Кафедри
Постійне посилання на розділhttps://repository.kpi.kharkov.ua/handle/KhPI-Press/35393
Переглянути
3 результатів
Результати пошуку
Документ Collection and processing of a Medical Corpus in Ukrainian(2020) Cherednichenko, Olga; Kanishcheva, Olga; Yakovleva, Olena; Arkatov, DenisThe text corpora are the basis of natural language studying. We describe the structure of a Ukrainian-language corpus (UKRMED), which contains a variety of medical text genres (Сlinical protocols, Blogs, and Wikipedia). The paper shows the process of collecting, creating and processing a corpus of medical data in Ukrainian. We represent our own framework for creating a text corpus. The medical domain and text simplification are chosen as corpus directions. The authors gave statistical characteristics of the corpus, an analysis of the morphological parts of speech is provided. Frequency lemmas for this medical corps are analyzed. The UKRMED corpus can be used for solving the task of natural language simplification.Документ Developing the Key Attributes for Product Matching Based on the Item’s Image Tag Comparison(2020) Cherednichenko, Olga; Yanholenko, Olha; Kanishcheva, OlgaWith the constant growth of the number of products on e-marketplaces, buyers feel hard to find and choose items that would satisfy all their needs and expectations. Search and filtering algorithms of recommender systems, although are striving to help users, still fail quite often due to incomplete and inaccurate description of items. The given work suggests to combine analysis of both item description and item image in order to construct groups of similar items. Since a person can define whether two items are similar or not looking at two images and a brief description, it is suggested to form a set of similar items based on users’ judgments and then to extract the core of keywords for the specific type of products. Further, it is proposed to use the given core to evaluate the similarity of any new item added to the definite group. The case study deals with the building of the core of keywords for sneakers. The developed key attributes allow matching the items with a high precision, thus, proving the effectiveness of the method of the core construction.Документ Readability Evaluation for Ukrainian Medicine Corpus (UKRMED)(2021) Cherednichenko, Olga; Kanishcheva, OlgaIn our work, we decided to demonstrate how to work different readability formulas on our Ukrainian-language corpus (UKRMED) of medical texts. UKRMED contains three types of texts in the medical domain divided by their complexity: “Complex texts”, “Moderate texts”, and “Simple texts”. This research aims to (1) demonstrate the use of the most commonly used readability formulas on written health information in Ukrainian, (2) compare and contrast these different formulas to various texts (simple, complex, and moderate), (3) research different medical text features which will be used for text simplification and classification medical texts and (4) prepare recommendations for using these formulas to the evaluation of readability medical texts in Ukrainian.