Кафедри

Постійне посилання на розділhttps://repository.kpi.kharkov.ua/handle/KhPI-Press/35393

Переглянути

Результати пошуку

Зараз показуємо 1 - 5 з 5
  • Ескіз
    Документ
    Collection and processing of a Medical Corpus in Ukrainian
    (2020) Cherednichenko, Olga; Kanishcheva, Olga; Yakovleva, Olena; Arkatov, Denis
    The text corpora are the basis of natural language studying. We describe the structure of a Ukrainian-language corpus (UKRMED), which contains a variety of medical text genres (Сlinical protocols, Blogs, and Wikipedia). The paper shows the process of collecting, creating and processing a corpus of medical data in Ukrainian. We represent our own framework for creating a text corpus. The medical domain and text simplification are chosen as corpus directions. The authors gave statistical characteristics of the corpus, an analysis of the morphological parts of speech is provided. Frequency lemmas for this medical corps are analyzed. The UKRMED corpus can be used for solving the task of natural language simplification.
  • Ескіз
    Документ
    Developing the Key Attributes for Product Matching Based on the Item’s Image Tag Comparison
    (2020) Cherednichenko, Olga; Yanholenko, Olha; Kanishcheva, Olga
    With the constant growth of the number of products on e-marketplaces, buyers feel hard to find and choose items that would satisfy all their needs and expectations. Search and filtering algorithms of recommender systems, although are striving to help users, still fail quite often due to incomplete and inaccurate description of items. The given work suggests to combine analysis of both item description and item image in order to construct groups of similar items. Since a person can define whether two items are similar or not looking at two images and a brief description, it is suggested to form a set of similar items based on users’ judgments and then to extract the core of keywords for the specific type of products. Further, it is proposed to use the given core to evaluate the similarity of any new item added to the definite group. The case study deals with the building of the core of keywords for sneakers. The developed key attributes allow matching the items with a high precision, thus, proving the effectiveness of the method of the core construction.
  • Ескіз
    Документ
    Readability Evaluation for Ukrainian Medicine Corpus (UKRMED)
    (2021) Cherednichenko, Olga; Kanishcheva, Olga
    In our work, we decided to demonstrate how to work different readability formulas on our Ukrainian-language corpus (UKRMED) of medical texts. UKRMED contains three types of texts in the medical domain divided by their complexity: “Complex texts”, “Moderate texts”, and “Simple texts”. This research aims to (1) demonstrate the use of the most commonly used readability formulas on written health information in Ukrainian, (2) compare and contrast these different formulas to various texts (simple, complex, and moderate), (3) research different medical text features which will be used for text simplification and classification medical texts and (4) prepare recommendations for using these formulas to the evaluation of readability medical texts in Ukrainian.
  • Ескіз
    Документ
    Towards Improving the Search Quality on the Trading Platforms
    (Springer International Publishing AG, 2018) Cherednichenko, Olga; Vovk, Maryna Anatoliivna; Kanishcheva, Olga; Godlevskyi, Mikhail
    In this paper, the problem of the search quality on the trading platforms, such AliExpress, eBay and others is explored, the major types of problems that arise in product search by customers are considered. The usage of the classical clusterization algorithms for grouping similar products according to their descriptions is studied. A data set for experimenting consists of different items (smartphones) from e-shop eBay is developed. Each entity in this corpus photos and a product description are given. These texts are used for item comparing in order to perform similar groups or similar items. The results show that the k-means algorithm is good for preliminary grouping but for detailed processing, other methods and approaches are required.
  • Ескіз
    Документ
    Studying items similarity for dependable buying on electronic marketplaces
    (2018) Cherednichenko, Olga; Vovk, Maryna Anatoliivna; Kanishcheva, Olga; Godlevskyi, Mikhail
    The processing of product buying is a very difficult task when we have thousands of items in each market category. In order to study items similarity for dependable buying we try to analyze item descriptions on AliExpress, eBay marketplaces and test k-means algorithm for item grouping/product segmentation. The usage of the classical clusterization algorithms for grouping similar products according to their descriptions is studied. A corpus of different products (bikes and smartphones) from e-shop AliExpress, eBay is developed. Each entity in this corpus contains photos and a product description. Each entity in this corpus contains product description with different fields. These short texts are used for experiments. As a result, it is found out that the k-means algorithm works well only for uniformly distributed data by categories, but this is not suitable for the segmentation of heterogeneous descriptions. The task of item descriptions systematization is set in the research below.