Collection and processing of a Medical Corpus in Ukrainian

Cherednichenko, Olga; Kanishcheva, Olga; Yakovleva, Olena; Arkatov, Denis

Collection and processing of a Medical Corpus in Ukrainian

Файли

Cherednichenko_Collection_and_processing_2020.pdf (621,92 KB)

Дата

2020

Автори

ORCID

https://orcid.org/0000-0002-9391-5220
https://orcid.org/0000-0002-9035-1765
https://orcid.org/0000-0002-6129-6146

Анотація

The text corpora are the basis of natural language studying. We describe the structure of a Ukrainian-language corpus (UKRMED), which contains a variety of medical text genres (Сlinical protocols, Blogs, and Wikipedia). The paper shows the process of collecting, creating and processing a corpus of medical data in Ukrainian. We represent our own framework for creating a text corpus. The medical domain and text simplification are chosen as corpus directions. The authors gave statistical characteristics of the corpus, an analysis of the morphological parts of speech is provided. Frequency lemmas for this medical corps are analyzed. The UKRMED corpus can be used for solving the task of natural language simplification.

Ключові слова

Medicine Corpus, Corpus Linguistic, Ukrainian, Text Collection, Ukrainian-language corpus, Natural Language Processing

Бібліографічний опис

Collection and processing of a Medical Corpus in Ukrainian [Electronic resource] / O. Cherednichenko [et al.] // Computational Linguistics and Intelligent Systems (COLINS 2020) : proc. of the 4th Intern. Conf., April 23-23, 2020. Vol. 2604. – Electronic text data. – Lviv, 2020. – 11 p. – Access mode: https://ceur-ws.org/Vol-2604/paper21.pdf, free (date of the application 02.02.2024.).

URI

https://repository.kpi.kharkov.ua/handle/KhPI-Press/73613

Колекції

Кафедра "Програмна інженерія та інтелектуальні технології управління ім. А. В. Дабагяна"

Повна інформація про документ
Google Scholar

Collection and processing of a Medical Corpus in Ukrainian

Файли

Дата

Автори

ORCID

DOI

Науковий ступінь

Рівень дисертації

Шифр та назва спеціальності

Рада захисту

Установа захисту

Науковий керівник/консультант

Члени комітету

Назва журналу

Номер ISSN

Назва тому

Видавець

Анотація

Опис

Ключові слова

Бібліографічний опис

URI

Колекції

Підтвердження

Рецензія

Додано до

Згадується в