NLP Resources for a Rare Language Morphological Analyzer: Danish Case

dc.contributor.authorKotov, Mykhailo
dc.date.accessioned2025-05-27T11:12:46Z
dc.date.issued2017
dc.description.abstractThe paper discusses the characteristics and practical aspects of application of the natural language processing resources available for developing a rare language morphological analysis solution. The case under consideration reveals the pipeline design needed to prepare the grammatical resources for Danish. Being rare not only in terms of distribution, but also in the amount of natural language resources available, the Danish language represents a significant problem in terms of application of third-party tools to help solve various NLP-related issues. The paper focuses on part-of-speech tagging and lemmatization, typical but indispensable tasks at the pre-processing stage within the framework of developing a morphological analyzer as a custom NLP solution.
dc.identifier.citationKotov M. NLP Resources for a Rare Language Morphological Analyzer: Danish Case / Mykhailo Kotov // Computational linguistics and intelligent systems. COLINS 2017 : proc. of the 1st Intern. conf., 21 April 2017 / org. com.: O. Kanishcheva [et al.] ; National Technical University "Kharkiv Polytechnic Institute". – Kharkiv : NTU "KhPI", 2017. – P. 31-36.
dc.identifier.orcidhttps://orcid.org/0000-0001-8327-5197
dc.identifier.urihttps://repository.kpi.kharkov.ua/handle/KhPI-Press/89854
dc.language.isoen
dc.publisherNational Technical University "Kharkiv Polytechnic Institute"
dc.subjectmorphological analyzer
dc.subjectlemmatization
dc.subjectpart-of-speech tagging
dc.subjectHunspell
dc.subjectOpenNLP
dc.subjectsnowball stemmer
dc.subjectSyntaxNet
dc.subjectword-list
dc.titleNLP Resources for a Rare Language Morphological Analyzer: Danish Case
dc.typeArticle

Файли

Контейнер файлів

Зараз показуємо 1 - 1 з 1
Вантажиться...
Ескіз
Назва:
Kotov_Resources_2017.pdf
Розмір:
249.54 KB
Формат:
Adobe Portable Document Format

Ліцензійна угода

Зараз показуємо 1 - 1 з 1
Вантажиться...
Ескіз
Назва:
license.txt
Розмір:
11.25 KB
Формат:
Item-specific license agreed upon to submission
Опис: