Open Information Extraction as Additional Source for Kazakh Ontology Generation

dc.contributor.authorKhairova, N. F.en
dc.contributor.authorPetrasova, S. V.en
dc.contributor.authorMamyrbayev, Orkenen
dc.contributor.authorMukhsina, Kuralayen
dc.date.accessioned2022-09-10T15:57:08Z
dc.date.available2022-09-10T15:57:08Z
dc.date.issued2020
dc.description.abstractNowadays, structured information that obtains from unstructured texts and Web context can be applied as an additional source of knowledge to create ontologies. In order to extract information from a text and represent it in the RDF-triplets format, we suggest using the Open Information Extraction model. Then we consider the adaptation of the model to fact extraction from unstructured texts in the Kazakh language. In our approach, we identify lexical units that name the participants of the action (the Subject and Object) and semantic relations between them based on words characteristics in a sentence. The model provides semantic functions of the action participants via logical-linguistic equations that express the relations of the grammatical and semantic characteristics of the words in a Kazakh sentence. Using the tag names and some syntactic characteristics of words in the Kazakh sentences as the values of the predicate variables in corresponding equations allows us to extract Subjects, Objects and Predicates of facts from texts of Web content. The experimental research dataset includes texts extracted from Kazakh bilingual news websites. The experiment shows that we can achieve the precision of facts extraction over 71% for Kazakh corpus.en
dc.identifier.citationOpen Information Extraction as Additional Source for Kazakh Ontology Generation [Electronic resource] / N. Khairova [et al.] // Intelligent Information and Database Systems (ACIIDS 2020) : proc. of the 12th Asian Conf., Phuket, Thailand, March 23-26, 2020. Pt. 1 / ed.: N. T. Nguyen [et al.]. – Electronic text data. – Cham, 2020. – P. 86-96. – (Ser. : Lecture Notes in Computer Science. Vol. 12033). – URL: https://link.springer.com/chapter/10.1007/978-3-030-41964-6_8, paid access (accessed 10.09.2022).en
dc.identifier.doidoi.org/10.1007/978-3-030-41964-6_8
dc.identifier.urihttps://repository.kpi.kharkov.ua/handle/KhPI-Press/57930
dc.language.isoen
dc.subjectopen information extractionen
dc.subjectRDF-tripletsen
dc.subjectunstructured texten
dc.subjectlogical-linguistic equationsen
dc.subjectKazakh bilingual news websitesen
dc.titleOpen Information Extraction as Additional Source for Kazakh Ontology Generationen
dc.typeArticleen

Файли

Контейнер файлів
Зараз показуємо 1 - 1 з 1
Вантажиться...
Ескіз
Назва:
Khairova_Open_information_2020.pdf
Розмір:
115.87 KB
Формат:
Adobe Portable Document Format
Опис:
Ліцензійна угода
Зараз показуємо 1 - 1 з 1
Ескіз недоступний
Назва:
license.txt
Розмір:
11.25 KB
Формат:
Item-specific license agreed upon to submission
Опис: