Кафедра "Інтелектуальні комп'ютерні системи"
Постійне посилання колекціїhttps://repository.kpi.kharkov.ua/handle/KhPI-Press/2423
Офіційний сайт кафедри http://web.kpi.kharkov.ua/iks
Кафедра "Інтелектуальні комп’ютерні системи" заснована 12 лютого 2007 року на базі спеціальності "Прикладна лінгвістика".
У 2009 році на базі кафедри спільно з Українським мовно-інформаційним фондом НАН України було створено Науково-дослідний центр інтелектуальних систем і комп’ютерної лінгвістики.
Кафедра входить до складу Навчально-наукового інституту соціально-гуманітарних технологій Національного технічного університету "Харківський політехнічний інститут".
У складі науково-педагогічного колективу кафедри працюють: 2 доктора технічних наук, 5 кандидатів філологічних наук, 4 кандидата технічних наук, 1 кандидат філософських наук; 2 співробітника мають звання професора, 3 – доцента.
Переглянути
Документ Adaptation of foreign leadership development models to leadership qualities formation of future psychologists in high schools of Ukraine(Національний технічний університет "Харківський політехнічний інститут", 2023) Khaziyev, ArseniyДокумент The aligned Kazakh-Russian parallel corpus focused on the criminal theme(2019) Khairova, N. F.; Kolesnyk, Anastasiia; Mamyrbayev, Orken; Mukhsina, KuralayNowadays, the development of high-quality parallel aligned text corpora is one of the most relevant and advanced directions of modern linguistics. Special emphasis is placed in creating parallel multilingual corpora for low resourced languages, such as the Kazakh language. In the study, we explored texts from four Kazakh bilingual news websites and created the parallel Kazakh-Russian corpus of texts that focus on the criminal subject at their base. In order to align the corpus, we used lexical compliances set and the values of POS-tagging of both languages. 60% of our corpus sentences are automatically aligned correctly. Finally, we analyzed the factors affecting the percentage of errors.Документ An Overview of Existing Automated Methods Definition of the Author of the Written Text Identification Characteristics(2020) Sliusarieva, Yuliia; Borysova, Natalia; Melnyk, KarinaThe paper presents an overview of the existing methods for solving the problem of automated identification of the author of the written text: substantiated the relevance of the research topic, analyzed the existing methods of solving the task, identified their advantages and disadvantages, selected the direction of further program implementation.Документ Applying VSM to Identify the Criminal Meaning of Texts(2020) Khairova, N. F.; Kolesnyk, Anastasiia; Mamyrbayev, Orken; Petrasova, S. V.Generally, to define the belonging of a text to a specific theme or domain, we can use approaches to text classification. However, the task becomes more complicated when there is no train corpus, in which the set of classes and the set of documents belonged to these classes are predetermined. We suggest using the semantic similarity of texts to determine their belonging to a specific domain. Our train corpus includes news articles containing criminal information. In order to define whether the theme of input documents is close to the theme of the train corpus, we propose to calculate the cosine similarity between documents of the corpus and the input document. We have empirically established the average value of the cosine similarity coefficient, in which the document can be attributed to the highly specialized documents containing criminal information.We evaluate our approach on the test corpus of articles from the news sites of Kharkiv. F-measure of the document classification with criminal information achieves 96 %.Документ Automated system for the creation and replenishment of users' electronic lexicographical resources(Society for Cultural and Scientific Progress in Central and Eastern Europe, 2018) Borysova, N. V.; Melnyk, K. V.This article proposes a solution to improve the efficiency of automated generation of electronic lexicographical resources based on strongly-structured electronic information arrays processing. The developed automated information system for lexicographical resources creation and replenishment have been described is this article. Several supporting subsystems of developed automated system have been characterized. The effectiveness of the information system has been evaluated.Документ Automatic Extraction of Synonymous Collocation Pairs from a Text Corpus(Polskie Towarzystwo Informatyczne, Poland, 2018) Khairova, N. F.; Petrasova, S. V.; Lewoniewski, Włodzimierz; Mamyrbayev, Orken; Mukhsina, KuralayAutomatic extraction of synonymous collocation pairs from text corpora is a challenging task of NLP. In order to search collocations of similar meaning in English texts, we use logical-algebraic equations. These equations combine grammatical and semantic characteristics of words of substantive, attributive and verbal collocations types. With Stanford POS tagger and Stanford Universal Dependencies parser, we identify the grammatical characteristics of words. We exploit WordNet synsets to pick synonymous words of collocations. The potential synonymous word combinations found are checked for compliance with grammatical and semantic characteristics of the proposed logical-linguistic equations. Our dataset includes more than half a million Wikipedia articles from a few portals. The experiment shows that the more frequent synonymous collocations occur in texts, the more related topics of the texts might be. The precision of synonymous collocations search in our experiment has achieved the results close to other studies like ours.Документ Automatic Identification of Collocation Similarity(Institute of Electrical and Electronics Engineers, 2015) Petrasova, S. V.; Khairova, N. F.This paper proposes a logical and linguistic model for automatic identification of collocation similarity. The method of component analysis is proposed to determine the semantic equivalence between collocates. The set of semantic and grammatical characteristics of collocates is identified by means of algebra of predicates to formalize collocation similarity.Документ Brainstorming as a part of learning process(Національний технічний університет "Харківський політехнічний інститут", 2016) Gulieva, D. O.Документ Building the Semantic Similarity Model for Social Network Data Streams(Institute of Electrical and Electronics Engineers, 2018) Petrasova, S. V.; Khairova, N. F.; Lewoniewski, WlodzimierzThis paper proposes the model for searching similar collocations in English texts in order to determine semantically connected text fragments for social network data streams analysis. The logical-linguistic model uses semantic and grammatical features of words to obtain a sequence of semantically related to each other text fragments from different actors of a social network. In order to implement the model, we leverage Universal Dependencies parser and Natural Language Toolkit with the lexical database WordNet. Based on the Blog Authorship Corpus, the experiment achieves over 0.92 precision.Документ Commercial correspondence. Part 1(2018) Лутай, Наталія Вікторівна; Гулієва, Діна ОлександрівнаНавчальний посібник призначений для факультетів іноземних мов та факультетів університетів , які готують фахівців в області світової економіки, міжнародних економічних відносин, менеджменту і фінансів, і розраховано на 200 годин аудиторних занять. Мета курсу - вивчення основ ділової кореспонденції, складання листів, запитів, оферт, замовлень, робота з контрактами, юридичними та платіжними документами, які широко використовуються в бізнесі, а також вдосконалення діалогічного мовлення з використанням професійної лексики; вироблення навичок послідовного перекладу діалогів комерційного змісту. Протягом ряду років матеріали, що увійшли в посібник, проходили апробацію і підтвердили свою ефективність. Навчальний посібник повністю відповідає програмі з англійської мови по аспекту "Комерційна кореспонденція та документація, ділове спілкування".Документ Construction and Analysis of Berber Text Corpus(2020) Zayd, Khayi; Orobinska, OlenaThis work is devoted to constructing a tool to analyze the different aspects of Berber languages. It is based on grammatical parameters of these languages. The text collection containing more than 500 texts that cover long historic period was collected. The corpus is free available and it will useful for further investigations on Tamazigh language. It was transformed into xml-format standardization goal. The corpus counts more than 200 000 of words. Based on the linguistic rules and statistic methods, original user interface and software prototype were developed by combining the technologies of web design and object programming in Python.Документ Corporative Ecological System and Processes Mathematical Modelling(Харківський національний університет радіоелектроніки, 2009) Kozulia, T. V.; Sharonova, Natalia ValeriyevnaIn the article the basics of the corporate approach in the system of ecological monitoring for solving of ecological problems tasks macro- and a microlevel are considered. Practical results of realization of corporate system in definition of an ecological estimation of processes in the soils are submitted.Документ Creating a Neural Network for Isolated Words Recognition(2020) Litvichenko, Diana; Kochueva, ZoiaThe purpose of this study: the development of a system of character recognition on the basis of the device of artificial neural networks. As part of the study, an analysis of modern artificial neural networks, as well as the direction of deep learning .As a result of the study, an own method for the realization of the task was developed.Документ Design of the User’s Interface of Virtual Lexicographic Laboratory for Explanatory Dictionary of the Spanish Language(2020) Kupriianov, Yevhen; Ostapova, Iryna; Yablochkov, MykytaOne of the most effective tools to work with dictionaries in digital environment is virtual lexicographic laboratories (VLL). Unlike electronic dictionaries, they are intended mostly for professional linguists. The paper shares the authors’ experience in elaborating the interface of the virtual lexicographic laboratory for Explanatory dictionary of the Spanish language (DLE 23). Using the theory of lexicographic systems a formal model of DLE 23 was elaborated. On the basis of the model the database structure and VLL interface elements were defined. The current version of VLL DLE 23 interface has the following advantages: 1) making an inventory of language units in the dictionary or in a sample; 2) conducting DLE 23-based linguistic researches to reveal lexical-semantic, etymological, grammatical and usage properties of the Spanish language; and 3) building of secondary lexicographic objects or sub-dictionaries on the basis of DLE 23, for example: sub-dictionary of morphemes, homonyms, collocations, etc.Документ Detecting Collocations Similarity via Logical-Linguistic Model(Association for Computational Linguistics, USA, 2019) Khairova, N. F.; Petrasova, S. V.; Mamyrbayev, Orken; Mukhsina, KuralaySemantic similarity between collocations, along with words similarity, is one of the main issues of NLP. In particular, it might be addressed to facilitate the automatic thesaurus generation. In the paper, we consider the logical-linguistic model that allows defining the relation of semantic similarity of collocations via the logical-algebraic equations. We provide the model for English, Ukrainian and Russian text corpora. The implementation for each language is slightly different in the equations of the finite predicates algebra and used linguistic resources. As a dataset for our experiment, we use 5801 pairs of sentences of Microsoft Research Paraphrase Corpus for English and more than 1000 texts of scientific papers for Russian and Ukrainian.Документ Digital method for assessing thermophysical parameters of an object on its image(Національний технічний університет "Харківський політехнічний інститут", 2022) Babkova, N. V.; Ugolnikov, S. V.Документ E-Communities как источник данных в неформальном информационном пространстве(НТУ "ХПИ", 2015) Петрасова, Светлана ВалентиновнаДокумент E-learning: формирование единого образовательного пространства на базе контента пользователей социальных сетей(Народная украинская академия, 2014) Петрасова, Светлана Валентиновна; Хайрова, Нина ФеликсовнаДокумент Efficiency estimation of methods for sentiment analysis of social network messages(Національний технічний університет "Харківський політехнічний інститут", 2019) Borysova, N. V.; Melnyk, K. V.The results of effectiveness evaluating of machine learning methods for sentiment analysis of social network messages are presented in this paper. The importance of the sentiment analysis problem as one of the important tasks of natural language processing in general and text ual information processing in particular is substantiated. A review of existing methods and software for sentiment analysis are ma de. The choice of classifiers for sentiment analysis of texts for this research is substantiated. The principles of functioning of a Naïve Bayesian Classifier and classifier based on a recurrent neural network are described. Classifiers were sequentially trained in two corpuses: first, in the RuTweetCorp corpus, the corpus of short messages from the social network Twitter, and then on the Slang corpus, the corpus of messages from social networks Facebook and Instagram and posts from the Pikabu website, second corpus have been marked up the tonality of slang words. Information about the tonality of slang words was taken from the youth slang dictionary obtained as a result of the survey of users. The separation of texts by tonality was carried out into three c lasses: positive, negative and neutral. The efficiency of these classifiers was evaluated. Efficiency evaluation was carried out according to standard metrics Recall, Precision, F-measure, Accuracy. For the naive Bayesian classifier, after training on the first corpus, the following metric values were obtained: Recall = 0,853; Precision = 0,869; F-measure = 0,861; Accuracy = 0,855; and after training on the second corpus such values were obtained: Recall = 0,948; Precision = 0,975; F-measure = 0,961; Accuracy = 0,960. For the classifier based on a recurrent neural network, after training on the first corpus, the following metric values were obtained: Recall = 0,870; Precision = 0,878; F-measure = 0,874; Accuracy = 0,861; and after training on the second corpus such values were obtained: Recall = 0,965; Precision = 0,982; F-measure = 0,973; Accuracy = 0,973. These results prove that additional training on the second corpus increased the efficiency of classifiers by 10–11%.Документ English communication practice(Національний технічний університет "Харківський політехнічний інститут", 2019) Petrasova, S. V.; Bratus, T. V.; Nikonorov, S. I.Навчально-методичний посібник з курсу "Практикум з мовної комунікації (англійська мова)" призначено для формування та удосконалення знань і навичок, необхідних для високого практичного володіння англійською мовою, точного та всебічного розуміння оригінальних англійських текстів, а також подальшого поповнення словникового запасу студентів. У навчально-методичному посібнику наведені вправи з лексики, граматики та говоріння, пропонуються тексти з проходження співбесід та відповідні завдання для розвитку навичок письмового та усного мовлення.