Using long short-term memory networks for natural language processing

Onyshchenko, Kostiantyn; Daniiel, Yana

doi:https://doi.org/10.20998/2079-0023.2023.01.14

Using long short-term memory networks for natural language processing

dc.contributor.author	Onyshchenko, Kostiantyn
dc.contributor.author	Daniiel, Yana
dc.date.accessioned	2023-07-20T11:13:05Z
dc.date.available	2023-07-20T11:13:05Z
dc.date.issued	2023
dc.description.abstract	The problem of emotion classification is a complex and non-trivial task of language interpretation due to the natural language structure and its dynamic nature. The significance of the study is in covering the important issue of automatic processing of client feedbacks, collecting opinions and trendcatching. In this work, a number of existing solutions for emotion classification problem were considered, having their shortcomings and advantages illustrated. The evaluation of performance of the considered models was conducted on emotion classification on four emotion classes, namely Happy, Sad, Angry and Others. The model for emotion classification in three-sentence conversations was proposed in this work. The model is based on smileys and word embeddings with domain specificity in state of art conversations on the Internet. The importance of taking into account the information extracted from smileys as an additional data source of emotional coloring is investigated. The model performance is evaluated and compared with language processing model BERT (Bidirectional Encoder Representations from Transformers). The proposed model achieved better performance at classifying emotions comparing to BERT (having F1 score as 78 versus 75). It should be noted, that further study should be performed to enhance the processing by the model of mixed reviews represented by emotion class Others. However, modern performance of models for language representation and understanding did not achieve the human performance. There is a variety of factors to consider when choosing the word embeddings and training methods to design the model architecture.
dc.description.abstract	Проблема класифікації емоцій є складним та нетривіальним завданням інтерпретації мови через структуру природної мови та її динамічний характер. Актуальність дослідження полягає в охопленні важливої проблеми автоматичної обробки відгуків клієнтів, збирання думок та виявлення тенденцій. У цій роботі розглянуто ряд існуючих рішень для проблеми класифікації емоцій, де продемонстровано їхні недоліки та переваги. Оцінка продуктивності розглянутих моделей була проведена на класифікації емоцій чотирьох класів: Happy, Sad, Angry та Other. У цій роботі запропоновано модель для класифікації емоцій в трирядкових розмовах. Модель базується на емодзі та представленнях слів зі специфікою області сучасних розмов в Інтернеті. Досліджується важливість врахування інформації, отриманої зі емодзі як додаткового джерела даних з емоційним забарвленням. Оцінено продуктивність моделі та порівняно її з мовною моделлю BERT (Bidirectional Encoder Representations from Transformers) для класифікації емоцій. Запропонована модель показала кращу продуктивність у класифікації емоцій порівняно з BERT (з F1-оцінкою 78 порівняно з 75). Слід зазначити, що потрібні додаткові дослідження для поліпшення обробки моделлю змішаних відгуків, що представлені класом емоцій "Other". Однак, сучасна продуктивність моделей для представлення та розуміння природної мови не досягла рівня людини. Є різноманітні фактори, які необхідно враховувати при виборі представлень слів та методів навчання для проектування архітектури моделі.
dc.identifier.citation	Onyshchenko K. Using long short-term memory networks for natural language processing / K. Onyshchenko, Ya. Daniiel // Вісник Національного технічного університету "ХПІ". Сер. : Системний аналіз, управління та інформаційні технології = Bulletin of the National Technical University "KhPI". Ser. : System analysis, control and information technology : зб. наук. пр. – Харків : НТУ "ХПІ", 2023. – № 1 (9). – С. 89-96.
dc.identifier.doi	https://doi.org/10.20998/2079-0023.2023.01.14
dc.identifier.orcid	https://orcid.org/0000-0002-7746-4570
dc.identifier.orcid	https://orcid.org/0000-0002-3895-0744
dc.identifier.uri	https://repository.kpi.kharkov.ua/handle/KhPI-Press/67289
dc.language.iso	en
dc.publisher	Національний технічний університет "Харківський політехнічний інститут"
dc.subject	natural language processing
dc.subject	neural network
dc.subject	natural language
dc.subject	long short-term memory networks
dc.subject	text classification
dc.subject	emotional text analysis
dc.subject	обробка природної мови
dc.subject	нейронна мережа
dc.subject	природна мова
dc.subject	мережі довготривалої пам’яті
dc.subject	текстова класифікація
dc.subject	емоційний аналіз тексту
dc.title	Using long short-term memory networks for natural language processing
dc.title.alternative	Використання мереж довготривалої пам’яті для обробки природної мови
dc.type	Article

Файли

Контейнер файлів

Зараз показуємо 1 - 1 з 1

Назва:: visnyk_KhPI_2023_1_SAUIT_Onyshchenko_Using.pdf
Розмір:: 1.14 MB
Формат:: Adobe Portable Document Format

Завантажити

Ліцензійна угода

Зараз показуємо 1 - 1 з 1

Назва:: license.txt
Розмір:: 11.18 KB
Формат:: Item-specific license agreed upon to submission
Опис:

Завантажити

Колекції

2023 № 1 Системний аналіз, управління та інформаційні технології