Information technologies of neural network speech recognition in real-time

Serdyuk, Iryna; Tonitsa, Oleh; Heliarovska, Oksana; Yanovsky, Oleksiy

doi:https://doi.org/10.20998/3083-6298.2025.02.09

Information technologies of neural network speech recognition in real-time

dc.contributor.author	Serdyuk, Iryna
dc.contributor.author	Tonitsa, Oleh
dc.contributor.author	Heliarovska, Oksana
dc.contributor.author	Yanovsky, Oleksiy
dc.date.accessioned	2025-10-07T07:37:31Z
dc.date.issued	2025
dc.description.abstract	The purpose of this work is to explore approaches to building neural network speech recognition systems. Real-time speech recognition has become an incredibly useful tool for solving a variety of problems in different areas of life. Many companies now offer dictation software that allows people to create search queries or dictate emails using voice commands. It is appropriate to consider neural network speech recognition, in particular, Ukrainian. One of the biggest problems faced by the analysis of Ukrainian speech is the limited number of models available for recognition. While there are many models for English, there are very few for Ukrainian. In general, the potential benefits of sound processing and speech recognition are obvious, and it is quite likely that we will continue to see new developments in these areas in the future. Neural networks are described, the principle of their operation and methods of audio recognition using them. The following results were obtained: the audio signal, its representation, statistical and physical methods of working with it were studied. Conclusion. Effective models for correct speech recognition and toolkits for model training were found.
dc.description.abstract	Метою даної роботи є дослідження підходів до створення систем нейромережевого розпізнавання мовлення. Розпізнавання мовлення в реальному часі стало неймовірно корисним інструментом для вирішення різноманітних проблем у різних сферах життя. Зараз багато компаній пропонують програмне забезпечення для диктування, яке дозволяє людям створювати пошукові запити або диктувати електронні листи за допомогою голосових команд. Доцільним є розгляд нейромережевого розпізнавання мови, зокрема, української. Однією з найбільших проблем, з якими стикається аналіз українського мовлення, є обмежена кількість моделей, доступних для розпізнавання. Якщо для англійської є багато моделей, то для української – їх зовсім мало. Загалом потенційні переваги обробки звуку та розпізнавання мовлення очевидні, і цілком імовірно, що ми продовжуватимемо бачити нові розробки в цих сферах у майбутньому. Описані нейромережі, принцип їх роботи та способи розпізнавання аудіо за допомогою них. Було отримано такі результати: досліджено аудіосигнал, його представлення, статистичні та фізичні методи роботи з ним. Висновок. Знайдено ефективні моделі для коректного розпізнавання мови та тулкіти для навчання моделі.
dc.identifier.citation	Information technologies of neural network speech recognition in real-time / I. Serdyuk [et al.] // Територія безпеки = Terra security. – 2025. – Т. 1, № 2. – С. 72-80.
dc.identifier.doi	https://doi.org/10.20998/3083-6298.2025.02.09
dc.identifier.orcid	https://orcid.org/0009-0001-1143-9145
dc.identifier.orcid	https://orcid.org/0009-0001-8498-0522
dc.identifier.orcid	https://orcid.org/0000-0002-8927-7465
dc.identifier.orcid	https://orcid.org/0009-0002-4310-2843
dc.identifier.uri	https://repository.kpi.kharkov.ua/handle/KhPI-Press/93724
dc.language.iso	en
dc.publisher	Національний технічний університет "Харківський політехнічний інститут"
dc.subject	neural networks
dc.subject	audio signal processing
dc.subject	convolutional neural network
dc.subject	gestalt grouping
dc.subject	cochlear model
dc.subject	dataset
dc.subject	нейронні мережі
dc.subject	обробка аудіосигналу
dc.subject	згорткова нейромережа
dc.subject	гештальт-групування
dc.subject	кохлеарна модель
dc.subject	датасет
dc.title	Information technologies of neural network speech recognition in real-time
dc.title.alternative	Інформаційні технології нейромережевого розпізнавання мовлення в режимі реального часу
dc.type	Article

Файли

Контейнер файлів

Зараз показуємо 1 - 1 з 1

Назва:: TS_2025_2_Serdyuk_Information_technologies.pdf
Розмір:: 512.3 KB
Формат:: Adobe Portable Document Format

Завантажити

Ліцензійна угода

Зараз показуємо 1 - 1 з 1

Назва:: license.txt
Розмір:: 2.95 KB
Формат:: Item-specific license agreed upon to submission
Опис:

Завантажити

Колекції

2025 № 2 Територія безпеки