Кафедри
Постійне посилання на розділhttps://repository.kpi.kharkov.ua/handle/KhPI-Press/35393
Переглянути
2 результатів
Результати пошуку
Документ Efficiency estimation of methods for sentiment analysis of social network messages(Національний технічний університет "Харківський політехнічний інститут", 2019) Borysova, N. V.; Melnyk, K. V.The results of effectiveness evaluating of machine learning methods for sentiment analysis of social network messages are presented in this paper. The importance of the sentiment analysis problem as one of the important tasks of natural language processing in general and text ual information processing in particular is substantiated. A review of existing methods and software for sentiment analysis are ma de. The choice of classifiers for sentiment analysis of texts for this research is substantiated. The principles of functioning of a Naïve Bayesian Classifier and classifier based on a recurrent neural network are described. Classifiers were sequentially trained in two corpuses: first, in the RuTweetCorp corpus, the corpus of short messages from the social network Twitter, and then on the Slang corpus, the corpus of messages from social networks Facebook and Instagram and posts from the Pikabu website, second corpus have been marked up the tonality of slang words. Information about the tonality of slang words was taken from the youth slang dictionary obtained as a result of the survey of users. The separation of texts by tonality was carried out into three c lasses: positive, negative and neutral. The efficiency of these classifiers was evaluated. Efficiency evaluation was carried out according to standard metrics Recall, Precision, F-measure, Accuracy. For the naive Bayesian classifier, after training on the first corpus, the following metric values were obtained: Recall = 0,853; Precision = 0,869; F-measure = 0,861; Accuracy = 0,855; and after training on the second corpus such values were obtained: Recall = 0,948; Precision = 0,975; F-measure = 0,961; Accuracy = 0,960. For the classifier based on a recurrent neural network, after training on the first corpus, the following metric values were obtained: Recall = 0,870; Precision = 0,878; F-measure = 0,874; Accuracy = 0,861; and after training on the second corpus such values were obtained: Recall = 0,965; Precision = 0,982; F-measure = 0,973; Accuracy = 0,973. These results prove that additional training on the second corpus increased the efficiency of classifiers by 10–11%.Документ Development of agent-oriented software components to retrieve the marketing information from the web(НТУ "ХПІ", 2018) Cherednichenko, Olga Yurevna; Melnyk, K. V.; Kirkin, Stanislav Vasylevich; Sokolov, Dmitry Vitalevich; Matveev, Alexander NikolaevichThe article is devoted to researching the processes of extracting marketing information from the Web space. Conclusions are drawn on the need to introduce an information marketing system into modern business activities. A decision has been taken to develop software for the collection and analysis of marketing information. Identified and analyzed the main problems of collecting marketing information in the Web space. External systems for extracting and processing marketing information from the Web space were considered. During the analysis of the subject area, functional and non-functional requirements for the software being developed were formulated. Requirements for the selection of technologies for the development of an information system were defined. The analysis of software development technologies is carried out and the approach to the development of a software component is chosen. Such approaches to software development as: object-oriented programming, service-oriented architecture, component-oriented programming, agent-oriented programming were analyzed. A decision has been made to use the agent three-tier architecture in software development. The most commonly used programming languages in programming systems were: Java, KIF, KQML, AgentSpeak, April, TeleScript, Tcl / Tk, Oz. Analyzed such popular agent platforms and their functions as: JADE, Cougaar, ZEUS, Jason. For the development of software, the JADE platform was chosen, its classes, methods and interfaces were examined. The advantages and peculiarities of the SOLID principle are analyzed. In detail, the levels of the CLEAN architecture are examined. And also explained the possibilities of software implementation of this architecture. A software architecture was developed for the data collection system. In accordance with the requirements, a selection of software development tools has been made. It was decided to use the programming language Java, Spring Framework, GoF design pattern, the template Dependency Injection, SOLID and CLEAN architectural principles. A software component was developed for marketing information gathering systems, which allows to optimize this process. The limitations and ways to improve the software system are analyzed.