Applıcation of Paragraphs Vectors Model for Semantic Text Analysis

Вантажиться...
Ескіз

Дата

2020

DOI

Науковий ступінь

Рівень дисертації

Шифр та назва спеціальності

Рада захисту

Установа захисту

Науковий керівник

Члени комітету

Видавець

Анотація

The paper examined a model of paragraph vectors, as well as its methods of distributed memory and distributed bag of words. The peculiarity of this model lies in the definition of the objective functions of individual sentences and their representation in the form of some local vectors, on the basis of which a global vector is constructed, which determines the semantic component of the text as a whole. Various aspects of the application of distributed memory and distributed bag of words methods were considered, as well as the sets of algorithms of the underlying distributed memory and distributed bag of words methods, which allow obtaining distributed vectors of text parts to solve the problem of determining similar articles, where the search will be carried out key words, annotations, and articles of various sizes. It was experimentally established that Doc2Vec and its Bag-of-Words method, the most complete, allows you to determine borrowing and analogues depending on the structural elements of the text, in accordance with the review and the task. Also Bag-of-Words allows the user to make an exact picture of the lexical meaning of a word and its semantic relations in language and texts.

Опис

Ключові слова

text meaning definition, semantic analysis, latent-semantic analysis, experiment, textual information, model, semantic analysis library, text analysis, text fragment

Бібліографічний опис

Applıcation of Paragraphs Vectors Model for Semantic Text Analysis [Electronic resource] / I. Gruzdo [et al.] // Computational Linguistics and Intelligent Systems (COLINS 2020) : proc. of the 4th Intern. Conf., April 23-24, 2020. Vol. 2604. – Electronic text data. – Lviv, 2020. – 11 p. – Access mode: https://ceur-ws.org/Vol-2604/paper22.pdf, free (date of the application 13.02.2024.).