Cherednichenko, OlgaVovk, Maryna AnatoliivnaKanishcheva, OlgaGodlevskyi, Mikhail2020-06-012020-06-012018Studying items similarity for dependable buying on electronic marketplaces [Electronic resource] / O. Cherednichenko [et al.] // Computational linguistics and intelligent systems (COLINS 2018) : proc. of the 2nd Intern. Conf., June 25-27, 2018. Vol. 1: Main Conference / ed.: V. Lytvyn [et al.]. – Electron. text data. – Lviv, 2018. – P. 78-89. – URL: http://ceur-ws.org/Vol-2136/10000078.pdf, free (accessed 01.06.2020).https://repository.kpi.kharkov.ua/handle/KhPI-Press/46674The processing of product buying is a very difficult task when we have thousands of items in each market category. In order to study items similarity for dependable buying we try to analyze item descriptions on AliExpress, eBay marketplaces and test k-means algorithm for item grouping/product segmentation. The usage of the classical clusterization algorithms for grouping similar products according to their descriptions is studied. A corpus of different products (bikes and smartphones) from e-shop AliExpress, eBay is developed. Each entity in this corpus contains photos and a product description. Each entity in this corpus contains product description with different fields. These short texts are used for experiments. As a result, it is found out that the k-means algorithm works well only for uniformly distributed data by categories, but this is not suitable for the segmentation of heterogeneous descriptions. The task of item descriptions systematization is set in the research below.ene-commercedependable buyingrecomendation systemsproduct searchclusterizationk-meansTFIDFStudying items similarity for dependable buying on electronic marketplacesThesis