Komarov Ivan Dmitrievich (Postgraduate student at the All-Russian Institute of Scientific and Technical Information of the Russian Academy of Sciences (VINITI RAS) (Moscow))
| |
This article discusses the features of embedding in the tasks of automatic indexing of scientific texts with key terms. To analyze the possibilities and limitations of embedding-oriented indexing methods, three groups of factors are identified: equality of vectors, contextual blurriness, loss of structural significance of terms, and a structural and semantic model using a weighted representation of the term and an aggregated document vector is proposed. The results of the study showed that the best quality and reproducibility of indexing is achieved by integrating the semantic proximity of candidates, taking into account the structure of the scientific text, thereby increasing the consistency of key terms in digital scientific collections.
Keywords:automatic indexing, embeddings, keywords, vector representations, scientific texts, structural and semantic model, digital scientific collections.
|
|
| |
|
Read the full article …
|
Citation link: Komarov I. D. EMBEDDINGS AS THE BASIS FOR AUTOMATIC INDEXING OF SCIENTIFIC TEXTS WITH KEY TERMS: ALGORITHMIC CONSTRAINTS AND A STRUCTURAL AND SEMANTIC MODEL // Современная наука: актуальные проблемы теории и практики. Серия: Естественные и Технические Науки. -2026. -№03. -С. 94-100 DOI 10.37882/2223-2966.2026.03.14 |
|
|