Research output: Contribution to journal › Conference article › peer-review
Automatic text summarization based on syntactic links. / Yerimbetova, A. S.; Batura, T. V.; Murzin, F. A. et al.
In: CEUR Workshop Proceedings, Vol. 2570, 01.01.2020.Research output: Contribution to journal › Conference article › peer-review
}
TY - JOUR
T1 - Automatic text summarization based on syntactic links
AU - Yerimbetova, A. S.
AU - Batura, T. V.
AU - Murzin, F. A.
AU - Sagnayeva, S. K.
N1 - Publisher Copyright: Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0) Copyright: Copyright 2020 Elsevier B.V., All rights reserved.
PY - 2020/1/1
Y1 - 2020/1/1
N2 - The task of information retrieval is to find documents relevant to the query in a certain collection of documents. The document is a text selected by the author as a single fragment. A query is usually a meaningful phrase or set of words describing the information needed. Instead of searching through the whole document, organizing a search by topic or resume of the document becomes enough. By the term "topic" we refer to a set of small reference texts. Therefore, one of the interesting tasks in information retrieval systems is the task of classifying texts by topic. The whole classification process is carried out in four stages: preprocessing the text, weighing the terms, weighing the sentences, extracting meaningful sentences. In the process of selecting topics, fragments of the text are studied (for example, paragraphs) and compared with the chosen standard. Different fragments can be attributed to different topics. Selected fragments can be combined into a summary on this topic. This paper considers the issues of automatic summarization of text documents taking into account the syntactic relations between words and word forms in sentences that can be obtained at the output of the Link Gramma Parser (LGP) system for the Kazakh and Turkish languages. The authors operate on the results of studies on customizing the LGP parser for agglutinative languages.
AB - The task of information retrieval is to find documents relevant to the query in a certain collection of documents. The document is a text selected by the author as a single fragment. A query is usually a meaningful phrase or set of words describing the information needed. Instead of searching through the whole document, organizing a search by topic or resume of the document becomes enough. By the term "topic" we refer to a set of small reference texts. Therefore, one of the interesting tasks in information retrieval systems is the task of classifying texts by topic. The whole classification process is carried out in four stages: preprocessing the text, weighing the terms, weighing the sentences, extracting meaningful sentences. In the process of selecting topics, fragments of the text are studied (for example, paragraphs) and compared with the chosen standard. Different fragments can be attributed to different topics. Selected fragments can be combined into a summary on this topic. This paper considers the issues of automatic summarization of text documents taking into account the syntactic relations between words and word forms in sentences that can be obtained at the output of the Link Gramma Parser (LGP) system for the Kazakh and Turkish languages. The authors operate on the results of studies on customizing the LGP parser for agglutinative languages.
KW - Closeness centrality
KW - Directed graph
KW - Information retrieval
KW - LGP
KW - Summarization
KW - Text topics
KW - Word weight
UR - http://www.scopus.com/inward/record.url?scp=85081539628&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85081539628
VL - 2570
JO - CEUR Workshop Proceedings
JF - CEUR Workshop Proceedings
SN - 1613-0073
T2 - 1st International Conference of Information Systems and Design, ICID 2019
Y2 - 5 December 2019
ER -
ID: 23804277