Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › Research › peer-review
Identification of Scientific Texts with Similar Argumentation Complexity. / Pimenov, Ivan; Salomatina, Natalia.
2022 IEEE International Multi-Conference on Engineering, Computer and Information Sciences (SIBIRCON). Institute of Electrical and Electronics Engineers (IEEE), 2022. p. 870-875.Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › Research › peer-review
}
TY - GEN
T1 - Identification of Scientific Texts with Similar Argumentation Complexity
AU - Pimenov, Ivan
AU - Salomatina, Natalia
N1 - The research was conducted within the framework of the state contract of the Sobolev Institute of Mathematics (projects no. FWNF-2022-0015). The authors express gratitude to the developers of the argumentation annotation software (Laboratory of Artificial Intelligence, Ershov Institute of Informatics Systems) for the provided opportunity to use it for constructing a dataset of argumentatively-annotated scientific texts.
PY - 2022
Y1 - 2022
N2 - The presented work describes the study of formally identifying texts that are similar in argumentation complexity. We analyze scientific articles in Russian language through the use of clustering algorithms (K-means, Ward, Spectral). The clustering features include the formally calculable characteristics of argumentation annotations for the dataset texts, so the method is applicable to texts in different genres and languages (after adapting the markers dictionary). The principal limitation consists in the requirement of inputting quantitative characteristics of argumentation structures of texts, which are constructed in accordance with the Argument Interchange Format (in form of rooted directed graphs) and Walton’s compendium of argumentation schemes. We analyze the performance of the clustering algorithms on different feature sets, which characterize the general properties of argumentation graphs, the specific argumentation patterns (common subgraphs for different texts), emotionality and authoritativeness of texts. Argumentation patterns are represented in two forms: standard (in accordance with Walton’s compendium) and generalized (based on functional similarity). We check the similarity of clustering results by different algorithms through using several quality measures (Jaccard-index-based, V-measure, FM-score), whose values belong to the 64±71 percent range. The employed dataset contains more than 1000 arguments from argumentation annotations (graphs) for 30 scientific texts in two thematic areas (linguistics and information technologies). Argumentation graphs are constructed by two annotators with the ArgNetBankStudio tool. The resulting clusters are distinguished by the general complexity of argumentation graphs, the usage of specific argumentation patterns, as well as by the difference in emotionality and authoritativeness.
AB - The presented work describes the study of formally identifying texts that are similar in argumentation complexity. We analyze scientific articles in Russian language through the use of clustering algorithms (K-means, Ward, Spectral). The clustering features include the formally calculable characteristics of argumentation annotations for the dataset texts, so the method is applicable to texts in different genres and languages (after adapting the markers dictionary). The principal limitation consists in the requirement of inputting quantitative characteristics of argumentation structures of texts, which are constructed in accordance with the Argument Interchange Format (in form of rooted directed graphs) and Walton’s compendium of argumentation schemes. We analyze the performance of the clustering algorithms on different feature sets, which characterize the general properties of argumentation graphs, the specific argumentation patterns (common subgraphs for different texts), emotionality and authoritativeness of texts. Argumentation patterns are represented in two forms: standard (in accordance with Walton’s compendium) and generalized (based on functional similarity). We check the similarity of clustering results by different algorithms through using several quality measures (Jaccard-index-based, V-measure, FM-score), whose values belong to the 64±71 percent range. The employed dataset contains more than 1000 arguments from argumentation annotations (graphs) for 30 scientific texts in two thematic areas (linguistics and information technologies). Argumentation graphs are constructed by two annotators with the ArgNetBankStudio tool. The resulting clusters are distinguished by the general complexity of argumentation graphs, the usage of specific argumentation patterns, as well as by the difference in emotionality and authoritativeness.
UR - https://www.scopus.com/inward/record.url?eid=2-s2.0-85147532679&partnerID=40&md5=05ed0aee31dc2653454c32644eca32df
UR - https://www.mendeley.com/catalogue/96ad565c-51ac-399d-a835-aca2baa4e694/
U2 - 10.1109/sibircon56155.2022.10017119
DO - 10.1109/sibircon56155.2022.10017119
M3 - Conference contribution
SN - 9781665464802
SP - 870
EP - 875
BT - 2022 IEEE International Multi-Conference on Engineering, Computer and Information Sciences (SIBIRCON)
PB - Institute of Electrical and Electronics Engineers (IEEE)
T2 - 2022 IEEE International Multi-Conference on Engineering, Computer and Information Sciences, SIBIRCON 2022
Y2 - 11 November 2022 through 13 November 2022
ER -
ID: 46007222