Результаты исследований: Публикации в книгах, отчётах, сборниках, трудах конференций › статья в сборнике материалов конференции › научная › Рецензирование
Bridging dialectal variations in Uzbek texts: A comparative evaluation of modern approaches. / Mengliev, Davlatyor; Abdurakhmonova, Nilufar; Kholmurodova, Iroda и др.
AIP Conference Proceedings. ред. / Niyetbay Uteuliev; Bakhtiyor Khuzhayorov; Bekzodjion Fayziev. Том 3377 American Institute of Physics Inc., 2025. 070001 (AIP Conference Proceedings; Том 3377, № 1).Результаты исследований: Публикации в книгах, отчётах, сборниках, трудах конференций › статья в сборнике материалов конференции › научная › Рецензирование
}
TY - GEN
T1 - Bridging dialectal variations in Uzbek texts: A comparative evaluation of modern approaches
AU - Mengliev, Davlatyor
AU - Abdurakhmonova, Nilufar
AU - Kholmurodova, Iroda
AU - Ibragimov, Bahodir
AU - Latipova, Gulasal
AU - Kadirova, Zebo
N1 - Conference code: 2
PY - 2025/11/7
Y1 - 2025/11/7
N2 - In this paper, we propose a solution to the problem of identifying dialect words in Uzbek texts using neural network methods focused on the contextual representation of lexical units. As the main comparison tools, the authors chose models based on the spaCy library, as well as an architecture combining bidirectional LSTM and a convolutional neural network (CNN). The authors note that existing solutions are not able to either generalize the original data or analyze the context of sentences for the most accurate standardization of dialect words into formal equivalents. As a result of training the models, it was found that the spaCy-based model achieved such indicators as accuracy of 90%, recall of 89%, and f1-score of 90%. While the biLSTM+CNN bundle achieved such values as accuracy of 92%, recall of 91%, and f1-score of 92%. Moreover, the authors cited existing solutions, talked about their approaches and the reasons why they cannot cope with the task under study.
AB - In this paper, we propose a solution to the problem of identifying dialect words in Uzbek texts using neural network methods focused on the contextual representation of lexical units. As the main comparison tools, the authors chose models based on the spaCy library, as well as an architecture combining bidirectional LSTM and a convolutional neural network (CNN). The authors note that existing solutions are not able to either generalize the original data or analyze the context of sentences for the most accurate standardization of dialect words into formal equivalents. As a result of training the models, it was found that the spaCy-based model achieved such indicators as accuracy of 90%, recall of 89%, and f1-score of 90%. While the biLSTM+CNN bundle achieved such values as accuracy of 92%, recall of 91%, and f1-score of 92%. Moreover, the authors cited existing solutions, talked about their approaches and the reasons why they cannot cope with the task under study.
UR - https://www.scopus.com/pages/publications/105021335880
UR - https://www.mendeley.com/catalogue/912dcb7f-cae7-37fc-aaf0-4b8ff80e382d/
U2 - 10.1063/5.0299774
DO - 10.1063/5.0299774
M3 - Conference contribution
VL - 3377
T3 - AIP Conference Proceedings
BT - AIP Conference Proceedings
A2 - Uteuliev, Niyetbay
A2 - Khuzhayorov, Bakhtiyor
A2 - Fayziev, Bekzodjion
PB - American Institute of Physics Inc.
T2 - Second International Scientific and Practical Conference on Actual Problems of Mathematical Modeling and Information Technology
Y2 - 12 November 2024 through 13 November 2024
ER -
ID: 72347068