Standard

Integrating morphological stemming and syntactic parsing for low-resource Uzbek texts. / Mengliev, Davlatyor; Abdurakhmonova, Nilufar; Barkhnin, Vladimir et al.

AIP Conference Proceedings. ed. / Niyetbay Uteuliev; Bakhtiyor Khuzhayorov; Bekzodjion Fayziev. Vol. 3377 American Institute of Physics Inc., 2025. 040003 (AIP Conference Proceedings; Vol. 3377, No. 1).

Research output: Chapter in Book/Report/Conference proceedingConference contributionResearchpeer-review

Harvard

Mengliev, D, Abdurakhmonova, N, Barkhnin, V, Ibragimov, B, Jurakulova, M, Urazaliyeva, M & Islombekov, B 2025, Integrating morphological stemming and syntactic parsing for low-resource Uzbek texts. in N Uteuliev, B Khuzhayorov & B Fayziev (eds), AIP Conference Proceedings. vol. 3377, 040003, AIP Conference Proceedings, no. 1, vol. 3377, American Institute of Physics Inc., Second International Scientific and Practical Conference on Actual Problems of Mathematical Modeling and Information Technology, Nukus, Uzbekistan, 12.11.2024. https://doi.org/10.1063/5.0299773

APA

Mengliev, D., Abdurakhmonova, N., Barkhnin, V., Ibragimov, B., Jurakulova, M., Urazaliyeva, M., & Islombekov, B. (2025). Integrating morphological stemming and syntactic parsing for low-resource Uzbek texts. In N. Uteuliev, B. Khuzhayorov, & B. Fayziev (Eds.), AIP Conference Proceedings (Vol. 3377). [040003] (AIP Conference Proceedings; Vol. 3377, No. 1). American Institute of Physics Inc.. https://doi.org/10.1063/5.0299773

Vancouver

Mengliev D, Abdurakhmonova N, Barkhnin V, Ibragimov B, Jurakulova M, Urazaliyeva M et al. Integrating morphological stemming and syntactic parsing for low-resource Uzbek texts. In Uteuliev N, Khuzhayorov B, Fayziev B, editors, AIP Conference Proceedings. Vol. 3377. American Institute of Physics Inc. 2025. 040003. (AIP Conference Proceedings; 1). doi: 10.1063/5.0299773

Author

Mengliev, Davlatyor ; Abdurakhmonova, Nilufar ; Barkhnin, Vladimir et al. / Integrating morphological stemming and syntactic parsing for low-resource Uzbek texts. AIP Conference Proceedings. editor / Niyetbay Uteuliev ; Bakhtiyor Khuzhayorov ; Bekzodjion Fayziev. Vol. 3377 American Institute of Physics Inc., 2025. (AIP Conference Proceedings; 1).

BibTeX

@inproceedings{a7ba937ea3ea491a83c270180be5dadc,
title = "Integrating morphological stemming and syntactic parsing for low-resource Uzbek texts",
abstract = "In the context of the lack of language resources for the Uzbek language, the development of complex tools for automatic text processing is particularly relevant. This article proposes a hybrid approach that combines preliminary morphological normalization of word forms with subsequent syntactic analysis of Uzbek sentences. At the first stage, stemming and lemmatization are performed using rules and dictionary resources, which allows obtaining canonical forms of words and reducing the degree of ambiguity. At the next stage, a trained model based on a syntactic parser (spaCy) determines grammatical relations between words. Experiments conducted on a corpus of multi-genre Uzbek texts demonstrated an improvement in the quality of syntactic analysis using morphological normalization: accuracy (Precision) reached 92%, recall (Recall) - about 91%, and F1-measure - about 91%. A comparative analysis with a model without preliminary normalization showed a decrease in quality by several percentage points, which emphasizes the rather important role of the morphological stage. The results obtained indicate the prospects of the proposed solution and create a basis for further development of tools for processing Uzbek texts.",
author = "Davlatyor Mengliev and Nilufar Abdurakhmonova and Vladimir Barkhnin and Bahodir Ibragimov and Madina Jurakulova and Mavluda Urazaliyeva and Bozorboy Islombekov",
year = "2025",
month = nov,
day = "7",
doi = "10.1063/5.0299773",
language = "English",
volume = "3377",
series = "AIP Conference Proceedings",
publisher = "American Institute of Physics Inc.",
number = "1",
editor = "Niyetbay Uteuliev and Bakhtiyor Khuzhayorov and Bekzodjion Fayziev",
booktitle = "AIP Conference Proceedings",
address = "United States",
note = "Second International Scientific and Practical Conference on Actual Problems of Mathematical Modeling and Information Technology, APMMIT2024 ; Conference date: 12-11-2024 Through 13-11-2024",

}

RIS

TY - GEN

T1 - Integrating morphological stemming and syntactic parsing for low-resource Uzbek texts

AU - Mengliev, Davlatyor

AU - Abdurakhmonova, Nilufar

AU - Barkhnin, Vladimir

AU - Ibragimov, Bahodir

AU - Jurakulova, Madina

AU - Urazaliyeva, Mavluda

AU - Islombekov, Bozorboy

N1 - Conference code: 2

PY - 2025/11/7

Y1 - 2025/11/7

N2 - In the context of the lack of language resources for the Uzbek language, the development of complex tools for automatic text processing is particularly relevant. This article proposes a hybrid approach that combines preliminary morphological normalization of word forms with subsequent syntactic analysis of Uzbek sentences. At the first stage, stemming and lemmatization are performed using rules and dictionary resources, which allows obtaining canonical forms of words and reducing the degree of ambiguity. At the next stage, a trained model based on a syntactic parser (spaCy) determines grammatical relations between words. Experiments conducted on a corpus of multi-genre Uzbek texts demonstrated an improvement in the quality of syntactic analysis using morphological normalization: accuracy (Precision) reached 92%, recall (Recall) - about 91%, and F1-measure - about 91%. A comparative analysis with a model without preliminary normalization showed a decrease in quality by several percentage points, which emphasizes the rather important role of the morphological stage. The results obtained indicate the prospects of the proposed solution and create a basis for further development of tools for processing Uzbek texts.

AB - In the context of the lack of language resources for the Uzbek language, the development of complex tools for automatic text processing is particularly relevant. This article proposes a hybrid approach that combines preliminary morphological normalization of word forms with subsequent syntactic analysis of Uzbek sentences. At the first stage, stemming and lemmatization are performed using rules and dictionary resources, which allows obtaining canonical forms of words and reducing the degree of ambiguity. At the next stage, a trained model based on a syntactic parser (spaCy) determines grammatical relations between words. Experiments conducted on a corpus of multi-genre Uzbek texts demonstrated an improvement in the quality of syntactic analysis using morphological normalization: accuracy (Precision) reached 92%, recall (Recall) - about 91%, and F1-measure - about 91%. A comparative analysis with a model without preliminary normalization showed a decrease in quality by several percentage points, which emphasizes the rather important role of the morphological stage. The results obtained indicate the prospects of the proposed solution and create a basis for further development of tools for processing Uzbek texts.

UR - https://www.scopus.com/pages/publications/105021346124

UR - https://www.mendeley.com/catalogue/ebf97d9e-e85f-3e62-8a4d-a8447a3a53ac/

U2 - 10.1063/5.0299773

DO - 10.1063/5.0299773

M3 - Conference contribution

VL - 3377

T3 - AIP Conference Proceedings

BT - AIP Conference Proceedings

A2 - Uteuliev, Niyetbay

A2 - Khuzhayorov, Bakhtiyor

A2 - Fayziev, Bekzodjion

PB - American Institute of Physics Inc.

T2 - Second International Scientific and Practical Conference on Actual Problems of Mathematical Modeling and Information Technology

Y2 - 12 November 2024 through 13 November 2024

ER -

ID: 72347744