Результаты исследований: Публикации в книгах, отчётах, сборниках, трудах конференций › статья в сборнике материалов конференции › научная › Рецензирование
Development of Named Entity Recognition Model for Analysis of Oceanographic Texts in Uzbek Language. / Mengliev, Davlatyor B.; Abdurakhmonova, Nilufar Z.; Barakhnin, Vladimir B. и др.
Proceedings - 4th International Conference on Technological Advancements in Computational Sciences, ICTACS 2024. Institute of Electrical and Electronics Engineers Inc., 2024. стр. 1-5.Результаты исследований: Публикации в книгах, отчётах, сборниках, трудах конференций › статья в сборнике материалов конференции › научная › Рецензирование
}
TY - GEN
T1 - Development of Named Entity Recognition Model for Analysis of Oceanographic Texts in Uzbek Language
AU - Mengliev, Davlatyor B.
AU - Abdurakhmonova, Nilufar Z.
AU - Barakhnin, Vladimir B.
AU - Kuvondikova, Gavhar I.
AU - Kadirova, Zebo G.
AU - Ibragimov, Bahodir B.
N1 - Conference code: 4
PY - 2024
Y1 - 2024
N2 - This paper presents the development of a language model for recognizing named entities in Uzbek-language texts on oceanology and navigation. The study included a corpus of 5,000 sentences related to oceanology. These sentences contained more than 33,000 manually annotated words. The BIOES scheme was used to label the data, which allowed labeling both single-word entities and entire phrases. The trained model demonstrated effectiveness in recognizing entities such as geographic features, natural phenomena, vehicles, etc. The accuracy of the model when analyzing test texts was 88%, and the recall was 94%. Despite these results, the model showed a decrease in accuracy when analyzing texts from other areas, indicating the need for further improvement. In addition, the authors also conduct a comparative analysis with existing scientific research in this area to create a more relevant solution to the problem. The article discusses the prospects for improving the model and expanding the scope of its application.
AB - This paper presents the development of a language model for recognizing named entities in Uzbek-language texts on oceanology and navigation. The study included a corpus of 5,000 sentences related to oceanology. These sentences contained more than 33,000 manually annotated words. The BIOES scheme was used to label the data, which allowed labeling both single-word entities and entire phrases. The trained model demonstrated effectiveness in recognizing entities such as geographic features, natural phenomena, vehicles, etc. The accuracy of the model when analyzing test texts was 88%, and the recall was 94%. Despite these results, the model showed a decrease in accuracy when analyzing texts from other areas, indicating the need for further improvement. In addition, the authors also conduct a comparative analysis with existing scientific research in this area to create a more relevant solution to the problem. The article discusses the prospects for improving the model and expanding the scope of its application.
KW - Low-Resource Languages
KW - Machine Learning Model
KW - Named Entity Recognition
KW - Natural Language Processing
KW - Oceanography
KW - Text Processing
KW - Uzbek Language
UR - https://www.scopus.com/record/display.uri?eid=2-s2.0-85218194885&origin=inward&txGid=5ee6e613d5d9835d51e2aa055305794e
UR - https://www.mendeley.com/catalogue/4e010c8c-3a93-3475-8336-9a3019635545/
U2 - 10.1109/ICTACS62700.2024.10840741
DO - 10.1109/ICTACS62700.2024.10840741
M3 - Conference contribution
SN - 979-8-3503-8747-6
SP - 1
EP - 5
BT - Proceedings - 4th International Conference on Technological Advancements in Computational Sciences, ICTACS 2024
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 4th International Conference on Technological Advancements in Computational Sciences
Y2 - 13 November 2024 through 15 November 2024
ER -
ID: 64856068