Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › Research › peer-review
Dictionary-Based Medical Text Analysis in Uzbek: Overcoming the Low-Resource Challenge. / Mengliev, Davlatyor; Barakhnin, Vladimir; Eshkulov, Mukhriddin et al.
2023 IEEE Ural-Siberian Conference on Computational Technologies in Cognitive Science, Genomics and Biomedicine, CSGB 2023 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2023. p. 85-89 (2023 IEEE Ural-Siberian Conference on Computational Technologies in Cognitive Science, Genomics and Biomedicine, CSGB 2023 - Proceedings).Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › Research › peer-review
}
TY - GEN
T1 - Dictionary-Based Medical Text Analysis in Uzbek: Overcoming the Low-Resource Challenge
AU - Mengliev, Davlatyor
AU - Barakhnin, Vladimir
AU - Eshkulov, Mukhriddin
AU - Palvanov, Bozorboy
AU - Abdurakhmonova, Nilufar
AU - Khamraeva, Saida
N1 - © 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - In the dynamically developing field of computational linguistics, problems associated with the processing of low-resource languages can face to certain difficulties. Moreover, solving such a problem becomes more complicated in the context of medical text processing, where the algorithm is required to do more subtle work than simply understand the context of the source text. The article proposes an algorithm for recognizing named entities (symptoms and medications) in medical texts in the Uzbek language, which is considered a low-resource language. The proposed algorithm begins its work by segmenting the text into sentences and word forms, after which each word from the source text is compared with a medical dictionary. Undetected words are subjected to morphological analysis and compared with a dictionary of word roots. The proposed approach not only speeds up the recognition of medical objects, but also minimizes redundancy and ensures data integrity. By integrating traditional linguistic methodologies with computational methods, this research offers a robust solution for efficient recognition of medical named entities in languages with limited available resources.
AB - In the dynamically developing field of computational linguistics, problems associated with the processing of low-resource languages can face to certain difficulties. Moreover, solving such a problem becomes more complicated in the context of medical text processing, where the algorithm is required to do more subtle work than simply understand the context of the source text. The article proposes an algorithm for recognizing named entities (symptoms and medications) in medical texts in the Uzbek language, which is considered a low-resource language. The proposed algorithm begins its work by segmenting the text into sentences and word forms, after which each word from the source text is compared with a medical dictionary. Undetected words are subjected to morphological analysis and compared with a dictionary of word roots. The proposed approach not only speeds up the recognition of medical objects, but also minimizes redundancy and ensures data integrity. By integrating traditional linguistic methodologies with computational methods, this research offers a robust solution for efficient recognition of medical named entities in languages with limited available resources.
KW - Computational Linguistics
KW - Dictionary-Based Extraction
KW - Low-Resource Languages
KW - Medical Entity Recognition
KW - Medical Informatics
KW - Medical Text Processing
KW - Morphological Analysis
KW - Named Entity Recognition
KW - Stemming
KW - Uzbek Language
UR - https://www.scopus.com/record/display.uri?eid=2-s2.0-85180369041&origin=inward&txGid=9ca32ec2ab037bc4a1e460f0a1041971
UR - https://www.mendeley.com/catalogue/ca5484dc-9ec4-3fb4-a0bd-a4cce7dce75a/
U2 - 10.1109/CSGB60362.2023.10329819
DO - 10.1109/CSGB60362.2023.10329819
M3 - Conference contribution
SN - 9798350307979
T3 - 2023 IEEE Ural-Siberian Conference on Computational Technologies in Cognitive Science, Genomics and Biomedicine, CSGB 2023 - Proceedings
SP - 85
EP - 89
BT - 2023 IEEE Ural-Siberian Conference on Computational Technologies in Cognitive Science, Genomics and Biomedicine, CSGB 2023 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2023 IEEE Ural-Siberian Conference on Computational Technologies in Cognitive Science, Genomics and Biomedicine
Y2 - 28 September 2023 through 29 September 2023
ER -
ID: 59454339