Standard

Can LLMs Get to the Roots? Evaluating Russian Morpheme Segmentation Capabilities in Large Language Models. / Морозов, Дмитрий Алексеевич; Glazkova, Anna V.; Iomdin, Boris L.

в: Supercomputing Frontiers and Innovations, Том 12, № 3, 25.12.2025, стр. 63-75.

Результаты исследований: Научные публикации в периодических изданияхстатьяРецензирование

Harvard

Морозов, ДА, Glazkova, AV & Iomdin, BL 2025, 'Can LLMs Get to the Roots? Evaluating Russian Morpheme Segmentation Capabilities in Large Language Models', Supercomputing Frontiers and Innovations, Том. 12, № 3, стр. 63-75. https://doi.org/10.14529/jsfi250305

APA

Vancouver

Морозов ДА, Glazkova AV, Iomdin BL. Can LLMs Get to the Roots? Evaluating Russian Morpheme Segmentation Capabilities in Large Language Models. Supercomputing Frontiers and Innovations. 2025 дек. 25;12(3):63-75. doi: 10.14529/jsfi250305

Author

Морозов, Дмитрий Алексеевич ; Glazkova, Anna V. ; Iomdin, Boris L. / Can LLMs Get to the Roots? Evaluating Russian Morpheme Segmentation Capabilities in Large Language Models. в: Supercomputing Frontiers and Innovations. 2025 ; Том 12, № 3. стр. 63-75.

BibTeX

@article{746c8efba0e44979871f72582256886a,
title = "Can LLMs Get to the Roots? Evaluating Russian Morpheme Segmentation Capabilities in Large Language Models",
abstract = "Automatic morpheme segmentation, a crucial task for morphologically rich languages like Russian, is persistently hindered by a significant drop in performance on words containing out-of-vocabulary (OOV) roots. This issue affects even state-of-the-art models, such as fine-tuned BERT models. This study investigates the potential of modern Large Language Models (LLMs) to address this challenge, focusing on the specific task of root identification in Russian. We evaluate a diverse set of eight state-of-the-art LLMs, including proprietary and open-weight models, using a prompt-based, few-shot learning approach. The models' performance is benchmarked against strong baselines – a fine-tuned RuRoberta model and a CNN ensemble – on a 500-word test set. Our results demonstrate that one model, Gemini 2.5 Pro, surpasses both baselines by approximately 5 percentage points in root identification accuracy. An examination of the model's reasoning capabilities shows that while it can produce logically sound, etymologically-informed analyses, it is also highly prone to factual hallucinations. This work highlights that while LLMs show significant promise in overcoming the OOV root problem, the inconsistency of their reasoning presents a significant obstacle to their direct application, underscoring the need for further research into improving their factuality and consistency.",
author = "Морозов, {Дмитрий Алексеевич} and Glazkova, {Anna V.} and Iomdin, {Boris L.}",
year = "2025",
month = dec,
day = "25",
doi = "10.14529/jsfi250305",
language = "English",
volume = "12",
pages = "63--75",
journal = "Supercomputing Frontiers and Innovations",
issn = "2409-6008",
publisher = "Южно-Уральский государственный университет",
number = "3",

}

RIS

TY - JOUR

T1 - Can LLMs Get to the Roots? Evaluating Russian Morpheme Segmentation Capabilities in Large Language Models

AU - Морозов, Дмитрий Алексеевич

AU - Glazkova, Anna V.

AU - Iomdin, Boris L.

PY - 2025/12/25

Y1 - 2025/12/25

N2 - Automatic morpheme segmentation, a crucial task for morphologically rich languages like Russian, is persistently hindered by a significant drop in performance on words containing out-of-vocabulary (OOV) roots. This issue affects even state-of-the-art models, such as fine-tuned BERT models. This study investigates the potential of modern Large Language Models (LLMs) to address this challenge, focusing on the specific task of root identification in Russian. We evaluate a diverse set of eight state-of-the-art LLMs, including proprietary and open-weight models, using a prompt-based, few-shot learning approach. The models' performance is benchmarked against strong baselines – a fine-tuned RuRoberta model and a CNN ensemble – on a 500-word test set. Our results demonstrate that one model, Gemini 2.5 Pro, surpasses both baselines by approximately 5 percentage points in root identification accuracy. An examination of the model's reasoning capabilities shows that while it can produce logically sound, etymologically-informed analyses, it is also highly prone to factual hallucinations. This work highlights that while LLMs show significant promise in overcoming the OOV root problem, the inconsistency of their reasoning presents a significant obstacle to their direct application, underscoring the need for further research into improving their factuality and consistency.

AB - Automatic morpheme segmentation, a crucial task for morphologically rich languages like Russian, is persistently hindered by a significant drop in performance on words containing out-of-vocabulary (OOV) roots. This issue affects even state-of-the-art models, such as fine-tuned BERT models. This study investigates the potential of modern Large Language Models (LLMs) to address this challenge, focusing on the specific task of root identification in Russian. We evaluate a diverse set of eight state-of-the-art LLMs, including proprietary and open-weight models, using a prompt-based, few-shot learning approach. The models' performance is benchmarked against strong baselines – a fine-tuned RuRoberta model and a CNN ensemble – on a 500-word test set. Our results demonstrate that one model, Gemini 2.5 Pro, surpasses both baselines by approximately 5 percentage points in root identification accuracy. An examination of the model's reasoning capabilities shows that while it can produce logically sound, etymologically-informed analyses, it is also highly prone to factual hallucinations. This work highlights that while LLMs show significant promise in overcoming the OOV root problem, the inconsistency of their reasoning presents a significant obstacle to their direct application, underscoring the need for further research into improving their factuality and consistency.

UR - https://www.scopus.com/pages/publications/105029075142

UR - https://www.mendeley.com/catalogue/c1c52f0b-7250-3fcb-b6cb-062e80ebdfd4/

U2 - 10.14529/jsfi250305

DO - 10.14529/jsfi250305

M3 - Article

VL - 12

SP - 63

EP - 75

JO - Supercomputing Frontiers and Innovations

JF - Supercomputing Frontiers and Innovations

SN - 2409-6008

IS - 3

ER -

ID: 74461329