Exploring Fine-Tuned Generative Models for Keyphrase Selection: A Case Study for Russian

Standard

Exploring Fine-Tuned Generative Models for Keyphrase Selection: A Case Study for Russian. / Glazkova, Anna; Morozov, Dmitry.

Data Analytics and Management in Data Intensive Domains. ed. / Panos Pardalos; Eduard Babkin; Nikolay Zolotykh; Sergey Stupnikov. Springer, 2026. p. 98-111 7 (Communications in Computer and Information Science; Vol. 2641 CCIS).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › Research › peer-review

Harvard

Glazkova, A & Morozov, D 2026, Exploring Fine-Tuned Generative Models for Keyphrase Selection: A Case Study for Russian. in P Pardalos, E Babkin, N Zolotykh & S Stupnikov (eds), Data Analytics and Management in Data Intensive Domains., 7, Communications in Computer and Information Science, vol. 2641 CCIS, Springer, pp. 98-111, 26th International Conference Data Analytics and Management in Data Intensive Domains, Нижний Новгород, Russian Federation, 23.10.2024. https://doi.org/10.1007/978-3-032-03997-2_7

APA

Glazkova, A., & Morozov, D. (2026). Exploring Fine-Tuned Generative Models for Keyphrase Selection: A Case Study for Russian. In P. Pardalos, E. Babkin, N. Zolotykh, & S. Stupnikov (Eds.), Data Analytics and Management in Data Intensive Domains (pp. 98-111). [7] (Communications in Computer and Information Science; Vol. 2641 CCIS). Springer. https://doi.org/10.1007/978-3-032-03997-2_7

Vancouver

Glazkova A, Morozov D. Exploring Fine-Tuned Generative Models for Keyphrase Selection: A Case Study for Russian. In Pardalos P, Babkin E, Zolotykh N, Stupnikov S, editors, Data Analytics and Management in Data Intensive Domains. Springer. 2026. p. 98-111. 7. (Communications in Computer and Information Science). doi: 10.1007/978-3-032-03997-2_7

Author

Glazkova, Anna ; Morozov, Dmitry. / Exploring Fine-Tuned Generative Models for Keyphrase Selection: A Case Study for Russian. Data Analytics and Management in Data Intensive Domains. editor / Panos Pardalos ; Eduard Babkin ; Nikolay Zolotykh ; Sergey Stupnikov. Springer, 2026. pp. 98-111 (Communications in Computer and Information Science).

BibTeX

@inproceedings{7283bd2739ad464fad02ebbc41c6a339,

title = "Exploring Fine-Tuned Generative Models for Keyphrase Selection: A Case Study for Russian",

abstract = "Keyphrase selection plays a pivotal role within the domain of scholarly texts, facilitating efficient information retrieval, summarization, and indexing. In this work, we explored how to apply fine-tuned generative transformer-based models to the specific task of keyphrase selection within Russian scientific texts. We experimented with four distinct generative models, such as ruT5, ruGPT, mT5, and mBART, and evaluated their performance in both in-domain and cross-domain settings. The experiments were conducted on the texts of Russian scientific abstracts from four domains: mathematics & computer science, history, medicine, and linguistics. The use of generative models, namely mBART, led to gains in in-domain performance (up to 4.9% in BERTScore, 9.0% in ROUGE-1, and 12.2% in F1-score) over three keyphrase extraction baselines for the Russian language. Although the results for cross-domain usage were significantly lower, they still demonstrated the capability to surpass baseline performances in several cases, underscoring the promising potential for further exploration and refinement in this research field.",

keywords = "Keyphrase Selection, Keywords, Scholarly Documents, Sequence-to-sequence Models, Text Generation, Text Summarization, mBART",

author = "Anna Glazkova and Dmitry Morozov",

note = "Glazkova, A., Morozov, D. (2026). Exploring Fine-Tuned Generative Models for Keyphrase Selection: A Case Study for Russian. In: Pardalos, P., Babkin, E., Zolotykh, N., Stupnikov, S. (eds) Data Analytics and Management in Data Intensive Domains. DAMDID/RCDL 2024. Communications in Computer and Information Science, vol 2641. Springer, Cham. https://doi.org/10.1007/978-3-032-03997-2_7; 26th International Conference Data Analytics and Management in Data Intensive Domains, DAMDID/RCDL 2024 ; Conference date: 23-10-2024 Through 25-10-2024",

year = "2026",

doi = "10.1007/978-3-032-03997-2_7",

language = "English",

isbn = "978-3-032-03996-5",

series = "Communications in Computer and Information Science",

publisher = "Springer",

pages = "98--111",

editor = "Panos Pardalos and Eduard Babkin and Nikolay Zolotykh and Sergey Stupnikov",

booktitle = "Data Analytics and Management in Data Intensive Domains",

address = "United States",

}

RIS

TY - GEN

T1 - Exploring Fine-Tuned Generative Models for Keyphrase Selection: A Case Study for Russian

AU - Glazkova, Anna

AU - Morozov, Dmitry

N1 - Conference code: 26

PY - 2026

Y1 - 2026

N2 - Keyphrase selection plays a pivotal role within the domain of scholarly texts, facilitating efficient information retrieval, summarization, and indexing. In this work, we explored how to apply fine-tuned generative transformer-based models to the specific task of keyphrase selection within Russian scientific texts. We experimented with four distinct generative models, such as ruT5, ruGPT, mT5, and mBART, and evaluated their performance in both in-domain and cross-domain settings. The experiments were conducted on the texts of Russian scientific abstracts from four domains: mathematics & computer science, history, medicine, and linguistics. The use of generative models, namely mBART, led to gains in in-domain performance (up to 4.9% in BERTScore, 9.0% in ROUGE-1, and 12.2% in F1-score) over three keyphrase extraction baselines for the Russian language. Although the results for cross-domain usage were significantly lower, they still demonstrated the capability to surpass baseline performances in several cases, underscoring the promising potential for further exploration and refinement in this research field.

AB - Keyphrase selection plays a pivotal role within the domain of scholarly texts, facilitating efficient information retrieval, summarization, and indexing. In this work, we explored how to apply fine-tuned generative transformer-based models to the specific task of keyphrase selection within Russian scientific texts. We experimented with four distinct generative models, such as ruT5, ruGPT, mT5, and mBART, and evaluated their performance in both in-domain and cross-domain settings. The experiments were conducted on the texts of Russian scientific abstracts from four domains: mathematics & computer science, history, medicine, and linguistics. The use of generative models, namely mBART, led to gains in in-domain performance (up to 4.9% in BERTScore, 9.0% in ROUGE-1, and 12.2% in F1-score) over three keyphrase extraction baselines for the Russian language. Although the results for cross-domain usage were significantly lower, they still demonstrated the capability to surpass baseline performances in several cases, underscoring the promising potential for further exploration and refinement in this research field.

KW - Keyphrase Selection

KW - Keywords

KW - Scholarly Documents

KW - Sequence-to-sequence Models

KW - Text Generation

KW - Text Summarization

KW - mBART

UR - https://www.scopus.com/pages/publications/105021001901

UR - https://www.mendeley.com/catalogue/f0b0b62f-fd49-3584-b4bb-de0b7188a4b0/

U2 - 10.1007/978-3-032-03997-2_7

DO - 10.1007/978-3-032-03997-2_7

M3 - Conference contribution

SN - 978-3-032-03996-5

T3 - Communications in Computer and Information Science

SP - 98

EP - 111

BT - Data Analytics and Management in Data Intensive Domains

A2 - Pardalos, Panos

A2 - Babkin, Eduard

A2 - Zolotykh, Nikolay

A2 - Stupnikov, Sergey

PB - Springer

T2 - 26th International Conference Data Analytics and Management in Data Intensive Domains

Y2 - 23 October 2024 through 25 October 2024

ER -

ID: 72143660