Cross-Domain Robustness of Transformer-Based Keyphrase Generation

Standard

Cross-Domain Robustness of Transformer-Based Keyphrase Generation. / Glazkova, Anna; Morozov, Dmitry.

Communications in Computer and Information Science. Springer, 2024. стр. 249-265 19 (Communications in Computer and Information Science; Том 2086 CCIS).

Результаты исследований: Публикации в книгах, отчётах, сборниках, трудах конференций › статья в сборнике материалов конференции › научная › Рецензирование

Harvard

Glazkova, A & Morozov, D 2024, Cross-Domain Robustness of Transformer-Based Keyphrase Generation. в Communications in Computer and Information Science., 19, Communications in Computer and Information Science, Том. 2086 CCIS, Springer, стр. 249-265, 25th International Conference on Data Analytics and Management in Data Intensive Domains, Москва, Российская Федерация, 24.10.2023. https://doi.org/10.1007/978-3-031-67826-4_19

APA

Glazkova, A., & Morozov, D. (2024). Cross-Domain Robustness of Transformer-Based Keyphrase Generation. в Communications in Computer and Information Science (стр. 249-265). [19] (Communications in Computer and Information Science; Том 2086 CCIS). Springer. https://doi.org/10.1007/978-3-031-67826-4_19

Vancouver

Glazkova A, Morozov D. Cross-Domain Robustness of Transformer-Based Keyphrase Generation. в Communications in Computer and Information Science. Springer. 2024. стр. 249-265. 19. (Communications in Computer and Information Science). doi: 10.1007/978-3-031-67826-4_19

Author

Glazkova, Anna ; Morozov, Dmitry. / Cross-Domain Robustness of Transformer-Based Keyphrase Generation. Communications in Computer and Information Science. Springer, 2024. стр. 249-265 (Communications in Computer and Information Science).

BibTeX

@inproceedings{0193ecfc8a594a92a8c7279b0c7eb976,

title = "Cross-Domain Robustness of Transformer-Based Keyphrase Generation",

abstract = "Modern models for text generation show state-of-the-art results in many natural language processing tasks. In this work, we explore the effectiveness of abstractive text summarization models for keyphrase selection. A list of keyphrases is an important element of a text in databases and repositories of electronic documents. In our experiments, abstractive text summarization models fine-tuned for keyphrase generation show quite high results for a target text corpus. However, in most cases, the zero-shot performance on other corpora and domains is significantly lower. We investigate cross-domain limitations of abstractive text summarization models for keyphrase generation. We present an evaluation of the fine-tuned BART models for the keyphrase selection task across six benchmark corpora for keyphrase extraction including scientific texts from two domains and news texts. We explore the role of transfer learning between different domains to improve the BART model performance on small text corpora. Our experiments show that preliminary fine-tuning on out-of-domain corpora can be effective under conditions of a limited number of samples.",

keywords = "BART, Keyphrase extraction, Scholarly document, Text summarization, Transfer learning",

author = "Anna Glazkova and Dmitry Morozov",

note = "Supported by the grant of the President of the Russian Federation no. MK-3118.2022.4.; 25th International Conference on Data Analytics and Management in Data Intensive Domains, DAMDID/RCDL 2023 ; Conference date: 24-10-2023 Through 27-10-2023",

year = "2024",

doi = "10.1007/978-3-031-67826-4_19",

language = "English",

isbn = "978-3-031-67825-7",

series = "Communications in Computer and Information Science",

publisher = "Springer",

pages = "249--265",

booktitle = "Communications in Computer and Information Science",

address = "United States",

}

RIS

TY - GEN

T1 - Cross-Domain Robustness of Transformer-Based Keyphrase Generation

AU - Glazkova, Anna

AU - Morozov, Dmitry

N1 - Conference code: 25

PY - 2024

Y1 - 2024

N2 - Modern models for text generation show state-of-the-art results in many natural language processing tasks. In this work, we explore the effectiveness of abstractive text summarization models for keyphrase selection. A list of keyphrases is an important element of a text in databases and repositories of electronic documents. In our experiments, abstractive text summarization models fine-tuned for keyphrase generation show quite high results for a target text corpus. However, in most cases, the zero-shot performance on other corpora and domains is significantly lower. We investigate cross-domain limitations of abstractive text summarization models for keyphrase generation. We present an evaluation of the fine-tuned BART models for the keyphrase selection task across six benchmark corpora for keyphrase extraction including scientific texts from two domains and news texts. We explore the role of transfer learning between different domains to improve the BART model performance on small text corpora. Our experiments show that preliminary fine-tuning on out-of-domain corpora can be effective under conditions of a limited number of samples.

AB - Modern models for text generation show state-of-the-art results in many natural language processing tasks. In this work, we explore the effectiveness of abstractive text summarization models for keyphrase selection. A list of keyphrases is an important element of a text in databases and repositories of electronic documents. In our experiments, abstractive text summarization models fine-tuned for keyphrase generation show quite high results for a target text corpus. However, in most cases, the zero-shot performance on other corpora and domains is significantly lower. We investigate cross-domain limitations of abstractive text summarization models for keyphrase generation. We present an evaluation of the fine-tuned BART models for the keyphrase selection task across six benchmark corpora for keyphrase extraction including scientific texts from two domains and news texts. We explore the role of transfer learning between different domains to improve the BART model performance on small text corpora. Our experiments show that preliminary fine-tuning on out-of-domain corpora can be effective under conditions of a limited number of samples.

KW - BART

KW - Keyphrase extraction

KW - Scholarly document

KW - Text summarization

KW - Transfer learning

UR - https://www.scopus.com/record/display.uri?eid=2-s2.0-85206363258&origin=inward&txGid=9723b476eda80e86bb31d2c08110c0e4

UR - https://www.mendeley.com/catalogue/4722ccdf-ff68-3981-942a-c448b4f3fd51/

U2 - 10.1007/978-3-031-67826-4_19

DO - 10.1007/978-3-031-67826-4_19

M3 - Conference contribution

SN - 978-3-031-67825-7

T3 - Communications in Computer and Information Science

SP - 249

EP - 265

BT - Communications in Computer and Information Science

PB - Springer

T2 - 25th International Conference on Data Analytics and Management in Data Intensive Domains

Y2 - 24 October 2023 through 27 October 2023

ER -

ID: 61528953