Standard

NEREL: a Russian information extraction dataset with rich annotation for nested entities, relations, and wikidata entity links. / Loukachevitch, Natalia; Artemova, Ekaterina; Batura, Tatiana et al.

In: Language Resources and Evaluation, Vol. 58, No. 2, 06.2024, p. 547-583.

Research output: Contribution to journalArticlepeer-review

Harvard

Loukachevitch, N, Artemova, E, Batura, T, Braslavski, P, Ivanov, V, Manandhar, S, Pugachev, A, Rozhkov, I, Shelmanov, A, Tutubalina, E & Yandutov, A 2024, 'NEREL: a Russian information extraction dataset with rich annotation for nested entities, relations, and wikidata entity links', Language Resources and Evaluation, vol. 58, no. 2, pp. 547-583. https://doi.org/10.1007/s10579-023-09674-z

APA

Loukachevitch, N., Artemova, E., Batura, T., Braslavski, P., Ivanov, V., Manandhar, S., Pugachev, A., Rozhkov, I., Shelmanov, A., Tutubalina, E., & Yandutov, A. (2024). NEREL: a Russian information extraction dataset with rich annotation for nested entities, relations, and wikidata entity links. Language Resources and Evaluation, 58(2), 547-583. https://doi.org/10.1007/s10579-023-09674-z

Vancouver

Loukachevitch N, Artemova E, Batura T, Braslavski P, Ivanov V, Manandhar S et al. NEREL: a Russian information extraction dataset with rich annotation for nested entities, relations, and wikidata entity links. Language Resources and Evaluation. 2024 Jun;58(2):547-583. doi: 10.1007/s10579-023-09674-z

Author

Loukachevitch, Natalia ; Artemova, Ekaterina ; Batura, Tatiana et al. / NEREL: a Russian information extraction dataset with rich annotation for nested entities, relations, and wikidata entity links. In: Language Resources and Evaluation. 2024 ; Vol. 58, No. 2. pp. 547-583.

BibTeX

@article{d5030b3d26ce4dd6bb3cb8f70376e671,
title = "NEREL: a Russian information extraction dataset with rich annotation for nested entities, relations, and wikidata entity links",
abstract = "This paper describes NEREL—a Russian news dataset suited for three tasks: nested named entity recognition, relation extraction, and entity linking. Compared to flat entities, nested named entities provide a richer and more complete annotation while also increasing the coverage of relations annotation and entity linking. Relations between nested named entities may cross entity boundaries to connect to shorter entities nested within longer ones, which makes it harder to detect such relations. NEREL is currently the largest Russian dataset annotated with entities and relations: it comprises 29 named entity types and 49 relation types. At the time of writing, the dataset contains 56 K named entities and 39 K relations annotated in 933 person-oriented news articles. NEREL is annotated with relations at three levels: (1) within nested named entities, (2) within sentences, and (3) with relations crossing sentence boundaries. We provide benchmark evaluation of current state-of-the-art methods in all three tasks. The dataset is freely available at https://github.com/nerel-ds/NEREL .",
keywords = "Entity linking, Named entity recognition, Nested entities, Nested relations, Relation extraction, 68T35, 68T50",
author = "Natalia Loukachevitch and Ekaterina Artemova and Tatiana Batura and Pavel Braslavski and Vladimir Ivanov and Suresh Manandhar and Alexander Pugachev and Igor Rozhkov and Artem Shelmanov and Elena Tutubalina and Alexey Yandutov",
note = "The work is supported by a grant for research centers in the field of artificial intelligence, provided by the Analytical Center for the Government of the Russian Federation in accordance with the subsidy agreement (agreement identifier 000000D730321P5Q0002) and the agreement with the Ivannikov Institute for System Programming of the Russian Academy of Sciences dated November 2, 2021 No. 70-2021-00142. Публикация для корректировки.",
year = "2024",
month = jun,
doi = "10.1007/s10579-023-09674-z",
language = "English",
volume = "58",
pages = "547--583",
journal = "Language Resources and Evaluation",
issn = "1574-0218",
number = "2",

}

RIS

TY - JOUR

T1 - NEREL: a Russian information extraction dataset with rich annotation for nested entities, relations, and wikidata entity links

AU - Loukachevitch, Natalia

AU - Artemova, Ekaterina

AU - Batura, Tatiana

AU - Braslavski, Pavel

AU - Ivanov, Vladimir

AU - Manandhar, Suresh

AU - Pugachev, Alexander

AU - Rozhkov, Igor

AU - Shelmanov, Artem

AU - Tutubalina, Elena

AU - Yandutov, Alexey

N1 - The work is supported by a grant for research centers in the field of artificial intelligence, provided by the Analytical Center for the Government of the Russian Federation in accordance with the subsidy agreement (agreement identifier 000000D730321P5Q0002) and the agreement with the Ivannikov Institute for System Programming of the Russian Academy of Sciences dated November 2, 2021 No. 70-2021-00142. Публикация для корректировки.

PY - 2024/6

Y1 - 2024/6

N2 - This paper describes NEREL—a Russian news dataset suited for three tasks: nested named entity recognition, relation extraction, and entity linking. Compared to flat entities, nested named entities provide a richer and more complete annotation while also increasing the coverage of relations annotation and entity linking. Relations between nested named entities may cross entity boundaries to connect to shorter entities nested within longer ones, which makes it harder to detect such relations. NEREL is currently the largest Russian dataset annotated with entities and relations: it comprises 29 named entity types and 49 relation types. At the time of writing, the dataset contains 56 K named entities and 39 K relations annotated in 933 person-oriented news articles. NEREL is annotated with relations at three levels: (1) within nested named entities, (2) within sentences, and (3) with relations crossing sentence boundaries. We provide benchmark evaluation of current state-of-the-art methods in all three tasks. The dataset is freely available at https://github.com/nerel-ds/NEREL .

AB - This paper describes NEREL—a Russian news dataset suited for three tasks: nested named entity recognition, relation extraction, and entity linking. Compared to flat entities, nested named entities provide a richer and more complete annotation while also increasing the coverage of relations annotation and entity linking. Relations between nested named entities may cross entity boundaries to connect to shorter entities nested within longer ones, which makes it harder to detect such relations. NEREL is currently the largest Russian dataset annotated with entities and relations: it comprises 29 named entity types and 49 relation types. At the time of writing, the dataset contains 56 K named entities and 39 K relations annotated in 933 person-oriented news articles. NEREL is annotated with relations at three levels: (1) within nested named entities, (2) within sentences, and (3) with relations crossing sentence boundaries. We provide benchmark evaluation of current state-of-the-art methods in all three tasks. The dataset is freely available at https://github.com/nerel-ds/NEREL .

KW - Entity linking

KW - Named entity recognition

KW - Nested entities

KW - Nested relations

KW - Relation extraction

KW - 68T35

KW - 68T50

UR - https://www.scopus.com/record/display.uri?eid=2-s2.0-85171757390&origin=inward&txGid=1b19e870f0d3cb73630957ea4947ec92

UR - https://www.mendeley.com/catalogue/870f1879-7284-36d6-8eba-b3d8c378a277/

U2 - 10.1007/s10579-023-09674-z

DO - 10.1007/s10579-023-09674-z

M3 - Article

VL - 58

SP - 547

EP - 583

JO - Language Resources and Evaluation

JF - Language Resources and Evaluation

SN - 1574-0218

IS - 2

ER -

ID: 59174838