Результаты исследований: Научные публикации в периодических изданиях › статья по материалам конференции › Рецензирование
RUREBUS-2020 : СОРЕВНОВАНИЕ ПО ИЗВЛЕЧЕНИЮ ОТНОШЕНИЙ В БИЗНЕС-ПОСТАНОВКЕ. / Ivanin, V. A.; Artemova, E. L.; Batura, T. V. и др.
в: Komp'juternaja Lingvistika i Intellektual'nye Tehnologii, Том 2020-June, № 19, 2020, стр. 416-431.Результаты исследований: Научные публикации в периодических изданиях › статья по материалам конференции › Рецензирование
}
TY - JOUR
T1 - RUREBUS-2020
T2 - 2020 Annual International Conference on Computational Linguistics and Intellectual Technologies, Dialogue 2020
AU - Ivanin, V. A.
AU - Artemova, E. L.
AU - Batura, T. V.
AU - Ivanov, V. V.
AU - Sarkisyan, V. V.
AU - Tutubalina, E. V.
AU - Smurov, I. M.
N1 - Funding Information: Work on maintenance of the annotation system, discussions of results, and manuscript preparation was carried out by Elena Tutubalina, Vladimir Ivanov, Tati-ana Batura and supported by the Russian Science Foundation grant no. 20-11-20166. Ekaterina Artemova and Veronika Sarkisyan worked on text annotation, discussions of results, and manuscript. Their work was supported by the framework of the HSE University Basic Research Program and Russian Academic Excellence Project “5-100”. Publisher Copyright: © 2020 ABBYY PRODUCTION LLC. All rights reserved. Copyright: Copyright 2020 Elsevier B.V., All rights reserved.
PY - 2020
Y1 - 2020
N2 - In this paper, we present a shared task on core information extraction problems, named entity recognition and relation extraction. In contrast to popular shared tasks on related problems, we try to move away from strictly academic rigor and rather model a business case. As a source for textual data we choose the corpus of Russian strategic documents, which we annotated according to our own annotation scheme. To speed up the annotation process, we exploit various active learning techniques. In total we ended up with more than two hundred annotated documents. Thus we managed to create a high-quality data set in short time. The shared task consisted of three tracks, devoted to 1) named entity recognition, 2) relation extraction and 3) joint named entity recognition and relation extraction. We provided with the annotated texts as well as a set of unannotated texts, which could of been used in any way to improve solutions. In the paper we overview and compare solutions, submitted by the shared task participants. We release both raw and annotated corpora along with annotation guidelines, evaluation scripts and results at https://github.com/dialogue-evaluation/RuREBus.
AB - In this paper, we present a shared task on core information extraction problems, named entity recognition and relation extraction. In contrast to popular shared tasks on related problems, we try to move away from strictly academic rigor and rather model a business case. As a source for textual data we choose the corpus of Russian strategic documents, which we annotated according to our own annotation scheme. To speed up the annotation process, we exploit various active learning techniques. In total we ended up with more than two hundred annotated documents. Thus we managed to create a high-quality data set in short time. The shared task consisted of three tracks, devoted to 1) named entity recognition, 2) relation extraction and 3) joint named entity recognition and relation extraction. We provided with the annotated texts as well as a set of unannotated texts, which could of been used in any way to improve solutions. In the paper we overview and compare solutions, submitted by the shared task participants. We release both raw and annotated corpora along with annotation guidelines, evaluation scripts and results at https://github.com/dialogue-evaluation/RuREBus.
KW - BERT
KW - Named entity recognition
KW - Relation extraction
KW - Russian fine-tuning
KW - Shared task
UR - http://www.scopus.com/inward/record.url?scp=85093820206&partnerID=8YFLogxK
U2 - 10.28995/2075-7182-2020-19-416-431
DO - 10.28995/2075-7182-2020-19-416-431
M3 - статья по материалам конференции
AN - SCOPUS:85093820206
VL - 2020-June
SP - 416
EP - 431
JO - Компьютерная лингвистика и интеллектуальные технологии
JF - Компьютерная лингвистика и интеллектуальные технологии
SN - 2221-7932
IS - 19
Y2 - 17 June 2020 through 20 June 2020
ER -
ID: 25999437