Standard

Layout logical labelling and finding the semantic relationships between citing and cited paper content. / Parinov, Sergey; Bakarov, Amir; Vodolazsky, Daniil.

в: International Journal of Metadata, Semantics and Ontologies, Том 14, № 1, 01.01.2020, стр. 54-62.

Результаты исследований: Научные публикации в периодических изданияхстатьяРецензирование

Harvard

Parinov, S, Bakarov, A & Vodolazsky, D 2020, 'Layout logical labelling and finding the semantic relationships between citing and cited paper content', International Journal of Metadata, Semantics and Ontologies, Том. 14, № 1, стр. 54-62. https://doi.org/10.1504/IJMSO.2020.107796

APA

Parinov, S., Bakarov, A., & Vodolazsky, D. (2020). Layout logical labelling and finding the semantic relationships between citing and cited paper content. International Journal of Metadata, Semantics and Ontologies, 14(1), 54-62. https://doi.org/10.1504/IJMSO.2020.107796

Vancouver

Parinov S, Bakarov A, Vodolazsky D. Layout logical labelling and finding the semantic relationships between citing and cited paper content. International Journal of Metadata, Semantics and Ontologies. 2020 янв. 1;14(1):54-62. doi: 10.1504/IJMSO.2020.107796

Author

Parinov, Sergey ; Bakarov, Amir ; Vodolazsky, Daniil. / Layout logical labelling and finding the semantic relationships between citing and cited paper content. в: International Journal of Metadata, Semantics and Ontologies. 2020 ; Том 14, № 1. стр. 54-62.

BibTeX

@article{94c1532943da4e5b8fcaca05e4fce4df,
title = "Layout logical labelling and finding the semantic relationships between citing and cited paper content",
abstract = "Currently, large data sets of in-text citations and citation contexts are becoming available for research and developing tools. Using the “topic model” method to analyse these data, one can characterise thematic relationships between citation contexts from citing and the cited paper content. However, to build relevant topic models and to compare them accurately for papers linked by citation relationships we have to know the semantic labels of PDF papers' layout such as section titles, paragraph boundaries, etc. Recent achievements in papers' conversion from a PDF form into a rich attributed JSON format allow us to develop new approaches for the logical labelling of the papers' layout. This paper presents a re-usable method and open source software for the logical labelling of PDF papers, which gave good quality of a layout element's recognition for a set of research papers. Using these semantic labels we made a precise comparison of topic models built for citing and cited papers and we found some level of similarity between them.",
keywords = "Cirtec project, Citation contexts, Hierarchical topic models, In-text citation, Logical labelling, Research paper layout recognition",
author = "Sergey Parinov and Amir Bakarov and Daniil Vodolazsky",
year = "2020",
month = jan,
day = "1",
doi = "10.1504/IJMSO.2020.107796",
language = "English",
volume = "14",
pages = "54--62",
journal = "International Journal of Metadata, Semantics and Ontologies",
issn = "1744-2621",
publisher = "Inderscience Enterprises Ltd",
number = "1",

}

RIS

TY - JOUR

T1 - Layout logical labelling and finding the semantic relationships between citing and cited paper content

AU - Parinov, Sergey

AU - Bakarov, Amir

AU - Vodolazsky, Daniil

PY - 2020/1/1

Y1 - 2020/1/1

N2 - Currently, large data sets of in-text citations and citation contexts are becoming available for research and developing tools. Using the “topic model” method to analyse these data, one can characterise thematic relationships between citation contexts from citing and the cited paper content. However, to build relevant topic models and to compare them accurately for papers linked by citation relationships we have to know the semantic labels of PDF papers' layout such as section titles, paragraph boundaries, etc. Recent achievements in papers' conversion from a PDF form into a rich attributed JSON format allow us to develop new approaches for the logical labelling of the papers' layout. This paper presents a re-usable method and open source software for the logical labelling of PDF papers, which gave good quality of a layout element's recognition for a set of research papers. Using these semantic labels we made a precise comparison of topic models built for citing and cited papers and we found some level of similarity between them.

AB - Currently, large data sets of in-text citations and citation contexts are becoming available for research and developing tools. Using the “topic model” method to analyse these data, one can characterise thematic relationships between citation contexts from citing and the cited paper content. However, to build relevant topic models and to compare them accurately for papers linked by citation relationships we have to know the semantic labels of PDF papers' layout such as section titles, paragraph boundaries, etc. Recent achievements in papers' conversion from a PDF form into a rich attributed JSON format allow us to develop new approaches for the logical labelling of the papers' layout. This paper presents a re-usable method and open source software for the logical labelling of PDF papers, which gave good quality of a layout element's recognition for a set of research papers. Using these semantic labels we made a precise comparison of topic models built for citing and cited papers and we found some level of similarity between them.

KW - Cirtec project

KW - Citation contexts

KW - Hierarchical topic models

KW - In-text citation

KW - Logical labelling

KW - Research paper layout recognition

UR - http://www.scopus.com/inward/record.url?scp=85087985924&partnerID=8YFLogxK

U2 - 10.1504/IJMSO.2020.107796

DO - 10.1504/IJMSO.2020.107796

M3 - Article

AN - SCOPUS:85087985924

VL - 14

SP - 54

EP - 62

JO - International Journal of Metadata, Semantics and Ontologies

JF - International Journal of Metadata, Semantics and Ontologies

SN - 1744-2621

IS - 1

ER -

ID: 24815507