О географической привязке контента текстовых документов

Standard

О географической привязке контента текстовых документов. / Zhizhimov, Oleg L.; Leonova, Yulia V.

In: CEUR Workshop Proceedings, Vol. 2534, 12.01.2020, p. 241-247.

Research output: Contribution to journal › Conference article › peer-review

BibTeX

@article{918fa2729eed48c8a5cecdbe25204401,

title = "О географической привязке контента текстовых документов",

abstract = "Extracting geographical names from arbitrary text documents is important in the tasks of processing large arrays of documents and linking their content to a specific geographic region. In the simplest form, the model for extracting geographical names from the text looks like a sequence of actions with the text, while at each stage its task is solved. Among these tasks, there are undoubtedly: text parsing, analyzing text elements, processing synonyms and abbreviations, bringing the text elements to normal form from possible word forms and grammar rules, comparing text elements with the elements of dictionaries of geographical names, adding special tags to the text for unambiguous identification geographical names. The proposed work describes a technology that implements the above tasks on the basis of a freely distributed PostgreSQL DBMS. In this case, the standard configuration is used, all the server part settings are performed within the framework of the documented procedures. GeoNames Gazetteer database, Open Street Map (OSM) databases, OKATO and КЛАДР classifications are used as an authoritative database of geographical names.",

keywords = "Full-text search, Geographical names, Geographical search, Model of extraction of names, PostgreSQL, Text processing",

author = "Zhizhimov, {Oleg L.} and Leonova, {Yulia V.}",

note = "Publisher Copyright: Copyright {\textcopyright} 2019 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). Copyright: Copyright 2021 Elsevier B.V., All rights reserved.; 2019 All-Russian Conference {"}Spatial Data Processing for Monitoring of Natural and Anthropogenic Processes{"} ; Conference date: 26-08-2019 Through 30-08-2019",

year = "2020",

month = jan,

day = "12",

language = "русский",

volume = "2534",

pages = "241--247",

journal = "CEUR Workshop Proceedings",

issn = "1613-0073",

publisher = "CEUR-WS",

}

RIS

TY - JOUR

T1 - О географической привязке контента текстовых документов

AU - Zhizhimov, Oleg L.

AU - Leonova, Yulia V.

N1 - Publisher Copyright: Copyright © 2019 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). Copyright: Copyright 2021 Elsevier B.V., All rights reserved.

PY - 2020/1/12

Y1 - 2020/1/12

N2 - Extracting geographical names from arbitrary text documents is important in the tasks of processing large arrays of documents and linking their content to a specific geographic region. In the simplest form, the model for extracting geographical names from the text looks like a sequence of actions with the text, while at each stage its task is solved. Among these tasks, there are undoubtedly: text parsing, analyzing text elements, processing synonyms and abbreviations, bringing the text elements to normal form from possible word forms and grammar rules, comparing text elements with the elements of dictionaries of geographical names, adding special tags to the text for unambiguous identification geographical names. The proposed work describes a technology that implements the above tasks on the basis of a freely distributed PostgreSQL DBMS. In this case, the standard configuration is used, all the server part settings are performed within the framework of the documented procedures. GeoNames Gazetteer database, Open Street Map (OSM) databases, OKATO and КЛАДР classifications are used as an authoritative database of geographical names.

AB - Extracting geographical names from arbitrary text documents is important in the tasks of processing large arrays of documents and linking their content to a specific geographic region. In the simplest form, the model for extracting geographical names from the text looks like a sequence of actions with the text, while at each stage its task is solved. Among these tasks, there are undoubtedly: text parsing, analyzing text elements, processing synonyms and abbreviations, bringing the text elements to normal form from possible word forms and grammar rules, comparing text elements with the elements of dictionaries of geographical names, adding special tags to the text for unambiguous identification geographical names. The proposed work describes a technology that implements the above tasks on the basis of a freely distributed PostgreSQL DBMS. In this case, the standard configuration is used, all the server part settings are performed within the framework of the documented procedures. GeoNames Gazetteer database, Open Street Map (OSM) databases, OKATO and КЛАДР classifications are used as an authoritative database of geographical names.

KW - Full-text search

KW - Geographical names

KW - Geographical search

KW - Model of extraction of names

KW - PostgreSQL

KW - Text processing

UR - http://www.scopus.com/inward/record.url?scp=85078519428&partnerID=8YFLogxK

M3 - статья по материалам конференции

AN - SCOPUS:85078519428

VL - 2534

SP - 241

EP - 247

JO - CEUR Workshop Proceedings

JF - CEUR Workshop Proceedings

SN - 1613-0073

T2 - 2019 All-Russian Conference "Spatial Data Processing for Monitoring of Natural and Anthropogenic Processes"

Y2 - 26 August 2019 through 30 August 2019

ER -

ID: 25095877