Standard

Automated detection of non-relevant posts on the russian imageboard “2ch” : Importance of the choice of word representations. / Bakarov, Amir; Gureenkova, Olga.

Analysis of Images, Social Networks and Texts - 6th International Conference, AIST 2017, Revised Selected Papers. ред. / WMP VanDerAalst; DI Ignatov; M Khachay; SO Kuznetsov; Lempitsky; IA Lomazova; N Loukachevitch; A Napoli; A Panchenko; PM Pardalos; AV Savchenko; S Wasserman. Springer-Verlag GmbH and Co. KG, 2018. стр. 16-21 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Том 10716 LNCS).

Результаты исследований: Публикации в книгах, отчётах, сборниках, трудах конференцийстатья в сборнике материалов конференциинаучнаяРецензирование

Harvard

Bakarov, A & Gureenkova, O 2018, Automated detection of non-relevant posts on the russian imageboard “2ch”: Importance of the choice of word representations. в WMP VanDerAalst, DI Ignatov, M Khachay, SO Kuznetsov, Lempitsky, IA Lomazova, N Loukachevitch, A Napoli, A Panchenko, PM Pardalos, AV Savchenko & S Wasserman (ред.), Analysis of Images, Social Networks and Texts - 6th International Conference, AIST 2017, Revised Selected Papers. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Том. 10716 LNCS, Springer-Verlag GmbH and Co. KG, стр. 16-21, 6th International Conference on Analysis of Images, Social Networks and Texts, AIST 2017, Moscow, Российская Федерация, 27.07.2017. https://doi.org/10.1007/978-3-319-73013-4_2

APA

Bakarov, A., & Gureenkova, O. (2018). Automated detection of non-relevant posts on the russian imageboard “2ch”: Importance of the choice of word representations. в WMP. VanDerAalst, DI. Ignatov, M. Khachay, SO. Kuznetsov, Lempitsky, IA. Lomazova, N. Loukachevitch, A. Napoli, A. Panchenko, PM. Pardalos, AV. Savchenko, & S. Wasserman (Ред.), Analysis of Images, Social Networks and Texts - 6th International Conference, AIST 2017, Revised Selected Papers (стр. 16-21). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Том 10716 LNCS). Springer-Verlag GmbH and Co. KG. https://doi.org/10.1007/978-3-319-73013-4_2

Vancouver

Bakarov A, Gureenkova O. Automated detection of non-relevant posts on the russian imageboard “2ch”: Importance of the choice of word representations. в VanDerAalst WMP, Ignatov DI, Khachay M, Kuznetsov SO, Lempitsky, Lomazova IA, Loukachevitch N, Napoli A, Panchenko A, Pardalos PM, Savchenko AV, Wasserman S, Редакторы, Analysis of Images, Social Networks and Texts - 6th International Conference, AIST 2017, Revised Selected Papers. Springer-Verlag GmbH and Co. KG. 2018. стр. 16-21. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/978-3-319-73013-4_2

Author

Bakarov, Amir ; Gureenkova, Olga. / Automated detection of non-relevant posts on the russian imageboard “2ch” : Importance of the choice of word representations. Analysis of Images, Social Networks and Texts - 6th International Conference, AIST 2017, Revised Selected Papers. Редактор / WMP VanDerAalst ; DI Ignatov ; M Khachay ; SO Kuznetsov ; Lempitsky ; IA Lomazova ; N Loukachevitch ; A Napoli ; A Panchenko ; PM Pardalos ; AV Savchenko ; S Wasserman. Springer-Verlag GmbH and Co. KG, 2018. стр. 16-21 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).

BibTeX

@inproceedings{aafdd81284da4db982407a1df933d78d,
title = "Automated detection of non-relevant posts on the russian imageboard “2ch”: Importance of the choice of word representations",
abstract = "This study considers the problem of automated detection of non-relevant posts on Web forums and discusses the approach of resolving this problem by approximation it with the task of detection of semantic relatedness between the given post and the opening post of the forum discussion thread. The approximated task could be resolved through learning the supervised classifier with a composed word embeddings of two posts. Considering that the success in this task could be quite sensitive to the choice of word representations, we propose a comparison of the performance of different word embedding models. We train 7 models (Word2Vec, Glove, Word2Vec-f, Wang2Vec, AdaGram, FastText, Swivel), evaluate embeddings produced by them on dataset of human judgements and compare their performance on the task of non-relevant posts detection. To make the comparison, we propose a dataset of semantic relatedness with posts from one of the most popular Russian Web forums, imageboard “2ch”, which has challenging lexical and grammatical features.",
keywords = "2ch, Compositional semantics, Distributional semantics, Imageboard, Semantic relatedness, Word embeddings, Word similarity, Word similarity Word embeddings, Compositional semantics 2ch",
author = "Amir Bakarov and Olga Gureenkova",
year = "2018",
month = jan,
day = "1",
doi = "10.1007/978-3-319-73013-4_2",
language = "English",
isbn = "9783319730127",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer-Verlag GmbH and Co. KG",
pages = "16--21",
editor = "WMP VanDerAalst and DI Ignatov and M Khachay and SO Kuznetsov and Lempitsky and IA Lomazova and N Loukachevitch and A Napoli and A Panchenko and PM Pardalos and AV Savchenko and S Wasserman",
booktitle = "Analysis of Images, Social Networks and Texts - 6th International Conference, AIST 2017, Revised Selected Papers",
address = "Germany",
note = "6th International Conference on Analysis of Images, Social Networks and Texts, AIST 2017 ; Conference date: 27-07-2017 Through 29-07-2017",

}

RIS

TY - GEN

T1 - Automated detection of non-relevant posts on the russian imageboard “2ch”

T2 - 6th International Conference on Analysis of Images, Social Networks and Texts, AIST 2017

AU - Bakarov, Amir

AU - Gureenkova, Olga

PY - 2018/1/1

Y1 - 2018/1/1

N2 - This study considers the problem of automated detection of non-relevant posts on Web forums and discusses the approach of resolving this problem by approximation it with the task of detection of semantic relatedness between the given post and the opening post of the forum discussion thread. The approximated task could be resolved through learning the supervised classifier with a composed word embeddings of two posts. Considering that the success in this task could be quite sensitive to the choice of word representations, we propose a comparison of the performance of different word embedding models. We train 7 models (Word2Vec, Glove, Word2Vec-f, Wang2Vec, AdaGram, FastText, Swivel), evaluate embeddings produced by them on dataset of human judgements and compare their performance on the task of non-relevant posts detection. To make the comparison, we propose a dataset of semantic relatedness with posts from one of the most popular Russian Web forums, imageboard “2ch”, which has challenging lexical and grammatical features.

AB - This study considers the problem of automated detection of non-relevant posts on Web forums and discusses the approach of resolving this problem by approximation it with the task of detection of semantic relatedness between the given post and the opening post of the forum discussion thread. The approximated task could be resolved through learning the supervised classifier with a composed word embeddings of two posts. Considering that the success in this task could be quite sensitive to the choice of word representations, we propose a comparison of the performance of different word embedding models. We train 7 models (Word2Vec, Glove, Word2Vec-f, Wang2Vec, AdaGram, FastText, Swivel), evaluate embeddings produced by them on dataset of human judgements and compare their performance on the task of non-relevant posts detection. To make the comparison, we propose a dataset of semantic relatedness with posts from one of the most popular Russian Web forums, imageboard “2ch”, which has challenging lexical and grammatical features.

KW - 2ch

KW - Compositional semantics

KW - Distributional semantics

KW - Imageboard

KW - Semantic relatedness

KW - Word embeddings

KW - Word similarity

KW - Word similarity Word embeddings

KW - Compositional semantics 2ch

UR - http://www.scopus.com/inward/record.url?scp=85039438003&partnerID=8YFLogxK

U2 - 10.1007/978-3-319-73013-4_2

DO - 10.1007/978-3-319-73013-4_2

M3 - Conference contribution

AN - SCOPUS:85039438003

SN - 9783319730127

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 16

EP - 21

BT - Analysis of Images, Social Networks and Texts - 6th International Conference, AIST 2017, Revised Selected Papers

A2 - VanDerAalst, WMP

A2 - Ignatov, DI

A2 - Khachay, M

A2 - Kuznetsov, SO

A2 - Lempitsky, null

A2 - Lomazova, IA

A2 - Loukachevitch, N

A2 - Napoli, A

A2 - Panchenko, A

A2 - Pardalos, PM

A2 - Savchenko, AV

A2 - Wasserman, S

PB - Springer-Verlag GmbH and Co. KG

Y2 - 27 July 2017 through 29 July 2017

ER -

ID: 12099770