Research output: Contribution to journal › Article › peer-review
Information-Theoretic method for classification of texts. / Ryabko, B. Ya; Gus’kov, A. E.; Selivanova, I. V.
In: Problems of Information Transmission, Vol. 53, No. 3, 01.07.2017, p. 294-304.Research output: Contribution to journal › Article › peer-review
}
TY - JOUR
T1 - Information-Theoretic method for classification of texts
AU - Ryabko, B. Ya
AU - Gus’kov, A. E.
AU - Selivanova, I. V.
PY - 2017/7/1
Y1 - 2017/7/1
N2 - We consider a method for automatic (i.e., unmanned) text classification based on methods of universal source coding (or “data compression”). We show that under certain restrictions the proposed method is consistent, i.e., the classification error tends to zero with increasing text lengths. As an example of practical use of the method we consider the classification problem for scientific texts (research papers, books, etc.). The proposed method is experimentally shown to be highly efficient.
AB - We consider a method for automatic (i.e., unmanned) text classification based on methods of universal source coding (or “data compression”). We show that under certain restrictions the proposed method is consistent, i.e., the classification error tends to zero with increasing text lengths. As an example of practical use of the method we consider the classification problem for scientific texts (research papers, books, etc.). The proposed method is experimentally shown to be highly efficient.
UR - http://www.scopus.com/inward/record.url?scp=85031754667&partnerID=8YFLogxK
U2 - 10.1134/S0032946017030115
DO - 10.1134/S0032946017030115
M3 - Article
AN - SCOPUS:85031754667
VL - 53
SP - 294
EP - 304
JO - Problems of Information Transmission
JF - Problems of Information Transmission
SN - 0032-9460
IS - 3
ER -
ID: 9410463