Research output: Contribution to journal › Article › peer-review
A new method for detecting multiple text change points. / Abebe, Berhane.
In: Glottotheory, 14.04.2025.Research output: Contribution to journal › Article › peer-review
}
TY - JOUR
T1 - A new method for detecting multiple text change points
AU - Abebe, Berhane
PY - 2025/4/14
Y1 - 2025/4/14
N2 - This research introduces a new algorithm designed to identify double text change points within a concatenated text composed of three distinct texts. It also investigates the application of text homogeneity and text change point detection techniques to low-resource languages, specifically Tigre and Tigrigna. Leveraging recently developed probability models for text homogeneity and change point detection, the study proposes a novel algorithm capable of accurately locating multiple (double) text change points in a sequence of three texts while evaluating the error rate in estimating the primary and secondary points of concatenation. Data samples were gathered from three different genres in each of the target languages. The results demonstrate a notable reduction in the error rate for detecting text change points as the heterogeneity of the concatenated text increases.
AB - This research introduces a new algorithm designed to identify double text change points within a concatenated text composed of three distinct texts. It also investigates the application of text homogeneity and text change point detection techniques to low-resource languages, specifically Tigre and Tigrigna. Leveraging recently developed probability models for text homogeneity and change point detection, the study proposes a novel algorithm capable of accurately locating multiple (double) text change points in a sequence of three texts while evaluating the error rate in estimating the primary and secondary points of concatenation. Data samples were gathered from three different genres in each of the target languages. The results demonstrate a notable reduction in the error rate for detecting text change points as the heterogeneity of the concatenated text increases.
KW - Tigrigna and Tigre
KW - multiple text change point detection
KW - quantitative language models
KW - text homogeneity
KW - urn model
UR - https://www.mendeley.com/catalogue/b46563d6-9b83-3102-97c8-bf8be0b5ef50/
UR - https://www.scopus.com/record/display.uri?eid=2-s2.0-105002691735&origin=inward&txGid=e6ec0584f2c0ed02a070e1c1c866397e
U2 - 10.1515/glot-2025-2003
DO - 10.1515/glot-2025-2003
M3 - Article
JO - Glottotheory
JF - Glottotheory
SN - 2196-6907
ER -
ID: 65234642