Standard

A new method for detecting multiple text change points. / Abebe, Berhane.

In: Glottotheory, 14.04.2025.

Research output: Contribution to journalArticlepeer-review

Harvard

APA

Vancouver

Abebe B. A new method for detecting multiple text change points. Glottotheory. 2025 Apr 14. doi: 10.1515/glot-2025-2003

Author

Abebe, Berhane. / A new method for detecting multiple text change points. In: Glottotheory. 2025.

BibTeX

@article{403fa337ecdc41daa172ffd621e13eb0,
title = "A new method for detecting multiple text change points",
abstract = "This research introduces a new algorithm designed to identify double text change points within a concatenated text composed of three distinct texts. It also investigates the application of text homogeneity and text change point detection techniques to low-resource languages, specifically Tigre and Tigrigna. Leveraging recently developed probability models for text homogeneity and change point detection, the study proposes a novel algorithm capable of accurately locating multiple (double) text change points in a sequence of three texts while evaluating the error rate in estimating the primary and secondary points of concatenation. Data samples were gathered from three different genres in each of the target languages. The results demonstrate a notable reduction in the error rate for detecting text change points as the heterogeneity of the concatenated text increases.",
keywords = "Tigrigna and Tigre, multiple text change point detection, quantitative language models, text homogeneity, urn model",
author = "Berhane Abebe",
year = "2025",
month = apr,
day = "14",
doi = "10.1515/glot-2025-2003",
language = "English",
journal = "Glottotheory",
issn = "2196-6907",
publisher = "Walter de Gruyter",

}

RIS

TY - JOUR

T1 - A new method for detecting multiple text change points

AU - Abebe, Berhane

PY - 2025/4/14

Y1 - 2025/4/14

N2 - This research introduces a new algorithm designed to identify double text change points within a concatenated text composed of three distinct texts. It also investigates the application of text homogeneity and text change point detection techniques to low-resource languages, specifically Tigre and Tigrigna. Leveraging recently developed probability models for text homogeneity and change point detection, the study proposes a novel algorithm capable of accurately locating multiple (double) text change points in a sequence of three texts while evaluating the error rate in estimating the primary and secondary points of concatenation. Data samples were gathered from three different genres in each of the target languages. The results demonstrate a notable reduction in the error rate for detecting text change points as the heterogeneity of the concatenated text increases.

AB - This research introduces a new algorithm designed to identify double text change points within a concatenated text composed of three distinct texts. It also investigates the application of text homogeneity and text change point detection techniques to low-resource languages, specifically Tigre and Tigrigna. Leveraging recently developed probability models for text homogeneity and change point detection, the study proposes a novel algorithm capable of accurately locating multiple (double) text change points in a sequence of three texts while evaluating the error rate in estimating the primary and secondary points of concatenation. Data samples were gathered from three different genres in each of the target languages. The results demonstrate a notable reduction in the error rate for detecting text change points as the heterogeneity of the concatenated text increases.

KW - Tigrigna and Tigre

KW - multiple text change point detection

KW - quantitative language models

KW - text homogeneity

KW - urn model

UR - https://www.mendeley.com/catalogue/b46563d6-9b83-3102-97c8-bf8be0b5ef50/

UR - https://www.scopus.com/record/display.uri?eid=2-s2.0-105002691735&origin=inward&txGid=e6ec0584f2c0ed02a070e1c1c866397e

U2 - 10.1515/glot-2025-2003

DO - 10.1515/glot-2025-2003

M3 - Article

JO - Glottotheory

JF - Glottotheory

SN - 2196-6907

ER -

ID: 65234642