Standard

MetArea: a software package for analysis of the mutually exclusive occurrence in pairs of motifs of transcription factor binding sites based on ChIP-seq data. / Levitsky, V. G.; Tsukanov, A. V.; Merkulova, T. I.

в: Vavilovskii Zhurnal Genetiki i Selektsii, Том 28, № 8, 2024, стр. 822-833.

Результаты исследований: Научные публикации в периодических изданияхстатьяРецензирование

Harvard

APA

Vancouver

Author

BibTeX

@article{c756ca7d2fe344028293856f31dd2881,
title = "MetArea: a software package for analysis of the mutually exclusive occurrence in pairs of motifs of transcription factor binding sites based on ChIP-seq data",
abstract = "ChIP-seq technology, which is based on chromatin immunoprecipitation (ChIP), allows mapping a set of genomic loci (peaks) containing binding sites (BS) for the investigated (target) transcription factor (TF). A TF may recognize several structurally different BS motifs. The multiprotein complex mapped in a ChIP-seq experiment includes target and other “partner” TFs linked by protein-protein interactions. Not all these TFs bind to DNA directly. Therefore, both target and partner TFs recognize enriched BS motifs in peaks. A de novo search approach is used to search for enriched TF BS motifs in ChIP-seq data. For a pair of enriched BS motifs of TFs, the co-occurrence or mutually exclusive occurrence can be detected from a set of peaks: the co-occurrence reflects a more frequent occurrence of two motifs in the same peaks, while the mutually exclusive means their more frequent detection in different peaks. We propose the MetArea software package to identify pairs of TF BS motifs with the mutually exclusive occurrence in ChIP-seq data. MetArea was designed to predict the structural diversity of BS motifs of the same TFs, and the functional relation of BS motifs of different TFs. The functional relation of the motifs of the two distinct TFs presumes that they are interchangeable as part of a multiprotein complex that uses the BS of these TFs to bind directly to DNA in different peaks. MetArea calculates the estimates of recognition performance pAUPRC (partial area under the Precision–Recall curve) for each of the two input single motifs, identifies the “joint” motif, and computes the performance for it too. The goal of the analysis is to find pairs of single motifs A and B for which the accuracy of the joint A&B motif is higher than those of both single motifs.",
keywords = "PR curve, area under curve, cooperative action of transcription factors, de novo motif search, structural variants of transcription factor binding site motifs",
author = "Levitsky, {V. G.} and Tsukanov, {A. V.} and Merkulova, {T. I.}",
note = " The work was supported by the Russian government project No. FWNR-2022-0020, Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences",
year = "2024",
doi = "10.18699/vjgb-24-90",
language = "English",
volume = "28",
pages = "822--833",
journal = "Вавиловский журнал генетики и селекции",
issn = "2500-0462",
publisher = "Institute of Cytology and Genetics of Siberian Branch of the Russian Academy of Sciences",
number = "8",

}

RIS

TY - JOUR

T1 - MetArea: a software package for analysis of the mutually exclusive occurrence in pairs of motifs of transcription factor binding sites based on ChIP-seq data

AU - Levitsky, V. G.

AU - Tsukanov, A. V.

AU - Merkulova, T. I.

N1 - The work was supported by the Russian government project No. FWNR-2022-0020, Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences

PY - 2024

Y1 - 2024

N2 - ChIP-seq technology, which is based on chromatin immunoprecipitation (ChIP), allows mapping a set of genomic loci (peaks) containing binding sites (BS) for the investigated (target) transcription factor (TF). A TF may recognize several structurally different BS motifs. The multiprotein complex mapped in a ChIP-seq experiment includes target and other “partner” TFs linked by protein-protein interactions. Not all these TFs bind to DNA directly. Therefore, both target and partner TFs recognize enriched BS motifs in peaks. A de novo search approach is used to search for enriched TF BS motifs in ChIP-seq data. For a pair of enriched BS motifs of TFs, the co-occurrence or mutually exclusive occurrence can be detected from a set of peaks: the co-occurrence reflects a more frequent occurrence of two motifs in the same peaks, while the mutually exclusive means their more frequent detection in different peaks. We propose the MetArea software package to identify pairs of TF BS motifs with the mutually exclusive occurrence in ChIP-seq data. MetArea was designed to predict the structural diversity of BS motifs of the same TFs, and the functional relation of BS motifs of different TFs. The functional relation of the motifs of the two distinct TFs presumes that they are interchangeable as part of a multiprotein complex that uses the BS of these TFs to bind directly to DNA in different peaks. MetArea calculates the estimates of recognition performance pAUPRC (partial area under the Precision–Recall curve) for each of the two input single motifs, identifies the “joint” motif, and computes the performance for it too. The goal of the analysis is to find pairs of single motifs A and B for which the accuracy of the joint A&B motif is higher than those of both single motifs.

AB - ChIP-seq technology, which is based on chromatin immunoprecipitation (ChIP), allows mapping a set of genomic loci (peaks) containing binding sites (BS) for the investigated (target) transcription factor (TF). A TF may recognize several structurally different BS motifs. The multiprotein complex mapped in a ChIP-seq experiment includes target and other “partner” TFs linked by protein-protein interactions. Not all these TFs bind to DNA directly. Therefore, both target and partner TFs recognize enriched BS motifs in peaks. A de novo search approach is used to search for enriched TF BS motifs in ChIP-seq data. For a pair of enriched BS motifs of TFs, the co-occurrence or mutually exclusive occurrence can be detected from a set of peaks: the co-occurrence reflects a more frequent occurrence of two motifs in the same peaks, while the mutually exclusive means their more frequent detection in different peaks. We propose the MetArea software package to identify pairs of TF BS motifs with the mutually exclusive occurrence in ChIP-seq data. MetArea was designed to predict the structural diversity of BS motifs of the same TFs, and the functional relation of BS motifs of different TFs. The functional relation of the motifs of the two distinct TFs presumes that they are interchangeable as part of a multiprotein complex that uses the BS of these TFs to bind directly to DNA in different peaks. MetArea calculates the estimates of recognition performance pAUPRC (partial area under the Precision–Recall curve) for each of the two input single motifs, identifies the “joint” motif, and computes the performance for it too. The goal of the analysis is to find pairs of single motifs A and B for which the accuracy of the joint A&B motif is higher than those of both single motifs.

KW - PR curve

KW - area under curve

KW - cooperative action of transcription factors

KW - de novo motif search

KW - structural variants of transcription factor binding site motifs

UR - https://www.mendeley.com/catalogue/c9a02f0b-fbcc-39ef-af42-fe0f7aa7cc1b/

UR - https://www.scopus.com/record/display.uri?eid=2-s2.0-85217191627&origin=inward&txGid=57e3dd0aa7db237cf6143e9b28a720cd

U2 - 10.18699/vjgb-24-90

DO - 10.18699/vjgb-24-90

M3 - Article

C2 - 39944799

VL - 28

SP - 822

EP - 833

JO - Вавиловский журнал генетики и селекции

JF - Вавиловский журнал генетики и селекции

SN - 2500-0462

IS - 8

ER -

ID: 64715644