A single ChIP-seq dataset is sufficient for comprehensive analysis of motifs co-occurrence with MCOT package

Standard

A single ChIP-seq dataset is sufficient for comprehensive analysis of motifs co-occurrence with MCOT package. / Levitsky, Victor ; Zemlyanskaya, Elena; Oshchepkov, Dmitry et al.

In: Nucleic Acids Research, Vol. 47, No. 21, 02.12.2019, p. e139.

Research output: Contribution to journal › Article › peer-review

BibTeX

@article{e6d2d2868c6943d5b7aa76e3f18f2f3c,

title = "A single ChIP-seq dataset is sufficient for comprehensive analysis of motifs co-occurrence with MCOT package",

abstract = "Recognition of composite elements consisting of two transcription factor binding sites gets behind the studies of tissue-, stage- and condition-specific transcription. Genome-wide data on transcription factor binding generated with ChIP-seq method facilitate an identification of composite elements, but the existing bioinformatics tools either require ChIP-seq datasets for both partner transcription factors, or omit composite elements with motifs overlapping. Here we present an universal Motifs Co-Occurrence Tool (MCOT) that retrieves maximum information about overrepresented composite elements from a single ChIP-seq dataset. This includes homo- and heterotypic composite elements of four mutual orientations of motifs, separated with a spacer or overlapping, even if recognition of motifs within composite element requires various stringencies. Analysis of 52 ChIP-seq datasets for 18 human transcription factors confirmed that for over 60% of analyzed datasets and transcription factors predicted co-occurrence of motifs implied experimentally proven protein-protein interaction of respecting transcription factors. Analysis of 164 ChIP-seq datasets for 57 mammalian transcription factors showed that abundance of predicted composite elements with an overlap of motifs compared to those with a spacer more than doubled; and they had 1.5-fold increase of asymmetrical pairs of motifs with one more conservative 'leading' motif and another one 'guided'.",

keywords = "Algorithms, Animals, Binding Sites, Chromatin Immunoprecipitation Sequencing/methods, Computational Biology/methods, Datasets as Topic, Humans, Mice, Nucleotide Motifs/genetics, Regulatory Elements, Transcriptional/genetics, Sequence Analysis, DNA/methods, Transcription Factors/genetics, TRANSCRIPTION FACTORS, ACTIVATION, DNA-BINDING, CHROMATIN, COMPLEXES, COMPOSITE REGULATORY ELEMENTS, ENHANCERS, DISCOVERY, REGIONS, NF-KAPPA-B",

author = "Victor Levitsky and Elena Zemlyanskaya and Dmitry Oshchepkov and Olga Podkolodnaya and Elena Ignatieva and Ivo Grosse and Victoria Mironova and Tatyana Merkulova",

note = "Publisher Copyright: {\textcopyright} The Author(s) 2019. Published by Oxford University Press on behalf of Nucleic Acids Research. Copyright: This record is sourced from MEDLINE/PubMed, a database of the U.S. National Library of Medicine {\textcopyright} The Author(s) 2019. Published by Oxford University Press on behalf of Nucleic Acids Research.",

year = "2019",

month = dec,

day = "2",

doi = "10.1093/nar/gkz800",

language = "English",

volume = "47",

pages = "e139",

journal = "Nucleic Acids Research",

issn = "0305-1048",

publisher = "Oxford University Press",

number = "21",

}

RIS

TY - JOUR

T1 - A single ChIP-seq dataset is sufficient for comprehensive analysis of motifs co-occurrence with MCOT package

AU - Levitsky, Victor

AU - Zemlyanskaya, Elena

AU - Oshchepkov, Dmitry

AU - Podkolodnaya, Olga

AU - Ignatieva, Elena

AU - Grosse, Ivo

AU - Mironova, Victoria

AU - Merkulova, Tatyana

N1 - Publisher Copyright: © The Author(s) 2019. Published by Oxford University Press on behalf of Nucleic Acids Research. Copyright: This record is sourced from MEDLINE/PubMed, a database of the U.S. National Library of Medicine © The Author(s) 2019. Published by Oxford University Press on behalf of Nucleic Acids Research.

PY - 2019/12/2

Y1 - 2019/12/2

N2 - Recognition of composite elements consisting of two transcription factor binding sites gets behind the studies of tissue-, stage- and condition-specific transcription. Genome-wide data on transcription factor binding generated with ChIP-seq method facilitate an identification of composite elements, but the existing bioinformatics tools either require ChIP-seq datasets for both partner transcription factors, or omit composite elements with motifs overlapping. Here we present an universal Motifs Co-Occurrence Tool (MCOT) that retrieves maximum information about overrepresented composite elements from a single ChIP-seq dataset. This includes homo- and heterotypic composite elements of four mutual orientations of motifs, separated with a spacer or overlapping, even if recognition of motifs within composite element requires various stringencies. Analysis of 52 ChIP-seq datasets for 18 human transcription factors confirmed that for over 60% of analyzed datasets and transcription factors predicted co-occurrence of motifs implied experimentally proven protein-protein interaction of respecting transcription factors. Analysis of 164 ChIP-seq datasets for 57 mammalian transcription factors showed that abundance of predicted composite elements with an overlap of motifs compared to those with a spacer more than doubled; and they had 1.5-fold increase of asymmetrical pairs of motifs with one more conservative 'leading' motif and another one 'guided'.

AB - Recognition of composite elements consisting of two transcription factor binding sites gets behind the studies of tissue-, stage- and condition-specific transcription. Genome-wide data on transcription factor binding generated with ChIP-seq method facilitate an identification of composite elements, but the existing bioinformatics tools either require ChIP-seq datasets for both partner transcription factors, or omit composite elements with motifs overlapping. Here we present an universal Motifs Co-Occurrence Tool (MCOT) that retrieves maximum information about overrepresented composite elements from a single ChIP-seq dataset. This includes homo- and heterotypic composite elements of four mutual orientations of motifs, separated with a spacer or overlapping, even if recognition of motifs within composite element requires various stringencies. Analysis of 52 ChIP-seq datasets for 18 human transcription factors confirmed that for over 60% of analyzed datasets and transcription factors predicted co-occurrence of motifs implied experimentally proven protein-protein interaction of respecting transcription factors. Analysis of 164 ChIP-seq datasets for 57 mammalian transcription factors showed that abundance of predicted composite elements with an overlap of motifs compared to those with a spacer more than doubled; and they had 1.5-fold increase of asymmetrical pairs of motifs with one more conservative 'leading' motif and another one 'guided'.

KW - Algorithms

KW - Animals

KW - Binding Sites

KW - Chromatin Immunoprecipitation Sequencing/methods

KW - Computational Biology/methods

KW - Datasets as Topic

KW - Humans

KW - Mice

KW - Nucleotide Motifs/genetics

KW - Regulatory Elements, Transcriptional/genetics

KW - Sequence Analysis, DNA/methods

KW - Transcription Factors/genetics

KW - TRANSCRIPTION FACTORS

KW - ACTIVATION

KW - DNA-BINDING

KW - CHROMATIN

KW - COMPLEXES

KW - COMPOSITE REGULATORY ELEMENTS

KW - ENHANCERS

KW - DISCOVERY

KW - REGIONS

KW - NF-KAPPA-B

UR - http://www.scopus.com/inward/record.url?scp=85075326742&partnerID=8YFLogxK

U2 - 10.1093/nar/gkz800

DO - 10.1093/nar/gkz800

M3 - Article

C2 - 31750523

AN - SCOPUS:85075326742

VL - 47

SP - e139

JO - Nucleic Acids Research

JF - Nucleic Acids Research

SN - 0305-1048

IS - 21

ER -

ID: 26207675