Statistical estimates of multiple transcription factors binding in the model plant genomes based on ChIP-seq data

Standard

Statistical estimates of multiple transcription factors binding in the model plant genomes based on ChIP-seq data. / Dergilev, Arthur I.; Orlova, Nina G.; Dobrovolskaya, Oxana B. и др.

в: Journal of integrative bioinformatics, Том 19, № 1, 21.12.2021.

Результаты исследований: Научные публикации в периодических изданиях › статья › Рецензирование

BibTeX

@article{6fdea742670a468993ac1ffbace06a0e,

title = "Statistical estimates of multiple transcription factors binding in the model plant genomes based on ChIP-seq data",

abstract = "The development of high-throughput genomic sequencing coupled with chromatin immunoprecipitation technologies allows studying the binding sites of the protein transcription factors (TF) in the genome scale. The growth of data volume on the experimentally determined binding sites raises qualitatively new problems for the analysis of gene expression regulation, prediction of transcription factors target genes, and regulatory gene networks reconstruction. Genome regulation remains an insufficiently studied though plants have complex molecular regulatory mechanisms of gene expression and response to environmental stresses. It is important to develop new software tools for the analysis of the TF binding sites location and their clustering in the plant genomes, visualization, and the following statistical estimates. This study presents application of the analysis of multiple TF binding profiles in three evolutionarily distant model plant organisms. The construction and analysis of non-random ChIP-seq binding clusters of the different TFs in mammalian embryonic stem cells were discussed earlier using similar bioinformatics approaches. Such clusters of TF binding sites may indicate the gene regulatory regions, enhancers and gene transcription regulatory hubs. It can be used for analysis of the gene promoters as well as a background for transcription networks reconstruction. We discuss the statistical estimates of the TF binding sites clusters in the model plant genomes. The distributions of the number of different TFs per binding cluster follow same power law distribution for all the genomes studied. The binding clusters in Arabidopsis thaliana genome were discussed here in detail.",

keywords = "ChIP-seq, gene expression, plant genomes, regulatory gene networks, transcription factor binding sites, transcription regulation, DNA-BINDING, CONSERVATION, EVOLUTION, NETWORK, FAMILY, GENES, GATA, Mammals/genetics, Transcription Factors/genetics, Genome, Plant, Binding Sites/genetics, Animals, Chromatin Immunoprecipitation, Chromatin Immunoprecipitation Sequencing",

author = "Dergilev, {Arthur I.} and Orlova, {Nina G.} and Dobrovolskaya, {Oxana B.} and Orlov, {Yuriy L.}",

note = "The publication has been prepared with the support of the RUDN University Strategic Academic Leadership Program (Recipients: YO, OD). Publisher Copyright: {\textcopyright} 2021 Arthur I. Dergilev et al., published by De Gruyter, Berlin/Boston.",

year = "2021",

month = dec,

day = "21",

doi = "10.1515/jib-2020-0036",

language = "English",

volume = "19",

journal = "Journal of integrative bioinformatics",

issn = "1613-4516",

publisher = "Walter de Gruyter GmbH",

number = "1",

}

RIS

TY - JOUR

T1 - Statistical estimates of multiple transcription factors binding in the model plant genomes based on ChIP-seq data

AU - Dergilev, Arthur I.

AU - Orlova, Nina G.

AU - Dobrovolskaya, Oxana B.

AU - Orlov, Yuriy L.

N1 - The publication has been prepared with the support of the RUDN University Strategic Academic Leadership Program (Recipients: YO, OD). Publisher Copyright: © 2021 Arthur I. Dergilev et al., published by De Gruyter, Berlin/Boston.

PY - 2021/12/21

Y1 - 2021/12/21

N2 - The development of high-throughput genomic sequencing coupled with chromatin immunoprecipitation technologies allows studying the binding sites of the protein transcription factors (TF) in the genome scale. The growth of data volume on the experimentally determined binding sites raises qualitatively new problems for the analysis of gene expression regulation, prediction of transcription factors target genes, and regulatory gene networks reconstruction. Genome regulation remains an insufficiently studied though plants have complex molecular regulatory mechanisms of gene expression and response to environmental stresses. It is important to develop new software tools for the analysis of the TF binding sites location and their clustering in the plant genomes, visualization, and the following statistical estimates. This study presents application of the analysis of multiple TF binding profiles in three evolutionarily distant model plant organisms. The construction and analysis of non-random ChIP-seq binding clusters of the different TFs in mammalian embryonic stem cells were discussed earlier using similar bioinformatics approaches. Such clusters of TF binding sites may indicate the gene regulatory regions, enhancers and gene transcription regulatory hubs. It can be used for analysis of the gene promoters as well as a background for transcription networks reconstruction. We discuss the statistical estimates of the TF binding sites clusters in the model plant genomes. The distributions of the number of different TFs per binding cluster follow same power law distribution for all the genomes studied. The binding clusters in Arabidopsis thaliana genome were discussed here in detail.

AB - The development of high-throughput genomic sequencing coupled with chromatin immunoprecipitation technologies allows studying the binding sites of the protein transcription factors (TF) in the genome scale. The growth of data volume on the experimentally determined binding sites raises qualitatively new problems for the analysis of gene expression regulation, prediction of transcription factors target genes, and regulatory gene networks reconstruction. Genome regulation remains an insufficiently studied though plants have complex molecular regulatory mechanisms of gene expression and response to environmental stresses. It is important to develop new software tools for the analysis of the TF binding sites location and their clustering in the plant genomes, visualization, and the following statistical estimates. This study presents application of the analysis of multiple TF binding profiles in three evolutionarily distant model plant organisms. The construction and analysis of non-random ChIP-seq binding clusters of the different TFs in mammalian embryonic stem cells were discussed earlier using similar bioinformatics approaches. Such clusters of TF binding sites may indicate the gene regulatory regions, enhancers and gene transcription regulatory hubs. It can be used for analysis of the gene promoters as well as a background for transcription networks reconstruction. We discuss the statistical estimates of the TF binding sites clusters in the model plant genomes. The distributions of the number of different TFs per binding cluster follow same power law distribution for all the genomes studied. The binding clusters in Arabidopsis thaliana genome were discussed here in detail.

KW - ChIP-seq

KW - gene expression

KW - plant genomes

KW - regulatory gene networks

KW - transcription factor binding sites

KW - transcription regulation

KW - DNA-BINDING

KW - CONSERVATION

KW - EVOLUTION

KW - NETWORK

KW - FAMILY

KW - GENES

KW - GATA

KW - Mammals/genetics

KW - Transcription Factors/genetics

KW - Genome, Plant

KW - Binding Sites/genetics

KW - Animals

KW - Chromatin Immunoprecipitation

KW - Chromatin Immunoprecipitation Sequencing

UR - http://www.scopus.com/inward/record.url?scp=85128160641&partnerID=8YFLogxK

U2 - 10.1515/jib-2020-0036

DO - 10.1515/jib-2020-0036

M3 - Article

C2 - 34953471

VL - 19

JO - Journal of integrative bioinformatics

JF - Journal of integrative bioinformatics

SN - 1613-4516

IS - 1

ER -

ID: 35410691