Research output: Contribution to journal › Article › peer-review
Statistical estimates of multiple transcription factors binding in the model plant genomes based on ChIP-seq data. / Dergilev, Arthur I.; Orlova, Nina G.; Dobrovolskaya, Oxana B. et al.
In: Journal of integrative bioinformatics, Vol. 19, No. 1, 21.12.2021.Research output: Contribution to journal › Article › peer-review
}
TY - JOUR
T1 - Statistical estimates of multiple transcription factors binding in the model plant genomes based on ChIP-seq data
AU - Dergilev, Arthur I.
AU - Orlova, Nina G.
AU - Dobrovolskaya, Oxana B.
AU - Orlov, Yuriy L.
N1 - The publication has been prepared with the support of the RUDN University Strategic Academic Leadership Program (Recipients: YO, OD). Publisher Copyright: © 2021 Arthur I. Dergilev et al., published by De Gruyter, Berlin/Boston.
PY - 2021/12/21
Y1 - 2021/12/21
N2 - The development of high-throughput genomic sequencing coupled with chromatin immunoprecipitation technologies allows studying the binding sites of the protein transcription factors (TF) in the genome scale. The growth of data volume on the experimentally determined binding sites raises qualitatively new problems for the analysis of gene expression regulation, prediction of transcription factors target genes, and regulatory gene networks reconstruction. Genome regulation remains an insufficiently studied though plants have complex molecular regulatory mechanisms of gene expression and response to environmental stresses. It is important to develop new software tools for the analysis of the TF binding sites location and their clustering in the plant genomes, visualization, and the following statistical estimates. This study presents application of the analysis of multiple TF binding profiles in three evolutionarily distant model plant organisms. The construction and analysis of non-random ChIP-seq binding clusters of the different TFs in mammalian embryonic stem cells were discussed earlier using similar bioinformatics approaches. Such clusters of TF binding sites may indicate the gene regulatory regions, enhancers and gene transcription regulatory hubs. It can be used for analysis of the gene promoters as well as a background for transcription networks reconstruction. We discuss the statistical estimates of the TF binding sites clusters in the model plant genomes. The distributions of the number of different TFs per binding cluster follow same power law distribution for all the genomes studied. The binding clusters in Arabidopsis thaliana genome were discussed here in detail.
AB - The development of high-throughput genomic sequencing coupled with chromatin immunoprecipitation technologies allows studying the binding sites of the protein transcription factors (TF) in the genome scale. The growth of data volume on the experimentally determined binding sites raises qualitatively new problems for the analysis of gene expression regulation, prediction of transcription factors target genes, and regulatory gene networks reconstruction. Genome regulation remains an insufficiently studied though plants have complex molecular regulatory mechanisms of gene expression and response to environmental stresses. It is important to develop new software tools for the analysis of the TF binding sites location and their clustering in the plant genomes, visualization, and the following statistical estimates. This study presents application of the analysis of multiple TF binding profiles in three evolutionarily distant model plant organisms. The construction and analysis of non-random ChIP-seq binding clusters of the different TFs in mammalian embryonic stem cells were discussed earlier using similar bioinformatics approaches. Such clusters of TF binding sites may indicate the gene regulatory regions, enhancers and gene transcription regulatory hubs. It can be used for analysis of the gene promoters as well as a background for transcription networks reconstruction. We discuss the statistical estimates of the TF binding sites clusters in the model plant genomes. The distributions of the number of different TFs per binding cluster follow same power law distribution for all the genomes studied. The binding clusters in Arabidopsis thaliana genome were discussed here in detail.
KW - ChIP-seq
KW - gene expression
KW - plant genomes
KW - regulatory gene networks
KW - transcription factor binding sites
KW - transcription regulation
KW - DNA-BINDING
KW - CONSERVATION
KW - EVOLUTION
KW - NETWORK
KW - FAMILY
KW - GENES
KW - GATA
KW - Mammals/genetics
KW - Transcription Factors/genetics
KW - Genome, Plant
KW - Binding Sites/genetics
KW - Animals
KW - Chromatin Immunoprecipitation
KW - Chromatin Immunoprecipitation Sequencing
UR - http://www.scopus.com/inward/record.url?scp=85128160641&partnerID=8YFLogxK
U2 - 10.1515/jib-2020-0036
DO - 10.1515/jib-2020-0036
M3 - Article
C2 - 34953471
VL - 19
JO - Journal of integrative bioinformatics
JF - Journal of integrative bioinformatics
SN - 1613-4516
IS - 1
ER -
ID: 35410691