Research output: Contribution to journal › Article › peer-review
HOCOMOCO: Towards a complete collection of transcription factor binding models for human and mouse via large-scale ChIP-Seq analysis. / Kulakovskiy, Ivan V.; Vorontsov, Ilya E.; Yevshin, Ivan S. et al.
In: Nucleic Acids Research, Vol. 46, No. D1, 04.01.2018, p. D252-D259.Research output: Contribution to journal › Article › peer-review
}
TY - JOUR
T1 - HOCOMOCO: Towards a complete collection of transcription factor binding models for human and mouse via large-scale ChIP-Seq analysis
AU - Kulakovskiy, Ivan V.
AU - Vorontsov, Ilya E.
AU - Yevshin, Ivan S.
AU - Sharipov, Ruslan N.
AU - Fedorova, Alla D.
AU - Rumynskiy, Eugene I.
AU - Medvedeva, Yulia A.
AU - Magana-Mora, Arturo
AU - Bajic, Vladimir B.
AU - Papatsenko, Dmitry A.
AU - Kolpakov, Fedor A.
AU - Makeev, Vsevolod J.
N1 - Funding Information: The project was primarily supported by Russian Science Foundation [17-74-10188 to I.V.K.]; A.M.M. and V.B.B. were supported by King Abdullah University of Science and Technology (KAUST) [baseline fund BAS/1/1606-01-01 of V.B.B.]; I.E.V. was personally supported by the Skoltech Systems Biology Fellowship. Funding for open access charge: Russian Science Foundation [17–74–10188 to I.V.K.]. Conflict of interest statement. None declared. Publisher Copyright: © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
PY - 2018/1/4
Y1 - 2018/1/4
N2 - We present a major update of the HOCOMOCO collection that consists of patterns describing DNA binding specificities for human and mouse transcription factors. In this release, we profited from a nearly doubled volume of published in vivo experiments on transcription factor (TF) binding to expand the repertoire of binding models, replace low-quality models previously based on in vitro data only and cover more than a hundred TFs with previously unknown binding specificities. This was achieved by systematic motif discovery from more than five thousand ChIP-Seq experiments uniformly processed within the BioUML framework with several ChIP-Seq peak calling tools and aggregated in the GTRD database. HOCOMOCO v11 contains binding models for 453 mouse and 680 human transcription factors and includes 1302 mononucleotide and 576 dinucleotide position weight matrices, which describe primary binding preferences of each transcription factor and reliable alternative binding specificities. An interactive interface and bulk downloads are available on the web: http://hocomoco.autosome.ru and http://www.cbrc.kaust.edu.sa/hocomoco11. In this release, we complement HOCOMOCO by MoLoTool (Motif Location Toolbox, http://molotool.autosome.ru) that applies HOCOMOCO models for visualization of binding sites in short DNA sequences.
AB - We present a major update of the HOCOMOCO collection that consists of patterns describing DNA binding specificities for human and mouse transcription factors. In this release, we profited from a nearly doubled volume of published in vivo experiments on transcription factor (TF) binding to expand the repertoire of binding models, replace low-quality models previously based on in vitro data only and cover more than a hundred TFs with previously unknown binding specificities. This was achieved by systematic motif discovery from more than five thousand ChIP-Seq experiments uniformly processed within the BioUML framework with several ChIP-Seq peak calling tools and aggregated in the GTRD database. HOCOMOCO v11 contains binding models for 453 mouse and 680 human transcription factors and includes 1302 mononucleotide and 576 dinucleotide position weight matrices, which describe primary binding preferences of each transcription factor and reliable alternative binding specificities. An interactive interface and bulk downloads are available on the web: http://hocomoco.autosome.ru and http://www.cbrc.kaust.edu.sa/hocomoco11. In this release, we complement HOCOMOCO by MoLoTool (Motif Location Toolbox, http://molotool.autosome.ru) that applies HOCOMOCO models for visualization of binding sites in short DNA sequences.
KW - EXPANSION
KW - GENE
KW - MOTIFS
KW - OPEN-ACCESS DATABASE
KW - SITES
UR - http://www.scopus.com/inward/record.url?scp=85040905364&partnerID=8YFLogxK
U2 - 10.1093/nar/gkx1106
DO - 10.1093/nar/gkx1106
M3 - Article
C2 - 29140464
AN - SCOPUS:85040905364
VL - 46
SP - D252-D259
JO - Nucleic Acids Research
JF - Nucleic Acids Research
SN - 0305-1048
IS - D1
ER -
ID: 13146172