Research output: Contribution to journal › Article › peer-review
GTRD: a database on gene transcription regulation-2019 update. / Yevshin, Ivan; Sharipov, Ruslan; Kolmykov, Semyon et al.
In: Nucleic Acids Research, Vol. 47, No. D1, 08.01.2019, p. D100-D105.Research output: Contribution to journal › Article › peer-review
}
TY - JOUR
T1 - GTRD: a database on gene transcription regulation-2019 update
AU - Yevshin, Ivan
AU - Sharipov, Ruslan
AU - Kolmykov, Semyon
AU - Kondrakhin, Yury
AU - Kolpakov, Fedor
N1 - Russian Foundation for Basic Research [17-00-00296]. Funding for open access charge: Russian Foundation for Basic Research. Publisher Copyright: © The Author(s) 2018.
PY - 2019/1/8
Y1 - 2019/1/8
N2 - The current version of the Gene Transcription Regulation Database (GTRD; http://gtrd.biouml.org) contains information about: (i) transcription factor binding sites (TFBSs) and transcription coactivators identified by ChIP-seq experiments for Homo sapiens, Mus musculus, Rattus norvegicus, Danio rerio, Caenorhabditis elegans, Drosophila melanogaster, Saccharomyces cerevisiae, Schizosaccharomyces pombe and Arabidopsis thaliana; (ii) regions of open chromatin and TFBSs (DNase footprints) identified by DNase-seq; (iii) unmappable regions where TFBSs cannot be identified due to repeats; (iv) potential TFBSs for both human and mouse using position weight matrices from the HOCOMOCO database. Raw ChIP-seq and DNase-seq data were obtained from ENCODE and SRA, and uniformly processed. ChIP-seq peaks were called using four different methods: MACS, SISSRs, GEM and PICS. Moreover, peaks for the same factor and peak calling method, albeit using different experiment conditions (cell line, treatment, etc.), were merged into clusters. To reduce noise, such clusters for different peak calling methods were merged into meta-clusters; these were considered to be non-redundant TFBS sets. Moreover, extended quality control was applied to all ChIP-seq data. Web interface to access GTRD was developed using the BioUML platform. It provides browsing and displaying information, advanced search possibilities and an integrated genome browser.
AB - The current version of the Gene Transcription Regulation Database (GTRD; http://gtrd.biouml.org) contains information about: (i) transcription factor binding sites (TFBSs) and transcription coactivators identified by ChIP-seq experiments for Homo sapiens, Mus musculus, Rattus norvegicus, Danio rerio, Caenorhabditis elegans, Drosophila melanogaster, Saccharomyces cerevisiae, Schizosaccharomyces pombe and Arabidopsis thaliana; (ii) regions of open chromatin and TFBSs (DNase footprints) identified by DNase-seq; (iii) unmappable regions where TFBSs cannot be identified due to repeats; (iv) potential TFBSs for both human and mouse using position weight matrices from the HOCOMOCO database. Raw ChIP-seq and DNase-seq data were obtained from ENCODE and SRA, and uniformly processed. ChIP-seq peaks were called using four different methods: MACS, SISSRs, GEM and PICS. Moreover, peaks for the same factor and peak calling method, albeit using different experiment conditions (cell line, treatment, etc.), were merged into clusters. To reduce noise, such clusters for different peak calling methods were merged into meta-clusters; these were considered to be non-redundant TFBS sets. Moreover, extended quality control was applied to all ChIP-seq data. Web interface to access GTRD was developed using the BioUML platform. It provides browsing and displaying information, advanced search possibilities and an integrated genome browser.
KW - FACTOR-BINDING SITES
KW - CHIP-SEQ
KW - READ ALIGNMENT
KW - IDENTIFICATION
KW - COLLECTION
KW - HOCOMOCO
KW - ARCHIVE
UR - http://www.scopus.com/inward/record.url?scp=85059798202&partnerID=8YFLogxK
U2 - 10.1093/nar/gky1128
DO - 10.1093/nar/gky1128
M3 - Article
C2 - 30445619
AN - SCOPUS:85059798202
VL - 47
SP - D100-D105
JO - Nucleic Acids Research
JF - Nucleic Acids Research
SN - 0305-1048
IS - D1
ER -
ID: 18118790