Research output: Contribution to journal › Article › peer-review
The specific problems of the experimental design in the high-throughput sequencing studies of transcriptome. / Menshanov, P. N.; Dygalo, N. N.
In: Russian Journal of Genetics: Applied Research, Vol. 7, No. 3, 01.05.2017, p. 258-265.Research output: Contribution to journal › Article › peer-review
}
TY - JOUR
T1 - The specific problems of the experimental design in the high-throughput sequencing studies of transcriptome
AU - Menshanov, P. N.
AU - Dygalo, N. N.
PY - 2017/5/1
Y1 - 2017/5/1
N2 - Some problems in the design of the high-throughput sequencing experiments utilizing RNA-Seq or Ribo-Seq technologies are reviewed. The ENCODE guidelines (2011) and the recommendations of other experts on the experimental design for studying animal and plant transcriptomes are briefed also. The optimal limit of the sequencing depth does exist for the identification of most actively transcribed genes. This limit depends on the transcriptome size in a studied biological object. Additional sequencing over this limit does not provide any substantial information about the complexity of the transcriptome. For mammals, the optimal limit of sequencing depth for identification of the actively transcribed genes is ~2 × 109 bp per biological sample. For other species, the optimal limit of sequencing depth per biological sample can be assessed using this value for mammals by recalculating it for target species with respect to their transcriptome size and specific RNA amount per cell. Detection of differentially expressed genes, as well as the identification of splice junctions in mRNA can be enhanced by increasing the number of analyzed biological samples per experimental group. Two biological replicates per experimental group should be sequenced at least. Five to eight biological replicates per experimental group should be sequenced at least to achieve the optimal results (similar to the qRT-PCR quantification of single gene expression). For the transcriptome studies, the sequencing technologies with an accuracy of sequencing of ≥0.999 per base pair are recommended to use. For RNA-Seq, the use of sequencing platforms giving reads with a length of ≥75 bp is optimal to minimize the sequencing cost. The relative cost for the sequencing of control groups can be reduced by increasing the number of experimental groups via combining several similar experiments or via the sophistication of the initial experiment. These recommendations can be helpful in designing the transcriptome experiments in functional genomics.
AB - Some problems in the design of the high-throughput sequencing experiments utilizing RNA-Seq or Ribo-Seq technologies are reviewed. The ENCODE guidelines (2011) and the recommendations of other experts on the experimental design for studying animal and plant transcriptomes are briefed also. The optimal limit of the sequencing depth does exist for the identification of most actively transcribed genes. This limit depends on the transcriptome size in a studied biological object. Additional sequencing over this limit does not provide any substantial information about the complexity of the transcriptome. For mammals, the optimal limit of sequencing depth for identification of the actively transcribed genes is ~2 × 109 bp per biological sample. For other species, the optimal limit of sequencing depth per biological sample can be assessed using this value for mammals by recalculating it for target species with respect to their transcriptome size and specific RNA amount per cell. Detection of differentially expressed genes, as well as the identification of splice junctions in mRNA can be enhanced by increasing the number of analyzed biological samples per experimental group. Two biological replicates per experimental group should be sequenced at least. Five to eight biological replicates per experimental group should be sequenced at least to achieve the optimal results (similar to the qRT-PCR quantification of single gene expression). For the transcriptome studies, the sequencing technologies with an accuracy of sequencing of ≥0.999 per base pair are recommended to use. For RNA-Seq, the use of sequencing platforms giving reads with a length of ≥75 bp is optimal to minimize the sequencing cost. The relative cost for the sequencing of control groups can be reduced by increasing the number of experimental groups via combining several similar experiments or via the sophistication of the initial experiment. These recommendations can be helpful in designing the transcriptome experiments in functional genomics.
KW - design of experiment
KW - high-throughput sequencing
KW - Ribo-Seq
KW - RNA-Seq
KW - transcriptome
UR - http://www.scopus.com/inward/record.url?scp=85018959828&partnerID=8YFLogxK
U2 - 10.1134/S207905971703011X
DO - 10.1134/S207905971703011X
M3 - Article
AN - SCOPUS:85018959828
VL - 7
SP - 258
EP - 265
JO - Russian Journal of Genetics: Applied Research
JF - Russian Journal of Genetics: Applied Research
SN - 2079-0597
IS - 3
ER -
ID: 8715639