Standard

Improving Illumina assemblies with Hi-C and long reads : An example with the North African dromedary. / Elbers, Jean P.; Rogers, Mark F.; Perelman, Polina L. et al.

In: Molecular Ecology Resources, Vol. 19, No. 4, 01.07.2019, p. 1015-1026.

Research output: Contribution to journalArticlepeer-review

Harvard

Elbers, JP, Rogers, MF, Perelman, PL, Proskuryakova, AA, Serdyukova, NA, Johnson, WE, Horin, P, Corander, J, Murphy, D & Burger, PA 2019, 'Improving Illumina assemblies with Hi-C and long reads: An example with the North African dromedary', Molecular Ecology Resources, vol. 19, no. 4, pp. 1015-1026. https://doi.org/10.1111/1755-0998.13020

APA

Elbers, J. P., Rogers, M. F., Perelman, P. L., Proskuryakova, A. A., Serdyukova, N. A., Johnson, W. E., Horin, P., Corander, J., Murphy, D., & Burger, P. A. (2019). Improving Illumina assemblies with Hi-C and long reads: An example with the North African dromedary. Molecular Ecology Resources, 19(4), 1015-1026. https://doi.org/10.1111/1755-0998.13020

Vancouver

Elbers JP, Rogers MF, Perelman PL, Proskuryakova AA, Serdyukova NA, Johnson WE et al. Improving Illumina assemblies with Hi-C and long reads: An example with the North African dromedary. Molecular Ecology Resources. 2019 Jul 1;19(4):1015-1026. doi: 10.1111/1755-0998.13020

Author

Elbers, Jean P. ; Rogers, Mark F. ; Perelman, Polina L. et al. / Improving Illumina assemblies with Hi-C and long reads : An example with the North African dromedary. In: Molecular Ecology Resources. 2019 ; Vol. 19, No. 4. pp. 1015-1026.

BibTeX

@article{de9a9491b6d54e829fe086c20bc09f50,
title = "Improving Illumina assemblies with Hi-C and long reads: An example with the North African dromedary",
abstract = "Researchers have assembled thousands of eukaryotic genomes using Illumina reads, but traditional mate-pair libraries cannot span all repetitive elements, resulting in highly fragmented assemblies. However, both chromosome conformation capture techniques, such as Hi-C and Dovetail Genomics Chicago libraries and long-read sequencing, such as Pacific Biosciences and Oxford Nanopore, help span and resolve repetitive regions and therefore improve genome assemblies. One important livestock species of arid regions that does not have a high-quality contiguous reference genome is the dromedary (Camelus dromedarius). Draft genomes exist but are highly fragmented, and a high-quality reference genome is needed to understand adaptation to desert environments and artificial selection during domestication. Dromedaries are among the last livestock species to have been domesticated, and together with wild and domestic Bactrian camels, they are the only representatives of the Camelini tribe, which highlights their evolutionary significance. Here we describe our efforts to improve the North African dromedary genome. We used Chicago and Hi-C sequencing libraries from Dovetail Genomics to resolve the order of previously assembled contigs, producing almost chromosome-level scaffolds. Remaining gaps were filled with Pacific Biosciences long reads, and then scaffolds were comparatively mapped to chromosomes. Long reads added 99.32 Mbp to the total length of the new assembly. Dovetail Chicago and Hi-C libraries increased the longest scaffold over 12-fold, from 9.71 Mbp to 124.99 Mbp and the scaffold N50 over 50-fold, from 1.48 Mbp to 75.02 Mbp. We demonstrate that Illumina de novo assemblies can be substantially upgraded by combining chromosome conformation capture and long-read sequencing.",
keywords = "chromosome conformation capture, chromosome mapping, dromedary, genome annotation, genome assembly, scaffolding, PLANT, CATTLE, ABYSS, PRODUCTION TRAITS, ANNOTATION, SEQUENCE, ARCHITECTURE, GENOME-WIDE ASSOCIATION, RESOURCE, MAKER",
author = "Elbers, {Jean P.} and Rogers, {Mark F.} and Perelman, {Polina L.} and Proskuryakova, {Anastasia A.} and Serdyukova, {Natalia A.} and Johnson, {Warren E.} and Petr Horin and Jukka Corander and David Murphy and Burger, {Pamela A.}",
note = "Publisher Copyright: {\textcopyright} 2019 The Authors. Molecular Ecology Resources Published by John Wiley & Sons Ltd",
year = "2019",
month = jul,
day = "1",
doi = "10.1111/1755-0998.13020",
language = "English",
volume = "19",
pages = "1015--1026",
journal = "Molecular Ecology Resources",
issn = "1755-098X",
publisher = "Wiley-Blackwell",
number = "4",

}

RIS

TY - JOUR

T1 - Improving Illumina assemblies with Hi-C and long reads

T2 - An example with the North African dromedary

AU - Elbers, Jean P.

AU - Rogers, Mark F.

AU - Perelman, Polina L.

AU - Proskuryakova, Anastasia A.

AU - Serdyukova, Natalia A.

AU - Johnson, Warren E.

AU - Horin, Petr

AU - Corander, Jukka

AU - Murphy, David

AU - Burger, Pamela A.

N1 - Publisher Copyright: © 2019 The Authors. Molecular Ecology Resources Published by John Wiley & Sons Ltd

PY - 2019/7/1

Y1 - 2019/7/1

N2 - Researchers have assembled thousands of eukaryotic genomes using Illumina reads, but traditional mate-pair libraries cannot span all repetitive elements, resulting in highly fragmented assemblies. However, both chromosome conformation capture techniques, such as Hi-C and Dovetail Genomics Chicago libraries and long-read sequencing, such as Pacific Biosciences and Oxford Nanopore, help span and resolve repetitive regions and therefore improve genome assemblies. One important livestock species of arid regions that does not have a high-quality contiguous reference genome is the dromedary (Camelus dromedarius). Draft genomes exist but are highly fragmented, and a high-quality reference genome is needed to understand adaptation to desert environments and artificial selection during domestication. Dromedaries are among the last livestock species to have been domesticated, and together with wild and domestic Bactrian camels, they are the only representatives of the Camelini tribe, which highlights their evolutionary significance. Here we describe our efforts to improve the North African dromedary genome. We used Chicago and Hi-C sequencing libraries from Dovetail Genomics to resolve the order of previously assembled contigs, producing almost chromosome-level scaffolds. Remaining gaps were filled with Pacific Biosciences long reads, and then scaffolds were comparatively mapped to chromosomes. Long reads added 99.32 Mbp to the total length of the new assembly. Dovetail Chicago and Hi-C libraries increased the longest scaffold over 12-fold, from 9.71 Mbp to 124.99 Mbp and the scaffold N50 over 50-fold, from 1.48 Mbp to 75.02 Mbp. We demonstrate that Illumina de novo assemblies can be substantially upgraded by combining chromosome conformation capture and long-read sequencing.

AB - Researchers have assembled thousands of eukaryotic genomes using Illumina reads, but traditional mate-pair libraries cannot span all repetitive elements, resulting in highly fragmented assemblies. However, both chromosome conformation capture techniques, such as Hi-C and Dovetail Genomics Chicago libraries and long-read sequencing, such as Pacific Biosciences and Oxford Nanopore, help span and resolve repetitive regions and therefore improve genome assemblies. One important livestock species of arid regions that does not have a high-quality contiguous reference genome is the dromedary (Camelus dromedarius). Draft genomes exist but are highly fragmented, and a high-quality reference genome is needed to understand adaptation to desert environments and artificial selection during domestication. Dromedaries are among the last livestock species to have been domesticated, and together with wild and domestic Bactrian camels, they are the only representatives of the Camelini tribe, which highlights their evolutionary significance. Here we describe our efforts to improve the North African dromedary genome. We used Chicago and Hi-C sequencing libraries from Dovetail Genomics to resolve the order of previously assembled contigs, producing almost chromosome-level scaffolds. Remaining gaps were filled with Pacific Biosciences long reads, and then scaffolds were comparatively mapped to chromosomes. Long reads added 99.32 Mbp to the total length of the new assembly. Dovetail Chicago and Hi-C libraries increased the longest scaffold over 12-fold, from 9.71 Mbp to 124.99 Mbp and the scaffold N50 over 50-fold, from 1.48 Mbp to 75.02 Mbp. We demonstrate that Illumina de novo assemblies can be substantially upgraded by combining chromosome conformation capture and long-read sequencing.

KW - chromosome conformation capture

KW - chromosome mapping

KW - dromedary

KW - genome annotation

KW - genome assembly

KW - scaffolding

KW - PLANT

KW - CATTLE

KW - ABYSS

KW - PRODUCTION TRAITS

KW - ANNOTATION

KW - SEQUENCE

KW - ARCHITECTURE

KW - GENOME-WIDE ASSOCIATION

KW - RESOURCE

KW - MAKER

UR - http://www.scopus.com/inward/record.url?scp=85066040780&partnerID=8YFLogxK

U2 - 10.1111/1755-0998.13020

DO - 10.1111/1755-0998.13020

M3 - Article

C2 - 30972949

AN - SCOPUS:85066040780

VL - 19

SP - 1015

EP - 1026

JO - Molecular Ecology Resources

JF - Molecular Ecology Resources

SN - 1755-098X

IS - 4

ER -

ID: 20038501