Article

Capturing single‐copy nuclear genes, organellar genomes, and nuclear ribosomal DNA from deep genome skimming data for plant phylogenetics: A case study in Vitaceae

Binbin LiuDepartment of Botany, National Museum of Natural History Smithsonian Institution PO Box 37012 Washington DC 20013‐7012 USAZhi‐Yao MaDepartment of Botany, National Museum of Natural History Smithsonian Institution PO Box 37012 Washington DC 20013‐7012 USAChen RenGuangdong Provincial Key Laboratory of Applied Botany, South China Botanical Garden Chinese Academy of Sciences Guangzhou 510650 ChinaRichard G.J. HodelDepartment of Botany, National Museum of Natural History Smithsonian Institution PO Box 37012 Washington DC 20013‐7012 USAMiao SunDepartment of Biology—Ecoinformatics and Biodiversity Aarhus University 8000 Aarhus C DenmarkXiu‐Qun LiuKey Laboratory of Horticultural Plant Biology (Ministry of Education), College of Horticulture and Forestry Science Huazhong Agricultural University Wuhan 430070 ChinaGuang‐Ning LiuCollege of Architecture and Urban Planning Tongji University Shanghai 200092 ChinaHong De‐YuanState Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany Chinese Academy of Sciences Beijing 100093 ChinaElizabeth A. ZimmerDepartment of Botany, National Museum of Natural History Smithsonian Institution PO Box 37012 Washington DC 20013‐7012 USAJun WenDepartment of Botany, National Museum of Natural History Smithsonian Institution PO Box 37012 Washington DC 20013‐7012 USA

2021en

ABI

Abstract

Abstract With the decreasing cost and availability of many newly developed bioinformatics pipelines, next‐generation sequencing (NGS) has revolutionized plant systematics in recent years. Genome skimming has been widely used to obtain high‐copy fractions of the genomes, including plastomes, mitochondrial DNA (mtDNA), and nuclear ribosomal DNA (nrDNA). In this study, through simulations, we evaluated the optimal (minimum) sequencing depth and performance for recovering single‐copy nuclear genes (SCNs) from genome skimming data, by subsampling genome resequencing data and generating 10 data sets with different sequencing coverage in silico . We tested the performance of four data sets (plastome, nrDNA, mtDNA, and SCNs) obtained from genome skimming based on phylogenetic analyses of the Vitis clade at the genus level and Vitaceae at the family level, respectively. Our results showed that optimal minimum sequencing depth for high‐quality SCNs assembly via genome skimming was about 10× coverage. Without the steps of synthesizing baits and enrichment experiments, coupled with incredibly low sequencing costs, we showcase that deep genome skimming (DGS) is as effective for capturing large data sets of SCNs as the widely used Hyb‐Seq approach, in addition to capturing plastomes, mtDNA, and entire nrDNA repeats. DGS may serve as an efficient and economical alternative and may be superior to the popular target enrichment/Hyb‐Seq approach.

Identifiers

DOI: 10.1111/jse.12806

Citations and references

Cited by 20 references

Metrics — AkademScholar