Asosiy kontentga oʻtish
AkademIndex

Mahsulotlar

Ishlab chiquvchilar uchun

AkademBaseEkotizim uchun ochiq API
Maqola

SOAPdenovo2: an empirically improved memory-efficient short-read <i>de novo</i> assembler

Ruibang Luo1BGI HK Research Institute, 16 Dai Fu Street, Tai Po Industrial Estate, Hong KongBinghang Liu1BGI HK Research Institute, 16 Dai Fu Street, Tai Po Industrial Estate, Hong KongYinlong Xie2HKU-BGI Bioinformatics Algorithms and Core Technology Research Laboratory & Department of Computer Science, University of Hong Kong, Pokfulam, Hong KongZhenyu Li1BGI HK Research Institute, 16 Dai Fu Street, Tai Po Industrial Estate, Hong KongWeihua Huang1BGI HK Research Institute, 16 Dai Fu Street, Tai Po Industrial Estate, Hong KongJianying Yuan1BGI HK Research Institute, 16 Dai Fu Street, Tai Po Industrial Estate, Hong KongGuangzhu He1BGI HK Research Institute, 16 Dai Fu Street, Tai Po Industrial Estate, Hong KongYanxiang Chen1BGI HK Research Institute, 16 Dai Fu Street, Tai Po Industrial Estate, Hong KongQi Pan1BGI HK Research Institute, 16 Dai Fu Street, Tai Po Industrial Estate, Hong KongYunjie Liu1BGI HK Research Institute, 16 Dai Fu Street, Tai Po Industrial Estate, Hong KongJingbo Tang1BGI HK Research Institute, 16 Dai Fu Street, Tai Po Industrial Estate, Hong KongGengxiong Wu1BGI HK Research Institute, 16 Dai Fu Street, Tai Po Industrial Estate, Hong KongHao Zhang1BGI HK Research Institute, 16 Dai Fu Street, Tai Po Industrial Estate, Hong KongYujian Shi1BGI HK Research Institute, 16 Dai Fu Street, Tai Po Industrial Estate, Hong KongYong Liu1BGI HK Research Institute, 16 Dai Fu Street, Tai Po Industrial Estate, Hong KongChang Yu1BGI HK Research Institute, 16 Dai Fu Street, Tai Po Industrial Estate, Hong KongBo Wang1BGI HK Research Institute, 16 Dai Fu Street, Tai Po Industrial Estate, Hong KongYao Lu1BGI HK Research Institute, 16 Dai Fu Street, Tai Po Industrial Estate, Hong KongChanglei Han1BGI HK Research Institute, 16 Dai Fu Street, Tai Po Industrial Estate, Hong KongDavid W. Cheung2HKU-BGI Bioinformatics Algorithms and Core Technology Research Laboratory & Department of Computer Science, University of Hong Kong, Pokfulam, Hong KongSiu‐Ming Yiu2HKU-BGI Bioinformatics Algorithms and Core Technology Research Laboratory & Department of Computer Science, University of Hong Kong, Pokfulam, Hong KongShaoliang Peng4School of Computer Science, National University of Defense Technology, No.47, Yanwachi street, Kaifu District, Changsha, Hunan 410073, ChinaZhu Xiaoqian4School of Computer Science, National University of Defense Technology, No.47, Yanwachi street, Kaifu District, Changsha, Hunan 410073, ChinaGuangming Liu4School of Computer Science, National University of Defense Technology, No.47, Yanwachi street, Kaifu District, Changsha, Hunan 410073, ChinaXiangke Liao4School of Computer Science, National University of Defense Technology, No.47, Yanwachi street, Kaifu District, Changsha, Hunan 410073, ChinaYingrui Li1BGI HK Research Institute, 16 Dai Fu Street, Tai Po Industrial Estate, Hong KongHuanming Yang1BGI HK Research Institute, 16 Dai Fu Street, Tai Po Industrial Estate, Hong KongJian Wang1BGI HK Research Institute, 16 Dai Fu Street, Tai Po Industrial Estate, Hong KongTak‐Wah Lam2HKU-BGI Bioinformatics Algorithms and Core Technology Research Laboratory & Department of Computer Science, University of Hong Kong, Pokfulam, Hong KongJun Wang1BGI HK Research Institute, 16 Dai Fu Street, Tai Po Industrial Estate, Hong Kong
2012en
ABI

Annotatsiya

BACKGROUND: There is a rapidly increasing amount of de novo genome assembly using next-generation sequencing (NGS) short reads; however, several big challenges remain to be overcome in order for this to be efficient and accurate. SOAPdenovo has been successfully applied to assemble many published genomes, but it still needs improvement in continuity, accuracy and coverage, especially in repeat regions. FINDINGS: To overcome these challenges, we have developed its successor, SOAPdenovo2, which has the advantage of a new algorithm design that reduces memory consumption in graph construction, resolves more repeat regions in contig assembly, increases coverage and length in scaffold construction, improves gap closing, and optimizes for large genome. CONCLUSIONS: Benchmark using the Assemblathon1 and GAGE datasets showed that SOAPdenovo2 greatly surpasses its predecessor SOAPdenovo and is competitive to other assemblers on both assembly length and accuracy. We also provide an updated assembly version of the 2008 Asian (YH) genome using SOAPdenovo2. Here, the contig and scaffold N50 of the YH genome were ~20.9 kbp and ~22 Mbp, respectively, which is 3-fold and 50-fold longer than the first published version. The genome coverage increased from 81.16% to 93.91%, and memory consumption was ~2/3 lower during the point of largest memory consumption.

Hali tarjima qilinmagan

Identifikatorlar

Iqtiboslar va manbalar

2 ta iqtibos0 ta foydalanilgan manba