Перейти к основному содержанию
AkademIndex

Продукты

Для разработчиков

AkademBaseОткрытый API экосистемы
Статья

A fast, lock-free approach for efficient parallel counting of occurrences of <i>k</i> -mers

Guillaume Marçais1 Program in Applied Mathematics, Statistics and Scientific Computation and 2Department of Computer Science and Institute for Advanced Computer Studies, University of Maryland, College Park, MD 20742, USACarl Kingsford1 Program in Applied Mathematics, Statistics and Scientific Computation and 2Department of Computer Science and Institute for Advanced Computer Studies, University of Maryland, College Park, MD 20742, USA
2011en
ABI

Аннотация

MOTIVATION: Counting the number of occurrences of every k-mer (substring of length k) in a long string is a central subproblem in many applications, including genome assembly, error correction of sequencing reads, fast multiple sequence alignment and repeat detection. Recently, the deep sequence coverage generated by next-generation sequencing technologies has caused the amount of sequence to be processed during a genome project to grow rapidly, and has rendered current k-mer counting tools too slow and memory intensive. At the same time, large multicore computers have become commonplace in research facilities allowing for a new parallel computational paradigm. RESULTS: We propose a new k-mer counting algorithm and associated implementation, called Jellyfish, which is fast and memory efficient. It is based on a multithreaded, lock-free hash table optimized for counting k-mers up to 31 bases in length. Due to their flexibility, suffix arrays have been the data structure of choice for solving many string problems. For the task of k-mer counting, important in many biological applications, Jellyfish offers a much faster and more memory-efficient solution. AVAILABILITY: The Jellyfish software is written in C++ and is GPL licensed. It is available for download at http://www.cbcb.umd.edu/software/jellyfish.

Перевод пока недоступен

Идентификаторы

Цитирования и источники

Цитирований: 6Использованных источников: 0