Skip to main content
Article

TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions

Daehwan KimCenter for Bioinformatics and Computational Biology, University of Maryland, College Park, MD, 20742, USAGeo PerteaCenter for Computational Biology, McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, 733 N. Broadway, Baltimore, MD, 21205, USACole TrapnellBroad Institute of MIT and Harvard, 7 Cambridge Center, Cambridge, MA, 02142, USAHarold PimentelDepartment of Electrical Engineering and Computer Science, University of California, 101 Sproul Hall, Berkeley, CA, 94720, USARyan KelleyIllumina Inc., 5200 Illumina Way, San Diego, CA, 92122, USASteven L. SalzbergCenter for Computational Biology, McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, 733 N. Broadway, Baltimore, MD, 21205, USA
2013en
ABI

Abstract

TopHat is a popular spliced aligner for RNA-sequence (RNA-seq) experiments. In this paper, we describe TopHat2, which incorporates many significant enhancements to TopHat. TopHat2 can align reads of various lengths produced by the latest sequencing technologies, while allowing for variable-length indels with respect to the reference genome. In addition to de novo spliced alignment, TopHat2 can align reads across fusion breaks, which can occur after genomic translocations. TopHat2 combines the ability to identify novel splice sites with direct mapping to known transcripts, producing sensitive and accurate alignments, even for highly repetitive genomes or in the presence of pseudogenes. TopHat2 is available at http://ccb.jhu.edu/software/tophat.

Identifiers

Citations and references

Cited by 20 references