TopHat is a fast splice junction mapper for RNA-Seq reads. It aligns RNA-Seq reads to mammalian-sized genomes using the ultra high-throughput short read aligner Bowtie, and then analyzes the mapping results to identify splice junctions between exons.


Tophat accepts files in Sanger FASTQ format. Use the FASTQ Groomer to prepare your files.

Tophat produces two output files:

a UCSC BED track of junctions reported by TopHat. Each junction consists of two connected BED blocks, where each block is as long as the maximal overhang of any read spanning the junction. The score is the number of alignments spanning the junction.
a list of read alignments in BAM format.

Two other possible outputs, depending on the options you choose, are insertions and deletions, both of which are in BED format.


TopHat: discovering splice junctions with RNA-Seq.
C. Trapnell, L. Pachter, S.L. Salzberg.
Bioinformatics (2009) 25 (9): 1105-1111