Edena

Edena (V3.130110) is an overlaps graph based short reads assembler and is suited to Illumina GA reads. It can assemble both direct-reverse (paired-ends) and reverse-direct (mate-pairs) datasets. This program requires the reads to be all the same length, as Illumina GA reads are. This is due to historical reasons and because it greatly simplifies several computational steps. 454 or Sanger reads are therefore not suited to Edena. If you provide multiple files with different read lengths, Edena will trim the 3’ end of the reads so that the reads are all the same length as the shortest reads in the file.

The program was developed in a framework of whole genome bacterial assemblies. It is therefore more suited for this kind of task though we also successfully used it for other types of projects.

Details

An assembly with Edena is a two step process: overlapping and assembling.

Overlapping mode
The reads files are provided to the program which computes the transitively reduced overlaps graph. This structure is then stored together with the sequence reads in a binary file suffixed with .ovl.
Assembling mode
The .ovl file is provided to the program, as well as some assembly parameters. A set of contigs in FASTA format is output.

The purpose of having a two step process is that the .ovl file is computed only once and can then be used to produce assemblies with different parameters.

References

De novo bacterial genome sequencing: millions of very short reads assembled on a desktop computer.
D. Hernandez, P. François, L. Farinelli, M. Østerås, J. Schrenzel.
Genome Research. 18:802-809, 2008.