Pages

Tuesday 25 June 2013

PubMed Highlight: Benchmarking of short sequence aligner for NGS

The introduction of NGS technology has posed the problem of fast ad accurate mapping of the millions of short sequences produced with every single experiment. To address this challenge a number of alignment tools have been developed and updated, each one optimized for a specific type of inputs or a specific kind of alignment problems (gapped, ungapped and so on...).
Even if every single aligner has its strengths and pitfalls, a detailed comparison of the overall performances of the various tools is always useful when it come to choice the best one for your own analysis. This recent paper published in BMC Bioinformatics reports a benchmark of the most popular aligners, such as Bowtie, Bowtie2, BWA, SOAP2, MAQ, RMAP, GSNAP, Novoalign, and mrsFAST. Moreover the authors have developed a benchmarking suite that can be used to asses the performance of any other aligner of interest.

If you need a compass to orientate in the world of aligners don't forget to visit also the HTS mapper page at EBI, which is a really useful and summarize the main features of the single software.

Benchmarking short sequence mapping tools
Ayat Hatem, Doruk Bozda¿, Amanda E Toland and Ümit V Çatalyürek
BMC Bioinformatics 2013, 14:184. Published: 7 June 2013

Abstract
Background
The development of next-generation sequencing instruments has led to the generation of millions of short sequences in a single run. The process of aligning these reads to a reference genome is time consuming and demands the development of fast and accurate alignment tools. However, the current proposed tools make different compromises between the accuracy and the speed of mapping. Moreover, many important aspects are overlooked while comparing the performance of a newly developed tool to the state of the art. Therefore, there is a need for an objective evaluation method that covers all the aspects. In this work, we introduce a benchmarking suite to extensively analyze sequencing tools with respect to various aspects and provide an objective comparison.
Results
We applied our benchmarking tests on 9 well known mapping tools, namely, Bowtie, Bowtie2, BWA, SOAP2, MAQ, RMAP, GSNAP, Novoalign, and mrsFAST (mrFAST) using synthetic data and real RNA-Seq data. MAQ and RMAP are based on building hash tables for the reads, whereas the remaining tools are based on indexing the reference genome. The benchmarking tests reveal the strengths and weaknesses of each tool. The results show that no single tool outperforms all others in all metrics. However, Bowtie maintained the best throughput for most of the tests while BWA performed better for longer read lengths. The benchmarking tests are not restricted to the mentioned tools and can be further applied to others.
Conclusion
The mapping process is still a hard problem that is affected by many factors. In this work, we provided a benchmarking suite that reveals and evaluates the different factors affecting the mapping process. Still, there is no tool that outperforms all of the others in all the tests. Therefore, the end user should clearly specify his needs in order to choose the tool that provides the best results.

No comments: