- Mapping, that is mapping the reads and generate the BAM files
- Sorting, that is sorting the BAM files by chromosomes and clean them up (remove of PCR duplicates, base recalibration and sequence realignment)
- Reduction, that is detecting the variants (SNP, indels and SVs) and eventually annotate them.
The entire process is highly optimized and build up to run in parallel reducing the analysis time. Various steps rely on robust and widely used softwares such as BWA, Picard, SAMTools, VCFTools, GATK, ANNOVar. As stressful as this can be, these softwares often use non standard input files and produce a non standard output as well... But luckly now you have HugeSeq that can make your life easier!
For details see the related article on Nature Biotechnology.
1 comment:
We are looking at installing the HugeSeq pipeline. Are you currently using it? If so, has it worked well for you?
Post a Comment