Saturday, 25 January 2014

Illumina presents two new sequencing platforms: population scale genomics and the 1000$ genome

Few days ago Illumina announced two new NGS platforms: a huge factory scale sequencer called HiSeq X Ten and a new benchtop sequencher called NextSeq 500, half-way between a MiSeq and a HiSeq.

Both platforms represent a huge advancement in data production, made possible by several technical innovations and a new chemistry. First of all, Illumina worked hard to increase accuracy and speed of image acquisition using an increased number (up to 6) of new LED cameras for image snapshot and a new flow cell design with larger random clusters enabling it to work with the lower-resolution optics as well as new surface chemistry to enhance the signal.
The HiSeq X Ten will also integrate a dual direction image scan system dubbling the scan speed and a new flow cell containing nanowell that allow for a precise cluster separation resulting in more dense clustering.

Both instruments will run with the new 2-color chemistry. This methods use only 2 different fluorescent molecule, red and green: T and C bases are marked as green or red signal, respectively; A is marked with both signal and G lack any marker. Thus, only 2 image acquisitions, one per color channel, are needed every cycle, instead of the classic 4, cutting down the processing time. The chemistry is well explained in CoreGenomics blog and the Illumina tech sheet.
The NextSeq 500 will come together with two diffent flowcell and two different run mode, resembling the fast run mode of the HiSeq 2500.
The mid-output flow cell includes 130 million clusters and will support a 2x75 base kit that will generate 16-19 gigabases of data per 15-hour run, or a 2x150 base kit that will generate 32-39 gigabases of data in a 26-hour run.
The high-output flow cell includes 400 million clusters and will support a 2x150 base kit that will generate 100-120 gigabases of data per 29-hour run, a 2x75 base kit that will generate 50-60 gigabases per 18-hour run, and a 1x75 base kit that will generate 25-30 gigabases per 11-hour run.
This numbers will allow a WGS sequencing in little more than a day at about 30X, making the NextSeq 500 the first benchtop sequencer to hit the goal of whole genome.

The HiSeq X Ten is a huge sequencher and it's actually composed by 10 single sequencing unit that will cost you a total of 10M$. Illumina will accept only a minimum order of ten units, with each supplementary unit costing 1M$. One unit will be able to generate 600 gigabases of data in one day, enough to sequence five human genomes, or 1.8 terabases of data in under three days, so that the total data production will be 18Tb every 3 days, allowing the sequencing of 18000 genome every year!!
Illumina claims that this juggernaut will respond to the need of population scale sequencing programs, often national health programs, such as UK initiative to sequence 100000 individuals or the Denmark project to sequence the entire population of an isolated island.
The HiSeq X Ten will enable the "first real $1,000 genome," said Flatley, CEO of Illumina. One reagent kit to support 16 genomes per run will cost $12,700, or $800 per genome for reagents. Hardware will add an additional $137 per genome, while sample prep will range between $55 and $65 per genome.
However, the new machine will sequence ONLY whole human genomes, no other applications are supported by now, and, given the hard work needed to produce and set up such a huge instruments, Illumina will deliver only 5 of them in the first year.

Despite the 10M$ price, Illumina has already sold 4 HiSeq X Ten: to the South Korean sequencing service provider Macrogen, the Garvan Institute in Australia, the New York Genome Center, and to the Broad Institute, which purchased a 14-unit system.

Detailed information and interesting discussions around the two new platforms and their technical innovations can be found around the web: CoreGenomics (presentation, HiSeq X Ten, NextSeq 500), MassGenomics, Omics!Omics!, Opniomics, GenomeWeb, Nature News

So is the mythical goal of 1000$ genome finally achieved? Well, it seems almost...
First of all, one have to consider the initial investment and the overhead costs to run the 10 machines. Moreover, the cost estimate made by Illumina are based on 4 year full activity of the HiSeq X Ten, which means 18.000 genomes per year per 4 year with machine running 24h/day...This scenario seems unlikely to many experts, since we simply don't have so many samples to sequence.
Finally data analysis costs, besides the simple sequence alignment and maybe SNP call, are not included as usually. For a more detailed evaluation of the real costs, read the interesting post on allseq blog.

Thursday, 2 January 2014

Human Genome Variation Journal announced

Nature group has just announced a new open access journal focused on study and discoveries about variation in the human genome and their relation to human phenotypes, with particular interest in disease related studies. "The journal was born from a demand by the community for a place to publish important discoveries, observations and analysis about research on the human genome." Nature reports in the home page of the new journal.

The topic is quite interesting, but the most intriguing aspect of this new born publication is a new kind of article named "Data Reports". Under this category the journal will publish "standardized reports about genomic variation and variability, especially in relation to disease". Even if peer reviewed, Data Reports will full a short editorial procedure, to allow for rapid publication. Moreover they will create a open access database to query around these data providing a powerful new instrument to rapidly find association between genomic variation and particular health traits.
As defined in the Journal guidelines "Data Reports are short reports about human genome variation and variability, which describe disease-causing variation and/or their frequencies. In addition, Data Reports can describe, document and analyze human multifactorial disease-associated variations and their frequencies."

The journal will start considering submission from March 2014.