Monday, 14 July 2014
New challenges in NGS
After about a decade from the first appearance if NGS sequencing we have seen incredible improvements in throughput, accuracy and analysis methods and sequencing is now more diffused and easy to achieve also for small labs. Researchers have produced tons of sequencing data and the new technology allowed us to investigate DNA and human genomic variations at unprecedent scale and precision.
However, beside the milestones achieved, we have now to deal with new challenges that were largely underestimated in the early days of NGS.
MassGenomics has a nice blog post underlining the main ones, that I reported here:
Where do we put all those data from large genomic sequencing projects? Can we afford the cost of store everything or we have to be more selectively on what to keep in our hard drives?
GWAS studies have showed us that large numbers, in the order of 10 thousands of samples, are needed to achieve statistical significance for association studies, particularly for common diseases. Even when you consider the present low price of 1,000$ / genome it will require around 10 millions $ for such a sequencing project. So we can reduce our sample size (and thus significance) or create mega consortium with all the managing issues.
Samples became precious resources.
In the present scenario sequencing power is not longer a limitation. The real matter is find enough well-characterized samples to sequence!
Whole genome and whole exome approaches let researchers to rapidly identify new variants potentially related to phenotypes. But which of them are truly relevant? Our present knowledge do not allow for a confident prediction of functional impact of genetic variation and thus functional studies are often needed to assess the actual role of each variants. These studies, based on cellular models or animal models, could be expensive and complicated.
With large and increasing amount of genomic data available to the community and studies showing that people ancestry and living location could be traced using them (at least in a proportion of cases), there are concerns about how "anonymous" these kind of data could really be. This is going to became a real problem has more and more genomes are sequenced.