Sunday, 30 December 2012

Rita Levi Montalcini: "I'm the mind"

Remembering a life spent in the name of science and research. A brilliant mind that endure through old age showing us that science is marvel at the common things, a lifelong desire to pose question and search for answers. As Rita Levi Montalcini said in this interview on her 101st birthday: "I'm the mind, let the body act as it will".

Rita Levi-Montalcini, who conducted underground research in defiance of Fascist persecution has died today at her home in Rome at the age of 103. She was an Italian neurologist who lived through anti-Semitic discrimination and Nazi invasion, becoming one of the country's leading scientist.
Together with colleague Stanley Cohen, received the 1986 Nobel Prize in Physiology or Medicine for their discovery of nerve growth factor (NGF). Since 2001, she has also served in the Italian Senate as a Senator for Life. Rita Levi-Montalcini had been the oldest living Nobel laureate and the first ever to reach a 100th birthday.  Her research increased our understanding of many conditions, including tumors, developmental malformations, and senile dementia.

Friday, 28 December 2012

Pandas, Bats, Goats, Cats... What a genomic zoo!

At the end of 2012 some interesting paper come out providing new insights on some well known animal species revealed by whole genome sequencing.

First of all, I report a curious genetic test offered by the Davis' feline genetics lab, at University of California. Provide them with a swab from your loved cat together with $120 and they will trace back the geographic and breed ancestry of your pet. However, if your cat is a true genetic mix, they may fail in identifying a breed makeup, since most mixed cats descend from populations of so-called "random bred cats". On the contrary, if you do get a match, the UC Davis lab currently estimates that the test's breed probability matches are over 90% accurate.

I'm going now to introduce a paper published on Science by Zhang et al. providing de novo assembly of two species of bats. The authors assembled a 100x draft sequence and performed comparative analyses of two distantly related bat species, fruit bat Pteropus alecto and insectivorous Myotis davidii.  Comparison of bat genomes with other mammalian species has provided new insights into bat biology and evolution. Indeed, bats are the only known mammals capable of sustained flight and they also present other peculiar biological feature such as echo-localization, hibernation and ability to host several highly infectious pathogens (such as Ebola and SARS viruses).
The authors found signs of positive selection on different DNA damage checkpoint genes that might have contributed to development of flying ability. Moreover, their results suggest that some of the genetic adaptations that provide support for flying, such as those implied in dealing with reactive oxygen species, could also have had secondary effects on bat immunity.
Indeed, positive selection for DNA damage response genes may also have helped bats in avoid some of the negative effects of viruses, as did selection for other components of innate and adaptive immune pathways, including genes that interact with the NF-kappa-beta transcription factor family.
Both new bat genomes are missing some of the genes responsible for sensing and responding to microbial pathogens in other mammals, supporting the idea that certain features of bat immunity are distinct from those found in other animals.

Another paper published on Nature Genetics by Zhao et al. reports the low-coverage genome sequencing on 34 giant pandas from different sites in central China, providing new information on both the history and current state of panda populations and revealing three geographically-related genetic clusters within existing wild panda communities in China. Using Illumina HiSeq 2000, the team sequenced each of the giant panda genomes to a 4.7-fold depth across 91.5 percent of the 2.25 billion base panda genome, on average.
Results suggest that humans may have contributed to some of the more recent divergence events. For example the MIN-QXL panda split seems to coincide with establishment of ancient human Shu populations on a river that separates the panda populations. Human activities may have widened this divide by cutting down trees representing the animals' forest habitat. The new NGS data also provide an opportunity for researchers to further look back at population expansions and bottlenecks, as well as genetic divergence and adaptation events.
The genetic diversity observed within the three existing panda populations appears to be quite high, suggesting that it will be important, from a conservative standpoint, to consider panda's genetic adaptations to specific habitats.

paper by Dong et al. published on Nature Biotechnology applies NGS sequencing and automated whole-genome optical mapping to reconstruct the genome of the domestic goat (Capra hircus). To date, the goat lacks a reference genome, making breeding and genetic studies of this ruminant difficult and limiting our ability to select this species for productive QTLs.  
This article is the first example of a large genome assembled de novo using whole-genome mapping technology, without the aid of traditional genetic maps. The researchers first performed NGS sequencing (USING AN Illumina platform), generating 191.5 gigabases at about 65-fold coverage. The obtained reads were assembled into 542,145 contigs and 285,383 scaffolds longer than 100 bp. Sequencing fosmid ends, the researchers were able to further increase the size of their scaffolds,generating a goat genome sequence assembly of about 2.66 gigabases, close to the estimated goat genome size of about 2.92 gigabases. The next step implied whole-genome mapping to develop even longer scaffolds that were closer to the size of chromosomes. Authors used the OpGen's Argus system to develop a single-molecule restriction map using genomic DNA from the same goat. Then using the company's GenomeBuilder hybrid assembly software to bring together the short-read generated scaffolds with the single-molecule restriction maps, the researchers succeeded in joining 2,090 scaffolds into 315 super-scaffolds. Finally, with cattle genome assemblies as a guide, the researchers anchored the super scaffolds to presumptive goat chromosomes obtaining 2.52 gigabases mapped to 30 pseudo-chromosomes.
One of the main reasons for raising goats is for the production wool and cashmere. To get more information on how cashmere fibers are generated, the authors sequenced the transcriptome of the primary and secondary hair follicles of an Inner Mongolia cashmere-producing goat and then mapped those reads to the goat genome they had produced. The study identified 51 genes showing at least two-fold changes in expression between the two hair follicles, many of which were keratin genes.

Thursday, 27 December 2012

2012: Big achievements and promising youngs for the future

We are close to the end of 2012 and it's time to make a summary of the success and achievements of the year. 
A good starting point is this summary from GenomeWeb, that underlines how 2012 has been "a good year for genomics": ENCODE project milestones, several large sequencing projects started, papers from 1000G and ESP shedding new light on human genetic variability, new genomic techniques for non-invasive prenatal screening, rapid spread of genetic tests and new developments in personalized medicine and cancer treatments.

Forbes has published its usual "30 under 30" list of young scientists that have made a critical impact in their field. Concerning science and health there are several inspiring stories: UCLA's Christina Agapakis, who works on synthetic biology; Immumetrix's Christina Fan, who focuses on prenatal genetic testing; and Caltech's Mitchell Guttman, who helped uncover lincRNAs; Isaac Kinde at Johns Hopkins School of Medicine, who is working to improve the accuracy of DNA sequencing technology; and Adina Mangubat at Spiral Genetics, who is offering a cloud-based solution to manage DNA sequencing data.
These stories fascinate me every year, particularly now that I'm approaching my thirties. They represent a powerful source of inspiration, something that would push everyone to work harder and think out of the box!

Genome Technology also points to the future with its Seventh Annual Young Investigators list. They compiled a record of successful young researchers with no more than five years into their first faculty appointment (and a few are still completing their training). Their contributions in genomics cover a wide range of topics, such as: sequencing data sharing technology; single codon alternative splicing; plant epigenetics; genetics of sleepclinical translation of cancer omics datause of gene expression data, together with cellular and mouse models to study complex psychiatric disease; protocols for personalized treatments; drug development; genomic epidemiology for disease trackinggenetic mechanisms driving differences in gene expression and single cell proteomics.

Also Nature has published his list of the 10 people who mattered this year
I will report here a few examples:
- Jun Wang, the head of the BGI sequencing powerhouse. The incredible productive efficiency of BGI has moved huge projects on human and other life forms variability from imagination to actual reality. The Center is taking a leading role in sequencing 10,000 vertebrates through the Genome 10K project; 5,000 insects and other arthropods through the i5k initiative; and more than 1,000 birds, including some extinct ones in a separate project. Moreover BGI has launched the Million Project, aimed to expand the sequenced samples to 1M for human, animals, plants and bacteria. This would increase out knowledge about life on earth, how it works and how it has developed.
- Cedric Blanpain has used single cell sequencing to track down the evolution of single tumor cells, discovering that they descend clonally from different progenitor cells (some kind of tumor stem cells) in the tumor mass. This also implies that a tumor mass is composed by different cells populations with peculiar properties and different response to therapies.
- Elizabeth Iorns, who has focused her attention on reproducibility of results published in scientific papers. With the number of frauds and retractions increasing in the last 3 years, also as a drawback of fund cuts and consequent higher pressure for publication, the transparency of paper and new methods of quality checking are topics to care about. With this in mind Iorns founded the Reproducibility Initiative, based in Palo Alto, California, which allows authors to submit their papers for replication.
- Bernardo de Bernardinis, which in 2009 was deputy head of the Italian Department of Civil Protection and involved in the recent trial for missing communication and inadequate assessment of seismic risk for the earthquake disaster in Aquila. The trial has assumed an international relevance mainly because it suggests direct responsibility for the scientific committee called to evaluate the actual seismic risk for the city.

In the USA it's time to assign the National Medal of Science and the National Medal of Technology and Innovation. The recipients will receive their awards at a White House ceremony in early 2013.
“I am proud to honor these inspiring American innovators,” President Obama said. “They represent the ingenuity and imagination that has long made this Nation great—and they remind us of the enormous impact a few good ideas can have when these creative qualities are unleashed in an entrepreneurial environment.”.
Here is the list for the National Medal of Science:

Dr. Allen Bard, University of Texas at Austin, TX. Investigations in electro-organic chemistry, photoelectrochemistry, electrogenerated chemiluminescence, and electroanalytical chemistry
Dr. Sallie Chisholm, Massachusetts Institute of Technology, MA. Marine ecology.
Dr. Sidney Drell, Stanford University, CA.
Dr. Sandra Faber, University of California, Santa Cruz, CA. Formation and evolution of galaxies and the evolution of structure in the universe.
Dr. Sylvester James Gates, University of Maryland, MD. Supersymmetry theory.
Dr. Solomon Golomb, University of Southern California, CA
Dr. John Goodenough, University of Texas at Austin, TX. Relationships between the chemistry, structure and electrical properties of solids in order to design new or improved technical materials.
Dr. M. Frederick Hawthorne, University of Missouri, MO. International Institute of Nano and Molecular Medicine.
Dr. Leroy Hood, Institute for Systems Biology, WA. Integrating biology, technology and computation to create a predictive, personalized, preventive and participatory approach to medicine.
Dr. Barry Mazur, Harvard University, MA. Number theory, Automorphic forms, and related issues in algebraic geometry.
Dr. Lucy Shapiro, Stanford University School of Medicine, CA. Mechanisms used to generate the three-dimensional organization of a cell from a one-dimensional genetic code; define these mechanisms using both molecular genetics and biochemistry.
Dr. Anne Treisman, Princeton University, NJ. Visual attention, object perception and memory. Explore the nature of the limits to human perception, the information-processing that results in the perception of objects and events.

PubMed Highlights: an improved strategy for Whole-Genome Sequencing of Single Cells

MALBAC single-cell whole-genome amplification, Science 21-12-2012: vol. 338 no. 6114 1622-1626.

Scientists of Harvard University developed a new technique they call MALBAC (multiple annealing and looping-based amplification cycles) that yields better sequence coverage than has previously been available for single-cell genome sequencing. Sequencing MALBAC-amplified DNA achieves 93% genome coverage ≥1x for a single human cell at 25x mean sequencing depth, as reported in the latest issue of Science.
In MALBAC, genomic DNA is copied to form looped products that can’t serve as templates, so in each cycle only the genomic DNA can be copied. The amount of DNA increases linearly rather than exponentially as it would in PCR or multiple displacement amplification. After five MALBAC cycles, the DNA loops are collected and used as templates for further amplification by PCR. Apparently, the linear amplification step reduces the sequencing bias. “Most of the amplification bias is generated in the first few cycles of PCR,” one of the authors of the study says. “By doing linear amplification first we avoid this strong bias. That makes it very even across the genome.”

MALBAC was successfully used for this two impressive studies:

 2012 Dec 21;338(6114):1622-6. doi: 10.1126/science.1229164.

Genome-wide detection of single-nucleotide and copy-number variations of a single human cell.


Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA 02138, USA.


Kindred cells can have different genomes because of dynamic changes in DNA. Single-cell sequencing is needed to characterize these genomic differences but has been hindered by whole-genome amplification bias, resulting in low genome coverage. Here, we report on a new amplification method-multiple annealing and looping-based amplification cycles (MALBAC)-that offers high uniformity across the genome. Sequencing MALBAC-amplified DNA achieves 93% genome coverage ≥1x for a single human cell at 25x mean sequencing depth. We detected digitized copy-numbervariations (CNVs) of a single cancer cell. By sequencing three kindred cells, we were able to identify individual single-nucleotide variations (SNVs), with no false positives detected. We directly measured the genome-wide mutation rate of a cancer cell line and found that purine-pyrimidine exchanges occurred unusually frequently among the newly acquired SNVs.

PMID: 23258894

 2012 Dec 21;338(6114):1627-30. doi: 10.1126/science.1229112.

Probing meiotic recombination and aneuploidy of single sperm cells by whole-genome sequencing.


Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA 02138, USA.


Meiotic recombination creates genetic diversity and ensures segregation of homologous chromosomes. Previous population analyses yielded results averaged among individuals and affected by evolutionary pressures. We sequenced 99 sperm from an Asian male by using the newly developed amplification method-multiple annealing and looping-based amplification cycles-to phase the personal genome and map recombination events at high resolution, which are nonuniformly distributed across the genome in the absence of selection pressure. The paucity of recombination near transcription start sites observed in individual sperm indicates that such a phenomenon is intrinsic to the molecular mechanism of meiosis. Interestingly, a decreased crossover frequency combined with an increase of autosomal aneuploidy is observable on a global per-sperm basis.
PMID: 23258895

Flash Report: the University of Connecticut is planning to sequence the genome/exome of Adam Lanza, the Newtown killer.

Geneticists at the University of Connecticut are making plans to study the DNA of Adam Lanza, who killed 20 children and seven adults in Newtown. 
I personally disagree with Dr. Beaudet, Professor at the Baylor College of Medicine and chairman of its Department of Molecular and Human Genetics, who said: “We can’t afford not to do this research”.
New York Times article provides more details on this controversial effort to discover biological clues to extreme violence.

The 1000 Genomes Project Tutorials at ASHG 2012

The 1000 Genomes Project has released the sequence data and an integrated set of variants, genotypes, and haplotypes for the 1092 samples in the phase 1 set, and the sequence data for the phase 2 set. This tutorial describes the data sets, how to access them, and how to use them. During the ASHG annual meeting the 1000 Genomes Consortium organized an interesting sets of tutorial on how to access, use and analyze the data produced.

The topics being covered are:

1. (15 min talk, 3 min questions) Description of the 1000 Genomes data – Mark DePristo [slides]
2. (15 min talk, 3 min questions) How to access the data – Laura Clarke [slides]
3. (15 min talk, 3 min questions) Structural variants -- Ryan Mills [slides]
4. (15 min talk, 3 min questions) Population genetic and admixture analyses – Eimear Kenny [slides]
5. (15 min talk, 3 min questions) Functional analyses – Ekta Khurana [slides]
6. (15 min talk, 3 min questions) How to use the data in disease studies -- Stephan Ripke
7. (12 min) Q&A

A poster was also presented on Wednesday 7th. A copy of the poster is also available on the ftp site

Wednesday, 26 December 2012

UK tries hard on genomics!

For the end of 2012 UK researchers have received a really nice Christmas present. George Osborne, the UK finance minister, announced an investment of £600 million ($963 million) to sustain the Medical Research Council as well as facilities for applied research and development (as reported by the Nature News blog).
At the beginning of December, the Prime Minister revealed that a consistent part of this funding (around  £100 million ($160.9 million) will be directed to genomics research for an ambitious project aimed to perform whole genome sequencing on 100,000 UK individuals and to use their genomic information for studies and treatments of cancer and other diseases. The funds will be used for the sequencing itself as well as for the training of scientific and medical staff to improve the capabilities of the UK health system in managing genomic data and providing personalized treatments.
The interesting fact about the UK effort is its focus on whole genome instead of the classical exome sequencing, currently the dominant approach in others large NGS efforts. This choice promises to be especially effective for the discovery of regulatory variants related to cancer and other disease (see also the post from GenomeWeb).
Indeed, the primary effort of the project will be toward cancer and rare diseases as reported in the official Prime Minister website and the BBC medical news:

The Government has earmarked £100 million:
- to train a new generation of British genetic scientists to lead on the development of new drugs, treatments and cures, building the UK as the world leader in the field. And train the wider healthcare community in harnessing this technology;
- to pump-prime DNA sequencing for cancer and rare inherited diseases;
- to build the NHS data infrastructure to ensure that this new technology leads to better care for patients.

The Prime Minister's Office said that the genome sequencing will be entirely voluntary. Patients will be able to opt out of the sequencing, and the DNA data will be anonymized except when it is used in the context of a patient's individual care. The NHS will explore a number of ways to store the data, and it plans to make patient privacy and confidentiality an important factor in the decision about which platforms and technologies it will use.

To promote the initiative, the Prime Minister David Cameron popped by the genomics core facility at the Cancer Research UK Cambridge Research Institute (see the news here), where he kicked off the run to demonstrate how easy sequencing has become (to be honest, cartridge and flowcell have been prepared by the lab, so basically it was all about pushing some buttons...). However the visit of the Prime Minister and the funding announcement was an exciting event, as reported in the CoreGenomics blog by James Hadfield, who runs the lab.

Friday, 7 December 2012

What's new from the annual conference of Italian Society of Human Genetics

The annual meeting of the Italian Society of Human Genetics (SIGU) has concluded a couple of weeks ago in Sorrento, a small beautiful town in southern Italy.
As usual, the meeting had a really busy schedule with several parallel sessions, so my report is not a complete survey but it's based only on talks I've attended.

The opening session was really inspiring, particularly the speech given by Prof. A. Ballabio from TIGEM on lysosomal function/biogenesis and autophagy. He presented his brilliant work on the genetic network controlled by TFEB transcription factor which tune the entire lysosomal compartment (published about 3 years ago on Science) and the recent findings on autophagy and how TFEB can also control this process. Besides the remarkable results themselves, I found this story interesting for how all the study has started. The starting point was one of the main concepts emerged from Systems Biology studies. It seems that every one of the main cell metabolic pathways is under the control of one or few master genes driving the co-expression of all the macinery needed for that specific function. Ballabio and his collaborators decided to screen the available gene expression databases (such as GEO) to find genes that are co-regulated in various conditions affecting the lysosomal compartment. They isolated a list of lysosomal related genes and analyzed in silico their promoter sequences to find shared transcription factor binding sites and finally tested which transcription factor binds to them. This led to the identification of TFEB as the master gene controlling lysosomal compartment. This demonstrates once more how an intelligent use of publicly available data generated from high-throughput studies could lead to new discoveries when the correct question is posed (and of course brilliant minds are at work on it...). Moreover Ballabio showed how the expression of TFEB is able to rescue the healthy phenotype in cells and mouse models affected by lysosomal storage disorders. This process is based on an augmented exocitosis of mature lysosomes, so that even if the catabolic enzymes remains nonfunctional, junk material no longer accumulates in the cells, strongly reducing its citotoxicity. Even if he admitted that the question remains on how to control the side effects and how to predict the long-term effects of the materials secreted by the cells, this strategy represents a promising opportunity for future therapies of lysosomal disorders.

At the conference there was a lot talking about clinical genetics and, interestingly, also clinical and diagnostic genomics. Genetic diagnosis based on NGS technology is attracting increasing attention, and this trend was clearly reflected in an increasing number of talks covering diagnostic use of NGS, even if doubts remain in the clinicians about its real accuracy and confidence.
In this area the two sponsored sections of both Life Tech and Illumina had interesting data to show. Indeed, under the pressure of the producers, the speakers also reported technical details on the protocols and performance, aspects that usually not receive a lot of attention. Benchtop sequenchers from both sides (essentially Life's Ion PGM and Illumina's MiSeq) have demonstrated that they are almost ready for a diagnostic and clinical debut, both with its own fallbacks that have been deeply discussed in the last months (see for example 1, 2, 3). Speaking for myself, I see a ready-to-go future for both platforms in the field of target re-sequencing of already known mutations or disease genes, with disease-targeted validated panels as the best option. This is in agreement to the trends emerged from Q&A sessions: small gene panels can guarantee higher coverage and so higher confidence in variant calling and moreover you get information only on the specific disease gene(s), avoiding more complex counseling and ethical issues. Even if exome sequencing proved to be an effective approach for syndromes showing extreme genetic heterogeneity or even better for rare conditions caused by private mutations, this approach still remain on the research side of the line. Technical difficulties related to data interpretation and accuracy, cost-benefit considerations and ethical issues raised by the genetic data not strictly related to the pathology have to be resolved before this approach could develop in a "routine" genetic test (however initiatives in this direction are already in testing, such as the Baylor College's Exome diagnostic service which has been running for a about one year and have already receive more than a hundred request for exome test from clinicians. See this interview on GenomeWeb).
Going back to presentations, it worth mentioning that Illumina made the move in clinical market, announcing that its MiSeq have received CLIA certification and so they are ready to sell a certified version of the machine (and the sequencing kits) from the second half of 2013. We'll see how Life Technologies will respond on its PGM to avoid the risk of being  cut out of the market...I think that IonTorrent's community based panels (target amplicon resequencing kits tailored on a specific pathology or cancer type and developed with the support and collaboration of research community) should allow them to rapidly develop robust disease oriented panels, but an official certification is still an essential requisite to compete in the clinical market.

An entire session was dedicated to non coding RNA. This kind of molecules were initially limited to rRNA and tRNA, but since their appearence on the regulatory scene with siRNA and miRNA, the catologue and relevance of ncRNA had rapidly increased. Today we have a lot of long non-coding RNA (lncRNA) and they have been implicated in basically every aspect of cell transcription regulation with a role also in pathological conditions (see fro example this review). Within the various talks, the one on facioscapularhumeral dystrophy (FSHD) has impressed me the most. The chromosomal locus associated with this pathology has been known from years. This locus contains a repetitive element of about 3kb and the pathological manifestations appears in subjects showing less than 10 (but at least 1) of these elements. In these subject the expression of a group of genes located in the region is altered and this lead to the observed phenotype. However the exact mechanism that connect the number of repeats with the modification in gene expression was not clear. In its presentation at SIGU, Gabellini from Milan showed its last results demonstrating that a lncRNA is involved in this process. This RNA is transcribed from the same genetic region linked to FSHD and it is partially encoded in the repeated element (so now it is explained why at least 1 repeat is needed to the pathology). This RNA acts in cis as an epigenetic factor: it binds to the repetitive elements and serve as binding site for a regulatory protein factor that repress the transcriptional activity of the entire locus (it seems that it can induce chromatin remodeling) . In affected subjects however, the reduced number of repeat elements also reduce the binding of the lncRNA and so the transcription level of genes in this region increase, leading to pathology.
In another talk Gustincich from Trieste showed how lncRNA could also act in stimulating translation. The fact itself confirm new possibilities for the regulatory role of non-coding RNA, since until now they have been mostly implicated in repression of transcription. This enrich the picture of protein level regulation that appear to occur at three distinct levels (plus post-traductional mechanisms of course): mRNA transcription, regulated by historically studied mechanisms such as TF binding and chromatin remodeling  which made the transcript available; RNA stability and traduction, regulated by availability of specific RNA-binding factors; and a fine regulation of mRNA levels and mRNA traduction based on ncRNA or more complex mechanism such as RNA editing. Things are made even more complicated by the fact that often a single ncRNA could target multiple mRNAs with different affinity and so its effect on protein level depends on the final equilibrium of all the possible interactions, as illustrated by the theory of competing endogenous RNA (ceRNA) appeared on Cell.

Making the rest of the story short, I've seen an increasing attention on CNV and their role in pathological conditions; there was also a session dedicated to new protocols in genetic therapy and interesting talks on mitochondria related pathology (this field has taken advantage of NGS technology too).

Walking around poster session I've noticed that almost all of the project presented involving exome sequencing were case studies on rare syndromes or even on unique patients showing a peculiar phenotype. The idea is relatively simple: you have and affected subject with an unusual phenotype or a clinical diagnosis but unknown molecular defect, luckily you have access also to genetic material from its relative and a family tree from which to assess the model of genetic transmission...Now you can perform exome sequencing on the affected subject and (at least) its parents and try to identify the causative mutation(s). This approach has demonstrated to be extremely effective and new papers applying this method have being published almost every day in the last year. Often, this is reported as diagnostic exome sequencing, since the final result is a molecular diagnosis for a yet unknown case. There is a double advantage in this process: first, the research group identify the genetic defect and can make assumption on the molecular mechanism underlying the pathology (and this may lead also to treatment); second, we got new knowledge on gene function and gene-to-phenotype relationships. This latter aspect is of major interest for medical research, since a better knowledge of genes function is one of the main missing factor preventing researcher to make prediction about the SNVs identified by NGS studies. Now that the new technologies have made sequencing on sporadic and ultra-rare cases affordable, research groups could also reanalyze old families remained undiagnosed and our knowledge of gene function will rapidly increase.

Ok, I've wrote enough...even if a lot more should be written indeed!
Keep your genetic enthusiasm alive!

Friday, 30 November 2012

EU consortium receive $15M funding to shed light on neurological diseases

The European Community have recently funded the Neuromics Consortium with €12 million for 5 years to investigate the causes of neurological and neuromuscular diseases and found new causative genes to develop diagnostic gene panels.
Headed by the University of Tubingen, the consortium of 18 European and Australian institutions will perform whole-exome sequencing of 1,100 individuals and aims to identify causative genes for at least 80% of the studied syndromes.
The Neuromics Consortium hopes its work will yield better diagnostic panels that can increase the diagnosis rate for ten main neurodegenerative and neuromuscular disease types — including ataxia, spastic paraplegia, Huntington's disease, muscular dystrophy and spinal muscular atrophy — as well as provide information on genes and pathways that could inform new treatments.

The consortium will collaborate with Iceland's Decode Genetics, that will perform the sequencing on Illumina HiSeq and provide the main data; with Agilent Technologies, to develop new diagnostic gene panels based on the HaloPlex technology; and with Ariadne Genomics, that will provide bioinformatic support for data analysis. Also RNA-seq and other omics apporaches will be used by the consortium in the second phase of diagnostic panel development and validation.

A recent post on GenomeWeb reports that in a document describing the project, the consortium wrote that at the end of the funding period, it expects "to have elucidated the genetic basis for [more than] 80 [percent] of investigated patient groups." According to the group, the new genes will be added to existing databases and used to develop the first overlapping gene panel that can be used to diagnose several of these individual diseases, "overcoming time consuming and costly single gene analysis."

PubMed Highlight: Watermelon and Pear sequenced!

We can now add watermelon and pear to the rapidly increasing list of genome sequenced fruits! The draft sequence of watermelon was recently published on Nature Genetics, while pear sequence make its appearance on Genome Research.

Both papers report the high quality draft genome sequence, together with gene-prediction and chromosomal mapping, and reconstruct evolution of the modern fruit species, also identifying the main effects of human selection.
Besides providing useful information for further genetic improvements, the increasing number of fruit genomes available will shortly allow for identification of fruit salad or mixed juice components by sequencing...Imagine this future: just get your Oxford Nanopore MinIon sequencer out of your pocket, take a drop of juice and in a matter of minutes you'll know exactly which fruits your going to drink or eat!

Ok, maybe I'm pushing the sequencing thing a little bit too further!

Friday, 23 November 2012

Proton presented @ SIGU

Flash update directly from the Conference of Italian Society of Human Genetics (SIGU). Besides a lot of interesting talks applying ngs in diagnostic protocols and for disease gene discovery (I will cover them as I've came back) and a fascinating session on ncRNA, I want to report the official presentation of life proton and ion torrent diagnostic applications...and they provides t-shirts for free also!

Friday, 16 November 2012

In situ single cell RNA-seq to build a 3D trascriptional map of the brain

Do you remember the results from the Connectome project mapping the connections in the human brain   and resulting in those beautiful images of wired colorful brains? Probably in the next 5 years we would get much further in the understanding of cells spatial organization in the brain!

A team from the University of California, San Diego, has recently won a five-year $9.3 million dollar from NIH to perform RNA-seq on 10,000 single neuron cells and reconstruct a 3D map of gene activity in the brain. The team plan to perform complete RNA-Seq, not just poly-A RNAs, on such a large amount of cells to get a complete picture of the high genetic variability of neuron cells sub-population.

What impress me the most (and almost sounds like science fiction to me) is the idea to apply an in situ RNA-seq protocol to reconstruct the expression profile of 500 genes. These profiles will act as a "fingerprint" for transcriptional location so that for any other whole transcriptome dataset from an isolated cell authors can look at these 500 genes to find the matched pattern and infer the brain localization of the sequenced cell.
The in situ RNA-seq protocol is fascinating itself. It's based on a technique developed by George Church's group at Harvard, which imply a chemical reaction to create pores on the cells, followed by the application of a customized microfluidic device to deposit sequencing reagents into the cell. The sequencing will take place within the tissue and the signal will be read out with a microscope.
The team leader Kun Zhang, an associate professor at UCSD's Department of Bioengineering and Institute for Genomic Medicine, anticipated that the group would spend the first two to three years developing and optimizing this protocol, while canonical sequencing will be conducted at Illumina.

A deeper coverage of the story on this post from GenomeWeb or see the news directly from UCSD press.

Wednesday, 7 November 2012

PubMed Highlight: Single-cell genetic variability in neurons

Theoretically any two cells in our body have an high probability to show some genetic diversity, due to somatic mutations and other genetic specific rearrangements activated by cell differentiation. This concept has been proposed as a key factor in neurons. These cells show great plasticity and are divided in several different sub-populations with specific molecular and cellular characteristics and maybe all this diversity should rely on some kind of genetic reorganization particularly active in brain neurons (with retrotransposable elements as best candidates). This hypothesis has been discussed in past years and now this new paper appeared on Cell could shed some light on the real state of genetic diversity in neurons. The authors applied single-cell sequencing on 300 single neurons from cerebral cortex and caudate nucleus of three normal individuals to evaluate specific insertion of LINE-1 elements. Moreover they also evaluate the presence and diffusion of a somatic mutation in AKT3 gene in single cortical cells to characterize the mosaicism in a child with hemimegalencephaly. This study showing that neuronal disorders can arise from mutations that are specific of brain tissue or even neuron sub-populations (somatic mutations appeared in some precursor) and can thus be assessed only by sequencing the neurons themselves.
Further analysis on other neuron populations could lead to definition of a genetic profile specific for each neuron type and/or patients.

Single-Neuron Sequencing Analysis of L1 Retrotransposition and Somatic Mutation in the Human Brain
Gilad D. Evrony, Xuyu Cai, Eunjung Lee, L. Benjamin Hills, Princess C. Elhosary, Hillel S. Lehmann, J.J. Parker, Kutay D. Atabay, Edward C. Gilmore, Annapurna Poduri, Peter J. Parkand Christopher A. Walsh

A major unanswered question in neuroscience is whether there exists genomic variability between
individual neurons of the brain, contributing to functional diversity or to an unexplained burden of neurological disease. To address this question, we developed a method to amplify genomes of single
neurons from human brains. Because recent reports suggest frequent LINE-1 (L1) retrotransposition in
human brains, we performed genome-wide L1 insertion profiling of 300 single neurons from cerebral cortex and caudate nucleus of three normal individuals, recovering >80% of germline insertions from single neurons. While we find somatic L1 insertions, we estimate <0.6 unique somatic insertions per
neuron, and most neurons lack detectable somatic insertions, suggesting that L1 is not a major generator of neuronal diversity in cortex and caudate. We then genotyped single cortical cells to characterize the mosaicism of a somatic AKT3 mutation identified in a child with hemimegalencephaly. Single-neuron sequencing allows systematic assessment of genomic diversity in the human brain.

Friday, 2 November 2012

Pubmed highlight: 1000 Genomes Phase I published

Data from the Phase I analysis of 1000 Genomes Project have just been published in Nature! Besides data on average distributions in a genome of SNVs, indels and structural variants; the paper also provides interesting insights on population specific distribution of variants and a lot of technical details (more than 100 pages of Supplementary materials!!) that will serve as useful guidelines for NGS data analysis!

Chek this out!

An integrated map of genetic variation from 1,092 human genomes
The 1000 Genomes Project Consortium

By characterizing the geographic and functional spectrum of human genetic variation, the 1000 Genomes Project aims to build a resource to help to understand the genetic contribution to disease. Here we describe the genomes of 1,092 individuals from 14 populations, constructed using a combination of low-coverage whole-genome and exome sequencing. By developing methods to integrate information across several algorithms and diverse data sources, we provide a validated haplotype map of 38|[thinsp]|million single nucleotide polymorphisms, 1.4|[thinsp]|million short insertions and deletions, and more than 14,000 larger deletions. We show that individuals from different populations carry different profiles of rare and common variants, and that low-frequency variants show substantial geographic differentiation, which is further increased by the action of purifying selection. We show that evolutionary conservation and coding consequence are key determinants of the strength of purifying selection, that rare-variant load varies substantially across biological pathways, and that each individual contains hundreds of rare non-coding variants at conserved sites, such as motif-disrupting changes in transcription-factor-binding sites. This resource, which captures up to 98% of accessible single nucleotide polymorphisms at a frequency of 1% in related populations, enables analysis of common and low-frequency variants in individuals from diverse, including admixed, populations.

Wednesday, 31 October 2012

PubMed Highlight: A Refreshing Pineapple Juice!

Do you like a sweet refreshing pineapple juice in a hot summer afternoon? Sure I do! And now there are hope that this juice would become even better! The full trascriptome of Pineapple as just been released and described in this paper published on PLOS ONE, so that genetic selection could applied to improve the fruit and the juice!

De Novo Assembly, Characterization and Functional Annotation of Pineapple Fruit Transcriptome through Massively Parallel Sequencing

Wen Dee Ong, Lok-Yung Christopher Voo, Vijay Subbiah Kumar
Biotechnology Research Institute, Universiti Malaysia Sabah, Kota Kinabalu, Sabah, Malaysia


Pineapple (Ananas comosus var. comosus), is an important tropical non-climacteric fruit with high commercial potential. Understanding the mechanism and processes underlying fruit ripening would enable scientists to enhance the improvement of quality traits such as, flavor, texture, appearance and fruit sweetness. Although, the pineapple is an important fruit, there is insufficient transcriptomic or genomic information that is available in public databases. Application of high throughput transcriptome sequencing to profile the pineapple fruit transcripts is therefore needed.
Methodology/Principal Findings
To facilitate this, we have performed transcriptome sequencing of ripe yellow pineapple fruit flesh using Illumina technology. About 4.7 millions Illumina paired-end reads were generated and assembled using the Velvet de novo assembler. The assembly produced 28,728 unique transcripts with a mean length of approximately 200 bp. Sequence similarity search against non-redundant NCBI database identified a total of 16,932 unique transcripts (58.93%) with significant hits. Out of these, 15,507 unique transcripts were assigned to gene ontology terms. Functional annotation against Kyoto Encyclopedia of Genes and Genomes pathway database identified 13,598 unique transcripts (47.33%) which were mapped to 126 pathways. The assembly revealed many transcripts that were previously unknown.
The unique transcripts derived from this work have rapidly increased of the number of the pineapple fruit mRNA transcripts as it is now available in public databases. This information can be further utilized in gene expression, genomics and other functional genomics studies in pineapple.

Monday, 29 October 2012

BGI Workshop in Milan, Italy (December 5, 2012)

The Translational Science of Mendelian Disorders: From transomics to daily life

Wednesday, December 5, 2012 from 9:00 AM to 6:00 PM (PST), Milan, Italy

About the workshop

The aim of this workshop is to exchange ideas and discuss cutting-edge research in the field of Mendelian Disorder by bringing together scientists from all around Europe and even the world. Our workshop will be a forward thinking platform that aids our participants to review the specific topics, as well as to discuss future research.
We except to have 150-200 high ranked participants who are the top researchers from leading European research centers, universities as well as government representatives. These scientific researchers would come from Italy, Switzerland, France and other countries around.

The Program

Contact Information:

Ms Wenyan Li, Project Coordinator of Italy, BGI Europe
+39 3313812359

Ms Mingyu Tian, Project Coordinator of Italy, BGI Europe
+39 3426496811

Online Registration:

Monday, 22 October 2012

NGS PubMed Highlights: Human Molecular Genetics "Genomic Medicine" Issue

Human Molecular Genetics has published its latest review issue, featuring invited articles from top researchers in the field of Genomic Medicine.

Saturday, 20 October 2012

Flash Report: Beer genome sequenced

Actually is the barley genome that has been sequenced and described in the journal Nature by scientists of The International Barley Sequencing Consortium (IBSC).
Malted barley, along with hops and yeast, is a key ingredient in brewing beer (while the yeast genome has been unraveled more than 15 years ago, to the best of my knowledge the hops - Humulus lupulus genome has not been sequenced yet).
First cultivated more than 10,000 years ago, barley (Hordeum vulgare) is the world's fourth most important cereal crop (both in terms of area of cultivation and in quantity of grain produced), trailing only maize, rice and wheat. Its genome is almost twice the size of the human genome and contains a large proportion of closely related sequences, which are difficult to piece together.
Professor Robbie Waugh (Scotland's James Hutton Institute) who led the research said: "this research will streamline efforts to improve barley production through breeding for improved varieties. This could be varieties better able to withstand pests and disease, deal with adverse environmental conditions, or even provide grain better suited for beer and brewing".

 2012 Oct 17. doi: 10.1038/nature11543.

A physical, genetic and functional sequence assembly of the barley genome.


Barley (Hordeum vulgare L.) is among the world's earliest domesticated and most important crop plants. It is diploid with a large haploid genome of 5.1 gigabases (Gb). Here we present an integrated and ordered physical, genetic and functional sequence resource that describes the barley gene-space in a structured whole-genome context. We developed a physical map of 4.98 Gb, with more than 3.90 Gb anchored to a high-resolution genetic map. Projecting a deep whole-genome shotgun assembly, complementary DNA and deep RNA sequence data onto this framework supports 79,379 transcript clusters, including 26,159 'high-confidence' genes with homology support from other plant genomes. Abundant alternative splicing, premature termination codons and novel transcriptionally active regions suggest that post-transcriptional processing forms an important regulatory layer. Survey sequences from diverse accessions reveal a landscape of extensive single-nucleotide variation. Our data provide a platform for both genome-assisted research and enabling contemporary crop improvement.

Thursday, 18 October 2012

Flash Report: The search for extra-terrestrial genomes

Although we are non even close to April Fools' Day, I found quite unbelievable that Craig Venter and Jonathan Rothberg, founder of Ion Torrent, are independently developing a DNA sequencing machine to be delivered to Mars to search for life.
You can have additional details about this almost unbelievable project from this Technology Review article.
By the way, I didn't know of the existence of a NASA-funded project at Harvard and MIT called SET-G, or "the search for extra-terrestrial genomes.

Wednesday, 17 October 2012

Flash Report: Ion great discounts!

Competition between the two NGS giants (Illumina and Life Technologies) is resulting in more and more advantages for the final users, particularly in terms of technological advancements and price rebates. 

Now that its new Proton is on the market and ready to ship, Life Technologies seems to have adopted an aggressive marketing policy offering great discounts on both PGM and Proton...
They also offer an interesting trade-in solution: 40% discount on Proton in exchange of your old Illumina GAIIx!

See the offer on the Life Technologies site!

NGS PubMed Highlights: Can we predict the face of an individual from its DNA?

Apparently we are not there yet, however a recent article published in PLoS Genetics describes a genome-wide association study that allowed the identification of five loci influencing facial morphology in europeans. The study has been carried out in almost 5,400 individuals of European descent. Researchers defined four-dozen facial traits measurable from three-dimensional magnetic resonance as well from two-dimensional data from portrait photographs.
PRDM16, PAX3, TP63, C5orf50, and COL17A1 are the five candidate genes involved in the determination of the human face.
The scientists involved in the study speculate that it should be possible to identify additional variants, including some with smaller effects, through studies that involve larger sample sets and more detailed facial measurements.
As reported by the Medical Daily web site "perhaps one day, police officers will be able to use DNA found at the crime scene to create an image of a person's face, rather than relying on witness testimony told to sketch artists".

 2012 Sep;8(9):e1002932. doi: 10.1371/journal.pgen.1002932. Epub 2012 Sep 13.

A genome-wide association study identifies five Loci influencing facial morphology in europeans.


Department of Forensic Molecular Biology, Erasmus MC University Medical Center Rotterdam, Rotterdam, The Netherlands.


Inter-individual variation in facial shape is one of the most noticeable phenotypes in humans, and it is clearly under genetic regulation; however, almost nothing is known about the genetic basis of normal human facial morphology. We therefore conducted a genome-wide association study forfacial shape phenotypes in multiple discovery and replication cohorts, considering almost ten thousand individuals of European descent from several countries. Phenotyping of facial shape features was based on landmark data obtained from three-dimensional head magnetic resonance images (MRIs) and two-dimensional portrait images. We identified five independent genetic loci associated with different facial phenotypes, suggesting the involvement of five candidate genes-PRDM16, PAX3, TP63, C5orf50, and COL17A1-in the determination of the human face. Three of them have been implicated previously in vertebrate craniofacial development and disease, and the remaining two genes potentially represent novel players in the molecular networks governing facial development. Our finding at PAX3 influencing the position of the nasion replicates a recent GWAS of facialfeatures. In addition to the reported GWA findings, we established links between common DNA variants previously associated with NSCL/P at 2p21, 8q24, 13q31, and 17q22 and normal facial-shape variations based on a candidate gene approach. Overall our study implies that DNA variants in genes essential for craniofacial development contribute with relatively small effect size to the spectrum of normal variation in human facialmorphology. This observation has important consequences for future studies aiming to identify more genes involved in the human facial morphology, as well as for potential applications of DNA prediction of facial shape such as in future forensic applications.