Pages

Monday, 3 February 2014

The sequencing frenzy! Not only human genomes!

With NGS cutting down the costs, Illumina pushing hard to increase sequencing production and some expectation from the new nanopore technology, the past year saw an explosion of genome sequencing.
Lot of organisms, even some bizarre creatures, had their genome sequenced and lot of new sequencing programs had been started, aiming to sequence thousand and thousand of new genomes in the next few years!
I've a personal interest in evolutionary and comparative genomics, so I always appreciate a new genome and I'm particularly intrigued by exotic organisms genomes...Sometimes they may be not so informative, but it is always funny to tell your friend the story of the genome of the white tiger! 
I made a rapid survey of what I've missed in the last couple of months and here are some cool new genomes:

The Burmese python genome (total of 1.44 Gb) and the King cobra genome (total of 1.66 Gb).

These two snake genomes have been published in December on PNAS and give new insights on the evolution of snakes and the peculiar adaptation related to their metabolism and to venom production. Both paper report results from genome sequencing as well as transcriptome characterization, providing a complete picture on several interesting and poorly understood aspects of snakes biology.



The first paper is focused on the molecular basis of morphological and physiological adaptations in snakes. Positive selection acted in ancestral snakes on many genes related to metabolism, development, lungs, eyes, heart, kidney, and skeletal structure—all highly modified features in snakes. To better study genetic basis of the extreme phenotypes of the python, they also compared the python genome with king cobra genome and genomic samples from other snakes. They also performed a detailed transcriptome analysis and found responsive genese associated with metabolism, development, and also mammalian diseases.


The second paper is focused on snake venom, a fascinating toxic protein cocktails. The authors investigate the evolution of these complex biological weapon by sequencing the genome of the king cobra and perform transcriptome analysis to assess the composition of venom gland expressed genes, small RNAs, and secreted venom proteins. They found that "toxin genes important for prey capture have massively expanded by gene duplication and evolved under positive selection, resulting in protein neofunctionalization.". There is a lot of interest in animals venom as a source of new bio-active peptides with a possible application as human drugs, and this article advance the field providing lots of new information on the origin and evolution of venom proteins.


The Locust genome (total of 6.5 Gb)

The genome of L. migratoria has been published in Nature on January. 
There is no doubt that locusts are one of the world’s most destructive agricultural pests, as demonstrated also by their use as a God punishment! Locusts are grasshopper species and they exhibit a remarkable ability in swarming and long-distance migration. Locust swarms form suddenly from the congregation of billions of insects and they can fly hundreds of kilometres each day, and even cross oceans. They are also quite voracious and a single individual could consume its own body weight in food every day! The authors of this paper combined genome sequencing with a set of transcriptome and methylome data from gregarious and solitarious locusts to get insights on the adaptations behind the locust machine! They revealed peculiar findings on neuronal regulatory mechanisms underlying phase change in the locust, together with a significant expansion of gene families associated with energy consumption and detoxification, consistent with long-distance flight capacity and phytophagy. Moreover they also identified hundreds of potential insecticide target genes, such as ion channels, G-protein-coupled receptors and lethal genes. Beware locust!

The Elephant shark genome (0.93 Gb)

In this paper published by Nature on January, authors report the whole-genome analysis of a cartilaginous fish, the elephant shark (Callorhinchus milii). This genome will provide new insights on the evolution of gnathostomes from jawless vertebrates, a transition accompanied by many morphological and phenotypic innovations: jaws, paired appendages and an adaptive immune system based on immunoglobulins, T-cell receptors and major histocompatibility complex (MHC). Moreover they also found a lack of genes encoding secreted calcium-binding phosphoproteins, suggesting an explanation to the the absence of bon
They also found that "the C. milii genome is the slowest evolving of all known vertebrates and features extensive synteny conservation with tetrapod genomes, making it a good model for comparative analyses of gnathostome genomes". The paper analyze also some peculiar aspects of the adaptive immune system of cartilaginous fishes: "it lacks the canonical CD4 co-receptor and most transcription factors, cytokines and cytokine receptors related to the CD4 lineage, despite the presence of polymorphic major histocompatibility complex class II molecules. It thus presents a new model for understanding the origin of adaptive immunity."

The Giant galapagos tortoise (C. nigra) transcriptome

In this study published by Genome Biology on December, authors performed transcriptome sequencing on five C. nigra individuals from three distinct subspecies. Moreover they also analyzed samples from the congeneric red-footed tortoise C. carbonaria and from the Spanish pond turtle Mauremys leprosa. To get a complete picture on tortoise evolution, transcriptome data from the previously published European pond turtle Emys orbicularis and pond slider Trachemys scripta were also considered.
Based on this dataset, they perform a population genomic study of the giant Galápagos tortoise, a species endemic from the Galápagos archipelago. C. nigra is an interesting turtle species: it is the largest known living species of terrestrial turtles and can live well above 100 years. From mtDNA analyses authors suggested that "this insular species has been isolated from the South American continent during millions of years". C. nigra is therefore a perfect model for the study of adaptation following island colonization and "point to island endemic species as a promising model for the study of the deleterious effects on genome evolution of a reduced long-term population size". Among other interesting results, authors found a reduced diversity of immunity genes, supporting the hypothesis of attenuated pathogen diversity in the island restricted habitat, and an increased selective pressure on genes involved in response to stress, potentially involved in the response to the climatic instability and in the elongated lifespan of this species.


After these intriguing examples, here are the major sequencing programs that promise to provide us with more and more genomes in the next few years:

This project aim to sequence the genome of 10k vertebrate species, covering amphibian, birds, reptiles, mammal, fishes and teleosts. The declared goal is "To understand how complex animal life evolved through changes in DNA and use this knowledge to become better stewards of the planet."
The project is co-directed by David Haussler (Howard Hughes Medical Institute, University of California, Santa Cruz); Stephen J. O'Brien (Chief Scientific Officer, Theodosius Dobzhansky Center for Genome Bioinformatics, St. Petersburg State University, St. Petersburg, Russia) and Oliver A. Ryder (San Diego Zoo, Institute for Conservation Research, San Diego, CA). Between collaborators and promoters they have also the BGI.

The i5K Insect and other Arthropod Genome Sequencing Initiative
Started officially in 2011, "the i5k initiative plans to sequence the genomes of 5,000 insect and related arthropod species over the next 5 years. This project will be transformative because it aims to sequence the genomes of all insect species known to be important to worldwide agriculture, food safety, medicine, and energy production; all those used as models in biology; the most abundant in world ecosystems; and representatives in every branch of insect phylogeny so as to achieve a deep understanding of arthropod evolution and phylogeny". The collaborators on the project has already produced more than 60 genomes, such as various species of Drosophila, Apis mellifera, Bombyx mori, Aedes aegypti, Anopheles gambiae, Iodex scapularis and many others. 

Sustained by BGI and China National Genebank (CNGB) this project is started at the end of 2013 and aims to "unveil the mysteries of the origin, evolution and diversification of the largest group of vertebrates." Morover, "all data generated from Fish T1K will be made available publicly through CNGB, ensuring that scientists have access to new developments and trends in fish research and the use of RNA-seq technology."

This is another insects related initiative. Insects are one of the most species-rich groups of metazoan organisms. They play a pivotal role in most non-marine ecosystems and they are of enormous economical and medical importance. With about 20 international partners involved, the 1K Insect Transcriptome Evolution project "aims to study the transcriptomes (that is the entirety of expressed genes) of more than 1,000 insect species encompassing all recognized insect orders. For each species, so-called ESTs (Expressed Sequence Tags) will be produced using next generation sequencing techniques (NGS). [...]. The expected data will allow inferring for the first time a robust phylogenetic backbone tree of insects. Furthermore, the project includes the development of new software for data quality assessment and analysis."

The Global Invertebrate Genomics Alliance (GIGA) is an initiative started in 2013, that group together diverse scientists "with the intent of growing a collaborative network that can address the major problems associated with genomic sequencing of a large taxonomic spectrum - sample collection and processing, data handling, sequence annotation, alignment and access, as well as intellectual property issues." The entire project is focused on (non-insect/ non-nematode) invertebrate, a taxonomic group that "comprise over 70% of all described metazoan species diversity, yet most of their genomes (complete hereditary material, DNA code) remain relatively unknown and understudied".

No comments: