FishingCNV: a graphical software package for detecting rare copy number variations in exome sequencing data.
A blog with news and curiosity on genomics subjects with a particular interest for topics related to Next Generation Sequencing, Personal Genomics and Bioinformatics. We work at the University of Brescia (Italy) and are new in the field but with a lot of energy to share.
Saturday, 30 March 2013
Friday, 29 March 2013
PubMed Highlight: Next-Generation sequencing visualization
I've just finished an intensive course on NGS data analysis where command line based soutions where of course the best reported way to manage and make sense of data.
Playing with scripts, unix code and R language make you feel a sort of bioinformatic power. You start to blame all those wet-lab collegues spending hours on excel spreadsheets. You are amazed by the results of your last programming trick and effectivness of your command-line skills. Even if this make you proude, keep in mind that a screen full of symbols and over-a-million-row tables have to most og biologist and geneticists the same appeal of the flowing characters of The Matrix...As in the famous movie, not everyone can see the meaning behind the code, most of them will just see a bunch of chars and number, doubting that this is The real world!
A good visualization of genomic data from NGS experiments would make your results nicer to see, easier to explain and explore. Moreover, a colorful alignments of reads in genome browser style or a circos graph sure make a better impact when you show them in your presentations! The scientific community constantly ask for visulization tools that simplify the task of explaining and exploring NGS data, so that they became accessible to everyone, even to the old-school ones.
The last special issue of Briefings in Bioinformatics make an extensive review of the main visualization tools, with an overview on their peculiar advantages and main features. Web-based browsers, UCSC Genome Browser, IGV, Tablet, Bamview and GBrowse are all covered, making this issue the ideal answer to the collegue asking you: "I've just received this great NGS data, but what are all these bam and vcf files? I want to see them nicely placed on my favourite chromosome!".
Main articles in the special issue:
Jun Wang, Lei Kong, Ge Gao, and Jingchu Luo
A brief introduction to web-based genome browsers
Robert M. Kuhn, David Haussler, and W. James Kent
The UCSC genome browser and associated tools
Lincoln D. Stein
Using GBrowse 2.0 to visualize and share next-generation sequence data
Oscar Westesson, Mitchell Skinner, and Ian Holmes
Visualizing next-generation sequencing data with JBrowse
Helga Thorvaldsdóttir, James T. Robinson, and Jill P. Mesirov
Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration
Iain Milne, Gordon Stephen, Micha Bayer, Peter J.A. Cock, Leighton Pritchard, Linda Cardle, Paul D. Shaw, and David Marshall
Using Tablet for visual exploration of second-generation sequencing data
Tim Carver, Simon R. Harris, Thomas D. Otto, Matthew Berriman, Julian Parkhill, and Jacqueline A. McQuillan
BamView: visualizing and interpretation of next-generation sequencing read alignments
Michael C. Schatz, Adam M. Phillippy, Daniel D. Sommer, Arthur L. Delcher, Daniela Puiu, Giuseppe Narzisi, Steven L. Salzberg, and Mihai Pop
Hawkeye and AMOS: visualizing and assessing the quality of genome assemblies
Thursday, 28 March 2013
Elementary elements in bioinformatics
In these days I'm attending to an intensive course on NGS data analysis...Everyday we deal with about 9h of bioinformatics, both theory and scripting...And a tons of useful tools have been cited during the course...
Since the names of these softwares are all but easy to remember, I found myself wondering for a summary that give a compact and organized overview and quick access to the main ones.
Considering that bioinformatics tricks have became as essentials as chemical elements, The Elements of Bioinformatics table from Eagle Genomics is an efficient and funny answer to my needs
If programming, analyzing DNA data and talking about stats and complex biology don't satisfy your need to look nerdy, use this table to remember strange-named tools should improve your reputation as a real geek!!
Have fun (if you read this blog I'm sure you will!)
Thursday, 21 March 2013
PubMed Highlight: The origin, evolution and functional impact of short insertion-deletion variants identified in 179 human genomes
However a detailed genome-wide assessment of indels impact and dsitribution still missing...until now.
In this interesting paper appeared in Genome Research, Montgomery et al. address exactly this question and with amazing results. First of all authors as to deal with the short Indels calling challenge that is one of the biggest issue when analyzing NGS data. Starting with DNA sequences from 179 individuals from 3 population groups, they made several optimization to the standard pipeline used by the 1000 Genome Project to obtain a set of high quality indels. Even if indels in homopolymeric regions remain out of reach, the improved pipeline described in the paper is certainly a guideline for anyone working in the field. Among the other interesting findings, authors confirmed that rates of indel mutagenesis are highly heterogeneous, with 43-48% of indels occurring in 4.03% of the genome (loci defined as indel hotspots by the authors), and they proposed fork stalling and template switching (FoSTeS) together with polymerase slippage as the main mechanism originating the indels.
Take a look!
The origin, evolution and functional impact of short insertion-deletion variants identified in 179 human genomes
- Stephen B Montgomery1,
- David Goode1,
- Erika Kvikstad2,
- Cornelis A Albers3,
- Zhengdong Zhang4,
- Xinmeng Jasmine Mu5,
- Guruprasad Ananda6,
- Bryan Howie7,
- Konrad J Karczewski1,
- Kevin S Smith1,
- Vanessa Anaya1,
- Rhea Richardson1,
- Joe Davis1,
- Daniel G MacArthur8,
- Arend Sidow1,
- Laurent Duret2,
- Mark Gerstein5,
- Kateryna Markova6,
- Jonathan Marchini9,
- Gilean A McVean9 and
- Gerton Lunter9,10
Wednesday, 20 March 2013
PubMed Highlight: the genome of HeLa cell line has been sequenced
HeLa cells, sampled in 1951 from the cervical tumor of a woman named Henrietta Lacks, are probably the world's most commonly used human cell lines and have been used as a standard for understanding many fundamental biological processes, leading to more than 60,000 scientific publications.
In a new study published on G3 (Genes, Genomes, Genetics), scientists announce they have successfully sequenced the genome of a HeLa cell line. While previous work had shown that they have extra copies of each chromosome and sometimes multiple extra chromosomes, the analysis of the HeLa genome revealed additional features commonly associated with cancer cells like losing healthy copies of genes. In particular, the researchers found that countless regions of the chromosomes in each cell were arranged in the wrong order and had extra or fewer copies of genes.
The results of the study are also discussed in a Nature commentary.
Published Early Online March 11, 2013, doi:10.1534/g3.113.005777