• Document: Analysing genomes and transcriptomes using Illumina sequencing
  • Size: 1012.52 KB
  • Uploaded: 2019-03-14 13:28:39
  • Status: Successfully converted


Some snippets from your converted document:

Analysing genomes and transcriptomes using Illumina sequencing Dr. Heinz Himmelbauer Centre for Genomic Regulation (CRG) Ultrasequencing Unit Barcelona From Whole Genome to Whole Solution, Disease Analysis Tools for the Next Generation The Sequencing Revolution High-Throughput Sequencing High-Throughput Sequencing 2000 2009 Image credit: U.S. Department of Energy, JGI 96 sequences per hour 2.6 million sequences per hour From Whole Genome to Whole Solution, Disease Analysis Tools for the Next Generation Next generation sequencing platforms Applications „ mRNA-Seq: expression profiling, transcript discovery „ Small RNAs (miRNAs) „ ChIP-Seq, RIP-Seq „ DNA methylation „ Re-sequencing, haplotyping (SNPs and structural variation) „ De novo sequencing From Whole Genome to Whole Solution, Disease Analysis Tools for the Next Generation Sequencing platforms Technology Year Read length Chemistry Template (nt) amplification Sanger 1977 1000 SBS cloning/PCR 454 2005 250-500 Pyro PCR Solexa 2006 36-75 SBS PCR SOLiD 2007 35 Ligation PCR Helicos 2008 30 SBS no amplification PacBio 2010 ? 1000 SBS no amplification VisiGen 2009 ? 1000 SBS no amplification Oxford Nanopore ? ? Elco no amplification SBS = Sequencing by Synthesis Pyro = Pyrosequencing Ligation = Sequencing by ligation assays Elco = Electrochemical detection From Whole Genome to Whole Solution, Disease Analysis Tools for the Next Generation Illumina/Solexa technology Library preparation Cluster generation Cycle sequencing Solexa GA II run: - 45 hours - 1 TB image data - 10 x 8 mio. 36-50 nt - up to 2.9 Gbp Base calling From Whole Genome to Whole Solution, Disease Analysis Tools for the Next Generation Summary of Solexa properties Reads are short (36 nt), but get progessively longer (50 nt, 75 nt) 8 samples can be sequenced in parallel Protocols in use at the CRG: „ Shotgun sequencing (genomic DNA, cDNA, ChIP-Seq, RIP-Seq) „ mRNA-Seq „ Indexed RNA-Seq „ Small RNA (miRNAs) identification and profiling „ Gene expression profiling (Tag sequencing, similar to SAGE) „ Paired end sequencing (long and short paired ends) Advantages „ Millions of reads from your sample „ Relatively cheap Disadvantages „ Millions of reads from your sample „ Some biases, poorly understood „ Sequencing errors (but Illumina improves) „ Demands on IT infrastructure (data processing, analysis, backup) From Whole Genome to Whole Solution, Disease Analysis Tools for the Next Generation Illumina GA II workflow 4 images/tile per cycle 1 intensity file per tile ~23 GB per cycle ~1.4 GB per cycle ~800GB for 36 cycles ~ 50 GB for 36 cycles Int tiff Int Int tiff tiff Int Int Illumina tiff tiff Int Int tiff Int Genome tiff tiff Int Int tiff Int Analyzer tiff tiff Firecrest: quantifies Int Int

Recently converted files (publicly available):