Single molecule sequencing of the human transcriptome

We will discuss sequencing technologies for transcriptome dat. We will focus on the downstream analyses and use the Pacbio human transcriptome paper as a starting point for our meeting (Sharon et al. 2013 A single-molecule long-read survey of the human transcriptome. Nature Biotechnology 31: 1009-1014).

Abstract

Global RNA studies have become central to understanding biological processes, but methods such as microarrays and short-read sequencing are unable to describe an entire RNA molecule from 5′ to 3′ end. Here we use single-molecule long-read sequencing technology from Pacific Biosciences to sequence the polyadenylated RNA complement of a pooled set of 20 human organs and tissues without the need for fragmentation or amplification. We show that full-length RNA molecules of up to 1.5 kb can readily be monitored with little sequence loss at the 5′ ends. For longer RNA molecules more 5′ nucleotides are missing, but complete intron structures are often preserved. In total, we identify ~14,000 spliced GENCODE genes. High-confidence mappings are consistent with GENCODE annotations, but >10% of the alignments represent intron structures that were not previously annotated. As a group, transcripts mapping to unannotated regions have features of long, noncoding RNAs. Our results show the feasibility of deep sequencing full-length RNA from complex eukaryotic transcriptomes on a single-molecule level.

Published Nov. 18, 2013 10:34 AM - Last modified Nov. 18, 2013 10:34 AM