Skip to main content
Fig. 1 | Genome Medicine

Fig. 1

From: Transcriptome sequencing reveals thousands of novel long non-coding RNAs in B cell lymphoma

Fig. 1

De novo transcript discovery identifies 2,632 novel lncRNA in DLBCL tumors. a A well-organized pipeline was used to discover and obtain candidate lncRNAs. Using aligned 116 tumor and eight normal B cell samples, the transcriptome was reassembled using CuffLinks in de novo mode. A cohort of the assembled transcripts for each sample was then created using CuffMerge and parsed out for novel unannotated transcripts using CuffCompare, given a consensus reference of all known annotations from various sources – UCSC, RefSeq, ensemble, and GencodeV17. Initially we discovered 40,258 unannotated novel transcripts. b Stringent filtering steps applied to the initially discovered novel transcripts to remove artefactual novel lncRNAs, based on selection of non-coding transcripts using CPAT, transcript length, and exon count. Post these, expression level filtering (FPKM > 0.1 in two samples or more) across each sample group was performed, respectively. c GENEID coding potential score was used to validate the coding potential comparing known LncRNAs, novel lncRNAs, and protein-coding genes. The known lncRNA emulate the novel LncRNAs’ low average coding potential score, while a higher average coding potential score for protein coding genes is observed. d Transcript length using a density plot was also compared – showing common patterns for known lncRNAs and novel lncRNAs, with protein coding gene, were much longer in length. e Based on the 2 exon on more cutoff, the bar plot shows all novel lncRNA, similar to known lncRNAs have at least 2 or more exons. Protein coding has a lot more exons compared to the known and novel lncRNAs. f Bar plot showing, selected lncRNA based on FPKM filtering (FPKM > 0.1 in two or more samples) in DLBCL tumor samples, ordered to show count of selected lncRNAs expressed per tumor sample (n = 116)

Back to article page