Improved coverage and accuracy with strand-conserving sequence enrichment
© BioMed Central Ltd 2013
Published: 29 May 2013
Targeted next-generation sequencing is becoming a common tool in the molecular diagnostic laboratory. However, currently available methods to enrich for regions of interest in the DNA sequence suffer from drawbacks such as high cost, complex protocols, lack of clinical-level accuracy and uneven target coverage. A target-enrichment approach using complementary long padlock probes described in a recent article significantly improves on previous methods in most of these areas.
See related Research: http://genomemedicine.com/content/5/5/50
From whole-genome sequencing to target capture
In the almost 13 years since the first whole human genome was sequenced and published [1, 2], tremendous advances in technology have enabled the sequencing of human genomes for a fraction of the cost and time. However, although the cost of sequencing has dropped considerably, large-scale whole-genome sequencing remains challenging, particularly in the clinical arena. This is due to the still significant cost of sequencing an entire human genome, and the challenges of analyzing enormous amounts of data with tools that are not standardized to a level acceptable for routine diagnostic use. Consequently, targeted sequencing approaches may be more suitable for clinically actionable genes.
Cheap and high-quality targeted sequencing is key for a number of clinical research applications, including large-scale variant screening in disease genes or as follow-up for genetic markers identified as significant in genome-wide association studies. Various methods have been developed to enable whole-exome sequencing and targeted-region sequencing. Early on, solid-state capture arrays were used, but these were expensive and had relatively complex protocols . In-solution capture and PCR-based enrichment methods have reduced the cost and complexity of protocols considerably . These improvements led to a wider adoption of next-generation sequencing and, in the past 12 months particularly, an increase in the use of targeted resequencing as a diagnostic tool .
The size of a target region is limited to a few megabases, which restricts the number of genes/exons that can be included in a clinical sequencing panel. In addition, all current capture methods use only one strand of genomic DNA, missing out on an additional level of possible accuracy.
Overcoming current limitations in target enrichment
By contrast with standard capture methods, the complementary long padlock probe (cLPP) approach, as presented by Shen et al. in a recent article , captures both strands of the target region, effectively doubling the target sequence information compared with other capture methods. This is achieved by generating double-stranded CIPs that are incubated at high temperatures to create single DNA strands, and then hybridized to the sense and antisense strands of genomic DNA, effectively forming two complementary single-stranded DNA circles. In addition, cLPP enables the sequencing of both strands in both the forward and reverse direction (Shen et al. call this reciprocal paired-end sequencing), resulting in a total of four unique sequence reads per template. This redundancy reduces uneven coverage due to differences in the amplification efficiencies of the target regions, and increases coverage and accuracy. This should lead to increased confidence in variant calls in the downstream bioinformatics analysis, and might allow for a reduced average depth of sequence coverage resulting in less sequencing per sample - thus lowering cost. Shen et al. also demonstrate that copy number variation (CNV) detection can be improved with this enrichment method owing to its significantly better discrimination of high- and low-covered targets.
An additional interesting potential application for cLPP is the targeted resequencing of problematic DNA samples derived from formalin-fixed paraffin-embedded (FFPE) tissues. DNA extracted from FFPE samples frequently contains lesions such as abasic sites that lead to a significant increase in sequencing errors when using traditional single-strand sequence capture methods . Owing to the ability of cLPP to capture both strands, it could become a compelling option for targeted resequencing of these sample types. Although cLPP appears to be better suited than traditional CIPs for clinical use, both methods require a large sample size to be economical because of the initial cost of assay development. Furthermore, to our knowledge, reagents based on cLPP are not yet commercially available, which poses a challenge to its widespread adoption.
cLPP is an innovative new approach for high-throughput target enrichment for next-generation sequencing. It improves on a number of shortcomings of current targeted sequencing methods such as accuracy, CNV detection and cost. Most compelling is its ability to preserve strand information and separately sequence sense and antisense strands. Beyond the resulting improvement of variant detection fidelity, other applications that rely on double-strand targeting could benefit. Such applications include problematic DNA samples, where redundancy is important to retrieve as much information as possible because of damage to a single DNA strand.
List of abbreviations
connector inversion probes
complementary long padlock probes
copy number variation: FFPE: formalin-fixed paraffin-embedded.
- Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, Devon K, Dewar K, Doyle M, FitzHugh W, Funke R, Gage D, Harris K, Heaford A, Howland J, Kann L, Lehoczky J, LeVine R, McEwan P, McKernan K, Meldrim J, Mesirov JP, Miranda C, Morris W, Naylor J, Raymond C, Rosetti M, Santos R, Sheridan A, Sougnez C: Initial sequencing and analysis of the human genome. Nature. 2001, 6822: 860-921.View ArticleGoogle Scholar
- Venter JC, Adams MD, Myers EW, Li PW, Mural RJ, Sutton GG, Smith HO, Yandell M, Evans CA, Holt RA, Gocayne JD, Amanatides P, Ballew RM, Huson DH, Wortman JR, Zhang Q, Kodira CD, Zheng XH, Chen L, Skupski M, Subramanian G, Thomas PD, Zhang J, Gabor Miklos GL, Nelson C, Broder S, Clark AG, Nadeau J, McKusick VA, Zinder N: The sequence of the human genome. Science. 2001, 550: 1304-1351.View ArticleGoogle Scholar
- Rowida A, van der Heijden J, Ariyurek Y, Lai Y, Bakker E, van Galen M, Breuning MH, den Dunnen J: Experiences with array-based sequence capture; toward clinical applications. Eur J Hum Genet. 2011, 19: 50-55.View ArticleGoogle Scholar
- Hedges DJ, Guettouche T, Yang S, Bademci G, Diaz A, Andersen A, Hulme WF, Linker S, Mehta A, Edwards YJ, Beecham GW, Martin ER, Pericak-Vance MA, Zuchner S, Vance JM, Gilbert JR: Comparison of three targeted enrichment strategies on the SOLiD sequencing platform. PLoS One. 2011, 6: e18595-PubMed CentralView ArticlePubMedGoogle Scholar
- Boyd SD: Diagnostic applications of high-throughput DNA sequencing. Annu Rev Pathol. 2013, 8: 381-410.View ArticlePubMedGoogle Scholar
- Akhras MS, Unemo M, Thiyagarajan S, Nyrén P, Davis RW, Fire AZ, Pourmand N: Connector inversion probe technology: a powerful one-primer multiplex DNA amplification system for numerous scientific applications. PLoS One. 2007, 2: e195-View ArticleGoogle Scholar
- Krishnakumar S, Zheng J, Wilhelmy J, Faham M, Mindrinos M, Davis R: A comprehensive assay for targeted multiplex amplification of human DNA sequences. Proc Natl Acad Sci USA. 2008, 27: 9296-9301.View ArticleGoogle Scholar
- Shen P, Wang W, Chi A-K, Fan Y, Davis RW, Scharfe C: Multiplex target capture with double-stranded DNA probes. Genome Med. 2013, 5: 50PubMed CentralView ArticlePubMedGoogle Scholar
- Kerick M, Isau M, Timmermann B, Sültmann H, Herwig R, Krobitsch S, Schaefer G, Verdorfer I, Bartsch G, Klocker H, Lehrach H, Schweiger MR: Targeted high throughput sequencing in clinical cancer settings: formaldehyde fixed-paraffin embedded (FFPE) tumor tissues, input amount and tumor heterogeneity. BMC Med Genomics. 2011, 4: 68-PubMed CentralView ArticlePubMedGoogle Scholar