Improved coverage and accuracy with strand-conserving sequence enrichment

Targeted next-generation sequencing is becoming a common tool in the molecular diagnostic laboratory. However, currently available methods to enrich for regions of interest in the DNA sequence suffer from drawbacks such as high cost, complex protocols, lack of clinical-level accuracy and uneven target coverage. A target-enrichment approach using complementary long padlock probes described in a recent article significantly improves on previous methods in most of these areas. See related Research: http://genomemedicine.com/content/5/5/50

relatively complex protocols [3]. In-solution capture and PCR-based enrichment methods have reduced the cost and complexity of protocols considerably [4]. Th ese improvements led to a wider adoption of next-generation sequencing and, in the past 12 months particularly, an increase in the use of targeted resequencing as a diagnostic tool [5].
Nevertheless, current methods are far from perfect. For example, PCR-based methods require highly multiplexed oligonucleotide pairs targeted to heterogeneous sequen ces with a range of melting temperatures and CG content to generate hundreds or thousands of amplicons in a single tube. Th is leads to diff erences in amplicon presentation and uneven sequence coverage. Hybridiza tion-based methods exhibit signifi cantly more off -target capture than other enrichment methods, do not capture repetitive sequences, and poorly cover GC-and AT-rich regions. Methods employing 'capture by circularization' (Figure 1), such as connector inversion probes (CIPs), also have problems. Th ese methods use single-stranded DNA molecules with gene-specifi c targeting regions at the 5' and 3' ends that are complementary to the targeted genomic DNA [6]. After hybridization of the targeting ends of the CIP to the genomic DNA, a single-stranded DNA circle is formed and closed by gap fi lling and ligation. Th e single-stranded DNA circle is then linearized by restriction digest, and the target region is enriched by PCR and fi nally sequenced. CIPs require a large backbone for the probes to capture targets effi ciently, which makes them expensive and diffi cult to manufacture [7].
Th e size of a target region is limited to a few megabases, which restricts the number of genes/exons that can be included in a clinical sequencing panel. In addition, all current capture methods use only one strand of genomic DNA, missing out on an additional level of possible accuracy.

Overcoming current limitations in target enrichment
By contrast with standard capture methods, the complementary long padlock probe (cLPP) approach, as Abstract Targeted next-generation sequencing is becoming a common tool in the molecular diagnostic laboratory. However, currently available methods to enrich for regions of interest in the DNA sequence suff er from drawbacks such as high cost, complex protocols, lack of clinical-level accuracy and uneven target coverage. A target-enrichment approach using complementary long padlock probes described in a recent article signifi cantly improves on previous methods in most of these areas.  [8], captures both strands of the target region, effectively doubling the target sequence information compared with other capture methods. This is achieved by generating doublestranded CIPs that are incubated at high temperatures to create single DNA strands, and then hybridized to the sense and antisense strands of genomic DNA, effectively forming two complementary single-stranded DNA circles. In addition, cLPP enables the sequencing of both strands in both the forward and reverse direction (Shen et al. call this reciprocal paired-end sequencing), resulting in a total of four unique sequence reads per template. This redundancy reduces uneven coverage due to differences in the amplification efficiencies of the target regions, and increases coverage and accuracy. This should lead to increased confidence in variant calls in the downstream bioinformatics analysis, and might allow for a reduced average depth of sequence coverage resulting in less sequencing per sample -thus lowering cost. Shen et al. also demonstrate that copy number variation (CNV) detection can be improved with this enrichment method owing to its significantly better discrimination of high-and low-covered targets.
An additional interesting potential application for cLPP is the targeted resequencing of problematic DNA samples derived from formalin-fixed paraffin-embedded (FFPE) tissues. DNA extracted from FFPE samples frequently contains lesions such as abasic sites that lead to a significant increase in sequencing errors when using traditional single-strand sequence capture methods [9]. Owing to the ability of cLPP to capture both strands, it could become a compelling option for targeted resequencing of these sample types. Although cLPP appears to be better suited than traditional CIPs for clinical use, both methods require a large sample size to be economical because of the initial cost of assay development. Furthermore, to our knowledge, reagents based on cLPP are not yet commercially available, which poses a challenge to its widespread adoption.

Conclusion
cLPP is an innovative new approach for high-throughput target enrichment for next-generation sequencing. It improves on a number of shortcomings of current targeted sequencing methods such as accuracy, CNV detection and cost. Most compelling is its ability to preserve strand information and separately sequence sense and antisense strands. Beyond the resulting improvement of variant detection fidelity, other applications that rely on doublestrand targeting could benefit. Such applications include problematic DNA samples, where redundancy is important to retrieve as much information as possible because of damage to a single DNA strand.

Competing interests
The authors declare that they have no competing interests.