Skip to main content
Figure 1 | Genome Medicine

Figure 1

From: Ultradeep analysis of tumor heterogeneity in regions of somatic hypermutation

Figure 1

Deep-Drilling iterative Mapping (DDiMAP) flowchart. (A) This overview schematic illustrates the novel components in a DDiMAP pipeline. Key points include partitioning of reference sequence into computational units called regions of analysis (ROAs), with mapped reads uniquely assigned to ROAs using alignment information within bam files. Variant sequence patterns are collected in each ROA, forming a ‘dictionary’ of unique ‘words’ which are retained based on frequency thresholds. Retained words are partially assembled with words from overlapping ROAs in a cross-verification process. Partially assembled sequences containing variant sequence patterns may be used as additional reference sequences for the optional remapping of reads, a process that is repeated until no new variants above a coverage dependent threshold are observed. For variant identification, data from all ROAs are tallied at each location from the verified dictionary-based sequences. (B) Reference sequence is partitioned into abutting ROAs. Reads are assigned to an ROA based on their ability to completely cover the ROA, maintaining the contained read segments (blue letters) while discarding excess (gray letters). (C) ROA analysis includes counting all read segments matching observed word sequence patterns. This compresses all read data from each ROA into a listing of unique words with count of occurrences in each read direction. (D) An ROA collection is a pair of tracks of abutting ROAs that overlap by half, with reads assigned to one and only one track. Overlapping dictionaries facilitate partial assembly of sequences to form longer sequence fragments containing observed variation from the initial reference sequence that are added to enhance mapping of reads that contain a high density of variation. Additionally, comparison of overlapping dictionary entries formed using independent sets of overlapping variant sequences provides an independent cross-validation of variant sequences for SNV identification. See text and Additional file 1 for complete details.

Back to article page