Skip to main content

Table 1 Overview of whole-genome haplotyping methods

From: Whole-genome haplotyping approaches and genomic medicine

Method

Minimal cohort

Advantages

Limitations*

Molecular

Single and paired-end physical reads

Individual

Haplotype is directly observed from sequence data

Produces short haplotypes, even after assembly

Simple

Can resolve private and rare haplotypes

Can phase de novo variants

Chromosome sorting, clone-by-clone, dilution, proximity ligation

Individual

Haplotype is directly observed from sequence data

May be labor intensive, time-consuming and expensive, therefore

Highly accurate

difficult to translate to large sample sizes

Can resolve private and rare haplotypes

Can phase de novo variants

Can resolve long-range and chromosome-length haplotypes (depending on method)

Ideal for generating personalized genome-resolved haplotypes

 

Haplotype assembly

Individual

Leverages molecular haplotype information from WGS data and/or from sorted chromosomes, clones

Assembly requires variants in overlapping sequence reads

Works well when molecular haplotypes are long (that is, from cosmid or BAC)

Limited by the accuracy and availability of suitable reference data

Generate short-range haplotypes

May introduce phase errors

Genetic analysis

 

Trios, nuclear families

Can accurately phase high-throughput short-read sequencing reads

Cannot resolve sites where all family members are heterozygous

Low error rate

Precisely maps recombinations and inheritance states

May not be possible to ascertain family members

Enables detection of sequencing errors

Can phase private and rare alleles

Can phase entire chromosomes

Suitable for clinical applications

Population inference

 

Unrelated individuals, duos, trios

Cost-effective

Can only phase common variants

Facilitates haplotype imputation in samples with low-density microarray panels

Difficult to impute private variants or rare haplotypes

Useful when family members cannot be ascertained

Limited by the accuracy and availability of suitable reference data

Large sample sizes increase accuracy

Generates short-range haplotypes

Good for large samples of unrelated individuals

Sample size impacts haplotype frequency estimations

Incorporation of family duos and trios improves accuracy

Methods are probabilistic and accuracy must be balanced against computational costs

  1. *All of these methods are limited by the accuracy of the sequence data.