Skip to main content

Table 2 Summary of selected software available for whole-genome haplotyping

From: Whole-genome haplotyping approaches and genomic medicine

Method Software Summary Reference(s)
Molecular - haplotype assembly HapCut (OSS) A combinatorial approach implementing a max-cut-based algorithm and optimized minimum error correction (MEC) solution [22]
Single Individual Haplotyper (OSS) A collection of algorithms including RefHap, a heuristic algorithm for sorting reads into haplotype bins [31],[37]
H-BOP (OSS) Heuristic algorithm for optimizing a combination of the MEC and Maximum Fragments Cut models [38]
MixSIH (OSS) Probabilistic mixture model [39]
HASH (OSS) Markov chain Monte Carlo algorithm [21]
Genetic analysis Haploscribe (OSS) Implements a parsimony approach to generate inheritance state vectors and a hidden Markov model to deduce haplotypes [8]
Population inference Beagle (OSS) Phased input data are used to build a local haplotype cluster model, which is sampled using a hidden Markov model. Iterations and the Viterbi algorithm are used to select the `most likely haplotype [40]
fastPHASE (OSS) Enhancement of PHASE. Implements a haplotype-clustering model with a fixed number of clusters and hidden Markov model assumptions for cluster membership. Expectation-maximization methods are incorporated for parameterization [41]
GERMLINE (OSS) Implements a hashing-algorithm approach to identifying whole-haplotype segment sharing [42]
IMPUTE2 Pre-phasing, imputation and haplotype sampling strategy incorporating a Monte Carlo algorithm and Markov model calculations [43]
MaCH Implements a Markov Chain algorithm for genotype imputation and haplotyping [44]
PHASE (OSS) Implements Bayesian haplotype reconstruction [45]
SHAPEIT Implements hidden Markov model sampling [46],[47]
SNPTools (OSS) A population imputation pipeline that generates genotype likelihoods using a binary sequence map-specific binomial mixture model. Haplotypes are then sampled using a hidden Markov model [48]
WinHAP (OSS) Scalable sliding windows are used to optimize haplotypes and a parsimony approach iteratively restricts the number of solutions [49]
Combination strategies HARSH (OSS) Sampling within a probabilistic model combining read data with a reference panel of haplotypes. Successor to Hap-SeqX [50]
SHAPEIT2 Adds short-read molecular information to population inference [51]
Prism Combines haplotype assembly and population inference [36]
PPHS Implements a phylogeny model to estimate haplotype frequencies recursively using the expectation maximization algorithm [52]
  FamilyQuartet (OSS) Integrates physical, genetic and population phasing [53]
  1. Abbreviations: OSS open source software, MEC minimum error correction.