Skip to main content

Table 1 Considerations and challenges for the identification of disease causal mutations

From: Identifying disease mutations in genomic medicine settings: current challenges and how to accelerate progress

Considerations Challenges Solutions
Mutation detection Platform selection Different sequencing platforms have variable error rates Increased sequencing coverage for platforms with high error rates
  Sequencing target selection Exome sequencing may miss regulatory variants that are disease causal Use whole genome sequencing when budget is not a concern, or when diseases other than well-studied classical Mendelian diseases are encountered
  Variant generation Genotype calling algorithms differ from each other and have specific limitations Use multiple alignment and variant calling algorithms and look for concordant calls. Use local assembly to improve indel calls
  Variant annotation Multiple gene models and multiple function prediction algorithms are available Perform comprehensive set of annotations and make informed decisions; use probabilistic model for ranking genes/variants
  Variant validation Predicted disease causal mutations may be false positives Secondary validation by Sanger sequencing or capture-based sequencing on specific genes/regions
Type of mutations Coding and splice variants Many prediction algorithms are available Evaluate all prediction algorithms under different settings. Develop consensus approaches for combining evidence from multiple algorithms
  Untranslated region, synonymous and non-coding variants Little information on known causal variants in databases such as HGMD Improved bioinformatics predictions using multiple sources of information (ENCODE data, multispecies conservation, RNA structure, and so on)
Specific application areas Somatic mutations in cancer Tissues selected for sequencing may not harbor large fractions of cells with causal mutations due to heterogeneity; variant calling is complicated by stromal contamination; current databases on allele frequencies do not apply to somatic mutations; current function prediction algorithms focus on loss-of-function mutations Sample several tissue locations for sequencing; utilize algorithms specifically designed for tumor with consideration for heterogeneity; use somatic mutation databases such as COSMIC; develop function prediction algorithms specifically for gain-of-function mutations in cancer-related genes/pathways
  Non-invasive fetal sequencing Variants from fetal and maternal genomes need to be teased apart; severe consequences when variants are incorrectly detected and predicted to be highly pathogenic Much increased sequence depth and more sophisticated statistical approaches that best leverage prior information for inferring fetal alleles; far more stringent criteria to predict pathogenic variants
Inheritance pattern Inherited from affected parents Rare/private mutations may be neutral Evaluate extended pedigrees and 'clans' to assess the potential role of private variants
  De novo mutations from unaffected parents Every individual is expected to carry three de novo mutations, including about one amino acid altering mutation per newborn Detailed functional analysis of the impacted genes
Biological validation Known disease causal genes Difficult to conclude causality when a mutation is found in a well-known disease causal gene Examine public databases such as locus-specific databases
  Previously characterized genes not known to cause the disease of interest Relate known molecular function to phenotype of interest Evaluate loss of function by biochemical assays where available
  Genes without known function Difficult to design functional follow-up assays Evaluate gene expression data. Use model organisms to recapitulate the phenotype of interest
Statistical validation Rare diseases Limited power to declare association Sequence candidate genes in unrelated patients to identify additional causal variants
  Idiopathic diseases Lack of additional unrelated patients Comprehensive functional follow-up of the biospecimens from patients to prove causality
  Mendelian diseases or traits Finding rare, unrelated individuals with same phenotype and same mutation to help prove causality Networking of science through online databases can help find similarly affected people with same phenotype and mutation
Type of phenotypes Mendelian forms of complex diseases or traits Several major-effect mutations may work together to cause disease Statistical models of combined effects (additive and epistatic) of multiple variants within each individual
  Complex diseases or traits Many variants may contribute to disease risk, each with small effect sizes Refrain from making predictions unless prior evidence suggested that such predictive models are of practical utility (for example, receiver operating characteristic >0.8)
  1. HGMD, Human Gene Mutation Database.