Skip to main content

Table 1 Considerations and challenges for the identification of disease causal mutations

From: Identifying disease mutations in genomic medicine settings: current challenges and how to accelerate progress

Considerations

Challenges

Solutions

Mutation detection

Platform selection

Different sequencing platforms have variable error rates

Increased sequencing coverage for platforms with high error rates

 

Sequencing target selection

Exome sequencing may miss regulatory variants that are disease causal

Use whole genome sequencing when budget is not a concern, or when diseases other than well-studied classical Mendelian diseases are encountered

 

Variant generation

Genotype calling algorithms differ from each other and have specific limitations

Use multiple alignment and variant calling algorithms and look for concordant calls. Use local assembly to improve indel calls

 

Variant annotation

Multiple gene models and multiple function prediction algorithms are available

Perform comprehensive set of annotations and make informed decisions; use probabilistic model for ranking genes/variants

 

Variant validation

Predicted disease causal mutations may be false positives

Secondary validation by Sanger sequencing or capture-based sequencing on specific genes/regions

Type of mutations

Coding and splice variants

Many prediction algorithms are available

Evaluate all prediction algorithms under different settings. Develop consensus approaches for combining evidence from multiple algorithms

 

Untranslated region, synonymous and non-coding variants

Little information on known causal variants in databases such as HGMD

Improved bioinformatics predictions using multiple sources of information (ENCODE data, multispecies conservation, RNA structure, and so on)

Specific application areas

Somatic mutations in cancer

Tissues selected for sequencing may not harbor large fractions of cells with causal mutations due to heterogeneity; variant calling is complicated by stromal contamination; current databases on allele frequencies do not apply to somatic mutations; current function prediction algorithms focus on loss-of-function mutations

Sample several tissue locations for sequencing; utilize algorithms specifically designed for tumor with consideration for heterogeneity; use somatic mutation databases such as COSMIC; develop function prediction algorithms specifically for gain-of-function mutations in cancer-related genes/pathways

 

Non-invasive fetal sequencing

Variants from fetal and maternal genomes need to be teased apart; severe consequences when variants are incorrectly detected and predicted to be highly pathogenic

Much increased sequence depth and more sophisticated statistical approaches that best leverage prior information for inferring fetal alleles; far more stringent criteria to predict pathogenic variants

Inheritance pattern

Inherited from affected parents

Rare/private mutations may be neutral

Evaluate extended pedigrees and 'clans' to assess the potential role of private variants

 

De novo mutations from unaffected parents

Every individual is expected to carry three de novo mutations, including about one amino acid altering mutation per newborn

Detailed functional analysis of the impacted genes

Biological validation

Known disease causal genes

Difficult to conclude causality when a mutation is found in a well-known disease causal gene

Examine public databases such as locus-specific databases

 

Previously characterized genes not known to cause the disease of interest

Relate known molecular function to phenotype of interest

Evaluate loss of function by biochemical assays where available

 

Genes without known function

Difficult to design functional follow-up assays

Evaluate gene expression data. Use model organisms to recapitulate the phenotype of interest

Statistical validation

Rare diseases

Limited power to declare association

Sequence candidate genes in unrelated patients to identify additional causal variants

 

Idiopathic diseases

Lack of additional unrelated patients

Comprehensive functional follow-up of the biospecimens from patients to prove causality

 

Mendelian diseases or traits

Finding rare, unrelated individuals with same phenotype and same mutation to help prove causality

Networking of science through online databases can help find similarly affected people with same phenotype and mutation

Type of phenotypes

Mendelian forms of complex diseases or traits

Several major-effect mutations may work together to cause disease

Statistical models of combined effects (additive and epistatic) of multiple variants within each individual

 

Complex diseases or traits

Many variants may contribute to disease risk, each with small effect sizes

Refrain from making predictions unless prior evidence suggested that such predictive models are of practical utility (for example, receiver operating characteristic >0.8)

  1. HGMD, Human Gene Mutation Database.