Considerations | Challenges | Solutions | |
---|---|---|---|
Mutation detection | Platform selection | Different sequencing platforms have variable error rates | Increased sequencing coverage for platforms with high error rates |
 | Sequencing target selection | Exome sequencing may miss regulatory variants that are disease causal | Use whole genome sequencing when budget is not a concern, or when diseases other than well-studied classical Mendelian diseases are encountered |
 | Variant generation | Genotype calling algorithms differ from each other and have specific limitations | Use multiple alignment and variant calling algorithms and look for concordant calls. Use local assembly to improve indel calls |
 | Variant annotation | Multiple gene models and multiple function prediction algorithms are available | Perform comprehensive set of annotations and make informed decisions; use probabilistic model for ranking genes/variants |
 | Variant validation | Predicted disease causal mutations may be false positives | Secondary validation by Sanger sequencing or capture-based sequencing on specific genes/regions |
Type of mutations | Coding and splice variants | Many prediction algorithms are available | Evaluate all prediction algorithms under different settings. Develop consensus approaches for combining evidence from multiple algorithms |
 | Untranslated region, synonymous and non-coding variants | Little information on known causal variants in databases such as HGMD | Improved bioinformatics predictions using multiple sources of information (ENCODE data, multispecies conservation, RNA structure, and so on) |
Specific application areas | Somatic mutations in cancer | Tissues selected for sequencing may not harbor large fractions of cells with causal mutations due to heterogeneity; variant calling is complicated by stromal contamination; current databases on allele frequencies do not apply to somatic mutations; current function prediction algorithms focus on loss-of-function mutations | Sample several tissue locations for sequencing; utilize algorithms specifically designed for tumor with consideration for heterogeneity; use somatic mutation databases such as COSMIC; develop function prediction algorithms specifically for gain-of-function mutations in cancer-related genes/pathways |
 | Non-invasive fetal sequencing | Variants from fetal and maternal genomes need to be teased apart; severe consequences when variants are incorrectly detected and predicted to be highly pathogenic | Much increased sequence depth and more sophisticated statistical approaches that best leverage prior information for inferring fetal alleles; far more stringent criteria to predict pathogenic variants |
Inheritance pattern | Inherited from affected parents | Rare/private mutations may be neutral | Evaluate extended pedigrees and 'clans' to assess the potential role of private variants |
 | De novo mutations from unaffected parents | Every individual is expected to carry three de novo mutations, including about one amino acid altering mutation per newborn | Detailed functional analysis of the impacted genes |
Biological validation | Known disease causal genes | Difficult to conclude causality when a mutation is found in a well-known disease causal gene | Examine public databases such as locus-specific databases |
 | Previously characterized genes not known to cause the disease of interest | Relate known molecular function to phenotype of interest | Evaluate loss of function by biochemical assays where available |
 | Genes without known function | Difficult to design functional follow-up assays | Evaluate gene expression data. Use model organisms to recapitulate the phenotype of interest |
Statistical validation | Rare diseases | Limited power to declare association | Sequence candidate genes in unrelated patients to identify additional causal variants |
 | Idiopathic diseases | Lack of additional unrelated patients | Comprehensive functional follow-up of the biospecimens from patients to prove causality |
 | Mendelian diseases or traits | Finding rare, unrelated individuals with same phenotype and same mutation to help prove causality | Networking of science through online databases can help find similarly affected people with same phenotype and mutation |
Type of phenotypes | Mendelian forms of complex diseases or traits | Several major-effect mutations may work together to cause disease | Statistical models of combined effects (additive and epistatic) of multiple variants within each individual |
 | Complex diseases or traits | Many variants may contribute to disease risk, each with small effect sizes | Refrain from making predictions unless prior evidence suggested that such predictive models are of practical utility (for example, receiver operating characteristic >0.8) |