Molecular subtyping and improved treatment of neurodevelopmental disease

The next-generation sequencing revolution has substantially increased our understanding of the mutated genes that underlie complex neurodevelopmental disease. Exome sequencing has enabled us to estimate the number of genes involved in the etiology of neurodevelopmental disease, whereas targeted sequencing approaches have provided the means for quick and cost-effective sequencing of thousands of patient samples to assess the significance of individual genes. By leveraging such technologies and clinical exome sequencing, a genotype-first approach has emerged in which patients with a common genotype are first identified and then clinically reassessed as a group. This approach has proven a powerful methodology for refining disease subtypes. We propose that the molecular characterization of these genetic subtypes has important implications for diagnostics and also for future drug development. Classifying patients into subgroups with a common genetic etiology and applying treatments tailored to the specific molecular defect they carry is likely to improve management of neurodevelopmental disease in the future.

A shift to a genotype-first approach Neurodevelopmental disorders (NDs) refer to a complex collection of phenotypes that encompass clinically recognizable disorders such as autism spectrum disorders (ASD), intellectual disability (ID), epilepsy and schizophrenia. The diagnosis of NDs has classically fallen within the clinical realm. The diagnosis of epilepsy is somewhat quantitative, with the frequency, onset and family history of seizure events being considered for classification [1], whereas the diagnosis of ASD, ID and schizophrenia is historically more complex. The Diagnostic and Statistical Manual of Mental Disorders (DSM, currently DSM-5) is recognized by the US healthcare system as a standard battery of diagnostic criteria for classifying mental disorders. These criteria recognize patients with ASD as those with primarily communication deficits, which can be measured by several standardized tests (e.g., ADOS, ADI-R and BAPQ). In addition to intelligence quotient (IQ) testing, ID is classified by the DSM-5 as involving adaptive functioning impairments in the conceptual, social and practical skills domains. Individuals diagnosed with schizophrenia must present with at least two disease-associated symptoms, which include delusions, hallucinations, disorganized speech and behavior, and social/occupational dysfunction [2].
Earlier versions of the DSM included phenotypic subtypes for many mental health disorders that have since been eliminated owing to inconsistent diagnoses between clinicians. However, the study of these disorders, ASD and ID in particular, has shown that disease subtypes do exist (such as high-functioning ASD, previously Asperger syndrome) [3]. Twin studies of ASD, epilepsy and schizophrenia showed that NDs have a strong genetic component (heritability [h 2 ] = 40-80 % [4][5][6], h 2 = 70-88 % [7], and h 2 = 64-81 % [8,9], respectively). The existence of extensive comorbidity among ND diagnoses has long been recognized; for example, 28 % of individuals who have ID also present with ASD [10], whereas 26 % present with epilepsy [11] and 3.7-5.2 % with schizophrenia [12]. Phenotypic overlap between NDs led to an early hypothesis that common risk genes underlie multiple NDs and, furthermore, that genetic characterization could be a useful diagnostic tool for ND identification and treatment [13].
Studies of copy number variation and whole-exome or whole-genome sequencing (WES and WGS, respectively) of families have highlighted the importance of rare, de novo gene-disruptive mutations in the genetic etiology of NDs. These studies frequently implicated the same copy number variant, biochemical pathway or even the same gene as an underlying factor of seemingly diverse clinical and etiological outcomes (Table 1). One classic example of this genetic overlap is a microdeletion in chromosome 15q11.3, which has been associated with multiple NDs (ASD, ID, epilepsy and schizophrenia) [14]. At the single-gene level, exome sequencing studies have highlighted that specific loci, such as SYNGAP1, ARID1B and ADNP, are likely to contribute to both ASD and ID, whereas mutations in genes such as STXBP1 and WDR45 might contribute to ID and epilepsy but not ASD ( Table 1). Recognition of this genetic overlap and the subtlety of the clinical diagnoses of NDs have led to the development of a so-called genotype-first approach, in which patients with a common genotype (i.e., a disruptive variation in the same gene) are collected for deep clinical phenotyping to define the specific disease attributes associated with each candidate ND risk gene [15]. This approach contrasts with phenotype-driven approaches, in which patients are collected on the basis of a shared clinical presentation and used to identify candidate risk genes post hoc.
The goal of this Opinion is to review advances in the discovery of candidate genes based on next-generation sequencing of patients and the impact of these advances on refining specific subtypes of ND. Linking genotypes to deep clinical phenotypes (including information obtained through application of best-practice DSM-5 criteria, clinical dysmorphology assessment, analysis of   All counts represent de novo mutations that are likely to be gene-disruptive, including frameshift, splice and nonsense mutations a Gene also identified through genotype-first approaches. b 5001-5922 individuals with ASD were screened depending on the gene. ASD data have been previously published [19,[23][24][25][26][27][28]. c 1284 individuals with ID/DD were screened. ID/DD data have been previously published [29][30][31]. d 274 individuals with EP were screened. EP data have been previously published [32,33]. e 785 individuals with SZ were screened. SZ data have been previously published [34][35][36]. f Data from 45,376 control individuals were obtained from the ExAC database. The disruptive mutations counted here represent unaffected population control individuals and individuals with diseases other than neuropsychiatric disorders [37]. These data were used to calculate the Fisher's exact test p value. Only disruptive (frameshift, splice, nonsense) variants were scored in cases and controls. g Pathway annotations determined using the Database for Annotation, Visualization and Integrated Discovery (DAVID) v6.7 [57,58]. A Autophagy, ASD Autism spectrum disorders, C Chromatin remodeling, CI Confidence interval, D Broad development, Db DNA binding, DD Developmental delay, EP Epilepsy, ID Intellectual disability, K Kinase, OR Odds ratio, R Replication, S Synapse function, SZ Schizophrenia, T Gene transcription, U E3 ubiquitin-protein ligase, W Wnt/β-catenin signaling family histories and electroencephalography) is providing important insight into ND risk gene models [16][17][18].
We propose that grouping patients on the basis of a shared genetic etiology is a critical first step in tailoring improved therapeutics to a defined subset of patients.

Gene discovery and phenotypic refinement
The affordable application of next-generation sequencing in the clinical and research arenas has rapidly increased our understanding of the genetic variation that underlies NDs. Exome and targeted sequencing studies of patients with ND have revealed dozens of new genes emerging as high-risk candidate loci in recent years ( Table 2). WES sequencing of patients with ASD led to estimates that 500-1000 genes contribute to disease etiology [19], whereas in ID this number is greater than 1000 [20]. Epilepsy and schizophrenia are thought to be less genetically heterogeneous, involving approximately 500 [21] and 600 [22] genes, respectively. Although associations between certain gene variants and ND risk have been consistently replicated (such as de novo disruptive mutations in CHD8, ADNP and DYRK1A [19] among ASD and ID simplex families), hundreds of ND risk genes remain undiscovered or have not been associated with NDs with sufficient statistical significance owing to ultra-low mutation frequencies in the patient population.
We combined the results of multiple published WES, WGS and targeted sequencing studies including 5001-5922 individuals with ASD (single gene denominators varied owing to the variety of WES, WGS and targeted sequencing approaches used) [19,[23][24][25][26][27][28], 1284 individuals with ID [29][30][31], 274 individuals with epilepsy [32,33] and 785 individuals with schizophrenia [34][35][36] to look for genetic overlap between these NDs. Using this large dataset (over 7000 individuals/families), we identified the top 25 genes that show an excess of disruptive (frameshift, splice, nonsense) gene mutations in disease cases when these individuals are compared with 45,376 controls drawn from the ExAC database [37], where neuropsychiatric cases were masked before analysis (Table 1). Although the number of individuals represented in each disease study differs and several genes reach only nominal significance, the identified genes clearly converge on common biochemical and neurodevelopmental pathways, such as synaptic function, chromatin remodeling, gene transcription and Wnt/β-catenin signaling. Importantly, significance thresholds are likely to be highly conservative, as the ND studies that were included in the analysis only considered confirmed de novo events, whereas the ExAC database variants have not been filtered for population frequency and inheritance status is unknown. Using our large dataset of de novo mutations associated with ND we can apply a recurrent de novo simulation model which considers the size and evolutionary conservation of individual genes to calculate the likelihood of observing a number of de novo mutations in any given ND-associated gene [23]. In some cases we find that genes that were not statistically significant for overall disruptive mutational burden after Bonferroni correction (p < 10-6) of the Fisher's exact test p value are indeed significant for recurrent de novo mutation burdensuch as GRIN2B, which has a de novo p value of 0.001 after correction. Therefore, although some genes (e.g., GRIN2B) reach only nominal significance for an overall increased burden in disruptive mutations in ND cases compared with unaffected controls, based on a de novo model they may prove to be bona fide ND risk genes.
The discovery of recurrently mutated genes has been used to successfully identify additional patients with disruptive mutations in these risk genes who, when collectively phenotyped, define new syndromic and sub-syndromic forms of ND [16][17][18]. These efforts have proceeded in parallel with the coordination of clinical exome sequencing of patients, which has led to the identification of dozens of individuals with the same type of rare molecular defect ( Table 2). This coordination led to the emergence of refined patient checklists that enable a systematic reassessment of pediatric, neuroimaging, neurobehavioral and morphological features [15]. Such genotype-phenotype studies have shown that individuals sharing a genetic etiology have more features in common than those they share with the general ND population. These observations have led to the description of both genetic and clinical subtypes of ND, some of which may be considered syndromic by clinicians.
Many of the top ND risk genes identified in our analysis are correlated with an observable phenotype that may have been predicted based on our analysis (Tables 1  and 2). For example, CHD8 is an ASD-associated gene linked with macrocephaly and gastrointestinal dysfunction [16], whereas ADNP mutations are associated with ASD and the complete loss of expressive language [17]. Some genes seem to be predominantly associated with ID (e.g., ARID1B, ANKRD11, CTNNB1, STXBP1 and CHAMP1). SCN1A mutations have been primarily observed in epilepsy [38]. Other genes are strongly associated with epilepsy and ID (e.g., CHD2 and DYRK1A), often with very specific clinical manifestations (e.g., microcephaly and late-onset epilepsy in the case of patients with DYRK1A variants [18]. The potential contribution of some of these ND genes (e.g., SCN2A, CHD8 and POGZ; Table 1) to adult neuropsychiatric diseases, such as schizophrenia, is intriguing, although statistical significance supporting these associations is still lacking. The existence of such associations would suggest that mutations in these genes have broad phenotypic effects or variable expressivity that manifests as ND at different  [38] developmental stages. It will be important to identify families in which gene-disruptive mutations in these genes are segregated in order to explore phenotypic differences among the familial carriers.

Molecular pathways and therapeutic potential
Beyond genetic subtypes, network-based approaches that more globally predict the effects of ND risk genes on molecular pathways have repeatedly shown an enrichment for synapse function and gene transcription/chromatin remodeling [19,39]. Although these pathways remain the most statistically significant pathways found among ND datasets, other pathways have been identified, including interaction with SNARE proteins and vesicular transport pathways in epilepsy (p < 0.03) and FMRP targets in ASD, ID and epilepsy (p < 0.00001) [39]. Given the extensive locus heterogeneity of these diseases, pathway-defined 'molecular subtypes' are likely to become the ultimate target for behavioral and pharmacological therapeutics. Each of these large functional networks can be further subdivided into smaller pathways, such as long-term potentiation, calcium signaling, postsynaptic density and synapse structure in the case of synaptic function, in which enrichment is driven by signals from de novo mutations in genes such as SYNGAP1, SCN2A, STXBP1, GRIN2B and SCN1A (Table 1). SCN2A and SCN1A are members of the same gene family of voltage-gated sodium channels that are responsible for the generation and propagation of action potentials and have been associated with seizure phenotypes in animal models [38].
Although SCN1A de novo mutations seem to be specific to epilepsy [38], we observe SCN2A de novo mutations in both ASD and ID (Table 1), which suggests that longterm potentiation has a role in multiple forms of ND. It is important to note, however, that we are classifying mutations using the primary clinical diagnosis under which each patient's cohort was originally ascertained. As a large phenotypic overlap exists between NDs, we could reasonably hypothesize, for example, that patients with ASD or ID and an SCN2A mutation could also manifest with seizure phenotypes.
An enrichment for synapse function in ND has been observed primarily in a subset of patients with ID, epilepsy and schizophrenia [39]. Many antipsychotic and psychotropic compounds have been developed to modulate synaptic function to treat comorbid conditions (hyperactivity, depression, anxiety, aggression and seizures) often associated with NDs. These medications may be used more effectively when applied to patients with a molecular perturbation in the relevant gene or pathway. For example, benzodiazepines (e.g., clonazepam) are a class of drugs that increase GABA A receptor activity and thus contribute to the inhibition of action potentials in the central nervous system, which are often overactive in seizure conditions [40]. Efforts are currently underway to specifically tailor benzodiazepines to treat patients with mutations in SCN2A and SCN1A [41,42] (Dr. Raphael Bernier, personal communication). Clemizole, a compound approved by the US Food and Drug Administration, has been shown to mitigate some of the convulsive behavior of Scn1a mutant zebrafish ADHD Attention deficit hyperactivity disorder, ASD Autism spectrum disorders, CNS Central nervous system, DD Developmental delay, EP Epilepsy, ID Intellectual disability, SZ Schizophrenia [43]. Scn2a mutant mice are being used in the development of other similar sodium-channel-inhibiting compounds, including GS967 [44]. Studies of simplex ASD and ID families have highlighted an enrichment for gene-disruptive mutations in transcription and chromatin remodeling pathways (e.g., SWI/SNF complex, Wnt/β-catenin and mTOR) [19,39,45]. Wnt/β-catenin and mTOR pathways are involved in gene transcription, cell growth, migration and patterning during embryonic development [46,47]. These pathways are closely linked to the SWI/SNF nucleosome remodeling complex, which is involved in the regulation of gene expression and is thought to have a role in neural specification [48]. Understanding the molecular biology of these pathways may reveal additional therapeutic targets. ADNP, for example, is a transcription factor that interacts directly with the SWI/SNF complex. Davunetide, a derivative octapeptide of ADNP, has been shown to ameliorate some of the cognitive deficits in animal models with ADNP mutations, which is a promising line of therapeutic research for ADNP patients with similar defects [49]. Some ND-associated genes (Table 1) are simultaneously involved in chromatin remodeling and transcription, such as ARID1B [50] and CHD8 [51], which have been linked to the SWI/SNF and Wnt/β-catenin signaling pathways [16,50] (Table 1) and are known to be important for proliferation of neural precursors [23,39,52].The study of genetic subtypes of ND associated with the Wnt/β-catenin pathwayspecifically DDX3X and CHD8suggests that mutations in this pathway are important in the very early stages of development [16,53]. Importantly, mutations in DDX3X account for a large percentage of unexplained ID in female individuals (1-3 %) [53], which was overlooked in studies of ASD alone [54] (Table 1). The Wnt/β-catenin pathway is commonly dysregulated in cancer; over 40 compounds have been shown to modulate Wnt/β-catenin pathway activity in model systems or in vivo that might be considered for use in specific genetic subtypes of ND in the future [55].
Mutations in the mTOR pathway involving genes such as TSC and PTEN have also been implicated in tumorigenesis and ND owing to their role in transcription and cell growth [47]. Rapalogues, including sirolimus (rapamycin) and everolimus, which inhibit TORC1 and are commonly used to treat cancer, are currently under investigation to assess whether they can improve ASDrelated symptoms in patients with TSC mutations [56]. Similar disease-modifying therapies might be useful to treat patients with other genetic subtypes of ND in which mTOR function is abrogated. However, the use of drugs targeting both Wnt/β-catenin and mTOR pathways will need to be carefully considered and fine-tuned for use in NDs to avoid adverse side effects. Although killing healthy cells in adults is an acceptable consequence of cancer treatment, this is not the case during pediatric brain development.

Conclusions
The success of the genotype-first approach for subtyping NDs can be primarily attributed to technological advances that make WES and targeted sequencing fast and cost-effective. ND candidate gene discovery can be maximized by combining many datasets from overlapping conditions (e.g., ASD, ID, epilepsy and schizophrenia) to (1) increase the genetic evidence supporting individual ND risk gene models, (2) build stronger molecular interaction networks that implicate specific pathways in disease biology and (3) assess the robustness of genotype-phenotype links. Beyond providing a potential genetic explanation for disease to families, our understanding of the biological pathways that are disrupted by specific variants is leading to improved assessment of disease risk in families and to the prospect of tailored treatments for patients with these debilitating diseases. Competing interests E.E.E. is on the scientific advisory board (SAB) of DNAnexus, Inc. and is a consultant for Kunming University of Science and Technology (KUST) as part of the 1000 China Talent Program. H.A.S. and T.N.T. declare that they have no competing interests.

Authors' contribution
All authors read and approved the final manuscript.