De novo truncating mutations in ASXL3 are associated with a novel clinical phenotype with similarities to Bohring-Opitz syndrome

Background Molecular diagnostics can resolve locus heterogeneity underlying clinical phenotypes that may otherwise be co-assigned as a specific syndrome based on shared clinical features, and can associate phenotypically diverse diseases to a single locus through allelic affinity. Here we describe an apparently novel syndrome, likely caused by de novo truncating mutations in ASXL3, which shares characteristics with Bohring-Opitz syndrome, a disease associated with de novo truncating mutations in ASXL1. Methods We used whole-genome and whole-exome sequencing to interrogate the genomes of four subjects with an undiagnosed syndrome. Results Using genome-wide sequencing, we identified heterozygous, de novo truncating mutations in ASXL3, a transcriptional repressor related to ASXL1, in four unrelated probands. We found that these probands shared similar phenotypes, including severe feeding difficulties, failure to thrive, and neurologic abnormalities with significant developmental delay. Further, they showed less phenotypic overlap with patients who had de novo truncating mutations in ASXL1. Conclusion We have identified truncating mutations in ASXL3 as the likely cause of a novel syndrome with phenotypic overlap with Bohring-Opitz syndrome.


Background
Widespread use of high-throughput sequencing has helped elucidate the genetic heterogeneity underlying phenotypically similar syndromes. Bohring-Opitz syndrome (BOS; MIM 605039] is characterized by distinct craniofacial features and posture, severe intellectual disability, feeding problems, small size at birth, and failure to thrive [1], but shares some of these features with other syndromes. Recently, de novo truncating mutations in ASXL1 have been shown to account for approximatly 50% of cases with BOS [2];we initially and independently identified two individuals with de novo truncation mutations in a related gene, ASXL3. Subsequent interrogation of a small cohort identified two additional individuals with similar mutations. In all four families, the affected children had BOS-like features, but had no specific recognizable syndromic diagnosis. Subjects 1, 2, and 4 had similar clinical histories, including severe psychomotor retardation, feeding problems, severe post-natal growth retardation, arched eyebrows, anteverted nares, and ulnar deviation of the hands (Table 1, Figure 1), which are features partially shared with Cornelia de Lange Syndrome (CdLS) and BOS, but they did not have the trigonocephaly that is characteristic of BOS. Subject 3 also displayed anteverted nares, but had less severe psychomotor retardation and had normal growth. At 5 years of age, she has intellectual disability and does not speak. Magnetic resonance imaging data, which was only   available for subject 4, showed cerebral volume loss over time, a feature seen in some cases of BOS (see Additional file 1, Figure S1). In the exome data of any of the subjects, no rare variants were found in the genes known to cause BOS and CdLS.

Ethics approval
The study was approved by the Baylor Institutional Review Board (IRB), by the appropriate ethical committee at the MPIMG, and informed consent was obtained from the guardians of all subjects.

DNA
DNA from subjects and their parents was obtained under written informed consent, provided by their parents or legal guardians, for participation in the study. The whole human genomes were resequenced by a commercial company (Complete Genomics, Mountain View, CA, USA) with unchained base reads on selfassembling DNA nanoarrays [3], and the variants called by using mated gapped reads [4].

Sequence alignment, variant calling, annotation, and verification
At BCM, Illumina data was aligned by use of Burrows-Wheeler Aligner (BWA) software. Variants were called using ATLAS-SNP v2.0 and the SAMtools program Pileup. De novo variants were found by in silico subtraction of the variants discovered in either parent. Variants were subsequently annotated for effect on the protein, known minor allele frequencies, and gene function using AnnoVar and custom in-house developed software. Candidate variants were verified and segregation examined using Sanger capillary sequencing.
At MPIMG, the exonic region of human genome was enriched (SureSelect Human All Exon Kits; Agilent Technologies), and deep sequenced (HiSeq 2000; Illumina). Raw sequence reads were pre-screened to remove low-quality reads, and then aligned to the human reference genome with SOAP (version 2.21). Aligned and unaligned reads were used to call the single-nucleotide variants (SNVs) and Indels, respectively. Variant lists were filtered against the reference databases and ranked as potential candidates [5].
The primary variant list from Complete Genomics was filtered and prioritized by an in-house pipeline [5]. De novo mutations in each patient were defined as those absent from all other members in the family and that had been flagged as 'high quality' and supported by at least five reads with allele percentage of 0.4 to 0.6. Candidate variants were verified and segregation tested using Sanger capillary sequencing.

RNA extraction and reverse transcriptase-PCR experiments
A cell line from subject 2 was established by Epstein-Barr virus transformation, in accordance with standard protocols after informed consent. Total RNA was extracted from patient cells using a commercial kit (RNeasy Plus; Qiagen Inc., Valencia, CA, USA) in accordance with the manufacturer's recommendations. For reverse transcription, 1 μg of RNA was used with 10 U of Superscript III (Invitrogen Corp., Carlsbad, CA, USA), and 150 μmol/l random hexamers in the presence of ribonuclease inhibitor (RNasin; Promega Corp., Madison, WI, USA) in accordance with the manufacturer's protocol. Reverse transcriptase (RT)-PCRs for allele expression analysis of ASXL3 were carried out using a PCR mix (BIO-X-ACT Long Mix; Bioline Reagent Ltd, London, UK) and primers (ASXL3_RTmut_for 5'-AAATGCAGTTGCGGATAAGG-3' and ASXL3_RTmut_rev 5'-TGGGGTTCTTCATGA-GAATTC-3'), located in exons 10 and 11. After initial denaturation at 94°C for 3 min, cycling conditions were: 40 cycles at 94°C, 55°C and 72°C. Each step was for 45 seconds.

ASXL3 analysis
Exon data for the gene and prosite regions was extracted from the ENSEMBL genome browser [6]. Vertebrate conservation was obtained from the UCSC Genome browser [7], using the vertebrate conserved elements track (Vertebrate Multiz Alignment & Conservation (46 Species): Vertebrate Conserved Elements) with a minimum LOD score of 700. Predicted motifs were obtained from the eukaryotic linear motif [8] server with a motif probability cut-off of 0.005. All phosphorylation events were grouped together. ASXL family similarity was calculated by BLASTing ASXL1 and ASXL2 amino acid sequences against ASXL3 using NCBI-BLASTP [9].

Results and discussion
To identify potential causative alleles, genome sequencing was undertaken in each family we identified (see Additional file 2, Table S1). For family 1, we interrogated the exome of both parents and the affected child using a custom exon-capture reagent, followed by high-throughput sequencing with the Illumina HiSeq platform. For family 2, we obtained the whole genome sequence of the affected child, his unaffected sibling, and both parents from Complete Genomics. We also performed exome sequencing in this family as part of a pilot study to compare the yield of these methods (see Methods). For families 3 and 4, exome sequencing was performed on the proband only. After sequencing, we identified all coding and near-intronic differences between the subjects and the human reference genome (see Methods). By using the unaffected parents as controls, it was possible to identify de novo mutations in the affected subject.
In each generation, 70 to 175 de novo point mutations are expected [10,11], and 0 to 3 of these are anticipated to cause protein-coding changes [11]. In subject 1, a single protein-coding de novo mutation was identified (chr18, g.31318578C > T, p.Q404X; hg19). In subject 2, two coding de novo mutations were identified. The first (chr16, g.75258715G > A, p.R248H) occurred in CTRB1, and was considered non-pathogenic, as the variant has been reported in samples analyzed for the Thousand Genomes project (rs191950160), and is identical to the orthologous alleles in the chimpanzee and rhesus macaque. The second occurred in ASXL3 (chr18, g.31318764C > T, p.Q466X) and has not been previously reported. In subject 3, a de novo 4 bp deletion was identified (chr18, g.31319343_31319346delACAG, p.T659fsX41); this frameshift was predicted to generate a premature termination codon (TGA) after an additional 41 amino acids. In subject 4, a de novo 1 bp insertion was identified (chr18, g. 31318789_insT, p.P474fs); this frameshift mutation was predicted to generate a premature termination codon immediately. In all cases, the variant was interrogated in the affected child and parental DNA obtained from peripheral blood by Sanger capillary sequencing. All four de novo ASXL3 mutations generated stop codons, and were predicted, in silico, to generate a truncated ASXL3. Further, the mRNAs containing these premature stop-codon mutations may be degraded by nonsense-mediated decay (NMD) and thus could represent loss-of-function alleles. Neither SNV allele occurred in a CpG dinucleotide; the only known site with an increased propensity for de novo mutations [10] (see Additional file 3, Table S2).
In Drosophila, the additional sex combs (Asx) gene is required to maintain homeotic gene activation and silencing, and in mice, three orthologs (Asxl1, Asxl2, and Axl3) have been identified. Asxl1 acts on the developmentally important Hox genes both as a repressor (HoxA4, HoxA7, and HoxC8) and as an enhancer (HoxC8) [12]; dysregulation of the human HOX genes may account for the developmental phenotype. Little information is available about either embryologic or fetal expression of the ASXL gene family; however, in Drosophila, regulation of the Asx gene is highly variable and tightly controlled in the first 3 hours after fertilization [13]. In humans, ASXL3, like ASXL1 and ASXL2, is a putative polycomb protein and probably acts as a histone methyltransferase in a complex with other proteins [14]. ASXL3 is expressed in similar tissues to ASXL1 including brain, spinal cord, kidney, liver, and bone marrow, but at a lower level [15] (see Additional file 4, Figure S2). Within the brain, ASXL1 has much higher expression in the white matter, whereas ASXL3 has moderately higher expression in the insula, cingulate gyrus, and amygdala, with approximately similar expression elsewhere [16] (see Additional file 5, Figure S3). The high correlation of expression patterns between ASXL1 and ASXL3 may account for some of the shared phenotypic features.
No deleterious ASXL3 mutation was found in a small cohort of patients with BOS without causative ASXL1 mutation (Hoischen, personal communication), consistent with ASXL3 mutations conveying a phenotype distinct from BOS. Using large-scale datasets (Thousand Genomes [17], dbSNP, ESP5400, and Cohorts for Heart and Aging Research in Genomic Epidemiology [18]) we identified four other truncating mutations in ASXL3, which occurred as singletons within each dataset in reportedly phenotypically normal individuals (Figure 2), and thus may represent benign variants. Two of these mutations occur at the extreme 3' of the gene, and thus may escape NMD and retain protein activity. One mutation, p.L902X (rs187354298), identified in a single sample from the Thousand Genomes cohort, occurs more 3' to, but within the same penultimate exon as, the two disease-causing mutations (see Figure 2). More interestingly, however, a high-quality nonsense mutation (R322X) was identified in the exon 9 of the 12-exon ASXL3 gene, and is anticipated to undergo NMD or may be otherwise highly deleterious to protein function.
Although all four disease-associated de novo variants and rs187354298 are predicted to potentially undergo NMD, it is challenging to reconcile such a hypothesis with the observed range of phenotypes for nonsense alleles reported at this locus. However, the current ability to predict NMD is limited, and it has been shown that around 75% of mRNA transcripts that are predicted to undergo NMD escape destruction, and that the nonsense codon-harboring mRNA is expressed at levels similar to wild type in lymphoblastoid cells [19]. Furthermore, the dynamics of mRNA stability and degradation may differ for cells and tissues undergoing rapid developmental changes. Parenthetically, we noted an enrichment of around 50% in gene regions where mRNA would be predicted to escape NMD, or 3' gene bias to the predicted loss-of-function nonsense codon mutations in normal controls from the Thousand Genomes data [19].
Another distinct interpretation of our observations is that all ASXL3 disease-causing nonsense-encoding mRNAs are translated into prematurely terminated proteins, which act in a dominant-negative fashion. In support of this hypothesis, Sanger sequencing of multiple cDNA extractions derived from a transformed lymphoblast cell line from subject 2 showed that both alleles were expressed (see Additional file 6, Figure S4) although this does not exclude that some degree of NMD may occur or reflect what occurred during development. Further, we observed that in previously reported cases of BOS known disease-causing nonsense mutations in ASXL1 [2,20] occur almost entirely within a very limited region of the protein,. Furthermore, database searches reveal that truncating mutations, in reportedly phenotypically normal individuals, can occur both 5' and 3' of these mutations, just as we now report for ASXL3 (see Additional file 7, Figure S5). This disease-causing mutation hotspot falls between two paralogous regions shared by all ASXL genes ( Figure 2) and into a region unique to ASXL1. Interestingly, the presumptive diseasecausing mutations we describe here occur within an analogous region in ASXL3, within the first half of the penultimate exon. Further, disease severity may decrease the more 3' the mutation occurs within this region. This region contains a number of predicted phosphorylation sites, an evolutionarily conserved region (residues 420 to 470, approximately), and an evolutionarily conserved serine-rich motif between residues 600 and 800, approximately. We speculate that disruption of these conserved regions may result in dysregulation of post-translational protein modification, resulting in constitutive activation.
Truncating ASXL3 mutations are uncommon, and their de novo nature makes it even less likely that we identified these individuals by chance, which highlights the value of de novo mutation-based methods to find disease-causing loci. To determine the probability of observing multiple de novo truncating mutations in ASXL3, we developed a model [21,22] accounting for gene size, GC content and de novo rates of SNVs and small insertion/deletions, and of the probability of those mutations causing a truncation of the protein. The probability of developing a de novo nonsense mutation in ASXL3 is 3.35 × 10 -6 per generation, whereas the probability of developing a de novo coding insertion or deletion in ASXL3 is approximately 3.91 × 10 -6 . Thus, the total probability of observing three additional individuals with truncating ASXL3 mutation, given the first de novo observation, is around 4.0 × 10 -17 . The observation of four de novo truncating mutations occurring in association with a sporadic disease that shares similar phenotypic features is highly unlikely to have occurred by chance; nevertheless, functional studies will be required to show conclusively that truncating mutations in ASXL3 have pathological consequences that cause the observed disease trait. Although all four subjects shared clinical findings, these characteristics were mostly non-specific. Severe feeding difficulties, present from birth, that required intervention (3/4 subjects). The subjects had small size at birth (3/4), with microcephaly (3/4) and severe psychomotor delay, with missed milestones (4/4) at their most recent evaluation. Deep palmar creases (4/4) and slight ulnar deviation of the hands (3/4), combined with a high arched palate (3/4) were also common. No patient had the typical 'BOS posture' of elbow and wrist flexion, or of myopia or trigonocephaly (0/4).
The phenotype present in the three affected individuals varies in both presentation and severity, a phenomenon that is also reported in subjects with ASXL1 mutation. Several factors may account for this. First, truncating mutations occurring earlier in the gene seem to be associated with a more severe phenotype, with truncating mutations at the extreme 3' end of the gene yielding no observed phenotype. Interestingly, this does not seem to be the case for ASXL1 [2]. Additional subjects will be needed for further genotype-phenotype analysis to address a potential polarity hypothesis [23]. Second, because of the importance of the ASXL gene family in very early development, the time at which the mutation arose may also influence the phenotypic outcome; mutations that occurred in the parental gametes could convey a more severe phenotype than those arising post-zygotically or during later embryogenesis [24]. Third, ASXL proteins form complexes with other proteins, and have been shown to influence Trx gene mutations in flies [12]. Mutational load and other epistatic effects may contribute to the observed phenotype, and we could not discern such alleles using our de novo variant approach. Finally, epigenetic factors may contribute to the phenotype. In mice, homozygous Asxl2 mutations can lead to two primary outcomes: around 20% are born very small and die by the age of 2 months, whereas the remaining 80% are smaller at birth but gain weight normally and are successfully weaned [25]. Thus, other factors may contribute to the penetrance and/or expressivity of de novo ASXL3 mutations in humans, and severe phenotypes could be atypical.
The condition defined molecularly in the current study is phenotypically distinct from, but with similar and overlapping features to BOS. This is probably the consequence of functional overlap between the causative candidate genes ASXL1 and ASXL3, which are both developmentally important putative polycomb genes. Differentiating two phenotypically similar syndromes based on clinical presentation alone is challenging, and is further complicated by phenotypic variability. Molecular methods permit an objective means to establish and secure a diagnosis. Moreover, these methods now enable comparative analyses between novel and well-described syndromes to make use of evolutionary genetics in addition to phenotypic features in disease nosology. This allows a distinct molecular diagnosis, and increases diagnostic capabilities for rare syndromes. Interestingly, in this study, the subjects were identified not by establishing phenotypic overlap between them, but rather by identifying that they shared de novo nonsense mutations in the identical genes, and that mutations in a related gene, ASXL1, resulted in a similar phenotype. In particular, subjects 3 and 4 were identified from a small clinical cohort (n = 192) of individuals with psychomotor delay, based upon the presence of rare truncating ASXL3 mutations, which were later determined to be de novo. This is a novel way in which molecular diagnostics can help foster international and inter-institutional collaborations that will be vital to both solving the multitude of very rare diseases and to functionally annotating the human genome. characterization and sample collection; DMM and YY conducted highthroughput and validation sequencing; and MNB conducted data analysis and interpretation. Berlin group (subject 2): HHR conceived and planned the experiments and helped prepare the manuscript; TFW enabled clinical characterization and sample collection, and performed database screening; WC conducted high-throughput sequencing; HH conducted data analysis and interpretation; and LM validated the WES and WGS results. All authors read and approved the final manuscript.