Neurodevelopmental disorders are genetically and phenotypically heterogeneous encompassing developmental delay (DD), intellectual disability (ID), autism spectrum disorders (ASDs), structural brain abnormalities, and neurological manifestations with variants in a large number of genes (hundreds) associated. To date, a few de novo mutations potentially disrupting TCF20 function in patients with ID, ASD, and hypotonia have been reported. TCF20 encodes a transcriptional co-regulator structurally related to RAI1, the dosage-sensitive gene responsible for Smith–Magenis syndrome (deletion/haploinsufficiency) and Potocki–Lupski syndrome (duplication/triplosensitivity).
Genome-wide analyses by exome sequencing (ES) and chromosomal microarray analysis (CMA) identified individuals with heterozygous, likely damaging, loss-of-function alleles in TCF20. We implemented further molecular and clinical analyses to determine the inheritance of the pathogenic variant alleles and studied the spectrum of phenotypes.
We report 25 unique inactivating single nucleotide variants/indels (1 missense, 1 canonical splice-site variant, 18 frameshift, and 5 nonsense) and 4 deletions of TCF20. The pathogenic variants were detected in 32 patients and 4 affected parents from 31 unrelated families. Among cases with available parental samples, the variants were de novo in 20 instances and inherited from 4 symptomatic parents in 5, including in one set of monozygotic twins. Two pathogenic loss-of-function variants were recurrent in unrelated families. Patients presented with a phenotype characterized by developmental delay, intellectual disability, hypotonia, variable dysmorphic features, movement disorders, and sleep disturbances.
TCF20 pathogenic variants are associated with a novel syndrome manifesting clinical characteristics similar to those observed in Smith–Magenis syndrome. Together with previously described cases, the clinical entity of TCF20-associated neurodevelopmental disorders (TAND) emerges from a genotype-driven perspective.
The human chromosome 22q13 region is involved with various genetic and genomic disorders, including Phelan–McDermid syndrome (MIM 606232), in which terminal deletion of 22q13.3 encompassing the critical gene SHANK3 is frequently observed . Occasionally, deletions proximal to the classical Phelan–McDermid syndrome region have been reported, affecting chromosome 22q13.2 without directly disrupting SHANK3 [2,3,4]. It remains unknown whether the abnormal neurodevelopmental phenotypes observed in patients with 22q13.2 deletions are caused by dysregulation of SHANK3 or haploinsufficiency of previously undefined “diseases genes” within the deletion. Recently, a bioinformatics analysis of genes within 22q13.2 highlighted that TCF20 and SULT4A1 are the only two genes within this region that are predicted to be highly intolerant to loss-of-function (LoF) variants and are involved in human neurodevelopmental processes . In particular, TCF20 was predicted to be of higher intolerance to LoF variants as reflected by its higher pLI (probability of LoF intolerance) score (pLI = 1), making it the most promising candidate disease gene underlying neurodevelopmental traits associated with 22q13.2 deletion disorders.
TCF20 (encoding a protein previously known as SPRE-binding protein, SPBP) is composed of six exons, which encode two open reading frames of 5880 or 5814 nucleotides generated by alternative splicing. The shorter isoform (referred to as isoform 2, Genbank: NM_181492.2) lacks exon 5 in the 3′ coding region. Isoform 1 (Genbank: NM_005650.3) is exclusively expressed in the brain, heart, and testis and predominates in the liver and kidney. Isoform 2 is mostly expressed in the lung ([6, 7]; Fig. 1). TCF20 was originally found to be involved in transcriptional activation of the MMP3 (matrix metalloproteinase 3, MIM 185250) promoter through a specific DNA sequence . More recently, it has been shown to act as a transcriptional regulator augmenting or repressing the expression of a multitude of transcription factors including SP1 (specificity protein 1 MIM 189906), PAX6 (paired box protein 6, MIM 607108), ETS1 (E twenty-six 1, MIM 164720), SNURF (SNRPN upstream reading frame)/RNF4 (MIM 602850), and AR (androgen receptor, MIM 313700) among others [9,10,11]. TCF20 is widely expressed and shows increased expression in the developing mouse brain particularly in the hippocampus and cerebellum [12, 13]. Babbs et al. studied a cohort of patients with autism spectrum disorders (ASDs) and proposed TCF20 as a candidate gene for ASD based on four patients with de novo heterozygous potentially deleterious changes, including two siblings with a translocation disrupting the coding region of TCF20, one frameshift and one missense change in another two patients . Subsequently, Schafgen et al. reported two individuals with de novo truncating variants in TCF20 who presented with intellectual disability (ID) and overgrowth . In addition, pathogenic variants in TCF20 have also been observed in two large cohort studies with cognitive phenotypes of ID and developmental delay (DD) [15, 16]. These isolated studies clearly support a role for TCF20 as a disease gene. However, a systematic study of patients with TCF20 pathogenic variant alleles from a cohort with diverse clinical phenotypes is warranted in order to establish a syndromic view of the phenotypic and molecular mutational spectrum associated with a TCF20 allelic series.
Interestingly, TCF20 shares substantial homology with a well-established Mendelian disease gene, RAI1, which is located in human chromosome 17p11.2 (MIM 607642). LoF mutations or deletions of RAI1 are the cause of Smith–Magenis syndrome (SMS; MIM 182290), a complex disorder characterized by ID, sleep disturbance, multiple congenital anomalies, obesity, and neurobehavioral problems [17,18,19,20,21], whereas duplications of RAI1 are associated with a developmental disorder characterized by hypotonia, failure to thrive, ID, ASD, and congenital anomalies [22, 23], designated Potocki–Lupski syndrome (PTLS; MIM 610883). Recent studies suggested that TCF20 and RAI1 might derive from an ancestral gene duplication event during the early history of vertebrates . Therefore, it is reasonable to hypothesize that, as paralogous genes, mutations in TCF20 may cause human disease by biological perturbations and molecular mechanisms analogous to those operative in RAI1-mediated SMS/PTLS.
In this study, we describe the identification of TCF20 pathogenic variations by either clinical exome sequencing (ES) or clinical chromosomal microarray analysis (CMA) from clinically ascertained subjects consisting of cohorts of patients presenting with neurodevelopmental disorders as the major phenotype as well as with various other suspected genetic disorders. We report the clinical and molecular characterization of 28 subjects with TCF20 de novo or inherited pathogenic single nucleotide variants/indels (SNV/indels) and 4 subjects with interstitial deletions involving TCF20. These subjects present with a core phenotype of DD/ID, dysmorphic facial features, congenital hypotonia, and variable neurological disturbances including ataxia, seizures, and movement disorders; some patients presented features including sleep issues resembling those observed in SMS. Additionally, we report the molecular findings of 10 anonymized subjects with pathogenic TCF20 SNVs or deletion/duplication copy-number variants (CNVs). We demonstrate that ascertainment of patients from clinical cohorts driven by molecular diagnostic findings (TCF20 LoF variants) delineates the phenotypic spectrum of a potentially novel syndromic disorder.
The study cohort consists of 31 unrelated families including one family with a set of affected monozygotic twins; four affected heterozygous parents from these families are also included. All the affected individuals were recruited under research protocols approved by the institutional review boards of their respective institutions after informed consent was obtained. Subject #17 who received clinical exome sequencing evaluation at Baylor Genetics presented with hypotonia, autism spectrum disorder, and behavioral abnormalities. Six additional patients carrying SNV/indels (subjects #1, #6, #11, #13, #17, #20, and #25) were identified retrospectively from the Baylor Genetics exome cohort of > 11,000 individuals after filtering for rare potential LoF variants in previously unsolved cases with overlapping neurological phenotypes. Subject #7 was recruited from Children’s Hospital of San Antonio (TX), and the pathogenic variant in TCF20 was detected via diagnostic exome sequencing at Ambry Genetics (Aliso Viejo, CA, USA). Subjects #3 and #4 were recruited from the Hadassah Medical Center from Israel. Subjects #2, #5, #8, #9, #10, #12, #14, #15, #16, #18, #19, #21, #22, #23, #24, #26, #27, and #28 were identified through the DDD (Deciphering Developmental Disorders) Study in the UK.
Two patients (subjects #29 and #30) carrying deletion CNVs in chromosome 22q13 were identified in the Baylor Genetics CMA cohort of > 65,000 subjects. Subject #31 carrying a deletion of TCF20 was recruited from the Decipher study. Subject #32 carrying a deletion encompassing 11 genes including TCF20 was recruited from Boston Children’s Hospital through microarray testing from GeneDX. These cases with positive CNV findings did not receive exome sequencing evaluation.
All participating families provided informed consent via the procedures approved under the respective studies to which they were recruited. The parents or legal guardians of subjects shown in Fig. 2 provided consent for publication of photographs.
Clinical ES analysis was completed for subjects #1, #6, #11, #13, #17, #20, and #25 in the exome laboratory at Baylor Genetics and was conducted as previously described . Samples were also analyzed by cSNP array (Illumina HumanExome-12 or CoreExome-24 array) for quality control assessment of exome data, as well as for detecting large copy-number variants (CNVs) and regions of absence of heterozygosity [25, 26].
The ES-targeted regions cover > 23,000 genes for capture design (VCRome by NimbleGen®), including the coding and the untranslated region exons. The mean coverage of target bases was 130X, and > 95% of target bases were covered at > 20X . PCR amplification and Sanger sequencing to verify all candidate variants were done in the proband and the parents when available, according to standard procedures, and candidate variants were annotated using the TCF20 RefSeq transcript NM_005650.3. Exome sequencing and data analysis for the DDD study were performed at the Wellcome Sanger Institute as previously described . Sequencing and data analysis at the Hadassah Medical Center and Ambry Genetics were conducted as previously described [27, 28].
The two CNV deletions were detected using customized exon-targeted oligo arrays (OLIGO V8, V9, and V10) designed at Baylor Genetics [29,30,31], which cover more than 4200 known or candidate disease genes with exon-level resolution. The deletion in subject #32 was detected by a customized Agilent 180k array, which provides interrogation of 220 regions of microdeletion/microduplication syndrome and 35 kb backbone. The deletion in subject #31 from the Decipher study was detected by the Agilent 180k array.
RNA studies to evaluate for potential escape from nonsense-mediated decay (NMD) associated with the TCF20 alleles with premature stop codons
Total cellular RNA was extracted from peripheral blood according to the manufacturer’s protocol. After DNase I treatment to remove genomic DNA (Ambion), cDNA was synthesized from oligo dT with SuperScript III Reverse Transcriptase (Invitrogen). Primers were designed to span multiple exons of TCF20 to amplify the target variant site from cDNA. The amplified fragments were sized and Sanger sequenced to ensure that cDNA rather than genomic DNA was amplified. Negative controls were also set up without reverse transcriptase to confirm that there was no genomic DNA interference. Sanger sequencing results were analyzed for the ratio of mutant allele versus wild type allele to infer whether there was an escape from nonsense-medicated decay.
Table 1 summarizes the clinical findings in the 32 subjects; further details can be found in Additional file 1: Clinical information. Twenty individuals are male, 12 are female, and at the last examination, ages ranged from 1 to 20 years. Additionally, an affected biological parent of subjects #1, #5, and #7 and twins #27 and #28 were found to be carriers of the TCF20 pathogenic variants and their ages ranged from 42 to 47 years (these are not listed in the tables but briefly described in text Additional file 1: Clinical information). Five individuals (#2, #8, #10, #19, and #26) from the DDD cohort previously reported in a large study with relatively uncharacterized neurodevelopmental disorder  have been included in this study after obtaining more detailed clinical information.
Overall, the majority of the subjects included in our cohort presented with a shared core phenotype of motor delay (94%, n = 30/32), language delay (88%, n = 28/32), moderate-to-severe ID (75%, n = 24/32), and hypotonia (66%, n = 21/32). Some of the variable features reported in the patients include ASD/neurobehavioral abnormalities (66%, n = 21/32), movement disorder (44%, n = 14/32), sleep disturbance (38%, n = 12/32), seizures (25%, n = 8/32), structural brain abnormalities (22%, n = 7/32), growth delay and feeding problems (13%, n = 4/32), macrocephaly (25%, n = 8/32), digital anomalies (34%, n = 11/32), otolaryngological anomalies (3/32, 9%), and inverted nipples (13%, n = 4/32) (Tables 1 and 2 and Additional file 1: Clinical information). Facial dysmorphisms (78%, n = 25/32) were also variable and included anomalies reminiscent of SMS such as a tented or protruding upper lip in a subset of the patients (16%, n = 5/32) and the affected mother of subject #5, brachycephaly (9%, n = 3/32), and midface hypoplasia (6%, n = 2/32) (Tables 1 and 2, Additional file 1: Clinical information, and Fig. 2).
To date, deleterious variants in TCF20 have been identified in cohorts of individuals with diverse neurodevelopmental disorders (NDDs) including ID (66%, n = 8/12), language delay (42%, n = 5/12), neurobehavioral abnormalities (58%, n = 7/12), hypotonia (25%, n = 3/12), one patient with seizures (n = 1/12, 8%), and macrocephaly/overgrowth (25%, n = 3/12) [14,15,16] (Tables 1, 2, and 3). In Babbs et al., the first study reporting TCF20 as a potential disease gene, all four patients presented with ASD, three with ID and one of the patients with midface hypoplasia . Of note, subject 17 of our cohort presented with mild delayed motor milestones, generalized hypotonia, and, in particular, dysmorphic features including midface hypoplasia, tented upper lips, along with sleep issues, ASD, food-seeking behavior, and aggressive behavior; these clinical features are similar to those reported in SMS [32,33,34]. In Schafgen et al., both patients presented with ID, developmental delay, relative macrocephaly, and postnatal overgrowth . Postnatal overgrowth, overweight, and tall stature are seen in 4, 3, and 2 patients from our cohort, respectively. Patients that present with these three “growth acceleration” features account for 28% (9/32) of our cohort. Furthermore, we have observed sleep disturbance (38%, n = 12/32) and neurological features absent from previous published studies including ataxia/balance disorder (22%, n = 7/32), dyspraxia (6%, n = 2/32), dyskinesia/jerky movements (6%, n = 2/32), and peripheral spasticity (19%, n = 6/32) (Tables 1 and 2).
We detected a spectrum of variant types including 25 unique heterozygous SNVs/indels and 4 CNVs involving TCF20 (Figs. 1 and 3). The 25 variants include missense (n = 1), canonical splice-site change (n = 1), frameshift (n = 18), and nonsense changes (n = 5) (Table 3), and they are all located in exons 2 or 3 or the exon2/intron2 boundary of TCF20. All of these variants are absent in the Exome Aggregation Consortium and gnomAD (accessed September 2018) (Table 2, Fig. 1) databases. The variant c.5719C>T (p.Arg1907*) has been detected in both subjects #25 and #26 while c.3027T>A (p.Tyr1009*) is present in both subjects #8 and #9 (Table 2). Although recurring in unrelated subjects, neither of these two changes occurs within CpG dinucleotides. The missense mutation in codon 1710 (p.Lys1710Arg) in subject #17, which was confirmed by Sanger sequencing to have arisen de novo, is located in a highly conserved amino acid (Fig. 1c) within the PHD/ADD domain of TCF20 , and the substitution is predicted to be damaging by multiple in silico prediction tools including SIFT and Polyphen-2. In addition to this variant, another de novo c.1307G>T (p.Arg436Leu) missense variant in ZBTB18 (MIM 608433; autosomal dominant mental retardation 22, phenotype MIM 612337) was found in this patient. A nonsense mutation in ZBTB18 has been recently reported in a patient with ID, microcephaly, growth delay, seizures, and agenesis of the corpus callosum . The c.1307G>T (p.Arg436Leu) variant in ZBTB18 is also absent from ExAC and gnomAD databases and predicted to be damaging by Polyphen2 and SIFT and could possibly contribute to the phenotype in this patient, representing a potential blended (overlapping) phenotype due to a dual molecular diagnosis . Interestingly, in addition to the c.2685delG (p.Arg896Glyfs*9) variant in TCF20 inherited from the affected mother, subject #7 harbors also a de novo likely pathogenic variant (p.Gln397*) in SLC6A1 that, as described for subject #17, could contribute to a blended phenotype in this patient. Defects in SLC6A1 can cause epilepsy and developmental delay (MIM 616421), overlapping with the presentation observed and reported to date in patients with deleterious variants in TCF20. For all the other patients, the clinical test referenced in this study, either exome sequencing or microarray, did not detect additional pathogenic or likely pathogenic variants in other known disease genes underlying the observed neurodevelopmental disorder.
Sanger sequencing confirmed that subjects # 1 to #28 are heterozygous for the TCF20 variants and showed that these changes were absent from the biological parents in 17 patients; in 4 families (subjects #1, #5, #7, and siblings #27 and #28), the variants were inherited from parents with a similar phenotype, confirming the segregation of the phenotype with the variant within the families (Table 2, Fig. 1, and Additional file 1: Clinical information). One or two of the parental samples were unavailable for study in six cases.
In addition to SNVs/indels, we have studied four patients with heterozygous interstitial deletions (128 kb to 2.64 Mb in size) that include TCF20 (subjects #29 to #32, Fig. 3, Tables 1, 2, and 3). Subject #29 is a 4-year-old adopted female with global developmental delay, hypotonia, mixed receptive-expressive language disorder, ASD, ID, ADHD, and sleep disturbance. She was found to have a 2.64-Mb deletion at 22q13.2q13.31 involving TCF20 and 36 other annotated genes. Subject #30 is a 14-year-old male with global psychomotor delay, ASD, severe language delay, macrocephaly, congenital hypotonia, scoliosis, and abnormal sleep pattern. A heterozygous de novo 163-kb deletion was found in this individual removing exon 1 of TCF20. Subject #31 is a 5-year-old male with developmental disorder, seizures, and balance disorder with a 128-kb de novo heterozygous deletion involving TCF20, CYP2D6, and CYP2D7P1. Subject #32 is a 13-month-old female with global developmental delay, hypotonia, and emerging autistic features with a 403-kb deletion encompassing 11 annotated genes including TCF20. The deletions in subjects #30, #31, and #32 do not contain genes other than TCF20 that are predicted to be intolerant to loss-of-function variants, making TCF20 the most likely haploinsufficient disease gene contributing to these patients’ phenotypes. In patient #29, two genes included in the deletion, SCUBE1 and SULT4A1, have pLI scores of 0.96 and 0.97, respectively. These two genes may contribute to the phenotypic presentation of this patient together with TCF20 (pLI = 1) (Fig. 3).
We have also observed additional individuals presenting with neurodevelopmental disorders of variable severity from our clinical database, carrying de novo truncating variants (n = 6, Fig. 1, in green), deletions (n = 1, de novo, Fig. 3), and duplications (n = 3, Fig. 3) involving TCF20. These individuals are included in this study as anonymized subjects (Figs. 1 and 3). Additionally, we observed nine deletions (six are de novo) and five duplications (five are de novo) spanning TCF20 from the DECIPHER database; in some cases, the deletion CNV incorporates other potentially haploinsufficient genes (Fig. 3 and Additional file 1: Table S1). Taken together, these data from anonymized subjects combined with the current clinically characterized subjects in this study corroborate TCF20 being associated with a specific Mendelian disease condition.
Our results indicate that all variants identified in subjects #1 to #32 and four affected carrier parents represent either pathogenic or likely pathogenic (the de novo missense variant in subject #17) alleles. We performed RNA studies in patients #11, #25, and #7 and in the affected mother and sister of patient #7, who all carry premature termination codon (PTC) TCF20 variants that are expected to be subject to NMD as predicted by the NMDEscPredictor tool , because the PTCs are upstream of the 50-bp boundary from the penultimate exon based on both TCF20 transcripts (NM_181492.2 and NM_005650.3). Our data suggest that the mutant TCF20 mRNAs did not obey the “50-bp penultimate exon” rule and they all escaped from NMD (Additional file 1: Figure S2), which is consistent with a previous observation . Despite this, we did not observe a clear genotype-to-phenotype correlation among the different mutation categories. For instance, patients with missense mutations or truncating mutations near the terminal end of the gene did not present with milder phenotypes when compared with patients carrying early-truncating mutations in TCF20 or large deletion encompassing TCF20 and surrounding several genes—the phenotype appears consistent.
We report 32 patients and 4 affected carrier parents with likely damaging pathogenic variants in TCF20. Phenotypic analysis of our patients, together with a literature review of previously reported patients, highlights shared core syndromic features of individuals with TCF20-associated neurodevelopmental disorder (TAND). Previous reports have collectively associated deleterious variants in TCF20 with ID, DD, ASD, macrocephaly, and overgrowth [6, 14,15,16] (Tables 1 and 2). The majority of the individuals in our cohort displayed an overlapping phenotype characterized by congenital hypotonia, motor delay, ID/ASD with moderate to severe language disorder, and variable dysmorphic facial features with additional neurological findings (Tables 1 and 2 and Fig. 2). We observe in our cohort that it is possible to have TCF20 deleterious variants transmitting across generations in familial cases (subjects #1, #5, and #7 and the twin brothers #27 and #28; Table 1, Additional file 1: Clinical information). Our parent carriers presented with an apparently milder phenotype; the mother of subject #1 showed mild dysmorphic facial features; the mother of subject #5 had features including ID, prominent forehead, tented upper lip, and short nose.
It is intriguing that TCF20 contains regions of strong sequence and structural similarity to RAI1 (Additional file 1: Figure S1) [22, 38,39,40,41]. RAI1 encodes a nuclear chromatin-binding multidomain protein with conserved domains found in many chromatin-associated proteins, including a polyglutamine and two polyserine tracts, a bipartite nuclear localization signal, and a zinc-finger-like plant homeodomain (PHD) (Additional file 1: Figure S1) . A previous phylogenetic study of TCF20 and RAI1 suggested that a gene duplication event may have taken place early in vertebrate evolution, just after branching from insects, giving rise to TCF20 from RAI1, this latter representing the ancestral gene . The two proteins share organization of several domains such as N-terminal transactivation domain, nuclear localization signals (NLS), and PHD/ADD at their C-terminus (Additional file 1: Figure S1) . The PHD/ADD domain associates with nucleosomes in a histone tail-dependent manner and has an important role in chromatin dynamics and transcriptional control . Here, we report that some patients with TCF20 mutations may present phenotypic features reminiscent of SMS such as craniofacial abnormalities which include brachycephaly, tented upper lips, midface hypoplasia, neurological disturbance (seizure, ataxia, abnormal gait), failure to thrive, food-seeking behaviors, and sleep disturbance.
To our knowledge, ataxia, hypertonia, food-seeking behavior, sleep disturbance, and facial gestalt reminiscent of SMS have not been previously reported in association with TCF20 pathogenic variants and represent a further refinement of TAND. Interestingly, subject #17 who presented features reminiscent of SMS harbors a missense variant c.5129A>G (p.Lys1710Arg) in the F-box/GATA-1-like finger motif part of the PHD/ADD domain in TCF20. The PHD/ADD domain that maps between amino acid positions 1690–1930 of TCF20 is highly conserved in RAI1 and confers the ability to bind the nucleosome and function as a “histone-reader” (HR) [8, 9]. Interestingly, mutations occurring in the region of GATA-1-like finger of RAI1 (p.Asp1885Asn and p.Ser1808Asn), in close proximity to the corresponding region of TCF20 where p.Lys1710 lies, are also associated with SMS [38, 39, 43].
Postnatal overgrowth has been previously reported in two patients with TCF20 defects . We observe overgrowth, obesity, or tall stature in nine of the patients from our cohort. Interestingly, eight of these nine patients fall into an older age group (> 9.5 years old), representing 73% (8/11) of the patients older than 9.5 years old from our cohort; in the age group younger than 9.5 years old, only 6.7% (1/15) of them presented overgrowth. Further longitudinal clinical studies are warranted to dissect the etiologies of overgrowth, obesity, and tall stature, and to investigate whether these growth accelerations are age-dependent.
Of note, a subset of patients reported herein have sleep disturbance (38%, n = 12/32), hyperactivity (28%, n = 9/32), obsessive–compulsive traits (9%, n = 3/32), anxiety (6%, n = 2/32), and food-seeking behavior/early obesity (16%, n = 5/32) (Table 2), which could ultimately be attributed to circadian rhythm alterations as seen in SMS and PTLS [22, 38, 39]. Receptors for the steroid hormones estrogen (ER) and androgen (AR) have an emerging role in circadian rhythms and other metabolic function regulation in the suprachiasmatic nuclei in vertebrates through alteration of brain-derived neurotropic factor (BNDF) expression in animal models [44,45,46,47]. Interestingly, Bdnf is also downregulated in the hypothalamus of Rai1+/− mice, which are hyperphagic, have impaired satiety, develop obesity, and consume more food during light phase [48,49,50]. Since TCF20 has also been implicated in the regulation of ER- and AR-mediated transcriptional activity [10, 11, 51], we speculate that TCF20 might play a role in the regulation of circadian rhythms through steroid hormone modulation and disruption of its activity could lead to the phenotype observed in a subset of our patients.
Besides patient #17, all other patients carry either deletion or truncating variants occurring before the last exon of TCF20 that are predicted to be loss-of-function either through presumably NMD or by truncating essential domains of the TCF20 protein (Fig. 1). The frameshifting mutations from patients #27 and #28 are expected to result in a premature termination codon beyond the boundary of NMD, therefore rendering the mutant protein immune to NMD . Future studies are warranted to delineate the exact correlation between genotype and phenotype in light of the potential escape from NMD and the potential pathway overlapping and interaction between TCF20 and RAI1 in the determination of the phenotype. It has been shown that around 75% of mRNA transcripts that are predicted to undergo NMD escape destruction and that the nonsense codon-harboring mRNA may be expressed at similar levels to wild type . Therefore, alternative to NMD, we can speculate that, for instance, the truncating mutations that occur earlier in the gene before the first NLS (amino acid position 1254–1268) (Fig. 1, Additional file 1: Figure S1) in subjects #1 to #12 may determine loss-of-function of TCF20 due to either decreased level of protein in the nucleus with consequent cytoplasmic accumulation and/or to the absence of key functional C-terminal domains including PHD/ADD domains and/or DBD, AT-hook, NLS2, and NLS3, these latter representing unique motifs not conserved between TCF20 and RAI1 (Fig. 1, Additional file 1: Figure S1). It has been previously shown that the frameshift mutation c.3518delA (p.Lys1173Argfs*5) in TCF20 in one patient with ASD produces a stable mRNA that escapes NMD . Data from our RNA studies corroborates this observation that TCF20 alleles with premature termination codon mutations may in general escape NMD. However, it should also be noted that NMD and mRNA turn over may be tissue specific and the current tissue tested is limited to blood. Based on this hypothesis, the position of amino acid truncation, for example, within the NLS or DNA-binding domain, may contribute to the prediction of genotype–phenotype correlation. The truncated TCF20 protein may retain partial function, representing hypomorphic alleles, or act in a dominant-negative manner sequestering transcription factors and co-factors in the absence of transcriptional modulation. Another possibility is that, due to the similarity between RAI1 and TCF20, mutated products of TCF20 could interfere with RAI1 pathways through the aforementioned mechanisms. Due to the complexity of the protein regulation and the variety of functional domains present in TCF20 (Additional file 1: Figure S1) that are not fully characterized, further studies are needed to refine the genotype–phenotype correlation.
Finally, although disorders associated with 22q13.2 deletions (encompassing TCF20) share similar features with Phelan–McDermid syndrome caused by deletion of SHANK3, our study provides evidence for the hypothesis that the major phenotypes observed in the former disorder are likely caused by direct consequence of TCF20 defects. Phenotypes specific for TCF20, such as sleep disturbances and movement disorders, may help clinically distinguish the 22q13.2 deletions from the 22q13.3 deletions (SHANK3). It is tempting to hypothesize that dosage gain of TCF20 may also be disease causing, given the similar observation at the 17p11.2 locus, where copy number gain of RAI1 was found to cause PTLS, potentially presenting mirror trait endophenotypes in comparison to SMS (e.g., underweight versus overweight) [53, 54]. This hypothesis predicts that TCF20 duplications are expected to cause similar neurodevelopmental defects as observed in the deletions, which is supported by the observation of TCF20 duplications from anonymized individuals with neurodevelopmental disorders, some of which are de novo (Fig. 2 and Additional file 1: Figure S1); additionally, one may speculate that specific phenotypes caused by TCF20 duplication may present mirror trait compared to those associated with the deletions, such as underweight versus overweight and schizophrenia spectrum disorders versus autism spectrum disorders. Further work is warranted to investigate the consequence of dosage gain of TCF20 in human disease.
Our findings confirm the causative role of TCF20 in syndromic ID, broaden the spectrum of TCF20 mutations recently reported, begin to establish an allelic series at this locus, and may help to understand the molecular basis of this new TAND syndrome. We also observe some patients with pathogenic variants in TCF20 presenting phenotypes reminiscent of SMS, suggesting potential common downstream targets of both TCF20 and RAI1. We suggest without molecular testing that it is challenging for a TAND diagnosis to be clinically reached purely based on the phenotypes observed in most patients. This underlines the importance of clinical reverse genetics for patients presenting with developmental delay and minor dysmorphic features, where positioning genotype-driven analysis (ES, CMA, or a combination of both) early in the “diagnostic odyssey” could improve the molecular diagnostic outcome and facilitate appropriate clinical management including recurrence risk counseling .
Autism spectrum disorder
Chromosomal microarray analysis
Premature termination codon
TCF20-associated neurodevelopmental disorders
Wilson HL, Wong AC, Shaw SR, Tse WY, Stapleton GA, Phelan MC, et al. Molecular characterization of the 22q13 deletion syndrome supports the role of haploinsufficiency of SHANK3/PROSAP2 in the major neurological symptoms. J Med Genet. 2003;40(8):575–84.
Simenson K, Oiglane-Shlik E, Teek R, Kuuse K, Ounap KA, et al. A patient with the classic features of Phelan-McDermid syndrome and a high immunoglobulin E level caused by a cryptic interstitial 0.72-Mb deletion in the 22q13.2 region. Am J Med Genet A. 2014;164A(3):806–9.
Thummler S, Giuliano F, Karmous-Benailly H, Richelme C, Fernandez A, De Georges C, et al. Neurodevelopmental and immunological features in a child presenting 22q13.2 microdeletion. Am J Med Genet A. 2016;170(3):792–4.
Naoufal R, Legendre M, Couet D, Gilbert-Dussardier B, Kitzis A, Bilan F, Harbuz R. Association of structural and numerical anomalies of chromosome 22 in a patient with syndromic intellectual disability. Eur J Med Genet. 2016;59(9):483–7.
Babbs C, Lloyd D, Pagnamenta AT, Twigg SR, Green J, McGowan SJ, et al. De novo and rare inherited mutations implicate the transcriptional coregulator TCF20/SPBPin autism spectrum disorder. J Med Genet. 2014;51(11):737–47.
Elvenes J, Thomassen EI, Johnsen SS, Kaino K, Sjottem E, Johansen T. Pax6 represses androgen receptor-mediated transactivation by inhibiting recruitment of the coactivator SPBP. PLoS One. 2011;6(9):e24659.
Schafgen J, Crème K, Becker J, Wieland T, Zink AM, Kim S, et al. De novo nonsense and frameshift variants of TCF20in individuals with intellectual disability and postnatal overgrowth. Eur J Hum Genet. 2016;24(12):1739–45.
Lelieveld SH, Reijnders MR, Pfundt R, Yntema HG, Kamsteeg EJ, de Vries P, et al. Meta-analysis of 2,104 trios provides support for 10 new genes for intellectual disability. Nat Neurosci. 2016;19(9):1194–6.
Greenberg F, Guzzetta V, Montes de Oca-Luna R, Magenis RE, Smith AC, Richter SF, et al. Molecular analysis of the Smith-Magenis syndrome: a possible contiguous-gene syndrome associated with del(17)(p11.2). Am J Hum Genet. 1991;49(6):1207–18.
Liu P, Lacaria M, Zhang F, Withers M, Hastings PJ, Lupski JR. Frequency of nonallelic homologous recombination is correlated with length of homology: evidence that ectopic synapsis precedes ectopic crossing-over. Am J Hum Genet. 2011;89(4):580–8.
Bi W, Yan J, Shi X, Yuva-Paylor LA, Antalffy BA, Goldman A, Yoo JW, et al. Rai1 deficiency in mice causes learning impairment and motor dysfunction, whereas Rai1heterozygous mice display minimal behavioral phenotypes. Hum Mol Genet. 2007;16(15):1802–13.
Potocki L, Bi W, Treadwell-Deering D, Carvalho CM, Eifert A, Friedman EM, et al. Characterization of Potocki-Lupski syndrome (dup(17)(p11.2p11.2)) and delineation of a dosage-sensitive critical interval that can convey an autism phenotype. Am J Hum Genet. 2017;80(4):633–49.
Zhang F, Potocki L, Sampson JB, Liu P, Sanchez-Valle A, Robbins-Furman P, et al. Identification of uncommon recurrent Potocki-Lupski syndrome-associated duplications and the distribution of rearrangement types and mechanisms in PTLS. Am J Hum Genet. 2010;86(3):462–70.
Lalani SR, Liu P, Rosenfeld JA, Watkin LB, Chiang T, Leduc MS, et al. Recurrent muscle weakness with rhabdomyolysis, metabolic crises, and cardiac arrhythmia due to bi-allelic TANGO2mutations. Am J Hum Genet. 2016;98(2):347–57.
Boudreau EA, Johnson KP, Jackman AR, Blancato J, Huizing M, Bendavid C, et al. Review of disrupted sleep patterns in Smith-Magenis syndrome and normal melatonin secretion in a patient with an atypical interstitial 17p11.2 deletion. Am J Med Genet A. 2009;149A(7):1382–91.
De Munnik SA, Garcia-Minaur S, Hoischen A, van Bon BW, Boycott KM, Schoots J, et al. A de novo non-sense mutation in ZBTB18in a patient with features of the 1q43q44 microdeletion syndrome. Eur J Hum Genet. 2014;22(6):844–6.
Posey JE, Harel T, Liu P, Rosenfeld JA, James RA, Coban Akdemir ZH, et al. Resolution of disease phenotypes resulting from multilocus genomic variation. N Engl J Med. 2017;376(1):21–3.
Coban-Akdemir Z, White JJ, Song X, Jhangiani SN, Fatih JM, Gambin T, et al. Identifying genes whose mutant transcripts cause dominant disease traits by potential gain-of-function alleles. Am J Hum Genet. 2018;103(2):171–87.
Carmona-Mora P, Canales CP, Cao L, Perez IC, Srivastava AK, Young JI, et al. RAI1 transcription factor activity is impaired in mutants associated with Smith-Magenis syndrome. PLoS One. 2012;7(9):e45155.
Chen L, Mullegama S, Alaimo JT, Elsea SH. Smith-Magenis syndrome and its circadian influence on development, behavior, and obesity-own experience. Develop Per Med. 2015;19(2):149-56.
Burns B, Schmidt K, Williams SR, Kim S, Girirajan S, Elsea SH. Rai1haploinsufficiency causes reduced Bdnf expression resulting in hyperphagia, obesity and altered fat distribution in mice and humans with no evidence of metabolic syndrome. Hum Mol Genet. 2010;19(20):4026–42.
Lyngsø C, Bouteiller G, Damgaard CK, Ryom D, Sanchez-Muñoz S, Nørby PL, et al. Interaction between the transcription factor SPBPand the positive cofactor RNF4. An interplay between protein binding zinc fingers. J Biol Chem. 2005;275:26144–9.
MacArthur DG, Tyler-Smith C. Loss-of-function variants in the genomes of healthy humans. Hum Mol Genet. 2011;19(R2):R125–30.
We thank the patients and their families for participating in this study. This study makes use of data generated by the DECIPHER community. A full list of centers that contributed to the generation of the data is available from http://decipher.sanger.ac.uk and via e-mail from email@example.com. The views expressed in this publication are those of the author(s) and not necessarily those of the Wellcome or the Department of Health. The research team acknowledges the support of the National Institute for Health Research, through the Comprehensive Clinical Research Network.
This study was supported in part by the National Human Genome Research Institute/National Heart, Lung, and Blood Institute (NHGRI/NHLBI) grant UM1HG006542 to the Baylor Hopkins Center for Mendelian Genomics (BHCMG); and National Institutes of Neurological Disorders and Stroke (NINDS) grant R35 NS105078-01 to JRL. JEP was supported by the NHGRI grant K08 HG008986. Funding for the DDD study project was provided by the Wellcome. The DDD study presents independent research commissioned by the Health Innovation Challenge Fund (grant number HICF-1009-003), a parallel funding partnership between the Wellcome and the Department of Health, and the Wellcome Sanger Institute (grant number WT098051).
Availability of data and materials
The datasets supporting the conclusions of this article are included within the article and its additional files. Our raw data cannot be submitted to publicly available datasets because the patient families were not consented for sharing their raw data, which can potentially identify the individuals.
Authors and Affiliations
Baylor Genetics, Houston, TX, 77021, USA
Francesco Vetrini, Wenmiao Zhu, Sarah H. Elsea, Weimin Bi, Seema Lalani, Fan Xia, Yaping Yang, Christine M. Eng, James R. Lupski & Pengfei Liu
Northern Ireland Regional Genetics Service, Belfast City Hospital, Belfast, UK
Shane McKee & Vivienne McConnell
Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
Jill A. Rosenfeld, Andrea M. Lewis, Kimberly Margaret Nugent, Elizabeth Roeder, Rebecca O. Littlejohn, Joseph T. Alaimo, Brett Graham, Daryl A. Scott, Lindsay C. Burrage, Donna M. Muzny, Richard A. Gibbs, Sarah H. Elsea, Jennifer E. Posey, Weimin Bi, Seema Lalani, Fan Xia, Yaping Yang, Christine M. Eng, James R. Lupski & Pengfei Liu
Nottingham Genetics Service, Nottingham City Hospital, Nottingham, UK
Department of Pediatrics, Baylor College of Medicine, San Antonio, TX, 78207, USA
Kimberly Margaret Nugent, Elizabeth Roeder & Rebecca O. Littlejohn
North West Thames Regional Genetics Service, 759 Northwick Park Hospital, London, UK
Sue Holder & Birgitta Bernhard
Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA
Donna M. Muzny, Richard A. Gibbs & James R. Lupski
Dell Children’s Medical Group, Austin, TX, 78723, USA
Jill M. Harris & James B. Gibson
Division of Genetic and Genomic Medicine, Nationwide Children’s Hospital; and Department of Pediatrics, College of Medicine, Ohio State University, Columbus, OH, 43205, USA
Matthew Pastore & Kim L. McBride
Department of Pediatrics, College of Medicine & Health Sciences, United Arab University, Al Ain, UAE
Makanko Komara & Lihadh Al-Gazali
Department of Pediatrics, Tawam Hospital, Al-Ain, UAE
Aisha Al Shamsi
Department of Pediatrics, Section of Genetics, University of Oklahoma Health Sciences Center, Oklahoma City, OK, 73104, USA
Elizabeth A. Fanning & Klaas J. Wierenga
Department of Human Genetics and Metabolic Diseases, Hadassah-Hebrew University Medical Center, Jerusalem, Israel
Ziva Ben-Neriah & Vardiella Meiner
Department of Pediatrics, Texas Children’s Hospital, Houston, TX, 77030, USA
J. Lloyd Holder Jr, Seema Lalani & James R. Lupski
Department of Pediatrics, University of Hawaii, Honolulu, HI, 96826, USA
Laurie H. Seaver
Centre de Génétique Humaine, Université de Franche-Comté, Besançon, France
Lionel Van Maldergem
Department of Neurology, Boston Children’s Hospital, Boston, MA, 0211, USA
Sonal Mahida, Janet S. Soul & Margaret Marlatt
Gene DX, Gaithersburg, MD, 20877, USA
West Midlands Regional Clinical Genetics Service and Birmingham Health Partners; and Women’s and Children’s Hospitals NHS Foundation Trust, Birmingham, UK
East Anglia Regional Genetics Service, Addenbrooke’s Hospital, Cambridge, UK
June-Anne Gold & Soo-Mi Park
All-Wales Medical Genetics Service, University Hospital of Wales, Cardiff, UK
South East of Scotland Clinical Genetic Service, Western General Hospital, Edinburgh, UK
Anne K. Lampe
North East Thames Regional Genetics Service, Great Ormond Street Hospital, London, UK
Ajith Kumar & Melissa Lees
South East Thames Regional Genetics Service, Guy’s Hospital, London, UK
Oxford Regional Genetics Service, Oxford University Hospitals, Oxford, UK
Wessex Clinical Genetics Service, Princess Anne Hospital, Southampton, UK
The Hebrew University of Jerusalem, Jerusalem, Israel
Monique and Jacques Roboh Department of Genetic Research, Hadassah-Hebrew University Medical Center, 91120, Jerusalem, Israel
Present address: Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN, 46202, USA
Francesco Vetrini & Brett Graham
Department of Molecular Physiology and Biophysics, Baylor College of Medicine, Houston, TX, 77030, USA
Daryl A. Scott
Present address: Mayo Clinic Florida, Department of Clinical Genomics, Jacksonville, FL, 32224, USA
FV drafted the manuscript. PL conceived and supervised the study. FV, SMc, SHE, JEP, JRL, and PL participated in the manuscript writing and advised on the data analysis. SMc and JAR organized and coordinated patient recruitment and cohort assembly. MS, AML, KMN, ER, ROL, SH, BG, JMH, JBG, MP, KLM, MK, LA, AAS, EAF, KJW, DS, ZB, VM, HC, OE, JLH, LB, LHS, LVM, SMa, JS, MM, LM, JV, JG, SP, VV, AKL, AK, ML, MH, VM, BB, EB, VH, and SL contributed to patient recruitment and characterization of individual patient phenotypes/genotypes. WZ and JA performed the RNA experiments. DMM, RAG, YY, and CME supervised the clinical exome sequencing studies. WB and CME supervised the clinical microarray studies. FV, WB, SL, FX, YY, and PL generated the original clinical molecular analyses and interpretation of individual patients. All authors have read and approved the final manuscript.
All participants provided written informed consent to participate in the study. The study was approved by the Institutional Review Board of Baylor College of Medicine (H-22769 and H-41191) and the UK Research Ethics Committee approval (10/H0305/83, granted by the Cambridge South REC, and GEN/284/12 granted by the Republic of Ireland REC). The research conforms with the principles of the Declaration of Helsinki.
Consent for publication
The consent to publish all identifiable information presented in the study including Fig. 2 was provided by the parents or legal guardians of the subjects.
Baylor College of Medicine (BCM) and Miraca Holdings Inc. have formed a joint venture with shared ownership and governance of Baylor Genetics (BG), which performs chromosomal microarray analysis and clinical exome sequencing. JAR, SHE, WB, FX, YY, CME and PL are employees of BCM and derive support through a professional services agreement with BG. FV and WZ are employees of BG. JRL serves on the Scientific Advisory Board of BG. JRL has stock ownership in 23andMe, is a paid consultant for Regeneron Pharmaceuticals, and is a coinventor on multiple US and European patents related to molecular diagnostics for inherited neuropathies, eye diseases, and bacterial genomic fingerprinting. The other authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The original version of this article was revised as it contained a typographical error in the Results section. Subject 17 was incorrectly cited as Subject 1.
Clinical information. Clinical presentation of the subjects in this study. Table S1. Phenotypes for de-identified subjects from the DECIPHER database. Figure S1. Schematic representation of key conserved domains between TCF20 and RAI1. Figure S2.TCF20 alleles with premature termination codon variants escape from nonsense-mediated decay (NMD). (PDF 393 kb)
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
Vetrini, F., McKee, S., Rosenfeld, J.A. et al. De novo and inherited TCF20 pathogenic variants are associated with intellectual disability, dysmorphic features, hypotonia, and neurological impairments with similarities to Smith–Magenis syndrome.
Genome Med11, 12 (2019). https://doi.org/10.1186/s13073-019-0623-0