Genetically complex epilepsies, copy number variants and syndrome constellations

Epilepsy is one of the most common neurological disorders, with a prevalence of 1% and lifetime incidence of 3%. There are numerous epilepsy syndromes, most of which are considered to be genetic epilepsies. Despite the discovery of more than 20 genes for epilepsy to date, much of the genetic contribution to epilepsy is not yet known. Copy number variants have been established as an important source of mutation in other complex brain disorders, including intellectual disability, autism and schizophrenia. Recent advances in technology now facilitate genome-wide searches for copy number variants and are beginning to be applied to epilepsy. Here, we discuss what is currently known about the contribution of copy number variants to epilepsy, and how that knowledge is redefining classification of clinical and genetic syndromes.

In older terminology, genetic epilepsies were referred to as 'idiopathic epilepsies' [4]. Syndromes, and sometimes subsyndromes, are delineated when the seizures are defined by easily recognizable electroclinical features and similar enough to be regarded as a homogeneous group, distinct from other groups in the same classification level (Table 1). For example, genetic generalized epilepsies are frequently divided into their subsyndromes of childhood absence epilepsy, juvenile absence epilepsy, juvenile myo clonic epilepsy and generalized tonic clonic seizures.
There is a subset of epilepsy syndromes that are clearly monogenic, and traditional linkage studies in large families have been useful for identifying causative genes [5,6]. However, the vast majority of the genetic epilepsies are multifactorial, with an underlying genetic contribution that is polygenic, where few or usually none of the sus cep tibility genes have been identified. This multifactorial concept dates back to the early works of William Lennox [7] and was well established in the modern era with additional twin data [8]. It is important to note that epilepsy with complex genetics and complex epilepsy are distinct concepts. To the geneticist, complex epilepsy is epilepsy with complex genetics; that is, multifactorial epilepsy that is polygenic and influenced by environ mental effects, both internal and external. Complex epilepsy to the epileptologist, on the other hand, refers to the complexity of the seizure pattern. Without an appre cia tion of the difference, interactions between basic and clinical scientists can be, and have been from personal experience, confused by 'complex epilepsy' meaning differ ent things to different people. In the context of this article, complex epilepsy will mean that which is multi factorial in origin, rather than necessarily having complex seizure patterns.

Monogenic epilepsies
To date, more than 20 genes have been identified for the group of genetic epilepsies that are primarily monogenic [5,6,9,10], prompting a recent update of clinically based classification [1]. While individual syndromes that com prise each of these groups are generally diagnosed through clinical assessment, molecular testing now facili tates more accurate definition of clinically similar disorders that are now known to be caused by mutation of different genes. While gene identity provides an alternative or additional criterion for syndrome classifica tion, it also has clinical efficacy providing a rapid definitive diagnosis to obviate an otherwise circuitous set of invasive or costly investigative procedures. Further more, in some cases, specific therapeutic intervention can be enabled to achieve improved outcomes or more accurate prognosis. Genetic testing for the epilepsies has high clinical utility in cases that may involve SLC2A1 (glucose transporter type 1 deficiency), SCN1A (Dravet syndrome), PCDH19 (familial epilepsy and mental re tard ation limited to females, 'Dravetlike' PCDH19 syn drome), ARX (Xlinked infantile spasms and myoclonic seizures, dystonia, and Xlinked lissencephaly with ambigu ous genitalia) or STK9 (Xlinked infantile spasms) mutations. Testing has high analytical sensitivity (ability to detect the presence of a causative mutation) and high analytical specificity (ability to exclude mutation in a candidate gene) for all of the monogenic epilepsies, but not necessarily high clinical utility apart from some of the syndromes associated with the above genes [9]. It has little or no clinical utility at this time when knowledge of the gene is not needed for accurate syndrome classi fication, when knowledge of the gene does not direct or affect treatment, or in cases of genetically complex epilepsies triggered by the combined effects of multiple genes spread across the genome, most likely each having only a small effect on phenotype.

Complex epilepsies
Speculation of the genetic architecture for the genetically complex epilepsies centers on the common disease common variant hypothesis [11] and the common disease rare variant hypothesis [12]. The general failure of linkage and association studies applied to the complex epilepsies [1316] argues against the common diseasecommon variant hypothesis, although the major criticism of such studies is that they are underpowered to detect the magnitude of odds ratios that are likely associated with susceptibility variants in the genetically complex epilepsies [17] and indeed other neuropsychiatric brain disorders.
The common diseaserare variant hypothesis, which suggests a variable subset of multiple rare genetic vari ants, has greater appeal for complex epilepsy [18,19], especially given the failure of association studies, which work on the premise of the common diseasecommon variant hypothesis [16], to deliver consistent findings. A mixture of the two models is also entirely plausible [19] with functional differences in the electrophysiological properties of ion channels demonstrated for both rare and polymorphic genetic variation detected at the GABRD (encoding γaminobutyric acid A receptor, δ), CACNA1H (encoding calcium channel, voltagedependent, T type, α 1H subunit) and CLCN2 (encoding chloride channel 2) genes [2023], for example. Computer simulation supports the notion that genetic variations associated with only very small functional changes in ion channel properties are sufficient to make meaningful contributions to increasing susceptibility to epilepsy [24].
Multiple sclerosis is another disorder with complex inheritance where extensive study suggests 'risk variants likely to include hundreds of modest effects and possibly thousands of very small effects' [25]. Similar conclusions with systematic effects of multiple rare variants across the genome have been suggested for schizophrenia and bipolar disorder [26]. We predict the same for epilepsy with complex inheritance, with seizure susceptibility thresh olds determined by combinations of many rare to moderately common sequence variants, copy number variants (CNVs) and perhaps noncoding DNA sequen ces with functional effects. Weak effects will only be detectable by genomewide association studies using massive sample sizes. Kryukov et al. [27] preempted out comes from deep resequencing by massively parallel sequencing (previously referred to as nextgeneration sequencing [28]) by promoting an association study approach based on the premise of multiple rare variants present in susceptibility genes in higher numbers for a given disease group (for example, epilepsy) than in their corresponding controls. The statistical tools to support that approach are now surfacing [29].
The heritability of genetic generalized epilepsy suggests a major genetic component [8] but virtually none has yet been identified. This constitutes the 'dark matter' [30]. The task is to find this missing heritability and charac terize it in terms of number of loci, effect sizes, allelic frequencies of variants and the nature of the variants [31]. Areas being investigated include cisacting genome wide regulatory variants [32], genomewide copy number variants [33,34] as discussed below, and, in the future, nextgeneration sequencing [28].

Copy number variation in epilepsy
CNVs are deletions, duplications or insertions of DNA in the genome that range in size from approximately 1 kb to Genomewide methods to detect CNVs include array comparative genomic hybridization (arrayCGH) and SNP genotyping arrays. These technologies can be targeted to specific chromosomal regions [43,4549]. However, their real power lies with capability for genomewide interrogation, where there is no need for a priori knowledge of where a lesion may lie [33,34,46,50]. Using that approach, Depienne et al. [46] discovered a Dravet like syndrome caused by severe PCDH19 mutations on chromosome X, and McMahon et al. [50] 'rediscovered' the 15q13.3 CNV and found a novel 10q21.2 micro duplication. Mefford et al. [33] and Heinzen et al. [34] used genomewide approaches to establish the extent of rare CNVs in the genetic epilepsies (see below). For CNVs with boundaries extending beyond the target gene, array CGH is a powerful tool for accurately determining size and gene content. Large epilepsyassociated CNVs detectable by MLPA, but extending well beyond the one gene of special interest (for example, beyond SCN1A), can also be reliably detected by array technologies [40,43,45].
The role of CNVs in epilepsy has now been addressed by several groups using both targeted and genomewide approaches. Helbig and colleagues [51] first directed our attention to the role of the 15q13.3 microdeletion in the etiology of epilepsy. This microdeletion was first described in a series of patients with ID, most of whom also suffered from seizures [52], but is much more common in epilepsy cohorts [51,53,54]. This is one of the most prevalent genetic risk factors identified for the genetic generalized epilepsy syndromes. A range of rare mutations within SLC2A1 encoding the GLUT1 glucose transporter are at least as important within the childhood absence epilepsy subsyndrome of genetic generalized epilepsy [55,56]. Although estimated confidence intervals are broad, the estimated odds risk ratio of 68 (95% confidence interval 29 to 181) for the 15q13.3 deletion [54] greatly exceeds that of most common susceptibility variants detectable by genomewide association studies in disorders other than epilepsy. Despite its relative 'severity' in relation to risk, its frequency in epilepsy cohorts is relatively high at around 1.3%. Conversely, this variant is difficult to find in the general control population, despite the screening of large numbers of controls, even though family studies following detection of an index case disclose frequent transmissions from nonpenetrant carrier parents [54,57]. Moreover, the position of the original mutation in the pedigree is often not too far back into its living ancestry, suggesting a relatively high recurrent mutation rate. Of the seven genes within the lesion, haploinsufficiency of CHRNA7 (nicotinic acetylcholine receptor, α7) is considered to be the most likely pathogenic element, although it is not the only neuronally expressed gene affected by the deletion. Interestingly, early genomewide linkage studies impli cated the CHRNA7 region in juvenile myoclonic epilepsy [58], but this could not be replicated [59], and screening of CHRNA7 did not detect convincing mutations [60]. Could it be that the families studied by Elmslie et al. [58] contained enough families segregating the 15q13.3 microdeletion to give a linkage signal?
Subsequent studies investigated the role of other large CNVs that had previously been associated with increased risk of ID, autism and schizophrenia [53]. Somewhat surprisingly, significant numbers of the same recurrent CNVs involved in the disorders listed above were implicated as a component of the polygenic pathogenic genetic architecture in the clinically and genetically com plex (idiopathic) epilepsies. Two microdeletions commonly associated with epilepsy are at 15q11.2 and 16p13.11 [33,34,53]. Together with the 15q13.3 microdeletion, their combined frequency in test populations of genetic generalized epilepsy is approximately 3% [33]. Other large recurrent CNVs associated with ID, autism or schizophrenia that have also been detected in epilepsy are at 1q21.1, 16p12, 22q11 and two regions within 16p11.2 [33,53]. These CNVs represent clearly defined genetic determinants that overlap with a number of hitherto regarded distinct disorders comprising part or all of their genetic architectures. The three most common recurrent CNVs, which together account for up to 3% of epilepsies, are shown in Figure 1. Notably, the 15q13.3 microdeletion has been consistently present in 0.5% to 1% of all genetic generalized epilepsy cohorts but has not been seen in >3,000 patients who presented with focal epilepsy syndromes [34], and therefore it may be a risk factor specifically for generalized epilepsy syndromes. Deletions at 16p13.11 and 15q11.2 have been found in both generalized and focal epilepsies [33,34,53].
The large, recurrent CNVs described above occur because of specific genomic architecture at each respec tive chromosome region. CNV is mediated by naturally occur ring sets of low copy repeats or segmental duplications [6163] that facilitate nonallelic homolo gous recombina tion [64,65], resulting in deletion or duplication of the intervening unique sequence. There fore, each region with such architecture is prone to rearrange ment at meiosis, causing recurrence of large CNVs with nearly identical breakpoints in unrelated individuals. Because CNVs at these rearrangementprone regions of the genome occur with an appreciable frequency, it has been possible to detect a statistically significant difference between cases and controls.
Apart from the recurrent CNVs discussed above, the rare nonrecurrent CNVs are also likely to play a significant role in the genetic etiology of epilepsy. Two recent studies applied genomewide technologies to detect CNVs in affected individuals. Heinzen and colleagues [34] evaluated 3,812 individuals and found an enrichment of large (>1 Mb) deletions in affected individ uals, the majority of which were seen in one individual each. Mefford et al. [33] evaluated 517 individuals with various types of epilepsy and found that nearly 10% carried one or more rare CNVs that had not been previously found at an appreciable frequency in controls. Again, the majority of events were seen only once, and represent a subset of the rare nonrecurrent CNVs involving genes that have been implicated in ID, autism or schizophrenia.

Syndrome constellations associated with CNVs
Taken literally, a constellation is a number of stars grouped within an outline. Here, we regard the CNV as the 'outline' encompassing a group of its associated syndromes comprising the syndrome constellation. Different combinations of syndromes define the constel lations that are packaged within different CNVs. The CNVs can be recurrent in the population, and any recurrent CNV located in a given region is virtually identical from patient to patient. The syndrome constel lations include one or more types of ID, dysmorphism, autism, schizophrenia and, more recently, genetic generalized epilepsy. The various syndromes within the constellations are themselves genetically and pheno typically heterogeneous, and in some cases have defined subsyndromes. For example, genetic generalized epilepsy consists of the subsyndromes childhood absence epilepsy, juvenile absence epilepsy, juvenile myoclonic epilepsy and generalized tonic clonic seizures. Recurrent deletions at 15q13.3 (1.5 Mb, seven genes), at 16p13.11 (1.2 Mb, eight genes) and at 15q11.2 (1.3 Mb, four genes) are emerging as the most common genetic determinants for various distinct disorders with complex inheritance. These generally include intellectual disability with or without dysmorphism, autism, schizophrenia or genetic generalized or focal epilepsy. Epilepsy was the latest addition to the constellations of syndromes associated with each of these CNVs, and is now well established    [33,34,51,53,54]. A similar picture is emerging for the rarer recurrent CNVs at 1q21.1, 16p12 and two regions within 16p11.2 [33,53]. Given the comorbidity of ID and epilepsy, autism and ID, and autism and epilepsy, for example, perhaps it should not be surprising that some CNVs cause over lapping neuropsychiatric features in affected individuals. However, it seems remarkable that the same CNV susceptibility lesion can be a genetic determinant for apparently disparate conditions (for example, only epilepsy in one patient, only schizophrenia in another). One possible explanation might be that odds risk ratios associated with disorders included within a given constel lation of syndromes is relatively high in the context of disorders with complex inheritance. For example, genetic generalized epilepsy has an odds risk ratio of 68 (95% confidence interval 29 to 181) for the 15q13.3 deletion [54]; this is far higher than for susceptibility variants generally detected in complex genetic disorders. Certainly another possible explanation is the presence of as yet undetected additional genetic or epigenetic variants that influence the phenotypic outcome. All of the 'common' recurrent CNVs in epilepsy (15q13.3, 16p13.11 and 15q11.2) have probably been identified already, given the extent of the arrayCGH genomewide searches already completed [33,34]. Some of the less common recurrent microdeletions at 1q21.1, 16p12 and two regions within 16p11.2 may be associated with their own multisyndrome constellations.
Rare or unique nonrecurrent CNVs are collectively more common than the combined recurrent ones. These lesions provide a wealth of leads to candidate epilepsy genes within or closely adjacent to them. The number, frequency and distribution of each genebearing CNV are consistent with the common diseaserare variant model for the genetic architecture for complex epilepsy. Overall genetic profiles of susceptibility genes for each individual are likely to be unique and fit the polygenic heterogeneity concept [18]. Genes within these epilepsy associated CNVs and genes identified through massively parallel sequencing [66] each represent independent oppor tunities to break out of the ion channel paradigm that might potentially constrain our thinking when the genetic architecture of epilepsy might extend beyond ion channels. Results of studies performed so far suggest that haploinsufficiency (deletions) or overexpression (duplica tions) of some of the genes in nonrecurrent CNVs may elicit the same syndromes as those in their associated constellations.
There are two common threads in these discussions. First, the constellations of syndromes associated with each recurrent CNV can include a range of diverse pheno types, including, in most cases, some combination of ID, autism, schizophrenia and epilepsy. Each CNV probably elicits its own specific distribution of pheno types and frequency of each phenotype, defining the associated constellation. Second, the mechanism for genesis of this extreme clinical heterogeneity observed within virtually identical lesions is not yet known. Several mechanistic possibilities have been outlined [34,6769] but none has been proven as a general mechanism, or even a mechanism specific to any given CNV. The clinical heterogeneity is likely to depend upon the nature of the other risk factors or genetic modifiers in the rest of the genome that alone or in combination may specify the phenotype.

Conclusions and future perspectives
The concept of extensive clinical heterogeneity in epilepsy associated with a welldefined genetic lesion is not new. Well known examples are genetic generalized epilepsy with febrile seizures plus [19], caused by mutations in sodium channel genes, and recently, genetic generalized epilepsy caused by the 15q13.3 CNV [70]. These observations have challenged complete reliance on the phenotypefirst approach to diagnosis. Investigations will always begin with general clinical evaluation to broadly classify cases into disease categories. Taking genetic generalized epilepsy as an example, is it then necessary to further refine down to subsyndromes using clinical criteria alone, and to even contemplate endo phenotyping for deeper clinical refinement? The answer is clearly no in the context of syndromic constellations associated with some CNVs and phenotypic spectrums associated with some familial missense mutations. The aim of that exercise of making phenotypes as clinically homogeneous as possible would be to promote genetic homogenization of study populations so that associations are easier to detect. But for CNVs and missense mutations in some genes, collections of the same CNV or same mutation are already genetically homogeneous, at least for that component of the complex polygenic architecture.
The approach needs to be turned upside down, by adoption of a genotypefirst approach where novel genomic disorders such as genetic generalized epilepsy are classified and defined by detection of a common deletion or duplication. The collection of large numbers of patients with the same CNV genotype but wide variety of phenotypes including epilepsy will facilitate genotype phenotype studies that might provide insight into the mechanisms that influence phenotype diversity in these and other disorders. Conversely, the collection of large numbers of genetic generalized epilepsy patients (not even subtyped into subsyndromes) with significantly more multiple rare DNA sequence changes within the same putative epilepsy susceptibility gene, as compared with unaffected controls, might be an outcome of their pursuit through massively parallel sequencing. That would enable us to work backwards, to endophenotype just those cases with mutations in a defined susceptibility gene to see if they have subtle phenotypic features in common. Thus might emerge a subsyndrome classifi cation that is different to that currently in use, based on more relevant components of the phenotype that better reflect the underlying molecular genetics.
Finally, we agree that careful clinical phenotyping is a vital component of our research, as the constellations associated with each of the CNVs need to be accurately characterized. Consider cohorts comprising 15q13.3 deletions, for example. Some of the cases are regarded as epilepsy only. Others are regarded as having dual pheno types, of epilepsy and ID, for example. Are these really dual phenotypes? Consider the hypothetical possibility that the haploid content of the 15q13.3 region lowers the seizure threshold and adversely affects intelligence in everyone who carries it. Some carriers will not have epilepsy because their susceptibility profile contains too few susceptibility variants at other loci throughout the genome, in addition to 15q13.3, to take them across the seizure threshold. Some carriers will not have ID because their baseline intelligence quotient will be high enough to begin with that even with some depression of intelligence quotient through the effects of the 15q13.3 deletion they remain within the normal range. Others, toward the lower end of the normal range to begin with, unfortunately drop down into the ID range. We challenge the clinical researchers to prove us wrong or, like us, seriously question the notion of dual phenotypes presenting in only a subset of the 15q13.3 deletion carriers.

Competing interests
The authors declare that they have no competing interests.