Copy number variation in Parkinson's disease

A central theme of human genetic studies is to understand genomic variation and how this underlies the inherited basis of disease. Genomic variation can provide increased biological understanding of disease processes, which is necessary to develop future treatments. Recent technological advances have highlighted the role of copy number variants in normal and pathological phenotypic expression. These applications have been used in studies of Parkinson's disease, a common, late-onset, progressive neurodegenerative disorder. At present the main therapeutic approach is administration of symptom-alleviating drugs, which neither reverses the disease process nor halts its progression. However, the generation of in vivo model systems and development of novel disease intervention strategies for Parkinson's disease have come from research on monogenic forms of the disorder, including those caused by copy number variants. Here, we review the role of copy number variants and the mechanistic insights they have provided on the pathogenesis of Parkinson's disease.

from less than one kilobase (kb) to several megabases (Mb) in size and can be caused by genomic rearrangements such as deletions, multiplications, inversions and translocations [2]. The vast majority of CNVs are unbalanced, may be limited to a single gene or include a contiguous set of genes and can either be inherited or caused by de novo events. Altered expression levels of these CNV genes may be responsible for observed phenotypic variability, complex behavioral traits and disease susceptibility. So far, CNVs have been implicated in the pathogeneses of several neurological disorders, including Parkinson's disease (PD) [3,4].

Parkinson's disease
PD is a common neurodegenerative disorder affecting approximately 1% of the population aged over 60 years. The disease presents clinically as a movement disorder characterized by tremor at rest, bradykinesia, rigidity and postural instability. Neuropathological changes leading to a diagnosis of PD are dopaminergic cell loss in the substantia nigra accompanied by the formation of Lewy bodies, which are intracytoplasmic protein aggregates within the remaining neurons. The neuropathology associated with PD progresses over time, and in more advanced stages, patients develop a range of non-motor symptoms, including cognitive decline. Although drugs such a levodopa or surgical intervention (deep brain stimulation) can help alleviate the motor symptoms, they do not halt disease progression and are not effective against the non-motor aspects of the disease, such as rapid eye movement sleep behavior disorder, constipation, depression and cognitive decline.
A major breakthrough in recent years has been the mapping of chromosomal loci linked to familial PD and the cloning of five genes causing monogenic forms of the syndrome [5]. Mendelian forms of PD are relatively rare and the genetic mutations account for only a small fraction of affected individuals; however, common varia tion at some of these loci has been shown to confer population disease risk in associationbased studies. The identification of CNVs in several of these genes has provided mechanistic insights into PD pathogenesis, generated in vivo model systems and driven the development of novel therapeutic intervention strategies.

Abstract
A central theme of human genetic studies is to understand genomic variation and how this underlies the inherited basis of disease. Genomic variation can provide increased biological understanding of disease processes, which is necessary to develop future treatments. Recent technological advances have highlighted the role of copy number variants in normal and pathological phenotypic expression. These applications have been used in studies of Parkinson's disease, a common, late-onset, progressive neurodegenerative disorder. At present the main therapeutic approach is administration of symptomalleviating drugs, which neither reverses the disease process nor halts its progression. However, the generation of in vivo model systems and development of novel disease intervention strategies for Parkinson's disease have come from research on monogenic forms of the disorder, including those caused by copy number variants. Here, we review the role of copy number variants and the mechanistic insights they have provided on the pathogenesis of Parkinson's disease.

Pathogenic copy number variation in autosomal dominant Parkinson's disease
A seminal discovery in the study of PD was the report of missense mutations in the SNCA gene, encoding the αsynuclein protein, in dominantly inherited disease [6]. This was the first evidence that genomic variation could result in an inherited form of PD. The subsequent identification of α-synuclein as the major protein component of the pathological substrate (Lewy bodies) placed αsynuclein at the center of PD research. However, the missense mutations reported in SNCA provided limited insight into the pathological mechanisms involved.
In 2003, genomic multiplications of chromosome 4q21-22 containing the SNCA locus were shown to cause familial PD (Figure 1). Genomic triplication of the gene causes a rapidly progressive form of PD [4]. In contrast, the clinical phenotype of families with a duplication of SNCA resembles idiopathic PD with late age at onset and slower disease progression and without early development of dementia. Several asymptomatic duplication carriers over 70 years of age without any signs of PD have recently been identified, indicating a reduced penetrance of disease [7]. Segmental intra-allelic duplication and interallelic recombination with unequal crossing-over  Toft and Ross Genome Medicine 2010, 2:62 http://genomemedicine.com/content/2/9/62 both seem to be responsible for SNCA multiplications [8]. Measurements of protein levels in triplication carriers confirmed the predicted doubling of α-synuclein protein in blood and increased levels of the protein in the brain. Therefore, even without sequence mutants, increased wild-type α-synuclein dosage may cause PD. Recent studies have confirmed the association of common non-coding genetic variants at the SNCA locus and increased risk of PD [9]. These findings suggest that variation in regions of SNCA, most likely deregulating constitutive gene overexpression, may provide a therapeutic target to a substantial proportion of patients with the more frequent sporadic form of PD. Currently, studies are underway to generate in vivo SNCA multi plication model systems and identify sensitive and specific downregulators of SNCA gene and protein expression.

Recessive forms of PD
Whereas the SNCA story suggests a gain of function, several early-onset forms of PD-like disorders have demon strated the role of loss of function in the disease. The Parkin (PRKN) gene is one of the largest known genes, spanning a 1.4 Mb genomic region. PRKN mutations are the most common cause of early-onset PD identified so far and are particularly frequent in individuals with evidence of recessive inheritance. The initial pathogenic PRKN mutations identified were large homozygous genomic deletions, and more than 100 different pathogenic variants in this gene have been published, including deletions, multiplications and missense and nonsense mutations [10]. The PRKN gene is located in one of several genomic regions of very high deletion frequency ('hotspots'), where independent rare deletions are found at frequencies of up to 100-fold higher than the average for the genome as a whole [11]. Approximately a third of all pathogenic PRKN variants are CNVs occurring between exons 2 and 5, forming a recombination hot spot [10].
Pathogenic mutations in the PTEN-induced kinase (PINK1) gene are much less common than PRKN mutations and probably account for only 1 to 2% of early-onset PD patients. CNVs in families with PD have been reported, including a deletion of the entire PINK1 gene [12]. This deletion also partly involved two neighboring genes, and two highly similar AluJo repeat sequences within these genes were found, which enclose the putative breakpoints. It is likely that the deletion resulted from an unequal crossing-over between these two sequen ces. Furthermore, a homozygous deletion of PINK1 exons 4 to 8 was found in three affected siblings from a consanguineous Sudanese family with early-onset PD [13]. Breakpoint analysis revealed a complex rearrangement combining a large deletion and the insertion of a sequence duplicated from the neighboring dolichyl-diphospho oligosaccharide-protein glycosyltransferase (DDOST) gene.
The PARK7 locus on chromosome 1p36 was localized by homozygosity mapping in two consanguineous families from genetically isolated communities in the Netherlands and Italy. In one of the families, a 14 kb deletion involving the first five of seven exons in the DJ-1 gene was identified [14]. The DJ-1 protein is involved in oxidative stress response and may be targeted to the mitochondria. Alu repeat elements flank the deleted sequence on both sides, suggesting that unequal crossingover was likely at the origin of this genomic rearrangement. Very few mutations in this gene have been reported, and mutation of DJ-1 accounts for less than 1% of early-onset PD cases. A heterozygous duplication involving the first five exons of DJ-1 in a single patient is the only other CNV published so far [15].

Genome-wide association studies
Genome-wide association studies (GWASs) have helped to elucidate the genetic basis for a number of complex disorders. GWASs use microarrays with up to one million single nucleotide polymorphisms (SNPs) to capture a significant amount of an individual's genetic variation. In addition to identification of SNPs associated with disease, these studies also allow quantitative assessment of the genome. A total of five GWASs of PD have been published so far. However, analyses of structural genetic variation have been published from only a relatively small study of 273 patients [16]. No new regions associated with PD were identified, but several deletions and duplications in the PRKN gene were observed. The potential power of this approach in PD and other neurodegenerative disorders has yet to be fully used.

Conclusions
A recent study of eight common diseases (including coronary artery disease, rheumatoid arthritis and diabetes) concluded that common CNVs are unlikely to have a major pathogenic role [17]. However, this does not exclude the possible existence of several rare CNVs causing the same disorders in a proportion of patients. The number of individuals carrying a given CNV at a known gene locus is frequently small. Nevertheless, discovery of these changes by GWASs or family-based studies are important.
Limited data from GWASs have been published on the extent of CNVs in PD, even though previous familybased studies have highlighted CNVs in both familial and sporadic PD. A large proportion of functional and translational PD research today is based on hypotheses provided by the identification of mutations in familial forms of the disorder, including CNVs. The most common genetic cause of PD is point mutations in the leucine-rich repeat kinase-2 (LRRK2) gene and CNVs have yet to be identified at this locus. The hypothesized toxic gain of function (kinase activity) would be supported if LRRK2 multiplications are identified. The SNCA story in PD has shown how the identification of a single CNV can provide insights into pathogenic mechanisms and drive therapeutic development, and although it remains unclear what proportion of genetic disease is caused by CNVs, it is likely that many of those affecting risk of PD are still to be identified.