The human genome and structural variation
One of the most intriguing findings in the wake of the release of the reference genome sequence from the Human Genome Project has been the realization of the extent to which each individual genome differs, not only in terms of single nucleotide polymorphisms, but also in terms of large deletions, duplications and other rearrangements, a phenomenon now referred to as copy number variation. Some 3,000 protein-coding genes (around 10% of the human gene complement) are known to be associated with copy number variants (CNVs), and two unrelated human genomes may therefore differ quite dramatically in terms of their gene content. Indeed, it is becoming increasingly evident that CNVs are a major source of genetic variation, contributing not only to phenotypic traits but also to inherited disease. A growing number of reports support the role of CNVs in the etiology of complex genomic disorders, such as the Smith-Magenis and Potocki-Lupski syndromes, Charcot-Marie-Tooth disease 1A (CMT1A) and hereditary neuropathy with liability to pressure palsies (HNPP), Sotos syndrome, Williams-Beuren syndrome, Pelizaeus-Merzbacher disease and autism, among others [1].
In light of these findings, it is clear that the nature of the mechanisms underlying CNV formation is of central importance, from both a theoretical and a clinical standpoint. Analyses of CNVs in humans and across lines of Drosophila melanogaster have revealed that the sites of chromosomal rearrangements are characterized by either stretches of homology, or little to no homology at all, suggesting that both non-allelic homologous recombination and homology-independent repair are likely to lead to CNV formation. A study [2] also showed that DNA sequences flanking CNV breakpoints often contain repetitive sequence motifs known to form alternative DNA structures, or non-B DNA (various non-canonical types of DNA, including left-handed Z-DNA, triplexes, G-quadruplexes, cruciform and slipped structures). This is an important conclusion since it implies that DNA structure, rather than the sequence per se, may predispose to chromosomal breakage and subsequent repair, thereby promoting CNV formation. These results [2] expand observations made earlier by a number of laboratories, including our own, using different analyses and model systems [3, 4]. Recent molecular analyses of novel CNVs, such as the NRXN1 region associated with autism spectrum and other neurodevelopmental disorders [5], and non-recurrent microdeletions of the FOXL2 gene associated with blepharophimosis-ptosis-epicanthus-inversus syndrome, also support the above conclusions.