Genomic disorders ten years on
© BioMed Central Ltd 2009
Published: 24 April 2009
Skip to main content
© BioMed Central Ltd 2009
Published: 24 April 2009
It is now becoming generally accepted that a significant amount of human genetic variation is due to structural changes of the genome rather than to base-pair changes in the DNA. As for base-pair changes, knowledge of gene and genome function has been informed by structural alterations that convey clinical phenotypes. Genomic disorders are a class of human conditions that result from structural changes of the human genome that convey traits or susceptibility to traits. The path to the delineation of genomic disorders is intertwined with the evolving technologies that have enabled the resolution of human genome analyses to continue increasing. Similarly, the ability to perform high-resolution human genome analysis has fueled the current and future clinical implementation of such discoveries in the evolving field of genome medicine.
Nevertheless, progress was blocked by both technological and conceptual limitations. Technically, we had no way to view the entire human genome simultaneously at a level of resolution that would enable insights into molecular mechanisms. Conceptually, locus-specific thinking had permeated genetics for over a century, with genocentric (gene-specific) views and base-pair changes as the one form of mutation predominating during the latter half of the 20th century and often blindly biasing genetic thinking to this day. The significant heritability and uncertain molecular basis of common disorders has been approached with such geno-centric and 'point mutation' genetic thinking. Even now, we witness this as a recurrent theme with an excessive focus on genome-wide association studies (GWASs) evaluating ancient SNPs, as contrasted with the potential involvement of recent or new mutations and/or CNV.
These findings crystallized and solidified the concept of genomic disorders . The concept of genomic disorders is predicated on two general ideas: firstly, that genomic disorders occur by rearrangements of our genome (the human genome is disordered) and not by DNA-sequence-based changes (that is, not by base-pair changes or by SNPs that cause disease); and secondly, that genome architecture incites genome instability. This article stated that structural characteristics of the human genome predispose it to rearrangements that result in human disease traits, and that genome alterations can occur through many mechanisms, including homologous recombination between region-specific LCRs . This first mechanism was later termed NAHR . The term NAHR stresses the mechanism by which these particular rearrangements of the human genome occur, including the requirement for homologous substrates and the observations of gene conversion and recombination hotspots. Furthermore, NAHR can cause duplication, deletion and inversion. In contrast, unequal crossing-over usually refers to the segregation of marker genotypes and can lead to duplication or deletion chromosomes [25–27]. Admittedly, almost all of the cases used to bolster the argument for genomic disorders in the original article on the topic  occurred mechanistically by NAHR. However, both Pelizaeus-Merzbacher disease (MIM 312080), caused by genomic duplications, and spinal muscular atrophy (MIM 25330), associated with genomic deletion, were mentioned as other diseases commonly caused by DNA rearrangements that might reflect genomic instability due to unique genome structural features .
The same article  also suggested that for disorders caused by genomic deletion rearrangements, the reciprocal duplications might be under-recognized. Examples were provided of contiguous-gene-deletion syndromes, such as Williams-Beuren (WBS; MIM 194050), Prader-Willi (MIM 176270), Angelman (MIM 105830) and DiGeorge/velocardio-facial syndromes (DG/VCFS; MIM 188400), that might result from a molecular mechanism similar to that of SMS and suggested the reciprocal duplication, as seen for SMS, may occur . It was also pointed out that such patients with duplications might have different clinical findings and milder phenotypic features than those with deletions, because excess information is usually less detrimental to the organism than deficiency. Therefore, these cases could escape identification through under-ascertainment or be missed by routine cytogenetic analysis because of the further technical challenges required to recognize duplications compared with deletions .
The first predicted reciprocal microduplication syndrome was identified shortly thereafter, the duplication of the genomic interval deleted in SMS  (Figure 2), but it would take another 7 years to systematically study and describe the phenotypic variability of what has come to be known as the Potocki-Lupski syndrome  (PTLS; MIM 610883). Interestingly, these clinical studies showed that autism, as defined by objective psychological testing, was one feature of PTLS , thus linking the autism trait to a specific CNV. The apparent predicted reciprocal duplications for both the DG/VCFS [30–32] (MIM 608363) and WBS regions [33–35] (MIM 609757) followed rapidly. Reciprocal duplication syndromes are now being defined for almost all microdeletion syndromes in which the deletion is flanked by LCRs/SDs and that occur by NAHR (for example, dup(17)q21.31q21.31  and duplication of the Sotos syndrome (MIM 117550) region ); these are often described within the same year [38, 39] or even the same paper [40–44] as the microdeletion syndromes themselves.
After several years of study, the rules for NAHR were elucidated [14, 24]. A hallmark experimental approach based on an understanding and implementation of the new knowledge of the NAHR mechanism was executed by Evan Eichler and colleagues. With a reference human genome sequence in hand [45–47] and the technology of genome-wide array comparative genomic hybridization (aCGH) , they designed a research array to interrogate genomic intervals flanked by LCRs greater than 10 kb in length, over 95% sequence identical, in direct orientation, and mapping within 50 kb to 5 Mb of each other [49, 50]. These arrays were then used to assay patient cohorts with idiopathic mental retardation and other birth defects. In this manner, they defined five new microdeletion syndromes (deletions of 17q21.31 [50–54], 17q12, 15q24, 15q13.3 and 1q21.1) within less than 2 years [44, 50, 55–57]. Interestingly, the 17q12 deletion was found to be associated with maturity-onset diabetes of the young [56, 58] a common, albeit genetically heterogeneous, disorder. The latter two deletions, 15q13.3 and 1q21.1, have also been associated with schizophrenia [59–61], whereas 15q13.3 has also been associated with idiopathic seizures [57, 62], mental retardation , autism [63, 64] and behavioral abnormalities with antisocial behavior .
Many other common and complex disorders are being shown to be due to CNV in some fraction of patients. Thus, genomic disorders encompass not only rare multiple congenital anomaly and mental retardation syndromes, but also common and complex traits, such as autism and schizophrenia, as well as other neurobehavioral phenotypes. For instance, deletion and duplication 16p11.2 can also cause autism [40, 65]. Both duplications and/or deletion CNVs of the human genome have been associated with HIV susceptibility , Crohn's disease [67–69], glomerulonephritis , psoriasis , systemic lupus erythematosus [72, 73], pancreatitis  and many other human diseases. Furthermore, animal models for SMS and PTLS show that obesity and several of the objectively assayed behavioral traits can result from a specific gene CNV (i.e. the mouse Rai1 gene ).
In the past decade, many important basic science questions have also been addressed through studies of genomic disorders. NAHR hotspots [15, 16] had been identified long before allelic homologous recombination (AHR) hotspots  were generally appreciated through studies that emerged from the HapMap Project [77, 78]. NAHR and AHR hotspots were found to coincide at the two loci where they were studied : the CMT1A duplication/HNPP deletion locus  and the neurofibromatosis type 1 deletion locus at 17q11.2 . Fundamental insights into human recombination have been gleaned from studies of genomic rearrangements and genomic disorders [82–86]. Importantly, locus-specific mutation rates for de novo genomic rearrangements that result in CNV were shown both theoretically  and experimentally  to occur at frequencies of 100 to 10,000 times greater than locus-specific mutation rates for de novo SNPs. Interestingly, the deletions can outweigh duplications about 2:1 at selected autosomal loci and about 4:1 on the Y chromosome at a given locus for rearrangements generated by NAHR . Studies of genomic disorders have also provided fundamental insights into human gene [89–93] and genome [94–100] evolution. Such studies were among the first to provide examples of exon accretion by segmental duplication in the evolution of novel gene functions , gene duplication/triplication by de novo CNV formation [92, 93], accumulation of LCRs/SDs during primate genome evolution [98, 99], and LCRs/SDs at evolutionary chromosomal breakpoints [95, 98] and at breaks in synteny between the mouse and human genome [94, 100].
As genome-wide tools became more readily available after the consecutive completion of the draft, reference and finished human haploid genome [45–47], many laboratories shifted their experimental approach from locus-specific and genocentric thinking to genomic studies. And as a result, the field of genomic disorders exploded. First, it became apparent that structural variation including CNV  of the normal human genome was much greater than anticipated [102–105]. In fact, any two individuals vary more as a result of CNV in terms of numbers of base-pairs involved than all the SNPs combined . Moreover, the clinical implementation of genomic techniques enables high-resolution human genome analysis and can resolve CNVs 10, 100 and even 1,000 times smaller than the 3-5 Mb resolution afforded by a clinical G-banded karyotype. This has revolutionized medical genetics and bolstered the emerging field of genome medicine [106–121]. Array-based technologies can resolve pathogenic subtelomeric CNV better than can subtelomere fluorescent in situ hybridization  and can reveal genomic rearrangements in patients with apparently balanced translocations [120, 121]. Moreover, these technologies also enable mosaicism to be detected as a cause of a clinical phenotype [114, 115]. This was not visualized previously because of stimulation of selected cell types for karyotype analysis [114, 115]. Such techniques have also enabled prenatal detection of submicroscopic abnormalities [122–126] and the detection of de novo genomic rearrangement events causing sporadic birth defects . Submicroscopic duplications as a cause of X-linked mental retardation [128, 129] and other mental retardation syndromes [130, 131] are now revealed. Many new genomic disorders caused by submicroscopic duplications and deletions continue to be described and are catalogued in the DECIPHER database .
In addition to NAHR and FoSTeS/MMBIR, other mechanisms may remain to be uncovered that fulfill the original conception of genomic disorders. Genome architecture may be different for individuals as a result of structural variation within a particular population [50–54, 139], so particular individuals may be more susceptible than others to having either a genomic disorder or an offspring with one. Furthermore, other mechanisms, such as nonhomologous end joining and retrotransposition, can lead to structural variation that results in genomic disorders , and unique genome architectural features other than LCR/SD, such as AT-rich palindromes [141, 142] and non-B DNA conformations [86, 143], can incite genome instability. Systematic studies of disorders that occur by such mechanisms may provide insights into local genome architecture that could potentially influence susceptibility to rearrangement; they may thus delineate the 'rules' for FoSTeS/MMBIR as was done for NAHR.
It was initially not known whether human genomic rearrangements reflected random DNA breaks or perhaps selection/survival of genomic regions that could tolerate the gains and losses of CNV. Over the past decade, our thinking has evolved and we can now speak of specific mechanisms (NAHR, MMBIR/FoSTeS, nonhomologous end joining and retrotransposition), and elucidation of the rules for such mechanisms has enabled powerful predictions that have had a direct clinical impact. We have also learnt some of the 'rules' regarding genome architecture. It seems that each rearrangement mechanism can occur anywhere in the human genome, but one mechanism may be preferred over another at a given locus depending on local genome architecture (for example, LCR/SD or non-B DNA). We have realized that CNVs are as important as SNPs to human mutation and perhaps even more important with regard to human sporadic traits [87, 127]. Whether CNV or SNP is the more favored mutational event at a given locus may again reflect what the local genome architecture is around that locus . The elucidation of both the mechanisms of CNV formation  and how CNVs affect genes to convey phenotypes , whether the latter occurs through altered copy number [75, 146], gene dysregulation or position effect, has to a large extent come from studies of genomic disorders . The clinical phenotype allows the ascertainment of the genomic rearrangement from the population to enable the molecular studies.
The 'rules' for MMBIR/FoSTeS remain to be further defined with respect to the human genome architecture that might stimulate the events [93, 133]. Unquestionably, many more genomic disorders are still to be defined and many Mendelian and complex traits may be shown to be caused by CNV, rather than SNPs of a given gene in selected patients. Thus, a potentially more fruitful and cost-efficient approach to the study of human complex traits may be to examine a few hundred patients for CNV associated with the trait, rather than perform SNP-based GWASs. Such an approach recently yielded insights into Wolf-Parkinson-White syndrome, a common pre-excitation phenomenon resulting in a characteristic electrocardiographic pattern . Certainly all GWASs should look for CNV and not just focus on SNPs .
Perhaps the most significant findings regarding the human genome that were not anticipated by the human genome project [45–47, 77, 78] were the elucidation of genomic disorders and the discovery of the extent to which we vary from each other genetically as a result of CNV. In fact, the establishment of a reference haploid versus diploid genome truly reflects our naiveté with regards to the importance of CNV for human traits. With further widespread clinical implementation of high-resolution human genome analysis, submicroscopic genomic duplications and deletions will probably be identified at an increasing rate. Potentially, the vast majority of the human genome could be involved in CNV, perhaps more of the genome will be subject to, or tolerate, duplication CNV than deletion as observed for chromosomal studies [150, 151], and 'reverse genomics' could be used to systematically delineate genomotype-phenotype correlations . The genomic change accompanying a CNV results in a genomotype that may include either more than one, or no genes involved in conveying the specific phenotype and thus is distinct from a genotype.
Such studies will directly address the question: what is the genomic code? This is needed because the genetic code has only addressed the functions of under 2% of the human genome: the coding exons. Systematic analyses of the size, extent and genomic content of CNV and associated phenotypes might lead to a new understanding of 'cis-genetics', the phenotypic consequences of CNV encompassing multiple genes and/or regulatory sequences on one chromosome homolog, as opposed to the 'trans-genetics' focus of Mendelian segregation and transmission of homologous chromosomes. Furthermore, the extents to which human genomic rearrangements occur somatically in mitotic cells are only beginning to be explored [135, 152–156]. Thus, genomic disorders will probably continue to be a fruitful area for ongoing and future research.
allelic homologous recombination
Charcot-Marie-Tooth disease type 1A
comparative genomic hybridization
copy number variation
database of chromosomal imbalance and phenotype in humans using Ensembl resources
fork stalling and template switching
genome-wide association study
hereditary neuropathy with liability to pressure palsies
microhomology mediated break induced replication
non-allelic homologous recombination
single nucleotide polymorphism.
I appreciate the critical reviews of Art Beaudet, Weimin Bi, Claudia Car-valho, Evan Eichler, Matt Hurles, Bernice Morrow, Pawel Stankiewicz and Feng Zhang. I apologize, but take full responsibility for, omissions of citations given space limitations. Work in my laboratory has been supported by the Charcot-Marie-Tooth Association, the Muscular Dystrophy Association, the March of Dimes, the Texas Children's Hospital General Clinical Research Center, Baylor College of Medicine Mental Retardation Research Center, Baylor Intellectual and Developmental Disabilities Research Center, and The National Institutes of Health (National Institute of Neurological Disorders and Stroke, R01 NS27042, National Institute of Child Health and Human Development, P01 HD39420, National Eye Institute, R01 EY1325, National Cancer Institute, P01 CA75719, National Institute of Dental and Craniofacial Research, R01 DE015210).