Trinucleotide repeats: triggers for genomic disorders?
© BioMed Central Ltd 2010
Published: 30 April 2010
Among the various sequence repeats that shape the human genome, trinucleotide repeats have attracted special interest as a result of their involvement in a class of human genetic disorders known as triplet repeat expansion diseases. Recently, long TGG repeat tracts were shown to be implicated in a genomic disorder resulting from chromosome 14q32.2 deletion. Various different mechanisms might trigger this deletion, and looking at the problem from a structural biology perspective may help. Deeper insight into repeated sequences and their features may shed light on the mechanisms involved in this microdeletion and similar genomic rearrangements.
Genomic repeats and human diseases
At least a third of the human genome consists of repetitive sequences of various types, including large segmental duplications, also known as low-copy-number repeats (LCRs), long and short interspersed transposon-derived elements (LINEs and SINEs) and tandem repeats . The tandemly repeated sequences encompass satellites (with repeated units longer than 100 bp), minisatellites (between 100 bp and 10 bp) and microsatellites (with a repeated motif shorter than 10 bp) . The latter, also known as short tandem repeats (STRs) or simple sequence repeats, account for about 3% of the genome. Most of the STR tracts occur in the intergenic regions and introns, but a fraction of them, predominantly trinucleotide repeats (TNRs), also reside in exons and may be beneficial, neutral or deleterious. Among the beneficial roles of TNRs, which contribute about 0.1% to all STR sequences and are often polymorphic in length, is their potential to modulate cellular processes, including transcription splicing and translation . These TNRs include repeats of CGG, CAG and AGG, which are overrepresented in human exons . On the other hand, AAT, AAC and AAG are probably disadvantageous as they are negatively selected in exons . TNR sequences undergo mutations at a very high frequency , and this may increase disease risk or trigger disease in specific conditions [6, 7].
Research on the pathogenesis of TREDs includes studies on toxic RNA that triggers alternative splicing alteration in numerous genes linked to the clinical symptoms of DM1 [12, 13], and studies on toxic poly-Q proteins that impair many cellular functions . Research on repeat instability mechanisms is also very active, and there are still many challenges ahead [7, 8]. The consensus opinion at present is that several processes, including replication, recombination, DNA repair and transcription, contribute to repeat instability and that the formation of unusual non-B-DNA structures formed by the repeats is at the heart of the expansion processes [7, 8]. When classified by the size of the underlying mutation, TREDs lie between many genetic diseases resulting from small base substitutions, deletions and insertions and a class of diseases known as genomic disorders, caused by deletions or insertions of tens of thousands to several million base pairs. The group of genomic disorders with identified mutation mechanisms is constantly increasing, with major mechanisms including non-allelic homologous recombination (NAHR), non-homologous end joining (NHEJ) and replication fork stalling and template switching (FoSTeS) .
TGG repeats trigger recurrent microdeletion
A recently published article  shows a link between a TNR sequence and a human genomic disorder related to OMIM #608149. The authors demonstrated that the recurrent 1.11 Mb microdeletion from the long arm of paternal chromosome 14 (14q32.2) is catalyzed by long tracts of interrupted TGG repeats (approximately 500 bp in size) located at both sides of the deletion with 88% sequence similarity (Figure 1d). An identical heterozygous deletion was found in two unrelated patients diagnosed with several clinical phenotypes (such as growth retardation, hypotonia, precocious puberty and mental retardation) characteristic of maternal uniparental disomy (UPD(14)mat). UPD is defined by the inheritance of two copies of a chromosome from only one parent, a mother in this case, and is related to parent-specific imprinting of some genes. The deleted 14q32.2 region harbors 13 protein-coding genes, small nucleolar RNA (snoRNA) and microRNA loci  (Figure 1d). Two of these genes, Delta-like homolog 1 (DLK1) and retrotransposon-like 1 (RTL1), are maternally imprinted (paternally expressed), which explains several disease symptoms .
Structural insight into TGG repeats
A closer inspection of the nucleotide sequences of the TGG repeat segments (Figure 2a) may shed more light on the likelihood of the proposed mechanisms. Both segments (A and B in Figure 2a) contain approximately 60 repeat interruptions (mainly single nucleotide substitutions). The longest uninterrupted TGG repeat is 15 repeat units, and 12 tracts are at least 8 units. Pure repeat tracts of this length probably show only moderate repeat number polymorphism . The repeat interruptions are mostly TGA, TAG and AGG triplets in one repeat tract and TGA, TGT and TAC in the other (Figure 2a). The interrupting triplets may prevent repeated sequences from expansion, which is known to be the case for interrupted CGG and CAG repeats in genes implicated in FXS, SCA1 and SCA2 . Repeat expansions in these genes require the previous loss of repeat interruptions, which are thought to inhibit inter-strand slippage and to suppress intra-strand interaction [7, 19]. Bena et al.  consider the possibility that the TGG repeat tracts are unstable. They demonstrate that TGG repeats are, on average, much longer than any other TNR in the genome. The analysis we have performed using the same constraints (our unpublished work) shows the frequency of TNR tracts in the genome and reveals that AGG and TGG repeats most frequently form the longest tracts of at least 100 units (300 bp), which may facilitate the NAHR mechanism (Figure 2c). Considering only pure repeat tracts of at least 8 units, which may be implicated in repeat instability, the total number of TGG repeats in the genome is similar to that of AGG and much lower than that of TAA and CAA repeats (Figure 2c) .
Taking the structural perspective, the repeated sequences within DNA become transiently single-stranded during DNA replication, recombination, repair and transcription, which allows non-B-DNA structure formation and various downstream effects . The repeat interruptions present within the TGG repeats will no doubt influence their ability to form G-quadruplexes and would be likely to diversify the G-quadruplex structures. It is likely that there will be a heterogeneous mixture of structural variants formed by the repeated sequence and their core elements may resemble the G-quadruplex structures described for AGG repeats (Figure 2d) . Notably, the longest repeat tracts of at least 100 units consist of AGG and TGG repeats (Figure 2c), which are capable of forming G-quadruplex structures. For both of these repeat types, the presence of just four repeats is sufficient to form minimal G-quadruplex structures (Figure 2d) that can stack on each other and become more stable. One lesson that can be taken from our analysis of the putative mechanisms underlying the 14q32.2 deletion is that deeper insight into the features of repeated sequences may be needed to identify and better understand the mechanism involved.
The tip of the iceberg or a scarce phenomenon?
Whatever the exact mechanism implicated in the 14q32.2 deletion , the involvement of TGG repeat tracts in this deletion cannot be questioned. One important issue that needs to be addressed now is how general this kind of mechanism could be. If NAHR is in operation, similar TNR-mediated genomic rearrangements should be predictable, as was shown earlier for LCR sequences . If stable structure is important, the analysis can be narrowed to repeats having the potential to form G-quadruplex (TGG, AGG and CGG) and hairpin (CNG, GAC and GTC) structures [23, 24]. If repeat instability is essential, more attention needs to be paid to the nature, density and localization of the repeat interruptions. Genome-wide copy-number variation discovery studies (for example, ) may provide important information on this intriguing question.
myotonic dystrophy type 1
replication fork stalling and template switching
fragile X syndrome
non-allelic homologous recombination
non-homologous end joining
short tandem repeat
trinucleotide repeat expansion disease
This work was supported by the Ministry of Science and Higher Education, Grant Nos. PBZ-MNiI-2/1/2005, N301-112-32/3910, N302-278937, N302-260938, and Operational Program 'Innovative economy' POIG.01.03.01-00-098/08.
- Jasinska A, Krzyzosiak WJ: Repetitive sequences that shape the human transcriptome. FEBS Lett. 2004, 567: 136-141. 10.1016/j.febslet.2004.03.109.PubMedView ArticleGoogle Scholar
- Voineagu I, Freudenreich CH, Mirkin SM: Checkpoint responses to unusual structures formed by DNA repeats. Mol Carcinog. 2009, 48: 309-318. 10.1002/mc.20512.PubMedPubMed CentralView ArticleGoogle Scholar
- Kashi Y, King DG: Simple sequence repeats as advantageous mutators in evolution. Trends Genet. 2006, 22: 253-259. 10.1016/j.tig.2006.03.005.PubMedView ArticleGoogle Scholar
- Kozlowski P, de Mezer M, Krzyzosiak WJ: Trinucleotide repeats in human genome and exome. Nucleic Acids Res. 2010, doi:10.1093/nar/gkq127.Google Scholar
- Hurles M: How homologous recombination generates a mutable genome. Hum Genomics. 2005, 2: 179-186.PubMedPubMed CentralView ArticleGoogle Scholar
- Hannan AJ: Tandem repeat polymorphisms: modulators of disease susceptibility and candidates for 'missing heritability'. Trends Genet. 26: 59-65. 10.1016/j.tig.2009.11.008.
- Mirkin SM: Expandable DNA repeats and human disease. Nature. 2007, 447: 932-940. 10.1038/nature05977.PubMedView ArticleGoogle Scholar
- Lopez Castel A, Cleary JD, Pearson CE: Repeat instability as the basis for human diseases and as a potential target for therapy. Nat Rev Mol Cell Biol. 11: 165-170. 10.1038/nrm2854.
- Orr HT, Zoghbi HY: Trinucleotide repeat disorders. Annu Rev Neurosci. 2007, 30: 575-621. 10.1146/annurev.neuro.29.051605.113042.PubMedView ArticleGoogle Scholar
- Ranum LP, Day JW: Pathogenic RNA repeats: an expanding role in genetic disease. Trends Genet. 2004, 20: 506-512. 10.1016/j.tig.2004.08.004.PubMedView ArticleGoogle Scholar
- Zoghbi HY, Orr HT: Pathogenic mechanisms of a polyglutamine-mediated neurodegenerative disease, spinocerebellar ataxia type 1. J Biol Chem. 2009, 284: 7425-7429. 10.1074/jbc.R800041200.PubMedPubMed CentralView ArticleGoogle Scholar
- Shin J, Charizanis K, Swanson MS: Pathogenic RNAs in microsatellite expansion disease. Neurosci Lett. 2009, 466: 99-102. 10.1016/j.neulet.2009.07.079.PubMedPubMed CentralView ArticleGoogle Scholar
- Cooper TA, Wan L, Dreyfuss G: RNA and disease. Cell. 2009, 136: 777-793. 10.1016/j.cell.2009.02.011.PubMedPubMed CentralView ArticleGoogle Scholar
- Zhang F, Gu W, Hurles ME, Lupski JR: Copy number variation in human health, disease, and evolution. Annu Rev Genomics Hum Genet. 2009, 10: 451-481. 10.1146/annurev.genom.9.081307.164217.PubMedPubMed CentralView ArticleGoogle Scholar
- Bena F, Gimelli S, Migliavacca E, Brun-Druc N, Buiting K, Antonarakis SE, Sharp AJ: A recurrent 14q32.2 microdeletion mediated by expanded TGG repeats. Hum Mol Genet. 2010, doi:10.1093/hmg/ddq075.Google Scholar
- Jones C, Mullenbach R, Grossfeld P, Auer R, Favier R, Chien K, James M, Tunnacliffe A, Cotter F: Co-localisation of CCG repeats and chromosome deletion breakpoints in Jacobsen syndrome: evidence for a common mechanism of chromosome breakage. Hum Mol Genet. 2000, 9: 1201-1208. 10.1093/hmg/9.8.1201.PubMedView ArticleGoogle Scholar
- Gu W, Zhang F, Lupski JR: Mechanisms for human genomic rearrangements. Pathogenetics. 2008, 1: 4-10.1186/1755-8417-1-4.PubMedPubMed CentralView ArticleGoogle Scholar
- Rozanska M, Sobczak K, Jasinska A, Napierala M, Kaczynska D, Czerny A, Koziel M, Kozlowski P, Olejniczak M, Krzyzosiak WJ: CAG and CTG repeat polymorphism in exons of human genes shows distinct features at the expandable loci. Hum Mutat. 2007, 28: 451-458. 10.1002/humu.20466.PubMedView ArticleGoogle Scholar
- Pearson CE, Eichler EE, Lorenzetti D, Kramer SF, Zoghbi HY, Nelson DL, Sinden RR: Interruptions in the triplet repeats of SCA1 and FRAXA reduce the propensity and complexity of slipped strand DNA (S-DNA) formation. Biochemistry. 1998, 37: 2701-2708. 10.1021/bi972546c.PubMedView ArticleGoogle Scholar
- Lin Y, Dent SY, Wilson JH, Wells RD, Napierala M: R loops stimulate genetic instability of CTG.CAG repeats. Proc Natl Acad Sci USA. 2010, 107: 692-697. 10.1073/pnas.0909740107.PubMedPubMed CentralView ArticleGoogle Scholar
- Matsugami A, Okuizumi T, Uesugi S, Katahira M: Intramolecular higher order packing of parallel quadruplexes comprising a G:G:G:G tetrad and a G(:A):G(:A):G(:A):G heptad of GGA triplet repeat DNA. J Biol Chem. 2003, 278: 28147-28153. 10.1074/jbc.M303694200.PubMedView ArticleGoogle Scholar
- Sharp AJ, Locke DP, McGrath SD, Cheng Z, Bailey JA, Vallente RU, Pertz LM, Clark RA, Schwartz S, Segraves R, Oseroff VV, Albertson DG, Pinkel D, Eichler EE: Segmental duplications and copy-number variation in the human genome. Am J Hum Genet. 2005, 77: 78-88. 10.1086/431652.PubMedPubMed CentralView ArticleGoogle Scholar
- Sobczak K, Michlewski G, de Mezer M, Kierzek E, Krol J, Olejniczak M, Kierzek R, Krzyzosiak WJ: Structural diversity of triplet repeat RNAs. J Biol Chem. 2010, doi:10.1074/jbc.M109.078790.Google Scholar
- Bacolla A, Larson JE, Collins JR, Li J, Milosavljevic A, Stenson PD, Cooper DN, Wells RD: Abundance and length of simple repeats in vertebrate genomes are determined by their structural properties. Genome Res. 2008, 18: 1545-1553. 10.1101/gr.078303.108.PubMedPubMed CentralView ArticleGoogle Scholar
- Conrad DF, Pinto D, Redon R, Feuk L, Gokcumen O, Zhang Y, Aerts J, Andrews TD, Barnes C, Campbell P, Fitzgerald T, Hu M, Ihm CH, Kristiansson K, Macarthur DG, Macdonald JR, Onyiah I, Pang AW, Robson S, Stirrups K, Valsesia A, Walter K, Wei J, The Wellcome Trust Case Control Consortium, Tyler-Smith C, Carter NP, Lee C, Scherer SW, Hurles ME: Origins and functional impact of copy number variation in the human genome. Nature. 2009, 464: 704-712. 10.1038/nature08516.PubMedPubMed CentralView ArticleGoogle Scholar