Emerging patterns of genetic overlap across autoimmune disorders

Most of the recently identified autoimmunity loci are shared among multiple autoimmune diseases. The pattern of genetic association with autoimmune phenotypes varies, suggesting that certain subgroups of autoimmune diseases are likely to share etiological similarities and underlying mechanisms of disease. In this review, we summarize the major findings from recent studies that have sought to refine genotype-phenotype associations in autoimmune disease by identifying both shared and distinct autoimmunity loci. More specifically, we focus on information from recent genome-wide association studies of rheumatoid arthritis, ankylosing spondylitis, celiac disease, multiple sclerosis, systemic lupus erythematosus, type 1 diabetes and inflammatory bowel disease. Additional work in this area is warranted given both the opportunity it provides to elucidate pathogenic mechanisms in autoimmunity and its potential to inform the development of improved diagnostic and therapeutic tools for this group on complex human disorders.

showing similarity to both groups and Crohn's disease to neither.
Early candidate gene studies, particularly those focus ing on genes within the human leukocyte antigen (HLA) region [11], also supported the notion of shared 'auto immunity' loci. Strong support for genetic loci that are shared across autoimmune disorders and located outside the HLA region has been demonstrated for several loci encoding proteins that have immunemediating func tions, including cytotoxic Tlymphocyte antigen 4 (CTLA4; a member of the immunoglobulin superfamily that is expressed on the surface of helper T cells and transmits an inhibitory signal to T cells), protein tyrosine phosphatase nonreceptor type 22 (PTPN22; which is expressed primarily in lymphoid tissue and plays a role in the regulation of Tcell receptor signaling pathways), and tumor necrosis factor (TNF) alphainduced protein 3 (TNFAIP3; which inhibits NFkappa B activation as well as TNFmediated apoptosis) [1214]. Many of the recently identified AID loci involve pathways related to Bcell or Tcell activation and differentiation, innate immunity, and regulation of cytokine signaling [15,16]. Certain loci, however, appear to be associated with specific autoimmune diseases. For example, variants in NOD2 (nucleotidebinding oligomerization domain contain ing 2) and ATG16L1 (ATG16 autophagyrelated16like 1) have been associated with defective autophagy in dendritic cells from Crohn's disease patients [17].
A theme emerging from recent genetic studies of AIDs relates to the surprising degree of overlap between genetic loci for this diverse group of disorders, given the phenotypic diversity. Several recent reviews have summarized emerging work that identifies both genetic loci that are shared across the spectrum of autoimmune disease and the biologic pathways whose involvement is implicated by these shared loci [15,16,18,19]. For example, Zhernakova et al. [18] completed a detailed review of 16 genomewide association (GWA) or nonsynonymous SNP scans for 11 immunerelated disorders that were published in 2007 or 2008. Their analysis underscores the extensive sharing of genetic risk loci across this spectrum of disorders, and the fact that most of these loci can be mapped to a few shared biologic pathways, including those related to innate immunity, immune signaling, Tcell differentiation, cytokines and chemokines.
The analysis by Zhernakova et al. [18] also suggests that the degree to which each of these disorders is charac terized by shared (rather than unique) susceptibility loci varies substantially, from all loci shared (for RA) to 50% or more shared for celiac disease, psoriasis, MS, SLE, T1D, AS and AITD [18]. The two types of IBD examined, Crohn's disease and ulcerative colitis (UC), shared substantial numbers of loci between them but relatively few with the other AIDs studied. The extent to which the Tcell differentiation, immune cell signaling, innate immunity and TNF signaling, or other pathways are impli cated for each of these disorders varies, but overall the analysis by Zhernakova et al. [18] suggests that most of these pathways contribute (to a variable degree) to most of these disorders.
In this review we focus on recent studies that have sought to refine genotypephenotype associations by comparing susceptibility loci between specific AIDs. We concentrate on RA, AS, celiac disease, MS, SLE, T1D and IBD. Table 1 summarizes these AIDs in terms of their prevalence in the population and major phenotypic features. In particular, we focus on comparative studies that use GWA results to distinguish genetic variants that are specific to individual AIDs from those that are shared among multiple AIDs. We also summarize the results of a recently published crossphenotype metaanalysis that uses genetic association results to highlight four main AID clusters. A detailed understanding of these shared and distinct genetic loci provides insight into funda mental etiologic mechanisms in autoimmune disease. It has the potential to inform the choice of current therapies and the development of novel targeted therapies and other interventions that could improve our ability to manage these complex human disorders.
Comparative studies of GWA data to identify shared and distinct genetic loci for AIDs GWA and other recent genetic studies of AIDs have been remarkably successful in terms of the number of genetic loci that have been identified. For example, more than 70 genetic loci have now been firmly established as susceptibility factors for Crohn's disease [20], and more than 30 loci that contribute to the risk of RA and/or SLE have been identified [21,22]. Following the completion of many large GWAS of individual AIDs, a number of studies have sought to refine the specificity of loci that are associated with AIDs. More specifically, these recent studies have examined AID risk loci identified for one AID in casecontrol collections that have been developed for another AID to distinguish between those risk loci that are shared and those that are distinct for the AIDs being compared. In Table 2, the specific disease compari son studies that are discussed in this review are listed alongside both the major shared (and unique) loci that have been identified and the biologic pathways or mecha nisms implicated by these analyses. Table 3 and Figure 1 further highlight the patterns of shared AID risk loci and their associated pathways revealed in these studies. In the next section, we summarize very recent work by Cotsapas et al. [23] that addresses these relationships in an analy tically powerful way. Specifically, they utilized GWA data generated for seven AIDs and performed a cross phenotype metaanalysis (CPMA) to distinguish between genetic variants that are common to all of these seven AIDs from variants that are common to some but not all of these AIDs and variants that are specific for one AID.

Type 1 diabetes
Several recent studies have compared the pattern of genetic association between T1D and other AIDs, includ ing celiac disease [24], RA [25] and IBD [26]. On the basis of the association of both T1D and celiac disease with HLA class II loci, Smyth et al. [24] evaluated both the association of eight nonHLA celiac disease risk loci with T1D and of 18 T1D loci with risk of celiac disease in very large samples of patients, controls and families. Their analysis revealed that seven loci were common to these two AIDs, including RGS1 on chromosome 1q31, IL18RAP on chromosome 2q12, TAGAP on chromosome 6q25, a 32bp insertiondeletion variant on chromosome 3p21, PTPN2 on chromosome 18p11, CTLA4 on chromo some 2q33, and SH2B3 on chromosome 12q24. Further, the associated alleles for IL18RAP and TAGAP confer risk of celiac disease but protection against T1D. Non shared loci include the T1D risk loci INS (chromosome 11p15), IL2RA (chromosome 10p15), and PTPN22 (chromo some 1p13) and the celiac disease risk loci IL12A (chromosome 3q25) and LPP (chromosome 3q28).
Eyre et al. [25] extended this work by investigating genetic overlap between T1D and celiac disease risk loci and RA. They studied eight celiac disease risk loci and six T1D risk loci in a large sample of RA patients and control individuals. Although they found significant evidence for association of the TAGAP locus (which is associated with both celiac disease and T1D but with opposing effects) with RA and modest evidence of association between the C1QTNF6 T1D risk locus and RA, overall their investi gation revealed little evidence of association between celiac disease and T1D risk loci and RA, suggesting that RA might be more genetically distinct.
Finally, Wang et al. [26] studied GWA data from large collections of IBD patients (Crohn's disease and UC), T1D patients and control individuals of European ances try to identify shared susceptibility loci. Although they identified a number of overlapping susceptibility loci among these diseases, their results were notable for the frequency with which risk alleles for one disease appear to provide protection against another. They interpret these data as indicating that many AID risk loci could be under balancing selection and that variants that have opposing effects on different AIDs might contribute to the maintenance of common susceptibility alleles in human populations.

Inflammatory bowel disease
Although the two types of IBD, Crohn's disease and UC, differ in several important ways, such as the depth and location of inflammation in the gastrointestinal tract (Table 1), the clustering of these diseases within certain families and their overlapping risk loci support their etio logic relationship. Thus, these diseases have often been considered together in GWA and other genetic studies.
In addition to the aforementioned investigation of IBD and T1D [26], other recent work has investigated genetic overlap between IBD and other AIDs, including AS, celiac disease, psoriasis, SLE, RA and MS [20,2729]. As mentioned previously, a large number of loci that

Rheumatoid arthritis and juvenile idiopathic arthritis
Several recent studies have investigated genetic risk loci that are shared between RA or juvenile idiopathic arthritis (JIA) and other AIDs, including SLE, T1D, celiac disease, IBD and MS [25,3033]. Studies investigating  Control of cell division cycles and regulation of cyclin-dependent kinases CCR3 (Chemokine receptor 3) Binds to eotaxin, eotaxin-3, MCP-3, MCP-4, RANTES and MIP-1 CCR7 (Chemokine receptor 7) Receptor for the MIP-3-beta chemokine CD40 Member of the TNF-receptor superfamily; receptor for CD40L CD226 Receptor involved in intercellular adhesion, lymphocyte signaling, cytotoxicity and lymphokine secretion mediated by cytotoxic T-lymphocyte (CTL) and NK cells COG6 (Component of oligomeric Golgi complex 6) Required for normal Golgi function CTLA4 (Cytotoxic T-lymphocyte-associated protein 4) Negative regulator of T-cell responses HERC2 (Hect domain and RLD 2) E3 ubiquitin-protein ligase FCGR2A (Fc fragment of IgG) Binds to the Fc region of immunoglobulins gamma ICOSLG (Inducible T-cell co-stimulator ligand) Co-stimulatory signal for T-cell proliferation and cytokine secretion. Ligand for the T-cell-specific cell surface receptor ICOS IKZF1 (IKAROS family zinc finger 1) Transcriptional regulator of hematopoietic cell differentiation IL10 Inhibits the synthesis of a number of cytokines, including interferon-gamma, IL-2, IL-3, TNF and GM-CSF; produced by activated macrophages and by helper T-cells IL12A Cytokine that can act as a growth factor for activated T and NK cells IL18RAP NFkB and JNK activation (IL-18 dependent)

IL2/IL21
Cytokines required for T-cell or B-cell proliferation IL23R Binds IL23 and mediates T-cell and NK cell stimulation IL26 Activates STAT1 and STAT3, MAPK1/3 (ERK1/2), JUN and AKT IL27 Broad functions in adaptive immunity IRF5 (Interferon regulatory factor 5) Transcription factor involved in virus-mediated activation of interferon IRF8 (Interferon regulatory factor 8) Plays a negative regulatory role in cells of the immune system IRGM (Immunity-related GTPase family, M) Might play a role in the innate immune response by regulating autophagy formation in response to intracellular pathogens KIF5A (Kinesin family member 5A) Microtubule-dependent motor required for intracellular protein transport LPP (LIM-domain-containing preferred translocation Role in cell shape and motility partner in lipoma) MMEL1 (Membrane metallo-endopeptidase-like 1) Metalloprotease involved in sperm function TNFSF14 Activates NFkB and stimulates T-cell proliferation ORMDL3 (ORM1-like 3) Might indirectly regulate endoplasmic reticulum-mediated Ca 2+ signaling PLCL1 (Phospholipase C-like 1) Involved in an inositol phospholipid-based intracellular signaling cascade PRKCQ (Protein kinase C theta) TCR-mediated T-cell activation PTPN2 (Protein tyrosine phosphatase, non-receptor type 2) Lymphocyte cell signaling PTPN22 Involved in the TCR signaling pathway PUS10 (Pseudouridylate synthase 10) Post-transcriptional nucleotide modification of RNAs  including HLA, PTPN22, STAT4 and 6q23, were not studied. None of the nine SLE risk variants studied was signi ficantly associated with RA, suggesting that the genetic contribution to these two AIDs is relatively dis tinct, although it is also possible that this study was not powered sufficiently to identify shared risk loci.
Coenen et al. [32] investigated the extent of genetic overlap between RA and celiac disease. Specifically, they evaluated 11 RA and 11 celiac disease risk loci among Dutch RA patients, celiac disease patients and control individuals. Their analyses revealed six risk loci that were shared by RA and celiac disease, which included the TNFAIP3, IL2/IL21, SH2B3, LPP, MMEL1/TNFRSF14 and PFKFB3/PRKCQ loci. Overall, the shared loci supported the importance of both adaptive and innate immunity in susceptibility to these two disorders.

Systemic lupus erythematosus
In addition to the studies mentioned above that have compared the overlap between SLE and RA risk loci [30,31], recent work by Ramos et al. [34] suggests only modest overlap between SLE and other AID risk loci. More specifically, these authors evaluated 446 genetic variants that had previously been associated with one or more of 17 AIDs to determine which loci were signifi cantly associated with SLE susceptibility. A number of AID loci, including FCGR2A, IL10, IRGM, TNFAIP3, IKZF1, IRF5, BLK, IRF8, and UBE2L3, were associated with SLE and one or more other AIDs. However, many SLE loci, including ITGAM, TNFSF4, PTTG1, PHRF1, WDFγ4 and BANK1, were associated with other AIDs only weakly, if at all.

Multiple sclerosis (MS)
Multiple sclerosis is characterized by very strong asso ciations with HLA class II variants, but some of the other genes that are strongly associated with multiple AIDs, such as PTPN22, do not appear to contribute substan tially to the risk of MS. Nonetheless, emerging evidence from GWAS supports overlap of MSassociated genes with genes that have been linked to a broad spectrum of AIDs [35]. For example, work by Alcina et al. [36] in which 12 genetic variants previously associated with other AIDs were studied in a large collection of Spanish MS cases and control individuals identified three shared susceptibility loci, including KIF5A, SH2B3 and CD226, that also influence risk of RA, T1D and SLE (SH2B3). More recently, a collaborative GWAS involving almost 10,000 MS cases recruited from 15 different countries has identified a large number of susceptibility loci, most of which map to regions containing immunologically relevant genes [37]. Particularly overrepresented are loci implicated in Thelpercell differentiation. Further, just over onethird of the MS risk loci identified overlap with regions previously identified in GWAS of other AIDs. Most of these shared risk loci have been associated with celiac disease, T1D, RA and/or IBD [37]. As mentioned previously, recent work by Sirota et al. [10] highlights the fact that certain variants that are associated with increased risk for some AIDs appear to be protective for others. More specifically, Sirota et al. [10] studied six AIDs (T1D, MS, AS, RA, Crohn's disease and AITD) and found that AS and RA formed one group, and MS and AITD formed another group (with T1D showing similarity to both groups and Crohn's disease to neither). Further, susceptibility variants that are asso cia ted with the first class of AIDs generally had a protective effect in relation to the second class of AIDs. As an example, TAP2, which is involved in the transport of antigens from the cytoplasm to the endoplasmic reticulum for association with MHC class I molecules, was found to be a susceptibility factor for AS, RA and T1D, but a protective factor for MS and AITD.

Results of CPMA to identify shared and distinct genetic loci in AID
Cotsapas et al. [23] have recently completed a CPMA that significantly extends our understanding of shared and distinct AID loci. More specifically, they studied 107 SNPs associated in recent GWAS with one or more of the following AIDs: celiac disease, Crohn's disease, MS, psoriasis, RA, SLE and T1D. Their study indicates that almost half of these loci (47/107, 44%) are associated with multiple AIDs; many of these variants were not pre viously known to be shared across AIDs. Nine of these 47 variants had opposing effects in different AIDs. Cotsapas and colleagues also examined patterns of disease asso ciation for the 47 shared loci, and found that just one locus, a variant in an exon of SH2B3 (rs3185404) was signi ficantly associated with all seven of the AIDs exam ined. The remaining 46 variants were associated with subsets of the seven AIDs.
The authors extended their analysis of these variants to try to elucidate the molecular pathways underlying these subgroups of AID. Four clusters were revealed on the basis of the patterns of AID associations. The first cluster, represented by variants in IL23R, IL12B, PTGER4, JAK2, KIF21B, STAT3 and other genes, was most strongly asso ciated with Crohn's disease, psoriasis and MS. A second cluster, represented by variants in STAT4, IRF5, TNFAIP3, RGS1, CCR1, IL18RAP, IL2-IL21 and UBE2L3, was most strongly associated with celiac disease, RA and SLE. A third cluster, represented by variants in ORMDL3, CLEC16A, IL2RA, PRKCQ, CYP27B1, IKZF1 and ETS1, was most strongly associated with T1D, MS and RA. A fourth cluster, represented by variants in SH2B3, PTPN2, PTPN22, PRKCQ, CTLA4, UBASH3A, IL10, IFIH1, IL2, BACH2, IL27, CD226 and other genes, was most strongly associated with T1D, RA, celiac disease, Crohn's disease and SLE. Further, an analysis of proteinprotein inter actions revealed that the proteins encoded by variants within groups were more likely to interact with each other (either directly or via intermediates) than with proteins encoded by variants in other groups, under scoring the biologic relevance of the AID relationships defined by this CPMA.
The delineation of genes and pathways that relate more specifically to certain AIDs than to others provides valuable information that can be used to target auto immune phenotypes with interventions that are relevant to those pathways. The highlighted biologic pathways then provide a focus for more fundamental research, aimed at elucidating the underlying disease mechanisms in autoimmunity, and they could inform the development of novel therapies. The success of antiTNF targeted therapies for a diverse group of autoimmune disorders, including RA, IBD, psoriasis, AS and others [38], nicely illustrates the potential value of this information. Similarly, the aforementioned collaborative GWAS of MS [37] highlights loci that are related to MS therapies, including VCAM1 (natalizumab) and IL2RA (daclizumab).
As recently summarized in a review by Rai and Wakeland [16], despite the dramatic increase both in the number of risk loci recently identified for human AIDs and in information about patterns of shared risk and biologic pathways, the current literature does not provide a complete mechanistic understanding of biologic path ways that explain the pattern of AID susceptibility in human populations. Additional work will be required to refine genotypephenotype relationships in autoimmune disease more completely. This research should include larger casecontrol studies in diverse population groups and the application of new technologies, such as next generating sequencing, to define all of the relevant genetic variation. Given the 'missing heritability' of human AIDs, and the fact that current GWAS have captured primarily common genetic SNP variants, it is likely that rare or structural variants explain much of the missing heritability, the identification of which will require new and emerging technologies. Finally, once the complete genetic architecture underlying human AIDs has been characterized, additional methods will be required to define the functional mechanisms that explain these genetic associations.

Summary and conclusions
Owing to the rapid pace of identification of AIDasso ciated genes during the past 5 years, primarily as a result of GWAS, there is now a wealth of information available that allows for a more thorough delineation of the extent of genetic overlap across this broad group of disorders. Loci that are shared between various AIDs and involved in a wide range of immune pathways (for example, Tcell activation, Bcell activation, cytokine signaling) might help explain common pathogenic features and inform the development of novel therapies. Further, the lack of over lap for other loci and pathways (for example, IL23R and STAT3 in IBD or spondyloarthritis) also suggests distinct pathogenic mechanisms that could explain, at least in part, the phenotypic diversity across the spectrum of auto immune disease. It is important to keep in mind, however, that current studies are likely insufficiently powered to characterize fully the genetic architecture of AIDs, including shared and distinct loci and biologic pathways. Thus, the ongoing generation and analysis of data emerging from GWA and other genetic studies is warranted in order to better define genotypephenotype associations in human AIDs and to clarify which pathways and specific targets are most relevant to the diseases within this diverse group of human disorders.