Skip to main content

A multilayered post-GWAS assessment on genetic susceptibility to pancreatic cancer

Abstract

Background

Pancreatic cancer (PC) is a complex disease in which both non-genetic and genetic factors interplay. To date, 40 GWAS hits have been associated with PC risk in individuals of European descent, explaining 4.1% of the phenotypic variance.

Methods

We complemented a new conventional PC GWAS (1D) with genome spatial autocorrelation analysis (2D) permitting to prioritize low frequency variants not detected by GWAS. These were further expanded via Hi-C map (3D) interactions to gain additional insight into the inherited basis of PC. In silico functional analysis of public genomic information allowed prioritization of potentially relevant candidate variants.

Results

We identified several new variants located in genes for which there is experimental evidence of their implication in the biology and function of pancreatic acinar cells. Among them is a novel independent variant in NR5A2 (rs3790840) with a meta-analysis p value = 5.91E−06 in 1D approach and a Local Moran’s Index (LMI) = 7.76 in 2D approach. We also identified a multi-hit region in CASC8—a lncRNA associated with pancreatic carcinogenesis—with a lowest p value = 6.91E−05. Importantly, two new PC loci were identified both by 2D and 3D approaches: SIAH3 (LMI = 18.24), CTRB2/BCAR1 (LMI = 6.03), in addition to a chromatin interacting region in XBP1—a major regulator of the ER stress and unfolded protein responses in acinar cells—identified by 3D; all of them with a strong in silico functional support.

Conclusions

This multi-step strategy, combined with an in-depth in silico functional analysis, offers a comprehensive approach to advance the study of PC genetic susceptibility and could be applied to other diseases.

Background

Pancreatic cancer (PC) has a relatively low incidence, but it is one of the deadliest tumors. In Western countries, PC ranks fourth among cancer-related deaths with 5-year survival of 3–7% in Europe [1,2,3]. In the last decades, progress in the management of patients with PC has been meager. In addition, mortality is rising [2] and it is estimated that PC will become the second cause of cancer-related deaths in the USA by 2030 [4].

PC is a complex disease in which both genetic and non-genetic factors participate. However, relatively little is known about its etiologic and genetic susceptibility background. In comparison with other major cancers, fewer genome-wide association studies (GWAS) have been carried out and the number of patients included in them is relatively small (N = 9040). According to the GWAS Catalog (January 2019) [5], 40 common germline variants sited in 32 loci and associated with PC risk have been identified in individuals of European descent [6,7,8,9,10,11]. However, these variants only explain 4.1% of the phenotypic variance for PC [12]. More importantly, given the challenges in performing new PC case-control studies with adequate clinical, epidemiological, and genetic information, the field is far from reaching the statistical power that has been achieved in other more common cancers such as breast, colorectal, or prostate cancers with > 100,000 subjects included in GWAS, yielding a much larger number of genetic variants associated with them [5].

Current GWAS methodology relies on establishing simple SNP-disease associations by setting a strict statistical threshold of significance (p value = 5 × 10−8) and replicating them in independent studies. This approach has been successful in minimizing false positive hits at the expense of discarding variants that may be truly associated with the disease displaying association p values not reaching genome-wide significance after multiple testing correction or not being replicated in independent populations. The false negatives can be the result of weak associations or of low prevalence of the variant SNP assessed, among others. The “simple” solution to this problem is to increase the sample size. However, it will take considerable time for PC GWAS to reach the number of subjects achieved in other tumors and the funding climate for replication studies is extremely weak. While a meta-analysis based on available datasets provides an alternative strategy for novel variant identification, this approach introduces heterogeneity because studies differ regarding methods, data quality, testing strategies, genetic background of the included individuals (e.g., population substructure), and study design, factors that can lead to lack of replicability. Therefore, we are faced with the need of exploring alternative approaches to discover new putative genetic risk variants missed by conventional GWAS criteria.

Here, we build upon one of the largest PC case-control studies with extensive standardized clinical and epidemiological data annotation and apply both a classical GWAS approach (1D strategy) and novel strategies for risk-variant discovery. We use, for the first time in genomics, the Local Moran’s Index (LMI) [13], an approach that is widely followed in geospatial statistics. In its original application to geographic two-dimensional (2D) analysis, LMI identifies the existence of relevant autocorrelated clusters in the spatial arrangement of a variable, highlighting points closely surrounded by others with similar risk estimate values, allowing the identification of “hot spots.” We computed LMI of (genomic) spatial autocorrelation to identify clusters of SNPs based on their similar risk estimates (odds ratio, OR) weighted by their genomic distance as measured by linkage disequilibrium (LD). By capturing LD structures of nearby SNPs, LMI leverages the values of SNPs with low minor allele frequencies (MAFs) that conventional GWAS fail to assess properly. In this regard, LMI offers a novel opportunity to identify potentially relevant new sets of genomic candidates associated with PC genetic susceptibility.

In addition, we have taken advantage of recent advances in 3D genomic analyses providing insights into the spatial relationship of regulatory elements and their target genes. Since GWAS have largely identified variants present in non-coding regions of the genome, a challenge has been to ascribe such variants to the corresponding regulated genes, which may lie far away in the genomic sequence. Chromosome Conformation Capture experiments (3C-related techniques) [14] can provide insight into the biology and function underlying previously “unexplained” hits, in addition to identify further genetic susceptibility loci [15, 16].

The combined use of conventional GWAS (1D) analysis with LMI (2D) and 3D genomic approaches has allowed enhancing the discovery of novel candidate variants involved in PC genetic susceptibility (Fig. 1). As high-throughput technologies have produced large amounts of publicly available data from cell types and tissues, these resources represent a valuable approach to perform an in silico functional validation of prioritized variants using novel criteria, as well as for functional interpretation of genetic findings. Importantly, here we identified several new variants located in genes for which there is functional evidence of their implication in the biology of pancreatic acinar cells. Among them are a novel independent variant in NR5A2, a multi-hit region in CASC8, and three new PC loci in SIAH3, CTRB2/BCAR1, and XBP1, all of them with strong in silico functional support.

Fig. 1
figure1

Study flowchart: overview of the complementary approaches adopted in this study to identify new pancreatic cancer susceptibility regions

Methods

1D approach: PanGenEU GWAS—single marker association analyses

Study population

We used the resources from the PanGenEU case-control study conducted in Spain, Italy, Sweden, Germany, UK, and Ireland, between 2009 and 2014 [17, 18]. Eligible PC patients, men and women ≥ 18 years of age, were invited to participate. Eligible controls were hospital in-patients with primary diagnoses not associated with known risk factors of PC. Controls from Ireland and Sweden were population-based. Institutional Review Board approval and written informed consent were obtained from all participating centers and study participants, respectively. To increase statistical power, we included controls from the Spanish Bladder Cancer (SBC)/EPICURO study, carried out in the same geographical areas where PanGenEU study was conducted. Characteristics of the study populations are detailed in Additional file 1: Table S1.

Genotyping and quality control of PanGenEU study subjects

DNA samples were genotyped using the Infinium OncoArray-500K [19] at the CEGEN facility (Spanish National Cancer Research Center, CNIO, Madrid, Spain). Genotypes were called using the GenTrain 2.0 cluster algorithm in GenomeStudio software v.2011.1.0.24550 (Illumina, San Diego, CA). Genotyping quality control criteria considered the missing call rate, unexpected heterozygosity, discordance between reported and genotyped gender, unexpected relatedness, and estimated European ancestry < 80%. After removing samples that did not pass the quality control filters, duplicated samples, and individuals with incomplete data regarding age of diagnosis/recruitment, 1317 cases and 700 controls were available for the association analyses. SNPs in sex chromosomes and those that did not pass the Hardy-Weinberg equilibrium (p value < 10−6) were also discarded. Overall, 451,883 SNPs passed the quality control filters conducted before the imputation.

Genotyping and quality control of SBC/EPICURO controls

Genotyping of germline DNA was performed using the Illumina 1M Infinium array at the National Cancer Institute (NCI) Core Genotyping Facility as previously described [20], which provided calls for 1,072,820 SNP genotypes. We excluded SNPs in sex chromosomes, those with a low genotyping rate (< 95%), and those that did not pass the Hardy-Weinberg equilibrium threshold. In addition, the exome of 36 controls was sequenced with the TruSeq DNA Exome and a standard quality control procedure both at the SNP and individual level was applied: SNPs with read depth < 10 and those that did not pass the tests of base sequencing quality, strand bias or tail distance bias, were considered as missing and imputed (see the “Imputation” section for further details). Overall, 1,122,335 SNPs were available for imputation from a total of 916 additional controls.

Imputation

Imputation was performed at the Wellcome Sanger Institute (Cambridge, UK) and at CNIO for the PanGenEU and the SBC/EPICURO studies, respectively. Imputation of missing genotypes was performed using IMPUTE v2 [21], and genotypes of SBC/EPICURO controls were pre-phased to produce best-guess haplotypes using SHAPEIT v2 software [22]. For both PanGenEU and EPICURO studies, the 1000 G (Phase 3, v1) reference dataset was used [23].

Association analyses

A final set of 317,270 common SNPs (MAF > 0.05) that passed quality control in both studies and showed comparable MAF across genotyping platforms was used. We ensured the inclusion of the 40 variants previously associated with PC risk in individuals of Caucasian origin compiled in GWAS Catalog [5]. Logistic regression models [24] were computed assuming an additive mode of inheritance for the SNPs, adjusted for age at PC diagnosis or at control recruitment, sex, the area of residence [Northern Europe (Germany and Sweden), European islands (UK and Ireland), and Southern Europe (Italy and Spain)], and the first 5 principal components (PCA) calculated with prcomp R function based on the genotypes of 32,651 independent SNPs (J Tyrer, personal communication) to control for potential population substructure.

Validation of the novel GWAS hits

To replicate the top 20 associations identified in the Discovery phase, we performed a meta-analysis using risk estimates obtained in previous GWAS from the Pancreatic Cancer Cohort Consortium (PanScan: https://epi.grants.cancer.gov/PanScan/) and the Pancreatic Cancer Case-Control Consortium (PanC4: http://www.panc4.org/), based on 16 cohort and 13 case-control studies, respectively. Details on individual studies, namely PanScan I, PanScan II, PanScan III, and PanC4, have been described elsewhere [6,7,8,9]. Genotyping for PanScan studies was performed at the NCI Cancer Genomic Research Laboratory using HumanHap550v3.0, and Human 610-Quad genotyping platforms for PanScan I and II, respectively, and the Illumina Omni series arrays for PanScan III. Genotyping for PanC4 was performed at the Johns Hopkins Center for Inherited Disease Research using the Illumina HumanOmniExpressExome-8v1 array. PanScan I/II datasets were imputed together while PanScan III and PanC4 were each imputed independently using the 1000 G (Phase3, v1) reference dataset [23] and IMPUTE2 [21]. Association models were adjusted for study (PanScan I and II), geographical region (for PanScan III), age, sex, and PCA of population substructure (5 PCA for PanScan I+II, 6 for PanScan III) for PanScan models, and for study, age, sex, and 7 PCA population substructure for PanC4 models. Summary statistics from PanScanI/II, PanScan III, and PanC4 were used for a meta-analysis using a random-effects model based on effect estimates and standard errors with the metafor R package [25].

2D approach: Local Moran Index

Local Moran’s Index calculation

The LMI was obtained for each SNP considered in the GWAS (n = 317,270) using the summary statistics resulting from the association analyses as follows. First, we standardized the OR of each SNP after referring it to the risk-increasing allele (i.e., OR > 1) using the inverse of the normal distribution. Then, we calculated the weight matrix containing the linkage disequilibrium (r2) as proxy for the distance between each SNP and each of its neighboring SNPs (± 500 kb). SNPs present within this window were matched by MAF to maximize the chance that haplotypes match. Therefore, the LMI for ith SNP was calculated as:

$$ {LMI}_i={z}_i\times \sum \frac{z_j\times {r}_{i,j}^2}{\sum {r}_{i,j}^2}, $$

where LMIi is the LMI value for the ith SNP; zi is the OR value for the ith SNP, obtained from the inverse of the normal distribution of ORs for all SNPs; zj is the OR for the jth SNP within the physical distance and MAF-matched defined bounds; and \( {r}_{i,j}^2 \) is the LD value, measured by r2, between the ith SNP and the jth SNP [26].

The LMI score could be estimated for 98.8% of the SNPs in our dataset, as 1.2% of the SNPs were not genotyped in the 1000 G (Phase 3, v1) reference dataset [23] or had a MAF < 1% in the CEU European population (n = 85 individuals, phase 1, version 3). We then discarded the SNPs that (1) had a negative LMI, meaning either that surrounding SNPs and target SNP have largely different ORs or that they are in linkage equilibrium and, therefore, do not pertain to the same cluster, or (2) had a positive LMI, i.e., target and surrounding SNPs have similar ORs, but the SNP came from the bottom 50% tail of the distribution of the ordered transformed OR distribution.

To assess the usefulness of the LMI score for SNP prioritization, we ran two benchmarking tests. First, we evaluated whether the GWAS Catalog PC-associated SNPs known to be associated with PC in European populations (GWAS Catalog, n = 40 [5]) had a LMI value higher than expected. Then, we assessed how many of the previously reported loci were also identified according to the LMI out of the 30 independent signals of ≥ 1 SNPs. Further details can be found in Additional file 1: Supplementary methods.

3D approach: Hi-C pancreas interaction maps and interaction selection

The 3D Hi-C interaction maps for both healthy pancreas tissue [27] and for a pancreatic cancer cell line (PANC-1) were generated using TADbit as previously described [28]. Briefly, Hi-C FASTQ files for 7 replicas of healthy pancreas tissue were downloaded from GEO repository (Accession number: GSE87112; Sequence Read Archive Run IDs: SRR4272011, SRR4272012, SRR4272013, SRR4272014, SRR4272015, SRR4272016, SRR4272017), and for PANC-1 FASTQ, files were available from ENCODE (Accession number: ENCSR440CTR). Merged FASTQ files of the 7 healthy samples and those of PANC-1 were mapped against the human reference genome hg19, parsed and filtered with TADbit to get the final number of valid interacting read pairs (99,074,082 and 287,201,883 valid interaction pairs, respectively). From this set, we built chromosome-wide interaction matrices at 40 kb resolution. The HOMER package [29] was used to detect significant interactions between bins using the –center and --maxDist 2000000 parameters. Using HOMER’s default parameters, the final number of nominally significant (p value ≤ 0.001) interactions was 41,833 for the healthy dataset and 357,749 for the PANC-1 dataset. To further filter the interactions, we retained those that passed a Bonferroni corrected threshold < 1 × 10−5, resulting in 6761 for the healthy sample (16.2% top interactions from those originally selected by HOMER default parameters). To make it comparable, we also kept the top 16.2% interactions identified in PANC-1, resulting in 57,813 significant interactions.

Functional in silico analysis

An exhaustive in silico analysis was conducted for associations with p values < 1 × 10−4 in the PanGenEU GWAS (N = 143) and for the top 0.5% loci according to their LMI (N = 510) (Additional file 1: Figure S1). Bioinformatics assessments included evidence of functional impact [30,31,32], annotation in overlapping genes and pathways [31], methylation quantitative trait locus in leukocyte DNA from a subset of the PanGenEU controls (mQTLs), expression QTL (eQTLs) in normal and tumoral pancreas (GTEx and TCGA, respectively) [33, 34], annotation in PC-associated long non-coding RNA (lncRNAs) [35], protein quantitative trait locus analysis in plasma (pQTLs) [36], overlap with regulatory chromatin marks in pancreatic tissue obtained from ENCODE [37], association with relevant human diseases [38], and annotation in differentially open chromatin regions (DORs) in human pancreatic cells [39]. We also investigated whether prioritized variants had been previously associated with PC comorbidities or other types of cancers [5].

We also computed the credible sets (calculated following the procedure in [40]; code at https://github.com/hailianghuang/FM-summary), with an r2 > 0.1, physical distance ± 500 kb, and up to a posterior probability of 0.99 for the variants prioritized by the 1D (N = 143 SNPs) and the 2D (510 SNPs) approaches within a 1-Mb window.

In addition to the in silico functional analyses at the variant level, we conducted enrichment analyses at the gene level using the FUMAGWAS web tool [38] and investigated whether our prioritized set of genes appeared altered at the tumor level in a collection of pancreatic tumor samples [41]. Methodological details of all bioinformatics analyses conducted are described in detail in Additional file 1: Supplementary methods.

Results

1D approach: PanGenEU GWAS—single marker association analyses

We performed a GWAS including data from 1317 patients diagnosed with PC (cases) and 1616 control individuals from European countries. In addition to the genotyped SNPs that passed the quality control, we considered the imputed genotypes for previously reported PC-associated hits not included in the OncoArray-500K (19 SNPs with info score ≥ 0.91). In all, 317,270 SNPs were tested (Additional file 1: Figure S2) with little evidence of genomic inflation (Additional file 1: Figure S3).

Replication of previously reported GWAS hits

Of the 40 previously GWAS-discovered variants associated with PC risk in European ancestry populations [5], 17 (42.5%) were replicated with nominal p values < 0.05. For all 17, the associations were in the same direction as in the primary reports (Additional file 1: Table S2). Among them, we replicated NR5A2-rs2816938 and NR5A2-rs3790844, a gene for which extensive experimental evidence supporting a role in PC has been acquired. We also observed significant associations for seven variants tagging NR5A2 previously reported in the literature [7,8,9,10, 42]. Replicated GWAS hits included LINC00673-rs7214041, reported to be in complete LD with LINC00673-rs11655237 [11], previously shown to be a PC-associated variant [9]. At the GWAS significance level, we also replicated TERT-rs2736098 [8, 11].

The top 20 PanGenEU GWAS hits: validation in independent populations

The risk estimates of the top 20 variants in the PanGenEU GWAS were included in a meta-analysis with those derived from PanScanI+II, PanScan III, and PanC4 consortia GWAS, representing a total of 10,357 cases and 14,112 controls (Additional file 1: Table S3). PanGenEU identified a new variant in NR5A2 associated with PC (NR5A2-rs3790840, metaOR = 1.23, p value = 5.91 × 10−6) which is in moderate LD with NR5A2-rs4465241 (r2 = 0.45, metaOR = 0.81, p value = 3.27 × 10−10) and had previously been reported in a GWAS pathway analysis [42]. NR5A2-rs3790840 remained significant (p value < 0.05) when conditioned on NR5A2-rs4465241, on NR5A2-rs3790844 plus NR5A2-rs2816938, and even on the 13 NR5A2 GWAS hits reported in the literature, indicating that NR5A2-rs3790840 is a new, distinct, PC risk signal. The SKAT-O [43] (seqMeta R package: https://rdrr.io/cran/seqMeta/man/skatOMeta.html), a gene-based analysis considering all significant NR5A2 hits plus NR5A2-rs3790840, yielded a significant association (p value = 8.9 × 10−4). Furthermore, in a case-only analysis conducted within the PanGenEU study, the overall NR5A2 variation was associated with diabetes (p value = 6.0 × 10−3), suggesting an interaction between both factors in relation to PC risk.

While not replicated in the meta-analysis or not in the top 20 SNPs, other variants of interest identified by the 1D approach are located in SETDB1, FAM63A, SCTR, SEC63, CASC8, and RPH3AL loci (Table 1). Their potential functionality is commented below.

Table 1 Novel pancreatic cancer genetic susceptibility hits prioritized by approaches 1D, 2D, and 3D, as well as by in silico functional analyses

2D approach: genomic spatial integration

We scaled up from the single-SNP (1D) to the genomic region (2D) association analysis by considering both genomic distance (LD) between variants and the magnitude of the association (OR) with the variants. We calculated a LMI score and selected those SNPs with positive LMI or within the top 50% of OR values resulting in a final set of 102,146 SNPs. The LMI scores and p values for these variants showed a direct correlation (Spearman r = 0.62; p value = 2.2 × 10−16, Additional file 1: Figure S4). To assess the versatility of LMI, we ran two benchmarks based on the MAFs and the ORs. Out of the 30 PC independent signals (r2 < 0.2) derived from the GWAS Catalog, 22 were present in our 102,146 selected set. The observed median rank position for the 22 PC signals in this list was 22,640, an average position significantly higher than that of 10,000 randomly selected sets of the same size (one tail p value = 0.0013) (Additional file 1: Figure S5). Moreover, out of the PC genomic loci, LMI was able to capture those reported by at least two studies (21 out of 30 PC independent genomic loci).

An LMI-enriched variant set was generated by selecting the top 0.5% of SNPs according to their LMI scores (LMI ≥ 5.1071) resulting in 510 SNPs, which included 29 out of the 143 SNPs prioritized by the 1D approach (Additional file 2: Table S4). We compared the MAF of the independent SNPs (r2 < 0.2) (see Additional file 1: Supplementary methods) prioritized by the 1D approach (N = 97/143) against the top 97 independent variants, out of 196 independent signals for the 510 SNPs selected by LMI. Notably, the LMI-identified SNPs had a lower MAF than GWAS-identified variants: 0.07 (SD = 0.03) vs. 0.24 (SD = 0.13) (Wilcoxon statistic p value < 2.2 × 10−16) (Additional file 1: Figure S6). In line with this observation, the average OR for the LMI-based SNPs was significantly higher than that for the GWAS-based SNPs (1.46 vs. 1.32, respectively, Wilcoxon statistic p value = 1.63 × 10−10).

The Manhattan plot of the LMI score across the genome displays the hits identified through this approach (Additional file 1: Figure S7). Among the 0.5% top LMI prioritized variants (N = 510), there were 8 SNPs in NR5A2, including the novel PanGenEU GWAS identified variant (rs3790840). All of them showed a high LMI score (> 6.859) what further endorses this approach. Other variants of interest identified by the 2D approach are in SETDB1, FAM63A/MINDY1, GPRC6A, RFX6, CASC8, CDKN2A, KDM4C, ROR2, MS4A5, SIAH3, LRRC36, and CTRB2/BCAR1 loci (Table 1). Their potential functionality is discussed below.

A total of 199 credible sets were identified among the 510 LMI-based SNP. Of them, 118 (60%) contained the SNP with the lowest p value in the region. Moreover, we observed an enrichment of SNPs with low p values in the 1-Mb region for the LMI-based SNP set that was even higher among the 118 credible sets (Additional file 1: Figure S8).

3D approach: genomic interaction analysis

To gain further insight into the biological function of the 624 candidate SNPs prioritized using the 1D and 2D approaches, and to identify additional PC genetic susceptibility loci, we focused on a set of 6761 significant chromatin interactions (p values ≤ 1 × 10−5) identified using Hi-C interaction pancreatic tissue maps at 40 Kb resolution [27]. Throughout the rest of the text, we will refer to the chromatin interaction component containing the prioritized SNP as “bait” and to its interacting region as “target.” In total, 54 target loci overlapping with 37 genes interacted with bait regions harboring 76/624 (12.1%) SNPs (Additional file 3: Table S5).

As a proof of concept of the utility of the 3D approach to identify novel PC genetic susceptibility loci, we highlight a target region (22:29,197,371-29,237,371 bp, p value = 1.3 × 10−9) interacting with an intronic region of TTC28 (bait: 22:28,602,352-28,642,352 bp) that includes four LMI-selected SNPs (rs9620778, rs9625437, rs17487463, and rs75453968, all in high LD, r2 > 0.95, in CEU population) (Fig. 2). Other loci of interest identified by the 3D approach are in SIAH3, CTRB2, and MN1 loci (Table 1). Their potential functionality is commented below.

Fig. 2
figure2

Three-dimensional genome organization in healthy and PANC-1 cells and association results corresponding to the genomic region around XBP1 using the standard GWAS and 2D approaches. a Coverage-normalized Hi-C maps of healthy samples and PANC-1 cells at 40 Kb resolution. Green ellipses highlight the interaction between the region harboring four Local Moran’s Index (LMI)-selected SNPs and the XBP1 promoter. b Tracks of the ChromHMM Chromatin for 8 states in healthy pancreas, PANC-1 cells, and a Pancreatic Intraepithelial Neoplasia 1B. Promoters are colored in light purple, strong enhancers in dark green, and weak enhancers in yellow. Note that the strong enhancer in the target region is lost in the PANC-1 and PanIN-1B samples, compared to the healthy samples. c UCSC tracks of H3K27ac, an enhancer-associated mark, and arcs linking significant interactions called by HOMER. Interactions in healthy pancreas samples are in green and those in PANC-1 and in the PanIN-1B sample are in purple. Red arc represents the interaction between LMI-prioritized SNPs and the XBP1 promoter (highlighted region in Hi-C map in a). d Scatterplots of SNPs in region chr22:28,400,000-29,600,000 (hg19) and their –log10 (p value), LMI, and odds ratio. Bait and target chromatin interaction regions are highlighted in yellow and blue, respectively

Functional in silico validation

We performed a systematic and exhaustive in silico functional analysis of SNPs prioritized by GWAS (N = 143) and LMI (N = 510) at the variant, gene, and pathway levels (Fig. 1 and Additional file 1: Figure S1).

Assessment of potential functionality of the variants

The evidence for potential functionality of the most relevant SNPs for each of the approaches used is reviewed here and summarized in Table 1, Additional file 1: Supplementary methods, and Additional file 2: Table S4 and Additional file 4: Table S6.

Among the 143 variants prioritized in the 1D approach, we highlight those in CASC8 (8q24.21) (Fig. 3): 27 variants with p values < 1 × 10−4 organized in four LD-blocks, 9 of which were also captured in the 2D approach. The CASC8 locus is amplified in 5% of PC and codes for a non-protein coding RNA overexpressed in tumor vs. normal pancreatic tissue (Log2FC = 1.25, p value = 2.29 × 10−56). CASC8 also overlaps with a PC-associated lncRNA [35], suggesting that genetic variants in CASC8 may contribute to the transcriptional program of pancreatic tumor cells. All CASC8 variants were also associated with differential leukocyte methylation (mQTL) of RP11-382A18.1-cg25220992 in our PanGenEU population sample. Moreover, 20 of them were associated with differential methylation of cg03314633, also in RP11-382A18.1. Twenty-three of the variants overlapped with at least one histone mark in either endocrine or exocrine pancreatic tissue. Alterations in CASC8 significantly co-occur with alterations in TG (adjusted p values < 0.001), also associated with PC in our GWAS, which is located downstream.

Fig. 3
figure3

Zoom plot of the 8q24.21 CASC8 (cancer Susceptibility 8) region and linkage disequilibrium pattern of the PanGenEU GWAS prioritized variants. Red and green points indicate OR < 1 and OR > 1, respectively

Three of the variants prioritized for in silico analysis in the 1D approach (but not in the 2D approach) are located in genes involved in pancreatic function: rs1220684 is in SEC63, coding for a protein involved in endoplasmic reticulum (ER) function and ER stress response [44]; rs7212943, a putative regulatory variant, is in NOC2/RPH3AL, a gene involved in exocytosis in exocrine and endocrine cells [45]; and rs4383344 is in SCTR, which encodes for the secretin receptor, selectively expressed in ductal cells, involved in the regulation of bicarbonate, electrolyte, and volume secretion. Interestingly, secretin regulation is affected by Helicobacter pylori which has been suggested as a PC risk factor [46]. High expression of SCTR has also been reported in PC [47].

Two variants in high LD (r2 = 0.92) and potentially relevant at the functional level are in 1q21.3 (SETDB1-rs17661062 and FAM63A-rs59942146). SETDB1 has recently been reported to be required for formation of PC in mice by inhibiting p53-mediated apoptosis [48], and FAM63A/MINDY1 has been found to interact significantly with diabetes (duration ≥ 3 years) in a meta-analysis on PC risk conducted within the PanC4 and PanScan consortia [49]. Interestingly, these two variants were also associated with an increased methylation of the cg17724175 in MCL1. High mRNA expression of this gene has been associated with poor survival [50], and Mcl-1 has been explored to selectively radiosensitize PC cells [51]. Importantly, these two variants were also the top two prioritized by the 2D approach with a LMI score > 16.87 (Additional file 2: Table S4).

Using the 2D approach, we prioritized several other regions with potential functional relevance (Table 1, Fig. 4, Additional file 1: Figure S7, Additional file 2: Table S4). In chromosome 6, we identified rs6907580 (LMI = 8.93), a well-characterized stop-gain—and likely disease-causal variant (CADD score = 35)—in exon 1 of GPRC6A (G protein-coupled receptor family C group 6 member A). GPRC6A is expressed in pancreatic acinar, ductal, and β-cells; it participates in endocrine metabolism [52]; and it has been involved in pancreatitis using mouse models [53]. Downstream in the same region, LMI approach also identified rs17078438 (LMI = 8.90) in RFX6, a pancreas-specific gene involved in pancreatic development [39].

Fig. 4
figure4

Scatterplots of the –log10 p values, Local Moran’s Index (LMI) values, and odds ratios (OR) for three genomic regions prioritized based on their LMI value. Highlighted regions show the hits identified in the 2D, but not in the 1D approach

Other potentially functional SNPs relevant to PC and prioritized in the 2D approach comprised 6 SNPs (LMI ≤ 5.60) in the vicinities of CDKN2A/p16, a gene that is almost universally inactivated in PC [54] and that is mutated in some hereditary forms of PC [55, 56], with three variants (LMI ≤ 5.48) in CDKN2A-AS1 and two (LMI ≤ 5.88) in CDKN2B/p15, other important cell-cycle regulators; three variants in KDM4C (LMI ≤ 11.27), a Lys demethylase 4C highly expressed in PC [57]; and two SNPs tagging ROR2 (LMI ≤ 5.57), a member of the Wnt pathway that plays a relevant role in PC [58].

Another region, in chromosome 16, comprises BCAR1-rs7190458, a variant with a relevant role in PC [59] reported in two previous GWAS [8, 11], as well as a novel SNP (rs13337397, LMI = 6.03) located in the first exon of BCAR1. Both SNPs are in low LD (r2 = 0.36). This second SNP is intergenic to CTRB2 and BCAR1 (Table 1, Fig. 4, Additional file 1: Figure S7, Additional file 2: Table S4). While BCAR1 is ubiquitous, CTRB2 code for chymotrypsinogens B2, a protease expressed exclusively in the exocrine pancreas; genetic variation therein has been previously associated with alcoholic pancreatitis [60] and type 2 diabetes [61, 62]. The expression of that gene is reduced in tumors vs. normal tissue [63]. Our finding using the 3D-approach further supports that this locus harbors genetic variation of relevance to PC risk.

According to their deleteriousness CADD score [32], we highlight two variants in coding transcripts: MS4A5-rs34169848 in chr11:60,197,299 (CADD score = 24.4) and LRRC36-rs8052655 in chr16:67,409,180 (CADD score = 24.4). CADD scores of such magnitude are likely to correspond to disease-causal variants [64].

The 3D approach highlighted XBP1 as a target region. As said, this target region including the XBP1 promoter interacts with four of the LMI-selected SNPs (Fig. 2) that are in moderate LD with rs16986177. The alternative allele (T) of this SNP is associated with a decreased expression of XBP1 in normal pancreas in GTEx (− 0.19, p value = 1.3 × 10−4) and with an increased risk of PC in our GWAS (OR = 1.28, p value = 8.71 × 10−3). Expression of XBP1 is reduced in PC samples from TCGA, compared to normal pancreas samples from GTEx (Log2FC = − 1.561, p value = 1.72 × 10−34). Chip-Seq data of all pancreatic samples available in ENCODE, as well as PANC-1 pancreatic cancer cells (see the “Methods” section), allowed us to find that, in comparison to normal pancreas, the H3K27Ac mark present in the XBP1 promoter is completely lost in PANC-1 cells and is reduced in a sample of a Pancreatic Intraepithelial Neoplasia 1B, a PC precursor (Fig. 2). To further characterize the bait and promoter regions upstream of XBP1, we ran eight chromatin states using ChromHMM [65] (Additional file 1: Supplementary methods). We observed a clear loss of enhancers/weak promoters in the corresponding target regions in the precursor lesions and in PANC-1 cells. This loss of activity is in line with the observation that XBP1 expression is reduced in cancer. Moreover, small enhancers are also lost in the bait region of the aforementioned samples. The 3D maps for this region revealed loss of 3D contact in PANC-1 cells (Fig. 2).

Gene set enrichment analyses (GSEA)

We performed GSEA of the genes harboring the SNPs prioritized using the 1D and 2D approaches. Six chromosomal regions were significantly enriched among the 81 genes harboring the 143 prioritized SNPs in the 1D approach (Additional file 5: Table S7). GSEA for the gene-trait associations reported in the GWAS Catalog yielded 29 enriched traits (Additional file 5: Table S7). The most relevant GWAS traits significantly enriched were “Pancreatic cancer,” “Lung cancer,” “Prostate cancer,” “Uric acid levels,” “Obesity-related traits,” and “Major depressive disorder.” We also performed a network analysis to visualize the relationships between the enriched GWAS traits and the prioritized genes using the igraph R package [66]. Twelve densely connected subgraphs were identified via random walks (Fig. 5). Interestingly, “pancreatic cancer” and “uric acid levels” GWAS traits were connected through NR5A2, which is also linked to “chronic inflammatory diseases” and “lung carcinoma” traits. NR5A2 is an important regulator of pancreatic differentiation and inflammation in the pancreas [67].

Fig. 5
figure5

Network of traits in the GWAS Catalog enriched with the genes prioritized in the 1D approach of PanGenEU GWAS. Twelve densely connected subgraphs identified via random walks are displayed in different colors

GSEA of the genes harboring the variants included in the credible sets corresponding to the 2D approach revealed enrichment in “Pancreatic cancer” as well as other GWAS traits related to PC risk factors, including alcoholic chronic pancreatitis, type 2 diabetes, body mass index and waist-to-hip ratio adjusted for body mass index, and HDL cholesterol (Additional file 6: Table S8). These findings lend support to the validity of the 2D approach as a tool to identify disease-relevant genetic variants.

Pathway enrichment analysis

The genes prioritized in the 1D approach were significantly enriched in 112 Gene Ontology Biological Function (GO:BP) terms (adjusted p values < 0.05, with minimum of three genes overlapping), 7 Cellular Component GO terms (GO:CC), and 11 Molecular Function (GO:MF) terms (Additional file 5: Table S7). Importantly, GO terms relevant to exocrine pancreatic function were overrepresented. Three KEGG pathways were significantly enriched with ≥ 2 genes from our prioritized set, including “Glycosaminoglycan biosynthesis heparan sulfate” (adj-p = 3.86 × 10−3), “ERBB signaling pathway” (adj-p = 3.73 × 10−2), and “Melanogenesis” (adj-p = 3.73 × 10−2) (Additional file 5: Table S7). Pathways enriched with the genes prioritized in the 2D approach included GO terms related to the nervous system and G protein-coupled receptor signaling. Interestingly, one of the hallmarks of PC is perineural invasion. Because the standard databases generally lack pathways related to acinar pancreatic function, we generated several curated gene sets and assessed their enrichment among the SNPs/genes prioritized in the 1D and 2D approaches. We found an overrepresentation of LMI genes in a signature including transcription factors differentially expressed in normal pancreas (GTEx). This signature was also enriched with genes prioritized in the other two approaches used in our study (including 11 overlapping genes: SETDB1, LHX4, NR5A2, ZBED6, ELK4, SIM1, RFX6, KLF14, ZNF32, ZNF133, and XBP1).

In summary, the in silico functional analysis revealed a remarkable enrichment of pathways related to the function of acinar and ductal cells, including SNPs associated with novel genes in these pathways.

Discussion

To overcome some of the limitations of standard GWAS analyses, we have expanded the scope of genomic studies of PC susceptibility to include novel approaches that build on spatial genome autocorrelations of LMI and 3D chromatin contacts. An in-depth in silico functional analysis leveraging available genomic information from public databases allowed us to prioritize new candidate variants with strong biological plausibility in well-established (i.e., NR5A2) as well as in novel (i.e., XBP1) genes playing a key role in acinar function (Table 1). We have thus reached a novel landscape on the inherited basis of PC and have paved the way to the application of a similar strategy to any other human disease or interest.

This is the first PC GWAS involving an exclusively Europe-based population sample. Of the previously reported European ancestry population GWAS hits, 42.5% were replicated, supporting the methodological soundness of the study. The lack of replication of other PC GWAS hits may be explained by variation in the MAFs of the SNPs among Europeans, population heterogeneity, differences in the genotyping platform used, and differences in calling methods applied, among others. This result emphasizes that statistical significance for GWAS-SNPs is largely dependent on MAF and the statistical power of the study, highlighting this as a major limitation of classical GWAS analyses.

We applied the LMI (2D approach) for the first time in the genomics field. LMI captured a new dimension of signals independent from MAF and the statistical power of the study (Additional file 1: Figure S6). The benchmarking tests evidenced that LMI prioritizes SNPs on the basis of OR that were largely present in credible sets (Additional file 1: Figure S8). We replicated 6.4% of the previous reported GWAS Catalog signals for PC in European populations by considering the top 0.5% LMI variants, a LMI threshold that is overly conservative, given that many of the GWAS Catalog-replicated signals have lower LMI than the cutoff value we selected. The ability of LMI to prioritize low MAF SNPs, unlike the GWAS approach, may also explain the low replicability rate. LMI helps to identify signals within genomic regions by scoring lower those regions that do not maintain LD structure.

The 3D genomic approach identified a highly potential important chromatin interacting region in XBP1. This is a potential candidate detected through a previously uncharacterized “bait” SNP. These findings are particularly important considering the overwhelming evidence of a major role of ER stress and unfolded protein responses in acinar function—two highly relevant processes to acinar homeostasis due to their high protein-producing capacity of these cells—and it plays an important role in pancreatic regeneration [68]. In addition, genetic mouse models have unequivocally shown that Xbp1 is required for acinar homeostasis and pancreatic ductal adenocarcinoma, the most common form of PC, can be initiated from acinar cells [69]. Overall, these analyses indicate that the SNPs interacting in 3D space with the XBP1 promoter could contribute to the differential expression of the gene associated with malignant transformation. These findings provide proof of concept that 3D genomics can contribute to identify further susceptibility loci and to decipher the biological relevance of orphan SNPs. Similar results were found with other LMI-selected SNPs associated with their target genes only by detecting significant spatial interactions between them (Additional File 3: Table S5).

To shed light into the functionality of the newly identified variants, we applied novel post-GWAS approaches to interrogate several databases at the SNP, gene, and pathway levels. We found sound evidence pointing to the functional relevance of several variants prioritized by the 1D and 2D approaches (Additional files 2 and 4: Tables S4 and S6, respectively, and Additional file 1: Supplemental methods). The importance of the multi-hit CASC8 region (8q24.21) is further supported by in silico functional analyses as well as by its previous associations with PC at the gene level [35]. In particular, 12/27 SNPs identified in CASC8 were annotated as regulatory variants. None of the CASC8 hits were in LD with CASC11-rs180204, a GWAS hit previously associated with PC risk, which is ~ 205 Kb downstream [10]. CASC8-rs283705 and CASC8-rs2837237 (r2 = 0.68) are likely to be functional with a score of 2b in RegulomeDB (TF binding + any motif + DNase Footprint + DNase peak). CASC8-rs1562430, in high LD (r2 > 0.85) with 18 CASC8 prioritized variants, has been previously associated with other cancers (breast, colorectal, and stomach) [70]. None of the prostate cancer-associated SNPs in CASC8 overlapped with the 27 identified variants in our study. The fact that this gene has not been reported previously in other PC GWAS could be due to the different genetic background of the study populations or to an overrepresentation of the variants tagging CASC8 in the Oncoarray platform used here.

In addition to confirming SNPs in TERT, we found strong evidence for the participation of novel susceptibility genes in telomere biology (PARN) and in the post-transcriptional regulation of gene expression (PRKCA and EIF2B5) (Additional File 1: Supplemental methods). Our study also expands the landscape of variants and genes involved in exocrine biology, including SEC63, NOC2/RPH3AL, and SCRT whose products participate in acinar function and possibly in acinar-ductal metaplasia, a PC pre-neoplastic lesion [71].

KEGG pathway enrichment analysis further validated our results being involved in important pathways for PC, including “Glycosaminoglycan biosynthesis heparan sulfate” and “ERBB signaling pathway.” Heparan sulfate (HS) is formed by unbranched chains of disaccharide repeats which play roles in cancer initiation and progression [72]. Interestingly, the expression of HS proteoglycans increases in PC [73] and related molecules, such as hyaluronic acid, are important therapeutic targets in PC [74, 75]. ERBB signaling is important both in PC initiation and as a therapeutic target [76].

The enrichment analysis indicates that urate levels, depression, and body mass index—three GWAS traits previously reported to be associated with PC risk—were enriched in our prioritized gene set. Urate levels have been associated with both PC risk and prognosis [77, 78]. In addition, patients with lower relative levels of kynurenic acid have more depression symptoms [79]. PC is one of the cancers with the highest occurrence of depression preceding its diagnosis [80]. Furthermore, body mass index has been previously associated with PC risk in diverse populations [81,82,83] and it has been suggested that increasing PC incidence may be partially attributed to the obesity epidemic. Insulin resistance is one of the mechanisms possibly underlying the obesity and PC association, through hyperinsulinemia and inflammation [84].

The post-GWAS approach used has limitations that should be addressed in future studies. For example, our study has a relatively small sample size, some imbalances regarding gender and geographical areas, and the Hi-C maps that we used have limited resolution (40 kb). To account for population imbalances, regression models were adjusted for gender and for country of origin, as well as for first five principal components. The study of a European-only population allows reducing the population heterogeneity not only at the genetic but also at the non-genetic level. This is particularly advantageous given the novel nature of the analysis performed here and the relatively small sample size of our study. More so, LMI is based on both the summary statistics and LD structure; therefore, it was important to test its validity for the first time in a more homogeneous population, with individuals sharing a more consistent LD pattern. It is now warranted to extend this approach to more generalizable multi-ethnic populations.

Our study has many other strengths: a standardized methodology was applied in all participating centers to recruit cases and controls, to collect information, and to obtain and process biosamples; state-of-the-art methodology was used to extend the identification of variants, genes, and pathways involved in PC genetic susceptibility. Most importantly, the combination of GWAS, LMI, and 3D genomics to identify new variants is completely novel and has proven crucial to refine results, reduce the number of false positives, and establish whether borderline GWAS p value signals could be true positives. These three strategies, together with an in-depth in silico functional analysis, offer a comprehensive approach to advance the study of PC genetic susceptibility.

Conclusions

We present a novel multilayered post-GWAS assessment on genetic susceptibility to PC. We showed that the combined use of conventional GWAS (1D) analysis with LMI (2D) and 3D genomic approaches allows enhancing the discovery of novel candidate variants involved in PC. Importantly, several of the new variants are located in genes relevant to the biology and function of acinar and ductal cells.

This multi-step strategy, combined with an in-depth in silico functional analysis, offers a comprehensive approach to advance the study of PC genetic susceptibility and could be applied to other diseases.

Availability of data and materials

GWAS summary statistics generated in this study are available in the GWAS Catalog repository with the accession numbers GCST90011857 (ftp://ftp.ebi.ac.uk/pub/databases/gwas/summary_statistics/GCST90011857) [85] and GCST90011858 (ftp://ftp.ebi.ac.uk/pub/databases/gwas/summary_statistics/GCST90011858) [86]. Code for performing the association analysis and meta-analysis is available at https://github.com/EvangelinaLdM/Multilayered_postGWAS_PanGenEU [24]. Code for calculating LMI is freely available at https://github.com/pollicipes/Local-Moran-Index-1D [26].

Abbreviations

PC:

Pancreatic cancer

GWAS:

Genome-wide association study

LMI:

Local Moran’s Index

OR:

Odds ratio

LD:

Linkage disequilibrium

MAF:

Minor allele frequency

SBC:

Spanish Bladder Cancer

NCI:

National Cancer Institute

PCA:

Principal components

mQTL:

Methylation quantitative trait locus

eQTL:

Expression quantitative trait locus

lncRNA:

Long non-coding RNA

pQTLs:

Protein quantitative trait locus analysis

DORs:

Differentially open chromatin regions

CADD:

Combined Annotation Dependent Depletion

ER:

Endoplasmic reticulum

GO:BP:

Gene Ontology Biological Function

GO:CC:

Gene Ontology Cellular Component

GO:MF:

Gene Ontology Molecular Function

HS:

Heparan sulfate

References

  1. 1.

    Carrato A, Falcone A, Ducreux M, Valle JW, Parnaby A, Djazouli K, et al. A systematic review of the burden of pancreatic cancer in Europe: real-world impact on survival, quality of life and costs. J Gastrointest Cancer. 2015;46:201–11.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  2. 2.

    Malvezzi M, Bertuccio P, Rosso T, Rota M, Levi F, La Vecchia C, et al. European cancer mortality predictions for the year 2015: does lung cancer have the highest death rate in EU women? Ann Oncol. 2015;26:779–86.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  3. 3.

    Torre LA, Siegel RL, Ward EM, Jemal A. Global cancer incidence and mortality rates and trends - an update. Cancer Epidemiol Biomark Prev. 2016;25:16–27.

    Article  Google Scholar 

  4. 4.

    Rahib L, Smith BD, Aizenberg R, Rosenzweig AB, Fleshman JM, Matrisian LM. Projecting cancer incidence and deaths to 2030: the unexpected burden of thyroid, liver, and pancreas cancers in the United States. Cancer Res. 2014;74:2913–21.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  5. 5.

    Buniello A, Macarthur JAL, Cerezo M, Harris LW, Hayhurst J, Malangone C, et al. The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Res. 2019;47:D1005–12.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  6. 6.

    Amundadottir L, Kraft P, Stolzenberg-Solomon RZ, Fuchs CS, Petersen GM, Arslan AA, et al. Genome-wide association study identifies variants in the ABO locus associated with susceptibility to pancreatic cancer. Nat Genet. 2009;41:986–90.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  7. 7.

    Petersen GM, Amundadottir L, Fuchs CS, Kraft P, Stolzenberg-Solomon RZ, Jacobs KB, et al. A genome-wide association study identifies pancreatic cancer susceptibility loci on chromosomes 13q22.1, 1q32.1 and 5p15.33. Nat Genet. 2010;42:224–8.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  8. 8.

    Wolpin BM, Rizzato C, Kraft P, Kooperberg C, Petersen GM, Wang Z, et al. Genome-wide association study identifies multiple susceptibility loci for pancreatic cancer. Nat Genet. 2014;46:994–1000.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  9. 9.

    Childs EJ, Mocci E, Campa D, Bracci PM, Gallinger S, Goggins M, et al. Common variation at 2p13.3, 3q29, 7p13 and 17q25.1 associated with susceptibility to pancreatic cancer. Nat Genet. 2015;47:911–6.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  10. 10.

    Zhang M, Wang Z, Obazee O, Jia J, Childs EJ, Hoskins J, et al. Three new pancreatic cancer susceptibility signals identified on chromosomes 1q32.1, 5p15.33 and 8q24.21. Oncotarget. 2016;7:66328–43.

    PubMed  PubMed Central  Article  Google Scholar 

  11. 11.

    Klein AP, Wolpin BM, Risch HA, Stolzenberg-Solomon RZ, Mocci E, Zhang M, et al. Genome-wide meta-analysis identifies five new susceptibility loci for pancreatic cancer. Nat Commun. 2018;9:556.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  12. 12.

    Chen F, Childs EJ, Mocci E, Bracci P, Gallinger S, Li D, et al. Analysis of heritability and genetic architecture of pancreatic cancer: a PANC4 study. Cancer Epidemiol Biomark Prev. 2019;28:1238–45.

    CAS  Article  Google Scholar 

  13. 13.

    Anselin L. Local indicators of spatial association—LISA. Geogr Anal. 1995;27:93–115.

    Article  Google Scholar 

  14. 14.

    Dekker J, Rippe K, Dekker M, Kleckner N. Capturing chromosome conformation. Science. 2002;295:1306–11.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  15. 15.

    Claussnitzer M, Dankel SN, Kim KH, Quon G, Meuleman W, Haugen C, et al. FTO obesity variant circuitry and adipocyte browning in humans. N Engl J Med. 2015;373:895–907.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  16. 16.

    Montefiori LE, Sobreira DR, Sakabe NJ, Aneas I, Joslin AC, Hansen GT, et al. A promoter interaction map for cardiovascular disease genetics. Elife. 2018;7:e35788.

    PubMed  PubMed Central  Article  Google Scholar 

  17. 17.

    Gomez-Rubio P, Zock J-P, Rava M, Marquez M, Sharp L, Hidalgo M, et al. Reduced risk of pancreatic cancer associated with asthma and nasal allergies. Gut. 2017;66:314–22.

  18. 18.

    Molina-Montes E, Gomez-Rubio P, Márquez M, Rava M, Löhr M, Michalski CW, et al. Risk of pancreatic cancer associated with family history of cancer and other medical conditions by accounting for smoking among relatives. Int J Epidemiol. 2018;47:473–83.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  19. 19.

    Amos CI, Dennis J, Wang Z, Byun J, Schumacher FR, Gayther SA, et al. The oncoarray consortium: a network for understanding the genetic architecture of common cancers. Cancer Epidemiol Biomark Prev. 2017;26:126–35.

    Article  Google Scholar 

  20. 20.

    Rothman N, Garcia-Closas M, Chatterjee N, Malats N, Wu X, Figueroa JD, et al. A multi-stage genome-wide association study of bladder cancer identifies multiple susceptibility loci. Nat Genet. 2010;42:978–84.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  21. 21.

    Howie BN, Donnelly P, Marchini J. A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genet. 2009;5:e1000529.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  22. 22.

    Delaneau O, Marchini J, Zagury JF. A linear complexity phasing method for thousands of genomes. Nat Methods. 2012;9:179–81.

    CAS  Article  Google Scholar 

  23. 23.

    Altshuler DL, Durbin RM, Abecasis GR, Bentley DR, Chakravarti A, Clark AG, et al. A map of human genome variation from population-scale sequencing. Nature. 2010;467:1061–73.

    PubMed  Article  CAS  PubMed Central  Google Scholar 

  24. 24.

    de Maturana, EL. Multilayered_postGWAS_PanGenEU. Github. 2020. https://github.com/EvangelinaLdM/Multilayered_postGWAS_PanGenEU.

  25. 25.

    Viechtbauer W. Conducting meta-analyses in R with the metafor. J Stat Softw. 2010;36:1–48.

    Article  Google Scholar 

  26. 26.

    Rodriguez, J.A. Local Moran Index 1D. Github. 2020. https://github.com/pollicipes/Local-Moran-Index-1D.

  27. 27.

    Schmitt AD, Hu M, Jung I, Xu Z, Qiu Y, Tan CL, et al. A compendium of chromatin contact maps reveals spatially active regions in the human genome. Cell Rep. 2016;17:2042–59.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  28. 28.

    Serra F, Baù D, Goodstadt M, Castillo D, Filion G, Marti-Renom MA. Automatic analysis and 3D-modelling of Hi-C data using TADbit reveals structural features of the fly chromatin colors. PLoS Comput Biol. 2017;13:e1005665.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  29. 29.

    Heinz S, Benner C, Spann N, Bertolino E, Lin YC, Laslo P, et al. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol Cell. 2010;38:576–89.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  30. 30.

    McLaren W, Gil L, Hunt SE, Riat HS, Ritchie GRS, Thormann A, et al. The Ensembl variant effect predictor. Genome Biol. 2016;17:122.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  31. 31.

    Martín-Antoniano I, Alonso L, Madrid M, López De Maturana E, Malats N. DoriTool: a bioinformatics integrative tool for post-association functional annotation. Public Health Genomics. 2017;20:126–35.

    PubMed  Article  PubMed Central  Google Scholar 

  32. 32.

    Rentzsch P, Witten D, Cooper GM, Shendure J, Kircher M. CADD: predicting the deleteriousness of variants throughout the human genome. Nucleic Acids Res. 2019;47:D886–94.

    CAS  Article  Google Scholar 

  33. 33.

    Ardlie KG, DeLuca DS, Segrè AV, Sullivan TJ, Young TR, Gelfand ET, et al. The Genotype-Tissue Expression (GTEx) pilot analysis: multitissue gene regulation in humans. Science. 2015;348:648–60.

    Article  CAS  Google Scholar 

  34. 34.

    Gong J, Mei S, Liu C, Xiang Y, Ye Y, Zhang Z, et al. PancanQTL: systematic identification of cis -eQTLs and trans -eQTLs in 33 cancer types. Nucleic Acids Res. 2018;46:D971–6.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  35. 35.

    Arnes L, Liu Z, Wang J, Maurer HC, Sagalovskiy I, Sanchez-Martin M, et al. Comprehensive characterisation of compartment-specific long non-coding RNAs associated with pancreatic ductal adenocarcinoma. Gut. 2019;68:499–511.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  36. 36.

    Sun BB, Maranville JC, Peters JE, Stacey D, Staley JR, Blackshaw J, et al. Genomic atlas of the human plasma proteome. Nature. 2018;558:73–9.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  37. 37.

    Sloan CA, Chan ET, Davidson JM, Malladi VS, Strattan JS, Hitz BC, et al. ENCODE data at the ENCODE portal. Nucleic Acids Res. 2016;44:D726–32.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  38. 38.

    Watanabe K, Taskesen E, Van Bochoven A, Posthuma D. Functional mapping and annotation of genetic associations with FUMA. Nat Commun. 2017;8:1826.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  39. 39.

    Arda HE, Tsai J, Rosli YR, Giresi P, Bottino R, Greenleaf WJ, et al. A chromatin basis for cell lineage and disease risk in the human pancreas. Cell Syst. 2018;7:310–322.e4.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  40. 40.

    Demontis D, Walters RK, Martin J, Mattheisen M, Als TD, Agerbo E, et al. Discovery of the first genome-wide significant risk loci for attention deficit/hyperactivity disorder. Nat Genet. 2019;51:63–75.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  41. 41.

    Gao J, Aksoy BA, Dogrusoz U, Dresdner G, Gross B, Sumer SO, et al. Integrative analysis of complex cancer genomics and clinical profiles using the cBioPortal. Sci Signal. 2013;6:11.

    Article  CAS  Google Scholar 

  42. 42.

    Li D, Duell EJ, Yu K, Risch HA, Olson SH, Kooperberg C, et al. Pathway analysis of genome-wide association study data highlights pancreatic development genes as susceptibility factors for pancreatic cancer. Carcinogenesis. 2012;33:1384–90.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  43. 43.

    Lee S, Wu MC, Lin X. Optimal tests for rare variant effects in sequencing association studies. Biostatistics. 2012;13:762–75.

    PubMed  PubMed Central  Article  Google Scholar 

  44. 44.

    Linxweiler M, Schick B, Zimmermann R. Let’s talk about secs: sec61, sec62 and sec63 in signal transduction, oncology and personalized medicine. Signal Transduct Target Ther. 2017;2:17002.

    PubMed  PubMed Central  Article  Google Scholar 

  45. 45.

    Matsumoto M, Miki T, Shibasaki T, Kawaguchi M, Shinozaki H, Nio J, et al. Noc2 is essential in normal regulation of exocytosis in endocrine and exocrine cells. Proc Natl Acad Sci U S A. 2004;101:8313–8.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  46. 46.

    Risch HA. Etiology of pancreatic cancer, with a hypothesis concerning the role of N-nitroso compounds and excess gastric acidity. J Natl Cancer Inst. 2003;95:948–60.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  47. 47.

    Körner M, Hayes GM, Rehmann R, Zimmermann A, Friess H, Miller LJ, et al. Secretin receptors in normal and diseased human pancreas: marked reduction of receptor binding in ductal neoplasia. Am J Pathol. 2005;167:959–68.

    PubMed  PubMed Central  Article  Google Scholar 

  48. 48.

    Ogawa S, Fukuda A, Matsumoto Y, Hanyu Y, Sono M, Fukunaga Y, et al. SETDB1 inhibits p53-mediated apoptosis and is required for formation of pancreatic ductal adenocarcinomas in mice. Gastroenterology. 2020;159:682–96.

  49. 49.

    Tang H, Jiang L, Stolzenberg-Solomon R, Arslan AA, Beane Freeman LE, Bracci P, et al. Genome-wide gene-diabetes and gene-obesity interaction scan in 8,255 cases and 11,900 controls from the PanScan and PanC4 Consortia. Cancer Epidemiol Biomarkers Prev. 2020;29:1784–91.

  50. 50.

    Castillo L, Young AIJ, Mawson A, Schafranek P, Steinmann AM, Nessem D, et al. MCL-1 antagonism enhances the anti-invasive effects of dasatinib in pancreatic adenocarcinoma. Oncogene. 2020;39:1821–9.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  51. 51.

    Wei D, Zhang Q, Schreiber JS, Parsels LA, Abulwerdi FA, Kausar T, et al. Targeting Mcl-1 for radiosensitization of pancreatic cancers. Transl Oncol. 2015;8:47–54.

    PubMed  PubMed Central  Article  Google Scholar 

  52. 52.

    Pi M, Quarles LD. Multiligand specificity and wide tissue expression of GPRC6A reveals new endocrine networks. Endocrinology. 2012;153:2062–9.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  53. 53.

    Zhang X, Jin T, Shi N, Yao L, Yang X, Han C, et al. Mechanisms of pancreatic injury induced by basic amino acids differ between L-arginine, L-ornithine, and L-histidine. Front Physiol. 2019;9:1922. https://doi.org/10.3389/fphys.2018.01922.

  54. 54.

    Notta F, Chan-Seng-Yue M, Lemire M, Li Y, Wilson GW, Connor AA, et al. A renewed model of pancreatic cancer evolution based on genomic rearrangement patterns. Nature. 2016;538:378–82.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  55. 55.

    Bartsch DK, Sina-Frey M, Lang S, Wild A, Gerdes B, Barth P, et al. CDKN2A germline mutations in familial pancreatic cancer. Ann Surg. 2002;236:730–7.

    PubMed  PubMed Central  Article  Google Scholar 

  56. 56.

    Lynch HT, Brand RE, Hogg D, Deters CA, Fusaro RM, Lynch JF, et al. Phenotypic variation in eight extended CDKN2A germline mutation familial atypical multiple mole melanoma-pancreatic carcinoma-prone families: the familial atypical multiple mole melanoma-pancreatic carcinoma syndrome. Cancer. 2002;94:84–96.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  57. 57.

    Gregory BL, Cheung VG. Natural variation in the histone demethylase, KDM4C, influences expression levels of specific genes including those that affect cell growth. Genome Res. 2014;24:52–63.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  58. 58.

    Morris JP, Wang SC, Hebrok M. KRAS, Hedgehog, Wnt and the twisted developmental biology of pancreatic ductal adenocarcinoma. Nat Rev Cancer. 2010;10:683–95.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  59. 59.

    Duan B, Hu J, Liu H, Wang Y, Li H, Liu S, et al. Genetic variants in the platelet-derived growth factor subunit B gene associated with pancreatic cancer risk. Int J Cancer. 2018;142:1322–31.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  60. 60.

    Rosendahl J, Kirsten H, Hegyi E, Kovacs P, Weiss FU, Laumen H, et al. Genome-wide association study identifies inversion in the CTRB1-CTRB2 locus to modify risk for alcoholic and non-alcoholic chronic pancreatitis. Gut. 2018;67:1855–63.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  61. 61.

    Morris AP, Voight BF, Teslovich TM, Ferreira T, Segrè AV, Steinthorsdottir V, et al. Large-scale association analysis provides insights into the genetic architecture and pathophysiology of type 2 diabetes. Nat Genet. 2012;44:981–90.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  62. 62.

    Xue A, Wu Y, Zhu Z, Zhang F, Kemper KE, Zheng Z, et al. Genome-wide association analyses identify 143 risk variants and putative regulatory mechanisms for type 2 diabetes. Nat Commun. 2018;9:2941.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  63. 63.

    Tang Z, Li C, Kang B, Gao G, Li C, Zhang Z. GEPIA: a web server for cancer and normal gene expression profiling and interactive analyses. Nucleic Acids Res. 2017;45:W98–102.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  64. 64.

    Kircher M, Witten DM, Jain P, O’roak BJ, Cooper GM, Shendure J. A general framework for estimating the relative pathogenicity of human genetic variants. Nat Genet. 2014;46:310–5.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  65. 65.

    Ernst J, Kellis M. ChromHMM: automating chromatin-state discovery and characterization. Nat Methods. 2012;9:215–6.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  66. 66.

    Csardi G, Nepusz T. The igraph software package for complex network research. InterJournal Complex Syst. 2006. https://doi.org/10.1186/1471-2105-12-455.

  67. 67.

    Cobo I, Martinelli P, Flández M, Bakiri L, Zhang M, Carrillo-De-Santa-Pau E, et al. Transcriptional regulation by NR5A2 links differentiation and inflammation in the pancreas. Nature. 2018;554:533–7.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  68. 68.

    Hess DA, Humphrey SE, Ishibashi J, Damsz B, Lee A, Glimcher LH, et al. Extensive pancreas regeneration following acinar-specific disruption of Xbp1 in mice. Gastroenterology. 2011;141:1463–72.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  69. 69.

    Guerra C, Schuhmacher AJ, Cañamero M, Grippo PJ, Verdaguer L, Pérez-Gallego L, et al. Chronic pancreatitis is essential for induction of pancreatic ductal adenocarcinoma by K-Ras oncogenes in adult mice. Cancer Cell. 2007;11:291–302.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  70. 70.

    Turnbull C, Ahmed S, Morrison J, Pernet D, Renwick A, Maranian M, et al. Genome-wide association study identifies five new breast cancer susceptibility loci. Nat Genet. 2010;42:504–7.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  71. 71.

    Notta F, Hahn SA, Real FX. A genetic roadmap of pancreatic cancer: still evolving. Gut. 2017;66:2170–8.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  72. 72.

    Nagarajan A, Malvi P, Wajapeyee N. Heparan sulfate and heparan sulfate proteoglycans in cancer initiation and progression. Front Endocrinol (Lausanne). 2018;9:483.

    Article  Google Scholar 

  73. 73.

    Theocharis AD, Skandalis SS, Tzanakakis GN, Karamanos NK. Proteoglycans in health and disease: novel roles for proteoglycans in malignancy and their pharmacological targeting. FEBS J. 2010;277:3904–23.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  74. 74.

    Hingorani SR, Zheng L, Bullock AJ, Seery TE, Harris WP, Sigal DS, et al. HALO 202: randomized phase II study of PEGPH20 plus nab-paclitaxel/gemcitabine versus nab-paclitaxel/gemcitabine in patients with untreated, metastatic pancreatic ductal adenocarcinoma. J Clin Oncol. 2018;36:359–66.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  75. 75.

    Provenzano PP, Cuevas C, Chang AE, Goel VK, Von Hoff DD, Hingorani SR. Enzymatic targeting of the stroma ablates physical barriers to treatment of pancreatic ductal adenocarcinoma. Cancer Cell. 2012;21:418–29.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  76. 76.

    Mccleary-Wheeler AL, Mcwilliams R, Fernandez-Zapico ME. Aberrant signaling pathways in pancreatic cancer: a two compartment view. Mol Carcinog. 2012;51:25–39.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  77. 77.

    Mayers JR, Wu C, Clish CB, Kraft P, Torrence ME, Fiske BP, et al. Elevation of circulating branched-chain amino acids is an early event in human pancreatic adenocarcinoma development. Nat Med. 2014;20:1193–8.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  78. 78.

    Stotz M, Szkandera J, Seidel J, Stojakovic T, Samonigg H, Reitz D, et al. Evaluation of uric acid as a prognostic blood-based marker in a large cohort of pancreatic cancer patients. PLoS One. 2014;9:e104730.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  79. 79.

    Botwinick IC, Pursell L, Yu G, Cooper T, Mann JJ, Chabot JA. A biological basis for depression in pancreatic cancer. Hpb (Oxford). 2014;16:740–3.

    Article  Google Scholar 

  80. 80.

    Eguia V, Gonda TA, Saif MW. Early detection of pancreatic cancer. JOP. 2012;13:131–4.

    PubMed  PubMed Central  Google Scholar 

  81. 81.

    Carreras-Torres R, Johansson M, Gaborieau V, Haycock PC, Wade KH, Relton CL, et al. The role of obesity, type 2 diabetes, and metabolic factors in pancreatic cancer: a Mendelian randomization study. J Natl Cancer Inst. 2017;109:djx012.

  82. 82.

    Koyanagi YN, Matsuo K, Ito H, Tamakoshi A, Sugawara Y, Hidaka A, et al. Body-mass index and pancreatic cancer incidence: a pooled analysis of nine population-based cohort studies with more than 340,000 Japanese subjects. J Epidemiol. 2018;28:245–52.

    PubMed  PubMed Central  Article  Google Scholar 

  83. 83.

    Lauby-Secretan B, Scoccianti C, Loomis D, Grosse Y, Bianchini F, Straif K. Body fatness and cancer - viewpoint of the IARC working group. N Engl J Med. 2016;375:794–8.

    PubMed  PubMed Central  Article  Google Scholar 

  84. 84.

    Park J, Morley TS, Kim M, Clegg DJ, Scherer PE. Obesity and cancer - mechanisms underlying tumour progression and recurrence. Nat Rev Endocrinol. 2014;10:455–65.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  85. 85.

    de Maturana EL, Rodríguez JA, Alonso L, Lao O, Molina-Montes E, Martín-Antoniano I, et al. A multilayered post-GWAS assessment on genetic susceptibility to pancreatic cancer. PanGenEU GWAS summary statistics. GWAS Catalog. 2020. ftp://ftp.ebi.ac.uk/pub/databases/gwas/summary_statistics/GCST90011857.

  86. 86.

    de Maturana EL, Rodríguez JA, Alonso L, Lao O, Molina-Montes E, Martín-Antoniano I, et al. A multilayered post-GWAS assessment on genetic susceptibility to pancreatic cancer. PanGenEU and EPICURO controls GWAS summary statistics. GWAS Catalog. 2020. ftp://ftp.ebi.ac.uk/pub/databases/gwas/summary_statistics/GCST90011858.

Download references

Acknowledgements

The authors are thankful to the patients, coordinators, field and administrative workers, and technicians of the European Study into Digestive Illnesses and Genetics (PanGenEU) and the Spanish Bladder Cancer (SBC/EPICURO) studies. We also thank Marta Rava former member of the GMEG-CNIO, Guillermo Pita and Anna González-Neira from CGEN-CNIO, and Joe Dennis and Laura Fachal from the University of Cambridge, for genotyping PanGenEU samples, performing variant calling and SNP imputation, and editing data.

PanGenEU Study investigators (Additional file 1: Annex 1) and SBC/EPICURO Investigators (Additional file 1: Annex 2).

Funding

The work was partially supported by Fondo de Investigaciones Sanitarias (FIS), Instituto de Salud Carlos III, Spain (#PI061614, #PI11/01542, #PI0902102, #PI12/01635, #PI12/00815, #PI15/01573, #PI18/01347); Red Temática de Investigación Cooperativa en Cáncer, Spain (#RD12/0036/0034, #RD12/0036/0050, #RD12/0036/0073); Spanish Ministerio de Ciencia, Innovación y Universidades (#BFU2017-85926-P); Fundación Científica de la AECC, Spain; European Cooperation in Science and Technology - COST Action #BM1204: EUPancreas. EU-6FP Integrated Project (#018771-MOLDIAG-PACA), EU-FP7-HEALTH (#259737-CANCERALIA, #256974-EPC-TM-Net), EU-FP7-ERC (#609989); Associazione Italiana Ricerca sul Cancro (#12182); Cancer Focus Northern Ireland and Department for Employment and Learning; and ALF (#SLL20130022), Sweden; Pancreatic Cancer Collective (PCC): Lustgarten Foundation & Stand-Up to Cancer (SU2C #6179); Intramural Research Program of the Division of Cancer Epidemiology and Genetics, National Cancer Institute, USA; PANC4 GWAS RO1CA154823; NCI, US-NIH (#HHSN261200800001E).

Author information

Affiliations

Authors

Consortia

Contributions

Study conception: NM, ELM. Design of the work: ELM, JAR, DE, MAMR, FXR. Data acquisition: EMM, PGR, RTL, AC, MH, MI, XM, ML, CWM, JP, MOR, BMB, AT, AF, LMB, TCJ, LDM, TG, WG, LS, LA, LC, JB, EC, LI, JK, NK, MM, JM, DOD, AS, WY, JY, PanGenEU Investigators, MGC, MK, NR, DS, SBC/EPICURO Investigators, DA, AAA, LBF, PMB, PB, BBM, JB, FC, MD, SG, JMG, PJG, MG, LLM, LD, NRN, UP, GMP, HAR, MJS, XOS, LDT, KV, WZ, SC, BMW, RZSS, APK, LA, FXR, NM. Data analysis: ELM, JAR, LA, OL. Interpretation of data: ELM, JAR, LA, OL, EMM, MAMR, FXR, NM. Creation of new software used in the work: ELM, LA, JAR, OL, MAMR. Drafting the work or substantively revised it: ELM, JAR, LA, OL, TCJ, UP, HAR, APK, LA, MAMR, FXR, NM. Approval of the submitted version (and any substantially modified version that involves the author’s contribution to the study): all authors. Agreement on both to be personally accountable for the author’s own contributions and to ensure that questions related to the accuracy or integrity of any part of the work, even ones in which the author was not personally involved, are appropriately investigated, resolved, and the resolution documented in the literature: all authors. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Marc A Marti-Renom or Núria Malats.

Ethics declarations

Ethics approval and consent to participate

IRB ethical approval and written informed consent were obtained by all participating centers contributing to PanGenEU, SBC/EPICURO, PanC4, and PanScan I–III International consortia, and study participants, respectively. The study was conducted in accordance with the Helsinki Declaration.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1: Table S1.

Characteristics of the study populations. Table S2. Replication of the SNPs reported as associated with pancreatic cancer risk in European population and published in GWAS Catalog. Table S3. Validated variants (at the nominal p-value), in PanScan and PanC4 populations, among the top 20 SNPs identified in the PanGenEU GWAS study. Figure S1. Functional in-silico analysis strategy followed to identify novel genomic regions previously prioritized using the 1D, 2D and 3D approaches. Figure S2. GWAS Manhattan plot for the PanGenEU study. The x-axis is the genomic position of each variant and the y-axis is the −log10 p-value obtained in the 1D analysis. Figure S3. Q-Q plots for pancreatic cancer risk of the association results using the PanGenEU case-control study (S2a) and PanGenEU&EPICURO study populations (S2b). Figure S4. Scatterplot of the local Moran’s index (LMI) obtained in the 2D approach and the –log10 p-value obtained in the GWAS analysis (1D approach). Figure S5. Results of the benchmarking test showing that the median rank position of the LMI values for the 22 pancreatic cancer signals from the GWAS Catalog is significantly higher than 10,000 randomly selected sets of the same size. Figure S6. Minor allele frequency (MAF) distributions for the top 97 SNPs identified by LMI (in pink) and by GWAS (in blue). Figure S7. LMI Manhattan plot for the PanGenEU study. The x-axis is the genomic position of each variant and the y-axis is the LMI value obtained in the 2D analysis. Figure S8. Q-Q plots show significant enrichment of SNPs with low p-values in the variants prioritized in the 2D-approach (S8a), and in the credible sets derived from them (S8b). Figure S9. Complementary NETWORK (in blue our input KEGG pathways; in green, the complementary pathways interconnected with them) obtained with Pathway-connector webtool.

Additional file 2: Table S4.

List of the 510 SNPs prioritized according to their Local Moran Index (LMI).

Additional file 3: Table S5.

List of the 76 SNPs overlapping with a chromatin interaction region (bait regions) and their 54 targets.

Additional file 4: Table S6.

Annotation and functional in silico analysis of the 143 prioritized SNPs.

Additional file 5: Table S7.

Results from the gene enrichment analysis performed with FUMA in the 1D prioritized genes.

Additional file 6: Table S8.

Results from the gene enrichment analysis performed with FUMA in the 2D prioritized genes.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

López de Maturana, E., Rodríguez, J.A., Alonso, L. et al. A multilayered post-GWAS assessment on genetic susceptibility to pancreatic cancer. Genome Med 13, 15 (2021). https://doi.org/10.1186/s13073-020-00816-4

Download citation

Keywords

  • Pancreatic cancer risk
  • Genome-wide association analysis
  • Genetic susceptibility
  • 3D genomic structure
  • Local indices of genome spatial autocorrelation