Sample characteristics
To investigate PanNENswe performed methylation analysis using the Infinium Methylation EPIC (850K) beadchip platform in addition to high-depth DNA panel sequencing of 57 PanNENs. The cohort consisted of 43 PanNETs including NETG1 (n = 18), NETG2 (n = 13), NETG3 (n = 12), and 14 PanNECs (Fig. 1a, representative H&E sections in Additional file 2: Fig. S1a, details in 1 and Additional file 1: Table S1).
DNA methylation classification identifies a distinct PanNEC subgroup within PanNEN samples
To classify PanNEN tumors we performed unsupervised class discovery using the 10,000 (10K) most variable probes and defined three distinct groups at the methylation level: Groups A, B, and C (Fig. 1b; Additional file 2: Fig. S1c and methods for details). A t-distributed stochastic neighbor embedding (tSNE) analysis showed consistent segregation of the samples, affirming the presence of distinct methylation patterns in the groups identified (Fig. 1c). The methylation beta values from the 10K most variable probes revealed no distinction between the groups (Fig. 1d), as Group A was composed of 39 well-differentiated PanNETs of all grades including 10 NETG3, while Group B harbored 13 from a total of 14 PanNECs, and one NETG2 sample. Group C consisted of two NETG3s, one NETG2, and one NEC (Additional file 1: Table S1). Patient survival data confirms the distinct separation of Group A and B enriched in PanNET and PanNEC, respectively (Additional file 2: Fig. S1d). Mean CpG island methylation was significantly higher in Group B and Group C compared to Group A (Fig. 1e). Importantly, the methylation profiles separated high-grade NETG3 from NEC tumors, as they were assigned with high precision to Group A and Group B, respectively.
Gene Ontology (GO) analysis of the genes associated with the 10K probes significantly enriched for terms involved in biological processes associated with organ development (Additional file 1: Table S2; Fig. 1f). Recent work evaluating the transition of human embryonic stem cells (hESC) primed to a naive state demonstrated a gradual and s acquisition of CpG hypermethylation in genes associated with development, which was mirrored in multiple cancer entities [53]. We found that naive hESC-associated and hypermethylated CpGs were significantly more hypermethylated in Group B compared to Group A tumors (Fig. 1g left panel). A closer analysis of NETG3 samples in Group A compared to NEC samples from Group B maintained a similar significant difference in the distribution of hypermethylated CpGs of naive hESC (Fig. 1g right panel), highlighting the difference in developmental states between these histologically similar high-grade PanNEN subtypes.
Distinct recurrent mutations separate PanNENs in Groups A and B
To further characterize the PanNEN samples at the mutational level, we employed high-depth panel sequencing utilizing the commercially available Comprehensive Cancer Panel (CCP), and a custom PanNEN panel (Additional file 2: Fig. S2a, Additional file 1: Table S3). Alterations characteristically associated with PanNETs were enriched in Group A, for example recurring mutations in MEN1, DAXX, ATRX, and TSC2 in 13, 6, 4, and 4 of the 39 Group A samples, respectively. In contrast, mutations in DAXX and ATRX were absent in Group B, while only one Group B sample each was mutated in MEN1 and TSC2 (Fig. 2a shows recurring mutations; complete list in Additional file 2: Fig. S2b and Additional file 1: Table S4). In addition, four Group A samples contained aberrations in VHL, and two samples contained PTEN mutations. In contrast, KRAS (5) (with G12D, G12R, G12V, and Q61D mutations), SMAD4 (2), and TP53 (3) mutations were exclusively seen in Group B. In total, 16 samples, including the single non-PanNEC sample in Group B, did not harbor any known driver mutations detected by the DNA panels (all mutations discussed are either non-synonymous, deletions, or indels. Variant allele frequencies in Additional file 2: Fig. S2c). Two patients in our cohort had multiple samples: PNET77 and PNET56. Patient PNET77 had tissues of primary tumor and liver metastasis (PNET77P and PNET77M, both NETG3 samples in Group C) surgically removed two years apart. Neither sample displayed mutations covered by either of the panels. In contrast, Patient PNET56 had two liver metastasis samples in our cohort; PNET56P1 and P2 (both NETG3s in Group A) removed one year apart. Both samples carried the same alterations in TSC2 and BRD3. Collectively, the mutational profiles uncovered key molecular distinctions between Group A and B samples, which are enriched for aberrations in MEN1, DAXX, and ATRX and KRAS, TP53, and SMAD4, respectively.
Copy number alterations separate PanNECs from PanNETs
Within Group A we found three profiles of copy number alterations (CNAs): amplification-rich, deletion-rich, and low-CNA signature, displaying whole chromosomal copy number gains, losses, and mixed but limited aberrations respectively (Fig. 2b). Mean signal counts per sample from fluorescence in situ hybridization (FISH) and mean log2 ratio from CNA analysis were correlated and showed a regression coefficient of R2 =0.6531 and p=6.243 × 10−7 (Fig. 2c and Additional file 2: Fig. S3a and b). Recurrent mutations of tumor suppressor genes MEN1, DAXX, TSC2, and VHL were enriched in the amplification-rich and deletion-rich signature (Fig. 2b). The low-CNA signature contained few aberrations in most chromosomes with no clear recurrences, and this signature was a predominant feature of Group A NETG1 tumors. Tumors in Group B and Group C were defined by few recurrent whole chromosomal aberrations, likely due to the low number of samples in the groups.
We additionally investigated focal CNA using GISTIC (Fig. 2e). Chromosome regional gains of 1q21.2, 8p23.1, and 14q11.2, and deletion of chromosomal region 15q11.2 were significantly associated with both Group A and Group B. Group A had unique significant gains in chromosomal regions 10q11.22 and 16p11.2, which were not present in Group B. In contrast, only Group B showed focal gains in 4p16.1 and 6q21. Chromosomal deletions exclusive to Group A were in regions 1p36.32, 2q37.3, 4q34.3, 6p25.3, 8p23.2, 9p21.3, 10q26.3, 12p11.1, 21q11.2, 22q12.3, and 22q13.32, whereas Group B carried focal deletions in 4p15.32 and 13q14.2. Deletion of 9p21.3 in Group A resulted in loss of CDKN1B and CDKN2A, two key regulators of the cell cycle (Additional file 1: Table S5). Interestingly, deletion of 13q14.2 affecting RB1, also an important cell cycle regulator, was seen in 50% of Group B. (Fig. 2f, Additional file 1: Table S6). On closer examination, we found that NETG3s in Group A did not show losses of the RB1 region (Fig. 2f lower panel). Therefore, while Group A is enriched for recurrent whole chromosomal and focal copy number aberrations, Group B exhibited significant focal aberrations and exclusively harbored RB1 loss associated with PanNECs.
PanNETs in Group A harbor endocrine cell of origin signatures
Cell-of-origin studies in PanNEN have been thus far restricted to well-differentiated PanNETs. Therefore, we investigated cell-of-origin methylation patterns in our diverse PanNEN cohort. First, we curated a list of 174 markers from PanglaoDB [33] uniquely expressed in each differentiated cell type of the adult pancreas (Additional file 1: Table S7). We called differentially methylated probes (DMPs) from our samples (Additional file 1: Table S8) and identified genes overlapping with the curated list. We identified 85 significantly enriched probes (red points in Fig. 3a, Additional file 2: Fig. S4a, Additional file 1: Table S9) associated with 23 markers of α, β, ɣ, δ, ductal, acinar and Islet Schwann cells (Additional file 2: Fig. S3d). Among these, α cell markers such as IRX2, TTR, and GLS were hypomethylated across Group A samples (Fig. 3b). IRX2 and PDX1 characterize α-like and β-like tumors, respectively [22, 25,26,27]. Group A tumors were consistently hypomethylated in promoter-associated probes of IRX2, while Group B showed strong hypermethylation (Fig. 3a, b). Although PDX1 was not differentially methylated between Group A and Group B, the 10K probes establishing the groups displayed variable methylation patterns of PDX1 within the groups (Fig. 3c and Additional file 2: Fig. S3e for sample IDs). Group A tumors separated into two subgroups with respect to IRX2 and PDX1 methylation status, whereby one subgroup carried hypomethylation of both IRX2 and PDX1, while in the second subgroup PDX1 was hypermethylated. The latter subgroup was enriched for MEN1, DAXX, and ATRX mutations, while recurring VHL mutations were seen in the prior subgroup (Fig. 3c). The transcription factor ARX was also found in a subgroup of Group A tumors and ARX+PDX1− phenotype was significantly more common when compared to Group B (p-value=0.02036: Fisher’s exact test) (Fig. 3d and Additional file 2: Fig. S4). In addition, genes associated with endocrine cell lineage maintenance, such as PAX6, NKX6-1, and NKX2-2 were mainly hypomethylated in Group A, except for NETG3 samples (PNET42, PNET57, PNET61, PNET56P1, PNET56P2, and PNET107), and one NETG2 (PNET24), which showed hypermethylation of PAX6, NKX6-1, and NKX2-2 genes similar to Group B samples (Fig. 3b). In contrast, Group B, comprising almost exclusively PanNECs, was characterized by hypermethylation of all DMPs under investigation, with the exception of KRT7.
PanNECs in Group B display an acinar-like cell signature and differ from PDACs in ductal-like cell signatures
In order to identify the cell of origin for samples from Group B, we extended our analyses to include normal pancreatic methylation profiles. We obtained Illumina 450K array methylation profiles of presorted normal pancreatic cell types: α (n =2), β (n =3), acinar (n =3), and ductal (n =3) cells [46, 48]. We identified 46,500 DMPs differentiating the cell types from one another (adjusted P value < 0.01, absolute Δ beta > 0.2, Additional file 1: Table S10 to S15). We determined Pearson distance between the tumor samples and the pancreatic cell types using these DMPs and constructed a phylo-epigenetic tree (Fig. 4a). Two main branches separate the whole cohort; the lower branch consists completely of Group A and clustered closely with β and α cells. The insulinomas grouped together and maintained the closest distance to the normal endocrine cell type. Strikingly, all tumors from Group B, except PNET58 an ARX+ and PDX1+ (Additional file 2: Fig. S4d), distinctly clustered with ductal and acinar cells, forming a separate clade together, indicating a clear resemblance to the exocrine cells of the pancreas and evidently distant from the endocrine cells of the pancreas.
We next applied an independent method to determine the normal cell signature composition of our PanNEN samples. As a reference, we utilized 6096 CpG probes which distinguish ductal, acinar, and β cells [47]. For the missing α cell reference we utilized the 450K profiles mentioned in the previous section [46] and determined the methylation values for the CpGs which differentiated β, acinar, and ductal cells (Additional file 1: Table S16). Using Euclidean distance analysis, we confirmed that the reference cell types were clearly discerned from one another (Fig. 4b). Given the reference and the tumor samples, we ran a deconvolution algorithm which models the profile of the tumors as a linear combination of the methylation profiles of the cell types (Additional file 1: supplementary methods). This method determines the methylation signature proportion using non-negative least squares linear regression (NNLS). We additionally added 167 PDACs as an independent pancreatic tumor type [54]. Group A PanNETs displayed significant enrichment of α-cell signature composition compared to Group B, and to a lesser extent to Group C (Fig. 4c, Additional file 2: Fig. S5a, b, Additional file 1: Table S17 and S18). Group B NECs, on the other hand, showed a significant increase in acinar cell signature proportion when compared to Group A and to Group C. Interestingly, the acinar cell composition of Group B was similar to that found in PDAC tumors. However, the ductal signature composition of PDAC tumors was significantly higher, distinguishing this pancreatic exocrine cancer from PanNENs and specifically from Group B. NETG3 samples from Group A contained similar profiles to other PanNET samples, and showed no resemblance to the acinar cell similarity of PanNECs (Additional file 2: Fig. S5c). Using the same method, we further looked at the two subgroups within Group A, which showed differences in PDX1 methylation patterns (Fig. 3c). We found an α-like cell profile signature more significantly enriched in samples with hypermethylation of PDX1, and a β-like, intermediate cell profile in samples with hypomethylation of both PDX1 and ARX (Additional file 2: Fig. S5d).
To determine the key probes that establish the resemblance of Group B to the exocrine cells and distinguishing them from the endocrine cells as well as Group A and C samples, we determined the 10K most variable probes from the aforementioned DMPs and performed an unsupervised clustering using ConcensusClusterPlus [55]. We computed the mean methylation of each probe in the two clusters from the new analysis; cluster 1 and cluster 2 (Additional file 2: Fig. S6). Next, we determined the distribution of the absolute difference in the mean methylation between cluster 1 and cluster 2 (ranging between 0.0001 to 0.5). To determine the highest varying probes between the clusters, and thereby the genes that establish the resemblance within cluster 2, we placed a cut-off for the absolute difference in mean methylation at 0.4 and higher (Additional file 2: Fig. S6a-d). We further focused only on the probes which were associated with the promoters (TSS200, TSS1500, 5′UTR, and 1st Exonic) to identify potential genes of interest. Genes that were hypomethylated or hypermethylated in the promoter regions within Group B samples and exocrine cells can be found in Additional file 1: Table S19.
The phylo-epigenetic analysis (Fig. 4a) as well as the cell signature composition analysis (Fig S5a) led us to reassess several outlier samples: PNET4 was the only NET sample found within Group B, PNET60 and PNET103 were seen to consist of a high proportion of acinar cells (Additional file 2: Fig. S5a), as well as all Group C samples, whose relationship seemed unclear. Therefore, we stained all Group B samples with Trypsin, a marker for acinar cell carcinoma of the pancreas (ACC), and all Group C samples with Keratin to exclude paraganglioma. PNET4 was stained with CD10 to control for pseudopapillary neoplasia (SPN), ß-catenin, and cytokeratin to exclude paraganglioma.
This re-analysis revealed that PNET60 is most likely an ACC, while 3 other Group B samples (PNETs 101, 103, 106) are tumors of mixed identity (MiNEN), consisting of different proportions of ACC and NETs (Additional file 1: Table S1). PNET4 was identified as a likely SPN being strongly positive for CD10 and ß-catenin (Fig. 4a (i, ii)). Based on these results, we excluded PNET60 and PNET4 from our cohort and re-conducted the phylo-epigenetic analysis, which showed the identical tree phylogeny (Fig. 4a (iii)).
Group B PanNECs display SOX9 patterns similar to exocrine cells
The pattern of Group B acinar similarity to PDACs compelled us to investigate whether this similarity was SOX9 dependent [38]. We performed IHC of SOX9 for representative samples of the cohort. From 11 Group B PanNECs 9 samples were positive for SOX9 staining (Fig. 4d, Additional file 2: Fig. S4e). In Group A, 4 from 17 samples (three NETG3s and one NETG2 (an insulinoma and a MEN1 syndrome tumor)) displayed staining for SOX9, albeit with a low and heterogenous pattern, unlike Group B positive tumors (Fig. 4d and Additional file 2: Fig. S4f). We analyzed IHC of ARX and PDX1, with regard to SOX9 expression, on the same samples. The Group A SOX9+ NETG3 were additionally ARX+. Three Group B SOX9+ samples were additionally ARX+. The SOX9- Group B sample was however ARX- and PDX1- (Additional file 2: Fig. S4f). Our IHC analyses showed that α-/β- like tumors in Group A harbored significantly more ARX+PDX1-SOX9, while Group B acinar-like tumors were enriched for ARX-PDX1-SOX9+ features (adjusted p-value = 0.001998: Fisher’s exact test and fdr corrected).