BAP1 Loss Predicts Therapeutic Vulnerability in Malignant Peritoneal Mesothelioma

Background: Malignant Peritoneal Mesothelioma (PeM) is a rare but frequently fatal cancer that originates from the peritoneal lining of the abdomen. Standard treatment of PeM is limited to cytoreductive surgery and/or chemotherapy, and no effective targeted therapies for PeM yet exist. In the search for novel therapeutic target candidates in PeM, we performed a comprehensive integrative multi-omics analysis of 19 treatment-naïve PeM tumors. Results: The analysis identified PeM tumors with BAP1 loss to form a distinct molecular subtype characterized by distinct expression patterns of genes involved in chromatin remodeling, DNA repair pathway, and immune checkpoint receptor activation. This PeM subtype could potentially benefit from immune checkpoint, PARP, or HDAC inhibition therapies. Conclusions: Our findings uncover BAP1 as a trackable prognostic and predictive biomarker, and refine PeM disease classification. This integrated molecular characterization provides a comprehensive foundation for developing PeM precision medicine. In this study, we performed an integrated analysis of the genome, transcriptome, and proteome of 19 PeM tumors Primary untreated PeM tumors and matched benign samples were obtained from patients undergoing cytoreductive following This study approved and All patients signed a formal consent form approved by the respective institutional ethics board. Histologic parameters and pathological scoring of tumors confirming PeM was established by three independent pathologists. Hematoxylin and eosin (H&E) and immunostained formalin-fixed paraffin-embedded (FFPE) slides were reviewed by at least two specialized pathologists to diagnose PeM and its subtype. H&E staining was used to determine the highest tumor cellularity (≥75%) from sections for sequencing. The surgical resections were snap frozen and processed at respective institutions. The tumors have a companion normal tissue specimen (either adjacent normal tissue or peripheral blood previously extracted for germline DNA control). Each tumor specimen was approximately 1cm 3 in size and weighed between 100-300 mg. Specimen were shipped overnight on dry ice that maintained an average temperature of less than -80 o C. Upon receipt, the tissues were sectioned into 5 slices for DNA, RNA, and protein extraction as well as construction of tissue microarrays (TMA). computed between PeM subtypes. h Correlation between immune score and mRNA expression of immune checkpoint receptors. The expression levels are log2 transformed and mean normalized


5
in PeM ( Fig. 1b and Additional file 1: Figure S6). Using the software deconstructSigs [12], we found that mutational signature 1, 5, 12, and 6 were operative in PeM. Interestingly, signature 1 was often correlated with age at diagnosis, and signature 6 was associated with DNA mismatch repair and mostly found in microsatellite instable tumors [13].
The mutation landscape of PeM was found to be highly heterogeneous. The nuclear deubiquitinase BAP1 was the most frequently mutated gene (5 out of 19 tumors) in PeM tumors. BAP1 is a tumor-suppressor gene known to be involved in chromatin remodeling, DNA double-strand break repair, and regulation of transcription of many other genes [14]. Previous studies have also reported BAP1 as the most frequently mutated gene in both PeM [3,6] and PM [14,15]. The BAP1 missense mutation in MESO-18A/E resulted in a single amino-acid (AA) change in the ubiquitin carboxyl hydrolase domain keeping the rest of the amino acid chain intact (Additional file 1: Figure   S7). In MESO-06 and MESO-09, a BAP1 frameshift deletion resulted in a premature stop codon and chain termination. In MESO-09 approximately 91% of BAP1 amino acid chains were intact, but in MESO-06 only 2% of BAP1 amino acid chains were intact (Additional file 1: Figure S7-S8). We also observed a BAP1 germline mutation in only one case (MESO-09). In three (15%) tumors, we identified a recurrent R396I mutation in ZNF678 -a zinc finger protein containing zinc-coordinating DNA binding domains involved in transcriptional regulation.
We compared the mutated genes in our VPC-PeM cohort with publically available datasets [3,15,16] of both PeM and PM (Additional file 1: Figure S9). BAP1 was the only mutated gene common between the three PeM cohorts. Twenty-five genes including tumor suppressors LATS1, TP53, and chromatin modifiers SETD2 were common between at least two PeM cohorts. Many mutated genes in VPC-PeM were also previously reported in PM. BAP1 and SETD2 were the two mutated genes found common between VPC-PeM and all four PM cohorts.
Next, we identified driver genes of PeM using our recently developed algorithm HIT'nDRIVE [17]. Briefly, HIT'nDRIVE measures the potential impact of genomic aberrations on changes in the global expression of other genes/proteins which are in close proximity in a gene/protein-interaction network. It then prioritizes those aberrations with the highest impact as cancer driver genes. HIT'nDRIVE prioritized 25 unique driver genes in 15 PeM tumors for which matched genome and transcriptome data were available ( Fig. 1c and Additional file 2: certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Copy number landscape in PeM tumors
To investigate the somatic copy number aberration (CNA) profiles of PeM, we derived copy-number calls from exome sequencing data using the software Nexus Copy Number Discovery Edition Version 8.0. The aggregate CNA profile of PeM tumors is shown in Fig. 2a-b. We observed a total of 1,281 CNA events across all samples (Additional file 2: Table S5). On an average, 10% of the protein-coding genome was altered per PeM tumor.
We also compared the CNA burden in protein-coding regions of the VPC-PeM cohort with different adult cancers from The Cancer Genome Atlas (TCGA) project. Similar to the mutation burden, VPC-PeM tumors were observed at the lower end of the pan-cancer CNA burden spectrum. Only UCEC, PRAD, and PAAD tumors had lower median CNA burden as compared to PeM tumors (Fig. 2c). CNA status and mRNA expression for around half of the genes were positively correlated (R ≥ 0.1) and 16% of the genes had strong correlation (R ≥ 0.5). To identify cancer genes, we compared aberrations in protein-coding genes with data from the Cancer Gene Census (CGC) [18]. Intriguingly, CNA status and mRNA expression for majority of CGC genes were positively correlated (Additional file 1: Figure S10).
To identify recurrent focal CNAs in PeM tumors, we used the GISTIC [19] algorithm which yielded 5 regions of focal deletions (q < 0.05) including in 3p21 and 22q13 which are characteristic of malignant mesotheliomas ( Fig.   2d and Additional file 2: Table S6). Furthermore, GISTIC prioritized 8 regions of focal amplification (q < 0.05) which included genes such as IGH, VEGFD, BRD9, FOXL1, EGFR, and PDGFA (Fig. 2d). Copy-number status of these genes was also significantly correlated with their respective mRNA expression (Additional file 1: Figure   S10). Chromosome 1 was the most aberrant region in PeM and chromosomes 13 and 18 were relatively unchanged except for MESO-14 (Fig. 2b). certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprint (which was not this version posted March 15, 2018. . https://doi.org/10.1101/243477 doi: bioRxiv preprint 7 Chromosome 3p21 was deleted in almost half of the tumors (8 of 19) in the cohort. Here, we call tumors with 3p21 (or BAP1) loss as BAP1 del and the rest of the tumors with 3p21 (or BAP1) copy-number intact as BAP1 intact .
Interestingly, BAP1 mRNA transcripts in BAP1 del tumors were expressed at lower levels as compared to those in BAP1 intact tumors (Wilcoxon signed-rank test p-value = 3x10 -4 ) (Fig. 2e). Immunohistochemical (IHC) staining showed lack of BAP1 nuclear staining in the tumors with BAP1 homozygous deletion (Fig. 2f). Tumors with BAP1 heterozygous loss still displayed BAP1 nuclear staining (Additional file 1: Figure S11). We observed three BAP1 mutated cases among BAP1 intact tumors. BAP1 mRNA transcripts in these three tumors, were expressed at high levels. As mentioned in the previous section, the mutation analysis also predicted that despite mutation in BAP1 in these three tumors, the entire BAP1 amino-acid chain is still intact and may be functionally active (Additional file 1: Figure S7). Furthermore, we found DNA copy loss of 3p21 locus to include four concomitantly deleted cancer genes -BAP1, SETD2, SMARCC1, and PBRM1, consistent with [20]. Copy-number status of these four genes was significantly correlated with their corresponding mRNA expression (Additional file 1: Figure S12), suggesting that the allelic loss of these genes is associated with decreased transcript levels. These four genes are chromatin modifiers, and PBRM1 and SMARCC1 are part of SWI/SNF complex that regulates transcription of a number of genes. Furthermore, HIT'nDRIVE prioritized BAP1, PBRM1, and SETD2 as driver genes of PeM ( Fig.   1c and Additional file 2: Table S4).
CNA status of genes associated with a number of key cancer pathways was observed to be different between the PeM subtypes (i.e. BAP1 del and BAP1 intact ). We observed many genes involved in chromatin remodeling, SWI/SNF complex and DNA repair pathway to be deleted in BAP1 del tumors as compared to BAP1 intact tumors (Fig. 1c). In contrast, we found copy-number gain of many genes in DNA repair pathways (BRCA2, ATM, MGMT, and RAD51) and the cell cycle (MYC, CDK5, CCNB1, and CCND1) in the BAP1 intact tumors. Furthermore, PeM tumors (both BAP1 del and BAP1 intact ) harbored CNA events in carcinogenic pathways such as MAPK, PI3K, MTOR, Wnt, and Hippo pathways. Interestingly, ESR1 copy number deletion is enriched in BAP1 del tumors while co-amplification of EGFR and BRAF were present in three BAP1 intact tumors. Notably, we identified copy-number loss of tumor suppressor LATS1/2 and copy-number gain of NF2 in one case, both of which has been previously associated with certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprint (which was not this version posted March 15, 2018. . https://doi.org/10.1101/243477 doi: bioRxiv preprint 8 mesotheliomas [15], in BAP1 del tumors. Notably, both LATS1/2 and NF2 are key regulators of the Hippo pathway [21]. Unsupervised consensus clustering of tumor samples based on copy-number segmentation mean values of the 3349 most variable genes identified four tumor sub-groups ( Fig. 2g and Additional file 1: Figure S13). We observed that BAP1 del and BAP1 intact tumors were grouped into distinct clusters. This indicates that BAP1 del tumors have distinct copy-number profiles from those of BAP1 intact tumors. We identified 692 genes (p-value < 0.01, Kruskal-Wallis test) with significantly differential CNA genes segments between the clusters. These genes were mapped to eight distinct chromosome loci -19p, 6q, 1q, and, 13q and were mostly gained in clusters 1 and 3, whereas Xq, 22q, and 7p loss were mostly in clusters 1 and 3 (Additional file 1: Figure S13).
Next, we compared the CNA profile of PeM with PM. We note that PeM and PM tumors displayed substantial overlap (Additional file 1: Figure S14). Both PeM and PM tumors harbored characteristic loss of chromosome segments 3p21, 9p21 and 22q. Copy number loss of CDKN2A (9p21) or NF2 (22q) is observed in more than 40% of PM tumors [15]. However, in PeM, loss of CDKN2A and NF2 was observed in only in a single case (MESO-14). We identified 1670 genes (p-value < 0.0005, Wilcoxon signed-rank test) with significant differential CNA segments between PeM and PM (Additional file 2: Table S7). Intriguingly, these 1670 genes mapped to eleven distinct chromosome loci 1q, 3p, 6p, 11p, 11q, 12p, 12q, and 17q that were mostly gained in PM and lost in PeM.
Vice versa 22q, Xp, and Xq were mostly lost in PM and gained in PeM (Additional file 1: Figure S14).

Gene fusions in PeM
To identify the presence of gene fusions, we analyzed RNA-seq data in 15 PeM using deFuse algorithm [22].
Overall, 82 unique gene fusion events were identified using our filtering criteria (Methods and Additional file 2: Table S8), out of which we successfully validated 18 gene fusions using Sanger sequencing (Additional file 1: Figure S15 and Additional file 2: Table S9). We observed more gene fusion events in BAP1 del tumors as compared to that in BAP1 intact tumors ( Fig. 3a-b). Similar to mutation or copy-number profiles, the gene fusion landscape of PeM was highly heterogeneous. MTG1-SCART1 was the most recurrent gene fusion observed in 7 certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprint (which was not this version posted March 15, 2018. . https://doi.org/10.1101/243477 doi: bioRxiv preprint 9 cases. This was followed by GKAP1-KIF27 and KANSL1-ARL17B (Fig. 3c) each of which was identified in 6 different cases. Three unique fusions were present in PBRM1, 2 in KANSL1, and 1 each in BAP1 and SETD2 all of which are involved in chromatin remodeling process ( Fig. 1c and 3d-f). Notably, BAP1, SETD2, PBRM1, and KANSL1 were prioritized as a driver gene by HIT'nDRIVE on basis of gene-fusion. Fusions in these genes were mostly found in the BAP1 del subtype.

The global transcriptome and proteome profile of PeM
To segregate transcriptional subtypes of PeM, we performed total RNA-seq (Illumina HiSeq 4000) and its quantification of 15 PeM tumor samples for which RNA were available (RNA for remaining four tumor samples did not pass the quality control checks). We first performed principal-component analyses and unsupervised consensus clustering of all PeM tumors to determine transcriptomic patterns using genes based on variance among tumor specimens. Consensus clustering revealed two distinct transcriptome sub-groups (Fig. 4a). We found BAP1 intact and BAP1 del have some distinct transcriptomic patterns; however, a few samples showed an overlapping pattern (Additional file 1: Figure S16).
We performed mass spectrometry (Fusion Orbitrap LC/MS/MS) with isobaric tagging for expressed peptide identification and its corresponding protein quantification using Proteome Discoverer for processing pipeline for 16 PeM tumors and 7 matched normal tissues (matched normal samples for the remaining tumors were not available). We identified 8242 unique proteins in 23 samples analyzed (we were surprised BAP1 protein was however not detected in our MS experiment, likely due to inherent technical limitations with these samples and/or processing. Quality control analysis of in solution Hela digests also have very low BAP1 with only a single peptide observed in occasional runs). First, we analyzed global matched mRNA-protein expression correlation. Although, ~58% (4715 of 8109) of proteins showed positive mRNA-protein correlation (Pearson correlation; R ≥ 0.1), only 22.7% (1839) of the proteins were strongly correlated with their corresponding mRNA (R ≥ 0.5) (Additional file 1: Figure S17). Expression of 2.4% (194) of proteins strongly negatively correlated with their corresponding mRNA (R ≤ -0.5). To analyze the proteomic pattern across PeM tumors, we performed principal-component analyses and unsupervised consensus clustering following the same procedure as described above for the certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprint (which was not this version posted March 15, 2018. . https://doi.org/10.1101/243477 doi: bioRxiv preprint transcriptome. Unlike in transcriptome profiles, the proteome profiles of BAP1 PeM tumor sub-types did not group into distinct clusters ( Fig. 4b and Additional file 1: Figure S18).
To identify differentially expressed genes (DEG) between BAP1 intact and BAP1 del , we performed Wilcoxon signedrank test using mRNA and protein expression data independently. We identified 1520 and 466 DEG (with p-value < 0.05) using mRNA and protein expression data respectively (Additional file 1: Figure S19 and Additional file 2: Table S10-S11). However, only 53 genes were found common between the two sets of DEG. As expected, BAP1, PBRM1 and SMARCA4, SMARCD3 were among the top-500 DEG. Many other important cancer-related genes were differentially expressed such as CDK20, HIST1H4F, ERCC1, APOBEC3A, CDK11A, CSPG4, TGFB1, IL6, LAG3, and ATM.

Transcriptional and post-transcriptional mechanisms regulate chromatin remodeling protein-complexes
Next, we aimed to study the extent to which changes in copy number profile affects its corresponding protein expression. For this, we calculated Pearson correlation between CNA-mRNA expression and CNA-protein expression. While, copy number profile of genes, on average, have good agreement with their corresponding mRNA expression, a number of detected proteins had poor correlation with their respective gene's copy number profile. Approximately 25% (1871 of 7462) of proteins were observed to have poor correlation with their genes copy number which we here define as "attenuated proteins" (Methods, Fig. 4c, and Additional file 2: Table S12).
Among the attenuated proteins, we identified important chromatin remodeling proteins -PBRM1, SETD2, and SMARCC1. The attenuated proteins also included cancer genes such as NF2, EGFR, APC, PIK3CA, and MAP3K4. We observed that the attenuated proteins were significantly enriched with direct protein-protein interaction partners of the UBC (hypergeometric test p-value: 10 -5 ), BAP1 (10 -3 ), and PBRM1 (10 -2 ) in STRING v10 interaction network. Notably, geneset enrichment analysis revealed that attenuated proteins are more likely to form a part of a multimeric complex or bind to macromolecules (Fig. 4d). These results corroborate previous findings from studies analyzing breast, ovarian and colorectal cancer datasets [23]. These attenuated proteins were found to be involved in mRNA processing, DNA repair pathway, cell cycle regulation, the immune system, and in carbohydrate and lipid metabolism. Strikingly, we found that DEG between the PeM subtypes are significantly certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprint (which was not this version posted March 15, 2018. . https://doi.org/10.1101/243477 doi: bioRxiv preprint associated with protein attenuation (Chi-Squared test p-value: 10 -4 using mRNA expression DEG, 10 -6 using protein expression DEG). These findings suggest that the effects of CNA are attenuated at the protein level via post-transcriptional modification.
To identify large protein complexes containing the attenuated proteins and that are variable (i.e. at least a protein subunit of the complex is differentially expressed) between PeM subtypes, we leveraged a manually curated set of core protein complexes from the CORUM database [24]. These included many protein complexes involved in DNA conformation modification, DNA repair, transcriptional regulation, post-translational modification including ubiquitination (Additional file 2: Table S13-S14). Using our data, we observed that the majority of the protein complexes were highly co-regulated at the protein level rather than at the mRNA level. Notably, we identified SWI/SNF (BAF and PBAF) and HDAC complex which were highly co-regulated ( Fig. 4e-g). We found copynumber deletion in many subunits of SWI/SNF complex, mostly in the BAP1 del subtype (Fig. 1c). About one quarter of proteins in the BAF complex and half of proteins in PBAF were attenuated. PBRM1 was both attenuated at the protein level as well as differentially expressed between PeM subtypes. SMARCB1 and SMARCA4 were also differentially expressed between PeM subtypes in this complex (Fig. 4h). We further identified a number of HDAC complex components as highly co-regulated. The complex consisted of Histone deacetylase (HDAC1/2), which regulates expression of a number of genes through chromatin remodeling. About one-third of protein subunits in the complex were attenuated at the protein level. More importantly, HDAC1, CHD4, and ZMYM2 were differentially expressed between PeM subtypes in the protein complex, and different family members of HDAC protein family were highly expressed in the BAP1 del subtype (Fig. 4i). This indicates potential use of HDAC inhibitors to suppress the tumor growth in the BAP1 del subtype. We note that both SWI/SNF and HDAC complexes interact with BAP1. Expression pattern of many subunits of these complexes were either highly correlated or highly anti-correlated with BAP1 expression (Fig. 4e-g). Although mRNA transcripts are transcribed proportional to the changes in copy-number profile of the gene, the corresponding proteins are often stabilized when in complex, and free proteins in excess are usually ubiquitinated and targeted for proteosomal degradation to maintain stoichiometry [23]. certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprint (which was not this version posted March 15, 2018. . https://doi.org/10.1101/243477 doi: bioRxiv preprint BAP1 del subtype is characterized by distinct expression patterns of genes involved in DNA repair pathway,

and immune checkpoint receptor activation
To identify the pathways dysregulated by the DEG between the PeM subtypes, we performed hypergeometric test based geneset enrichment analysis (Methods) using the REACTOME pathway database. Intriguingly, we observed high concordance between pathways dysregulated by the two sets (mRNA and protein expression data) of top-500 DEG ( Fig. 5a-b). The unsupervised clustering of pathways revealed two distinct clusters for BAP1 del and BAP1 intact tumors. This indicates that the enriched pathways, between the patient groups, are also differentially expressed. BAP1 del patients demonstrated elevated levels of RNA and protein metabolism as compared to BAP1 intact patients. Many genes involved in chromatin remodeling and DNA damage repair were differently expressed between the groups (Additional file 1: Figure S19-S20). Our data suggests that BAP1 del tumors have repressed DNA damage response pathways. Most importantly, protein expression data revealed that PARP1 is highly expressed in BAP1 del tumors as compared to BAP1 intact tumors indicating potential inhibition of PARP1 for BAP1 del tumors. Genes involved in cell-cycle and apoptotic pathways were observed to be highly expressed in BAP1 del patients. Furthermore, glucose and fatty-acid metabolism pathways were repressed in BAP1 del as compared to BAP1 intact . More interestingly, we observed a striking difference in immune-system associated pathways between the PeM subtypes. Whereas BAP1 del patients demonstrated strong activity of cytokine signaling and the innate immune system; MHC-I/II antigen presentation system and Adaptive immune system were active in BAP1 intact patients.
Prompted by this finding, we next analyzed whether PeM tumors were infiltrated with leukocytes. To assess the extent of leukocyte infiltration, we computed an expression (RNA-seq and protein) based score using the immunecell and stromal markers proposed by [25]. We discovered that the immune marker gene score was strongly correlated with stromal marker gene score (Methods and Fig. 5c-d). Using CIBERSORT [26] software, we computationally estimated leukocytes representation in the bulk tumor transcriptome. We observed massive infiltration of T cells and CD4/8 cells in majority of the PeM tumors (Fig. 5e). A subset of PeM tumors had massive infiltration of B-cells in addition to T cells and CD4/8 cells. Interestingly, when we group the PeM tumors by their certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprint (which was not this version posted March 15, 2018. . https://doi.org/10.1101/243477 doi: bioRxiv preprint BAP1 aberration status, there was a marked difference in the proportion of infiltrated plasma cells, natural killer

(NK) cells, mast cells, B cells and CD8 cells between the groups. Whereas the proportions of plasma cells, NK cells and B cells were less in the BAP1 del tumors, there was more infiltration of mast cells and CD8 cells were in
BAP1 del tumors as compared to BAP1 intact tumors. We performed TMA IHC staining of CD3 and CD8 antibody on PeM tumors. We observed that BAP1 del PeM tumors were positively stained for both CD3 and CD8 confirming infiltration of B and T cells in BAP1 del PeM tumors ( Fig. 5f and Additional file 1: Figure S21). Combined, this strongly indicates that leukocytes from the tumor-microenvironment infiltrates the PeM tumor.
Finally, we surveyed the PeM tumors for expression of genes involved in immune checkpoint pathways. A number of immune checkpoint receptors were highly expressed in BAP1 del tumors relative to BAP1 intact tumors. These included CD274 (PD-L1), CD80, CTLA4, LAG3, and ICOS (Fig. 5g) for which inhibitors are either clinically approved or are at varying stages of clinical trials. Gene expression of these immune checkpoint receptors were highly correlated with immune score (Fig. 5h). Moreover, a number of MHC genes, immuno-inhibitor genes as well as immuno-stimulator genes were differentially expressed between BAP1 del and BAP1 intact tumors (Additional file 1: Figure S22). Furthermore, we analyzed whether the immune checkpoint receptors were differentially expressed in tumors with and without 3p21 loss in PM tumors from TCGA. Unlike in PeM, we did not observe a significant difference in immune checkpoint receptor expression between the PM tumor groups (i.e. BAP1 del and BAP1 intact ) (Additional file 1: Figure S23). These findings suggest that BAP1 del PeM tumors could potentially be targeted with immune-checkpoint inhibitors while PM tumors may less likely to respond. certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Discussion
In this study, we present a comprehensive integrative multi-omics analysis of malignant peritoneal mesotheliomas.
Even though this is a rare disease we managed to amass a cohort of 19 tumors. Prior studies of mesotheliomas, performed using a single omic platform, have established loss of function mutation or copy-number loss of BAP1 as a key driver event in both PeM and PM. Our novel contribution to PeM is that we provide evidence from integrative multi-omics analyses that BAP1 copy number loss (BAP1 del ) forms a distinct molecular subtype of PeM. This subtype of PeM is characterized by distinct expression patterns of genes involved in chromatin remodeling, DNA repair pathway, and immune checkpoint activation. Our results suggest that BAP1 del tumors might be prioritized for immune checkpoint blockade therapies. Thus BAP1 may be both a prognostic and predictive biomarker for PeM enabling better disease classification and patient treatment.
Structural alterations in PeM tumors were found to be highly heterogeneous, and occur at a lower rate as compared to most other adult solid cancers. The majority of SNVs and CNAs were typically unique to a patient. However, many of these alterations were non-randomly distributed to critical carcinogenic pathways. We observed many alterations in genes involved in chromatin remodeling, SWI/SNF complex, cell cycle and DNA repair pathway. SWI/SNF complex is an ATP-dependent chromatin remodeling complex known to harbor aberrations in almost one-fifth of all human cancers [27]. Our results show that SWI/SNF complex is differentially expressed between PeM subtypes which further regulates oncogenic and tumor suppressive pathways. Notably, we also identified another chromatin remodeling complex -HDAC complex which is differentially expressed between PeM subtypes. HDAC, known to be regulated by BAP1, is a potential therapeutic target for the BAP1 del PeM subtype.
Recent in-vitro experiments demonstrated BAP1 loss altered sensitivity of PM as well as uveal melanoma (UM) cells to HDAC inhibition [28,29].
Loss of BAP1 is known to alter chromatin architecture exposing the DNA to damage, and also impairing the DNArepair machinery [30,31]. Similar to BRCA1/2 deficient breast and ovarian cancers, BAP1 deficient PeM tumors most likely depends on PARP1 for survival. This rationale can be utilized to test PARP inhibitors in BAP1 del PeM certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
DNA repair deficiency leads to the increased secretion of cytokines, including interferons that promote tumorantigen presentation, and trigger recruitment of both T and B lymphocytes to destroy tumor cells. As a response, tumor cells evade this immune-surveillance by increased expression of immune checkpoint receptors. The results presented here also indicate that PeM tumors are infiltrated with immune-cells from the tumor microenvironment.
Moreover, the BAP1 del subtype displays elevated levels of immune checkpoint receptor expression which strongly suggests the use of immune checkpoint inhibitors to treat this subtype of PeM. However, in a small subset of PM tumors in TCGA dataset, the loss of BAP1 did not elevate expression of immune checkpoint marker genes. This warrants further investigation on the characteristics of these groups of PM tumors. Furthermore, recently, BAP1 loss has been shown to define a distinct molecular subtype of clear cell renal cell carcinoma (ccRCC) and UM [33][34][35]. These studies showed that, similar to BAP1 del PeM subtype, BAP1 del tumors from both ccRCC and UM also have dysregulated chromatin modifiers, impaired DNA repair pathway, and immune checkpoint receptor activation. More recent studies in ccRCC [36] and melanoma [37] demonstrated that inactivation of PBRM1 (or PBAF complex) predicts response to immune checkpoint blocking therapies. Similarly, DNA repair defects have also been shown to be predictive of response to immune checkpoint blocking therapies [38][39][40]. This strongly indicates a pan-cancer mechanism of oncogenesis shared among tumors with BAP1 copy-number loss.
The main challenge in mesothelioma treatment is that, all current efforts made towards testing new therapy options are limited to using therapies that have been proven successful in other cancer types, without a good knowledge of underlying molecular mechanisms of the disease. As a result of sheer desperation, some patients have been treated even though no targeted therapy for mesothelioma has been proven effective as yet. For example, a number of clinical trials exploring the use of immune checkpoint inhibitors (anti-PD1/PD-L1 or anti-CTLA4) in PM and/or PeM patients that progressed under chemotherapy, and are positive for immune checkpoint markers are currently under progress. The results of the first few clinical trials report either very low response rate or no benefit to the patients [41][42][43][44]. Notably, BAP1 copy-number or mutation status were not assessed in these studies. We believe certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprint (which was not this version posted March 15, 2018. . https://doi.org/10.1101/243477 doi: bioRxiv preprint 16 that response rates for immune checkpoint blockade therapies in clinical trials for PeM will improve when patients are segregated by their BAP1 copy-number status.

Conclusions
Our first-in-field multi-omics data underscore PeM tumors with BAP1 loss as a distinct molecular subtype that could potentially be targeted with immune checkpoint, PARP, or HDAC inhibitors. This dataset can be used to inform targeted therapies for almost half of PeM cases for which no definite therapies yet exist. Our findings uncover BAP1 as a trackable prognostic and predictive biomarker, and refine PeM disease classification.

Construction of tissue microarrays (TMAs)
FFPE tissue blocks were retrieved from the archives of the Department of Pathology, Vancouver General Hospital (Vancouver, Canada). H&E stained slides from each block were reviewed by two pathologists to identify tumor areas. TMAs were constructed with 1 mm diameter tissue cores from representative tumor areas from FFPE blocks.
Cores were transferred to a paraffin block using a semi-automated tissue array instrument (Pathology Devices TMArrayer, San Diego, CA). Duplicate tissue cores were taken from each specimen, resulting in a composite TMA block. Reactive mesothelial tissues from pleura were also included as benign controls. Following construction, 4µm thick sections were cut for H&E and immunohistochemical staining. For Ion AmpliSeq TM Exome Sequencing, 100ng of DNA based on Qubit ® dsDNA HS Assay (Thermo Fisher Scientific) quantitation was used as input for Ion AmpliSeq TM Exome RDY Library Preparation. This is a polymerase chain reaction (PCR)-based sequencing approach using 294,000 primer pairs (amplicon size range 225-275 bp), and covers >97% of Consensus CDS (CCDS; Release 12), >19,000 coding genes and >198,000 certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Somatic variant calling
Torrent Server (Thermo Fisher Scientific) was used for signal processing, base calling, read alignment, and generation of results files. Specifically, following sequencing, reads were mapped against the human reference genome hg19 using Torrent Mapping Alignment Program. The mean target coverage ranges from 78.62 to 226.44, thus sequencing depth ranges from 78 to 226X. Variants were identified by using Torrent Variant Caller plugin with the optimized parameters for AmpliSeq exome-sequencing recommended by Thermo Fisher. The variant call format (VCF) files from all sample were combined using GATK (3.2-2) [45] and all variants were annotated using ANNOVAR [46]. Only non-silent exonic variants including non-synonymous single nucleotide variations (SNVs), stop-codon gain SNVs, stop-codon loss SNVs, splice site SNVs and In-Dels in coding regions were kept if they were supported by more than 10 reads and had allele frequency higher than 10%. To obtain somatic variants, we filtered against dbSNP build 138 (non-flagged only) and the matched adjacent benign or blood samples sequenced in this study. Putative variants were manually scrutinized on the Binary Alignment Map (BAM) files through Integrative Genomics Viewer (IGV) version 2.3.25 [47].

Copy number aberration (CNA) analysis
Copy number changes were assessed using Nexus Copy Number Discovery Edition Version 9.0 (BioDiscovery, Inc., El Segundo, CA). Nexus NGS functionality (BAM ngCGH) with the FASST2 Segmentation algorithm was used to make copy number calls (a Circular Binary Segmentation/Hidden Markov Model approach). The significance threshold for segmentation was set at 5x10 -6 , also requiring a minimum of 3 probes per segment and a maximum probe spacing of 1000 between adjacent probes before breaking a segment. The log ratio thresholds for single copy gain and single copy loss were set at +0.2 and -0.2, respectively. The log ratio thresholds for gain certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprint (which was not this version posted March 15, 2018. . https://doi.org/10.1101/243477 doi: bioRxiv preprint Molecular subtypes of malignant peritoneal mesothelioma 20 of 2 or more copies and for a homozygous loss were set at +0.6 and -1.0, respectively. Tumor sample BAM files were processed with corresponding normal tissue BAM files. Reference reads per CNA point (window size) was set at 8000. Probes were normalized to median. Relative copy number profiles from exome sequencing data were determined by normalizing tumor exome coverage to values from whole blood controls. The germline exome sequences were used to obtain allele-specific copy number profiles and generating segmented copy number profiles. The GISTIC module on Nexus identifies significantly amplified or deleted regions across the genome.
The amplitude of each aberration is assigned a G-score as well as a frequency of occurrence for multiple samples.
False Discovery Rate q-values for the aberrant regions have a threshold of 0.15. For each significant region, a "peak region" is identified, which is the part of the aberrant region with greatest amplitude and frequency of alteration. In addition, a "wide peak" is determined using a leave-one-out algorithm to allow for errors in the boundaries in a single sample. The "wide peak" boundaries are more robust for identifying the most likely gene targets in the region. Each significantly aberrant region is also tested to determine whether it results primarily from broad events (longer than half a chromosome arm), focal events, or significant levels of both. The GISTIC module reports the genomic locations and calculated q-values for the aberrant regions. It identifies the samples that exhibit each significant amplification or deletion, and it lists genes found in each "wide peak" region.

Whole transcriptome sequencing (RNA-seq)
Total RNA from 100µm sections of snap-frozen tissue was isolated using the mirVana Isolation Kit from Ambion (AM-1560). Strand specific RNA sequencing was performed on quality controlled high RIN value (>7) RNA samples (Bioanalyzer Agilent Technologies) before processing at the high throughput sequencing facility core at BGI Genomics Co., Ltd. (The Children's Hospital of Philadelphia, Pennsylvania, USA). In brief, 200ng of total DNAse treated RNA was first treated to remove the ribosomal RNA (rRNA) and then purified using the Agencourt RNA Clean XP Kit (Beckman Coulter) prior to analysis with the Agilent RNA 6000 Pico Chip to confirm rRNA removal. Next, the rRNA-depleted RNA was fragmented and converted to cDNA. Subsequent steps include end repair, addition of an `A' overhang at the 3' end, ligation of the indexing-specific adaptor, followed by purification with Agencourt Ampure XP beads. The strand specific RNA library prepared using TruSeq (Illumina Catalogue certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

RNA-seq quantification
Using splice-aware aligner STAR (2.3.1z) [48], RNA-seq reads (~200MB in size) were aligned onto the human genome reference (GRCh38) and exon-exon junctions, according to the known gene model annotation from the Ensembl release 80 (http://www.ensembl.org). Apart from protein coding genes, non-coding RNA types and pseudogenes are further annotated and classified. Based on the alignment and by using gene annotation (Ensembl release 80), gene expression profiles was calculated. Only reads unique to one gene and which corresponded exactly to one gene structure, were assigned to the corresponding genes by using the python tool HTSeq [49].
Normalization of read counts was conducted by R package DESeq [50], which was designed for gene expression analysis of RNA-seq data across different samples.

Identification of fusion transcripts and validation
We used the deFuse algorithm [22] to predict rearrangements in RNA sequence libraries. The deFuse fusion transcript prediction calls were further filtered using following criteria: a fusion gene candidate: (1) must be predicted to have arisen from genome rearrangement, rather than via a readthrough event; (2) must be predicted in no more than two sequence libraries; (3) must map unambiguously on both sides of the predicted breakpoints (that is, no multi-mapping reads); (4) must not map entirely to repetitive elements; (5) must be detected in >5 reads (either split or spanning) and (6) Table S9). PCR certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprint (which was not this version posted March 15, 2018. . https://doi.org/10.1101/243477 doi: bioRxiv preprint 22 was performed in 20µl reactions using Q5 buffer (NEB), 0.2mM dNTPs, 0.4 µM each primer, 0.12 units Q5 High-Fidelity DNA Polymerase (NEB) and 2 µl of the RT reaction. The PCR reaction was carried out with the following program: 95°C, 30 seconds, followed by 30 cycles of 95°C for 10 seconds, 57°C for 20 seconds and 72°C for 10 seconds. Resulting PCR products, ranging in size from 150bp to 250bp, were purified using AMPure beads (Agencourt) and sequenced using Sanger sequencing to verify fusion junctions. scanning for TMT tags. A short description follows; more detailed overview is in [53]. Briefly, an in house packed reverse phase column run with a 2 hour low pH acetonitrile gradient (5-40% with 0.1% formic acid) was used to separate and introduce peptides into the MS. Survey scans covering m/z 350-1500 were acquired in profile mode at a resolution of 120,000 (at m/z 200) with S-Lens RF Level of 60%, a maximum fill time of 50 milliseconds, and automatic gain control (AGC) target of 4x10 5 . For MS2, monoisotopic precursor selection was enabled with triggering charge state limited to 2-5, threshold 5x10 3 and 10 ppm dynamic exclusion for 60 seconds.

Proteomics analysis using mass spectrometry
Centroided MS2 scans were acquired in in the ion trap in Rapid mode after CID fragmentation with a maximum certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprint (which was not this version posted March 15, 2018. . https://doi.org/10.1101/243477 doi: bioRxiv preprint fill time of 20 milliseconds and 1 m/z isolation quadrupole isolation window, collision energy of 30%, activation Q of 0.25, injection for all available parallelizable time turned ON, and an AGC target value of 1x10 4 . For MS3, fragment ions were isolated from a 400-1200 m/z precursor range, ion exclusion of 20 m/z low and 5 m/z high, isobaric tag loss exclusion for TMT, with a top 10 precursor selection. Acquisition was in profile mode with the Orbitrap after HCD fragmentation (NCE 60%) with a maximum fill time of 90 milliseconds, 50,000 m/z resolution, 120-750 m/z scan range, an AGC target value of 1x10 5 , and all available parallelizable ON. The total allowable cycle time was set to 4 seconds.

Peptide identification and protein quantification
Qualitative and quantitative proteomics analysis was done using ProteomeDiscoverer 2.1.1.21 (Thermo Fisher Scientific). To maintain consistency with transcriptome annotation, we used Ensembl GRCh38.87 human reference proteome sequence database for proteome annotation. Sequest HT 1.3 was used for peptide-spectralmatches (PSMs), with parameters specified as trypsin enzyme, two missed cleavages allowed, minimum peptide length of 6, precursor mass tolerance 10 ppm, and a fragment mass tolerance of 0.6 Da. We allowed up to 4 variable modifications per peptide from the following categories: acetylation at protein terminus, methionine oxidation, and TMT label at N-terminal residues and the side chains of lysine residues. In addition, carbamidomethylation of cysteine was set as a fixed modification. PSM results were filtered using q-value cut off of 0.05 to control for false discovery rate (FDR) determined by Percolator. Identified peptides from both high and medium-confidence level after FDR-filtering were included in the final stage to provide protein identification and quantification results.
Reporter ions from MS3 scans were quantified with an integration tolerance of 20ppm with the most confident centroid. Proteins were further filtered to include only those found with minimum one peak in all samples.
Proteome Discoverer processed data was exported for further statistical analysis.

Mutational signature analysis
We used deconstructSigs [12], a multiple regression approach to statistically quantify the contribution of mutational signature for each tumor. The mutational signature were obtained from the COSMIC mutational certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprint (which was not this version posted March 15, 2018. . https://doi.org/10.1101/243477 doi: bioRxiv preprint signature database [54]. Both silent and non-silent somatic mutations were used together to obtain the mutational signatures. Only mutational signatures with a weight more than 0.06 were considered for analysis.

Prioritization of driver genes using HIT'nDRIVE
Non-silent somatic mutation calls, CNA gain or loss, and gene-fusion calls were collapsed in gene-patient alteration matrix with binary labels. Gene-expression values were used to derive expression-outlier gene-patient outlier matrix using Generalized Extreme Studentized Deviate (GESD) test. STRING ver10 [55] proteininteraction network was used to compute pairwise influence value between the nodes in the interaction network.

Consensus clustering
We used ConsensusClusterPlus [56] R-package to perform consensus clustering. We used the following parameters: maximum cluster number to evaluate: 10, number of subsamples: 10000, proportion of items to sample: 0.8, proportion of features to sample: 1, cluster algorithm: hierarchical, distance: pearson.

Protein attenuation analysis
For every gene/protein profiled for CNA (segment mean), RNA-seq (normalized log2 expression), and MS (normalized log2 expression), we performed the following analysis. For every gene/protein, the Pearson correlation coefficients were calculated for CNA-mRNA expression ( _ ) and CNA-protein expression ( _ ). The 75 th percentile of the difference between the above two correlation coefficients i.e. = _ − _ was found to be approximately 0.45. Therefore those proteins with ≥ 0.45 were considered as attenuated proteins.

Pathway enrichment analysis
The selected set of differentially expressed genes were tested for enrichment against gene sets of pathways present in Molecular Signature Database (MSigDB) v6.0 [57]. A hypergeometric test based gene set enrichment analysis was used for this purpose (https://github.com/raunakms/GSEAFisher). A cut-off threshold of false discovery rate certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprint (which was not this version posted March 15, 2018. . https://doi.org/10.1101/243477 doi: bioRxiv preprint (FDR) < 0.01 was used to obtain the significantly enriched pathways. Only pathways that are enriched with at least three differentially expressed genes were considered for further analysis. To calculate the pathway activity score, the expression dataset was transformed into standard normal distribution using 'inverse normal transformation' method. This step is necessary for fair comparison between the expression-values of different genes. For each sample, the pathway activity score is the mean expression level of the differentially expressed genes linked to the enriched pathway.

Stromal and immune score
We used two sets of 141 genes (one each for stromal and immune gene signatures) as described in [25]. We used 'inverse normal transformation' method to transform the distribution of expression data into the standard normal distribution. The stromal and immune scores were calculated, for each sample, using the summation of standard normal deviates of each gene in the given set.

Enumeration of tissue-resident immune cell types using mRNA expression profiles
CIBERSORT algorithm [26]  MuTect2, SomaticSniper and VArScan2) were considered. CNA segmented data were further processed using Nexus Copy Number Discovery Edition Version 9.0 (BioDiscovery, Inc., El Segundo, CA) to identify aberrant regions in the genome. In case of the RNA-seq expression data, HTSeq-FPKM-UQ normalized data were used. AACR Project GENIE [16] dataset was downloaded from https://www.synapse.org/#!Synapse:syn7222066 on April 2017.

Additional files
Additional file 1: Figure S1-S23. Figure S1. Pathology of Peritoneal Mesothelioma. Figure S2. Pathology of Peritoneal Mesothelioma. Figure S3. TMA slides of PeM tumors IHC stained for CK5. Figure S4. TMA slides of PeM tumors IHC stained for CALB2. Figure S5. Summary of non-silent somatic mutation landscape of PeM. Figure S6. Distribution of SNV nucleotide substitutions. Figure S7. Effect of BAP1 somatic mutation on resulting amino-acid chain. Figure S8. Effect of BAP1 somatic mutation on resulting amino-acid chain. Figure S9.
Common somatic mutated genes in mesothelioma patient cohorts. Figure S10. Correlation of copy-number with mutation and gene-expression profiles. Figure S11. TMA slides of PeM tumors IHC stained for BAP1. Figure   S12. Co-deletion of four cancer-associated genes in chromosomal region 3p21. Figure S13. Consensus clustering of copy-number segment mean profiles of PeM. Figure S14. Comparison of SCNA profile of peritoneal and pleural mesothelioma. Figure S15. Gene Fusions in PeM. Figure S16. Consensus clustering of mRNA expression profiles of PeM. Figure S17. Correlation between mRNA and protein expression profiles. Figure S18. Consensus clustering of mRNA expression profiles of PeM. Figure S19. Significant differentially expressed genes/proteins between molecular subtypes of PeM. Figure S20. Differentially expressed DNA-repair genes between PeM subtypes. Figure S21. TMA slides of BAP1del PeM tumors IHC stained for CD3 and CD8. Figure S22.
Differentially expressed immune genes between PeM subtypes. Figure S23. Immune cell infiltration in Pleural Mesothelioma.
Quality control statistics for WES data. Table S3. Somatic Mutation in PeM tumors. Table S4. Driver genes certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. Differentially expressed CORUM complex measured using mRNA expression levels. Table S14. Differentially expressed CORUM complex measured using protein expression levels. Awards. The funding agencies had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Availability of data and materials
The whole-exome and whole-transcriptome sequencing data from this study is available in the European Genomephenome Archive (EGA; https://ega-archive.org/) under accession number EGAS00001002820. The proteome data from mass spectrometry is available in the PRIDE Archive (https://www.ebi.ac.uk/pride/archive/) under accession number PXD008873.

Ethics approval and consent to participate
This study was approved by the Institutional Review Board of the University of British Columbia and the Vancouver Coastal Health (REB Number. H1500902 and V15-00902). All samples and information were collected with written and signed informed consent from the participating patients.

Consent for publication
Not applicable.

Competing interests
Authors declare no competing interests. certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprint (which was not this version posted March 15, 2018. . https://doi.org/10.1101/243477 doi: bioRxiv preprint     between PeM subtypes. The bar plot on the right represents negative log10 of Wilcoxon signed-rank test p-value certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.