- Open Access
Large-scale public data reuse to model immunotherapy response and resistance
Genome Medicine volume 12, Article number: 21 (2020)
Despite growing numbers of immune checkpoint blockade (ICB) trials with available omics data, it remains challenging to evaluate the robustness of ICB response and immune evasion mechanisms comprehensively. To address these challenges, we integrated large-scale omics data and biomarkers on published ICB trials, non-immunotherapy tumor profiles, and CRISPR screens on a web platform TIDE (http://tide.dfci.harvard.edu). We processed the omics data for over 33K samples in 188 tumor cohorts from public databases, 998 tumors from 12 ICB clinical studies, and eight CRISPR screens that identified gene modulators of the anticancer immune response. Integrating these data on the TIDE web platform with three interactive analysis modules, we demonstrate the utility of public data reuse in hypothesis generation, biomarker optimization, and patient stratification.
Despite growing numbers of published immune checkpoint blockade (ICB) trials in different cancer types with available omics data and clinical outcomes, ICB response prediction remains an open question. Many published ICB response biomarkers had been trained and tested on limited cohorts and showed variable performance in different cohorts. Moreover, with the limited data size in each clinical study, it is challenging to comprehensively evaluate the complexity of ICB response and immune evasion mechanisms. To address these challenges, we present a data-driven approach integrating large-scale omics data and biomarkers on published ICB trials, non-immunotherapy tumor profiles, and CRISPR screens on a web platform TIDE (http://tide.dfci.harvard.edu).
Previously, we developed TIDE as a transcriptome biomarker of ICB response by modeling tumor immune dysfunction and exclusion . The statistical model of TIDE was trained on clinical tumor profiles without ICB treatments since the immune evasion mechanisms in treatment-naïve tumors are also likely to influence patient response to immunotherapies. The TIDE model has been applied to evaluate T cell dysfunction and exclusion signatures across over 33K samples in 188 tumor cohorts from well-curated databases, including TCGA , METABRIC , and PRECOG , as well as our in-house collections. In the current work, we significantly expanded the scope of our previous work by incorporating many new datasets and function modules (Additional file 1: Table S1).
Construction and content
We processed the omics data for 998 tumors from 12 published ICB clinical studies (listed in Additional file 2: Table S2), and eight published CRISPR screens that identified genes modulating lymphocyte-mediated cancer killing and immunotherapy response [5,6,7,8]. The clinical study data from ICB naïve cohorts includes 33K samples in 188 tumor cohorts from well-curated databases, including TCGA , METABRIC , and PRECOG . We integrated these data on the TIDE web platform using the MySQL database. The web platform is based on the Django 3.0 framework. We provided three interactive modules for hypothesis generation, biomarker optimization, and patient stratification (Fig. 1).
Utility and discussion
Gene set prioritization module
The first module of the TIDE web platform can help cancer biologists prioritize genes in their input gene set for mechanistic follow-up experiments (Fig. 1A). Typically, a genomic experiment, often conducted on model systems in limited sample size, will yield tens to hundreds of gene hits. The large-scale omics data and clinical cohorts collected in TIDE enable cancer biologists to focus on genes with the highest clinical relevance and consistent behavior in other similar experiments. Generally, for any gene sets, a cancer biologist can utilize this module to evaluate each gene for its expression associations with ICB response outcome, T cell dysfunction levels, T cell exclusion levels, and phenotypes in genetic screens in diverse cohorts. To probe a candidate gene further, the user can also use a single gene as query to evaluate how the expression, copy number, somatic mutation, and DNA methylation levels of this gene influence clinical outcome in all collected datasets. Therefore, the prioritization module, integrating many independent cohorts, can help identify genes with improved robustness and clinical relevance.
To demonstrate an example of usage of the regulator prioritization module, we queried 696 druggable genes annotated by the OASIS database , to find potential therapeutic targets in synergy with ICB (Fig. 2). For example, AXL, a Tyro3/Axl/Mer family receptor tyrosine kinase, is among the top targets ranked by this module to render the tumor microenvironment resistant to ICB. High AXL expression is associated with T cell dysfunction phenotypes in all datasets enumerated (Fig. 2 left panel). Meanwhile, high expression of AXL is also associated with worse ICB outcome in bladder cancer and treatment-naïve melanoma treated with ICB (Fig. 2 second to left panel). Among the cell types promoting T cell exclusion, both myeloid-derived suppressor cell and cancer-associated fibroblast have very high AXL expression level (Fig. 2 right panel). Indeed, in a recent clinical trial NCT03184571, the combination of AXL inhibitor and anti-PD1 has shown promising efficacy among AXL-positive non-small cell lung cancer patients . Hence, this module can prioritize genes with the best potential for developing combination immunotherapies.
Biomarker evaluation module
The second module allows translational scientists to evaluate the accuracy of their biomarkers on many ICB cohorts in comparison with other published biomarkers (Fig. 1B). We implemented eight published ICB response biomarkers and applied them to our collection of published ICB trial samples. For a user-defined custom biomarker, which can be a gene set or weighted gene score vector, this module calculates the biomarker expression level in all ICB cohorts. The module displays the comparison between the custom biomarker and other published biomarkers based on their predictive power of response outcome and overall survival.
To demonstrate an example usage of the biomarker evaluation module, we tested one biomarker containing seven genes with previously reported association with tumor immune evasion (Additional file 3: Table S3). These genes were weighted by their reported direction of mediating anticancer immune response. This example biomarker gave an area under the receiver operating characteristic curve (AUC) greater than 0.5 in 12 out of the 16 ICB sub-cohorts (Fig. 3), suggesting it to be a robust predictive biomarker. This signature also achieved significant associations with prolonging survival in two sub-cohorts (Fig. 4, two-sided Cox-PH p value < 0.05). In contrast, several recently published biomarkers trained on limited clinical cohorts have shown significant performance variations in other cohorts (Additional file 4: Figure S1), underscoring the importance of cross-cohort evaluation of biomarker robustness using all available cohorts.
Biomarker consensus module
The third module of biomarker consensus aids oncologists in predicting whether a patient will respond to ICB therapy based on multiple biomarkers (Fig. 1C). Based on tumor pre-treatment expression profiles, oncologists could use this TIDE module and multiple published transcriptomic biomarkers (Additional file 4: Supplementary Methods) to predict patient response and potentially make informed treatment decisions. Notably, in the second and third TIDE modules, we only focused on evaluating transcriptomic biomarkers but not mutation biomarkers due to the following reasons. The results of tumor mutation analyses might be influenced by different experimental platforms (whole genome versus custom panel), sample types (FFPE versus fresh frozen), and computational mutation callers. Although tumor mutation burden (TMB) seems to be a consistent ICB response biomarker, the computation of TMB across different cohorts and platforms is still an open question.
To demonstrate an example usage of the biomarker consensus module, we upload the pre-treatment expression matrix of a melanoma cohort  treated with anti-PD1 therapy (Table 1). Patients with favorable predictions from multiple biomarkers are highly likely to be responders. For example, patient 2 tumor has a negative TIDE score, indicating a lack of tumor immune evasion phenotypes. In addition, patient 2 tumor has positive scores of interferon-gamma (IFNG) signature, macro-satellite instability (MSI), and PDL1 (CD274) levels, all of which are positive biomarkers of ICB response. With the support from multiple markers, an oncologist could be more confident that patient 2 will respond to anti-PD1, and indeed patient 2 is a responder in the original study . In contrast, this module also reported some patients who are unlikely to benefit from ICB (Table 1). For example, patient 10 tumor has high TIDE score and low IFNG, MSI, and PDL1 levels. Based on the predictions from multiple biomarkers, an oncologist might predict patient 10 as a non-responder and select an alternative therapy, and indeed patient 10 failed to benefit from anti-PD-1 . TIDE also showed that patient 10 tumor has a significant enrichment of T cell exclusion signature due to high infiltration of myeloid-derived suppressor cell (MDSC) and cancer-associated fibroblast (CAF). Therefore, elimination of MDSC and CAF might be needed for patient 10 to respond to ICB. In summary, by presenting the predictions from multiple biomarkers in one integrated platform, the biomarker consensus module can potentially inform oncologists on treatment decisions.
In conclusion, we present a TIDE web platform to infer gene functions in modulating tumor immunity and evaluate biomarkers to predict ICB clinical response. Our work underlines the value of data sharing of published trials and code sharing of published biomarkers. Notably, several published ICB clinical studies have not released their omics data or clinical data (Additional file 2: Table S2), and we hope their authors could release these data to bring invaluable resource to the whole research community. As the immunotherapy data becomes increasingly available, we foresee the TIDE web platform with increased value and benefit to the mechanism studies in cancer immunology and the biomarker discoveries in immune oncology.
Availability of data and materials
All the processed data can be accessed on http://tide.dfci.harvard.edu/. We collected ICB-naive cancer data sets with both patient survival durations and tumor gene expression profiles from the TCGA , METABRIC , and PRECOG  databases. Following the accession instruction described in published ICB studies (Additional file 2: Table S2), we downloaded ICB patients’ RNA-Seq raw sequencing data, clinical outcome information, and response outcome information from ICB studies (if available). The raw count table and meta-information of eight published CRISPR screens [5,6,7,8] were also obtained from the original studies. The list of genes with launched drugs, collected from the OASIS database , is available in Additional file 5: Table S4. The literature support of transcriptomic biomarkers is available in Additional file 6: Table S5.
Clustered regularly interspaced short palindromic repeats
Immune checkpoint blockade
Tumor Immune Dysfunction and Evolution
Jiang P, Gu S, Pan D, Fu J, Sahu A, Hu X, Li Z, Traugh N, Bu X, Li B, et al. Signatures of T cell dysfunction and exclusion predict cancer immunotherapy response. Nat Med. 2018;24:1550–8.
Cancer Genome Atlas Research N, Weinstein JN, Collisson EA, Mills GB, Shaw KR, Ozenberger BA, Ellrott K, Shmulevich I, Sander C, Stuart JM. The Cancer genome atlas Pan-Cancer analysis project. Nat Genet. 2013;45:1113–20.
Curtis C, Shah SP, Chin SF, Turashvili G, Rueda OM, Dunning MJ, Speed D, Lynch AG, Samarajiwa S, Yuan Y, et al. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature. 2012;486:346–52.
Gentles AJ, Newman AM, Liu CL, Bratman SV, Feng W, Kim D, Nair VS, Xu Y, Khuong A, Hoang CD, et al. The prognostic landscape of genes and infiltrating immune cells across human cancers. Nat Med. 2015;21:938–45.
Pan D, Kobayashi A, Jiang P, Ferrari de Andrade L, Tay RE, Luoma AM, Tsoucas D, Qiu X, Lim K, Rao P, et al. A major chromatin regulator determines resistance of tumor cells to T cell-mediated killing. Science. 2018;359:770–5.
Kearney CJ, Vervoort SJ, Hogg SJ, Ramsbottom KM, Freeman AJ, Lalaoui N, Pijpers L, Michie J, Brown KK, Knight DA, et al. Tumor immune evasion arises through loss of TNF sensitivity. Sci. Immunol. 2018;3:29776993.
Patel SJ, Sanjana NE, Kishton RJ, Eidizadeh A, Vodnala SK, Cam M, Gartner JJ, Jia L, Steinberg SM, Yamamoto TN, Merchant AS. Identification of essential genes for cancer immunotherapy. Nature. 2017;548(7669):537-42.
Manguso RT, Pope HW, Zimmer MD, Brown FD, Yates KB, Miller BC, Collins NB, Bi K, LaFleur MW, Juneja VR, Weiss SA. In vivo CRISPR screening identifies Ptpn2 as a cancer immunotherapy target. Nature. 2017;547(7664):413-8.
Fernandez-Banet J, Esposito A, Coffin S, Horvath IB, Estrella H, Schefzick S, Deng S, Wang K, AChing K, Ding Y, et al. OASIS: web-based platform for exploring cancer multi-omics data. Nat Methods. 2016;13:9–10.
Felip E, Brunsvig P, Vinolas N, Aix SP, Costa EC, Gomez MD, Perez JMT, Arriola E, Campelo RG, Spicer JF, et al. A phase II study of bemcentinib (BGB324), a first-in-class highly selective AXL inhibitor, with pembrolizumab in pts with advanced NSCLC: OS for stage I and preliminary stage II efficacy. J Clin Oncol. 2019;37:9098.
Hugo W, Zaretsky JM, Sun L, Song C, Moreno BH, Hu-Lieskovan S, Berent-Maoz B, Pang J, Chmielowski B, Cherry G, et al. Genomic and transcriptomic features of response to anti-PD-1 therapy in metastatic melanoma. Cell. 2016;165:35–44.
The authors acknowledge the authors from published studies to share their data on tumor profiling cohorts (especially PRECOG), CRISPR screens, immunotherapy trials.
The research was supported by the Cancer Immunologic Data Commons (1U24CA224316-01) grant of the National Cancer Institute (NCI), the Pathway to Independence Award (1K99CA218900-01) grant of NCI (to P.J.), and the Partnership for Accelerating Cancer Therapies Grant from the Foundation for the National Institute of Health (to X.S.L).
Ethics approval and consent to participate
Consent for publication
X.S.L. is a co-founder, board member, and Scientific Advisor of GV20 Oncotherapy, Scientific Advisory Board of 3DMed Care. The remaining authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Table S1. New function modules and datasets in the TIDE web server.
Table S2. Data availability of published ICB studies.
Table S3. An example gene set related with immunotherapy response.
Figure S1. The prediction performance of recently published biomarkers varies across data cohorts; and Supplementary Methods.
Table S4. Genes with approved drugs.
Table S5. Publications for public immunotherapy biomarkers.
About this article
Cite this article
Fu, J., Li, K., Zhang, W. et al. Large-scale public data reuse to model immunotherapy response and resistance. Genome Med 12, 21 (2020). https://doi.org/10.1186/s13073-020-0721-z
- Immune evasion
- Data integration
- Web platform