Large-scale public data reuse to model immunotherapy response and resistance

Fu, Jingxin; Li, Karen; Zhang, Wubing; Wan, Changxin; Zhang, Jing; Jiang, Peng; Liu, X. Shirley

doi:10.1186/s13073-020-0721-z

Database
Open access
Published: 26 February 2020

Large-scale public data reuse to model immunotherapy response and resistance

Jingxin Fu^1,2,3,
Karen Li⁴,
Wubing Zhang^1,2,
Changxin Wan^1,2,
Jing Zhang³,
Peng Jiang ORCID: orcid.org/0000-0002-7828-5486^2,5 &
…
X. Shirley Liu²

Genome Medicine volume 12, Article number: 21 (2020) Cite this article

16k Accesses
532 Citations
58 Altmetric
Metrics details

Abstract

Despite growing numbers of immune checkpoint blockade (ICB) trials with available omics data, it remains challenging to evaluate the robustness of ICB response and immune evasion mechanisms comprehensively. To address these challenges, we integrated large-scale omics data and biomarkers on published ICB trials, non-immunotherapy tumor profiles, and CRISPR screens on a web platform TIDE (http://tide.dfci.harvard.edu). We processed the omics data for over 33K samples in 188 tumor cohorts from public databases, 998 tumors from 12 ICB clinical studies, and eight CRISPR screens that identified gene modulators of the anticancer immune response. Integrating these data on the TIDE web platform with three interactive analysis modules, we demonstrate the utility of public data reuse in hypothesis generation, biomarker optimization, and patient stratification.

Background

Despite growing numbers of published immune checkpoint blockade (ICB) trials in different cancer types with available omics data and clinical outcomes, ICB response prediction remains an open question. Many published ICB response biomarkers had been trained and tested on limited cohorts and showed variable performance in different cohorts. Moreover, with the limited data size in each clinical study, it is challenging to comprehensively evaluate the complexity of ICB response and immune evasion mechanisms. To address these challenges, we present a data-driven approach integrating large-scale omics data and biomarkers on published ICB trials, non-immunotherapy tumor profiles, and CRISPR screens on a web platform TIDE (http://tide.dfci.harvard.edu).

Previously, we developed TIDE as a transcriptome biomarker of ICB response by modeling tumor immune dysfunction and exclusion [1]. The statistical model of TIDE was trained on clinical tumor profiles without ICB treatments since the immune evasion mechanisms in treatment-naïve tumors are also likely to influence patient response to immunotherapies. The TIDE model has been applied to evaluate T cell dysfunction and exclusion signatures across over 33K samples in 188 tumor cohorts from well-curated databases, including TCGA [2], METABRIC [3], and PRECOG [4], as well as our in-house collections. In the current work, we significantly expanded the scope of our previous work by incorporating many new datasets and function modules (Additional file 1: Table S1).

Construction and content

We processed the omics data for 998 tumors from 12 published ICB clinical studies (listed in Additional file 2: Table S2), and eight published CRISPR screens that identified genes modulating lymphocyte-mediated cancer killing and immunotherapy response [5,6,7,8]. The clinical study data from ICB naïve cohorts includes 33K samples in 188 tumor cohorts from well-curated databases, including TCGA [2], METABRIC [3], and PRECOG [4]. We integrated these data on the TIDE web platform using the MySQL database. The web platform is based on the Django 3.0 framework. We provided three interactive modules for hypothesis generation, biomarker optimization, and patient stratification (Fig. 1).

Utility and discussion

Gene set prioritization module

The first module of the TIDE web platform can help cancer biologists prioritize genes in their input gene set for mechanistic follow-up experiments (Fig. 1A). Typically, a genomic experiment, often conducted on model systems in limited sample size, will yield tens to hundreds of gene hits. The large-scale omics data and clinical cohorts collected in TIDE enable cancer biologists to focus on genes with the highest clinical relevance and consistent behavior in other similar experiments. Generally, for any gene sets, a cancer biologist can utilize this module to evaluate each gene for its expression associations with ICB response outcome, T cell dysfunction levels, T cell exclusion levels, and phenotypes in genetic screens in diverse cohorts. To probe a candidate gene further, the user can also use a single gene as query to evaluate how the expression, copy number, somatic mutation, and DNA methylation levels of this gene influence clinical outcome in all collected datasets. Therefore, the prioritization module, integrating many independent cohorts, can help identify genes with improved robustness and clinical relevance.

To demonstrate an example of usage of the regulator prioritization module, we queried 696 druggable genes annotated by the OASIS database [9], to find potential therapeutic targets in synergy with ICB (Fig. 2). For example, AXL, a Tyro3/Axl/Mer family receptor tyrosine kinase, is among the top targets ranked by this module to render the tumor microenvironment resistant to ICB. High AXL expression is associated with T cell dysfunction phenotypes in all datasets enumerated (Fig. 2 left panel). Meanwhile, high expression of AXL is also associated with worse ICB outcome in bladder cancer and treatment-naïve melanoma treated with ICB (Fig. 2 second to left panel). Among the cell types promoting T cell exclusion, both myeloid-derived suppressor cell and cancer-associated fibroblast have very high AXL expression level (Fig. 2 right panel). Indeed, in a recent clinical trial NCT03184571, the combination of AXL inhibitor and anti-PD1 has shown promising efficacy among AXL-positive non-small cell lung cancer patients [10]. Hence, this module can prioritize genes with the best potential for developing combination immunotherapies.

Biomarker evaluation module

The second module allows translational scientists to evaluate the accuracy of their biomarkers on many ICB cohorts in comparison with other published biomarkers (Fig. 1B). We implemented eight published ICB response biomarkers and applied them to our collection of published ICB trial samples. For a user-defined custom biomarker, which can be a gene set or weighted gene score vector, this module calculates the biomarker expression level in all ICB cohorts. The module displays the comparison between the custom biomarker and other published biomarkers based on their predictive power of response outcome and overall survival.

To demonstrate an example usage of the biomarker evaluation module, we tested one biomarker containing seven genes with previously reported association with tumor immune evasion (Additional file 3: Table S3). These genes were weighted by their reported direction of mediating anticancer immune response. This example biomarker gave an area under the receiver operating characteristic curve (AUC) greater than 0.5 in 12 out of the 16 ICB sub-cohorts (Fig. 3), suggesting it to be a robust predictive biomarker. This signature also achieved significant associations with prolonging survival in two sub-cohorts (Fig. 4, two-sided Cox-PH p value < 0.05). In contrast, several recently published biomarkers trained on limited clinical cohorts have shown significant performance variations in other cohorts (Additional file 4: Figure S1), underscoring the importance of cross-cohort evaluation of biomarker robustness using all available cohorts.

Biomarker consensus module

The third module of biomarker consensus aids oncologists in predicting whether a patient will respond to ICB therapy based on multiple biomarkers (Fig. 1C). Based on tumor pre-treatment expression profiles, oncologists could use this TIDE module and multiple published transcriptomic biomarkers (Additional file 4: Supplementary Methods) to predict patient response and potentially make informed treatment decisions. Notably, in the second and third TIDE modules, we only focused on evaluating transcriptomic biomarkers but not mutation biomarkers due to the following reasons. The results of tumor mutation analyses might be influenced by different experimental platforms (whole genome versus custom panel), sample types (FFPE versus fresh frozen), and computational mutation callers. Although tumor mutation burden (TMB) seems to be a consistent ICB response biomarker, the computation of TMB across different cohorts and platforms is still an open question.

To demonstrate an example usage of the biomarker consensus module, we upload the pre-treatment expression matrix of a melanoma cohort [11] treated with anti-PD1 therapy (Table 1). Patients with favorable predictions from multiple biomarkers are highly likely to be responders. For example, patient 2 tumor has a negative TIDE score, indicating a lack of tumor immune evasion phenotypes. In addition, patient 2 tumor has positive scores of interferon-gamma (IFNG) signature, macro-satellite instability (MSI), and PDL1 (CD274) levels, all of which are positive biomarkers of ICB response. With the support from multiple markers, an oncologist could be more confident that patient 2 will respond to anti-PD1, and indeed patient 2 is a responder in the original study [11]. In contrast, this module also reported some patients who are unlikely to benefit from ICB (Table 1). For example, patient 10 tumor has high TIDE score and low IFNG, MSI, and PDL1 levels. Based on the predictions from multiple biomarkers, an oncologist might predict patient 10 as a non-responder and select an alternative therapy, and indeed patient 10 failed to benefit from anti-PD-1 [11]. TIDE also showed that patient 10 tumor has a significant enrichment of T cell exclusion signature due to high infiltration of myeloid-derived suppressor cell (MDSC) and cancer-associated fibroblast (CAF). Therefore, elimination of MDSC and CAF might be needed for patient 10 to respond to ICB. In summary, by presenting the predictions from multiple biomarkers in one integrated platform, the biomarker consensus module can potentially inform oncologists on treatment decisions.

Table 1 Response prediction output from the biomarker consensus module. The expression profile uploaded comes from a previous study of anti-PD1 response in melanoma [11] (“example 1” on the TIDE website). We ranked rows by ascending order of TIDE score. Actual Responder the actual clinical outcome in the study, Predicted Responder predictions by the threshold of the TIDE score set by a user (default is 0), TIDE TIDE prediction score [1], IFNG average expression of interferon-gamma response signature, MSI Score microsatellite instability score predicted through gene expression (Additional file 4: Supplementary Methods), CD274 gene expression value of PD-L1, CD8 gene expression average of CD8A and CD8B, CTL.flag flag indicator for whether the gene expression values are all positive for five cytotoxic T lymphocyte markers, including CD8A, CD8B, GZMA, GZMB, and PRF1, Dysfunction, Exclusion, MDSC, CAF, TAM M2 enrichment scores based on the gene expression signatures of T cell dysfunction, T cell exclusion, myeloid-derived suppressor cell, cancer-associated fibroblast, and tumor associated macrophage M2 type [1]

Full size table

Conclusions

In conclusion, we present a TIDE web platform to infer gene functions in modulating tumor immunity and evaluate biomarkers to predict ICB clinical response. Our work underlines the value of data sharing of published trials and code sharing of published biomarkers. Notably, several published ICB clinical studies have not released their omics data or clinical data (Additional file 2: Table S2), and we hope their authors could release these data to bring invaluable resource to the whole research community. As the immunotherapy data becomes increasingly available, we foresee the TIDE web platform with increased value and benefit to the mechanism studies in cancer immunology and the biomarker discoveries in immune oncology.

Availability of data and materials

All the processed data can be accessed on http://tide.dfci.harvard.edu/. We collected ICB-naive cancer data sets with both patient survival durations and tumor gene expression profiles from the TCGA [2], METABRIC [3], and PRECOG [4] databases. Following the accession instruction described in published ICB studies (Additional file 2: Table S2), we downloaded ICB patients’ RNA-Seq raw sequencing data, clinical outcome information, and response outcome information from ICB studies (if available). The raw count table and meta-information of eight published CRISPR screens [5,6,7,8] were also obtained from the original studies. The list of genes with launched drugs, collected from the OASIS database [9], is available in Additional file 5: Table S4. The literature support of transcriptomic biomarkers is available in Additional file 6: Table S5.

Abbreviations

CRISPR:: Clustered regularly interspaced short palindromic repeats
ICB:: Immune checkpoint blockade
TIDE:: Tumor Immune Dysfunction and Evolution

References

Jiang P, Gu S, Pan D, Fu J, Sahu A, Hu X, Li Z, Traugh N, Bu X, Li B, et al. Signatures of T cell dysfunction and exclusion predict cancer immunotherapy response. Nat Med. 2018;24:1550–8.
Article CAS Google Scholar
Cancer Genome Atlas Research N, Weinstein JN, Collisson EA, Mills GB, Shaw KR, Ozenberger BA, Ellrott K, Shmulevich I, Sander C, Stuart JM. The Cancer genome atlas Pan-Cancer analysis project. Nat Genet. 2013;45:1113–20.
Article Google Scholar
Curtis C, Shah SP, Chin SF, Turashvili G, Rueda OM, Dunning MJ, Speed D, Lynch AG, Samarajiwa S, Yuan Y, et al. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature. 2012;486:346–52.
Article CAS Google Scholar
Gentles AJ, Newman AM, Liu CL, Bratman SV, Feng W, Kim D, Nair VS, Xu Y, Khuong A, Hoang CD, et al. The prognostic landscape of genes and infiltrating immune cells across human cancers. Nat Med. 2015;21:938–45.
Article CAS Google Scholar
Pan D, Kobayashi A, Jiang P, Ferrari de Andrade L, Tay RE, Luoma AM, Tsoucas D, Qiu X, Lim K, Rao P, et al. A major chromatin regulator determines resistance of tumor cells to T cell-mediated killing. Science. 2018;359:770–5.
Article CAS Google Scholar
Kearney CJ, Vervoort SJ, Hogg SJ, Ramsbottom KM, Freeman AJ, Lalaoui N, Pijpers L, Michie J, Brown KK, Knight DA, et al. Tumor immune evasion arises through loss of TNF sensitivity. Sci. Immunol. 2018;3:29776993.
Article Google Scholar
Patel SJ, Sanjana NE, Kishton RJ, Eidizadeh A, Vodnala SK, Cam M, Gartner JJ, Jia L, Steinberg SM, Yamamoto TN, Merchant AS. Identification of essential genes for cancer immunotherapy. Nature. 2017;548(7669):537-42.
Article CAS Google Scholar
Manguso RT, Pope HW, Zimmer MD, Brown FD, Yates KB, Miller BC, Collins NB, Bi K, LaFleur MW, Juneja VR, Weiss SA. In vivo CRISPR screening identifies Ptpn2 as a cancer immunotherapy target. Nature. 2017;547(7664):413-8.
Article CAS Google Scholar
Fernandez-Banet J, Esposito A, Coffin S, Horvath IB, Estrella H, Schefzick S, Deng S, Wang K, AChing K, Ding Y, et al. OASIS: web-based platform for exploring cancer multi-omics data. Nat Methods. 2016;13:9–10.
Article CAS Google Scholar
Felip E, Brunsvig P, Vinolas N, Aix SP, Costa EC, Gomez MD, Perez JMT, Arriola E, Campelo RG, Spicer JF, et al. A phase II study of bemcentinib (BGB324), a first-in-class highly selective AXL inhibitor, with pembrolizumab in pts with advanced NSCLC: OS for stage I and preliminary stage II efficacy. J Clin Oncol. 2019;37:9098.
Article Google Scholar
Hugo W, Zaretsky JM, Sun L, Song C, Moreno BH, Hu-Lieskovan S, Berent-Maoz B, Pang J, Chmielowski B, Cherry G, et al. Genomic and transcriptomic features of response to anti-PD-1 therapy in metastatic melanoma. Cell. 2016;165:35–44.
Article CAS Google Scholar

Download references

Acknowledgements

The authors acknowledge the authors from published studies to share their data on tumor profiling cohorts (especially PRECOG), CRISPR screens, immunotherapy trials.

Funding

The research was supported by the Cancer Immunologic Data Commons (1U24CA224316-01) grant of the National Cancer Institute (NCI), the Pathway to Independence Award (1K99CA218900-01) grant of NCI (to P.J.), and the Partnership for Accelerating Cancer Therapies Grant from the Foundation for the National Institute of Health (to X.S.L).

Author information

Authors and Affiliations

Clinical Translational Research Center, Shanghai Pulmonary Hospital, School of Life Science and Technology, Tongji University, Shanghai, 200433, China
Jingxin Fu, Wubing Zhang & Changxin Wan
Department of Data Sciences, Dana Farber Cancer Institute, Harvard T.H. Chan School of Public Health, Boston, MA, 02215, USA
Jingxin Fu, Wubing Zhang, Changxin Wan, Peng Jiang & X. Shirley Liu
Tongji Hospital, School of life Science and Technology, Tongji University, Shanghai, 200065, People’s Republic of China
Jingxin Fu & Jing Zhang
The Winsor School, Boston, MA, 02215, USA
Karen Li
Present Address: Cancer Data Science Laboratory, National Cancer Institute, National Institutes of Health, Bethesda, MD, 20892, USA
Peng Jiang

Authors

Jingxin Fu
View author publications
You can also search for this author in PubMed Google Scholar
Karen Li
View author publications
You can also search for this author in PubMed Google Scholar
Wubing Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Changxin Wan
View author publications
You can also search for this author in PubMed Google Scholar
Jing Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Peng Jiang
View author publications
You can also search for this author in PubMed Google Scholar
X. Shirley Liu
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

JF, JZ, PJ, and XSL designed the study and wrote the manuscript. JF and PJ analyzed the data and developed and the website. WZ and CW tested the website functions. KL created the tutorial videos. All authors participated in discussions. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Jing Zhang, Peng Jiang or X. Shirley Liu.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

X.S.L. is a co-founder, board member, and Scientific Advisor of GV20 Oncotherapy, Scientific Advisory Board of 3DMed Care. The remaining authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Additional file 1:

Table S1. New function modules and datasets in the TIDE web server.

Additional file 2:

Table S2. Data availability of published ICB studies.

Additional file 3:

Table S3. An example gene set related with immunotherapy response.

Additional file 4:

Figure S1. The prediction performance of recently published biomarkers varies across data cohorts; and Supplementary Methods.

Additional file 5:

Table S4. Genes with approved drugs.

Additional file 6:

Table S5. Publications for public immunotherapy biomarkers.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Cite this article

Fu, J., Li, K., Zhang, W. et al. Large-scale public data reuse to model immunotherapy response and resistance. Genome Med 12, 21 (2020). https://doi.org/10.1186/s13073-020-0721-z

Download citation

Received: 19 November 2019
Accepted: 03 February 2020
Published: 26 February 2020
DOI: https://doi.org/10.1186/s13073-020-0721-z

Large-scale public data reuse to model immunotherapy response and resistance

Abstract

Background

Construction and content