Skip to main content

Single cell lineage tracing reveals clonal dynamics of anti-EGFR therapy resistance in triple negative breast cancer

Abstract

Background

Most primary Triple Negative Breast Cancers (TNBCs) show amplification of the Epidermal Growth Factor Receptor (EGFR) gene, leading to increased protein expression. However, unlike other EGFR-driven cancers, targeting this receptor in TNBC yields inconsistent therapeutic responses.

Methods

To elucidate the underlying mechanisms of this variability, we employ cellular barcoding and single-cell transcriptomics to reconstruct the subclonal dynamics of EGFR-amplified TNBC cells in response to afatinib, a tyrosine kinase inhibitor (TKI) that irreversibly inhibits EGFR.

Results

Integrated lineage tracing analysis revealed a rare pre-existing subpopulation of cells with distinct biological signature, including elevated expression levels of Insulin-Like Growth Factor Binding Protein 2 (IGFBP2). We show that IGFBP2 overexpression is sufficient to render TNBC cells tolerant to afatinib treatment by activating the compensatory insulin-like growth factor I receptor (IGF1-R) signalling pathway. Finally, based on reconstructed mechanisms of resistance, we employ deep learning techniques to predict the afatinib sensitivity of TNBC cells.

Conclusions

Our strategy proved effective in reconstructing the complex signalling network driving EGFR-targeted therapy resistance, offering new insights for the development of individualized treatment strategies in TNBC.

Background

Triple-negative breast cancer (TNBC) is a highly heterogeneous and aggressive breast cancer subtype characterized by metastatic progression, poor prognosis, and the absence of targetable biomarkers [1,2,3]. Neoadjuvant chemotherapy is initially highly effective on TNBCs but about 30%–50% of the patients rapidly develop resistance associated with higher mortality [1, 2, 4]. Despite Epidermal Growth Factor Receptor (EGFR)-activating mutations and amplifications (≥ 5 copies) are uncommon in TNBC [3, 5,6,7], the majority of primary TNBCs exhibit enhanced expression of EGFR because of an increase in gene copy number (three to four copies) [7,8,9,10,11,12] (Additional File 1: Fig. S1A-F), thus representing a valuable vulnerability for TNBC patients [13]. However, unlike other tumour types where inhibition of wild-type EGFR by monoclonal antibodies or tyrosine kinase inhibitors (TKIs) is beneficial [14,15,16,17], EGFR-targeted therapies in TNBCs have shown variable and unpredictable clinical responses (1.7% to 38.7%) [18,19,20,21,22]. Indeed, we and others [12, 20] found no significant correlation between EGFR status (i.e. copy number, mRNA, protein, or phospho-protein levels) and response to anti-EGFR therapies (Additional File 1: Fig. S1G). Moreover, genomic variants that were found to be predictive of anti-EGFR resistance [23] in other tumour types are infrequent in TNBC patients [6, 24, 25] (Additional File 1: Fig. S1H), suggesting that non-genomic mechanisms of resistance must be at play in TNBCs. As a result, the lack of predictive biomarkers of a response to anti-EGFR therapies in TNBC has hampered the translation of EGFR inhibitors in breast cancer.

Single-cell RNA sequencing (scRNA-seq) technologies have recently emerged as powerful tools to study intra-tumour heterogeneity [26, 27] and to reveal key genes that change in response to an external stimulus such as drug treatment [12, 28]. However, when monitoring the emergence of drug-resistant cell populations, it is crucial to identify and compare how surviving lineages (i.e. clones) of cancer cells adapt to the treatment. Recently, bulk RNA sequencing and cellular barcoding, a technique that uses distinct DNA sequences to mark each cell, have provided the possibility of following cancer progression, metastasis dissemination, and cell differentiation at a single clone level [29]. Hence, developing novel methods that couple DNA marking techniques with techniques measuring cellular states at single-cell resolution is essential to determine the different adaptive trajectories leading to drug resistance [28, 30, 31]. Once these two techniques are coupled, prospective lineage tracing analyses can be used to study single clone dynamics during treatment to identify mechanisms driving drug adaptation, while retrospective analyses can be used to trace backwards in time and reconstruct pre-existing mechanisms of drug resistance by comparing the transcriptional states of tolerant and sensitive clones before the treatment (Fig. 1A).

Fig. 1
figure 1

Platform for the identification of drug response biomarker genes with single cell lineage tracing. A An overview of strategies of lineage tracing. Once cells are marked (i.e. barcoded), they can be exposed to a selected drug concentration to drive the selection of resistant clones (yellow cells). Two types of analyses can then be performed. Prospective analyses where resistant clones (i.e. lineages) are followed over time during treatment to reconstruct driving mechanisms of drug adaptation. Retrospective analyses where resistant lineages are retrospectively mapped on the untreated condition to reconstruct pre-existing mechanisms of drug resistance. B Average log IC50 distribution of MDAMB468 cells and of all the other available TNBC cell lines to anticancer drugs targeting EGFR. C The Cellecta’s lentiviral construct contain a pool of random 48 base pairs (bp) barcodes located 70 bp far from ploy-A tail of the puromycin‐resistance gene cloned downstream of a Venus fluorescent protein. D To enable lineage tracing in MDAMB468 cells, we seeded 50,000 cells and infected them with Cellecta's lentiviral libraries at a multiplicity of infection (MOI) of 0.05. After five days of antibiotic selection, surviving cells were assessed by flow cytometry and subsequently expanded. E Scheme of the time series experiment performed on the barcoded MDAMB468 cell line. D0 are untreated parental MDAMB468 cells (MDAMB468-P) a few hours before afatinib addition, and D3 through D40 is the duration of the experiment where cells where subjected to incremental doses of afatinib ranging from 250 to 2000 nM. Cells at D40 were considered as afatinib tolerant persisted cells (ATPC). 10 × Chromium sequencing was performed at Day 0 (D0), Day 3 (D3), Day 6 (D6) and Day 9 (D9) of 250 nM afatinib treatment. Quantseq-Flex to retrieve afatinib-tolerant lineages in bulk was performed at Day 40 with 2000 nM afatinib treatment. F Dose–response curve showing cell viability of MDAMB468-P and MDAMB468-ATPC cells following treatment with the indicated concentrations of afatinib. G Estimated IC50 for the dose response curve in (G). H Number of unique lineages present in the MDAMB468-P and MDAMB468-ATPC cell lines

Here, we investigate the non-genomic mechanisms determining the cellular response to EGFR inhibitors in TNBCs. We integrate methods for cellular barcoding and single-cell transcriptomics to enable cell lineage tracing and explore the subclonal evolution of adaptation in an established preclinical model of TNBC [32, 33] in response to incremental concentrations of afatinib, a potent TKI that irreversibly inhibits both EGFR and HER2 receptors. Retrospective lineage tracing data analysis uncovered a pre-existing subpopulation of rare afatinib-tolerant cells displaying distinct biological features, such as elevated mRNA levels of IGFBP2 (Insulin-Like Growth Factor Binding Protein 2). We demonstrate by chemical and genetic manipulations that IGFBP2 overexpression is sufficient to render TNBC cells tolerant to afatinib treatment through activation of the compensatory IGF1-R signaling pathway. Prospective lineage tracing, on the other hand, highlighted additional adaptive mechanisms, including lysosome biogenesis, reactive oxygen species (ROS) homeostasis, and fatty acid metabolism. Finally, by linking reconstructed mechanisms of drug resistance with deep learning techniques we developed an algorithm to computationally predict afatinib response starting from the transcriptional status of TNBC cells. Our findings provide a new understanding of the intricate signaling network underlying EGFR-targeted therapy resistance in TNBC that can be helpful in devising novel strategies for TNBC patient stratification and therapeutic intervention.

Methods

Cell culture

HEK293T, MDAMB468, HCC1806 and HDQP1 were obtained from ATCC biobank. HEK293T and HDQP1 cells were cultured with DMEM supplemented with 10% foetal bovine serum (FBS), 1% L-glutamine, 1% penicillin/streptomycin (Euroclone, Milan, Italy), whereas MDAMB468 and HCC1806 cells were cultured with RPMI supplemented with 10% foetal bovine serum (FBS), 1% L-glutamine, 1% penicillin/streptomycin (Euroclone). Cells were kept at 37 °C under 5% CO2 atmosphere. SUM series cell lines were obtained from BioIVT biobank. SUM149PT, SUM185PE, SUM225CWN, SUM229PE and SUM159PT were cultured with Ham’s F12 medium (Gibco) supplemented with 5% heat-inactivated FBS (Gibco), 10 mM HEPES (Sigma-Aldrich), 1 μg/ml hydrocortisone (Sigma-Aldrich), 5 μg/ml insulin (Sigma-Aldrich) and 1% L-glutamine (Euroclone). SUM1315MO2 were cultured with the Ham’s F12 medium supplemented as described above with in addition 10 ng/mL EGF. No penicillin–streptomycin was added to the SUM cell line medium, as recommended by the suppliers.

dCAS9 plasmids and sgRNA cloning

pRDKCE2B-EFS-dCas9-KRAB-2A-Blast, pRDVCE2B-EFS-dCas9-VPH-2A-Blast, pRSG16N-U6-sg-UbiC-TagRFP-2A-Neo and pRSG17H-U6-sg-UbiC-TagGFP2-2A-Hygro were obtained from Cellecta (#SVKRABC9E2B-PS, #SVVPHC9E2B-PS, #SVCRU616N-L and #SVCRU617H-L respectively). sgRNA used to overexpress or downregulate the IGFBP2 gene was designed using the CRISPick Broad Institute tool (https://portals.broadinstitute.org/gpp/public) [34, 35] (Additional File 2: Table S12). IGFBP2 protospacer sequences were synthesized, hybridized, phosphorylated, and inserted into pRSU6-gRNA plasmids using BbsI sites.

Lentiviral packaging and transduction of dCAS9 plasmids

Lentiviral packaging was performed following the manufacturer’s instructions (Cellecta Mountain View, CA). Briefly, 24 h before transfection, 4 × 106 of HEK293T cells were plated in a 100-mm tissue-culture dish. On the day of transfection, 2 μg of the plasmid of interest was combined with 10 μg of the packaging plasmid mix (Cellecta, psPAX2: pMD2.G) in DMEM without serum or antibiotics in the presence of Plus Reagent™ and Lipofectamine™ (Life Technologies). Cells were incubated with the DNA/Plus Reagent™/Lipofectamine™ mix for 24 h. Viral supernatant was collected 48 h after transfection, filtered through a 0.45 μm PES filter and concentrated using LentiFuge™ Viral Concentration Reagent (Cellecta) following the manufacturer’s protocol. The concentrated lentiviral particles were re-suspended in PBS and stored at − 80 °C. 5 × 105 of MDAMB468 and HCC1806 cells were seeded in a 6-well plate and transduced with lentivirus to stably express dCas9-KRAB or dCas9-VHP in the presence of 8 μg/mL of LentiTrans™ Polybrene Transduction Reagent (Cellecta, #LTDR1) at a MOI of 0.5. 72 h after transduction, 2.5 μg/mL of Blasticidin (Thermo Fisher Scientific, Waltham, MA, USA) was added to MDAMB468 cells, while 1 μg/mL of Blasticidin was used for the HCC1806 cell line. Once dCas9-KRAB and dCas9-VHP stable cell lines were obtained, they were transduced as previously with pRSU6-gRNA lentivirus to modulate IGFBP2 expression with a MOI of 0.5. After 5 days of selection, sgRNA RFP-positive or GFP-positive cells were further sorted using BD FACS ARIA III.

HDQP1 cell transfection

The pCMV6-AC-IGFBP2-GFP expression vector encoding human IGFBP2 (NM_000597) fused to the GFP in the C-terminal region was purchased from Origene Technologies (#RG202573). This plasmid was used to obtain HDPQ1 cells stably overexpressing the IGFBP2 gene. Lipofectamine 2000 (Life Technologies, Grand Island, NY, USA) reagent was used to transfect HDQP1 cells according to the manufacturer’s instructions. The stable-transfected HDQ-P1 cells were selected in a medium containing 500 μg/mL G418 (Life Technologies). After 7 days of selection, cells were sorted with BD FACS ARIA III to select only the transfected population.

CloneTracker XP barcode transduction

The MDAMB468 barcoded cell population was generated using CellTracker XP 10 M Barcode-3’ Libraries (Cellecta, #BCXP10M3VP-V). Overall, 50,000 cells were transduced with the library with LentiTrans™ Polybrene Transduction Reagent at a low MOI (0.05) to yield only 5% infection. A population with 2,500 unique different barcodes was obtained. 72 h post-infection, 1 μg/mL of Puromycin (Thermo Fisher Scientific) was added to the medium to select only transduced MDAMB468 cells. After 5 days, selected cells were checked by measuring the number of cells expressing Venus using the BD Accuri™ C6 flowcytometer (BD Biosciences, Franklin Lakes, NJ, USA). The cells were 100% positive and thus expanded in cultured medium with 1 μg/mL of Puromycin.

Time series experiment

5 × 105 barcoded cells were plated in triplicate in a 6-well plate and allowed to adhere for 24 h. Control cells were also plated in triplicate and viability checked over the course of the experiment. Cells were then treated with incremental concentrations of afatinib (Selleckchem cat. Num. S1011) starting from 250 nM with each treatment lasting 72 h. Afatinib concentration was doubled every nine days up to 2 µM. Every 3 days, cell viability was measured for both treated and untreated cells and 1 × 104 of treated cells were collected and cryopreserved from each treated replicate as follows. Cells were centrifuged at 1,100 rpm for 3 min and then resuspended with resuspension buffer (RB) (100% of FBS) to obtain a final concentration of 1,000 cells/µl. Then, 10 µl of this cell suspension was added to 90 µl of RB to obtain a final concentration of 100 cells/µl (total: 1 × 104 cells). Finally, 100 µl of cold freezing buffer (FB) (100% of FBS and 5% DMSO) was added before placing the cells at -80 °C. We chose to initiate afatinib treatment at 250 nM based on the lowest reported sensitivity of a breast cancer cell line to afatinib from GDSC1 screening, available at https://www.cancerrxgene.org. The chosen concentration range intentionally mimics a drug adaptation process, ranging from the minimal level at which a breast cancer cell line responds to an EGFR inhibitor to the highest concentration, which is twice the IC50 value of Afatinib in the MDA-MB-468 cell line. Regarding the 9-day period between doubling the drug concentration, this was determined after extensive experimentation. This interval provided sufficient time for the cells to adapt to the drug treatment and maintain survival. When we attempted to double the drug concentration more frequently, we encountered significant variability in cell viability and experiment reproducibility. The current strategy, with a 9-day interval, ensured consistent results throughout the experiments.

Single cell RNA-sequencing of MDAMB468 cells with 10 × chromium platform

The cryopreserved samples were thawed in a 37 °C water bath and washed multiple times with MACS® Separation Buffer (Miltenyi biotech). Cell viability of the single cell suspension was measured with Trypan Blue using a LUNA-II™ Automated Cell Counter (Logos Biosystems). Cells were then filtered through a 40-μm filter, centrifuged at 300 × g for 5 min and resuspended to obtain a final concentration of 1 × 103 cells/ µl. The cell suspensions were then processed to generate single-cell libraries using a Chromium Single Cell Gene Expression 3′ Library and Gel Bead kit v3.1 following the manufacturer’s instructions (10 × Genomics). Briefly, the cells were suspended in reverse transcription reagents and injected into microfluidic chips of the 10 × Chromium instrument, along with gel beads, and segregated in a nanoliter-scale Gel Beads-in-Emulsion (GEMs). Reverse transcription was carried out on the GEMs in a MiniAmp Thermal Cycler (Thermo Fisher Scientific) with the following protocol: 53 °C for 45 min, 85 °C for 5 min, and hold at 4 °C. After, reverse-transcribed samples were purified with the Recovery agent and isolated with the Dynabeads Cleanup Mix. The cDNA was amplified in a Thermo cycler using the following protocol: 98 °C for 3 min, 12 cycles of (98 °C for 15 s, 63 °C for 20 s, 72 °C for 1 min), 72 °C for 1 min, and hold at 4 °C. The quality of the amplified cDNA was quantified with the BioAnalyzer High Sensitivity DNA kit D5000 (Agilent Technologies) and its concentration was measured using the Qubit Fluorometer. Finally, amplified cDNAs were fragmented, end-repaired and A-tailed with SPRIselect Reagent Kit (Beckman Coulter). The post-ligation products were amplified using the following protocol: 98 °C for 45 s, 10 cycles of (98 °C for 20 s, 54 °C for 30 s, 72 °C for 20 s), 72 °C for 1 min and hold at 4 °C. The sequencing-ready libraries were purified with SPRIselect and the quality were quantified with the BioAnalyzer High Sensitivity DNA kit D1000 and the concentrations were measured using the Qubit Fluorometer. NovaSeq 6000 SP 100 cycles flow cell was used to sequence the libraries.

Specific CloneTracker XP barcode single cell library

To optimize the capture of the CloneTracker XP barcode at the single cell level, a parallel single cell library preparation was performed from the amplified cDNAs by separately amplifying the CloneTracker XP barcode amplicon with a custom P7 primer that targeted it (Additional File 2: Table S12). CloneTracker XP barcode was amplified according to the following protocol: 98 °C for 45 s, 14 cycles of (98 °C for 20 s, 54 °C for 30 s, 72 °C for 20 s), 72 °C for 1 min. PCR products were then SPRI-purified and the quality were quantified with the BioAnalyzer High Sensitivity DNA kit D1000 and the concentrations were measured using the Qubit Fluorometer. Single cell CloneTracker XP barcode libraries were sequenced as spike-ins alongside the parent RNA-seq libraries. This strategy to use a custom P7 primer with unique i7 index had the advantage to allow to separately analyse the barcoded scRNA-derived cDNA in the NGS data.

CloneTracker XP barcode retrieval from scRNA-Seq reads

As shown in Fig. 1C, the 48-nucleotide expressed barcode cassette of the Cellecta CloneTracker XP barcode library has a composite structure built by juxtapositioning 100 different 14-nt sequences (a.k.a. bc14) and 100,000 30-nt sequences (a.k.a. bc30) separated by a 4-nt (TGGT) anchor. Cellecta also provides the whitelist containing the 100 possible bc14, and 100,000 bc30 can appear at both sides of the anchor. Lineage barcodes were identified from the FASTQ file of read two, while cell barcode from read one of both the specific CloneTracker XP barcode single cell library and standard single cell library we produced for each sequenced time point. Specifically, for each fragment present in read two, the occurrence of the TGGT anchor was first searched allowing one possible mismatch. Then, if some correspondence was found, the 14-nt bases before (i.e. putative bc14) and the 30-nt bases after (i.e. putative bc30) the anchor sequence were extracted. Next, if putative bc14 and bc30 were in the whitelist provided by Cellecta, the lineage barcode represented by the string “bc14:bc30” was assigned to the corresponding cell retrieved from read one. On the other hand, if one of the putative lineage barcodes was not found in the Cellecta whitelist, we corrected it with the barcode in the whitelist having a Hamming distance equal to one, but only if there were no other barcodes in the whitelist with a Hamming distance of one. Finally, if both the putative bc14 and bc30 were not found in the Cellecta whitelist, the bc14 barcode sequence is corrected first as above and we corrected the corresponding bc30 only if bc14 correction was successful. This strategy allowed us to reduce the high computing demand required by correcting lineage barcodes for possible sequencing errors but without losing sensitivity. At end of this iterative process, each sequenced cell is associated with the most abundant pair of lineage barcodes bc14:bc30 we have found. Only reads where both bc14 and bc30, after correction, matched with the ones provided by Cellecta’s whitelist were used. This was implemented in an R script that made use of the Biostrings R package to perform string matching and a custom C + + function to compute the Hamming distances between two DNA sequences of the same length.

MDAMB468 scRNA-seq reads alignment and expression quantification

Following demultiplexing, raw sequencing reads were aligned to the Hg38 human reference genome using the Cell Ranger tool version 6.1.2. For reads alignment, the GENECODE annotation v32 of Hg38 was employed, retaining only the genes with biotypes protein coding, lincRNA, and antisense. After reads alignment, cells with fewer than 5,000 UMI were discarded. Next, putative cell doublets and cells expressing a high fraction of mitochondrial reads were removed using the filterCell function from the gficf version 2 R package [27, 36, 37] available at https://github.com/gambalab/gficf. Specifically, the filterCell function employs loess regression to fit the relationship between the total UMI count in a cell (in log scale) and the ratio between the total UMIs falling in mitochondrial genes over the total UMI count. Cells in which the ratio deviates significantly from the expected value with an adjusted p-value (FDR) < 0.1 are discarded. To eliminate putative cell doublets, the same strategy is applied, but in this case, loess regression is used to fit the relationship between the total UMI count in a cell (in log scale) and the ratio between the number of detected genes over the total UMI count. Again, cells for which the ratio deviates significantly from the expected value with an FDR < 0.1 are discarded. According to our long experience in tha analysis of single cell data, this latter strategy is particularly useful for removing not only cell doublets but also cells that, for technical reasons, exhibit a high total UMI count concentrated in a small number of genes. The resulting highly covered 4,101 cells after these filtering processes were used for all downstream analysis. Furthermore, since cells of day 0 were sequenced using two distinct captures in two different flow cells, the possible batch effect was corrected using the sva function of CombatSeq R package [38]. Finally, genes expressed in less than 50 cells and in less than 5% of sequenced cells were excluded only if their average expression, was less than 1.12 UMI. This approach is similar to that employed in [39].

Bulk estimation of the number of lineages in MDAMB468 cells

Bulk estimation of the number of CloneTracker XP barcodes present in parental and ATPC MDAMB468 barcoded cells was performed with QuantSeq-Flex (Lexogen) technology as follows. First, total RNA was isolated from three independent biological replicates using the Qiagen RNeasy Mini kit (Qiagen), according to the manufacturer's instructions. Then, library preparation was performed according to the manufacture’s protocol (Lexogene). The target-specific reverse transcription primers to specifically capture the Cellecta barcode were designed according to the guidelines outlined in the QuantSeq-Flex manual. All primers were composed of a partial Illumina P7 adapter extension followed by the target-specific sequence. A pool of 4 primers was used at 50 nM. Primer lengths were in the range of 44–50 nucleotides, as requested by the manufacture’s protocol. The target-specific first strand synthesis primers can be found in Additional File 2: Table S12.

Cell viability assay

Cells were seeded in either a 96-well plate (4 × 104 cell/well) or a 384-well plate (1 × 103 cell/well). After overnight incubation at 37 °C, cells were treated either with drugs, at the selected dilutions, or with DMSO as negative control (in technical triplicate) and incubated at 37 °C for 72 h. Then, cell viability was evaluated by measuring either luminescence or absorbance (490 nm) with the CellTiter-Glo® luminescent cell viability assay (Promega) or the CellTiter (Promega), respectively, using the GloMax® Discover instrument (Promega) according to the manufacturers’ protocol. Background luminescence or absorbance values were measured in wells without cells and with only culture medium. Background values were subtracted from experimental values. All drugs used in this study were purchased from Selleckchem.

Drug combination assay

4 × 104 of MDAMB468 cells were seeded in a 96-well plate and incubated overnight at 37 °C. Afatinib and GSK-1904529A drugs were prepared in different dilutions and then combined in all possible drug pairs to generate a 5 × 5 or 5 × 6 drug combination matrix. Then, cells were exposed to either single agent drugs or to the drug pairs of the drug combination matrix, while negative controls were treated with DMSO (each treatment was performed in triplicate). Following 72 h incubation at 37 °C, cell viability was measured with the CellTiter (Promega), and the absorbance was read at 490 nm with the plate reader GloMax® Discover instrument. The Afatinib-GSK-1904529A drug interactions and expected drug responses were calculated with the Combenefit tool [40] using the Loewe additivity model.

Generation of spheroids and image acquisition

Cells grown as a monolayer were detached to generate a single-cell suspension that was then diluted to 2.5 × 104 (for 5,000 cells per spheroid) cells per millilitre of ice-cold medium. The Cell Basement Membrane (ATCC, #ATCC-ACS-3035) was thawed on ice overnight and added at a final concentration of 2.5% with ice-cold pipette tips to the cell suspension. A volume of 200 µl of this cell suspension was added to each well of an Ultra-Low Attachment (ULA) 96-well plate with a round or conical bottom (PerkinElmer, #SPA6055330). The spheroid formation was initiated by centrifugation of the plates at 300 g for 10 min. The plates were incubated under standard cell culture conditions at 37 °C, 7% CO2 in humidified incubators. When we studied afatinib response, the drug was added as follows. After 24 h, when spheroids were formed, 100 μl of medium was removed and the treatment was performed adding 100 μl of medium with 2 μM of afatinib into each well. Treatments were renewed after 3 days in new fresh complete medium. Pictures were taken before adding treatments and then at day 3, 5 and 7 after treatment. Images were acquired with the High Content Analysis System Operetta CLS (PerkinElmer) with temperature (37 °C) and CO2 (5%) control. They were acquired in several planes and the area analyses were performed on maximum projections by Harmony software (PerkinElmer). Volumetric analyses were performed by stack processing with 3d analysis.

RNA extraction, reverse transcription and quantitative RT-PCR (qPCR)

Total RNA from cell lines was extracted usingthe Qiagen RNeasy Mini kit (Qiagen, Hilden, Germany) following the manufacturer’s instructions. A total of 1 μg of total RNA from each sample was used to obtain double strand cDNA with the QuantiTect Reverse Transcription Kit (Qiagen). Quantitative Real-Time PCR (qRT-PCR) was performed and for each PCR reaction, 10 μL of 2 × Sybr Green (LightCycler 96, Roche Molecular Systems, Inc.), 200 nM of each primer, and 20 ng of the cDNA, previously generated, were used. Relative gene expression was determined using comparative C(T) method, as described elsewhere. RP18S was used as a housekeeping gene.

Clonogenic survival assay

5 × 104 cells were plated into a 12-well plate and allowed to adhere for 24 h. Then the cells were treated with different concentrations of Afatinib and incubated for 10 (Fig. 3C) or 5 days (Fig. 3E). Fresh media was added on the fifth day. On the tenth day, the media was removed from the dishes and washed once with ice-cold PBS. The colonies were fixed and stained with a solution containing crystal violet for 45 min on a rocking platform. The dishes were rinsed three times with PBS and air-dried. Then, the stained crystal violet was resolved with PBS-0.1% SDS and absorbance at 590 nm was determined. The data were performed in triplicate and are shown as mean ± SD.

QuantSeq 3′ mRNA-Sequencing and bioinformatics analysis

Total RNA was purified from three biological replicates by Qiagen RNeasy Mini kit (Qiagen), according to the manufacturer’s instructions. Total RNA was quantified using the Qubit 4.0 fluorometric Assay (Thermo Fisher Scientific). Libraries were prepared from 125 ng of total RNA using the NEGEDIA Digital mRNA-seq research grade sequencing service (Next Generation Diagnostic srl) which included library preparation, quality assessment and sequencing on a NovaSeq 6000 sequencing system using a single-end, 100 cycle strategy (Illumina Inc.). The raw data were analysed as follow. Briefly, Illumina novaSeq base call (BCL) files were transformed into fastq files through bcl2fastq (version v2.20.0.422, Illumina Inc.), and sequence reads were trimmed using bbduk software (bbmap suite 37.31, Joint Genome Institute, Walnut Creek, CA, USA). Alignment was performed on hg38 reference assembly (Ensembl Assembly 93) with star 2.6.0a (GPL v3, open source) and gene expression levels were determined with htseq-count 0.9.1. Gene expression normalization and differentially expressed genes were identified using edger package [41] in the R statistical environment. For differential expression analysis, only genes with an average CPM more than 5 in at least one of the two conditions were considered.

Differential expression analysis and clustering analysis of single cell data

Differentially expressed genes in Fig. 2C between tolerant and sensitive afatinib cells at time 0 (i.e. untreated cells) were identified using the FindMarkers function implemented in Seurat R package v3 with logfc.threshold and min.pct parameters set to 0. All clustering analyses were performed with Seurat.

Fig. 2
figure 2

Retrospective and Prospective lineage tracing analysis. A UMAP representation of MDAMB468 cells colour-coded for the days of treatment. B Retrospective projection of the afatinib tolerant persisted lineages on a UMAP representation of untreated MDAMB468 cells at Day 0. C Differential expression between afatinib tolerant persisted lineages versus afatinib sensitive lineages on untreated MDAMB468 cells (i.e. cells from Day 0). D Expression distribution of IGFBP2 in afatinib tolerant persisted lineages and afatinib sensitive lineages over time. Statistical differences were estimated using a two-tailed Wilcoxon test. E IGFBP2 expression in MDAMB468-P and MDAMB468-ATPC cells measured by real-time quantitative PCR. The statistical difference was estimated using a two tailed t-test. F Cellular frequency of the afatinib tolerant clones bc14-013:bc30-092942 and bc14-013:bc30-092942 and all other clones at the sequenced time points (CTRL (Day 0), Day 3, Day 6, Day 9). G Cellular frequency of clones bc14-013:bc30-092942 and bc14-013:bc30-092942 in the MDAMB468-ATPC cell line. H Up-regulated genes identified by comparing cells belonging to the dominant clones to cells belonging to the neutral clones. I Average log2 fold change (FC) of the top ten up-regulated genes from (C). A 95% confident interval is reported. J Expression distribution of IGFBP2 in dominant and neutral clones over time. Statistical differences were estimated with a two-tailed Wilcoxon test

Pseudotime analysis of tolerant clones and time-course differential expression analysis between cells of the dominant and neutral clones

The 2,836 single cell expression profiles associated with afatinib-tolerant clones were divided into two groups according to the behaviour of their clone of origin, i.e. “dominant” or “neutral”. The expression matrix of dominant clones comprised 1,293 cells while the matrix of neutral clones comprised 1,543 cells. Then psupertime function of psupertime R package [42] was used to infer the pseudotime order of cells in each independent matrix. Psupertime R function was run with default parameters and using as cell labels their sequencing day (i.e., 0, 3, 6, 9 days). Next, for both matrices the expression profile of each gene was (i) ordered according to the reconstructed pseudotime; (ii) smoothed using the moving median method and (iii) finally divided into 50 expression bins according to the inferred pseudotime. To smooth gene expression with the moving median method, we used rollmedian function of the zoo R package with a rolling window equal to 51. After these three steps the expression profiles of each gene in both matrices was represented by 50 pseudotime points. Finally, to identify upregulated genes in the cells associated with dominant clones across the first 9 days of treatment we used the splineTimeR package [43]. With this tool we compared the reconstructed time-course profile of a gene across cells of the dominant clones with its reconstructed time-course profile across cells of the neutral clones. Up-regulated genes were defined as the genes with an FDR < 5% and a positive average fold change across the 50 binned pseudotime points.

Estimation of EGFR copy number alterations, mutations and expression in TNBC patients

Two independent cohorts of TNBC patients were used to estimate gene copy number alterations, mutations and expression reported in Additional File 1: Fig. S1. The first cohort composed of 192 TNBC patients was obtained from the Genomic Data Common (GDC) portal [44]. GDC raw bulk expression, mutational data and copy number variation data along with clinical information were retrieved using TCGAbiolinks R package v.2.25.3 [45]. The second cohort was composed of 465 TNBC Asiatic patients [5] for whom gene expression, gene copy number alteration and mutations were available was gathered from the NODE database after the author’s authorization. The raw expression counts downloaded from GDC were first normalized with edgeR package [41] and transformed in log2(CPM + 1), while those from the Asiatic cohort were already normalized as FPKM values, and thus we only transformed them into log2(FPKM + 1). A sample in the GDC cohort was defined to have EGFR copy gain if the number of copies was 3 or 4, while EGFR amplification was defined when this number was greater than 5. For the Asiatic cohort, EGFR gain and amplification were defined based on GISTIC scores, respectively, + 1 and + 2. For both cohorts, mutations were considered only if they were in the coding region (i.e., missense, non-sense, in-frame and out-of-frame deletions). Next, the deleterious tag was assigned when at least two functional annotations among VEP, PolyPhen and SIFT reported a negative impact of the mutation on protein function. Finally, EGFR activating mutations tag was conferred based on indications in [46].

Cell migration assay

Transwell chambers (8-μm pores) (Euroclone) were used to perform transwell migration assays. Briefly, breast cancer cells were plated in the upper side of the transwell chamber in a serum-deprived medium. Next, as chemoattractant, 300 μl of complete medium was added in the lower chamber. After 24 h, cells migrated to the lower side of the chambers were stained with crystal violet solution (crystal violet 0.05%, methanol 20%). Then, crystal violet in the chamber was de-stained with PBS-0.1% SDS solution and read at 570 nm.

scRNA library preparation of Drop-SEQ and bioinformatics analysis

Single cell transcriptomics of the SUM149PT, SUM185PE, SUM22PE, SUM159PT and SUM1315MO2 cell lines was performed with DROP-seq technology [47] and library preparation as described in Gambardella et al. [26]. scRNA libraries were sequenced with NovaSeq 6000 machine using an SP 100 cycles flow cell. Raw reads pre-processing was performed as described in [37]. Only high depth cells with at least 5,000 UMI were retained and used to test the scATRAL tool. The alignment pipeline can be found at https://github.com/gambalab/dropseq [48].

Gaussian processes to estimate significance in dose–response curves

We used Gaussian processes to evaluate the significance of dose–response curve trend differences. Gaussian processes computation was performed as in [49]. Briefly, we first reconstructed for each condition the dose response trend by interpolating the response values at each tested concentration and then we identified differences in the two trends under consideration in terms of log-likelihood, by computing the related Bayesian factor.

Estimation of the cellular frequency over time of the 192 afatinib-tolerant clones

To compute cellular frequencies of an afatinib-tolerant lineage over time, we estimated the percentage of cells associated with it in each sequenced time point. For each time-point, this percentage was simply estimated as the number of cells associated with a specific lineage divided by the total number of sequenced cells.

scASTRAL: architecture

Contrastive learning is a machine learning approach that aims to create an embedding space where positive examples (i.e. two cells with the same label) are separated from negative examples (i.e. two cells with different labels). In our case, a contrastive autoencoder was used as a model of scASTRAL. The encoder consisted of an input layer composed of 374 neurons, i.e. the number of afatinib response biomarker genes identified with their retrospective lineage tracing strategy. The input layer was followed by two hidden layers comprising 64 and 32 neurons, respectively. A ReLU (Rectified Linear Unit) activation function was used. The decoder was symmetrical to the encoder.

scASTRAL: training

We used the 1,541 MDAMB468 control cells labelled as afatinib-tolerant or afatinib-sensitive as a training set. Before starting the training, the matrix of normalized CPM counts was cut to the 374 afatinib response biomarker genes we identified in Fig. 3C and the expression of these genes rescaled using tf-idf transformation [36]. This dataset was then randomly split into 5 batches of equal size by ensuring that the number of afatinib-tolerant and afatinib-sensitive cells were similar across the different batches. Next, a contrastive model was trained on each batch as follows. First, a positive and a negative example was randomly built for each cell of the batch. This means that for each batch the number of examples on which training is performed was always equal to the number of cells in the batch multiplied by two. Second, the contrastive model was trained using a loss function composed by the weighted sum of three terms: the cosine embedding loss, the reconstruction error, and the latent space regularization factor. Specifically, the cosine embedding loss maximizes the cosine distance between pairs of cells labelled with -1 (i.e. negative example) while maximizing the cosine similarity between pairs of cells labelled with + 1. (i.e. positive example). At the same time, the reconstruction error assures that the decoder was capable of reconstructing original data from the latent representation generated by the encoder. This allows the model to also integrate in the final embedding space the dependencies among the 374 genes we used as input. Third, an SVM classifier is trained using the reconstructed embedding space of the considered batch of cells. Specifically, the SVM classifier was trained with a cosine kernel and with a regularization hyperparameter set to 100. Classification accuracy of the trained SVM of a batch was finally evaluated using the left-out cells. We used the Adam optimizer [50] with a learning rate of 0.0001 while the training was performed using an early stopping criterion, although the maximum number of epochs was set to 250. The metrics considered to compute improvements across epochs were the validation loss and the SVM classification accuracy. The training stopped if no improvements were obtained considering the last 20 epochs. The contrastive model hyperparameters were found by a grid search. Once the 5 contrastive models with associated SVM models were trained, the model with the highest classification accuracy was used to predict afatinib sensitivity at the single cell level as described below. scASTRAL was implemented in Phyton using PyTorch version 1.13.1. Code is available at the following address https://github.com/gambalab/scASTRAL [51].

Fig. 3
figure 3

Identification of genes driving cellular expansion during afatinib adaptation. A Dose–response curve in terms of cell viability following treatment of MDAMB468 control (CTR) and IGFBP2-knockdown cells with afatinib at the indicated concentrations. Significance was assessed using Gaussian processes and BF is the estimated Bayesian Factor expressed in bits representing the difference between the two dose–response curves (Methods section). A value of 6.81 corresponds to a very strong significant difference between the two dose–response curves [52]. B Dose–response curve in terms of cell viability of MDAMB468 control (CTR) and IGFBP2-overexpressing cells following treatment with afatinib at the indicated concentrations. Significance was assessed as in (A). A value of 4.3 corresponds to a strong significant difference between the two dose–response curves [52]. C Colony assay (representative images) for MDAMB468 control (CTR) and IGFBP2-overexpressing cells after 10 days of afatinib exposure at 2, 5 and 10 µM (left). Quantification is reported on the right. Experiments were performed in triplicate. D Spheroid volume growth of MDAMB468 control and IGFBP2-overexpressing cells with 2 µM afatinib (see Methods section) over time. E Colony assay (representative images) for MDAMB468 control and IGFBP2-knockdown cells after 3 days of afatinib exposure at 0.5 and 1 µM (left). Quantification is reported on the right. Experiments were performed in triplicate. F Spheroid volume growth of MDAMB468 control and IGFBP2-knockdown cells with 2 µM of afatinib over time. G Spheroid volume growth over time of MDAMB468 IGFBP2-overexpressing cells. H Spheroid volume growth over time of MDAMB468 IGFBP2-knockdown cells. I Transwell migration assay of MDAMB468 IGFBP2-overexpressing cells. Reported p values from panels C to I were estimated using two-tailed t-test

scASTRAL performance evaluation on single cells of TNBC cell lines

To validate scASTRAL and test its performance in predicting afatinib sensitivity of triple negative breast cancer cells, we used 22,724 single-cell transcriptional profiles from 16 TNBC cell lines (Fig. 5D). Of these, 11 were obtained from Gambardella et al. [26], while five were de-novo sequenced with the drop-seq platform. Before being fed into the scASTRAL method, the raw UMI count matrix of each cell line was first normalized using sklearn to obtain CPM and then cut on the 374 afatinib response biomarker genes identified in Fig. 3C. Next, the expression values of these genes were rescaled using tf-idf transformation [36] before applying scASTRAL. Finally, a cell was deemed to be tolerant or sensitive to afatinib if the SVM classification probability was greater than 0.75 otherwise it was considered undetermined. To convert predictions from the single-cell level to the cell-line level, we computed the fraction of predicted afatinib-tolerant cells in each cell line as the number of cells predicted to be tolerant divided by the total number of cells.

scASTRAL performance evaluation on pseudobulk of TNBC cell lines

Pseudobulk profiles of each cell line were computed by summing the UMI counts of each gene across the sequenced cells of the corresponding cell lines. The corresponding count matrix was cut on the 374 marker genes of afatinib response and then fed to scASTRAL.

scASTRAL performance evaluation on single cells of TNBC patients

To evaluate the performance of scASTRAL on single-cell data from TNBC patients, we compiled 41,189 single-cell transcriptional profiles from 16 treatment-naive primary TNBC patients from the studies conducted by Pal et al. [53] and Wu et al. [54]. As experimental data on the afatinib sensitivity of these patients was not available, we reconstructed this information using an ensemble prediction generated by the DREEP (https://github.com/gambalab/DREEP) [55] and Beyondcell (https://github.com/cnio-bu/beyondcell) [56] tools. These methods employ unique drug signatures to assess the impact of a drug and have been demonstrated to effectively predict drug response at the single-cell level. DREEP possesses five distinct signatures for Afatinib, while Beyondcell has three. To estimate the proportion of afatinib-sensitive cells in each patient's tumor cell population, we applied both tools to each patient's cells and assigned the median values across the eight potential signatures as the percentage of sensitive cells for that patient. Consequently, we could estimate the percentage of afatinib-sensitive cells in each patient's tumor cell population, which we then used as a benchmark to compare the performance of scASTRAL. Both tools were run using default parameters. Subsequently, scASTRAL was applied to single-cell data of the 16 TNBC patients, employing the same approach we used for single cell data of TNBC cell lines. Hence, before being fed into the scASTRAL tool, the raw UMI count matrix of each patient was normalized using sklearn to obtain CPM and then cut on the 374 afatinib response biomarker genes. Subsequently, the expression values of these genes were rescaled using tf-idf transformation before applying scASTRAL. Finally, a cell was considered tolerant or sensitive to afatinib if the SVM classification probability was greater than 0.75; otherwise, it was deemed undetermined. To convert predictions from the single-cell level to the patient level, we calculated the fraction of predicted afatinib-sensitive cells in patient as the number of cells predicted to be sensitive divided by the total number of cells.

Permutation feature importance (PFI) analysis

To assess the significance of features within our classification model, we utilized the permutation feature importance (PFI) algorithm [57], a versatile model inspection technique applicable to any estimator working with tabular data. This technique exclusively relies on input data and output predictions, enabling its seamless integration into the comprehensive end-to-end scAstral pipeline. The PFI of a feature is quantified as the decrease in a scoring metric when the values of that feature are randomly shuffled within a batch of data. In our case, we employed the ROC AUC as the scoring metric. This process is repeated 1,000 times using different seeds for the shuffling, and the mean score across repetitions is calculated, providing a robust metric for evaluating the importance of each feature.

EGFR copy number estimation and afatinib response of TNBC cell lines

Gene copy number data for EGFR was extracted from the DepMap database (version 23Q4) and transformed into actual copy counts. TNBC cell lines were classified into three groups based on their EGFR copy number distribution: Neutral (2 copies), Gained (3–4 copies), Amplified (≥ 5 copies) and Deleted (< 2 copies). Afatinib response data was acquired from the GDSC database, comprising both GDSC1 and GDSC2 datasets.

Results

Single cell lineage tracing in EGFR-dependent TNBC cells

In this study, we use single-cell lineage tracing to identify novel predictive biomarkers of resistance to EGFR inhibitors in the human TNBC cell line MDAMB468. This cell line stands out among EGFR-wt TNBC cell lines due to its unique combination of EGFR gene amplification (> 5 copies) and pronounced response to anti-EGFR therapies (Fig. 1B) [32, 33], including afatinib (Additional File 1: Fig. S2). As shown in Additional File 1: Fig. S3, while other two TNBC cell lines may exhibit higher afatinib sensitivity, none of them harbour EGFR amplification, a key feature of EGFR-driven TNBC with potential for patient stratification. Furthermore, whole exome sequencing of MDAMB468 [58, 59] failed to identify mutations in genes linked to EGFR inhibitor resistance, such as KRAS and HRAS [23].

To enable single cell lineage tracing in these cells, we engineered them with Cellecta’s CloneTracker XP recorder technology. The Cellecta's CloneTracker XP lentiviral barcode libraries (www.cellecta.com) are pooled expressed barcode libraries that enable the tracking and profiling of up to 10 million individual clones derived from a population of cells using either PCR or NGS techniques. As shown in Fig. 1C, the lentiviral-based CloneTracker XP barcode libraries have two main functional elements: the reporter Venus protein and the drug resistance marker (PuroR), both expressed from a single promoter on a single transcript. The 48-nucleotide barcode cassette is embedded within the 3'-UTR sequence of the Venus mRNA and is located approximately 70 nucleotides upstream of the polyA. This design ensures that the barcodes are transcribed and can be captured during cDNA first-strand synthesis using standard oligo-dT primers employed in routine bulk and single-cell RNA-sequencing library preparation protocols. Indeed, after cell infection at low multiplicity of infection (Methods section), antibiotic selection, and cellular expansion (Fig. 1D), bulk RNA-sequencing analysis revealed that our founder population of MDAMB468 cells consisted of 2,336 unique barcodes (i.e. groups of related cells descended from a single clone).

Next, to induce the selection of afatinib-tolerant subpopulations, we subjected barcoded MDAMB468 cells to a gradient of increasing afatinib concentrations, ranging from 250 to 2000 nM (Methods section). Over the course of the experiment, cells were collected at regular interval (i.e. every three days) for single cell transcriptomic profiling (Fig. 1E). Following 40 days of selection, a stable subpopulation of cells emerged that could proliferate in a medium containing 2000 nM of afatinib. This resistant subpopulation, we named MDAMB468-ATPC (Afatinib Tolerant Persistant Cells), exhibited a 4-fold increase in afatinib IC50 values compared to the parental MDAMB468 population (Fig. 1F,G). Barcode lineage retrieval from RNA bulk sequencing of MDAMB468-ATPC at 40 days revealed that only 192 out of the initial 2,336 clones (i.e., 8%) present in the parental MDAMB468 cell line survived to the afatinib selection process (Fig. 1H).

Retrospective lineage tracing to elucidate pre-existing mechanisms of drug resistance of afatinib resistance in MDAMB468 cells

Next, we aimed to reconstruct non-genomic cellular states mediating the response to afatinib. To this end, we performed single-cell transcriptomic analysis on 4,101 cells harvested at four time points, day 0 (untreated), day 3, day 6 and day 9, following 250 nM of afatinib treatment (Fig. 2A). By examining the transcriptional barcodes in individual cells, we were able to capture the transcriptional state of 448 distinct clones at day 0, while 186, 206 and 228 distinct clones were captured at day 3, day 6, and day 9, respectively.

We then retrospectively mapped the barcodes of 192 afatinib-tolerant clones present at day 40 onto the single-cell data of untreated MDAMB468 cells at day 0, as shown in Fig. 2B. To assess whether afatinib-tolerant clones could be identified using conventional clustering approaches, we created a Uniform Manifold Approximation and Projection (UMAP) of D0 cells exclusively. As depicted in Additional File 1: Fig. S4A, the afatinib-tolerant and -resistant clones are interspersed within the UMAP, implying a lack of distinct, specific expression patterns. To further corroborate this observation, we applied unsupervised clustering to the D0 cells (Additional File 1: Fig. S4B). This analysis revealed the absence of any single cluster exclusively comprising tolerant or sensitive cells (Additional File 1: Fig. S5C).

Next, we asked whether we could identify genes that were differentially expressed in untreated cells at day 0 by comparing cells belonging to afatinib-tolerant lineages with those belonging to the afatinib-sensitive lineages, i.e. those that were depleted after 40 days of continuous afatinib exposure. These genes, if present, should highlight pre-existing mechanisms of resistance to afatinib treatment. We found 374 genes that were differentially expressed between these two populations of cells (Additional File 2: Table S1). As shown in Fig. 2C, among these genes, 221 were up-regulated in the afatinib-tolerant lineages (i.e., marker gene of resistance) and 153 were up-regulated in the afatinib-sensitive lineages (i.e., marker genes of sensitivity). Gene Ontology Enrichment Analysis (GOEA) of the 221 up-regulated genes in the afatinib-tolerant cell population revealed several biological processes associated with resistance to EGFR inhibitors in other cancer types, such as oxidative phosphorylation and fatty acid metabolism [28, 60,61,62] (Additional File 2: Table S2). In contrast, the 153 genes up-regulated in the afatinib-sensitive cell population included EGFR and several genes related to the estrogen signalling pathway (Additional File 2: Table S2).

As shown in Fig. 2C, among the 221 up-regulated genes in the afatinib-tolerant cell population, the most significantly upregulated gene was the Insulin-Like Growth Factor Binding Protein 2 (IGFBP2), which is a member of a family of six proteins specifically binding insulin-like growth receptors I and II (IGF1-R and IGF2-R). IGFBP2 has been recently implicated in the progression and metastasis of several tumour types [63, 64]. Interestingly, as shown in (Fig. 2D), the overexpression of IGFBP2 in afatinib-tolerant lineages is maintained during treatment, suggesting that this gene, among the others, could be required both in drug resistance initiation and maintenance. Indeed, higher expression of the IGFBP2 gene in afatinib-tolerant clones was further confirmed by qRT-PCR of MDAMB468-ATPC (Fig. 2E).

Previous studies have shown that resistance mechanisms to anti-EGFR therapies involve the compensatory activation of signalling pathways that share downstream effectors with the EGFR signalling cascade, including IGF1-R, MET and the PI3K-AKT-mTOR signalling pathways [23, 65,66,67,68]. This understanding, coupled with the knowledge that IGFBP2 is part of the IGF1-R pathway, prompted us to co-treat parental MDA-MB-468 cells with different concentrations of Afatinib and GSK-1904529A, a selective inhibitor of IGF1-R and Insulin Receptor (INSR). To gain a broader understanding of this dose–response relationship, we performed two independent experiments in triplicate, one in which afatinib varied from 1 to 16 µM (Additional File 1: Fig. S5A,B) and another in which it varied between 31 nM and 1 µM (Additional File 1: Fig. S5C,D). As shown in Additional File 1: Fig. S5, the synergism between afatinib and GSK4529 is evident over a broad range of afatinib concentrations but is potent around 1 µM. These results support the existence of a pre-existing group of cells with enhanced activity of the compensatory IGF1-R signalling pathway could be sustained by IGFBP2.

In conclusion, these findings collectively emphasize the significance of our retrospective lineage strategy in identifying subtle distinctions that could otherwise be overlooked by conventional clustering methods based solely on gene expression analysis. Our strategy reviled IGFBP2 as a potential player in initiating afatinib resistance in MDAMB468 cells.

Prospective lineage tracing to reconstruct clonal expansion patterns in afatinib-tolerant MDAMB468 cells

Our previous analyses revealed that only 192 out of 2,336 clones (i.e. lineages) originally present in the parental MDAMB468 cell population became fully tolerant to afatinib treatment. To understand how these clones evolved over time, we computed at each sequenced time point the percentage of cells associated with each afatinib-tolerant lineage and plotted them over time, as shown in Fig. 2F and (Additional File 2: Table S3). This analysis revealed that 190 of the 192 lineages were “neutral”, maintaining a constant relative population size (i.e. the same percentage) over time. Two clones, however, showed significantly increased relative size frequency, accounting for almost half (45%) of the total population after nine days of treatment (Fig. 2F). Further analysis confirmed that these two clones became the “dominant” ones, comprising 81% (Fig. 2G) of the drug-tolerant MDAMB468-ATPC cell line (i.e., cells after 40 days of afatinib exposure).

Next, we sought to determine if the two dominant clones displayed distinct characteristics compared to the remaining 190 neutral clones, allowing their identification using conventional clustering approaches. To address this question, we generated UMAP visualizations of tolerant cells at each time point and clustered them based on their transcriptional similarities (Additional File 1: Fig. S6). As seen in Additional File 1: Fig. S6, no single cluster exclusively houses cells derived from either the dominant or neutral clones, except for cluster 5 of Day 0 tolerant cells, which however comprises only 12 cells.

To identify key genes driving the cellular expansion of the two dominant clones during the first nine days of afatinib treatment, we divided cells into two groups according to the behaviour of their clone of origin, i.e. “dominant” or “neutral”. Then, we ordered the two groups of cells along a linear pseudotime (pt) trajectory [42] from day zero to day nine and used time-course differential expression analysis [43] to identify genes that were up-regulated in the cells of the dominant clones (see Methods section). This clone resolution analysis identified a total of 549 driver genes that were significantly up-regulated (FDR < 5%) in the dominant clones (Fig. 2H and Additional File 2: Table S4), with IGFBP2 again being the most up-regulated gene (Fig. 2I,J). These findings suggest that IGFBP2 could be a player in both resistance initiation and cellular expansion during afatinib treatment.

To test this hypothesis, we generated two novel cell lines from the parental MDAMB468 using CRISPRa and CRISPRi techniques: the MDAMB468-VPH-IGFBP2 (stably overexpressed IGFBP2) and the MDAMB468-KRAB-IGFBP2 (stably downregulated IGFBP2) (Additional File 1: Fig. S7A,B and Methods section). As shown in Fig. 3A,B IGFBP2 knockdown significantly increased afatinib cytotoxicity while IGFBP2 overexpression significantly decreased it. Furthermore, IGFBP2 overexpression allowed cell colony growth even with a concentration of afatinib ranging from 2 to 10 µM (Fig. 3C). These results were further validated in 3D cell culture, where IGFBP2 overexpression increased MDAMB468 spheroid growth at 2 µM afatinib exposure (Fig. 3D). In contrast, IGFBP2 knockdown inhibited colony formation after 5 days of exposure to 1 µM afatinib (Fig. 3E) and significantly inhibited MDAMB468 spheroid growth during afatinib treatment (Fig. 3F). To investigate whether IGFBP2 could confer resistance to afatinib in other TNBC cell lines, we selected additional highly afatinib-responsive (Additional File 1: Fig. S3) TNBC cell lines (HCC1806 and HDQP1) and again employed the CRISPRa system to generate cells stably overexpressing IGFBP2 (Additional File 1: Fig. S7C). As shown in Additional File 1: Fig. S8A,B IGFBP2 overexpression decreased afatinib cytotoxicity in both cell lines.

Finally, to dissect the biological process sustained by IGFBP2 expression, we performed bulk 3’ RNA-sequencing of MDAMB468-VHP-IGFBP2 cells. Our analysis identified 856 transcripts that were significantly modulated by IGFBP2 overexpression (FDR < 0.05), with 373 were up-regulated and 483 down-regulated (Additional File 2: Table S5). GOEA of the 856 differentially expressed genes showed a strong enrichment for genes related to the PI3K-AKT axis/AMPK signalling [64, 69] but also many genes related to cell adhesion, growth, and migration (Additional File 2: Table S6). Indeed, IGFBP2 knockdown impaired spheroid growth of untreated MDAMB468 cells, conversely to its overexpression (Fig. 3G,H). IGFBP2 overexpression also increases cell migration in MDAMB468 (Fig. 3I) and additional TN cell lines such as HCC1806 and HDQP1 (Additional File 1: Fig. S8C,D). Moreover, 88 of the 856 genes modulated by IGFBP2 overexpression were also found among the 549 driver genes upregulated in the dominant lineages. These 88 genes were primarily associated to cell growth, adhesion, and metabolic processes (Additional File 2: Table S7).

In conclusion, we have demonstrated the capacity of our cell lineage tracing strategy to enable the reconstruction of cell expansion at single-clone resolution, providing insights into the dynamic nature of drug resistance mechanisms. Our findings also reinforce the role of IGFBP2 as one of the players influencing afatinib adaptation in MDAMB468 cells.

Temporal analysis of afatinib-mediated transcriptional programs in MDAMB468 cells

To investigate the coordination of gene activity and transcriptional programs among drug-resistant sub-populations of cancer cells, we clustered the expression patterns of the 549 driver genes upregulated in drug-tolerant lineages along their reconstructed pseudotime trajectory (Methods section). After clustering, four distinct patterns of transcriptional adaptation to afatinib treatment emerged (Fig. 4A). Cluster one included genes with an “early” transcriptional response, increasing significantly after three days of treatment, while cluster two included genes with a “delayed” transcriptional response, increasing significantly after six days of treatment. Cluster three included genes that progressively downregulate in expression during afatinib treatment, while genes in cluster four exhibited a transient decrease in expression followed by a return to baseline levels.

Fig. 4
figure 4

Gene expression dynamics in response to afatinib adaptation. A 549 driver genes clustered according to their expression. Cells are binned according to their reconstructed pseudotime. The average expression profile of the genes in the cluster and the cell composition of each bin as a percentage of cells in each timepoint is reported below each cluster. B Gene ontology enrichment analysis of 549 genes (bottom) and for each enriched term the percentage of genes belonging to a specific cluster of (A). C Genes in the enriched term EGFR tyrosine kinase inhibitor resistance and the cluster of (A) in which they fall

Next, we employed time-resolved Gene Ontology Enrichment Analyses (GOEA) to decipher the sequential activation of the transcriptional programs driving afatinib adaptation. To this end, we first performed GOEA considering all 549 driver genes (Additional File 2: Table S8) and then reconstructed the specific activation time of each significant GO term by calculating the proportion of genes it included from each cluster. GOEA analysis of the 549 driver genes (Fig. 4B) revealed several biological processes previously linked to drug resistance in other cancer types, including lysosome biogenesis, Reactive Oxygen Species (ROS) homeostasis, and fatty acid metabolism [70,71,72,73,74,75,76], as well as specific genes linked to EGFR resistance, such as PDGF-C [77], FGF2 [78, 79], PIK3R2 [80], HRAS [81], MAPK3 [82, 83], GAS6 [84], and BAX [85] (Fig. 4C).

Interestingly, PDGF-C and FGF2 are PDGFR-α and FGFR ligands, respectively. They are well-established compensatory pathways activated to overcome EGFR inhibition [86, 87] often associated with the acquisition of mesenchymal features [65, 66] sustained by the AXL pathway [88, 89] whose activator is GAS6 (Fig. 4C). Lysosome biogenesis and lipogenesis have been extensively studied for their involvement in drug adaptation to many anticancer drugs in multiple cancer types, including both standard chemotherapeutics and targeted therapy [70,71,72,73,74,75,76]. Enhanced lysosomal biogenesis enables the sequestration of hydrophobic weak base compounds, thereby reducing the cytotoxic potential of a drug by limiting its availability at the site of action [70, 71]. On the other hand, enhanced lipogenesis has been reported to contribute to drug adaptation by reducing drug uptake and participating in antioxidant cell defence by regulating membrane fluidity [72,73,74,75,76].

As shown in Fig. 4B, a temporal program emerges, although most pathways exhibit genes distributed across all four clusters. For instance, lysosomal genes are particularly enriched in clusters one and two, suggesting a possible cellular early response to drug-induced damage. Conversely, genes associated with fatty acid metabolism, ROS, and EGFR TKI resistance (Fig. 4B,C) exhibit expression profiles falling within clusters three and four, where gene expression is elevated from day 0 (i.e., untreated cells), suggesting that these pathways could be inherent characteristics of drug-tolerant cells that are either required in later stages of drug adaptation (cluster four) or not (cluster three).

In conclusion, these results support the hypothesis that during the afatinib response, along with the induction of xenobiotic detoxification mechanisms to reduce drug availability, tolerant clones activate compensatory pathways beyond IGF1-R to maintain pro-survival signals. This enables them to reduce Reactive Oxygen Species (ROS) production [69] while mitigating their damaging effects by producing antioxidants and facilitating Epithelial to Mesenchymal Transition (EMT) [87, 90].

Development of a deep learning approach to predict afatinib sensitivity in TNBC

Our retrospective lineage tracing analysis on barcoded MDAMB468 cells identified 374 genes whose expression levels significantly influence afatinib responsiveness. This led us to question whether these genes could serve as effective predictive biomarkers for afatinib therapy in other TNBC cells. To address this question, we developed scASTRAL (single-cell Afatinib reSponse of TRiple negAtive ceLls) [51], a deep doublet learning approach (Methods section) that utilizes both contrastive learning and Supported Vector Machine (SVM) to predict afatinib sensitivity from single cell data, leveraging the expression levels of the 374 marker genes identified in Fig. 2C.

Contrastive learning is a machine learning approach that aims to construct an optimized embedding space where similar sample pairs are pushed closer and dissimilar ones farther apart. As shown in Fig. 5A, scASTRAL uses as input cells whose transcriptional profiles are represented by the expression levels of the 374 marker genes of the afatinib response identified previously. The algorithm involves two main steps. In the first step, scASTRAL builds an embedding space where the cosine distance between cells belonging to clones of the same afatinib-response class (e.g., afatinib-sensitive or afatinib-tolerant) is minimized, while at the same time, the distance between cells belonging to clones of different afatinib-response classes is maximized. As illustrated in Fig. 5B, this effectively separates cells based on their afatinib response. In the second step scASTRAL trains an SVM classifier on the embedded data to distinguish afatinib-sensitive and afatinib-tolerant cells with high accuracy. This classifier can then be used to predict the afatinib sensitivity of new single-cell data from other treatment-naïve TNBCs.

Fig. 5
figure 5

Single-cell afatinib response prediction in TNBC cells. A Schematics of scASTRAL. Contrastive learning is used in the first step to build an embedding space where positive examples (two cells of the same class) are separated from negative examples (two cells of different classes). In the second step an SVM classifier is trained on the learned embedded space to predict the class (i.e. afatinib-tolerant or afatinib-sensitive) of novel cells. B Inter- and intra- cosine distance between sensitive and tolerant MDAMB468 cells before and after the training. Intra-distance is the cosine distance among cells of the same afatinib response class, while inter-distance is the distance among cells belonging to two different afatinib response classes C UMAP representation of 22,724 triple negative breast cancer cells from 16 cell lines. D Spearman Correlation Coefficient (SCC) between scASTRAL predicted resistance to afatinib and experimentally estimated IC50. E Histogram of SCC values computed between scASTRAL predicted sensitivity to afatinib and experimentally estimated IC50 using 374 random genes. 1,000 simulations were performed. Red arrow indicates SCC value obtained using the 374 afatinib response marker genes we identified with our retrospective lineage tracing approach. F UMAP representation of 41,189 triple-negative breast cancer (TNBC) cells extracted from treatment-naïve primary tumours of 16 patients. G Estimation of the proportion of cells sensitive to afatinib using DREEP and BeyondCell tools. H Spearman Correlation Coefficient (SCC) between scASTRAL-predicted afatinib sensitivity and estimated sensitivity obtained from DREEP and BeyondCell tools

We trained scASTRAL model on the labelled 1,541 MDAMB468 control cells (Methods section) and then assessed its performance in predicting afatinib sensitivity on 22,724 single-cell transcriptional profiles from 16 TNBC cell lines (Fig. 5C). Specifically, single cell data of 11 out of the 16 cell lines (15,022 cells) were obtained from the cell line breast cancer atlas we recently published [26], while data from the remaining 5 cell lines (7,702 cells), were de-novo sequenced using the DROP-seq platform [47] (Methods section). We then employed scASTRAL to project the 22,724 triple negative cells into the learned afatinib-response space and classify each cell as either afatinib-sensitive or afatinib-resistant.

To evaluate scASTRAL’s performance in predicting afatinib sensitivity, we converted predictions to the cell-line level by computing the fraction of predicted afatinib-sensitive cells in each cell line and correlated these values with the experimentally determined afatinib IC50 values of each cell line. Afatinib IC50 values were retrieved from the Genomics of Drug Sensitivity in Cancer (GDSC) database v2 [58] or measured de-novo when unavailable (Additional File 2: Table S9). As shown in Fig. 5D and Additional File 2: Table S10, scASTRAL’s predicted cell-line sensitivity exhibited a significant correlation with the corresponding experimentally determined IC50 values (SCC = 0.68, P = 0.0038). Consistent with expectations, our model's predictive accuracy for resistance in treated cells significantly diminished. scASTRAL accurately classified only 5% of Day 3 cells, 12% of Day 6 cells, and a mere 2% of Day 9 cells as resistant. This decline in predictive performance is probably due to the disruption of marker gene expression induced by afatinib treatment. Indeed, as shown in Additional File 1: Fig. S9, 230 out of the 374 marker genes consistently exhibited differential expression across all time points, indicating compelling evidence of treatment-induced alterations.

Next, to assess the predictive power of our 374 marker genes, we performed a series of Monte Carlo simulations (Methods section) to evaluate the performance of randomly selected gene sets in predicting afatinib sensitivity. Specifically, in each simulation: (i) we randomly selected 347 genes from the overall gene pool; (ii) we trained scASTRAL on MDAMB468 control cells; (iii) we used the trained model to predict the afatinib response of the 16 TNBC cell lines described above; and (iv) we evaluated the performance of scASTRAL correlating predicted sensitivities with the experimentally estimated ones. We repeated this process 1,000 times. As shown in Fig. 5E, our 374 marker genes consistently outperformed the randomly selected gene sets, suggesting that they possess superior predictive power for afatinib sensitivity. Next, we employed the Permutation Feature Importance (PFI) method [57] to evaluate the relative significance of the 374 marker genes associated with afatinib response that we had identified. The PFI algorithm is designed to focus solely on the predictive performance of the model, evaluating the importance of a feature by quantifying the increase in the model’s prediction error when the feature values are permuted (Methods section). This analysis disclosed that out of the 374 genes, 212 had a positive PFI score (Additional File 2: Table S11), indicating their heightened significance in the model’s predictions. Interestingly, 41 of these significant genes, which constitute 19% of the total, are among those modulated by IGFBP2 overexpression.

Finally, to evaluate scASTRAL's applicability on bulk transcriptomic profiles, we employed two approaches: first, aggregating single-cell expression profiles from the 16 TNBC cell lines (Fig. 5A) into pseudo-bulk profiles, and second, downloading bulk RNA-seq data from DeepMap for 19 triple-negative breast cancer cell lines with corresponding afatinib IC50 values from the GDSC portal. We then applied scASTRAL to both datasets and mirrored our single-cell data validation approach by correlating the estimated probability of afatinib resistance with experimentally determined IC50 values. Notably, both approaches yielded significant Spearman correlation coefficients: 0.48 for pseudo-bulk data and 0.49 for bulk RNA-seq data (Additional File 1: Fig. S10). This suggests that scASTRAL may be readily adaptable for use with bulk transcriptomic data, potentially offering a valuable tool for predicting afatinib response in triple-negative breast cancer.

Overall, these findings demonstrate the potential of our retrospective lineage tracing approach to identify biomarker genes predictive of drug response.

scASTRAL performance and comparison with previous method using patient scRNA-seq data

Cancer cell lines often exhibit limited transcriptomic heterogeneity, lacking the full complexity of actual tumors. Therefore, to demonstrate a possible clinical applicability of scASTRAL, we tested our methodology to a series of single-cell datasets derived from primary sites of individuals diagnosed with triple-negative breast cancer (TNBC). To this end, we assembled 41,189 single-cell transcriptional profiles encompassing 16 treatment-naive primary TNBC patients [53, 54] (Fig. 5F). Since experimental data on afatinib sensitivity for these patients is unavailable, we employed an ensemble prediction approach using two state-of-the-art bioinformatics tools, DREEP [55] and Beyondcell [56], to estimate afatinib sensitivity from single-cell expression profiles (Fig. 5G) (Methods section). This strategy enabled us to estimate the proportion of afatinib-sensitive cells within each patient’s tumor cell population, which served as a reference for evaluating the performance of scASTRAL. As shown in Fig. 5H, we observed a spearman correlation coefficient of 0.59 (p-value: 0.019), suggesting that our scASTRAL model can predict afatinib sensitivity with reasonable accuracy in TNBC patients as well. While this preliminary analysis provides some encouraging evidence for the potential clinical utility of scASTRAL, it is crucial to recognize that further validation studies involving larger datasets of patients treated with anti-EGFR therapy are essential to fully evaluate the model’s performance on more complex data as primary tumour’s biopsy. However, such data is currently unavailable.

These results suggest scASTRAL, could potentially be utilized to stratify TNBC patients according to their afatinib response.

Discussion

Owing to its inherent genetic complexity and the absence of recurrent oncogenic alterations, TNBC currently has limited treatment options beyond conventional chemotherapies [3, 5]. Although the majority of primary TNBCs exhibit increased expression of EGFR due to an increased gene copy number, EGFR-targeted therapies have demonstrated variable and unpredictable clinical responses in TNBC. Hence, novel approaches are urgently needed to identify drug response biomarker genes that can effectively stratify TNBC patients for tailored EGFR-targeted therapies [18,19,20,21,22]. Although clinical studies of EGFR inhibitors across various tumour types have not shown sufficiently high response rates to warrant their use in unselected patients, durable responses have been observed at low frequencies across several epithelial tumour types [91, 92]. This highlights the need for novel approaches for robust biomarker identification strategies to accurately predict patient outcomes and optimize the use of EGFR-targeted therapies in TNBC.

In this study, we combined single-cell transcriptomics, cellular barcoding, and time-resolved computational analyses to provide a comprehensive transcriptional characterization of MDAMB468 cells in response to afatinib treatment, a potent TKI that irreversibly inhibits both EGFR and HER2 proteins.

Retrospective lineage tracing analysis uncovered a pre-existing subpopulation of MDAMB468 afatinib-tolerant cells exhibiting distinct biological features, such as elevated mRNA levels of the IGFBP2 gene. We provide experimental validation that IGFBP2 overexpression is sufficient to make TNBC cells tolerant to afatinib treatment through activation of the compensatory IGF1-R signalling pathway in MDAMB468 cells. Additionally, prospective lineage tracing analysis revealed the activation of several mechanisms that contribute to drug adaptation, including lysosome biogenesis, reactive oxygen species homeostasis, and fatty acid metabolism [70,71,72,73,74,75,76].

Our approach not only provided insights into the molecular mechanisms of afatinib resistance in MDAMB468 cells but also yielded a valuable tool for predicting drug response in other TNBC cells. Indeed, by leveraging deep learning techniques we devised an algorithm, scASTRAL, to accurately predict afatinib sensitivity from the expression levels of the 374 genes were identified through retrospective lineage tracing strategy. This signature demonstrated its generalizability by accurately predicting afatinib response in both publicly available and de novo single-cell datasets of TNBC cell lines and primary tumors. However, further studies are necessary to investigate whether the transcriptional programs linked to afatinib resistance are conserved across different tumor types and shared by other tyrosine kinase inhibitors (TKIs).

Conclusions

Unraveling the distinct responses of cancer cell clones to treatment is crucial for understanding how intra-tumoral heterogeneity shapes drug effectiveness and identifying novel druggable targets for personalized cancer therapy. Our study highlights the importance of lineage tracing strategies, enabling the detection of subtle nuances that may go unnoticed by conventional clustering approaches reliant on gene expression alone. Our findings showcase the promising potential of lineage tracing techniques in reconstructing cancer clonal evolution under therapeutic interventions and demonstrate effectiveness in identifying predictive biomarker genes that can guide personalized treatment strategies for cancer patients.

Availability of data and materials

Data produced during this study have been deposited in the Gene Expression Omnibus (GEO) database in the SuperSeries with accession code GSE228382 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE228382) [93]. It contains the single-cell transcriptomics of Afatinib treatment time series, the bulk 3’ RNA-sequencing of MDA-MB-468 cell line over-expressing IGFBP2 and the single-cell sequencings of the additional 5 TNBC cell lines used to test scASTRAL performances. scASTRAL code is available at https://github.com/gambalab/scASTRAL [51]. DropSeq Alignment pipeline can be found at https://github.com/gambalab/dropseq [48]. Seurat object with raw UMI counts, normalized counts and cells’ metadata are available on figshare at following https://doi.org/10.6084/m9.figshare.24787458 [94].

Abbreviations

AMPK:

AMP-activated protein kinase

BAX:

BCL2 Associated X, Apoptosis Regulator

CRISPRa:

CRISPR activation

CRISPRi:

CRISPR interference

DREEP:

Drug Response Estimation from single-cell Expression Profiles

DROP-seq:

Droplet-based single-cell RNA sequencing

EGFR:

Epidermal Growth Factor Receptor

EMT:

Epithelial to Mesenchymal Transition

FDR:

False Discovery Rate

FGF2:

Fibroblast growth factor 2

FGFR:

Fibroblast growth factor receptor

GAS6:

Growth arrest-specific protein 6

GDSC:

Genomics of Drug Sensitivity in Cancer

GOEA:

Gene Ontology Enrichment Analysis

HRAS:

Harvey rat sarcoma viral oncogene homolog

IGF1-R:

Insulin-like growth factor 1 receptor

IGFBP2:

Insulin-Like Growth Factor Binding Protein 2

INSR:

Insulin receptor

KRAS:

Kirsten rat sarcoma viral oncogene homolog

MAPK3:

Mitogen-activated protein kinase 3

PDGF-C:

Platelet-derived growth factor C

PDGFR:

Platelet-derived growth factor receptor

PFI:

Permutation Feature Importance

PT:

Pseudotime

ROS:

Reactive oxygen species

scASTRAL:

Single-cell Afatinib reSponse of TRiple negAtive ceLls

SCC:

Spearman Correlation Coefficient

scRNA-seq:

Single-cell RNA sequencing

SVM:

Supported Vector Machine

TKI:

Tyrosine kinase inhibitors

TN:

Triple-negative

TNBC:

Triple-negative breast cancer

UMAP:

Uniform Manifold Approximation and Projection

References

  1. Foulkes WD, Smith IE, Reis-Filho JS. Triple-Negative Breast Cancer. N Engl J Med. 2010;363:1938–48.

    Article  CAS  PubMed  Google Scholar 

  2. Bianchini G, Balko JM, Mayer IA, Sanders ME, Gianni L. Triple-negative breast cancer: Challenges and opportunities of a heterogeneous disease. Nat Rev Clin Oncol. 2016;13:674–90.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Yin L, Duan J-J, Bian X-W, Yu S. Triple-negative breast cancer molecular subtyping and treatment progress. Breast Cancer Res. 2020;22:61.

    Article  PubMed  PubMed Central  Google Scholar 

  4. Hoogstraat M, Lips EH, Mayayo-Peralta I, Mulder L, Kristel P, van der Heijden I, Annunziato S, van Seijen M, Nederlof PM, Sonke GS, et al. Comprehensive characterization of pre- and post-treatment samples of breast cancer reveal potential mechanisms of chemotherapy resistance. NPJ Breast Cancer. 2022;8:60.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Jiang YZ, Ma D, Suo C, Shi J, Xue M, Hu X, Xiao Y, Yu KD, Liu YR, Yu Y, et al. Genomic and transcriptomic landscape of triple-negative breast cancers: subtypes and treatment strategies. Cancer Cell. 2019;35:428–440.e5.

    Article  CAS  PubMed  Google Scholar 

  6. Koboldt DC, Fulton RS, McLellan MD, Schmidt H, Kalicki-Veizer J, McMichael JF, Fulton LL, Dooling DJ, Ding L, Mardis ER, et al. Comprehensive molecular portraits of human breast tumours. Nature. 2012;490:61–70.

    Article  CAS  Google Scholar 

  7. Perou CM, Sørlie T, Eisen MB, van de Rijn M, Jeffrey SS, Rees CA, Pollack JR, Ross DT, Johnsen H, Akslen LA, et al. Molecular portraits of human breast tumours. Nature. 2000;406:747–52.

    Article  CAS  PubMed  Google Scholar 

  8. Ali R, Wendt MK. The paradoxical functions of EGFR during breast cancer progression. Signal Transduct Target Ther. 2017;2:1–7.

    Google Scholar 

  9. Bhargava R, Gerald WL, Li AR, Pan Q, Lal P, Ladanyi M, Chen B. EGFR gene amplification in breast cancer: Correlation with epidermal growth factor receptor mRNA and protein expression and HER-2 status and absence of EGFR-activating mutations. Mod Pathol. 2005;18:1027–33.

    Article  CAS  PubMed  Google Scholar 

  10. Park HS, Jang MH, Kim EJ, Kim HJ, Lee HJ, Kim YJ, Kim JH, Kang E, Kim SW, Kim IA, et al. High EGFR gene copy number predicts poor outcome in triple-negative breast cancer. Mod Pathol. 2014;27:1212–22.

    Article  CAS  PubMed  Google Scholar 

  11. Tischkowitz M, Brunet J-S, Bégin LR, Huntsman DG, Cheang MCU, Akslen LA, Nielsen TO, Foulkes WD. Use of immunohistochemical markers can refine prognosis in triple negative breast cancer. BMC Cancer. 2007;7:134.

    Article  PubMed  PubMed Central  Google Scholar 

  12. Savage P, Blanchet-Cohen A, Revil T, Badescu D, Saleh SMI, Wang YC, Zuo D, Liu L, Bertos NR, Munoz-Ramos V, et al. A targetable EGFR-dependent tumor-initiating program in breast cancer. Cell Rep. 2017;21:1140–9.

    Article  CAS  PubMed  Google Scholar 

  13. Costa R, Shah AN, Santa-Maria CA, Cruz MR, Mahalingam D, Carneiro BA, Chae YK, Cristofanilli M, Gradishar WJ, Giles FJ. Targeting epidermal growth factor receptor in triple negative breast cancer: New discoveries and practical insights for drug development. Cancer Treat Rev. 2017;53:111–9.

    Article  CAS  PubMed  Google Scholar 

  14. Moore MJ, Goldstein D, Hamm J, Figer A, Hecht JR, Gallinger S, Au HJ, Murawa P, Walde D, Wolff RA, et al. Erlotinib plus gemcitabine compared with gemcitabine alone in patients with advanced pancreatic cancer: a phase iii trial of the national cancer institute of Canada clinical trials group. J Clin Oncol. 2007;25:1960–6.

    Article  CAS  PubMed  Google Scholar 

  15. Vermorken JB, Mesia R, Rivera F, Remenar E, Kawecki A, Rottey S, Erfan J, Zabolotnyy D, Kienzer H-R, Cupissol D, et al. Platinum-based chemotherapy plus Cetuximab in head and neck cancer. N Engl J Med. 2008;359:1116–27.

    Article  CAS  PubMed  Google Scholar 

  16. Van Cutsem E, Peeters M, Siena S, Humblet Y, Hendlisz A, Neyns B, Canon J-L, Van Laethem J-L, Maurel J, Richardson G, et al. Open-label phase III trial of Panitumumab plus best supportive care compared with best supportive care alone in patients with chemotherapy-refractory metastatic colorectal cancer. J Clin Oncol. 2007;25:1658–64.

    Article  PubMed  Google Scholar 

  17. Cappuzzo F, Finocchiaro G, Grossi F, Bidoli P, Favaretto A, Marchetti A, Valente ML, Cseh A, Clementi L, Massey D, et al. Phase II study of afatinib, an irreversible ERBB family blocker, in EGFR FISH-positive non-small-cell lung cancer. J Thorac Oncol. 2015;10:665–72.

    Article  CAS  PubMed  Google Scholar 

  18. Bianchini G, De Angelis C, Licata L, Gianni L. Treatment landscape of triple-negative breast cancer — expanded options, evolving needs. Nat Rev Clin Oncol. 2022;19:91–113.

    Article  CAS  PubMed  Google Scholar 

  19. Baselga J, Arteaga CL. Critical update and emerging trends in epidermal growth factor receptor targeting in cancer. J Clin Oncol. 2005;23:2445–59.

    Article  CAS  PubMed  Google Scholar 

  20. Carey LA, Rugo HS, Marcom PK, Mayer EL, Esteva FJ, Ma CX, Liu MC, Storniolo AM, Rimawi MF, Forero-Torres A, et al. TBCRC 001: randomized phase II study of Cetuximab in combination with carboplatin in stage IV triple-negative breast cancer. J Clin Oncol. 2012;30:2615–23.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. von Minckwitz G, Jonat W, Fasching P, du Bois A, Kleeberg U, Lück H-J, Kettner E, Hilfrich J, Eiermann W, Torode J, et al. A multicentre phase II study on gefitinib in taxane- and anthracycline-pretreated metastatic breast cancer. Breast Cancer Res Treat. 2005;89:165–72.

    Article  CAS  Google Scholar 

  22. Nakai K, Hung M-C, Yamaguchi H. A perspective on anti-EGFR therapies targeting triple-negative breast cancer. Am J Cancer Res. 2016;6:1609–23.

    CAS  PubMed  PubMed Central  Google Scholar 

  23. He J, Huang Z, Han L, Gong Y, Xie C. Mechanisms and management of 3rd-generation EGFR-TKI resistance in advanced non-small cell lung cancer (Review). Int J Oncol. 2021;59:90.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Baselga J, Albanell J, Ruiz A, Lluch A, Gascón P, Guillém V, González S, Sauleda S, Marimón I, Tabernero JM, et al. Phase II and tumor pharmacodynamic study of gefitinib in patients with advanced breast cancer. J Clin Oncol. 2005;23:5323–33.

    Article  CAS  PubMed  Google Scholar 

  25. Shah SP, Roth A, Goya R, Oloumi A, Ha G, Zhao Y, Turashvili G, Ding J, Tse K, Haffari G, et al. The clonal and mutational evolution spectrum of primary triple-negative breast cancers. Nature. 2012;486:395–9.

    Article  CAS  PubMed  Google Scholar 

  26. Gambardella G, Viscido G, Tumaini B, Isacchi A, Bosotti R, di Bernardo D. A single-cell analysis of breast cancer cell lines to study tumour heterogeneity and drug response. Nat Commun. 2022;13:1714.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Slovin S, Carissimo A, Panariello F, Grimaldi A, Bouché V, Gambardella G, Cacchiarelli D. Single-cell RNA sequencing analysis: a step-by-step overview BT - RNA bioinformatics. Picardi E. (ed). New York: Springer US; 2021. p. 343–65.

    Google Scholar 

  28. Aissa AF, Islam ABMMK, Ariss MM, Go CC, Rader AE, Conrardy RD, Gajda AM, Rubio-Perez C, Valyi-Nagy K, Pasquinelli M, et al. Single-cell transcriptional changes associated with drug tolerance and response to combination therapies in cancer. Nat Commun. 2021;12:1–25.

    Article  Google Scholar 

  29. Sankaran VG, Weissman JS, Zon LI. Cellular barcoding to decipher clonal dynamics in disease. Science. 2022;378(6616):378.

  30. Oren Y, Tsabar M, Cuoco MS, Amir-Zilberstein L, Cabanos HF, Hütter J-C, Hu B, Thakore PI, Tabaka M, Fulco CP, et al. Cycling cancer persister cells arise from lineages with distinct programs. Nature. 2021;596:576–82.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Simeonov KP, Byrns CN, Clark ML, Norgard RJ, Martin B, Stanger BZ, Shendure J, McKenna A, Lengner CJ. Single-cell lineage tracing of metastatic cancer reveals selection of hybrid EMT states. Cancer Cell. 2021. https://doi.org/10.1016/j.ccell.2021.05.005.

    Article  PubMed  PubMed Central  Google Scholar 

  32. Filmus J, Pollak MN, Cailleau R, Buick RN. MDA-468, a human breast cancer cell line with a high number of epidermal growth factor (EGF) receptors, has an amplified EGF receptor gene and is growth inhibited by EGF. Biochem Biophys Res Commun. 1985;128:898–905.

    Article  CAS  PubMed  Google Scholar 

  33. Thomas T, Balabhadrapathruni S, Gardner CR, Hong J, Faaland CA, Thomas TJ. Effects of epidermal growth factor on MDA-MB-468 breast cancer cells: Alterations in polyamine biosynthesis and the expression of p21/CIP1/WAF1. J Cell Physiol. 1999;179:257–66.

    Article  CAS  PubMed  Google Scholar 

  34. Doench JG, Fusi N, Sullender M, Hegde M, Vaimberg EW, Donovan KF, Smith I, Tothova Z, Wilen C, Orchard R, et al. Optimized sgRNA design to maximize activity and minimize off-target effects of CRISPR-Cas9. Nat Biotechnol. 2016;34:184–91.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Sanson KR, Hanna RE, Hegde M, Donovan KF, Strand C, Sullender ME, Vaimberg EW, Goodale A, Root DE, Piccioni F, et al. Optimized libraries for CRISPR-Cas9 genetic screens with multiple modalities. Nat Commun. 2018;9:5416.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Gambardella G, di Bernardo D. A tool for visualization and analysis of single-cell RNA-Seq data based on text mining. Front Genet. 2019;10:734.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Franchini M, Pellecchia S, Viscido G, Gambardella G. Single-cell gene set enrichment analysis and transfer learning for functional annotation of scRNA-seq data. NAR Genomics Bioinforma. 2023;5:lqad024.

    Article  Google Scholar 

  38. Zhang Y, Parmigiani G, Johnson WE. ComBat-seq: batch effect adjustment for RNA-seq count data. NAR Genomics Bioinforma. 2020;2:lqaa078.

    Article  Google Scholar 

  39. Kleshchevnikov V, Shmatko A, Dann E, Aivazidis A, King HW, Li T, Elmentaite R, Lomakin A, Kedlian V, Gayoso A, et al. Cell 2location maps fine-grained cell types in spatial transcriptomics. Nat Biotechnol. 2022;40:661–71.

    Article  CAS  PubMed  Google Scholar 

  40. Di Veroli GY, Fornari C, Wang D, Mollard S, Bramhall JL, Richards FM, Jodrell DI. Combenefit: an interactive platform for the analysis and visualization of drug combinations. Bioinformatics. 2016;32:2866–8.

    Article  PubMed  PubMed Central  Google Scholar 

  41. Robinson MD, McCarthy DJ, Smyth GK. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2010;26(1):139–40.

    Article  CAS  PubMed  Google Scholar 

  42. Macnair W, Gupta R, Claassen M. psupertime: supervised pseudotime analysis for time-series single-cell RNA-seq data. Bioinformatics. 2022;38:i290–8.

    Article  PubMed  PubMed Central  Google Scholar 

  43. Michna A, Braselmann H, Selmansberger M, Dietz A, Hess J, Gomolka M, Hornhardt S, Blüthgen N, Zitzelsberger H, Unger K. Natural Cubic Spline Regression Modeling Followed by Dynamic Network Reconstruction for the Identification of Radiation-Sensitivity Gene Association Networks from Time-Course Transcriptome Data. PLoS One. 2016;11:e0160791.

    Article  PubMed  PubMed Central  Google Scholar 

  44. Grossman RL, Heath AP, Ferretti V, Varmus HE, Lowy DR, Kibbe WA, Staudt LM. Toward a shared vision for cancer genomic data. N Engl J Med. 2016;375:1109–12.

    Article  PubMed  PubMed Central  Google Scholar 

  45. Colaprico A, Silva TC, Olsen C, Garofano L, Cava C, Garolini D, Sabedot TS, Malta TM, Pagnotta SM, Castiglioni I, et al. TCGAbiolinks: an R/Bioconductor package for integrative analysis of TCGA data. Nucleic Acids Res. 2015;44:e71–e71.

    Article  PubMed  PubMed Central  Google Scholar 

  46. Dzul Keflee R, Leong KH, Ogawa S, Bignon J, Chan MC, Kong KW. Overview of the multifaceted resistances toward EGFR-TKIs and new chemotherapeutic strategies in non-small cell lung cancer. Biochem Pharmacol. 2022;205:115262.

    Article  CAS  PubMed  Google Scholar 

  47. Macosko EZ, Basu A, Satija R, Nemesh J, Shekhar K, Goldman M, Tirosh I, Bialas AR, Kamitaki N, Martersteck EM, et al. Highly Parallel Genome-wide Expression Profiling of Individual Cells Using Nanoliter Droplets. Cell. 2015;161:1202–14.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Gambardella G. 2024. https://github.com/gambalab/dropseq.

  49. Kalaitzis AA, Lawrence ND. A simple approach to ranking differentially expressed gene expression time courses through gaussian process regression. BMC Bioinformatics. 2011;12:180.

    Article  PubMed  PubMed Central  Google Scholar 

  50. Kingma DP, Ba J. Adam: a method for stochastic optimization. 2017.

    Google Scholar 

  51. Arnese R, Gambardella G. 2024. https://github.com/gambalab/scASTRAL.

  52. Jeffreys H. The theory of probability. 1939.

    Google Scholar 

  53. Pal B, Chen Y, Vaillant F, Capaldo BD, Joyce R, Song X, Bryant VL, Penington JS, Di Stefano L, Tubau Ribera N, et al. A single-cell RNA expression atlas of normal, preneoplastic and tumorigenic states in the human breast. EMBO J. 2021;40:e107333.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. Wu SZ, Al-Eryani G, Roden DL, Junankar S, Harvey K, Andersson A, Thennavan A, Wang C, Torpy JR, Bartonicek N, et al. A single-cell and spatially resolved atlas of human breast cancers. Nat Genet. 2021;53:1334–47.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  55. Pellecchia S, Viscido G, Franchini M, Gambardella G. Predicting drug response from single-cell expression profiles of tumours. BMC Med. 2023;21:476.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  56. Fustero-Torre C, Jiménez-Santos MJ, García-Martín S, Carretero-Puche C, García-Jimeno L, Ivanchuk V, Di Domenico T, Gómez-López G, Al-Shahrour F. Beyondcell: targeting cancer therapeutic heterogeneity in single-cell RNA-seq data. Genome Med. 2021;13:187.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  57. Breiman L. Random Forests. Mach Learn. 2001;45:5–32.

    Article  Google Scholar 

  58. Iorio F, Knijnenburg TA, Vis DJ, Bignell GR, Menden MP, Schubert M, Aben N, Gonçalves E, Barthorpe S, Lightfoot H, et al. A Landscape of pharmacogenomic interactions in cancer. Cell. 2016;166:740–54.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  59. Barretina J, Caponigro G, Stransky N, Venkatesan K, Margolin AA, Kim S, Wilson CJ, Lehar J, Kryukov GV, Sonkin D, et al. The Cancer Cell Line encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature. 2012;483:603–7.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  60. Yang Y, Li S, Wang Y, Zhao Y, Li Q. Protein tyrosine kinase inhibitor resistance in malignant tumors: molecular mechanisms and future perspective. Signal Transduct Target Ther. 2022;7:329.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  61. Merino Salvador M, Gómez de Cedrón M, Moreno Rubio J, Falagán Martínez S, Sánchez Martínez R, Casado E, Ramírez de Molina A, Sereno M. Lipid metabolism and lung cancer. Crit Rev Oncol Hematol. 2017;112:31–40.

    Article  PubMed  Google Scholar 

  62. Kim S, Im JH, Kim WK, Choi YJ, Lee J-Y, Kim SK, Kim SJ, Kwon SW, Kang KW. Enhanced sensitivity of nonsmall cell lung cancer with acquired resistance to epidermal growth factor receptor-tyrosine kinase inhibitors to phenformin: the roles of a metabolic shift to oxidative phosphorylation and redox balance. Oxid Med Cell Longev. 2021;2021:1–17.

    Google Scholar 

  63. Wei LF, Weng XF, Huang XC, Peng YH, Guo HP, Xu YW. IGFBP2 in cancer: Pathological role and clinical significance (review). Oncol Rep. 2021;45:427–38.

    Article  CAS  PubMed  Google Scholar 

  64. Russo VC, Azar WJ, Yau SW, Sabin MA, Werther GA. IGFBP-2: The dark horse in metabolism and cancer. Cytokine Growth Factor Rev. 2015;26:329–46.

    Article  CAS  PubMed  Google Scholar 

  65. Zhang J, Jia J, Zhu F, Ma X, Han B, Wei X, Tan C, Jiang Y, Chen Y. Analysis of bypass signaling in EGFR pathway and profiling of bypass genes for predicting response to anticancer EGFR tyrosine kinase inhibitors. Mol Biosyst. 2012;8:2645–56.

    Article  CAS  PubMed  Google Scholar 

  66. Shi K, Wang G, Pei J, Zhang J, Wang J, Ouyang L, Wang Y, Li W. Emerging strategies to overcome resistance to third-generation EGFR inhibitors. J Hematol Oncol. 2022;15:94.

    Article  PubMed  PubMed Central  Google Scholar 

  67. Liu L, Zhu H, Liao Y, Wu W, Liu L, Liu L, Wu Y, Sun F, Lin H. Inhibition of Wnt/β-catenin pathway reverses multi-drug resistance and EMT in Oct4+/Nanog+ NSCLC cells. Biomed Pharmacother. 2020;127:110225.

    Article  CAS  PubMed  Google Scholar 

  68. Lee Y, Wang Y, James M, Jeong JH, You M. Inhibition of IGF1R signaling abrogates resistance to afatinib (BIBW2992) in EGFR T790M mutant lung cancer cells. Mol Carcinog. 2016;55:991–1001.

    Article  CAS  PubMed  Google Scholar 

  69. Chua CY, Liu Y, Granberg KJ, Hu L, Haapasalo H, Annala MJ, Cogdell DE, Verploegen M, Moore LM, Fuller GN, et al. IGFBP2 potentiates nuclear EGFR–STAT3 signaling. Oncogene. 2016;35:738–47.

    Article  CAS  PubMed  Google Scholar 

  70. Gong Y, Duvvuri M, Duncan MB, Liu J, Krise JP. Niemann-Pick C1 protein facilitates the efflux of the anticancer drug daunorubicin from cells according to a novel vesicle-mediated pathway. J Pharmacol Exp Ther. 2006;316:242–7.

    Article  CAS  PubMed  Google Scholar 

  71. Groth-Pedersen L, Ostenfeld MS, Høyer-Hansen M, Nylandsted J, Jäättelä M. Vincristine induces dramatic lysosomal changes and sensitizes cancer cells to lysosome-destabilizing siramesine. Cancer Res. 2007;67:2217–25.

    Article  CAS  PubMed  Google Scholar 

  72. Robey RB, Weisz J, Kuemmerle NB, Salzberg AC, Berg A, Brown DG, Kubik L, Palorini R, Al-Mulla F, Al-Temaimi R, et al. Metabolic reprogramming and dysregulated metabolism: cause, consequence and/or enabler of environmental carcinogenesis? Carcinogenesis. 2015;36(Suppl 1):S203–31.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  73. Stine ZE, Schug ZT, Salvino JM, Dang CV. Targeting cancer metabolism in the era of precision oncology. Nat Rev Drug Discov. 2022;21:141–62.

    Article  CAS  PubMed  Google Scholar 

  74. Xu C, Zhang L, Wang D, Jiang S, Cao D, Zhao Z, Huang M, Jin J. Lipidomics reveals that sustained SREBP-1-dependent lipogenesis is a key mediator of gefitinib-acquired resistance in EGFR-mutant lung cancer. Cell Death Discov. 2021;7:353.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  75. Munir R, Lisec J, Swinnen JV, Zaidi N. Lipid metabolism in cancer cells under metabolic stress. Br J Cancer. 2019;120:1090–8.

    Article  PubMed  PubMed Central  Google Scholar 

  76. Rysman E, Brusselmans K, Scheys K, Timmermans L, Derua R, Munck S, Van Veldhoven PP, Waltregny D, Daniëls VW, Machiels J, et al. De novo lipogenesis protects cancer cells from free radicals and chemotherapeutics by promoting membrane lipid saturation. Cancer Res. 2010;70:8117–26.

    Article  CAS  PubMed  Google Scholar 

  77. Cenciarelli C, Marei HE, Zonfrillo M, Pierimarchi P, Paldino E, Casalbore P, Felsani A, Vescovi A, Maira G, Mangiola A. PDGF receptor alpha inhibition induces apoptosis in glioblastoma cancer stem cells refractory to anti-Notch and anti-EGFR treatment. Mol Cancer. 2014;13:247.

    Article  PubMed  PubMed Central  Google Scholar 

  78. Azuma K, Kawahara A, Sonoda K, Nakashima K, Tashiro K, Watari K, Izumi H, Kage M, Kuwano M, Ono M, et al. FGFR1 activation is an escape mechanism in human lung cancer cells resistant to afatinib, a pan-EGFR family kinase inhibitor. Oncotarget. 2014;5:5908–19.

    Article  PubMed  PubMed Central  Google Scholar 

  79. Terai H, Soejima K, Yasuda H, Nakayama S, Hamamoto J, Arai D, Ishioka K, Ohgino K, Ikemura S, Sato T, et al. Activation of the FGF2-FGFR1 autocrine pathway: a novel mechanism of acquired resistance to gefitinib in NSCLC. Mol Cancer Res. 2013;11:759–67.

    Article  CAS  PubMed  Google Scholar 

  80. Mak VC, Li X, Rao L, Zhou Y, Tsao S-W, Cheung LW. p85β alters response to EGFR inhibitor in ovarian cancer through p38 MAPK-mediated regulation of DNA repair. Neoplasia. 2021;23:718–30.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  81. Leonetti A, Sharma S, Minari R, Perego P, Giovannetti E, Tiseo M. Resistance mechanisms to osimertinib in EGFR-mutated non-small cell lung cancer. Br J Cancer. 2019;121:725–37.

    Article  PubMed  PubMed Central  Google Scholar 

  82. Huang L, Fu L. Mechanisms of resistance to EGFR tyrosine kinase inhibitors. Acta Pharm Sin B. 2015;5:390–401.

    Article  PubMed  PubMed Central  Google Scholar 

  83. Roskoski R. Targeting ERK1/2 protein-serine/threonine kinases in human cancers. Pharmacol Res. 2019;142:151–68.

    Article  CAS  PubMed  Google Scholar 

  84. Zhang Z, Lee JC, Lin L, Olivas V, Au V, Laframboise T, Abdel-Rahman M, Wang X, Levine AD, Rho JK, et al. Activation of the AXL kinase causes resistance to EGFR-targeted therapy in lung cancer. Nat Genet. 2012;44:852–60.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  85. Ma G, Deng Y, Qian L, Vallega KA, Zhang G, Deng X, Owonikoko TK, Ramalingam SS, Fang DD, Zhai Y, et al. Overcoming acquired resistance to third-generation EGFR inhibitors by targeting activation of intrinsic apoptotic pathway through Mcl-1 inhibition, Bax activation, or both. Oncogene. 2022;41:1691–700.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  86. Weng C-H, Chen L-Y, Lin Y-C, Shih J-Y, Lin Y-C, Tseng R-Y, Chiu A-C, Yeh Y-H, Liu C, Lin Y-T, et al. Epithelial-mesenchymal transition (EMT) beyond EGFR mutations per se is a common mechanism for acquired resistance to EGFR TKI. Oncogene. 2019;38:455–68.

    Article  CAS  PubMed  Google Scholar 

  87. Tulchinsky E, Demidov O, Kriajevska M, Barlev NA, Imyanitov E. EMT: A mechanism for escape from EGFR-targeted therapy in lung cancer. Biochim Biophys Acta - Rev Cancer. 2019;1871:29–39.

    Article  CAS  PubMed  Google Scholar 

  88. Taniguchi H, Yamada T, Wang R, Tanimura K, Adachi Y, Nishiyama A, Tanimoto A, Takeuchi S, Araujo LH, Boroni M, et al. AXL confers intrinsic resistance to osimertinib and advances the emergence of tolerant cells. Nat Commun. 2019;10:259.

    Article  PubMed  PubMed Central  Google Scholar 

  89. Kim D, Bach D-H, Fan Y-H, Luu T-T-T, Hong J-Y, Park HJ, Lee SK. AXL degradation in combination with EGFR-TKI can delay and overcome acquired resistance in human non-small cell lung cancer cells. Cell Death Dis. 2019;10:361.

    Article  PubMed  PubMed Central  Google Scholar 

  90. Bekeschus S. Acquired cancer tyrosine kinase inhibitor resistance: ROS as critical determinants. Signal Transduct Target Ther. 2021;6:437.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  91. Maubec E, Petrow P, Scheer-Senyarich I, Duvillard P, Lacroix L, Gelly J, Certain A, Duval X, Crickx B, Buffard V, et al. Phase II study of cetuximab as first-line single-drug therapy in patients with unresectable squamous cell carcinoma of the skin. J Clin Oncol. 2011;29:3419–26.

    Article  CAS  PubMed  Google Scholar 

  92. Mody K, Strauss E, Lincer R, Frank RC. Complete response in gallbladder cancer to erlotinib plus gemcitabine does not require mutation of the epidermal growth factor receptor gene: a case report. BMC Cancer. 2010;10:570.

    Article  PubMed  PubMed Central  Google Scholar 

  93. Pellecchia S, Franchini M, Viscido G, Arnese R, Gambardella G. GEO. 2024. https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE228382.

  94. Pellecchia S, Franchini M, Viscido G, Arnese R, Gambardella G. Figshare. 2024. https://doi.org/10.6084/m9.figshare.24787458.

Download references

Acknowledgements

We thank the TIGEM high content screening facility for helping in 3D-spheroid analysis. We thank the TIGEM bioinformatics core for helping in dose response curve analyses. We express our gratitude to Dr. Cathal Wilson for meticulously proofreading our manuscript. His invaluable efforts have significantly contributed to enhance the clarity and accuracy of our work.

Funding

This work was supported by the My First AIRC grant 23162.

Author information

Authors and Affiliations

Authors

Contributions

SP performed time series experiment, 10 × sequencing and all the experimental validations. MF performed computational analyses. GV setup DropSeq platform and performed sequencing. RA developed the drug prediction method. GG supervised the work, wrote the manuscript, and conceived the original idea. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Gennaro Gambardella.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pellecchia, S., Franchini, M., Viscido, G. et al. Single cell lineage tracing reveals clonal dynamics of anti-EGFR therapy resistance in triple negative breast cancer. Genome Med 16, 55 (2024). https://doi.org/10.1186/s13073-024-01327-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13073-024-01327-2