Simultaneous protein quantification increases the power of scRNA-seq to dissect the functional heterogeneity of human CD4+ T cells
In this study, we wanted to investigate the power of a unified high-throughput experimental workflow combining targeted scRNA-seq and the quantification of protein expression at the single-cell level, to dissect the heterogeneity of human primary CD4+ T cells in the blood. To address this question, we initially profiled the expression of 397 genes at the mRNA level, coupled with 37 protein targets (Additional file 1: Table S1) using the BD AbSeq technology, in CD4+ T cells isolated from the blood of an SLE patient. To enrich for the relative distribution of two less abundant CD4+ T cell subsets: (i) CD127lowCD25hi T cells, predominantly containing the Treg population; and (ii) CD127lowCD25low T cells, containing a subset of non-conventional CD25lowFOXP3+ Tregs previously characterised in autoimmune patients [17], we devised a FACS-sorting strategy to isolate and profile equal numbers of cells from the three defined T cell subsets (Fig. 1a). Following sorting, cells from each subset were labelled with a barcoded oligo-conjugated antibody (Sample Tag) prior to cell capture—a method related to the recently described cell-hashing technique [18]—to identify their original sorting gate and to assess the frequency of cell multiplets obtained in this experiment.
A total of 9898 captured cells passed the initial quality control (QC), of which a small proportion (1.9%; Additional file 2: Table S2) were assigned as multiplets and excluded from the analysis. Of note, we observed complete sequencing saturation of the mRNA library, assessed as the number of cDNA molecules with a novel unique molecular identifier (UMI) identified with increasing sequencing coverage, for a read depth of > 2700 reads/cell (Additional file 3: Figure S1a). In contrast, we obtained approximately 80% sequencing saturation at a read depth of > 6000 reads/cell for the AbSeq library (Additional file 3: Figure S1b). This is illustrated by the large dynamic range of expression of the protein targets, reaching up to thousands of unique copies in cells displaying higher levels of expression (Additional file 3: Figure S1c). Of note, the median expression of most assessed proteins, including all those that are known not to be expressed on CD4+ T cells, was zero copies (Additional file 3: Figure S1c), which demonstrates the high specificity of the AbSeq system. To test the sensitivity of this targeted approach to detect lowly expressed genes at the mRNA level at these sequencing coverages compared to whole-transcriptome scRNA-seq methods, we quantified the frequency of cells expressing at least one copy of FOXP3 within the Treg population. To standardise this comparison between studies, we took advantage of two publicly available PBMC datasets from 10× Genomics, with deep paired cell-surface protein expression data, and delineated the CD127lowCD25hi Treg population using the same sorting strategy employed here (Additional file 3: Figure S2a,b). While FOXP3 was detected in only 23.2% and 18.3% of Tregs in whole-transcriptome datasets (using the v3 and NextGEM chemistries, respectively), it was detected in 68.3% of Tregs using the targeted approach described in this manuscript (Additional file 3: Figure S2c). A similar difference was observed within the non-Treg gate, although as expected, the frequency of FOXP3+ cells was substantially lower (Additional file 3: Figure S2c). In addition, we observed different FOXP3 UMI count distributions, with a distinct skew towards higher UMI counts using the targeted approach (Additional file 3: Figure S2d). These data demonstrate the sensitivity of targeted scRNA-seq to detect lowly expressed genes, even at a fraction of the sequencing coverage (approx. 1/10th), therefore allowing to allocate additional sequencing resources to the protein expression libraries, which display much higher dynamic range of expression.
To further test the sensitivity of the AbSeq protein measurements, we next generated a two-dimensional plot depicting the AbSeq expression of IL-2RA (CD25) and IL-7R (CD127), which recapitulated the flow cytometric profile obtained with the same two markers used for the sorting of the assessed T cell subsets (Fig. 1b). Furthermore, by overlaying the Sample Tag information, we were able to confirm that the expression profiles of CD127 and CD25 mimicked the sorting strategy precisely for all three sorted CD4+ T cell subsets (Fig. 1b), therefore illustrating the highly quantitative nature of the protein measurements. We note that in our study, the read depth devoted to the protein library (approx. 6000 reads/cell) was insufficient to reveal the complete spectrum of CD127 and CD25 expression, resulting in reduced power to fully resolve CD25low/int T cells, when compared to flow cytometry. Although increased sequencing coverages in our experiment would provide only minimal gain for the functional characterisation of single cells and clustering, they are necessary to describe the complete dynamic range of protein expression, especially for highly expressed proteins such as CD127 and CD25, thereby leading to similar sensitivity between molecular flow cytometry, as previously demonstrated for CITE-seq [9], REAP-seq [10] and BD AbSeq [19].
Next, we performed unsupervised Louvain clustering combining the mRNA and protein expression data and visualised the clusters in a two-dimensional space using Uniform Manifold Approximation and Projection (UMAP) [20]. One of the main discriminators of functional differentiation in CD4+ T cells is the acquisition of a memory phenotype in response to antigen stimulation, typically marked by the expression of CD45RA on naive cells and CD45RO on memory cells. However, because these are splice isoforms of the same gene (PTPRC; CD45), discrimination cannot be achieved using UMI-based scRNA-seq systems targeting the 3′ or 5′ ends of the transcript. By measuring the expression of the two isoforms at the protein level, we were able to identify a marked expression gradient associated with a gradual loss of CD45RA and concomitant gain of CD45RO along the first component of the UMAP plot, indicating that the acquisition of a memory phenotype is indeed the main source of biological variation driving the clustering of CD4+ T cells in both the Teff and Treg compartments (Fig. 1c). One notable exception was the re-expression of CD45RA in the most differentiated memory cells (Fig. 1c). This observation is consistent with the phenotype of differentiated effector memory CD4+ T cells that re-express CD45RA (TEMRAs) [21] and illustrates the power of this highly multiparametric approach to identify subtle alterations in CD4+ T cell states, while mitigating the potential issue of cell-type misclassification based on a few prototypical markers such as CD45RA/RO.
In this study, we have opted to integrate the mRNA and protein data using standard normalisation methods, which have been developed for traditional scRNA-seq technologies. In the first approach, we do not differentiate between mRNA and protein libraries. To further investigate the relative contribution of the protein library to the underlying clustering, we also applied two alternative normalisation methods: (i) a hybrid method involving independent normalisation of protein and mRNA libraries and (ii) using only the mRNA library for clustering (Additional file 3: Figure S3a,b). Overall, we observed good concordance between the resulting cell clusters and their assigned functional annotation regardless of the normalisation method used, although small differences could be observed in the resolution of the smaller clusters (Additional file 3: Figure S3c,d). These data suggest that for this dataset, containing a relatively small number of protein markers, this issue does not significantly alter the functional annotation of the clusters and the interpretation of the results. Nevertheless, the transcriptomics-only approach provided slightly inferior cell-cluster resolution and spatial separation of the naive-memory differentiation trajectory, indicating the added value of protein data for the functional annotation of the clusters. With the growing popularity of molecular cytometry and increasing number of proteins assessed, it will be important to consider the effects of data normalisation to the interpretation of the results, and therefore, further work is warranted to develop better methods to integrate multi-omics datasets.
Single-cell mRNA and protein immunophenotyping identifies distinct trajectories of CD4+ T cell differentiation in blood
Integration of the multiparametric transcriptional and proteomics data identified discrete clusters of CD4+ T cells along the naive/memory differentiation axis (Fig. 1d). We observed an increased number of clusters within the memory compartment, marked by the differential expression of defined sets of signature genes (Fig. 1e and Additional file 3: Figure S4a), which was consistent with the increased functional heterogeneity in differentiated CD4+ T cell subsets. Moreover, we observed that the expression of the canonical Th1 (TBX21, encoding TBET) and Th17 (RORC, encoding RORγt) lineage-defining transcription factors was restricted to specific clusters within the memory Teff (mTeff) population (Fig. 1f), indicating that these clusters are highly enriched for Th1 and Th17 Teffs. More importantly, we observed a distinct gradient of expression of these transcription factors. Consistent with this gradient of functional differentiation, we observed marked co-expression of canonical Th1 effector-type molecules and TBET (Fig. 1f), revealing a subset of highly activated Th1 T cells with a putative cytotoxic profile in the blood of this SLE patient. Similarly, a gradient of expression of Th17 signature genes, including RORC, could be observed from clusters 8 to 7 (Fig. 1e, f), indicating a trajectory of Th17 differentiation.
In addition to resting CD4+ T cells, we also profiled the same subsets of cells following short in vitro stimulation (90 min) with PMA + ionomycin, to assess cell type-specific cytokine production. Similarly to the resting condition, in vitro-stimulated CD4+ T cells formed discrete clusters along the naive-memory differentiation axis (Additional file 3: Figure S5a,b). Furthermore, we observed a consistent induction of expression of Th1 (IFNγ) and Th17 (IL-22) type cytokines that were restricted to the respective Th1 and Th17 clusters (Additional file 3: Figure S5c,d). Although primers for the Th2 transcription factor gene GATA3 were not included in this assay, therefore precluding the annotation of Th2 cells in resting CD4+ T cells, we noted that in vitro stimulation revealed a distinct cluster of Th2 mTeffs cells marked by the expression of Th2-type cytokines, such as IL-13 (Additional file 3: Figure S5e), IL-4, IL-5 and IL-9 (Additional file 3: Figure S4b).
Recently, several scRNA-seq studies have refined our understanding of the heterogeneity of CD4+ Tregs and their functional adaptation in tissues, in both mice and humans [22, 23]. Given the sorting strategy used in this study, we were able to significantly enrich our CD4+ T cell dataset for Tregs, which are highly enriched within the CD127lowCD25hi population. Consistent with this enrichment strategy, we identified a large Treg population, marked by the expression of the transcription factor FOXP3 and other classical Treg signature genes, including HELIOS (encoded by IKZF2), IL-2RA, CTLA-4 or TIGIT (Fig. 2a, b). In agreement with their Treg-specific transcriptional programme, we found a marked suppression of IL-2 transcription in Tregs following in vitro stimulation (Additional file 3: Figure S5f). We also identified a naive Treg cluster (cluster 0, Fig. 1d) marked by elevated expression of the canonical naive T cell marker CD45RA and concurrent low expression of CD45RO (Fig. 1c). Furthermore, naive Tregs also displayed elevated expression of additional naive T cell signature genes, including CD62L (SELL), CCR7 and CD7 (Additional file 3: Figure S4a), highlighting a population of thymically derived naive Tregs that have been previously characterised in the thymus and cord blood, decreasing with age in adults [24]. In particular, we found that the differentiation of Tregs from a naive to memory phenotype was strongly associated with the expression of two transcription factors: BACH2 and BLIMP1 (encoded by PRDM1). These two key transcription factors displayed a distinct mutually exclusive expression pattern, with high expression of BACH2 mRNA in naive cells, declining gradually—with a concomitant gradual increase in PRDM1 expression—along the naive-memory differentiation axis (Fig. 2c). The gradual increase of PRDM1 expression was found to be strongly associated with the expression of Treg activation markers such as HLA-DRA, DUSP4 and CD39 (Fig. 2d), and revealed a trajectory of Treg activation in resting primary CD4+ T cells. These data suggest that the transcriptional interplay between BACH2 and BLIMP-1 is critical to regulate the differentiation of memory Treg subsets, which is in agreement with previous data in both mice and humans [22]. The dynamic interplay of BACH2 and PRDM1 in the differentiation of Tregs was even more pronounced following in vitro stimulation (Additional file 3: Figure S6), which further supports the hypothesis that they are primary regulators of the transcriptional programme associated with the differentiation of activated Tregs in humans, in response to antigen stimulation.
Statistical methods are currently being developed to identify and reconstruct developmental trajectories from heterogeneous scRNA-seq datasets using pseudotime analysis. To validate our findings, we next applied the recently developed partition-based graph abstraction (PAGA) method [15] to reconstruct the developmental trajectories in our dataset. Consistent with our previous findings, the pseudotime analysis revealed a gradient of T cell differentiation along the naive-memory differentiation axis, which lead to the identification of three distinct differentiation pathways associated with the acquisition of a Th1, Th17 or Treg phenotype (Fig. 3a, b). These identified differentiation trajectories were associated with gradually increased expression of the lineage-specific transcription factors TBET, RORγt and FOXP3, respectively, which regulate the transcriptional programme associated with the respective T cell lineages (Fig. 3c–e). In particular, we confirmed a very distinct and gradual differentiation of the Th1 lineage in this SLE patient, leading to the temporal acquisition of activated Th1 cells expressing IFN-γ in cluster 5 and the terminal differentiation of a subset with a cytotoxic profile (cluster 9). Of note, this analysis identified cluster 2 as an intermediate memory Teff cell state, leading to the differentiation of either Th1 (cluster 5 and 9) or Th17 (cluster 7) T cells. Moreover, the pseudotime analysis also recapitulated the Treg differentiation trajectory from naive Tregs (cluster 0) to activated memory Tregs (cluster 3), which was regulated by the mutually exclusive expression of the BACH2 and BLIMP-1 transcription factors (Fig. 3e). An intriguing observation was the identification of cluster 8 representing a potential intermediate T cell state on a Treg-Th17 developmental pathway, which is consistent with the plasticity and putative common co-evolutionary origin between these two lineages [25].
The identification of the temporal differentiation of these T cell lineages was also recapitulated using single-cell trajectories reconstruction, exploration and mapping (STREAM) [16], another method that has been developed to visualise developmental trajectories using multi-omics data (Fig. 3f). Further supporting a putative common developmental pathway of Treg and Th17 cells, STREAM analysis identified FOXP3+ memory Tregs (mTregs) as a less differentiated T cell state, which shares a developmental trajectory with differentiated RORγt+ Th17 cells, with cluster 8 representing an intermediate transitional cell state in this trajectory (Fig. 3g). These data illustrate not only the potential of the targeted scRNA-seq approach to sensitively quantify lowly expressed transcription factor genes, but also highlight the power of this integrated multi-omics approach to identify subtle cell-state transitions and underlying differentiation trajectories in resting human primary T cells.
Protein expression of CD80 and CD86 marks a subset of recently activated CD4+ Tregs in circulation
A feature of the most activated mTreg cluster (cluster 3) was the marked increased expression of the B7 proteins CD80 (B7.1) and CD86 (B7.2; Fig. 2b, d), two T cell co-stimulatory molecules usually expressed in antigen-presenting cells (APCs). These findings were recapitulated on the pseudotime analysis, which identified CD80/CD86 protein expression as markers of the temporal Treg differentiation trajectory (Fig. 3e). Although we observed virtually no detectable expression of either CD80 or CD86 at the mRNA level in either resting or in vitro-stimulated CD4+ T cells (Additional file 3: Figure S7a,b), previous reports have demonstrated endogenous expression of both genes in human CD4+ T cells upon activation [26,27,28]. Furthermore, an increasing body of work points to a functional role of both B7 proteins in T cell function [29]. Consistent with our AbSeq data, we detected the expression of CD80 and CD86 by flow cytometry on the surface of a small proportion resting CD4+ T cells isolated from the blood of four healthy donors (Fig. 4a, b). The expression of the B7 proteins was restricted to the CD45RA− memory compartment and showed predominant expression in Tregs (Fig. 4a, b). In agreement with these data, a significant proportion of CD80+, and especially CD86+ T cells, displayed co-expression of the Treg transcription factor FOXP3 (Fig. 4c). Furthermore, both CD80+ and CD86+ mTregs displayed normal profiles of FOXP3 and HELIOS expression, supporting a bona fide Treg phenotype; although we noted an increased frequency of the HELIOS−FOXP3+ subset within B7+ Tregs (Fig. 4d). Notably, we detected co-expression between CD86 and the activation markers CTLA-4 and HLA-DR, as indicated by the increased frequency of CTLA-4hi HLA-DR+ cells within CD86+ mTregs (Fig. 4e, f). In contrast to CD86, the co-expression between CD80 and CTLA-4 on mTregs was less pronounced, although still displaying preferential co-expression with the activation markers CTLA-4 and HLA-DR in mTregs (Additional file 3: Figure S7c,d). Taken together, these data support our AbSeq data and identify CD80 and especially CD86 as a specific marker of activated Tregs in circulation.
In vitro activation drives endogenous expression of CD80 and CD86 expression in human CD4+ T cells
Owing to the high sensitivity of Tregs to IL-2, we next investigated the dynamics of endogenous B7 protein acquisition and its dependence on IL-2 signalling in purified CD4+ T cells from two healthy donors incubated in vitro for up to two weeks in the presence or absence of IL-2. Incubation with IL-2 was found to be sufficient to induce the upregulation of CD80 and CD86 expression in the absence of TCR stimulation and was mostly pronounced in FOXP3+ mTregs, which express higher levels of IL-2RA than mTeffs (Fig. 5a–c). Of note, the expression of CD80 was upregulated earlier and increased until day 15, where the majority of CD4 T cells were CD80+, while CD86 expression was only robustly detected from day 12 (Fig. 5b, c). In addition, we observed some level of inter-individual variation associated with the initial timing of CD86 acquisition, which was more rapid in one of the two assessed donors. In the absence of IL-2, the majority of cells died before day 15.
In contrast to incubation with IL-2 alone, CD4+ T cells activated in vitro with TCR stimulation (using αCD3/CD28-conjugated beads) showed robust proliferative capacity, as assessed by increased cell numbers, and increased expression of both B7 molecules in the CD45RA− memory compartment, which reached maximal expression between days 15 and 20 post-stimulation (Fig. 5d and Additional file 3: Figure S7e). No systematic differences were observed in T cells cultured in the presence or absence of IL-2, which was likely due to the high levels of IL-2 production by activated Teffs in this model. However, if cells were not TCR re-stimulated at day 15, we observed a rapid downregulation of CD80 and CD86 expression in cells incubated in the absence of IL-2. In sharp contrast, expression of both B7 proteins in mTeffs was stably maintained for up to 7 days when cells were incubated with high doses (500 U) of IL-2 (Fig. 5e). The acquisition of B7 molecules was associated with T cell activation and was particularly pronounced for CD86 in cells expressing the T cell activation markers CTLA-4 and HLA-DR (Additional file 3: Figure S7f,g). Of note, we observed a distinct pattern of co-expression of these two activation markers in CD80+ and CD86+ mTeffs. CTLA-4 expression on the cell surface was rapidly and transiently expressed on B7+ mTeffs, peaking at day 4, while HLA-DR expression gradually increased over time, showing striking co-expression with both CD80 and CD86 at the later stages of activation (Fig. 5f). In the presence of IL-2, co-expression with CD25 and HLA-DR could be observed in resting cells as late as day 40 post-stimulation, particularly in CD86+ T cells. In contrast, membrane-bound CTLA-4 expression was virtually absent at this time point, although the few CTLA-4+ cells remaining co-expressed both B7 molecules (Fig. 5g). These data not only demonstrate the stability of B7 protein expression on CD4+ T cells following activation, but could also provide a rationale for the observed restricted expression of CD80, and especially CD86, in primary T cells by AbSeq.
The increased proliferative capacity of mTeffs in this model precluded the investigation of the effects of TCR-induced in vitro activation on B7 upregulation in Tregs. To address this question, we next investigated the expression of B7 molecules in flow-sorted CD127lowCD25hi Tregs activated in vitro under conditions that promote Treg expansion. Similarly to total CD4+ T cells, we observed that expanded Tregs showed very high expression of both B7 molecules, which was maintained in cycling cells (assessed as Ki-67+), harvested 8 days after αCD3/CD28 restimulation (Fig. 5h). The expression of B7 molecules in expanded Tregs was also confirmed at the mRNA level (Fig. 5i). Given the absence of APCs in these in vitro activation models, these data strongly support an endogenous upregulation of B7 molecules by CD4+ T cells in response to activation.
Multi-omics immunophenotyping identifies a rare subset of circulating CCR9+ T cells expressing immune checkpoint molecules
Another example of a rare T cell population that we were able to identify in circulation using this multimodal immunophenotyping strategy was a subset of T cells marked by the specific expression of the small intestine-homing chemokine receptor CCR9, as well as increased expression of a number of other classical homing markers, such as ITGA4 (CD49d) and ITGAE (CD103) (cluster 10; Fig. 6a and Additional file 4: Table S3). Analysis of surface-expressed markers identified an increased expression of CD38 in this cluster, along with the co-expression of IL23R, IL12RB1 and KLRB1 (CD161; Fig. 6a). Recently, a subset of CD38+CD62L− effector T cells expressing gut-homing receptors, including CCR9, has been described in the blood in humans [30]. These cells were shown to have strong immunomodulatory properties mediated by the expression of T cell inhibitory receptors such as TIGIT, and their frequency was shown to be decreased in the blood of IBD patients [30]. In agreement with this putative regulatory function, our data confirmed the increased expression not only of TIGIT, but also of the T cell immune checkpoint molecules ICOS, HAVCR2 (TIM-3) and LAG3 (Fig. 6a). Consistent with their effector T cell phenotype, this cluster of CCR9+ T cells displayed increased expression of several classical genes associated with T cell effector function, including FAS, ANXA5, CASP5 and SELPG (Fig. 6a). Furthermore, the pseudotime analysis demonstrated that these cells correspond to a highly differentiated cell state, located within the Treg and Th17 differentiation trajectories (Fig. 3a).
Of the 75 cells assigned to cluster 10, 36 (48.0%) were originally sorted from the CD127lowCD25low gate, while 37 (49.3%) were sorted from the CD127lowCD25hi gate. These data further support the differentiated state of this T cell subset and indicate that the sorting strategy employed in this study, strongly enriching for the frequency of cells in these two gates, allowed for the robust detection of this rare subset, which may have been missed in a more heterogeneous total CD4+ T cell population. In addition, these data also suggest that CCR9+ T cells display intermediate to elevated levels of IL-2RA (CD25) expression, consistent with their frequent detection within the conventional CD127lowCD25hi Treg gate. To investigate the putative regulatory function of these cells, we then compared the transcriptional profile of these cells with the identified memory Teff clusters (clusters 2, 5, 7, 8 and 9; Fig. 6b) or with the memory Treg clusters (clusters 3 and 4; Fig. 6c). In both comparisons, we observed a systematic upregulation of similar sets of genes in the CCR9+ T cells (cluster 10), consistent with the highly differentiated state of this cluster, and putative immunomodulatory properties. We did not observe a distinct upregulation of a set of Treg signature genes, suggesting that these cells do not represent a bona fide Treg subset. In addition, we observed a modest increased expression of FOXP3, when compared to memory Teffs, but lower than memory Tregs (Fig. 6a). As FOXP3 expression is known to be an imperfect marker of Tregs in humans, this intermediate expression could therefore represent transient upregulation in this subset of activated T cells. Moreover, despite the expression of several surface markers usually associated with Th17 cells, we observed no evidence for the expression of RORC (RORγt; Fig. 6a). These data suggest that CCR9+ T cells may respond to the same signalling and migration cues as Th17 cells, for example, IL-23, which could be responsible for their co-localisation and potential regulatory interaction in mucosal sites.
One distinguishing feature of this subset was the expression of the transcription factor POU2AF1 (Fig. 6a). Although POU2AF1 (encoding OCA-B) has been mainly characterised as a B cell-specific transcription factor in the blood, where it plays a role in B cell maturation [31], it has also been recently shown in mice to regulate the maintenance of memory phenotype and function in previously activated CD4+ T cells [32] and the differentiation of T follicular helper (Tfh) cells in the tissue [33].
Single-cell comparison of mRNA and protein expression levels reveals modest and variable levels of correlation in primary CD4+ T cells
Given that the main advantage of this combined targeted scRNA-seq and proteomic approach is the ability to immunophenotype large numbers of cells from multiple donors, we next investigated whether we were able to integrate data generated from independent experiments. We replicated the initial experiment using the same pre-sorting strategy to isolate the three assessed CD4+ T cell subsets from an individual with type 1 diabetes and one healthy donor. To further test the potential of the protein quantification, we extended the AbSeq panel to 43 protein targets expressed on CD4+ T cells (Additional file 1: Table S1). In agreement with the initial experiment, unsupervised clustering of the 23,947 cells passing QC revealed a similar discrimination of CD4+ T cell subsets (Fig. 7a) and good alignment of the data from the three donors (Fig. 7b). Analysis of the donor-specific distribution of the identified CD4+ T cell clusters showed that the frequency of the putative CD4+ cytotoxic Th1 subset (cluster 11), marked by the co-expression of TBET and effector-type cytokines, was highly increased in the SLE patient (Fig. 7c–e). To avoid age-specific differences in the relative distribution of the CD45RA+ naive and CD45RO+ memory compartments in these donors, we normalised the analysis to the memory T cell clusters only, which we were able to robustly annotate using the AbSeq data for the expression of CD45RA and CD45RO. Although we detected a few cells with this activated Th1 phenotype in circulation from every donor, there was a very substantial expansion in the SLE patient (2.5% of memory CD4+ T cells compared to 0.3% and 0.1% in the T1D patient and healthy donor, respectively; Fig. 7d), suggesting that it could represent a pathogenic CD4+ T cell subset associated with systemic autoimmunity in this patient. In addition, we also integrated the full dataset of resting and in vitro-stimulated CD4+ T cells from the three donors, resulting in a combined dataset of 43,656 cells. Despite the known activation-induced changes in the transcriptional profile of in vitro-stimulated T cells—including cytokine expression—we observed a good integration of the datasets, yielding similar annotation of the resulting T cell clusters (Additional file 3: Figure S8 and Additional file 5: Table S4). The only exception was the separation of the in vitro-activated naive T cells into a distinct cluster (cluster 3), caused by the upregulation of cytokine expression (most notably IL-2), which was completely absent in resting cells. In contrast, we observed a more consistent integration of the memory clusters, and particularly the memory Teff cells, thereby providing additional information and increased cell numbers for the functional annotation of this dataset.
The parallel quantification of mRNA and protein expression for a large number of genes expressed in CD4+ T cells in this experiment provided a unique opportunity to investigate their systematic correlation at the single-cell level. From the 43 proteins quantified with AbSeq, 26 were also assessed at the transcriptional level and detected in our CD4+ T cell dataset. Generally, we observed relatively weak (mean Pearson correlation coefficient = 0.214) but variable levels of correlation in total resting CD4+ T cells, ranging from 0.049 for TNFRSF9 to 0.808 in KLRB1 (encoding CD161; Fig. 7). Furthermore, we note that with the exception of CXCR5, the estimated correlations were very consistent between the two independent donors (Fig. 7f). These findings were consistent with previous observations [9, 10] and suggest that primary CD4+ T cells are highly specialised cells, where transcription is subject to tight regulation to avoid excessive energy consumption by the cell and to control effector function. As expected, by normalising our analysis to a functionally more homogeneous population of memory CD4+ T cells, we observed higher levels of correlation (mean = 0.233), which is consistent with their increased expression of the majority of the assessed T cell markers. A slightly decreased correlation (mean = 0.178) was observed in in vitro-stimulated CD4+ T cells (Fig. 7g).
Parallel mRNA and protein profiling provides increased cell-type resolution of the heterogeneous CD45+ immune cell population in blood and tissue
To investigate how this targeted scRNA-seq and transcriptomics approach performed on a more heterogeneous population of immune cells, we isolated total CD45+ cells from the blood and a matching duodenal biopsy from two coeliac disease (CD) patients with active disease. In this experiment, we captured 31,907 single cells that passed QC and expanded the AbSeq panel to the detection of 68 protein targets (Additional file 1: Table S1). As expected, we observed a very defined clustering of the different populations representing the CD45+ immune cells (Fig. 8a). Consistent with previous data [34, 35], we found a clear separation of cells isolated from either blood or the small intestine (Fig. 8b), indicating a transcriptional signature of tissue residency. Furthermore, clustering of cells isolated from blood (Fig. 8c and Additional file 3: Figure S9a) or tissue (Fig. 8d) separately revealed the expected cell populations. The main distinction was the relative distribution of the immune populations, with a marked increased representation of B cell, NK cell and CD14+CD16− monocyte populations in the blood, and a significantly increased proportion of plasma cells in the small intestine. In agreement with our findings in CD4+ T cells, we found that the acquisition of a memory phenotype was the main driver of the clustering of both CD4+ and CD8+ T cells (Additional file 3: Figure S9b,c). In addition, we identified other clusters of non-conventional T cells, including a subset of γδ T cells and mucosal-associated invariant T cells (MAIT) in the blood, which shared similarities with the transcriptional signature of memory CD8+ T cells, marked by the expression of effector-type cytokines genes, such as NKG7 (Additional file 3: Figure S9c). In contrast, tissue-resident CD4+ T cells isolated from the small intestine were restricted to the memory phenotype and displayed a markedly different subset distribution, including a substantially enlarged population of FOXP3+ Tregs (Fig. 8e, f). Moreover, the simultaneous assessment of the protein expression of CXCR5, ICOS and PD-1 identified a cluster of Tfh cells (Fig. 8e), which could be distinctly clustered along a trajectory of Tfh cell activation, as illustrated by the gradient of expression of key Tfh functional transcripts, such as IL21, CXCL13 and BTLA (Fig. 8g).
Similarly, we also identified distinct trajectories of cell differentiation in other immune cell types, as illustrated by the gradient of differentiation and class switching of B cells in the blood (Additional file 3: Figure S10a-c). Peripheral B cells were clearly dominated by a naive IgD+IgM+ CD27− subset, and only a small fraction of class-switched IgG+ CD27+ memory B cells, which was consistent with the young age of the CD patients. In contrast, tissue-resident B cells were much less abundant and contained mostly cells with a class-switched IgG+ CD27+ memory phenotype. In addition, we identified a vastly expanded population of antibody-secreting plasma cells (Additional file 3: Figure S10d,e). Of note, because we were able to specifically assess the expression of the secreted Ig isotypes, we could discriminate the different functional plasma cell subsets, including a very abundant population of IgA-secreting plasma cells (Additional file 3: Figure S10f), which are known to play a critical role in the interaction with the microbiome in the gut. Together, these data provide an example of the power of this multi-omics approach to identify trajectories of cell differentiation and cell states in diverse immune cell and tissue types.