- Research highlight
- Open Access
Single cell expression quantitative trait loci and complex traits
Genome Medicine volume 5, Article number: 72 (2013)
The recently developed ability to quantify mRNA abundance and noise in single cells has allowed the effect of heritable variations on gene function to be re-evaluated. A recent study has shown that major sources of variation are masked when gene expression is averaged over many cells. Heritable variations that determine single-cell expression phenotypes may exert a regulatory function in specific cellular processes underlying disease. Masked effects on gene expression should therefore be modeled, not ignored.
Genetic regulation of gene expression
Understanding how and to what extent inter-individual genetic variation determines gene function in normal and pathological conditions can provide important insights into disease etiology. To this end, the rapid accumulation of large transcriptomic datasets across different tissues has prompted several population-based studies of gene expression variation . In many of these studies, typical transcriptional analyses are carried out within or between whole tissue(s), with the aim of pinpointing gene expression signatures and/or (tissue-specific) genetic regulation of gene expression. Even at this level, context-dependent genetic regulation of gene expression has been shown to be important, and the underlying regulatory variants have more complex effects than previously anticipated . For instance, characterizing different cis-regulatory mechanisms between tissues (such as opposite allelic effects) is important to understand the tissue-specific function exerted by disease-associated genetic variants.
The genetic variants that are associated with gene expression variation are commonly called expression quantitative trait loci (eQTLs). These can be mapped to the genome by modeling quantitative variation in gene expression and genetic variation (for example, single nucleotide polymorphisms (SNPs)) that have been assessed in the same population, family or segregating population. Essentially, mRNA levels can be treated as a quantitative phenotype and as such can be mapped to discrete genomic regions (genetic loci) that harbor DNA sequence variation affecting gene expression. In many cases, eQTL studies have provided direct insights into the complex regulatory mechanisms of gene expression - for instance, by allowing researchers to differentiate cis (or local) from trans (or distant) control of gene expression in a given tissue, experimental condition or developmental stage. Furthermore, eQTL analyses can be integrated with clinical genome-wide association studies (GWAS) to identify disease-associated variants [3, 4]. Despite this recent, exciting progress in 'genetical genomics' (that is, eQTL studies), the growing number of single-cell transcriptomic analyses now prompts re-evaluation of our understanding of how heritable variations affect gene function in the cell.
Neglected single-cell differences and other hidden factors
Establishing a robust link between SNPs and gene expression variation is a non-trivial exercise when multiple cell types are jointly modeled. To aid this process, ad hoc methodological approaches that borrow information among tissues have been recently developed [5, 6]. Nonetheless, emerging concepts such as single-cell transcriptomics have started changing our understanding of the genetic regulation of gene expression in individual cells, which can be hidden in ensemble-averaged experiments. In a recent study published in Nature Biotechnology, Holmes and colleagues  carried out single-cell quantification of gene expression for 92 genes in approximately 1,500 individual cells to disentangle the effect of gene variants on cell-to-cell variability, temporal dynamics or cell-cycle dependence in gene expression.
The authors looked at selected genes in fresh, naive B lymphocytes from three individuals and clearly showed how gene expression had much greater variability between cells within an individual than between individuals. This observation set the scene for a comprehensive investigation of the distributions of single-cell gene expression and the properties of gene expression noise in a larger population of cells. These analyses were focused on 92 genes affected by Wnt signaling (that can be chemically perturbed by a Wnt pathway agonist), of which 46 genes were also listed in the Catalog of Genome-Wide Association Studies, and resulted in four important outcomes.
First, perturbing the system with a Wnt pathway agonist exposed significant changes not only in whole-tissue gene expression but also in gene expression noise. Given the intrinsic stochastic nature of gene expression, it was expected that the number of mRNA copy numbers would vary from cell to cell, as previously shown in isogenic bacterial cell populations . The single-cell transcriptomic analyses reported by Holmes and colleagues  highlight the large effect of fluctuations of mRNA copy numbers in HapMap lymphoblastoid cell lines, which has been mostly neglected and might influence eQTL detection in this system to a large extent.
Second, single-cell transcriptomic analysis allowed Holmes and colleagues to quantify both the noise from the regulation of transcription and the noise of RNA turnover, which therefore can be modeled independently. In keeping with previous observations , genes differed from each other primarily in terms of burst size (that is, the amount of RNA produced when the gene is switched on), resulting in an increased expression variance between cells that was greater than the expression mean. The expression 'Fano factor' (the gene expression variance divided by the mean) quantifies this phenomenon, and it represents another commonly neglected component that might be important in eQTL studies.
Third, when gene expression distributions were described in terms of heterogeneous cell subpopulations with respect to different stages of the cell cycle, Holmes and colleagues showed that the majority of genes analyzed had altered expression between G1 and early S phases. These apparent differences in cell cycle subpopulation proportions between samples represent another determinant of gene expression variation, which is expected to contribute significantly to gene regulation.
Finally, single-cell transcriptomics enabled the reliable quantification of the gene expression noise in the system. The latter can be considered as another source of variability, which can then be used to infer an expression network for each sample. Traditional gene co-expression networks assess gene-gene associations by correlating gene expression profiles across multiple samples. By contrast, in the Nature Biotechnology article, expression networks were built by correlating gene expression across multiple cells, which were profiled in the same lymphoblastoid cell line. For instance, one expression network built with approximately 200 cells from one of the lymphoblastoid cell lines revealed changes in cell-to-cell gene correlations in response to chemical perturbation of the Wnt signaling, which were not detectable at the level of whole-tissue expression. This approach allowed the authors to assess the extent to which the network connectivity of each gene varies in the system in response to other perturbations (for example, chemical, genetic), unmasking an additional factor that is potentially relevant for eQTL analysis.
Single-cell quantitative trait loci
After demonstrating (and quantifying) the important effect on gene function of a number of factors that reflect single-cell differences, Holmes and colleagues tested how each of these factors (alone or in combination) contributed to the detection of cis-eQTLs (that is, regulatory SNPs within 50 kb of the gene) . This is an important question because integrated eQTL and clinical GWAS analyses are commonly employed to identify genes and pathways underlying disease, and eventually generate new hypotheses concerning diagnostic and prognostic biomarkers or potential therapeutic targets . First, the eQTL associations detected at -log10 P = 3 for whole-tissue gene expression (at both baseline and after chemical perturbation of the Wnt signaling) represented only a small fraction of the total number of eQTLs in the system (Figure 1). Overall, many more eQTL signals were detected for the other single-cell expression phenotypes tested. This highlights the extent to which different masked sources of variation (detailed above) can significantly affect the detection of cis-eQTLs in the system. Furthermore, it turns out that the complex spatiotemporal expression variability quantified by single-cell analysis ('single-cell expression') is more heritable than, or at least comparable to, gene expression levels averaged over many cells ('whole-tissue expression'), such that the authors of the study named this new class of associated genetic variants 'single-cell quantitative trait loci' (scQTLs) .
Notably, GWAS eQTL genes in particular demonstrated greater cell-cycle (G1 and early S phase) inter-individual variability compared with other genes and greater inter-individual variability of their network connectivities . The implications of these results are two-fold: first, these studies urge caution in the interpretation of eQTL data published to date where only whole-tissue expression was considered; and second, they prompt a deeper evaluation (and accurate modeling) of these 'masked' sources of variation resulting from single-cell differences. It will be intriguing to extend these analyses to the study of more distant genetic control of gene expression at the single-cell level (that is, single-cell trans-eQTLs) and to investigate the functional relevance of scQTLs on whole-body phenotypes in human and animal models. With the growing accessibility of single-cell technologies for transcriptomic studies, the time is right for a deep re-thinking of the key factors determining the observed complexity of gene expression and its regulation.
expression quantitative trait loci
genome-wide association study
single-cell quantitative trait loci
single nucleotide polymorphism.
Nica AC, Parts L, Glass D, Nisbet J, Barrett A, Sekowska M, Travers M, Potter S, Grundberg E, Small K, Hedman AK, Bataille V, Tzenova Bell J, Surdulescu G, Dimas AS, Ingle C, Nestle FO, di Meglio P, Min JL, Wilk A, Hammond CJ, Hassanali N, Yang TP, Montgomery SB, O'Rahilly S, Lindgren CM, Zondervan KT, Soranzo N, Barroso I, Durbin R, et al: The architecture of gene regulatory variation across multiple human tissues: the MuTHER study. PLoS Genet. 2011, 7: e1002003-10.1371/journal.pgen.1002003.
Fu J, Wolfs MG, Deelen P, Westra HJ, Fehrmann RS, Te Meerman GJ, Buurman WA, Rensen SS, Groen HJ, Weersma RK, van den Berg LH, Veldink J, Ophoff RA, Snieder H, van Heel D, Jansen RC, Hofker MH, Wijmenga C, Franke L: Unraveling the regulatory mechanisms underlying tissue-dependent genetic variation of gene expression. PLoS Genet. 2012, 8: e1002431-10.1371/journal.pgen.1002431.
Nicolae DL, Gamazon E, Zhang W, Duan S, Dolan ME, Cox NJ: Trait-associated SNPs are more likely to be eQTLs: annotation to enhance discovery from GWAS. PLoS Genet. 2010, 6: e1000888-10.1371/journal.pgen.1000888.
Min JL, Taylor JM, Richards JB, Watts T, Pettersson FH, Broxholme J, Ahmadi KR, Surdulescu GL, Lowy E, Gieger C, Newton-Cheh C, Perola M, Soranzo N, Surakka I, Lindgren CM, Ragoussis J, Morris AP, Cardon LR, Spector TD, Zondervan KT: The use of genome-wide eQTL associations in lymphoblastoid cell lines to identify novel genetic pathways involved in complex traits. PloS One. 2011, 6: e22070-10.1371/journal.pone.0022070.
Petretto E, Bottolo L, Langley SR, Heinig M, McDermott-Roe C, Sarwar R, Pravenec M, Hubner N, Aitman TJ, Cook SA, Richardson S: New insights into the genetic control of gene expression using a Bayesian multi-tissue approach. PLoS Comput Biol. 2010, 6: e1000737-10.1371/journal.pcbi.1000737.
Flutre T, Wen X, Pritchard J, Stephens M: A statistical framework for joint eQTL analysis in multiple tissues. PLoS Genet. 2013, 9: e1003486-10.1371/journal.pgen.1003486.
Wills QF, Livak KJ, Tipping AJ, Enver T, Goldson AJ, Sexton DW, Holmes C: Single-cell gene expression analysis reveals genetic associations masked in whole-tissue experiments. Nat Biotechnol. 2013, 31: 748-752. 10.1038/nbt.2642.
Taniguchi Y, Choi PJ, Li GW, Chen H, Babu M, Hearn J, Emili A, Xie XS: Quantifying E. coli proteome and transcriptome with single-molecule sensitivity in single cells. Science. 2010, 329: 533-538. 10.1126/science.1188308.
Dar RD, Razooky BS, Singh A, Trimeloni TV, McCollum JM, Cox CD, Simpson ML, Weinberger LS: Transcriptional burst frequency and burst size are equally modulated across the human genome. Proc Natl Acad Sci USA. 2012, 109: 17454-17459. 10.1073/pnas.1213530109.
Califano A, Butte AJ, Friend S, Ideker T, Schadt E: Leveraging models of cell regulation and GWAS data in integrative network-based association studies. Nat Genet. 2012, 44: 841-847. 10.1038/ng.2355.
EP is supported by the Medical Research Council UK and thanks Aida Moreno-Moral for proof-reading.
The author declares that they have no competing interests.
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
About this article
Cite this article
Petretto, E. Single cell expression quantitative trait loci and complex traits. Genome Med 5, 72 (2013). https://doi.org/10.1186/gm476
- Lymphoblastoid Cell Line
- Fano Factor
- Heritable Variation
- Gene Expression Variation
- Expression Quantitative Trait Locus