Systems and genome-wide approaches unite to provide a route to personalized medicine

A report on the Keystone Symposium 'Complex Traits: Genomics and Computational approaches', Breckenridge, Colorado, USA, 20-25 February 2012.


A view from the GWAS community
Current efforts of the GWAS community can be divided into two broad categories: discovery of novel risk loci, and extraction of biologically meaningful information from identified loci. Mark McCarthy (Oxford University) illustrated several successful approaches under way in type 2 diabetes (T2D) research. Metaanalysis of multiple casecontrol cohorts has resulted in new associations and identified a large number of variants with diminishing effect sizes. Fine mapping in nonCaucasian populations, functional genomics, and networkbased approaches in diseaserelevant tissues are identifying causal genes and elucidating functional mechanisms. Elizabeth Speliotes (University of Michigan) echoed these themes, with approaches implicating novel genes and pathways in obesity and nonalcoholic fatty liver disease. However, how to translate potentially causal genes into therapeutics can be less transparent. Sekar Kathiresan (Massachusetts General Hospital) described a Mendelian randomization approach to test whether the association of higher plasma highdensity lipoprotein cholesterol (HDLC) with reduced myocardial infarction (MI) risk is causal. The results of the study challenge the idea that raising plasma HDLC will reduce MI risk; it is an important cautionary tale demonstrating that robust disease biomarkers may not always be feasible as therapeutic targets. A shift towards evaluation of rare and lowfrequency variant effects on complex traits through wholegenome and exome sequencing is currently under way, as are epigenomewide association studies, as exemplified by a genomewide study of brain methylation in Alzheimer's disease (described by Manolis Kellis, Massachusetts Institute of Technology).
Many of these studies are moving from identified asso ciations to an understanding of function. Kellis' whirl wind tour of data resources and analytic tools illustrated how the ENCODE project's data are being used to annotate dynamic regulatory elements in multiple human cell types, and can be mined to develop models of genetic effects.

Focus on health disparities
A workshop was held with the aim of better under standing how genomics research informs and impacts issues related to health disparities. Joshua Akey (Univer sity of Washington) provided a population genetics perspective by describing the National Heart, Lung, and Blood Institute Exome Resequencing project, comprising highcoverage exome sequencing of over 2,000 African American and EuropeanAmerican individuals. A high number of predicted deleterious variants were identified per individual, with the overall frequency spectrum dominated by very rare (mostly singleton) variants, consistent with human population demography models.
Populationlevel differences with respect to disease risk, drug efficacy, and side effects are areas in which the interplay of population genetics and functional genomics can inform mechanism. One of us (MED) described pharma cogenomics of anticancer agents in different populations. Cellbased models using HapMap lympho blastoid cell lines are being used to elucidate functional effects and mechanisms of genetic variants influencing chemotherapeutic susceptibility. In contrast to trait mapping in ancestryhomogeneous populations, Elad Ziv (University of California, San Francisco) illustrated how populations of mixed ancestry can be used to map risk variants contributing to differences in disease incidence or age of onset, specifically focusing on benign neutro penia and breast cancer.

Personalized genomics
Personalized cancer therapeutics was a recurrent theme of the meeting. Joseph Lehár (Novartis Institutes for BioMedical Research) described largescale efforts to test 45,000 drug combinations for synergy in 1,000 well characterized cancer cell lines. These data, which are available as part of the Cancer Cell Line Encyclopedia, could facilitate methods development for linking pharmacological susceptibilities with genetic variation. Andrea Califano (Columbia University) described efforts to reconstruct and interrogate the regulatory logic of the cancer cell and develop a novel framework for cancer target discovery in a patientspecific manner. Dana Pe'er's (Columbia University) efforts to characterize patient specific tumor network models are another step towards providing individualized treatment.
The realization that we are in the era of genomic medicine was emphasized by Atul Butte and Euan Ashley (both from Stanford University), who individually presented different aspects of the analysis of Stanford University investigator Stephen Quake's personal genome sequence. Together, they have created the largest curated database of humandisease associated single nucleotide polymorphisms (SNPs), and developed a pipeline for the analysis of clinically actionable findings from personal genomes. Butte discussed the importance of controlled vocabulary and methods for translating risk and effect size to clinicians, who will soon be faced with billions of patient data points to interpret.

Technology innovation
Pacific Biosciences' Stephen Turner presented the com pany's revolutionary technology that follows realtime enzyme activity, demonstrating that in addition to longread DNA sequencing, the technology also charac terizes nucleotide base modifications. The impression is that this technology is a novel frontier, but as detailed by Eric Schadt (Pacific Biosciences and Mount Sinai School of Medicine), it has already been deployed in important problems, including the 2011 Escherichia coli outbreak in Germany. It is expected that contributions from real time understanding of living systems will be made in the near future. Similarly fascinating is Garry Nolan's (Stanford University) application of mass flow cytometry to sort subclasses of leukocytes and then organize them into cellular networks to be used for more precise diag nosis. This application of flow cytometry evolved from an urgent clinical need to individualize treatments of life threatening lymphomas and amazing results have been reported. In this capacity, Nolan's (and Pe'er's) work stands out as one of the few examples of the application of a computational systems approach currently in use in clinical care. Another exciting development is Leroy Hood (Institute for Systems Biology) and colleagues' efforts to make blood a 'window' into health and disease through the monitoring of organspecific proteins in the blood.
On the informatics infrastructure side, Jeff Hammer bacher (Cloudera, Inc.) gave the 'Facebook' view of medical informatics. A completely informationdriven schema based on a petabytescale platform for compu ta tional applications will allow routine data gathering and access. Another important data source is highthrough put genomic data readily available on the Internet, which have been minimally analyzed and from limited perspectives. Joel Dudley (NuMedii) presented a strategy for drug repositioning, in which publicly available gene expression data are used to predict new and often unexpected indications for established drugs.

Data sharing
Given that vast volumes of highthroughput genetic and genomic data are being gathered at an increasingly faster pace, Stephen H Friend (Sage Bionetworks) emphasized the need for more efficient data sharing and storage to enable discovery. Using Sage Bionetworks as a raw model, a 'federation' for efficient data sharing, storage and access has been formed in which members can collaboratively build disease models. Vicki L SeyfertMargolis (US Food and Drug Administration) provided the administration's perspective on ways to enable drug trial data to be reused by the scientific community.
Informed consent is an important aspect of genomics data sharing. Jason Bobe (PersonalGenomics.org) detailed the problems inherent in making assurances to research volunteers that 'deidentified' or 'anonymized' data will remain confidential, even if data are shared widely. Bobe pre sented an 'open consent' solution stipulating that researchers: (1) do not promise anonymity and confidentiality of data and (2) acknowledge risks of being reidentified from public data. Several speakers suggested that the public's apprehension about genomics data sharing will likely be tempered by actionable discoveries.

Network modeling
Another predominant theme at the meeting was systems based approaches, including network or pathway model ing. The idea that GWAS has not uncovered the majority of the heritability for complex traits has been widely discussed, and here, the point was made that the simple, additive genetic components have been explored but the remaining 'missing heritability' lies elsewhere in the universe of molecular and cellular biology. For example, although common variation at the DNA level has been densely and routinely explored for single SNP associa tions, interactions have been largely ignored. Alexis Battle (Stanford University) described elegant approaches to look at epistasis, which has been considered primarily in model organisms (and was discussed by Leonid Krugylak, Princeton University, and Andy Clark, Cornell University). Methodologies and study designs that improve statistical power in humans are necessary and are clearly in development. For example, Trey Ideker (University of California, San Diego) described a framework integrating physical and genetic interaction maps to model regu latory and signaling networks, with implications for networkbased patient stratification and drug target discovery.
In addition to complex interactions at the DNA level, a central focus is the integration of multiple data types. Metadimensional analysis, as described by one of us (MR), allows the consideration of variability that occurs through the genome, including gene expression patterns and proteomics. MR and colleagues have developed a data integration approach using evolutionary computing techniques along with data mining algorithms, such as neural networks. This type of analytical approach was also implemented by Iya Khalil (GNS Healthcare), who has used the methodology to predict disease phenotypes for complex traits. Pe'er presented novel approaches to integrate heterogeneous genomic data types into patient specific tumor network models to identify key cancer drivers and their associated phenotypic effects, as well as to interrogate functionality of drug perturbations.
Considering data analysis in this comprehensive manner is supported by the evidence observed in several applications of expression quantitative trait loci (eQTLs), including in inflammatory disease (BES), T2D (Judy Zhong, New York University Medical School), and coro nary artery disease (JB). Collectively, these studies emphasize that success depends on the collection of study populations, generation of highthroughput, well defined cell and tissuespecific genomic and phenotypic data, and development of powerful analytic strategies for metadimensional analysis. To truly elucidate this archi tec ture, parallel nonhuman strategies are also needed, as highlighted by the efforts of Allan Attie (University of WisconsinMadison) to define eQTLs in mouse strains with a wide variety of disease susceptibilities.

The future
Leroy Hood's keynote address provided a bigpicture view of the future of medicine. He predicts that we will transition from a clinically reactive to a proactive model, encompassing predictive, personalized, preventative, and participatory, or 'P4' medicine. This way of thinking relies on recognizing medicine as an informational science, both hypothesisdriven and hypothesisgenerating, where systems approaches will allow one to understand wellness and disease in a more holistic way. Emerging technologies will allow us to explore new dimensions of patient data space, and new analytic tools will allow us to decipher the billions of data points for each individual. From the cuttingedge research discussed at this meeting, we can see that we are well on our way to that future.

Competing interests
MR, BES and MED have no competing interests. JB is a shareholder and board member of Clinical Gene Networks AB, Stockholm, Sweden (www.clinicalgenenetworks.com, commercializing gene networks in cardiovascular disease).