Novel cancer drivers: mining the kinome

Large-scale cancer genome studies are unveiling significant complexity and heterogeneity even in histopathologically indistinguishable cancers. Differentiating 'driver' mutations that are functionally relevant from 'passenger' mutations is a major challenge in cancer genomics. While recurrent mutations in a gene provides supporting evidence of 'driver' status, novel computational methods and model systems are greatly improving our ability to identify genes important in carcinogenesis. Reimand and Bader have recently shown that driver gene discovery in discrete gene classes (in this case the kinome) is possible across multiple cancer types and has the potential to yield new druggable targets and clinically relevant leads.

Th e overall goals of extensively characterizing cancer genomes are to refi ne current therapies through defi ning markers of therapeutic responsiveness or resistance, and to identify targets for the development of novel thera peutic strategies. Achieving these goals requires the inte gration of vast amounts of complex multidimensional genomic data with insights from other systems. A recently published study by Reimand and Bader provides a timely example of the importance of largescale eff orts in cancer genomics, and the valuable insights that mining these datasets can yield [1]. While cohortbased cata loguing of genomic aberrations initially reveals candidate driver events in diff erent cancer types, this group and many others are also interrogating these data using innovative approaches to distinguish between driver and passenger mutations. In this study, cancer genome data from 800 patients across 8 cancer types made publicly available by the International Cancer Genome Consor tium (ICGC) [2], the Cancer Genome Atlas (TCGA) [3,4] and independent groups [5] were analyzed using methods specifi cally designed to enrich for cancer drivers.
As we understand more about cancer genomes, pro found complexity and heterogeneity are emerging [6]. Apart from mutations in a relative handful of cancer driver genes that occur in a signifi cant proportion of tumors, the number of uncommon and rare mutations is extremely high. Th is poses challenges for the diff eren tiation of drivers versus passengers, as most approaches focus on recurrently mutated genes, and less frequently mutated genes are probabilistically defi ned by compari son to the background mutation rate across the whole genome [7]. As a consequence, new approaches that increase confi dence in candidate driver prediction are required to generate hypotheses for further study.

Driver mutations in the cancer kinome
Reimand and Bader [1] focused their eff orts on kinase genes that regulate phosphorylation, and regions of the genome that encode phosphorylation sites in known substrates, together known as the kinome. Th ese classes of genes play important roles in growth, homeostasis and are often dysregulated in cancer. As such, they are attrac tive therapeutic targets and have in some instances resulted in the development of eff ective therapies (for example, Erlotinib® for the treatment of lung cancers that harbor EGFR mutations). Th e authors developed ' ActiveDriver' , a novel computational algorithm that calcu lates the signifi cance of nonsynonymous single nucleo tide variations within phosphoregulatory sites based on the local (genewide), rather than genomewide back ground mutation rate, which assumes all areas of the genome have equal probability of harboring mutations. Th is approach increases the sensitivity of detection of signifi cant events within a given region of the genome; in this case, the gene where the mutation of interest is located. ActiveDriver identifi ed wellknown cancer genes and showed that mutations at some specifi c phospho regulatory sites within these were associated with diff erential patient survival. In addition, they identifi ed novel candidate driver genes with existing functional data suggesting a role in carcinogenesis: FLNB, which has a role in cytoskeleton organization; GRM1, which increases Abstract Large-scale cancer genome studies are unveiling signifi cant complexity and heterogeneity even in histopathologically indistinguishable cancers. Diff erentiating 'driver' mutations that are functionally relevant from 'passenger' mutations is a major challenge in cancer genomics. While recurrent mutations in a gene provides supporting evidence of 'driver' status, novel computational methods and model systems are greatly improving our ability to identify genes important in carcinogenesis. Reimand and Bader have recently shown that driver gene discovery in discrete gene classes (in this case the kinome) is possible across multiple cancer types and has the potential to yield new druggable targets and clinically relevant leads. PI3K activity; and POU2F1, a POU domain transcription factor that regulates cell cycle progression. As a con se quence, they conclude that ActiveDriver complements existing analysis tools.
Next, they performed network analysis and defined modules of kinases that were hierarchically organized, and found that certain networks were associated with differential survival in ovarian cancer. This has significant implications for therapeutic development, as defining key functional dependencies or 'weak points' in otherwise robustly deregulated mechanisms could uncover attrac tive therapeutic targets. They hypothesized that PRKCZ is one such master regulator of a frequently mutated phosphoregulatory network that contains wellknown cancer genes such as PTEN, which is inactivated in many cancer types and functions as a tumor suppressor by negatively regulating Akt/PKB signaling. Although there are no drugs that directly target PRKCZ, multiple inhibitors of an immediately upstream kinase, PDPK1, are available.

Strategies for enriching cancer driver genes
In general, several approaches can assist in enriching for candidate driver genes (Table 1) [8], many of which are exploited by Reimand and Bader [1]. These include the following approaches described below.

Increasing sample size and/or focus on uniform clinically relevant groups to define low frequency recurrent events
Current activities in this area include pancancer analyses, which can examine single genes, networks and pathways. The ICGC/TCGA goal within the next few years is to generate comprehensive genomic data for in excess of 25,000 cancer genomes, and when combined with other efforts the number is projected to be even greater.

Investigating the known characteristics of cancer genes
Reimand and Bader [1] exploited this in several ways: (1) choosing to focus on mutations in phosphoregulatory sites, (2) validating ActiveDriver by detecting wellknown cancer genes, and (3) using insights from other studies. In addition, other characteristics such as recurrent inactivation of genes using different mechanisms (point mutation, deletion, methylation) in discovery efforts are supportive of a candidate tumor suppressor gene.

Pathway and network analysis
Numerous pathway analysis tools are available (for example, MsigDB, GeneGO and Reactome), and as the underlying information grows, hypotheses concerning function and mechanisms can be better developed.

Integrative analysis
Orthogonal global analysis using different methodologies can assist in enriching for candidate driver genes and pathways. These datasets can include other genomic analy ses; for example, transcriptome, epigenome or in cor porate model systems such as animal models (com parative genomics), or in vitro functional screens [9].

Identifying clinical correlates
Correlating with clinical parameters such as prognosis and therapeutic responsiveness can be supportive of func tional relevance of a mutated gene or pathway/ network. Further insights can be gained if there is asso ciation with distinct clinical features such as the pattern of disease spread, vascular invasion, lymph node

Summary
Reimand and Bader [1] used large datasets for the purpose they were created; they developed a novel approach to address current challenges in analyzing genomes and used multiple methods to provide signifi cant insights into the molecular pathology of cancer. These data provide increased confidence in pursuing these target mechanisms through more detailed experi men tation. We are only at the beginning of mapping out the genomic events that exist in cancer, and the data analyzed in this report examined somatic single nucleo tide variants in the protein coding regions of genes [10].
Other classes of pathogenic mutation such as insertions, deletions, and translocations can dramatically impact gene function and also warrant investigation. Further more, other levels of gene regulation such as epigenetic modification and RNAmediated events can be inte grated into these studies over time. As the cancer genome atlases continue to grow, it is anticipated the approach described in the Reimand and Bader study will lead to further significant insights into the underlying mecha nisms that play key roles in cancer. This is vitally important for more timely advances in intervention strate gies for cancer prevention, early detection and treatment.