DOTS-Finder: a comprehensive tool for assessing driver genes in cancer genomes
© Melloni et al.; licensee BioMed Central Ltd. 2014
Received: 14 February 2014
Accepted: 3 June 2014
Published: 10 June 2014
A key challenge in the analysis of cancer genomes is the identification of driver genes from the vast number of mutations present in a cohort of patients. DOTS-Finder is a new tool that allows the detection of driver genes through the sequential application of functional and frequentist approaches, and is specifically tailored to the analysis of few tumor samples. We have identified driver genes in the genomic data of 34 tumor types derived from existing exploratory projects such as The Cancer Genome Atlas and from studies investigating the usefulness of genomic information in the clinical settings. DOTS-Finder is available athttps://cgsb.genomics.iit.it/wiki/projects/DOTS-Finder/.
The amount of data regarding somatic mutations in various cancer types has increased enormously in the past few years, thanks to technological advancements and reduction of sequencing costs. The massive sequencing of several cancer genomes has led to the identification of thousands of mutated genes. However, only a minority of the identified mutations has a true impact on the fitness of the cancer cells, in terms of conferring a selective growth advantage and leading to clonal expansion (drivers), while the others are simply passengers, namely, mutations that occur by genetic hitchhiking in an unstable environment and have no role in tumor progression.
Several statistical strategies have been developed to properly identify driver mutations and driver genes. These strategies can be roughly classified into four main categories: ‘protein function’ , ‘frequentist’ , ‘pathway-oriented’ and ‘pattern-based’ approaches. The ‘protein function’ approaches are based on the prediction of the functional impact of a specific mutation in the coding sequence of a protein[1–3]. Although they do not permit the identification of driver genes, they can predict the effect of the mutation on the protein product. The ‘frequentist’ approaches evaluate the frequency of mutations in a gene compared with the background mutation rate (BMR), a measure of baseline probability of mutation for a given region of DNA[4–6]. The ‘pathway-oriented’ approaches are based on the analysis of the co-occurrence of mutations in a pathway-centered view[7–10] and are usually focused on searching for driver genes belonging to the most significant mutated pathways. Lastly, the ‘pattern-based’ approaches identify driver genes by assessing the type of mutations (for example, missense/truncating/silent) and their relative positions on an amino acid map across many cancer samples[11–14]. They exploit the known structural properties of mutations in tumor suppressor genes (TSGs) and oncogenes (OGs). Nevertheless, the identification of driver mutations in cancer remains a major challenge in computational biology and cancer genomics. Indeed, discovering driver mutations is one of the main goals of genome re-sequencing efforts, as the knowledge generated by exome-sequencing will translate from research to the clinic. The results of some of the cited tools are summarized in a recent database called DriverDB and also aggregated in one of the Pan- Cancer analysis publications. From their comparison, it is clear that all these approaches are complementary and only the integration of many of these strategies can improve the identification of driver genes.
Here, we present an innovative tool called DOTS-Finder (Driver Oncogene and Tumor Suppressor Finder) that integrates a novel pattern-based method with a protein function approach (functional step) and a frequentist method (frequentist step) to identify driver genes. In addition, it allows the classification of driver genes as TSGs or OGs. The software is freely available and has been designed to return robust results even with few tumor samples.
Overview of DOTS-Finder
DOTS-Finder is a comprehensive method that considers three main aspects of a mutated gene: it takes into consideration where the mutations are collectively found (pattern-based approach), what is the effect of mutations on protein products (protein-change approach), and what is the frequency of these mutations in the sample (frequentist approach). Our method is able to overcome many of the problems derived from the application of each individual approach. First of all, the prediction ability of frequentist approaches such as MutSigCV relies on the estimation of the BMR. Nevertheless, a precise map of the BMR in the whole genome is still unavailable and constitutes one of the unresolved challenges of cancer genomics. A plethora of genomic events, such as transcription and replication timing, are associated with the fact that part of the genome is more prone or less prone to mutation. In particular, experimental data of these two events showed a significant correlation with the probability of a mutational event. However, while these experiments should be context specific (tissue/patient specific), data on replication timing are hard to obtain for every patient and/or tissue. Finally, pure frequentist methods do not allow any classification of the type of aberrations in terms of gain or loss of function. A pattern-based approach can bypass the problem of achieving a correct BMR estimation by focusing on the position of the observed mutations and not on their frequency. Thus, the frequency simply becomes a statistical power boost and not the point of investigation. Vogelstein et al. provide a scheme to assess whether a gene can be considered an OG or a TSG, but a large amounts of data are needed in order to evaluate rarely mutated genes. The approach of Vogelstein et al., as well as the method developed in TUSON Explorer, has been used to collectively evaluate general cancer genes across tumor types; however, when applied to single tumor type, they were found to lack the statistical power to recapitulate the overall results. In particular, with these methods, the discrete calculation of an OG test requires many mutations in the exact same hotspots to reach statistical significance. On the contrary, our approach, which takes into consideration the proximity of mutations by using the Gaussian smoothing, is able to identify also small deviations from a uniform distribution.
The main problem in assessing the value of our method is the absence of a gold standard in the identification of driver genes and the lack of benchmark studies. Indeed, the objects of our investigation are the driver genes of the different cancer types, which are still mostly unknown. However, to have an estimate of the prediction ability of DOTS-Finder, we decided to compare the aggregated predictions for 12 cancer types with the results of a well-documented Pan-Cancer 12 global analysis (Text S1a and Figure S1B in Additional file1). In this analysis, the authors combined the outputs of several approaches and we were able to compare our tool with the single output from MutSig, MuSiC, ActiveDriver, OncodriveFM and OncodriveClust (Text S1a and Figure S1 in Additional file1). We also related the predictions of each method with the Cancer Gene Census (CGC) database, a manually curated collection of driver genes (all the results are available in Table S1 in Additional file2). Notably, DOTS-Finder emerged as the best available tool in terms of precision-recall balance.
Moreover, we have applied DOTS-Finder to 34 tumor types and compared its output with the results of other approaches. Our approach shows results that are consistent with the literature for both high and low mutation rate cancers; DOTS-Finder allows detection of new plausible driver candidates while excluding highly mutated genes not associated with cancer, the so-called 'fishy genes', such as those encoding the mucins, titin and most of the olfactory receptors.
DOTS-Finder requires minimal input files, it is easy to use, and does not necessitate any programming skill or statistical knowledge. Indeed, we created a tool accessible to researchers in a wide range of fields. Compared with popular tools like MuSiC and MutSigCV, we only require the availability of easily accessible MAF files. Users do not need to have bam files as in MuSiC, which are not publicly available or easily accessible. In addition, the users do not need any proprietary software, as the source code is written in Python and contains some embedded R codes, which are two freely available languages. Since DOTS-Finder is released under the GNU GPLv3+ license, users are also free to modify the code and implement new features.
DOTS-Finder is an easy solution for investigating genomic information from existing exploratory projects like The Cancer Genome Atlas (TCGA), but it is especially useful to identify reliable driver candidates in small studies assessing the value of genomic information for clinical purposes, such as understanding and predicting chemoresistance or metastatic spread. Indeed, we performed a saturation analysis on the mutational data present in 238 bladder cancer patients using 9 subsampling fractions, and, as shown in Text S1b in Additional file1, DOTS-Finder can perform statistically better than our best competitor, MutSigCV (Text S1a and Figure S1 in Additional file1), in terms of number of drivers found and precision-recall balance in small sample datasets (Figure S2A,B in Additional file1). Our tool could recapitulate up to 40% of the results of the entire dataset with just 5% (that is, 12 patients) of the dataset (Figure S2C in Additional file1). Thus, it can be used in the clinical research setting to help identify driver genes that can assist patient stratification for prognosis and choice of treatment. We envisage that DOTS-Finder might facilitate the identification of candidate targets, which could be used to develop diagnostic, prognostic or therapeutic strategies, even in situations where the available data are scarce (for example, rare tumors).
The functional step: finding the best tumor suppressor gene and oncogene candidates
An OG, on the other hand, is characterized by gain or switch of function mutations that confer new properties on the protein product or simply enhance the existing ones. Hence, the typical mutations affecting an OG are missense mutations on key amino acids or on specific domains. We consider as missense type mutations all the non-synonymous SNVs that do not create a stop codon and occur outside start codons or stop codons, and all the insertions and deletions not altering the reading frame (inframe indels). These mutations have a particular pattern, as they are generally clustered in one or more regions along the protein (Figure 2B). For example, in leukemias, IDH1 can bear different kinds of mutations, but almost always at amino acid position 132 (Figure S3 in Additional file1).
The TSG-S evaluates whether a gene harbors an elevated number of truncating mutations compared with the total number of mutations present on that gene. Given 64 codons in the DNA and 9 possible SNVs per codon (3 nucleic acids × 3 possible changes) we have a total of 576 possible base changes. Only 23 of them can be considered truncating (approximately 3.9% of all the SNVs, weighted for the actual human codon usage) against the 415 non-synonymous single base changes that lead to missense variations and 138 silent mutations. If we take into account all the indels that corrupt the reading frame of a gene, we can estimate, based on our sample data, that the ratio between truncating mutations and total number of mutations in cancer is approximately 14%, with a standard deviation of 4. This ranges from a minimum of 9% in glioblastoma to a maximum of 25% in pancreatic adenocarcinoma, with high intra-tumor variability among patients. This discrepancy indicates that some tumors are more prone than others to acquire and maintain truncating mutations (Figure S4 in Additional file1).
The TSG-S is calculated using a binomial distribution under the null hypothesis that the ratio between truncating mutations and total number of mutations found in each gene is equal to the average truncating/total ratio in patients’ exomes (Figure S5 in Additional file1). The calculation of this score is set in the specific cancer-patient environment where the gene is found mutated, following the idea that a truncating mutation in a sample with few other alterations weights more than a mutation in a hypermutated sample.
The OG-S indicates whether a gene harbors an elevated number of missense mutations in certain regions of the gene. The score is based on the Shannon’s entropy of the pattern of missense SNVs and inframe indels, calculated using a Gaussian density model on the protein product. Every mutation is weighted for the actual Functional Impact provided by Mutation Assessor (a ‘protein function’ method) and compared with a random model estimated by a bootstrapping procedure. The score is able to catch the clusterization of mutations around significant hot spots in a gene.
We set a threshold for the two scores based on the analysis of the Catalogue Of Somatic Mutations In Cancer (COSMIC), using as positive control the CGC genes that encompass somatic point mutations. To evaluate the quality of our scores with regard to classification as driver and non-driver, and avoid making assumptions on the behavior of driver genes, we adopted two strategies. First, we did not consider any a priori set of true non-driver genes (negative control) and, second, we did not divide the CGC into OGs and TSGs. As mentioned before, the OG-S and TSG-S work on different levels and different mutation types, so we do not exclude the possibility that the same gene might show oncogenic and tumor suppressor features at the same time in different tumors, or even in the same cohort of patients (see the 'Atypical tumor suppressor genes and oncogenes' section below).
Since the number of mutated genes reported in COSMIC is greater than 18,000, the known drivers in CGC accounts for less than 1% of all the mutated genes. These numbers indicate that the two classes are extremely unbalanced, and that a common 'receiving operator characteristic' analysis is not appropriate to address the goodness of our scores. We therefore calculated the Matthews correlation coefficient curves for the two scores and maximize their values to obtain our thresholds (Figure S6 in Additional file1). Compared with other common measures like accuracy, the Matthews correlation coefficient is much more informative for strongly unbalanced classes. Our thresholds were also rescaled for every tumor type in order to take into account the setting-specific mutation rate and the number of samples at our disposal.
The frequentist step: assessing the possible drivers
Genes that exceed at least one of the thresholds of the two scores are classified as OGs or TSGs and four tests are then performed to assess if the mutational pattern in each gene shows a statistically defined 'driver behavior'. This analysis is complex, as it requires the proper estimation of the BMR, which is specific for each gene in each tumor type and patient. Indeed, we foresee at least seven sources of BMR heterogeneity: i) the specific mutation-rate of each tumor type; ii) the specific number of mutations in each patient; iii) the GC content, as most of the mutations found in cancer are point mutations occurring in GC spots; iv) the gene size; v) the gene-specific single nucleotide polymorphism frequency; vi) the replication time; vii) the levels of gene expression. However, other unknown parameters could also influence the BMR of a gene. Our method does not need to take into consideration either replication timing or gene expression levels, since they both require a great amount of new experimental data.
The four tests used by DOTS-Finder are the higher frequency test, the non-synonymous versus synonymous ratio test, the tumor-specificity test and the functional impact test (see Text S3 in Additional file1 for a full explanation of these). In the higher frequency test, the rate of non-synonymous mutations per megabase in a gene is compared with the rate of mutations in the patients carrying mutations in that gene. Given the total number of mutations found in a specific gene, the non-synonymous versus synonymous ratio test assesses whether the number of non-synonymous mutations is higher than the expected number of non-synonymous mutations. The expected value is calculated on the probabilistic ratio obtained by randomly placing the same number and type of mutations on the specific codon usage structure of the gene. The tumor-specificity test prioritizes the driver genes in the different tumors, although it is not fundamental for the driver assessment. The frequency of non-synonymous mutations in the samples is compared with the frequency found in the COSMIC database across tumor types. The test verifies whether the frequency of non-synonymous mutations in a particular tumor or situation is higher than the general frequency found in COSMIC. The idea is that some mutations are tissue-specific and might be drivers only in certain kinds of cancers. For example, NPM1 is a clear driver gene specific for leukemias; similarly, VHL is specific for renal cancer. The functional impact test is used to verify whether the functional impact score of the gene mutations, calculated by Mutation Assessor, is higher than the average score in the patients affected by a mutation in that gene. The four P-values obtained from these tests are combined using the Stouffer’s method with specific weights, in order to take into account both the dependencies between tests and their relative importance in the driver definition (Text S3 in Additional file1). The resulting P-value is then adjusted to correct for false discovery rate.
Results and discussion
Application of DOTS-Finder to individual cancer types characterized by different mutation rates
Significantly mutated genes identified by DOTS-Finder in four cancer types
Acute myeloid leukemia (S = 196, MNSp = 11)
Thyroid carcinoma (S = 326, MNSp = 19)
Breast cancer (S = 1046, MNSp = 36)
Bladder carcinoma (S = 145, MNSp = 177)
We identified three new driver candidates not present in previous publications: AQP7, MEF2A and UBC. AQP7 encodes aquaporin 7, an integral-membrane protein that plays important roles in water and fluid transport and cell migration. Recent discoveries of AQPs involvement in cell migration and proliferation suggest that AQPs play key roles in tumor biology. MEF2A encodes a DNA-binding transcription factor that is involved in several cellular processes, including cell growth control and apoptosis. It was recently shown that NOTCH-MEF2 synergy may be significant for modulating human mammary oncogenesis. UBC is a member of the ubiquitin family and involved in cell cycle and DNA repair. The role of ubiquitination is well established in cancer, especially in breast cancer.
We applied DOTS-Finder to the list of 326 thyroid carcinoma samples from TCGA, identifying 12 driver genes. We could only compare the DOTS-Finder results with the results obtained by TUSON Explorer, since, to date, there are no published TCGA papers for thyroid carcinoma (Figure 3B). Three of our putative driver genes (TG, BRAF and RPTN), are also predicted by TUSON Explorer. TG and BRAF are known driver genes in THCA[30, 31], while RPTN is a poorly characterized protein that has never been associated with THCA.
We identified several putative driver genes that may have relevant functions in cancer development (Table 1): mutations in EMG1 have been recently identified in a screen for mediators of IGF-1 signaling in cancer; germline mutations in PRDM9 are thought to influence genomic instability, increasing the risk of acquiring genomic rearrangements associated with childhood leukemogenesis; and PPM1D is an important interactor of TP53, is amplified in different types of cancers and encodes WIP1, a protein involved in oncogenesis. Recently, mutations and variants of this gene were associated with DNA damage response.
Although only slightly above our threshold, we also detected PTTG1LP and DICER1 as putative OGs. Interestingly, PTTG1IP (pituitary tumor transforming gene-binding factor) is a poorly characterized proto-oncogene that has already been implicated in the etiology of thyroid tumors[36, 37]. Loss of DICER1 is associated with the development of many cancers; somatic missense mutations affecting DICER1 are common in non-epithelial ovarian tumors and these mutations show an oncogenic behavior.
Atypical tumor suppressor genes and oncogenes
Genetic and functional effect of mutations in oncogenes and tumor suppressors
Dominant negative TSG
Typically, the functional information is missing or poorly understood for new driver candidates and the genetic information (allelic-specific) is not directly available in cancer sequencing studies. Thus, the OG and TSG classification must be inferred from the structural level. It is not surprising that our tool can classify many genes as being both TSGs and OGs within the same cancer type, or even put them into different categories according to the tumor context. This apparent misclassification might cast a light on the particular behavior of some genes.
Inference of biological classification by structural effect of mutational landscape
Typical (gain-of-function) for example, KRAS
Atypical (gain of function through loss of inhibition) for example, NPM1
Atypical (dominant negative, gain-of-function)
Atypical (possible dominant negative, gain-of-function*)
for example, SMARCA4 in lymphoma
for example, RB1
for example, TP53 in UCEC or DNMT3A in AML
The importance of considering subsets of samples
Analyzing the pattern of genetic alterations in tumor subsets classified by clinical or other biologic parameters can reveal important insights into individual pathogenic mechanisms and suggest possible therapeutic avenues. For instance, in LUAD, about 25 to 30% of the cases are not attributable to tobacco smoking as they are found in people that have never smoked (never smokers). Studies have revealed that LUAD in never smokers is a completely different disease from any type of lung cancer arising in smokers (LUAD included), as it differs in terms of clinical and pathological features, with diverse prognosis and strategy of care. The difference in the mutational landscape supports the hypothesis that lung adenocarcinomas in never smokers are driven by distinct genetic mechanisms. To identify additional driver genes with a role in the development of lung cancer in never smokers, we applied DOTS-Finder to the somatic mutations of the 50 never smoker patients present in the LUAD samples of the TCGA. These samples constitute approximately 10% of the population; our driver candidate predictions are reported in Table S4 in Additional file2. At the top of the list of predicted OGs is EGFR, consistent with the fact that EGFR is a key oncogenic player in never smokers with LUAD. Besides the identification of very well-known cancer genes such as SMAD4, STK11, SETD2, MET, KEAP1, TP53 and KRAS, we also identified several putative driver genes that might have relevant cancer development functions: somatic mutations in GRM1 disrupt signaling with multiple downstream consequences; mutations in RPL5 have been recently described as a potential oncogenic factor in T-cell acute lymphoblastic leukemia; inactivating mutations in the SHA gene, which has a role as a TSG, have been identified in familial paragangliomas[52, 53]; WRN encodes a helicase that is important for genomic integrity and involved in the repair of double strand DNA breaks and defects in this gene are the cause of the aging-promoting Werner syndrome and copy number variations or epigenetic inactivation of it have been recently found in never smokers with LUAD and non-small cell lung cancer, respectively.
Similarly, kidney cancer can be classified into different histological subtypes, the most common being kidney renal clear cell carcinoma (KIRC), kidney renal papillary cell carcinoma (KIRP) and kidney chromophobe (KICH). Applying DOTS-Finder separately on each kidney dataset (results are in Table S2 in Additional file2), we observed a subtype-specific pattern of genetic alterations. KIRC and KIRP share only SETD2, KIRC and KICH have only TP53 in common, and there are no common driver genes between KIRP and KICH. By analyzing all the datasets together we can predict two new putative driver genes, GFRAL and STAG2, not appearing in the single analyses. Since the KIRC subset is predominant in terms of sample size, the aggregated analysis can recapitulate 69% of its genes, while it can only identify 50% of KICH and 27% of KIRP genes. In KIRP, we lose the following candidate driver genes, which then appear to be tumor specific: KDM6A, SRCAP, SAV1, DARS, OGG1, MET, ATP10A; similarly, in KICH we lose CDKN1A.
DOTS-Finder sets the threshold for OG-S and TSG-S as a function of both the mutation rate of the analyzed tumor and the sample size of the input dataset (Text S3f in Additional file1). These thresholds have a default lower boundary. Nevertheless, for very small sample sizes, these thresholds can still be too high to let genes pass the functional step. We decided to introduce an option called lax that ignores the imposed lower boundary and allows more genes to pass the functional step in the presence of a small sample size. We provide insights on two tumors with small sample size (oligodendroglioma (16 patients) and carcinoid (54 patients)) to highlight the lax option in Text S2d and Table S5 in Additional file1.
DOTS-Finder is the first published software that can identify driver genes and classify them as TSGs and/or OGs and it can also be used to identify driver genes with atypical patterns of mutations (Figure 4). In addition, it is the first software that can be used by a vast and diverse scientific community as it is easy to install and use, does not require proprietary software, and does not require the use of low-level and hard to access files (for example, bam files, coverage files).
We have applied DOTS-Finder on publicly available datasets containing the mutation profile of 34 cancer types. We have obtained plausible driver genes for many low mutation rate cancers like gliomas, acute myeloid leukemia and prostate cancer. Notably, we have obtained results that are consistent with the literature even with some high mutation rate tumor types, like head and neck squamous cell carcinoma and bladder cancer, where the risk of falling into the 'fishy genes' trap is higher.
Our tool outperforms other available methods in terms of precision-recall, considering CGC as a gold standard. Importantly, DOTS-Finder has confirmed the predictions made by other methods and discovered novel driver candidates never identified before.
Using DOTS-Finder, researchers can identify driver genes in large public databases and also in user-defined samples stratified for a given characteristic, as the software is specifically designed to identify driver genes even in small datasets (for example, obese/normal weight, male/female, and so on). The use of few samples in cancer is justified by the high molecular heterogeneity present in tumors. Indeed, we believe that the results produced by DOTS-Finder could be very useful for researchers who want to identify driver genes in user-defined datasets, in order to investigate the significance or relevance of particular somatic mutations in relation to specific clinical questions.
Availability and requirements
Project name: DOTS-Finder.
Project home page: see.
Operating system(s): Unix based (MacOS, Linux).
Programming language: Python/R.
Other requirements: python 2.7, R > 2.
License: GNU GPLv3 +.
Any restrictions to use by non-academics: license needed.
acute myeloid leukemia
background mutation rate
Cancer Gene Census
Catalogue Of Somatic Mutations In Cancer
Driver Oncogene and Tumor Suppressor Finder
kidney renal clear cell carcinoma
kidney renal papillary cell carcinoma
Mutation Annotation Format
single nucleotide variant
The Cancer Genome Atlas
tumor suppressor gene
Tumor Suppressor Gene Score.
LR was supported by a Reintegration AIRC/Marie Curie International Fellowship in Cancer Research. This work was supported by a grant from Fondazione Cariplo to LR. SdP acknowledges funding from the European Community’s Seventh Framework Programme (FP7/2007-2013), project RADIANT (grant agreement no. 305626). We thank Marco J Morelli for the useful discussion on statistical analysis and critical review of the manuscript. We thank Lucilla Luzi for the useful discussion on the biological interpretation of the results and Luciano Giacò, Margherita Bodini, Anna Russo, Francesco Santaniello and Marzia Cremona for testing and debugging the beta version of the software. We thank Paola Dalton and Roberta Aina for critical review of the manuscript.
- Ng PC, Henikoff S: SIFT: Predicting amino acid changes that affect protein function. Nucleic Acids Res. 2003, 31: 3812-3814., 10.1093/nar/gkg509PubMed CentralView ArticlePubMedGoogle Scholar
- Shihab HA, Gough J, Cooper DN, Day INM, Gaunt TR: Predicting the functional consequences of cancer-associated amino acid substitutions. Bioinformatics. 2013, 29: 1504-1510., 10.1093/bioinformatics/btt182PubMed CentralView ArticlePubMedGoogle Scholar
- Reva B, Antipin Y, Sander C: Predicting the functional impact of protein mutations: application to cancer genomics. Nucleic Acids Res. 2011, 39: e118-, 10.1093/nar/gkr407PubMed CentralView ArticlePubMedGoogle Scholar
- Wood LD, Parsons DW, Jones S, Lin J, Sjöblom T, Leary RJ, Shen D, Boca SM, Barber T, Ptak J, Silliman N, Szabo S, Dezso Z, Ustyanksky V, Nikolskaya T, Nikolsky Y, Karchin R, Wilson PA, Kaminker JS, Zhang Z, Croshaw R, Willis J, Dawson D, Shipitsin M, Willson JKV, Sukumar S, Polyak K, Ben Ho P, Pethiyagoda CL, Pant PVK, et al: The genomic landscapes of human breast and colorectal cancers. Science. 2007, 318: 1108-1113., 10.1126/science.1145720View ArticlePubMedGoogle Scholar
- Lawrence MS, Stojanov P, Polak P, Kryukov GV, Cibulskis K, Sivachenko A, Carter SL, Stewart C, Mermel CH, Roberts SA, Kiezun A, Hammerman PS, McKenna A, Drier Y, Zou L, Ramos AH, Pugh TJ, Stransky N, Helman E, Kim J, Sougnez C, Ambrogio L, Nickerson E, Shefler E, Cortés ML, Auclair D, Saksena G, Voet D, Noble M, DiCara D, et al: Mutational heterogeneity in cancer and the search for new cancer-associated genes. Nature. 2013, 499: 214-218., 10.1038/nature12213PubMed CentralView ArticlePubMedGoogle Scholar
- Dees ND, Zhang Q, Kandoth C, Wendl MC, Schierding W, Koboldt DC, Mooney TB, Callaway MB, Dooling D, Mardis ER, Wilson RK, Ding L: MuSiC: identifying mutational significance in cancer genomes. CORD Conf Proc. 2012, 22: 1589-1598.Google Scholar
- Ciriello GG, Cerami EE, Sander CC, Schultz NN: Mutual exclusivity analysis identifies oncogenic network modules. Genes Dev. 2012, 22: 398-406.Google Scholar
- Bashashati A, Haffari G, Ding J, Ha G, Liu K: DriverNet: uncovering the impact of somatic driver mutations on transcriptional networks in cancer. Genome Biol. 2012, 13: R124-, 10.1186/gb-2012-13-12-r124PubMed CentralView ArticlePubMedGoogle Scholar
- Vandin F, Upfal E, Raphael BJ: De novo discovery of mutated driver pathways in cancer. Genome Res. 2012, 22: 375-385., 10.1101/gr.120477.111PubMed CentralView ArticlePubMedGoogle Scholar
- Leiserson MDM, Blokh D, Sharan R, Raphael BJ: Simultaneous identification of multiple driver pathways in cancer. PLoS Comput Biol. 2013, 9: e1003054-, 10.1371/journal.pcbi.1003054PubMed CentralView ArticlePubMedGoogle Scholar
- Liu H, Xing Y, Yang S, Tian D: Remarkable difference of somatic mutation patterns between oncogenes and tumor suppressor genes. Oncol Rep. 2011, 26: 1539-1546.PubMedGoogle Scholar
- Vogelstein B, Papadopoulos N, Velculescu VE, Zhou S, Diaz LA, Kinzler KW: Cancer genome landscapes. Science. 2013, 339: 1546-1558., 10.1126/science.1235122PubMed CentralView ArticlePubMedGoogle Scholar
- Davoli T, Xu AW, Mengwasser KE, Sack LM, Yoon JC, Park PJ, Elledge SJ: Cumulative haploinsufficiency and triplosensitivity drive aneuploidy patterns and shape the cancer genome. Cell. 2013, 155: 948-962., 10.1016/j.cell.2013.10.011PubMed CentralView ArticlePubMedGoogle Scholar
- Tamborero D, Gonzalez-Perez A, López-Bigas N: OncodriveCLUST: exploiting the positional clustering of somatic mutations to identify cancer genes. Bioinformatics. 2013, 29: 2238-2244., 10.1093/bioinformatics/btt395View ArticlePubMedGoogle Scholar
- Cheng W-C, Chung I-F, Chen C-Y, Sun H-J, Fen J-J, Tang W-C, Chang T-Y, Wong T-T, Wang H-W: DriverDB: an exome sequencing database for cancer driver gene identification. Nucleic Acids Res. 2014, 42: D1048-D1054., 10.1093/nar/gkt1025PubMed CentralView ArticlePubMedGoogle Scholar
- Tamborero D, Gonzalez-Perez A, Perez-Llamas C, Deu-Pons J, Kandoth C, Reimand J, Lawrence MS, Getz G, Bader GD, Ding L, López-Bigas N: Comprehensive identification of mutational cancer driver genes across 12 tumor types. Sci Rep. 2013, 3: 2650-PubMed CentralPubMedGoogle Scholar
- Reimand J, Bader GD: Systematic analysis of somatic mutations in phosphorylation signaling predicts novel cancer drivers. Mol Syst Biol. 2013, 9: 637-PubMed CentralView ArticlePubMedGoogle Scholar
- Gonzalez-Perez A, López-Bigas N: Functional impact bias reveals cancer drivers. Nucleic Acids Res. 2012, 40: e169-, 10.1093/nar/gks743PubMed CentralView ArticlePubMedGoogle Scholar
- Futreal PA, Coin L, Marshall M, Down T, Hubbard T, Wooster R, Rahman N, Stratton MR: A census of human cancer genes. Nat Rev Cancer. 2004, 4: 177-183., 10.1038/nrc1299PubMed CentralView ArticlePubMedGoogle Scholar
- Forbes SA, Bindal N, Bamford S, Cole C, Kok CY, Beare D, Jia M, Shepherd R, Leung K, Menzies A, Teague JW, Campbell PJ, Stratton MR, Futreal PA: COSMIC: mining complete cancer genomes in the Catalogue of Somatic Mutations in Cancer. Nucleic Acids Res. 2011, 39: D945-D950., 10.1093/nar/gkq929PubMed CentralView ArticlePubMedGoogle Scholar
- Baldi P, Brunak S, Chauvin Y, Andersen CA, Nielsen H: Assessing the accuracy of prediction algorithms for classification: an overview. Bioinformatics. 2000, 16: 412-424., 10.1093/bioinformatics/16.5.412View ArticlePubMedGoogle Scholar
- Kandoth C, McLellan MD, Vandin F, Ye K, Niu B, Lu C, Xie M, Zhang Q, McMichael JF, Wyczalkowski MA, Leiserson MDM, Miller CA, Welch JS, Walter MJ, Wendl MC, Ley TJ, Wilson RK, Raphael BJ, Ding L: Mutational landscape and significance across 12 major cancer types. Nature. 2013, 502: 333-339., 10.1038/nature12634PubMed CentralView ArticlePubMedGoogle Scholar
- Lawrence MS, Stojanov P, Mermel CH, Robinson JT, Garraway LA, Golub TR, Meyerson M, Gabriel SB, Lander ES, Getz G: Discovery and saturation analysis of cancer genes across 21 tumour types. Nature. 2014, 505: 495-501., 10.1038/nature12912PubMed CentralView ArticlePubMedGoogle Scholar
- Network CGA: Comprehensive molecular portraits of human breast tumours. Nature. 2012, 490: 61-70., 10.1038/nature11412View ArticleGoogle Scholar
- Robinson JLL, Holmes KA, Carroll JS: FOXA1 mutations in hormone-dependent cancers. Front Oncol. 2012, 3: 20-20.Google Scholar
- Catucci II, Verderio PP, Pizzamiglio SS, Manoukian SS, Peissel BB, Zaffaroni DD, Roversi GG, Ripamonti CBC, Pasini BB, Barile MM, Viel AA, Giannini GG, Papi LL, Varesco LL, Martayan AA, Riboni MM, Volorio SS, Radice PP, Peterlongo PP: The CASP8 rs3834129 polymorphism and breast cancer risk in BRCA1 mutation carriers. CORD Conf Proc. 2011, 125: 855-860.Google Scholar
- Verkman AS, Hara-Chikuma M, Papadopoulos MC: Aquaporins–new players in cancer biology. J Mol Med. 2008, 86: 523-529., 10.1007/s00109-008-0303-9PubMed CentralView ArticlePubMedGoogle Scholar
- Pallavi SK, Ho DM, Hicks C, Miele L, Artavanis-Tsakonas S: Notch and Mef2 synergize to promote proliferation and metastasis through JNK signal activation in Drosophila. EMBO J. 2012, 31: 2895-2907., 10.1038/emboj.2012.129PubMed CentralView ArticlePubMedGoogle Scholar
- Ohta T, Fukuda M: Ubiquitin and breast cancer. Oncogene. 2004, 23: 2079-2088., 10.1038/sj.onc.1207371View ArticlePubMedGoogle Scholar
- Rubio IGS, Medeiros-Neto G: Mutations of the thyroglobulin gene and its relevance to thyroid disorders. Curr Opin Endocrinol Diabetes Obes. 2009, 16: 373-378., 10.1097/MED.0b013e32832ff218View ArticlePubMedGoogle Scholar
- Kimura ET, Nikiforova MN, Zhu Z, Knauf JA, Nikiforov YE, Fagin JA: High prevalence of BRAF mutations in thyroid cancer: genetic evidence for constitutive activation of the RET/PTC-RAS-BRAF signaling pathway in papillary thyroid carcinoma. Cancer Res. 2003, 63: 1454-1457.PubMedGoogle Scholar
- McMahon M, Ayllón V, Panov KI, O’Connor R: Ribosomal 18 S RNA processing by the IGF-I-responsive WDR3 protein is integrated with p53 function in cancer cell proliferation. J Biol Chem. 2010, 285: 18309-18318., 10.1074/jbc.M110.108555PubMed CentralView ArticlePubMedGoogle Scholar
- Hussin J, Sinnett D, Casals F, Idaghdour Y, Bruat V, Saillour V, Healy J, Grenier J-C, de Malliard T, Busche S, Spinella J-F, Larivière M, Gibson G, Andersson A, Holmfeldt L, Ma J, Wei L, Zhang J, Andelfinger G, Downing JR, Mullighan CG, Awadalla P: Rare allelic forms of PRDM9 associated with childhood leukemogenesis. Genome Res. 2013, 23: 419-430., 10.1101/gr.144188.112PubMed CentralView ArticlePubMedGoogle Scholar
- Bulavin DV, Demidov ON, Saito S, Kauraniemi P, Phillips C, Amundson SA, Ambrosino C, Sauter G, Nebreda AR, Anderson CW, Kallioniemi A, Fornace AJ, Appella E: Amplification of PPM1D in human tumors abrogates p53 tumor-suppressor activity. Nat Genet. 2002, 31: 210-215., 10.1038/ng894View ArticlePubMedGoogle Scholar
- Dudgeon C, Shreeram S, Tanoue K, Mazur SJ, Sayadi A, Robinson RC, Appella E, Bulavin DV: Genetic variants and mutations of PPM1D control the response to DNA damage. Cell Cycle. 2013, 12: 2656-2664., 10.4161/cc.25694PubMed CentralView ArticlePubMedGoogle Scholar
- Stratford AL, Boelaert K, Tannahill LA, Kim DS, Warfield A, Eggo MC, Gittoes NJL, Young LS, Franklyn JA, McCabe CJ: Pituitary tumor transforming gene binding factor: a novel transforming gene in thyroid tumorigenesis. J Clin Endocrinol Metab. 2005, 90: 4341-4349., 10.1210/jc.2005-0523View ArticlePubMedGoogle Scholar
- Read ML, Lewy GD, Fong JCW, Sharma N, Seed RI, Smith VE, Gentilin E, Warfield A, Eggo MC, Knauf JA, Leadbeater WE, Watkinson JC, Franklyn JA, Boelaert K, McCabe CJ: Proto-oncogene PBF/PTTG1IP regulates thyroid cell growth and represses radioiodide treatment. Cancer Res. 2011, 71: 6153-6164., 10.1158/0008-5472.CAN-11-0720PubMed CentralView ArticlePubMedGoogle Scholar
- Heravi-Moussavi A, Anglesio MS, Cheng S-WG, Senz J, Yang W, Prentice L, Fejes AP, Chow C, Tone A, Kalloger SE, Hamel N, Roth A, Ha G, Wan ANC, Maines-Bandiera S, Salamanca C, Pasini B, Clarke BA, Lee AF, Lee C-H, Zhao C, Young RH, Aparicio SA, Sorensen PHB, Woo MMM, Boyd N, Jones SJM, Hirst M, Marra MA, Gilks B, et al: Recurrent somatic DICER1 mutations in nonepithelial ovarian cancers. N Engl J Med. 2012, 366: 234-242., 10.1056/NEJMoa1102903View ArticlePubMedGoogle Scholar
- Payne SR, Kemp CJ: Tumor suppressor genetics. Carcinogenesis. 2005, 26: 2031-2045., 10.1093/carcin/bgi223View ArticlePubMedGoogle Scholar
- Xu J, Haigis KM, Firestone AJ, McNerney ME, Li Q, Davis E, Chen S-C, Nakitandwe J, Downing J, Jacks T, Le Beau MM, Shannon K: Dominant role of oncogene dosage and absence of tumor suppressor activity in nras-driven hematopoietic transformation. Cancer Discov. 2013, 3: 993-1001., 10.1158/2159-8290.CD-13-0096PubMed CentralView ArticlePubMedGoogle Scholar
- Oren M, Rotter V: Mutant p53 gain-of-function in cancer. Cold Spring Harb Perspect Biol. 2010, 2: a001107-PubMed CentralView ArticlePubMedGoogle Scholar
- Kim SJ, Zhao H, Hardikar S, Singh AK, Goodell MA, Chen T: A DNMT3A mutation common in AML exhibits dominant-negative effects in murine ES cells. Blood. 2013, 122: 4086-4089., 10.1182/blood-2013-02-483487PubMed CentralView ArticlePubMedGoogle Scholar
- Medina PP, Romero OA, Kohno T, Montuenga LM, Pio R, Yokota J, Sanchez-Cespedes M: Frequent BRG1/SMARCA4-inactivating mutations in human lung cancer cell lines. Hum Mutat. 2008, 29: 617-622., 10.1002/humu.20730View ArticlePubMedGoogle Scholar
- Magnani L, Cabot RA: Manipulation of SMARCA2 and SMARCA4 transcript levels in porcine embryos differentially alters development and expression of SMARCA1, SOX2, NANOG, and EIF1. Reproduction. 2009, 137: 23-33., 10.1530/REP-08-0335View ArticlePubMedGoogle Scholar
- Medina PP, Sanchez-Cespedes M, Cespedes MS: Involvement of the chromatin-remodeling factor BRG1/SMARCA4 in human cancer. Epigenetics. 2008, 3: 64-68., 10.4161/epi.3.2.6153View ArticlePubMedGoogle Scholar
- Mariano AR, Colombo E, Luzi L, Martinelli P, Volorio S, Bernard L, Meani N, Bergomas R, Alcalay M, Pelicci PG: Cytoplasmic localization of NPM in myeloid leukemias is dictated by gain-of-function mutations that create a functional nuclear export signal. Oncogene. 2006, 25: 4376-4380., 10.1038/sj.onc.1209453View ArticlePubMedGoogle Scholar
- Grisendi S, Mecucci C, Falini B, Pandolfi PP: Nucleophosmin and cancer. Nat Rev Cancer. 2006, 6: 493-505., 10.1038/nrc1885View ArticlePubMedGoogle Scholar
- Rudin CM, Avila-Tang E, Harris CC, Herman JG, Hirsch FR, Pao W, Schwartz AG, Vahakangas KH, Samet JM: Lung cancer in never smokers: molecular profiles and therapeutic implications. Clin Cancer Res. 2009, 15: 5646-5661., 10.1158/1078-0432.CCR-09-0377PubMed CentralView ArticlePubMedGoogle Scholar
- Govindan R, Ding L, Griffith M, Subramanian J, Dees ND, Kanchi KL, Maher CA, Fulton R, Fulton L, Wallis J, Chen K, Walker J, McDonald S, Bose R, Ornitz D, Xiong D, You M, Dooling DJ, Watson M, Mardis ER, Wilson RK: Genomic landscape of non-small cell lung cancer in smokers and never-smokers. Cell. 2012, 150: 1121-1134., 10.1016/j.cell.2012.08.024PubMed CentralView ArticlePubMedGoogle Scholar
- Esseltine JL, Willard MD, Wulur IH, Lajiness ME, Barber TD, Ferguson SSG: Somatic mutations in GRM1 in cancer alter metabotropic glutamate receptor 1 intracellular localization and signaling. Mol Pharmacol. 2013, 83: 770-780., 10.1124/mol.112.081695View ArticlePubMedGoogle Scholar
- De Keersmaecker K, Atak ZK, Li N, Vicente C, Patchett S, Girardi T, Gianfelici V, Geerdens E, Clappier E, Porcu M, Lahortiga I, Lucà R, Yan J, Hulselmans G, Vranckx H, Vandepoel R, Sweron B, Jacobs K, Mentens N, Wlodarska I, Cauwelier B, Cloos J, Soulier J, Uyttebroeck A, Bagni C, Hassan BA, Vandenberghe P, Johnson AW, Aerts S, Cools J: Exome sequencing identifies mutation in CNOT3 and ribosomal genes RPL5 and RPL10 in T-cell acute lymphoblastic leukemia. Nat Genet. 2013, 45: 186-190.View ArticlePubMedGoogle Scholar
- Francis JM, Kiezun A, Ramos AH, Serra S, Pedamallu CS, Qian ZR, Banck MS, Kanwar R, Kulkarni AA, Karpathakis A, Manzo V, Contractor T, Philips J, Nickerson E, Pho N, Hooshmand SM, Brais LK, Lawrence MS, Pugh T, McKenna A, Sivachenko A, Cibulskis K, Carter SL, Ojesina AI, Freeman S, Jones RT, Voet D, Saksena G, Auclair D, Onofrio R, et al: Somatic mutation of CDKN1B in small intestine neuroendocrine tumors. Nat Genet. 2013, 45: 1483-1486., 10.1038/ng.2821PubMed CentralView ArticlePubMedGoogle Scholar
- Bardella C, Pollard PJ, Tomlinson I: SDH mutations in cancer. Biochim Biophys Acta. 2011, 1807: 1432-1443., 10.1016/j.bbabio.2011.07.003View ArticlePubMedGoogle Scholar
- Job B, Bernheim A, Beau-Faller M, Camilleri-Broët S, Girard P, Hofman P, Mazières J, Toujani S, Lacroix L, Laffaire J, Dessen P, Fouret P: LG Investigators: Genomic aberrations in lung adenocarcinoma in never smokers. PLoS One. 2010, 5: e15145-, 10.1371/journal.pone.0015145PubMed CentralView ArticlePubMedGoogle Scholar
- Agrelo R, Cheng W-H, Setien F, Ropero S, Espada J, Fraga MF, Herranz M, Paz MF, Sanchez-Cespedes M, Artiga MJ, Guerrero D, Castells A, von Kobbe C, Bohr VA, Esteller M: Epigenetic inactivation of the premature aging Werner syndrome gene in human cancer. Proc Natl Acad Sci U S A. 2006, 103: 8822-8827., 10.1073/pnas.0600645103PubMed CentralView ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.