Comprehensive analysis of the genome transcriptome and proteome landscapes of three tumor cell lines
© Akan et al.; licensee BioMed Central Ltd. 2012
Received: 27 September 2012
Accepted: 18 November 2012
Published: 18 November 2012
We here present a comparative genome, transcriptome and functional network analysis of three human cancer cell lines (A431, U251MG and U2OS), and investigate their relation to protein expression. Gene copy numbers significantly influenced corresponding transcript levels; their effect on protein levels was less pronounced. We focused on genes with altered mRNA and/or protein levels to identify those active in tumor maintenance. We provide comprehensive information for the three genomes and demonstrate the advantage of integrative analysis for identifying tumor-related genes amidst numerous background mutations by relating genomic variation to expression/protein abundance data and use gene networks to reveal implicated pathways.
Human cancer cell lines have been an invaluable and practical resource for cancer research. The availability of genomic, transcriptomic and proteomic data on these lines is expected to further increase their utility. To this end, we conducted whole-genome and transcriptome sequencing on three tumor cell lines (A431, U251MG and U2OS) for which there is a large body of proteomics data . The choice of these lines was also motivated by their origin from different lineages (tumor cell lines from mesenchymal, epithelial and glial tumors) and abundance of literature.
A431 is used as a model cell line for epidermoid carcinoma and there are currently 3,359 publications describing studies using this cell line. It was established from an epidermoid carcinoma in the vulva of an 85-year-old patient . This cell line expresses high levels of epidermal growth factor receptor (EGFR) and is often used to investigate cell proliferation and apoptosis. U251MG is a commonly used glioblastoma cell line (over 1,200 published articles) established from a male's brain tissue . U2OS is an osteosarcoma cell line derived from a 15-year-old female . Osteosarcoma tumors arise from cells of mesenchymal origin that differentiate to osteoblasts. It is the most common form of bone cancer, responsible for 2.4% of all malignancies in pediatric patients, and its triggers are currently not known . U2OS is a common choice for osteosarcoma research: 35% of the articles associated with the osteosarcoma Medical Subject Headings (MeSH) term in the PubMed database have used this cell line.
Using modern technologies, we subjected these three cell lines to genome and RNA sequencing in order to identify genomic alterations and expression of messenger and microRNAs. A review by Ideker and Sharan summarized studies that demonstrate how genes with a role in cancer tend to cluster together on well-connected sub-networks of protein-protein interactions . We also earlier demonstrated that somatic mutations in a glioblastoma cancer genome produced a pathway-like pattern of enriched connectivity in the gene interaction network. Hence, in this work we analyzed functional relations between all detected somatic mutations, structural variations (altered copy number) and allelic imbalances of expression via network enrichment analysis (NEA) [7, 8]. A biological pathway could be seen as an area of densely connected genes in a functional gene network. The idea of NEA when applied to cancer-related genes is that multiple key mutations (which are believed to be common in cancer genomes) could alter normal cellular programs for proliferation, differentiation, cell death, and so on, sometimes even producing quasi-pathways . These altered pathways could then be detected as denser and more enriched areas and evaluated by comparing patterns formed by the same set of genes in biologically meaningless (random) networks. Either the whole group or members of such a pathway could have links to individual master switches of oncogenesis, which may themselves have not been altered.
In particular, Dutta and co-authors developed a valuable idea, according to which effects of driver genes might be seen as differential (mRNA or protein) expression of network neighbors . In the current work we pursue a similar approach with the difference that we did not make any prior assumptions about modular properties of driver mutations and entirely summarized their relations to each other and important pathways. This method is the closest analog of gene set enrichment analysis (GSEA), with the important novel option of analyzing single genes against functional sets . Apart from that, gene network information enables much higher sensitivity, which we demonstrate as well.
While different methods of network inference from single or two data sources have been published , only data integration networks have a broader scope and include multiple molecular mechanisms required for our analysis. For the highest completeness, we employed a network of functional coupling that was drawn up using the methodology of the data integration tool FunCoup , and then merged with curated pathways from Kyoto Encyclopedia of Genes and Genomes (KEGG), protein complex data from CORUM, and a special network from glioblastoma data. However, any state-of-the-art network is likely incomplete or does not account for a specific context and we thus complement the network analysis of direct links with analogous statistics that accounts for indirect links, that is, connections via third genes.
To enable a rigorous statistical evaluation, patterns of potential functional couplings are compared to observations in a series of randomized networks that preserve basic topological properties overall, but have no biological function. This results in probabilistic estimates for every tested hypothesis. As the analysis considers relative enrichment rather than absolute signal strength, functional patterns can be discerned in the presence of multiple spurious mutations, which are referred to as passengers. On the other hand, any computation-based gene network would have a high number of individual false edges. Again, looking at statistically significant enrichment patterns instead of focusing on particular links allows ignoring such false positive findings. Of note, a number of reports were dedicated to discovery of network structures (modules, clusters, hypothetical pathways, and so on) that could characterize pathologic conditions [10, 14, 15].
Here we describe, to our knowledge, the first study in which whole-genome and transcriptome data for three cancer genomes were analyzed in conjunction with data on global protein levels. First, we select genes with the potentially highest signal concentration (that is, filter them by expression values, correlation of those to genome alteration, sequence features, and so on), and subject them to network enrichment analysis to prove that both the selection criteria and NEA can bring us closer to the true sets of driver mutations in these genomes. Second, we re-analyze in the interaction network all detected copy number and single nucleotide alterations and present the most likely driver mutations within each genome. We show that passengers account for the overwhelming majority of all detected structural variations. We believe that the results presented herein provide a basis for understanding the functional interactions between the genome, transcriptome and proteome for both these highly influential model cell lines and cancer genomes in general.
Materials and methods
Sequencing and mapping
We sequenced six Illumina paired-end lanes for the osteosarcoma (U2OS) cell line, and five for each of the other two cell lines, glioblastoma (U251) and epidermoid carcinoma (A431). In total, there were 16 lanes, amounting to 1.23 billion paired-end reads. The data are publicly available [ERP001947] . The lanes were then mapped to the human genome, hg19, using BWA . BWA was run with default parameters except for: -l 25 and -k 2. With these settings, 90%, 92.6% and 88.3% of the reads were mapped for the U251MG, U2OS and A431 cell lines, respectively. Mapped lanes were then filtered on a mapping quality higher than 30 to retain only the best mappings. Reads that mapped in multiple locations, which are reported by BWA as having quality 0, were discarded. This conferred coverage of approximately 21 × for U2OS. For U251 and A431 the coverage was approximately 15 ×. In addition to the paired-end libraries, we also sequenced three mate-pair lanes, one for each cell line. After clipping adapter sequences and reverse complementing the reads, we mapped them using BWA with the same parameters as above.
Total RNA was extracted using the RNeasy Mini extraction kit from Qiagen (Hilden, Germany) and eluted in 50 μl of RNase-free water. The quality of the RNA was analyzed using the Experion Automated Electrophoresis Station from Bio-Rad and the standard sensitivity RNA chip (Hercules, California, US). The RNA quality indicator (RQI) was 10 for all samples. The RNA extracts were stored at -80°C. Each RNA sample was bar-coded and prepared according to Illumina mRNA-seq sample preparation and kit with the automated platform previously described . The barcoded libraries were pooled together in pairs at equal concentrations and clustered on a cBot cluster-generation system using the Illumina HiSeq single-read cluster generation kit according to the protocol from the manufacturer. The pooled libraries were sequenced on Illumina HiSeq 2000 following instructions for multiplex single read sequencing and using 100 + 7 cycles. All lanes were spiked with a control library of phiX, yielding about 1% of the sequencing reads per lane. Reads were then mapped with TopHat with no quality trimming either with g -5 or g -20 . The data are publicly available [ERP001948] .
Functional analysis of the gene interaction network
The existing global networks of functional coupling, such as FunCoup, PPI networks, the union of KEGG pathways, and so on, are known to be of high quality and relevance when applied to statistically evaluate functional relations between larger gene sets. As the network for the enrichment analysis, we predicted a human network of functional coupling using the FunCoup computational framework at a confidence cutoff for individual links defined as a final Bayesian score >7 . This updated version used the latest protein-protein interactions from the IntAct database, protein expression atlas HPA  and sub-cellular localization data from Gene Ontology. In addition, analysis of glioblastoma multiforme (GBM) published by The Cancer Genome Atlas  provided data on the methylation status of about 2,000 genes, and the transcription of more than 17,000 genes; the GBM network was constructed by simultaneously profiling 147 individual tumors for genomic changes in 500 genes. This dataset provided an opportunity to reconstruct a cancer-specific network that considers the three molecular mechanisms. Using partial correlation analysis , we obtained a compact and highly specific GBM network of causative relations between somatic mutations, methylation, and transcription (22,990 links between 15,197 gene symbols; (manuscript in preparation). The FunCoup network was then merged with the GBM network and 79,539 curated links between 5,763 genes from the KEGG  and CORUM  databases. In total, the union contained 889,654 unique links between 18,904 HUPO gene symbols.
Functional gene groups for network analysis
To characterize altered gene sets by involvement into known biological processes, we compiled a list of gene membership in pathways and other gene groups of importance in the cancer context: 1) all 235 pathways presented in the KEGG database (as of 21 April 2010), including 9 cancer pathways; 2) 15 Gene Ontology terms that could be related to hallmarks of cancer ; 3) 13 cancer-related pathways from publications reporting on large-scale cancer genome projects; 4) gene sets of epithelial-mesenchymal transition (courtesy of S Souchelnytskyi) and tumor-specific pH-shift (courtesy of A de Milito). The list thus included 5,698 distinct HUPO gene symbols assigned to 260 gene groups (multiple membership allowed).
Network enrichment analysis
where n ij is the total number of links between any genes of i and any genes of j found in the given network. In biological networks, the distribution of node degree (number of connections per gene node) follows the power law, that is, is very uneven: many nodes have few links, while few nodes have many links. Thus, the expected (mean) number and standard deviation σ ij estimates are strongly influenced by node degree compositions in particular gene sets. To make the analysis unbiased, we applied the network randomization procedure proposed by . While systematically re-wiring network nodes, that is, randomly swapping edges between two nodes at a time, the procedure preserved node degrees and the total number of edges in the network. The expected mean (counted in the same way as the value of n ij ) and standard deviation σ ij were learned after a sufficient number (50) of random network permutations. The default statistic counted the direct links. An alternative statistic counted links indirectly, via a shared network neighbor, that is, if there was a third gene linked to both genes in question. Under the true null, that is, in the absence of any functional linkages between gene groups, the z-scores must be normally distributed; hence, Z could be converted to P-values by a standard procedure. For both direct and indirect links in each analysis, we evaluated relevant false discovery rates by looking at the left tail of the z-score distribution (that is, the depletion side) where no significant findings were expected and, alternatively, by permutation tests on random gene sets of matching size and topological properties.
Each gene carrying a potentially damaging single nucleotide variant (SNV) was individually tested for functional relatedness to the rest of the genes with potentially damaging SNVs from the same somatic genome. Formally, we tested for violation of the null hypothesis that stated 'the individual gene is not enriched in connections with somatically mutated genes from the same line' using two different statistics (direct and indirect links); we performed 334 tests in total (2 × (57 + 51 + 54)).
Gene set enrichment analysis
GSEA was performed on fixed-size AGS against the same FGS as described for NEA using the hypergeometric test, also known as odds ratio test . The z-scores were converted to P-values and adjusted for multiple testing with an R function using the Benjamini and Hochberg method.
Results and discussions
Genes affected by structural variations and their functional implications
Analysis of potential downstream effects of point mutations in all cell lines
SNVs were detected within coding genes  (Additional file 1). We first investigated effects of splice site SNVs on transcriptomes of the three cell lines. An in-house software package was used to evaluate the effects of splicing site SNVs on transcript structures (Additional file 1). Approximately 2,500 SNVs were found that may potentially affect splicing in each cell line; after applying several filters, around a dozen were identified as being potentially damaging and only two of these were validated by reference to mRNA data (Table S3 in Additional file 1). APIP was found to undergo alternative splicing in U251, probably due to a homozygous splice site SNV (chr11: 34905054_G/C) at the upstream splice site of exon 6 (Figure S2a in Additional file 1). This mutation causes the sixth exon to be skipped without shifting the reading frame. An aberrant transcription of the proto-oncogene FES was detected (Figure S2b in Additional file 1) in U2OS cells, which is missing the first 15 exons (which contain the regulatory region of its protein activity), leaving only 4 expressed exons. FES without its regulatory part has also been observed in lymphoma and lymphoid leukemia cell lines , and appears to be produced from the same transcript as we found in the U2OS osteosarcoma line in this work. FES expression has been found to correlate with tumor growth and metastasis  and it is likely that the short transcript variant observed in U2OS was involved in carcinogenesis.
We also assessed allelic imbalances in the expressed genes by comparing individual SNV frequencies at the DNA and RNA levels (Additional file 1). Genes carrying SNVs that were heterozygous at the DNA level but homozygous in RNA transcripts were considered allelically imbalanced. We detected 17, 6 and 10 such genes in A431, U251MG and U2OS, respectively (Table S4 in Additional file 1), and only one of them (NDN) is imprinted . In A431, several transcription factor genes as well as HDAC8, SMARCA1 and BCLAF1 were expressed from only one allele. MAP2K3 was allelically imbalanced in both the U2OS and U251MG cell lines.
We then looked at the non-synonymous SNVs in these genomes. In order to enrich those involved in tumor maintenance, we applied filters based on their heterogeneity and common polymorphisms (Additional file 1). We then predicted their protein-level effects using PolyPhen to filter out those with no obvious potential to cause a functional change on the protein . This left us with 57, 54 and 51 genes carrying SNVs that were likely to be damaging to protein function in A431, U251MG and U2OS, respectively (Table S5 in Additional file 1).
Cancer state is likely to be the result of a set of functional mutations in key genes that perturb relevant gene networks at multiple points [9, 39]. To identify such cooperative actions of mutations, we used NEA aiming to find the most likely key genes for each cell line, that is, the impaired genes that contributed to the onset and/or maintenance of the rapid proliferation state. To this end, we evaluated network connections between the genes impaired via SNVs within each cell line. In the A431 cell line, 8 of 57 potentially impaired genes were strongly connected to other genes within the same set; the corresponding numbers for the U251MG and U2OS lines were 12 and 7, respectively (false discovery rate (FDR) <0.10; Table S6 in Additional file 1). One example is PKMYT1, a gene that carries a heterozygous SNV that is predicted to be damaging (NP_004194_E179G, PolyPhen FDR = 0) in U2OS cells. This mutation is at a conserved residue within the catalytic domain of the protein . NEA indicated that this mutation was only directly linked to one other damaging somatic mutation in U2OS - a mutation in carbamoyl phosphate synthetase II (CAD). However, analysis of indirect links (that is, those via shared neighbors) revealed significant relationships between PKMYT1 and the rest of the U2OS somatic mutation set (790 links compared to 406.4 expected by chance, NEA z-score = 19.21). Again, the majority of such links (Figure S3 in Additional file 1) led to CAD through BMP2K and CDK2 (502 links), nuclear protein NUP93 (72 links), the WD repeat and HMG-box DNA binding protein WDHD1 (54 links), and the DNA primase PRIM2 (53 links). Collective actions of these heavily connected impaired genes could produce alterations in associated pathways such as cell cycle regulation [41, 42].
Context-dependent meta-analysis of impaired genes in the three cell lines
We then looked at the overlap with and interactions between our affected gene sets and a comprehensive list of cancer-related genes generated by Ding et al. (referred to as the Ding-set) . SNV-impaired genes in U2OS and U251MG were significantly enriched in terms of NEA with the Ding-set but those from A431 were not. All lists manifested some enrichment against KEGG cancer pathways, but only the U251 cell line was strongly associated with these pathways. The other two only had significant z-scores against small and non-small cell lung cancers as well as prostate and bladder cancer, whereas U251 was enriched with respect to all of these and ten other cancer pathways. However, as a final test of CNA being a driver mutation, we present a context-specific analysis: a NEA of individual CNAs versus the filtered SNV gene sets of the same cell line (Table S7 in Additional file 1). This analysis is analogous to the 'SNV gene versus SNV gene set' analysis described above (Table S6 in Additional file 1). Figure S8 in Additional file 1 shows the case for a specific SNV-impaired gene, MCM3, in U251 and interacts with several genes in cancer pathways as well as with other SNV-impaired genes in the same cell line.
In this study, we performed whole-genome, mRNA sequencing and analyses for three tumor cell lines. The expression and proteome profiles of these cell lines have already been investigated and fair correlations were shown between RNA expression and protein levels . We here incorporated whole genome data such as gene copy number and DNA variation profiles of these cell lines to perform an integrative analysis and discover impaired genes and pathways. Genes with elevated copy numbers were identified in all three of the cell lines considered, giving more than 3,000 genes with copy number changes. The expression levels of each such gene and the abundance of their corresponding proteins were then used to identify genes that were likely to contribute to the maintenance of the cancer state. This analysis narrowed the list of affected genes from thousands to a few hundred per cell line, demonstrating the utility of using DNA variation together with expression data. The cell lines used in this work have different origins so our cross-correlation analysis based on the assessment of copy number-dependent expression could potentially generate false negatives or positives due to some genes being differently regulated in the different cell lines. However, we assume that while these cell lines may retain some aspects of their original identities, the extent of cell-specific changes in the expression of genes in common pathways such as cell cycle regulation, DNA replication or apoptosis have much less impact than those induced by copy number changes.
While the reduction in the number of candidate genes achieved by applying the first filter was substantial, it was not sufficient by itself because the list still contained many passengers. To address this issue, we assumed that 1) cancer is more likely to be maintained by a set of interrelated mutations that alter cellular processes at multiple points than by the effects of a single mutation, and 2) proliferative benefit conferred from an alteration can depend on already existing mutations or structural variations. We therefore focused on CNA genes that exhibited functional links to genes impaired by SNVs in the same cell line. In conjunction with the first filtering step based on the expression correlations with copy number changes, this second filter afforded significant improvements, reducing the number of putative genes contributing to rapid proliferative state to around a few dozen genes per cell line, all of which exhibited enriched connectivity to major signaling, cell division and cancer-specific gene sets. Despite the low overlap between the altered gene sets for each cell line, the network analysis demonstrated that their cancer-related functionality was cooperative, which we detected at both the pathway and global-network level.
Traditionally, novel experimentally determined AGSs are characterized by significance of overlap (amount of shared genes) with known functional gene sets. This method is generally called gene set enrichment analysis. To illustrate superiority of our NEA, we directly compare analyses from GSEA and NEA in Figure S9 in Additional file 1. Only four of all 420 analyzed AGS-FGS pairs showed a significant GSEA overlap (each case was based on two shared genes) when NEA did not detect enrichment. The number for the opposite case (NEA+, GSEA-) was 75, and 18 pairs were detected by both methods. In addition, grounding a GSEA result on two or three genes would not be robust, whereas NEA results are usually based on tens or hundreds of network links. Of note, these comparisons were only possible on AGS as sets of multiple genes, while single gene analysis against FGS is a unique feature of NEA.
Cancer cells modulate their metabolism to switch from mitochondrial to glycolic metabolism despite the presence of sufficient oxygen levels to support the former; this is known as the Warburg effect . In A431 cells, lactase dehydrogenase (LDHA) levels are elevated (RPKM of 751, no gain or loss) which suggests heavy use of glycolic metabolism. The gene PPARGC1A, expressed strongly in normal tissues with high-energy demands, including cardiac tissue, brown fat, and the central nervous system [46–48], is heavily amplified in these cells. It is a master co-activator for mitochondrial biogenesis, which might suggest utilization of oxidative phosphorylation by A431 cells. The functional implications of this amplification are currently being assessed.
We also detected several allelically imbalanced genes and most of these genes did not have any copy number changes and/or damaging SNVs. One special case was necdin (NDN), a gene that is typically maternally imprinted and is only expressed in the brain and placenta . NDN is highly expressed in the U2OS cell line but not in U251 or A431. Previous comparisons of H3K36me3 gene expression patterns between osteoblasts and U2OS suggested that it is not expressed in osteoblasts . Maheswaran et al.  showed that overexpression of TP53 causes rapid apoptotic cell death in U2OS cells. However, transfection of U2OS cells with necdin together with TP53 inhibited TP53-induced apoptosis . A single functional copy of TP53 is present in U2OS cells. This suggests that U2OS cells may evade apoptosis in vivo due to their constitutive expression of NDN together with reduced expression of TP53.
We also looked at splice-site SNVs and detected numerous splice-site SNVs that could cause improper splicing. Only a few were supported by RNA sequencing data, which suggests that the splicing mechanism is fairly robust, in keeping with previous findings .
This study demonstrates that the combined analysis of genomic and transcriptomic data can provide a better functional understanding of the mutational landscape of cancer genomes than can be obtained by considering either one of these sources in isolation. The combined analysis of genomic variation and expression datasets enabled us to distinguish between variants contributing to rapid proliferation and those that are passengers. The mutational landscapes of cancers are highly variable; few shared mutations but numerous private mutations even among similar ones [54, 55]. Our method could be particularly beneficial in these scenarios since it evaluates each mutated gene within its biological context to reveal impaired functional couplings to cancer-related genes that have themselves not been altered. Moreover, the analyses over global gene and protein networks enabled us to uncover relations between alterations that drive/are driven by expression and those constitutively present in the cell but mis-paired via damaging mutations. As an example, a very recent study profiled 947 independent cancer cell lines and provided information on the copy numbers and RNA expression profiles of their genes . Applying the combined analysis reported herein to these cell lines could provide valuable insights into their impaired pathways and related anticancer drug sensitivity.
altered gene set
epidermal growth factor receptor
false discovery rate
functional gene set
fluorescence in situ hybridization
gene set enrichment analysis
Kyoto Encyclopedia of Genes and Genomes
network enrichment analysis
reads per kilobase per million mapped reads
single nucleotide variant.
We would like to acknowledge Charlotte Stadler for performing protein western blots and Katalin Benedek (Karolinska Institute KIVIF visualization facility) for performing FISH. This work was supported by the Swedish Scientific Council and Swedish Cancer Foundation. The authors would like to acknowledge support from Science for Life Laboratory, the national infrastructure SNISS, and Uppmax for providing assistance in massive parallel sequencing and computational infrastructure.
- Uhlen M, Oksvold P, Fagerberg L, Lundberg E, Jonasson K, Forsberg M, Zwahlen M, Kampf C, Wester K, Hober S, Wernerus H, Bjorling L, Ponten F: Towards a knowledge-based Human Protein Atlas. Nat Biotechnol. 2010, 28: 1248-1250. 10.1038/nbt1210-1248.View ArticlePubMedGoogle Scholar
- Giard DJ, Aaronson SA, Todaro GJ, Arnstein P, Kersey JH, Dosik H, Parks WP: In vitro cultivation of human tumors: establishment of cell lines derived from a series of solid tumors. J Natl Cancer Inst. 1973, 51: 1417-1423.PubMedGoogle Scholar
- Westermark B, Ponten J, Hugosson R: Determinants for the establishment of permanent tissue culture lines from human gliomas. Acta Pathol Microbiol Scand A. 1973, 81: 791-805.PubMedGoogle Scholar
- Ponten J, Saksela E: Two established in vitro cell lines from human mesenchymal tumours. Int J Cancer. 1967, 2: 434-447. 10.1002/ijc.2910020505.View ArticlePubMedGoogle Scholar
- Ottaviani G, Jaffe N: The epidemiology of osteosarcoma. Cancer Treat Res. 2009, 152: 3-13. 10.1007/978-1-4419-0284-9_1.View ArticlePubMedGoogle Scholar
- Ideker T, Sharan R: Protein networks in disease. Genome Res. 2008, 18: 644-652. 10.1101/gr.071852.107.View ArticlePubMed CentralPubMedGoogle Scholar
- Alexeyenko A, Lee W, Pernemalm M, Guegan J, Dessen P, Lazar V, Lehtio J, Pawitan Y: Network enrichment analysis: extension of gene-set enrichment analysis to gene networks. BMC Bioinformatics. 2012, 13: 226-10.1186/1471-2105-13-226.View ArticlePubMed CentralPubMedGoogle Scholar
- Reynolds CA, Hong MG, Eriksson UK, Blennow K, Wiklund F, Johansson B, Malmberg B, Berg S, Alexeyenko A, Gronberg H, Gatz M, Pedersen NL, Prince JA: Analysis of lipid pathway genes indicates association of sequence variation near SREBF1/TOM1L2/ATPAF2 with dementia risk. Hum Mol Genet. 2010, 19: 2068-2078. 10.1093/hmg/ddq079.View ArticlePubMed CentralPubMedGoogle Scholar
- Hanahan D, Weinberg RA: Hallmarks of cancer: the next generation. Cell. 2011, 144: 646-674. 10.1016/j.cell.2011.02.013.View ArticlePubMedGoogle Scholar
- Dutta B, Pusztai L, Qi Y, Andre F, Lazar V, Bianchini G, Ueno N, Agarwal R, Wang B, Shiang CY, Hortobagyi GN, Mills GB, Symmans WF, Balazsi G: A network-based, integrative study to identify core biological pathways that drive breast cancer clinical subtypes. Br J Cancer. 2012, 106: 1107-1116. 10.1038/bjc.2011.584.View ArticlePubMed CentralPubMedGoogle Scholar
- Alexeyenko A, Lee W, Pernemalm M, Guegan J, Dessen P, Lazar V, Lehtiö J, Pawitan P: Network enrichment analysis: extension of gene-set enrichment analysis to gene networks. BMC Bioinformatics. 2012, 13: 226-10.1186/1471-2105-13-226.View ArticlePubMed CentralPubMedGoogle Scholar
- De Smet R, Marchal K: Advantages and limitations of current network inference methods. Nat Rev Microbiol. 2010, 8: 717-729.PubMedGoogle Scholar
- Alexeyenko A, Sonnhammer EL: Global networks of functional coupling in eukaryotes from comprehensive data integration. Genome Res. 2009, 19: 1107-1116. 10.1101/gr.087528.108.View ArticlePubMed CentralPubMedGoogle Scholar
- Nibbe RK, Koyuturk M, Chance MR: An integrative -omics approach to identify functional sub-networks in human colorectal cancer. PLoS Comput Biol. 2010, 6: e1000639-10.1371/journal.pcbi.1000639.View ArticlePubMed CentralPubMedGoogle Scholar
- Alexeyenko A, Wassenberg DM, Lobenhofer EK, Yen J, Linney E, Sonnhammer EL, Meyer JN: Dynamic zebrafish interactome reveals transcriptional mechanisms of dioxin toxicity. PLoS One. 2010, 5: e10465-10.1371/journal.pone.0010465.View ArticlePubMed CentralPubMedGoogle Scholar
- WGS data. [http://www.ebi.ac.uk/ena/data/view/ERP001947]
- Li H, Durbin R: Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009, 25: 1754-1760. 10.1093/bioinformatics/btp324.View ArticlePubMed CentralPubMedGoogle Scholar
- Stranneheim H, Werne B, Sherwood E, Lundeberg J: Scalable transcriptome preparation for massive parallel sequencing. PLoS One. 2011, 6: e21910-10.1371/journal.pone.0021910.View ArticlePubMed CentralPubMedGoogle Scholar
- Langmead B, Trapnell C, Pop M, Salzberg SL: Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome biology. 2009, 10: R25-10.1186/gb-2009-10-3-r25.View ArticlePubMed CentralPubMedGoogle Scholar
- RNA-Seq data. [http://www.ebi.ac.uk/ena/data/view/ERP001948]
- Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature. 2008, 455: 1061-1068. 10.1038/nature07385.
- Reverter A, Chan EK: Combining partial correlation and an information theory approach to the reversed engineering of gene co-expression networks. Bioinformatics. 2008, 24: 2491-2497. 10.1093/bioinformatics/btn482.View ArticlePubMedGoogle Scholar
- Kanehisa M, Goto S, Sato Y, Furumichi M, Tanabe M: KEGG for integration and interpretation of large-scale molecular data sets. Nucleic Acids Res. 2011Google Scholar
- Ruepp A, Brauner B, Dunger-Kaltenbach I, Frishman G, Montrone C, Stransky M, Waegele B, Schmidt T, Doudieu ON, Stumpflen V, Mewes HW: CORUM: the comprehensive resource of mammalian protein complexes. Nucleic Acids Res. 2008, 36: D646-650.View ArticlePubMed CentralPubMedGoogle Scholar
- Maslov S, Sneppen K: Specificity and stability in topology of protein networks. Science. 2002, 296: 910-913. 10.1126/science.1065103.View ArticlePubMedGoogle Scholar
- Bland JM, Altman DG: Statistics notes. The odds ratio. Bmj. 2000, 320: 1468-10.1136/bmj.320.7247.1468.View ArticlePubMed CentralPubMedGoogle Scholar
- AVADIS: Data analysis was performed using Avadis® NGS software, Version 1.2.2, Build 146913. © Strand Scientific Intelligence, Inc., San Francisco, CA, USA. Avadis is a registered trademark of Strand Life Sciences Pvt. Ltd. In Book Data analysis was performed using Avadis® NGS software, Version 1.2.2, Build 146913. © Strand Scientific Intelligence, Inc., San Francisco, CA, USA. Avadis is a registered trademark of Strand Life Sciences Pvt Ltd. (Editor ed.^eds.). City
- Korbel JO, Abyzov A, Mu XJ, Carriero N, Cayting P, Zhang Z, Snyder M, Gerstein MB: PEMer: a computational framework with simulation-based error models for inferring genomic structural variants from massive paired-end sequencing data. Genome biology. 2009, 10: R23-10.1186/gb-2009-10-2-r23.View ArticlePubMed CentralPubMedGoogle Scholar
- Boeva V, Zinovyev A, Bleakley K, Vert JP, Janoueix-Lerosey I, Delattre O, Barillot E: Control-free calling of copy number alterations in deep-sequencing data using GC-content normalization. Bioinformatics. 2011, 27: 268-269. 10.1093/bioinformatics/btq635.View ArticlePubMed CentralPubMedGoogle Scholar
- Gebow D, Miselis N, Liber HL: Homologous and nonhomologous recombination resulting in deletion: effects of p53 status, microhomology, and repetitive DNA length and orientation. Molecular and cellular biology. 2000, 20: 4028-4035. 10.1128/MCB.20.11.4028-4035.2000.View ArticlePubMed CentralPubMedGoogle Scholar
- Ritchie K, Seah C, Moulin J, Isaac C, Dick F, Berube NG: Loss of ATRX leads to chromosome cohesion and congression defects. The Journal of cell biology. 2008, 180: 315-324. 10.1083/jcb.200706083.View ArticlePubMed CentralPubMedGoogle Scholar
- Lundberg E, Fagerberg L, Klevebring D, Matic I, Geiger T, Cox J, Algenas C, Lundeberg J, Mann M, Uhlen M: Defining the transcriptome and proteome in three functionally different human cell lines. Mol Syst Biol. 2010, 6: 450-View ArticlePubMed CentralPubMedGoogle Scholar
- Schwanhausser B, Busse D, Li N, Dittmar G, Schuchhardt J, Wolf J, Chen W, Selbach M: Global quantification of mammalian gene expression control. Nature. 2011, 473: 337-342. 10.1038/nature10098.View ArticlePubMedGoogle Scholar
- Lin J, Handschin C, Spiegelman BM: Metabolic control through the PGC-1 family of transcription coactivators. Cell Metab. 2005, 1: 361-370. 10.1016/j.cmet.2005.05.004.View ArticlePubMedGoogle Scholar
- Jucker M, Roebroek AJ, Mautner J, Koch K, Eick D, Diehl V, Van de Ven WJ, Tesch H: Expression of truncated transcripts of the proto-oncogene c-fps/fes in human lymphoma and lymphoid leukemia cell lines. Oncogene. 1992, 7: 943-952.PubMedGoogle Scholar
- Zhang S, Chitu V, Stanley ER, Elliott BE, Greer PA: Fes tyrosine kinase expression in the tumor niche correlates with enhanced tumor growth, angiogenesis, circulating tumor cells, metastasis, and infiltrating macrophages. Cancer research. 2011, 71: 1465-1473. 10.1158/0008-5472.CAN-10-3757.View ArticlePubMed CentralPubMedGoogle Scholar
- Geneimprint. [http://www.geneimprint.com]
- Adzhubei IA, Schmidt S, Peshkin L, Ramensky VE, Gerasimova A, Bork P, Kondrashov AS, Sunyaev SR: A method and server for predicting damaging missense mutations. Nat Methods. 2010, 7: 248-249. 10.1038/nmeth0410-248.View ArticlePubMed CentralPubMedGoogle Scholar
- McMurray HR, Sampson ER, Compitello G, Kinsey C, Newman L, Smith B, Chen SR, Klebanov L, Salzman P, Yakovlev A, Land H: Synergistic response to oncogenic mutations defines gene class critical to cancer phenotype. Nature. 2008, 453: 1112-1116. 10.1038/nature06973.View ArticlePubMed CentralPubMedGoogle Scholar
- Liu F, Stanton JJ, Wu Z, Piwnica-Worms H: The human Myt1 kinase preferentially phosphorylates Cdc2 on threonine 14 and localizes to the endoplasmic reticulum and Golgi complex. Molecular and cellular biology. 1997, 17: 571-583.View ArticlePubMed CentralPubMedGoogle Scholar
- Booher RN, Holman PS, Fattaey A: Human Myt1 is a cell cycle-regulated kinase that inhibits Cdc2 but not Cdk2 activity. J Biol Chem. 1997, 272: 22300-22306. 10.1074/jbc.272.35.22300.View ArticlePubMedGoogle Scholar
- Sigoillot FD, Berkowski JA, Sigoillot SM, Kotsis DH, Guy HI: Cell cycle-dependent regulation of pyrimidine biosynthesis. J Biol Chem. 2003, 278: 3403-3409.View ArticlePubMedGoogle Scholar
- Stratton MR, Campbell PJ, Futreal PA: The cancer genome. Nature. 2009, 458: 719-724. 10.1038/nature07943.View ArticlePubMed CentralPubMedGoogle Scholar
- Ding L, Getz G, Wheeler DA, Mardis ER, McLellan MD, Cibulskis K, Sougnez C, Greulich H, Muzny DM, Morgan MB, Fulton L, Fulton RS, Zhang Q, Wendl MC, Lawrence MS, Larson DE, Chen K, Dooling DJ, Sabo A, Hawes AC, Shen H, Jhangiani SN, Lewis LR, Hall O, Zhu Y, Mathew T, Ren Y, Yao J, Scherer SE, Clerc K, Metcalf GA, et al: Somatic mutations affect key pathways in lung adenocarcinoma. Nature. 2008, 455: 1069-1075. 10.1038/nature07423.View ArticlePubMed CentralPubMedGoogle Scholar
- Warburg O: On the origin of cancer cells. Science. 1956, 123: 309-314. 10.1126/science.123.3191.309.View ArticlePubMedGoogle Scholar
- Kressler D, Schreiber SN, Knutti D, Kralli A: The PGC-1-related protein PERC is a selective coactivator of estrogen receptor alpha. J Biol Chem. 2002, 277: 13918-13925. 10.1074/jbc.M201134200.View ArticlePubMedGoogle Scholar
- Lin J, Puigserver P, Donovan J, Tarr P, Spiegelman BM: Peroxisome proliferator-activated receptor gamma coactivator 1beta (PGC-1beta), a novel PGC-1-related transcription coactivator associated with host cell factor. J Biol Chem. 2002, 277: 1645-1648. 10.1074/jbc.C100631200.View ArticlePubMedGoogle Scholar
- Puigserver P, Wu Z, Park CW, Graves R, Wright M, Spiegelman BM: A cold-inducible coactivator of nuclear receptors linked to adaptive thermogenesis. Cell. 1998, 92: 829-839. 10.1016/S0092-8674(00)81410-5.View ArticlePubMedGoogle Scholar
- MacDonald HR, Wevrick R: The necdin gene is deleted in Prader-Willi syndrome and is imprinted in human and mouse. Hum Mol Genet. 1997, 6: 1873-1878. 10.1093/hmg/6.11.1873.View ArticlePubMedGoogle Scholar
- Ernst J, Kheradpour P, Mikkelsen TS, Shoresh N, Ward LD, Epstein CB, Zhang X, Wang L, Issner R, Coyne M, Ku M, Durham T, Kellis M, Bernstein BE: Mapping and analysis of chromatin state dynamics in nine human cell types. Nature. 2011, 473: 43-49. 10.1038/nature09906.View ArticlePubMed CentralPubMedGoogle Scholar
- Maheswaran S, Englert C, Bennett P, Heinrich G, Haber DA: The WT1 gene product stabilizes p53 and inhibits p53-mediated apoptosis. Genes Dev. 1995, 9: 2143-2156. 10.1101/gad.9.17.2143.View ArticlePubMedGoogle Scholar
- Taniura H, Matsumoto K, Yoshikawa K: Physical and functional interactions of neuronal growth suppressor necdin with p53. J Biol Chem. 1999, 274: 16242-16248. 10.1074/jbc.274.23.16242.View ArticlePubMedGoogle Scholar
- Lu ZX, Jiang P, Cai JJ, Xing Y: Context-dependent robustness to 5' splice site polymorphisms in human populations. Hum Mol Genet. 2011, 20: 1084-1096. 10.1093/hmg/ddq553.View ArticlePubMed CentralPubMedGoogle Scholar
- Mardis ER, Ding L, Dooling DJ, Larson DE, McLellan MD, Chen K, Koboldt DC, Fulton RS, Delehaunty KD, McGrath SD, Fulton LA, Locke DP, Magrini VJ, Abbott RM, Vickery TL, Reed JS, Robinson JS, Wylie T, Smith SM, Carmichael L, Eldred JM, Harris CC, Walker J, Peck JB, Du F, Dukes AF, Sanderson GE, Brummett AM, Clark E, McMichael JF, Meyer RJ, et al: Recurring mutations found by sequencing an acute myeloid leukemia genome. N Engl J Med. 2009, 361: 1058-1066. 10.1056/NEJMoa0903840.View ArticlePubMed CentralPubMedGoogle Scholar
- Varela I, Tarpey P, Raine K, Huang D, Ong CK, Stephens P, Davies H, Jones D, Lin ML, Teague J, Bignell G, Butler A, Cho J, Dalgliesh GL, Galappaththige D, Greenman C, Hardy C, Jia M, Latimer C, Lau KW, Marshall J, McLaren S, Menzies A, Mudie L, Stebbings L, Largaespada DA, Wessels LF, Richard S, Kahnoski RJ, Anema J, Tuveson DA, et al: Exome sequencing identifies frequent mutation of the SWI/SNF complex gene PBRM1 in renal carcinoma. Nature. 2011, 469: 539-542. 10.1038/nature09639.View ArticlePubMed CentralPubMedGoogle Scholar
- Barretina J, Caponigro G, Stransky N, Venkatesan K, Margolin AA, Kim S, Wilson CJ, Lehar J, Kryukov GV, Sonkin D, Reddy A, Liu M, Murray L, Berger MF, Monahan JE, Morais P, Meltzer J, Korejwa A, Jane-Valbuena J, Mapa FA, Thibault J, Bric-Furlong E, Raman P, Shipway A, Engels IH, Cheng J, Yu GK, Yu J, Aspesi P, de Silva M, Jagtap K, et al: The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature. 2012, 483: 603-607. 10.1038/nature11003.View ArticlePubMed CentralPubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.