Large hypomethylated blocks as a universal defining epigenetic alteration in human solid tumors
© Irizarry et al.; licensee BioMed Central Ltd. 2014
Received: 12 June 2014
Accepted: 12 August 2014
Published: 26 August 2014
One of the most provocative recent observations in cancer epigenetics is the discovery of large hypomethylated blocks, including single copy genes, in colorectal cancer, that correspond in location to heterochromatic LOCKs (large organized chromatin lysine-modifications) and LADs (lamin-associated domains).
Here we performed a comprehensive genome-scale analysis of 10 breast, 28 colon, nine lung, 38 thyroid, 18 pancreas cancers, and five pancreas neuroendocrine tumors as well as matched normal tissue from most of these cases, as well as 51 premalignant lesions. We used a new statistical approach that allows the identification of large hypomethylated blocks on the Illumina HumanMethylation450 BeadChip platform.
We find that hypomethylated blocks are a universal feature of common solid human cancer, and that they occur at the earliest stage of premalignant tumors and progress through clinical stages of thyroid and colon cancer development. We also find that the disrupted CpG islands widely reported previously, including hypermethylated island bodies and hypomethylated shores, are enriched in hypomethylated blocks, with flattening of the methylation signal within and flanking the islands. Finally, we found that genes showing higher between individual gene expression variability are enriched within these hypomethylated blocks.
Thus hypomethylated blocks appear to be a universal defining epigenetic alteration in human cancer, at least for common solid tumors.
The original observation of altered DNA methylation in cancer was widespread hypomethylation affecting as many as one-third of single copy genes and arising at the earliest stages . Later studies identified CpG island hypermethylation as well . More recently large heterochromatin regions termed LOCKs were found to become euchromatic in cancer cell lines  and partially methylated domains in embryonic stem cell lines . Recent whole genome bisulfite sequencing studies of human colorectal cancer showed that hypomethylation affects large genomic regions corresponding to chromatin regions (LOCKs) and nuclear organization (LADs), accounting for >95% of the DNA methylation change in cancer ,. This manifests itself as an intersample as an erosion of the normal methylation profile (hence increase in local/sequence related variation). Other work has identified similar hypomethylated blocks in breast cancer cell lines, and found direct correlation to chromatin modifications in the same population, . More recent work has even identified these blocks in medulloblastomas without obvious genetic drivers, underscoring the importance of this type of epigenetic change in cancer, . Large-scale hypomethylated blocks have also been associated with Epstein-Barr virus-induced B-cell immortalization , neuronally expressed genes , epigenetic changes prior to morphological transformation  age-related drift in the pathogenesis of MDS and AML .
Tissue samples analyzed in this study
The methylation changes within the blocks are progressive over time, showing a greater drift away from the normal profile as the cancer progresses.
DNA was isolated from tissue samples using either the MasterPure DNA Purification Kit (Epicentre) or DNeasy Blood and Tissue Kit (Qiagen) according the manufacturers protocol.
Purity and quantity of DNA was measured using nanodrop spectroscopy.
A total of 500ng of gDNA was bisulfite treated using the EZ-DNA Gold methylation kit (Zymo Research). The resulting bisulfite treated DNA was then subjected to the manufactures protocol for the Illumina Infinium HumanMethylation450 BeadChip Kit. The data are publically available from GEO repository GSE53051, processed data can be browsed at .
Single CpG analysis
Percent of single CpGs that result in a q value <0.05 and effect size >0.10 when comparing cancer to normal samples with a t-test
Collapsed CpGs analysis
For each sample we collapsed measurements from islands, shores, and shelves into one value. Specifically, we averaged all the measurements within each of these regions to produce one measure per region. We then grouped any open sea probe that was within 500bp from each other. If one of these regions exceeded 1,500bp we broke them up into subsets. Details are available in the code of the cpgCollapse function in minfi . This resulted in 223,497 collapsed regions: 26,571 CGIs, 47,344 CGI shores, 35,725 shelves, and 113,857 open sea. We then computed differences, standard deviations, and t-tests in the same way as we did for the single CpG analyses. R code for analysis is available upon request.
Block intersection P values
Large blocks of aberrant methylation identified in normal versus hyperplastic, adenoma, or cancer samples
Total Mb inside blocks
Intersection with colon blocks (%)
25%length quartile (Mb)
50%length quartile (Mb)
75%length quartile (Mb)
Median diff value
Gene expression hyper-variability analysis
We obtained frma  -normalized Affymetrix HGU133plus2 gene expression data for colon, breast, lung, pancreas, and thyroid tumors (curation and preprocessing of these data were previously described in . We calculated the log ratio of observed to expected variability as described in Alemu et al. . This method, which fits a local-likelihood regression method to estimate expected variability as a function of each gene's mean expression level was shown to better control for variability of lowly-expressed genes than the commonly used coefficient of variation. To calculate enrichment in hypo-methylation domains we only considered probesets of genes with transcription start sites within the collapsed 450k regions (described above) since these are the genomic regions covered by the 450k array within which blocks can be detected .
Profiles around CpG islands
At the single CpG level many cancer to normal differences are far from CpG islands
We used the Illumina HumanMethylation450 BeadChip methylation array to probe cancer methylation 10 breast, 28 colon, nine lung, 38 thyroid, 18 pancreas cancers, and five pancreas neuroendocrine tumors as well as matched normal tissue from most of these cases and 51 premalignant lesions (Table1). We stratified the 485,512 probes included in the array into CpG islands, CpG island shores (1 to 2,000bp from island), CpG island shelves (2,001to 4,000bp from island) and CpG open seas (>4,000 bp from island). For each tissue we computed cancer and normal across-individual averages for each probe. We then examined the differences between these pairs and declared a difference statistically and biologically significant when the q value was below 0.05 and the observed difference above 0.10 or below −0.10. We found that the majority were either hypomethylated probes located in open sea sites or hypermethylated CpG island probes (Table2). For colon, lung, thyroid, and PNET there were more significantly hypomethylated probes than hypermethylated probes and for pancreas adenocarcinoma it was about the same. For breast there were more significantly hypermethylated probes than hypomethylated probes. In general, the hypomethylated probes were characterized by average methylation of approximately 75% in normal samples that dropped to approximately 60% in cancer samples (Figure1). In contrast, the hypermethylated CpG island probes were characterized by approximately 10% methylation values for the normal samples increasing to approximately 40% in cancer (Figure1). In both cases the methylation pattern moved from the extremes to the middle. The probes in CpG island shores are a hybrid of the other two types.
We computed the same summaries for the difference between early neoplastic tissue and normal tissue - specifically breast ductal carcinoma in situ (DCIS) and normal breast tissue, colon tubular adenoma and colon normal, intraductal papillary mucinous neoplasms (IPMNs) and normal pancreas, and follicular thyroid adenomas, and normal thyroid tissue. We observed the same trend of methylation changes in these early neoplasms as in the fully developed cancers (Additional file 1: Figure S1).
Hypomethylated blocks are present in six cancer types
To determine if the observed hypomethylation is related to large hypomethylated blocks, previously identified for colon cancer using whole-genome bisulfite sequencing ,, we applied a new method that permits the detection of large differentially methylated regions using 450 k Methylation microarray data  (see Methods Section). To declare a region statistically and biologically significant, or blocks, we required a q value <0.05 and inclusion of at least five measurements (See methods). We also excluded the X and Y chromosomes. In the majority of cases, these blocks were hypomethylated regions, with median length on the order of hundreds of kb (Table3); a full tabulation of identified blocks is included as Additional file 2: Data 1-11. Hypomethylated blocks were observed in each of the six cancer types as well as in the early stage samples (Table3). Typically, blocks had an average methylation of approximately 75% in all the normal tissues (Figure2A; solid lines), but in cancer became distinctly hypomethylated (Figure2A; dotted liens). The difference between cancer and normal samples varied between types, with colon cancer showing the greatest area difference, and thyroid showing the least (Figure2B). The great majority of detected blocks were hypomethylated (83%, 99% 98%, 99%, and 78% for breast, colon, lung, PNET, and thyroid, respectively) except for pancreas adenocarcinoma for which 48% were hypermethylated. For each hypomethylated block, we determined if it intersected with a colon hypomethylated block (at least 5,000bps in common) and found these were highly co-localized (Table3). This co-localization is observed in the top ranked blocks for each tissue type (Figure2).
For colon, lung, and breast hypermethylated islands are enriched inside blocks
Hypermethylated CpG island location relative to blocks
CGIs in testable area (n)
Testable CGIs significantly (q <0.05; deltaM >0.1) hypermethylated (%)
CGIs in blocks (n)
Hypermethylated CGIs in blocks (%)
Odds ratio of CGI being in block and hypermethylated
Pvalue (Chi-squared test)
Inside blocks, methylation profiles flatten around CpG islands
We divided CpG islands into those inside and outside hypomethylated blocks. For each cancer type, for distances ranging from 1 bp to 15,000bp in both genomic directions, we then computed the average methylation value across all islands for normal and cancer. We also computed this average for probes within CpG islands. We found that across all examined tissues these average methylation profiles went from a pattern of methylated outside islands to unmethylated inside islands back to methylated outside islands in normal tissues (Figure3; Additional file 4: Figure S3). Outside blocks this pattern remained about the same for cancer samples, but within blocks the island methylation went up while the methylation right outside went down; both going from extreme to middle. The general trend is one of hypermethylation in islands, and hypomethylation of the surrounding area (Figure3; Additional file 4: Figure S3).
Hyper-variably expressed genes are enriched inside blocks
Gene expression hyper-variability in colon cancer was reported to be enriched in long hypomethylation blocks obtained from whole genome bisulfite sequencing . To establish how consistent this association is across solid tumor types, we performed a similar association test for the five tissues profiled here. We obtained publicly available gene expression microarray data for tumors in each of the five tumors from the Gene Expression Barcode project ,. Since expression is not available for normal samples in all tissues in this platform, we defined hyper-variability by calculating the log-ratio of observed variability to expected variability (conditioned on mean expression level) across tumor samples for each gene , and then tested association between hyper-variability (observed is twice the expected variability) and the gene's TSS being inside a hypo-methylation block in each cancer type. We found that hyper-variability is enriched in the hypomethylation blocks in each cancer type (P <0.05) except breast cancer (P = 0.5) where the small number of hypomethylation domains results in lack of power. We also observed that the odds ratio for hypomethylation domain presence increases along with hyper-variability for all tissues (Additional file 5: Figure S4).
The blocks occur in early neoplasms
Large blocks of aberrant methylation arise early in carcinogenesis and develop along with cancer
Total Mb inside blocks
Intersection with colon blocks (%)
25%length quartile (Mb)
50%length quartile (Mb)
75%length quartile (Mb)
Median diff value
Minimally invasive carcinoma
Capsular invasive carcinoma
Vascular invasive carcinoma
To summarize and evaluate how average methylation in blocks changes with progression (Figure4A-B); we calculated a value for each sample using the average methylation level inside all blocks and inside all islands. Each sample then had a single value for blocks and a single value for islands. We performed this analysis for colon (Figure4C-D) and thyroid (Figure4E-F) with increasing stages of progression plotted along the x-axis. The normal samples in both cases had a clear tight clustering. However, even the earliest lesions showed marked alterations of large domains as seen in the later cancers.
There are three major results of this study. First, we have found that large hypomethylated blocks in cancer, which we first described in three colorectal cancers, are a universal feature of solid tumors. Blocks were found in all five tumor types, and in every cancer within them and hyper-variably expressed genes are enriched within hypomethylated blocks in all tumor types. Second, the hypomethylated blocks occur early in cancer: all four groups of premalignant lesions also showed the hypomethylated blocks. Thus more than any other mutation, copy number change, or individual methylation change, hypomethylated blocks represent the genetic signature of human solid tumors.
Third, in breast, colon, and lung cancer, altered DNA methylation in CpG islands are enriched in hypomethylated blocks. The hypermethylated islands contained in the blocks do not show hypermethylation per se, but flattening, that is, hypermethylation of the islands, and hypomethylation of the shores and shelves that flank them. Note that we may be underestimating the enrichment. First, we may be underestimating the genomic coverage of the blocks due to the statistically conservative threshold we use for defining them and because the array does not cover the entire genome (approximately two-thirds of the genome). Second, is the somewhat arbitrary choice of effect-size we used to define a hypermethylated CpG islands.
Note that these large domains defined by the hypomethylated blocks in cancer have been previously shown , to co-localize with regions showing heterochromatin modifications such as H3K9Me2 or H3K9Me3 (LOCKs)  or lamin-associated domains (LADs)  in normal cells. A recent report on epithelial-mesenchymal transition (EMT) showed that the loss of LOCKs is associated with this process reversibly, and the properties of cell spreading and chemoresistance can be inhibited by biochemical modification of LOCK demethylation . In the original report of LOCKs, their loss was also described in cancer cell lines . A recent report in prostate cancer demonstrates both hypo- and hypermethylation associated with reduced chromatin acetylation . These results motivate a relatively new view of cancer epigenetics in which large-scale heterochromatin structures are disrupted generally, at least in solid tumors, leading to loss of both epigenetic and gene expression regulation, resulting in hyper-variability of gene expression . These changes could even have interaction with large scale genetic domains important in cancer .
The data in this paper also offer a new perspective of the role of CpG island methylation in cancer. While historically the focus was on island hypermethylation, we see that: (1) much of the methylation change in cancer involves hypomethylated blocks; (2) many of the methylation changes at islands are more a flattening out of methylation rather than simply hypermethylation. The presence of these regions within the block domains suggests that the mechanism for island disruption is not necessarily island-specific but could be part of the loss of structural integrity of heterochromatin in these regions. That would explain the lack of data for specific mutations at islands or of island modifying or recognizing genes in most solid tumors. It is intriguing to speculate that the blocks might be the functional target of many of the chromatin modifiers already known to be disrupted in cancer. In particular, the advent of histone lysine demethylase therapy  seems particularly relevant to these structures .
In summary, this is the first genome-scale analysis of DNA methylation in a large number of cancers and matched tissues, spanning six tumor types, and including premalignant lesions from four of the tumor types. This analysis allowed us to identify common features of the cancer epigenome in solid tumors and assess the timing of those changes. We also took advantage of new software that leverages the power of statistical smoothing and resampling to detect large statistically significant regions that are differentially methylated.
Ethics and consent
Cryogenically stored freshly frozen samples were obtained from the Cooperative Human Tissue Network (National Cancer Institute (NCI)), and Johns Hopkins Hospital under an institutional review board-approved waiver of consent. This conforms to the Helsinki Declaration as well as local legislation.
This work was supported by NIH grants HG003223 and CA054358 to APF, and GM083084 and RR021967/GM103552 to RAI.
- Feinberg AP, Vogelstein B: Hypomethylation distinguishes genes of some human cancers from their normal counterparts. Nature. 1983, 301: 89-92. 10.1038/301089a0.View ArticlePubMedGoogle Scholar
- Greger V, Passarge E, Hopping W, Messmer E, Horsthemke B: Epigenetic changes may contribute to the formation and spontaneous regression of retinoblastoma. Hum Genet. 1989, 83: 155-158. 10.1007/BF00286709.View ArticlePubMedGoogle Scholar
- Wen B, Wu H, Shinkai Y, Irizarry RA, Feinberg AP: Large histone H3 lysine 9 dimethylated chromatin blocks distinguish differentiated from embryonic stem cells. Nat Genet. 2009, 41: 246-250. 10.1038/ng.297.PubMed CentralView ArticlePubMedGoogle Scholar
- Lister R, Pelizzola M, Dowen RH, Hawkins RD, Hon G, Tonti-Filippini J, Nery JR, Lee L, Ye Z, Ngo QM, Edsall L, Antosiewicz-Bourget J, Stewart R, Ruotti V, Millar AH, Thomson JA, Ren B, Ecker JR: Human DNA methylomes at base resolution show widespread epigenomic differences. Nature. 2009, 462: 315-322. 10.1038/nature08514.PubMed CentralView ArticlePubMedGoogle Scholar
- Hansen KD, Timp W, Bravo HC, Sabunciyan S, Langmead B, McDonald OG, Wen B, Wu H, Liu Y, Diep D, Briem E, Zhang K, Irizarry RA, Feinberg AP: Increased methylation variation in epigenetic domains across cancer types. Nat Genet. 2011, 43: 768-775. 10.1038/ng.865.PubMed CentralView ArticlePubMedGoogle Scholar
- Berman BP, Weisenberger DJ, Aman JF, Hinoue T, Ramjan Z, Liu Y, Noushmehr H, Lange CP, van Dijk CM, Tollenaar RA, Van Den Berg D, Laird PW: Regions of focal DNA hypermethylation and long-range hypomethylation in colorectal cancer coincide with nuclear lamina-associated domains. Nat Genet. 2012, 44: 40-46. 10.1038/ng.969.PubMed CentralView ArticleGoogle Scholar
- Hon GC, Hawkins RD, Caballero OL, Lo C, Lister R, Pelizzola M, Valsesia A, Ye Z, Kuan S, Edsall LE, Camargo AA, Stevenson BJ, Ecker JR, Bafna V, Strausberg RL, Simpson AJ, Ren B: Global DNA hypomethylation coupled to repressive chromatin domain formation and gene silencing in breast cancer. Genome Res. 2012, 22: 246-258. 10.1101/gr.125872.111.PubMed CentralView ArticlePubMedGoogle Scholar
- Hovestadt V, Jones DT, Picelli S, Wang W, Kool M, Northcott PA, Sultan M, Stachurski K, Ryzhova M, Warnatz HJ, Ralser M, Brun S, Bunt J, Jager N, Kleinheinz K, Erkek S, Weber UD, Bartholomae CC, von Kalle C, Lawrenz C, Eils J, Koster J, Versteeg R, Milde T, Witt O, Schmidt S, Wolf S, Pietsch T, Rutkowski S, Scheurlen W: Decoding the regulatory landscape of medulloblastoma using DNA methylation sequencing. Nature. 2014, 510: 537-541. 10.1038/nature13268.View ArticlePubMedGoogle Scholar
- Hansen KD, Sabunciyan S, Langmead B, Nagy N, Curley R, Klein G, Klein E, Salamon D, Feinberg AP: Large-scale hypomethylated blocks associated with Epstein-Barr virus-induced B-cell immortalization. Genome Res. 2014, 24: 177-184. 10.1101/gr.157743.113.PubMed CentralView ArticlePubMedGoogle Scholar
- Schroeder DI, Lott P, Korf I, LaSalle JM: Large-scale methylation domains mark a functional subset of neuronally expressed genes. Genome Res. 2011, 21: 1583-1591. 10.1101/gr.119131.110.PubMed CentralView ArticlePubMedGoogle Scholar
- Teschendorff AE, Jones A, Fiegl H, Sargent A, Zhuang JJ, Kitchener HC, Widschwendter M: Epigenetic variability in cells of normal cytology is associated with the risk of future morphological transformation. Genome Med. 2012, 4: 24-10.1186/gm323.PubMed CentralView ArticlePubMedGoogle Scholar
- Maegawa S, Gough SM, Watanabe-Okochi N, Lu Y, Zhang N, Castoro RJ, Estecio MR, Jelinek J, Liang S, Kitamura T, Aplan PD, Issa JP: Age-related epigenetic drift in the pathogenesis of MDS and AML. Genome Res. 2014, 24: 580-591. 10.1101/gr.157529.113.PubMed CentralView ArticlePubMedGoogle Scholar
- Nejman D, Straussman R, Steinfeld I, Ruvolo M, Roberts D, Yakhini Z, Cedar H: Molecular rules governing de novo methylation in cancer. Cancer Res. 2014, 74: 1475-1483. 10.1158/0008-5472.CAN-13-3042.View ArticlePubMedGoogle Scholar
- Chelaru F, Smith L, Goldstein N, Chelaru F, Smith L, Goldstein N, Bravo HC: Epiviz: interactive visual analytics for functional genomics data.Nat Methods 2014, [epub ahead of print] doi:10.1038/nmeth.3038.,Google Scholar
- Aryee MJ, Jaffe AE, Corrada-Bravo H, Ladd-Acosta C, Feinberg AP, Hansen KD, Irizarry RA: Minfi: a flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays. Bioinformatics. 2014, 30: 1363-1369. 10.1093/bioinformatics/btu049.PubMed CentralView ArticlePubMedGoogle Scholar
- Jaffe AE, Murakami P, Lee H, Leek JT, Fallin MD, Feinberg AP, Irizarry RA: Bump hunting to identify differentially methylated regions in epigenetic epidemiology studies. Int J Epidemiol. 2012, 41: 200-209. 10.1093/ije/dyr238.PubMed CentralView ArticlePubMedGoogle Scholar
- McCall MN, Bolstad BM, Irizarry RA: Frozen robust multiarray analysis (fRMA). Biostatistics. 2010, 11: 242-253. 10.1093/biostatistics/kxp059.PubMed CentralView ArticlePubMedGoogle Scholar
- Bravo HC, Pihur V, McCall M, Irizarry RA, Leek JT: Gene expression anti-profiles as a basis for accurate universal cancer signatures. BMC Bioinformatics. 2012, 13: 272-10.1186/1471-2105-13-272.View ArticlePubMedGoogle Scholar
- Alemu EY, Carl JW, Corrada Bravo H, Hannenhalli S: Determinants of expression variability. Nucleic Acids Res. 2014, 42: 3503-3514. 10.1093/nar/gkt1364.PubMed CentralView ArticlePubMedGoogle Scholar
- Zilliox MJ, Irizarry RA: A gene expression bar code for microarray data. Nat Methods. 2007, 4: 911-913. 10.1038/nmeth1102.PubMed CentralView ArticlePubMedGoogle Scholar
- McCall MN, Uppal K, Jaffee HA, Zilliox MJ, Irizarry RA: The Gene Expression Barcode: leveraging public data repositories to begin cataloging the human and murine transcriptomes. Nucleic Acids Res. 2011, 39: D1011-1015. 10.1093/nar/gkq1259.PubMed CentralView ArticlePubMedGoogle Scholar
- Cibas ES, Ali SZ: The Bethesda System For Reporting Thyroid Cytopathology. Am J Clin Pathol. 2009, 132: 658-665. 10.1309/AJCPPHLWMI3JV4LA.View ArticlePubMedGoogle Scholar
- Guelen L, Pagie L, Brasset E, Meuleman W, Faza MB, Talhout W, Eussen BH, de Klein A, Wessels L, de Laat W, van Steensel B: Domain organization of human chromosomes revealed by mapping of nuclear lamina interactions. Nature. 2008, 453: 948-951. 10.1038/nature06947.View ArticlePubMedGoogle Scholar
- McDonald OG, Wu H, Timp W, Doi A, Feinberg AP: Genome-scale epigenetic reprogramming during epithelial-to-mesenchymal transition. Nat Struct Mol Biol. 2011, 18: 867-874. 10.1038/nsmb.2084.PubMed CentralView ArticlePubMedGoogle Scholar
- Coolen MW, Stirzaker C, Song JZ, Statham AL, Kassir Z, Moreno CS, Young AN, Varma V, Speed TP, Cowley M, Lacaze P, Kaplan W, Robinson MD, Clark SJ: Consolidation of the cancer genome into domains of repressive chromatin by long-range epigenetic silencing (LRES) reduces transcriptional plasticity. Nat Cell Biol. 2010, 12: 235-246.PubMed CentralPubMedGoogle Scholar
- Shen H, Laird PW: Interplay between the cancer genome and epigenome. Cell. 2013, 153: 38-55. 10.1016/j.cell.2013.03.008.PubMed CentralView ArticlePubMedGoogle Scholar
- Hojfeldt JW, Agger K, Helin K: Histone lysine demethylases as targets for anticancer therapy. Nat Rev Drug Discov. 2013, 12: 917-930. 10.1038/nrd4154.View ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.