DNA methylation signatures for breast cancer classification and prognosis

Changes in gene expression that reset a cell program from a normal to a diseased state involve multiple genetic circuitries, creating a characteristic signature of gene expression that defines the cell's unique identity. Such signatures have been demonstrated to classify subtypes of breast cancers. Because DNA methylation is critical in programming gene expression, a change in methylation from a normal to diseased state should be similarly reflected in a signature of DNA methylation that involves multiple gene pathways. Whole-genome approaches have recently been used with different levels of success to delineate breast-cancer-specific DNA methylation signatures, and to test whether they can classify breast cancer and whether they could be associated with specific clinical outcomes. Recent work suggests that DNA methylation signatures will extend our ability to classify breast cancer and predict outcome beyond what is currently possible. DNA methylation is a robust biomarker, vastly more stable than RNA or proteins, and is therefore a promising target for the development of new approaches for diagnosis and prognosis of breast cancer and other diseases. Here, I review the scientific basis for using DNA methylation signatures in breast cancer classification and prognosis. I discuss the role of DNA methylation in normal gene regulation, the aberrations in DNA methylation in cancer, and candidate-gene and whole-genome approaches to classify breast cancer subtypes using DNA methylation markers.

Breast cancer is a heterogeneous disease with very different therapeutic responses and outcomes. It has traditionally been staged by histopathological criteria that are based on size, level of invasiveness and lymph node infiltration, and by immunochemical characteri zation of cell surface receptors, including estrogen recep tor (ER), the progesterone receptor (PR) and the human epidermal growth factor receptor 2 (HER2). However, in many instances staging breast cancer fails to predict prognosis or therapeutic response because of the hetero geneity of the disease. More recently, molecular approaches focusing on gene expression profiles have been used. Classifications based on gene expression profiles have expanded the detailed classification of breast cancer by revealing cellular identity profiles, with particular emphasis on presence of stem cells and the nature of the immune response to the tumor. These new molecularbased classifications are termed 'intrinsic subtypes of breast cancer' because they reveal the molecular identity of the breast cancer cell in the tumor rather than its stage. Several distinct tumor types and normal breastlike intrinsic classes of breast cancer were previously described [1] (see list in Table 1). These differ ent subtypes are found in all stages of breast cancer, even in the early stages, and therefore serve as early prognostic and therapeutic predictors. They have contri buted prog nostic value in breast cancer management as they are now guiding prediction of patient relapse, survival and response to chemotherapy. However, there are still major challenges in accurate early prediction of breast cancer, prognosis, and prediction of therapeutic outcome. There is significant room for improving our predictive and prognostic tools, particularly in guiding therapeutic choices.
DNA methylation is a molecular modification of DNA that is tightly associated with gene function and cell typespecific gene function and therefore provides an exquisite identity descriptor of a cell. In the recent past, DNA methylation analysis was targeted at a few candi date genes using either methylationsensitive restriction enzymes or genespecific DNA methylation mapping by

Abstract
Changes in gene expression that reset a cell program from a normal to a diseased state involve multiple genetic circuitries, creating a characteristic signature of gene expression that defines the cell's unique identity. Such signatures have been demonstrated to classify subtypes of breast cancers. Because DNA methylation is critical in programming gene expression, a change in methylation from a normal to diseased state should be similarly reflected in a signature of DNA methylation that involves multiple gene pathways. Whole-genome approaches have recently been used with different levels of success to delineate breast-cancer-specific DNA methylation signatures, and to test whether they can classify breast cancer and whether they could be associated with specific clinical outcomes. Recent work suggests that DNA methylation signatures will extend our ability to classify breast cancer and predict outcome beyond what is currently possible. DNA methylation is a robust biomarker, vastly more stable than RNA or proteins, and is therefore a promising target for the development of new approaches for diagnosis and prognosis of breast cancer and other diseases. Here, I review the scientific basis for using DNA methylation signatures in breast cancer classification and prognosis. I discuss the role of DNA methylation in normal gene regulation, the aberrations in DNA methylation in cancer, and candidate-gene and whole-genome approaches to classify breast cancer subtypes using DNA methylation markers. sequencing bisulfiteconverted DNA. These studies provided the initial proof of principle that DNA methy la tion patterns are different between tissue types and between tumors and normal surrounding tissue. How ever, because these studies focused on a small number of genes, they provided a narrow, lowcontent portrait of the DNA methylation pattern. The utility of lowcontent DNA methylation profiles in classification of different subtypes of cancer in general and breast cancer in particular is inherently limited, as it is evident that genes do not act on their own and that gene networks and modules define cellular identities [2]. Therefore, DNA methylation in cancer would be predicted to influence multiple gene networks rather than single genes. Recent technological advances in DNA methylation mapping, including highdensity oligonucleotide arrays, Illumina bead arrays and nextgeneration highthroughput sequenc ing, together with advances in bioinformatics, have allowed examination of broad regions of the genome and delineation of highcontent profiles of DNA methyla tion for the first time. Moreover, it is possible to study genome function at several levels, including analysis of microRNA levels, DNA copy number, DNA methylation and histone modifications, and integrate these into combined genomic pathways. Several such studies have examined associations between wholegenome DNA methylation analyses and breast cancer classification and prognosis, and will be reviewed below. For example, Flanagan et al. [3] have delineated DNA methylation signatures that are associated with BRCA mutation state, but these were not predictive of subtypes defined by gene expression profiling. Kristensen et al. [4] used an inte grated molecular approach that examined genomewide transcription, DNA copy number, microRNA and DNA methylation profiles. This study did not reveal improve ment to prognostic value by adding DNA methylation and microRNA to the analysis. However, Dedeurwaerder et al. [5] have described DNA methylation profiles in a relatively large study that reveal and classify the existence of new breast cancer groups that are not classified by current expression subtypes. The study points to the prospect that DNA methylation signatures will extend our ability to classify breast cancer and predict outcome beyond what is currently possible. It is anticipated that such integrated methods will reveal the genomic basis for heterogeneity of breast cancer.

Hypothesis and critical questions
Given that DNA methylation changes are plausibly critical components of the molecular mechanisms in volved in breast cancer, breast cancers as a group and specific subtypes of breast cancer might be expected to show distinct DNA methylation states. These DNA methylation states could serve as diagnostic tools in breast cancer care. The important questions are: 1) are the changes in DNA methylation in breast cancer limited to a narrow set of candidate genes? 2) could DNA methy la tion states serve as early predictors of breast cancer? 3) could DNA methylation states provide infor ma tion regarding the stage of breast cancer? and 4) could DNA methylation states provide tools for prognosis and stratification for different therapeutic approaches?
Overall, it is crucial to delineate how specific changes in DNA methylation patterns of subsets of genes relate to the molecular pathologies involved in breast cancer initiation, progression and metastasis. It is also important to determine whether a limited set of specific gene methylation events would be sufficient in breast cancer diagnostics, or whether this would require more complex 'signatures' that involve coordinated changes in groups of genes. Here, I discuss the state of knowledge in this emerging field as well as future directions and prospects. First, I provide a short introduction to DNA methylation and its role in regulating gene function, and the changes in DNA methylation that occur in cancer. I then review candidategene and wholegenome DNA methylation mapping approaches aimed at associating DNA methyla tion profiles with breast cancer subtypes and prognosis.

DNA methylation
Vertebrate DNA is covalently modified by addition of methyl residues at the 5' position of cytosines residing mostly in CG (also known as CpG) dinucleotides [6]. Not all CGs are methylated in vertebrate genomes, and the Table 1 Intrinsic classification of breast cancer by gene expression profiles and cell surface hormonal expression   ER-negative, PR-negative,  ER-positive  ER-low  HER2-negative  Normal breast-like   Luminal A  HER2-enriched  Claudin-low  High stromal content Luminal B Basal-like subtype High lymphocyte infiltration

True normal epithelial cell contamination
Column headings indicate primary immunohistochemical criteria only (ER-positive and so on; in italics). The subtypes are indicated below each heading. Subtypes in bold exhibit worse prognosis (relapse and mortality). There are four most commonly referred subtypes in the literature: luminal A, luminal B, HER2-enriched and triple-negative basal-like. Additional subtypes that have been proposed are claudin-low in the triple-negative group and normal breast-like, which are subdivided into additional subtypes. Subtypes in bold show significantly poorer outcome than luminal A subtypes. ER, estrogen receptor; HER2, human epidermal growth factor receptor 2; PR, progesterone receptor.
distribution of methylated and unmethylated CGs in the genome is tissuespecific, resulting in a cellspecific pattern of DNA methylation [7]. The idea that different cell types have different patterns of methylation was intro duced three decades ago [7] and was recently con firmed by wholegenome DNA methylation mapping of differentiating human embryonic stem cells [8].
The DNA methylation reaction is catalyzed by DNA methyltransferases (DNMTs) [9,10]. DNA methylation is unique among all the factors that are involved in pro gram ming gene expression because the methyl moiety is a component of the chemical structure of the genome. Thus, a DNA molecule contains, in addition to the ances tral genetic information encoded by the four bases com prising the DNA sequence, a coating of methyl moieties that contains epigenetic information. The genetic infor ma tion is inherited and copied by the DNA replication enzymatic complex, while the DNA methylation pattern is established during embryonic development by an inde pendent enzymatic process that includes DNMTs and as yet unknown demethylating enzymes and proteins that target DNMTs to specific positions in the genome [6].
Three distinct DNMTs have been identified in mammals. DNMT1 shows preference for hemimethylated DNA in vitro, which is consistent with its role as a maintenance DNMT [11,12], an enzyme that copies the DNA methyla tion pattern from the methylated parental strand to the unmethylated daughter strand during cell division. DNMT3a and DNMT3b are de novo DNMTs, as they methy late unmethylated and methylated DNA at an equal rate [13].
Several proteins have been shown to target DNMTs to specific positions in the genome. For example, EZH2, a member of the multiprotein Polycomb complex PRC2/3, which methylates histone H3 at lysine 27, is believed to target DNMTs to specific locations in the genome [14 17]. This relationship between EZH2 and DNMT is thought to be important in the methylation of tumor suppressor genes in cancer [1417]. URHF1 targets the maintenance DNMT1 to hemimethylated DNA generated during DNA replication and is required for the copying of the DNA methylation pattern from the tem plate to the daughter DNA strand [18,19]. The binding of transcription factors to specific DNA sequences is also important in targeting or preventing DNA methylation during development, as has been suggested previously [20,21] and confirmed recently [22]. Early candidategene approaches and recent genomewide approaches for measuring DNA methylation are described in Box 1.

DNA methylation and its role in programming gene expression
DNA methylation patterns in vertebrates are distin guished by their correlation with chromatin structure.
Active regions of the chromatin, which enable gene expression, are associated with hypomethylated DNA, whereas hypermethylated DNA is packaged in inactive chromatin [23]. It has been known for more than three decades that DNA methylation in regulatory regions such as promoters and enhancers can silence gene expression and that there is an inverse correlation between gene expression and DNA methylation in promoters [7]. Recent wholegenome approaches have also revealed that promoters of vertebrate genes are generally devoid of DNA methylation and that overall there is an inverse correlation between promoter DNA methylation and gene expression [24].
Two important mechanisms for inhibition of gene expression by promoter DNA methylation are well established. First, methylcytosine residues in the recog nition elements of transcription factors block their bind ing, resulting in reduced transcriptional activity [25,26]. A second mechanism involves recruitment of methylated DNA binding domain (MBD) proteins to methylated cyto sines in promoters [27]. MBDs recruit histone modify ing complexes containing histone deacetylases (HDACs), such as the NurD complex, and histone methyl transferases (HMTases) to promoters, resulting in an inactive chromatin configuration around the genes [28]. It is also emerging that gene bodies of actively trans cribed genes are more methylated than gene bodies of silent genes [24,29,30]. The regulatory role of genebody methylation is unclear but if indeed it has a role in gene regulation, gene bodies should also be of interest for DNAmethylationbased diagnostics. Gene bodies have attracted almost no attention in the mapping of cancer methylomes and this might need to change.

DNA methylation and human disease
It is likely that all common human diseases involve changes in gene expression. Genes act to shape normal physiology through interacting networks and functional circuitries [31]. If indeed DNA methylation is involved in stable regulation of gene expression, it then makes sense that changes in DNA methylation would be detected in human disease. Aberrations in DNA methylation have been reported in schizophrenia [3234], lupus [3537] and type II diabetes [3842] and have been proposed to be involved in cardiovascular disease [38,43,44].
However, changes in DNA methylation associated with human disease are just associations, and it is unclear whether these changes are causal or not. It is extremely difficult to demonstrate a causal relationship between differ ential DNA methylation and the pathobiology. Never theless, the plausibility of a causal relationship is increased in particular examples by additional lines of evidence. In systemic lupus erythematosus (lupus) patients, Tcell DNA is hypomethylated relative to normal controls; demethylating drugs such as hydrala zine and procainamide reduce Tcell DNA methylation and also induce autoreactivity in culture and lupus symptoms in humans [45], which is consistent with a causal link between demethylation and the lupus phenotype. In addition, genes that are differentially methylated in lupus are known or suspected to be involved in pathobiology of the disease from other lines of study. Genes encoding interleukin4 and inter leukin6 are demethylated in T cells from lupus patients and are also activated and demethylated in normal T cells following treatment with the demethylating drug 5azacytidine [46]. A recent wholegenome approach was used to map differential DNA methylation profiles in pancreatic islets from type II diabetes patients and nondiabetic donors. Differen tially methylated regions were uncovered in 254 genes in diabetic islets. A fraction of these genes also showed concordant transcriptional changes, suggesting a func tion for these DNA methylation differences. A biocom pu ta tional analysis of the functional pathways involving these genes revealed pathways implicated in βcell survival and function, supporting a role for these DNA methylation changes in the disease [42].

Box 1 Methods for measuring DNA methylation: from candidate-gene approaches to whole-genome methods
Many of the first principles of DNA methylation and its involvement in cancer were derived from the analysis of the DNA methylation state of a limited number of genes. The first method that allowed studies of DNA methylation used methylation-sensitive bacterial restriction enzymes and their methylation-insensitive isoschizomers (enzymes that cleave the same sequence). The most commonly used enzymatic pair is MspI and its isoschizomer HpaII [103]. Southern blotting and hybridization with gene-specific probes first enabled studies of the state of methylation of specific regions in the genome. Later, bisulfite treatment was used to convert unmethylated cytosine residues to uracil residues, while methylated cytosines were protected from bisulfite conversion. The bisulfite treatment therefore creates a sequence difference between methylated and unmethylated cytosines. Specific regions in the genome are then amplified using gene-specific primers and PCR and the fragments are cloned and sequenced [104,105].
Several technological advances have enabled the extension of these initial studies of several genes to genome-wide mapping. Highdensity oligonucleotide arrays combined with different methods for enrichment of methylated DNA have enabled studies of DNA methylation states of broad regions of the genome. Several methods for enriching methylated DNA have been developed, including immunoprecipitation with methylated cytosine antibodies [106] or capturing methylated DNA with methylated DNA binding domain proteins [83]. Illumina has introduced the bead array platforms, which have allowed interrogation of the state of methylation of thousands of CGs concurrently [107]. Current arrays can examine up to 450,000 CG sites [108]. The current Illumina 450K technology combines two methods of differentiating the methylated from unmethylated alleles, the original Infinium I assay used with the 27K arrays and Infinium II assays. While the Infinum I assay differentiates between the methylated and unmethylated alleles by differential hybridization to methylated (C) or unmethylated (U) versions of the beads followed by fluorescent single base extension, the Infinum II assay uses a common oligo on bead primer for both the methylated and unmethylated alleles followed by differential fluorescent nucleotide base extension across a methylated (C) or unmethylated (T) CG site in the sample template.
High-throughput genome sequencing of bisulfite-treated DNA has enabled the mapping of cancer methylomes genome-wide [8]. This method is still costly and unfeasible for high-throughput DNA methylation profiling. However, a limited number of studies have provided important insights into the organization of the cancer epigenome. For example, a recent study used shotgun bisulfite genome sequencing for three color ectal cancers and matched normal colonic mucosa and described large hypomethylated blocks of DNA in these cancers [59].
There are strengths and weaknesses to each of these genome-wide methods. Antibody-enrichment-based methods such as methylated DNA immunoprecipitation (MeDIP) are not biased towards CG sequences and will immunoprecipitate DNA that contains methylated CGs and methylated cytosines in other dinucleotide sequences. In addition, this method does not require bisulfite conversion of DNA and therefore avoids biases in amplification of bisulfite-converted DNA that could dramatically affect results for samples that usually contain a mixture of methylated and unmethylated sequences [109]. MeDIP methods are extremely effective when a small fraction of the population of cells is methylated, because the method focuses on the methylated DNA population and measures the change in this population. This is particularly important in cancer given that DNA methylation patterns in tumors and other tissues are heterogeneous [59]. Other methods based on bisulfite conversion measure both methylated and unmethylated DNA and lose sensitivity when the percentage of cells whose DNA is methylated in a certain region is lower than the noise-to-signal ratio of the method. For example, a change in methylation from 1% to 2% will theoretically generate a duplication of the signal size by MeDIP approaches and be easily detected, but a methylation change of 1% in a range from 0% to 100% will be within the noise range of pyrosequencing or bisulfite mapping. The main disadvantage of MeDIP is that it provides an overall average of the state of methylation but does not provide information at single base resolution.
Illumina bead arrays and high-throughput sequencing provide information at single base resolution but are potentially confounded by bias in amplification of mixtures of methylated and unmethylated bisulfite-converted DNA. The extent of the bias varies from region to region, confounding interpretation of the data. Although altering the temperature of amplification could reduce the bias in specific regions [109], this is not currently feasible in a whole-genome approach as the optimal temperature of amplification differs from region to region. Illumina bead arrays are also limited in representation of CGs and biased towards CG dinucleotides.
Szyf Genome Medicine 2012, 4:26 http://genomemedicine.com/content/4/3/26 DNA methylation serves as a mechanism that provides different functions to identical sequences. A clear example is parental imprinting of genes, whereby an allele of a gene that was paternally inherited has a differ ent DNA methylation state and expression from the allele that was maternally inherited [4749]. Several human disease states, such as PraderWilli syndrome (deletion or absence of a paternal contribution to chromosome 15q11q13), Angelman syndrome (absence of a normal maternal copy of the same region) and Beckwith Wiedemann syndrome (deregulated parental imprinting by DNA methylation of chromosome 11p15) [50] involve disruption of parental imprinting, providing perhaps the strongest evidence for a causal link between aberrations in DNA methylation and human disease. Loss of imprint ing (LOI) in chromosome 11p15 is associated with several cancers, including Wilms tumor [51]. Further more, LOI of specific genes in this region has been associated with different cancers; LOI of H19 (which encodes a long noncoding RNA) is associated with lung cancer [52] and hepatoblastoma [53], and LOI of IGF2 (which encodes insulinlike growth factor 2) and H19 is associated with cervical cancer [54].

DNA methylation and cancer
Cancer was the first group of diseases to be associated with DNA methylation and to be considered for DNA methylationtargeted therapeutics, and it serves as a prototype for determining the role of DNA methylation and DNAmethylationtargeted therapeutics in other diseases [55]. Several types of aberration in DNA methylation and in the proteins involved in DNA methylation occur in cancer: hypermethylation of tumor suppressor genes, aberrant expression of DNMT1 and other DNMTs, and hypomethylation of unique genes and repetitive sequences [5658]. Silencing of tumor sup pressor genes by DNA methylation provides a powerful molecular mechanism by which DNA methylation can trigger cancer, and also provides a rationale for thera peutics aimed at inhibition of DNA methylation and re expression of silenced tumor suppressor genes. Recent genomewide studies suggest that not only highdensity CG islands but also regions of lower CG den sity near islands, termed shores, are differentially methylated in several cancers, and that the same regions are differ entially methylated between tissues, suggesting a role for these regions in defining tissue specificity of gene func tion [59].
DNA methylation of tumor suppressor genes has been the focus of numerous studies that have aimed to identify DNA methylation biomarkers of cancer. However, it is becoming clear that hypomethylation is equally impor tant, because critical genes for cancer growth and meta stasis are hypomethylated in cancer [6063]. DNA demethylation has an important role in cancer by turning on the expression of prometastatic genes, such as the heparanase gene [60], MMP2 (which encodes matrix metalloproteinase2) [61] and uPA (which encodes uro kinase plasminogen activator) [62]. We have recently delineated the DNA hypomethylation landscape of liver cancer. My colleagues and I [63] showed that there is an equal number of genes that are demethylated and hyper methylated in hepatocellular carcinoma in comparison with surrounding normal liver tissue. The hypomethy lated genes are clustered in broad genomic regions, suggesting a high level of organization of demethylation in liver cancer. Functional biocomputational analysis of the hypomethylated genes suggests that they are involved in functions relating to cell growth, invasion and meta stasis [63]. A causal role for demethylation in cancer metastasis is supported by the fact that treatment of non metastatic breast cancer cells with demethylating agents increases their invasiveness [64,65], and that treatment of invasive breast cancer and liver cancer cell lines with agents that reverse demethylation results in inhibition of invasiveness and metastasis [62,63]. A recent genome wide approach involving bisulfite mapping of several colorectal cancer samples revealed blocks of hypomethy la tion encompassing half the genome relative to normal colon tissue [59]. Therefore, an interesting question that has important diagnostic implications is whether the hypomethylation state of certain genes is characteristic of a more advanced and metastatic stage of breast cancer and could be of use in breast cancer staging.

DNA methylation of candidate genes in breast cancer
The original concept driving investigation of changes in DNA methylation in diseased states was that limited sets of candidate genes were critical for disease initiation and progression. However, unbiased approaches could poten tially reveal new genes and new functional gene networks that are associated with a disease, whereas candidate approaches essentially allow validation of genes that are already known to be involved. Early studies attempting to take advantage of the emerging role of methylation of promoters of tumor suppressor genes in cancer examined whether methylation of specific CGs in tumor suppressor genes correlates with different breast cancer clinical states [66]. Methylation of the p16 tumor suppressor gene was proposed to be an early biomarker for detection of breast cancer [67]. Methylationspecific PCR of six known tumor suppressor genes was used to generate a hyper methylation profile of primary breast tumors, and the methylation states of different genes were found to be significantly associated with several known prognostic factors [68]. However, our current under standing of the functional pathways of gene expression in physiological and pathological processes suggests that it is highly unlikely that analysis of a few specific CG sites will be sufficient to stage and provide prognostic information on breast cancer with high accuracy and specificity.

DNA methylation signatures and early whole-genome approaches
One of the lessons learnt from gene expression analyses in breast cancer is that the transcription profiles that distinguish breast cancer stages involve coordinated changes in expression of many genes generating a 'signature' that characterizes a stage of disease (Table 1). Transcription signatures have been used in classifying breast cancer, and in differentiating molecular signatures in the primary tumor of breast cancers that metastasize to bone marrow or lymphatic nodes [69]. Expression signatures differentiate tumors with BRCA1 and BRCA2 mutations [70], supporting the idea of unique molecular signatures for subtypes of breast cancer. Therefore, it is likely that, similar to transcription profiles, DNA methy la tion signatures involve multiple coordinate changes in several genes and that specific patterns of DNA methyla tion across a broader spectrum of genes will be able to differentiate subtypes of breast cancer and their prog nosis with high accuracy.
Early methods were sufficiently developed to examine a limited set of genes with bisulfite sequencing, methyla tionsensitive PCR or methylationsensitive restriction enzyme analysis. However, to delineate DNA methylation signatures without bias, wholegenome methods for mapping DNA methylation were required.
Over a decade ago, more comprehensive methods were developed to interrogate a large number of CG islands in either cell lines or tumor samples using differential methylation hybridization. This method used methyla tionsensitive restriction enzymes to enrich for methy lated DNA fragments, followed by hybridization to CG island arrays (containing 1,000 CG islands). By focusing on CG islands the bias for hypermethylated CG islands was preserved, and the basic assumption remained that the informative DNA methylation event in cancer is hypermethylation of CG islands. A pioneering study by the Huang group [71] used this approach to identify DNA methylation signatures by comparing 28 paired primary breast tumor and normal samples, and to determine whether patterns of specific CG hypermethyla tion correlate with pathological parameters in the patients analyzed. The study found that the number of CG hyper methylated islands increased with decreased differentia tion of the tumors [71]. This was an early demonstration of the potential of broad DNA methylation signatures for differentiating and staging breast cancer. The main caveat of this approach is its bias towards hypermethylation of CG islands.

The use of breast cancer cell lines for identification of DNA methylation signatures specific to breast cancer stages
As an alternative to genomewide delineation of differ entially methylated genes in breast cancer, a pharmaco logical transcriptomewide approach has been used to identify novel genes that are putatively hypermethylated in cancer cells. This approach is methodologically biased towards promoters of genes that are hypermethylated in cancer and ignores the hypomethylated genes that are emerging as important players in the advanced metastatic stages [6062,7274]. Breast cancer cell lines were treated with 5azacytidine, a demethylating drug, to reveal genes whose expression was induced in response to the drug treatment using expression arrays. It was considered that this broad collection of genes whose expression was induced by a DNA demethylating drug represented a group of genes that are hypermethylated and silenced in breast cancer, and that could be useful for diagnostic and prognostic purposes clinically [75]. This approach, which makes necessary the use of breast cancer cell lines rather than primary tumors for genomewide discovery, has several caveats. Firstly, demethylating agents could affect gene expression indirectly and by DNAdemethylation independent mechanisms [76]. Secondly, cells in culture show DNA methylation changes that are different in many instances from the situation in vivo [77,78]. For example, in a study by Stefanska et al. [63] primary liver cells in culture showed a signature of DNA methylation that was categorically different from that of normal liver, and Fu et al. [79] showed distinct expression and DNA methylation patterns of Hedgehog ligands between primary colorectal tumors and colorectal cancer cell lines. Thirdly, although this approach could reveal genes that are poten tially broadly methylated in breast cancer, fine demarcation and exquisite phenotyping of breast cancer cell lines is required for a cellculturebased approach to deliver reveal ing DNA methylation signatures that differentiate breast cancer subtypes or signatures that have prognostic value.
The use of distinctly phenotyped but highly related breast cancer cell lines for delineating DNA methylation signatures that differentiate and classify breast cancers has recently been reported [80]. A multidimensional comprehensive analysis of DNA methylation was performed, using enrichment of differentially methylated regions with methylationspecific restriction sites followed by microarray hybridization and gene expression, and this was repeated more recently by including copy number analysis. The study compared two isogenic MDA MB231 breast cancer cell lines, MDAMB468GFP and MDAMB468GFPLN (the latter derived from a lymphatic metastasis). This study [81] revealed broad changes in DNA methylation that included both hypomethylation and hypermethylation, and measured their association with gene expression and copy number variation. The correspondence between some of the hypomethylation and hypermethylation events with gain and loss of copy number suggested a linkage of these two events that needs to be further explored. The changes in DNA methylation were highly organized structurally and functionally; specific networks and functional pathways were affected. These results support the hypothesis that broad signatures exquisitely define variations between closely related breast cancer cells that take a different metastatic course. However, the main caveat, again, of this approach is the use of breast cancer cell lines. It is unclear what fraction of the DNA methylation signature identified in vitro will be relevant in primary breast cancer tumors. If this is true in vivo as well, such broad signatures could be instrumental in prognosis and have a profound impact on breast cancer care.
A different study [82] examined whether broad DNA methylation, expression and copy number signatures differ entiate ER + and ER breast cancer cell lines. The study was able to identify a cluster of differentially methylated genes that differentiate ER + and ER cells. The relevance of this signature to primary breast cancer was highlighted by the finding in primary tumors of 84 genes that are components of the methylation signature identi fied in ER + cell lines [82]. However, the clinical advan tage of these DNA methylation biomarkers over current immuno chemical and histopathological methods remains to be tested in larger studies.
Differential DNA methylation of a panel of 15 CG islands was used [83] to define the cellular origin of breast cancer cells, particularly focusing on stem cells. Park et al. [83] first determined cancer stem cell phenotype by CD44/ CD24 and ALDH1 immunohisto chemistry in 36 luminal A, 33 luminal B, 30 luminalHER2, 40 HER2enriched, and 40 basallike subtypes of breast cancer (Table 1) [83]. They reported that the number of CG island regions that were methylated was different between the subtypes. The basal like subtype was enriched with the CD44 + /CD24 and aldehyde dehydrogenase 1positive (ALDH1 + ) putative stem cell population. For example, methylation of promoter CG islands was significantly lower in CD44+/CD24cell + tumors than in CD44 + /CD24cell tumors, even within the basallike subtype, suggesting that DNA methylation could detect the 'stemness' of breast cancers. However, it is unclear whether these differences created a true DNA methy lation profile that would increase the accuracy of prognosis or identification of cell of origin beyond the classification achieved by traditional immuno histo chemistry [83].

The use of current genome-wide methods to delineate stage-specific DNA methylation signatures for staging primary breast cancer
In recent years several methods have been developed to provide a genomewide picture of the state of DNA methylation, including: nextgeneration genomewide sequencing of bisulfiteconverted DNA [8]; methylated DNA immunoprecipitation (MeDIP) followed by either hybridization to highdensity oligonucleotide arrays [84] or nextgeneration sequencing [85]; and dedicated Illumina 27K and 450K arrays [86] that measure the state of methylation of well characterized CG sites distributed in the genome. Although genomewide sequencing is still prohibitively costly for large population studies, array approaches are being frequently used to delineate DNA methylation signatures of disease states in primary clinical material rather than cell lines. Several studies have used this approach to differentiate breast cancer subtypes and their prognosis. Li et al. [87] used 27K arrays in a small sample of ER/PR + and ER/PR breast cancer samples, and identified and validated four genes whose DNA methylation was affected by ER/PR status. A similar approach was recently used to identify a group of genes that showed an association with relapsefree survival [88].
Fang et al. [80] also used the 27K array to delineate DNA methylation signatures that would differentiate breast cancers based on their metastatic potential. The study first discovered a 'methylator' phenotype, a co ordinated methylation of a large group of CG islands in groups of tumors, which they termed 'breast CpG island methylator phenotype' (BCIMP), and which resembles the previously characterized methylator phenotype in colorectal cancer [89]. The methylator phenotype was associated with low risk of breast cancer metastasis and improved rates of survival independently of other known breast cancer prognostic markers, such as ER + status. This provides strong evidence for the potential of DNA methylation signatures to prognostically differentiate breast cancers beyond current classifications.
These results have important implications for further development of DNA methylation signature biomarkers and epigenetic cancer therapeutics and highlight the importance of genomewide and unbiased approaches for DNA methylation mapping in breast cancer. Firstly, this study by Fang et al. [80] in primary tumors provides strong support to studies from my laboratory [62,90] that proposed that DNA hypomethylation is a driving force in breast cancer metastasis. It further highlights the impor tance of DNA hypomethylation markers in molecular diagnosis of aggressive breast cancers. The study also illustrates how DNA methylation signatures could have important therapeutic implications in guiding the use of epigenetic drugs in anticancer therapy [91]. It supports the conclusion that the use of hypomethylating drugs exclusively might exacerbate rather than cure cancer by unleashing the expression of hypermethylated prometa static genes and converting nonaggressive breast cancers to highly metastatic aggressive tumors with low survival prognosis [64,92]. These data are consistent with our data in primary liver cancer that showed extensive hypo methy lation in advanced liver cancer [74]. The genes whose promoters were demethylated in liver cancer were mainly involved in cell growth, cell adhesion and commu ni cation, signal transduction, mobility and invasion, functions that are essential for cancer progression and metastasis [74].
The use of genomewide approaches and a larger number of breast cancer samples and controls in the past 2 years has enabled further investigation of the classifi cation and prognostic value of DNA methylation profiles in breast cancer. More interestingly, recent studies suggest that DNA methylation profiling might provide information on the cellular origin of cancer cells in a breast tumor, as well as the microenvironment, particu larly the immune cell types, that are present in the tumor [93].
Related to this, a distinct profile of T cell subtype gene expression could be detected in mixed populations of tumors and stroma [4]. Kristensen et al. [4] used an inte grated approach termed 'Pathway Recognition Algorithm using Data Integration on Genomic Models' (PARADIGM), integrating DNA methylation and microRNA profiling with mRNA expression and DNA copy number. The analysis was conducted on approximately 110 breast carcinomas and then the PARADIGM clusters derived from the discovery sample set were tested in two other breast cancer cohorts [4]. The authors identified key tumor and stromal signatures in the mixed tumorstroma samples, suggesting that it is possible to obtain informative stromal molecular signatures without dissecting the stromal cells. This would simplify the diagnostic protocol. In addition to molecular signatures that classify breast cancer cell subtypes, they found a chronic inflammatory signature in all breast cancers. The strongest predictor of good outcome was a high Thelper 1 (Th1)/cytotoxic T lymphocyte signature, in contrast to a Th2 signature. The PARADIGM clustering seems to expand classification beyond traditional immunohisto chemistry, as a distinc tion was found between two clusters within luminal A breast cancer (called the PDGM3 and PDGM4 clusters) and luminal B breast cancer (PDGM4) clusters. However, although the differ en tial DNA methylation profiles were mapped onto functional pathways that were identified by gene expres sion analysis, the authors did not demonstrate that adding DNA methylation profiling improved the prog nostic value of the PARADIGM clusters over using mRNA expression and copy number variation [4].
Flanagan et al. [3] used a MeDIP approach to determine whether genomewide DNA methylation profiles would predict tumor mutation status and intrinsic subtypes. Although DNA methylation profiles predicted tumor subtypes with some estimated error rates, they did not accurately predict the intrinsic subtypes defined by gene expression [3]. A distinct subgroup of BRCAx tumors defined by methylation profiles was identified, supporting the hypothesis that DNA methylation profiling might expand subtype classification beyond mutation analysis.
DNA methylation profiling will become important for breast cancer diagnosis and prognosis only if it provides additional classification value to other currently used methods, namely immunohistochemistry and mRNA expres sion analysis. A recent detailed wholegenome DNA methylation analysis by Dedeurwaerder et al. using the Illumina 27K arrays [93,94] suggests that DNA methylation profiling might expand current classifica tions of breast cancer subtypes. The analysis of 248 breast cancer tumor samples, comprising a 'main set' of 123 samples (4 normal and 119 infiltrating ductal carcinomas (IDCs)), and a 'validation set' of 125 samples (8 normal and 117 IDCs), revealed an immune 'signature' in a mixed tumor stromal population, as also reported by Kristensen et al. [4]. DNA methylation profiles revealed six classes, three of which defined new groups that were not classified by expression subtypes, and these might reflect different cells of origin [94]. However, the sample size of the main set was too small to allow investigation of the prognostic value of these methylation classes.

Clinical testing
DNA methylation is celltypespecific [7]. Given that it is considered that molecular changes in cancer occur in the cancer cell, it is anticipated that changes in DNA methylation that characterize the cancer stage will be limited to the cancerous cell. This hypothesis, if true, would in practice limit DNA methylation diagnostics to biopsies rather than fluid samples. Carefully designed clinical studies will be needed to determine whether DNA methylation signatures can form the basis of more accurate and specific diagnostic and prognostic tests than currently available histopathological tools and immuno chemical tests. Another important area in which DNA methylation signatures in biopsies might be of value is in stratifying patients for therapy and predicting therapeutic outcomes.

Blood-based tests
If indeed DNA methylation signatures are informative only in tumor sample biopsies, this limits the utility of such markers for use in routine followup and populationwide early screening. Biopsies are invasive, so it is highly unlikely that they will become part of a routine screening procedure. Moreover, even in breast cancer patients, biopsies are not applicable for routine followup following surgery and particularly when there is no visible tumor growth. Noninvasive methods are essential for early prediction and followup of therapeutic res ponse following surgery.
Nevertheless, it is possible that circulating tumor cells display DNA methylation signatures that are reflective of the state of methylation in the tumor mass. Informative breast cancer DNA methylation signatures in tumor cells found in blood would be extremely important in early screening, diagnosis, staging and followup of treatment, and there is increased interest in their potential clinical importance. An important area of research is DNA methylation mapping of circulating tumor cells to identify DNA methylation signatures of breast cancer in these circulating cells. The initial focus has been on hypermethylated genes that are characteristic of many cancers.
Jing et al. [95] recently tested a CIMP in serum from 50 sporadic breast cancer (SBC) patients and paired controls, by examining the state of methylation of CG sites in 10 genes known to be methylated in cancer using methylationspecific PCR. CIMP was found to be more prevalent in serum from SBC patients than controls, the methylation rate was 92% (46/50) at least in one gene in SBC, and serum from only four patients showed no methylation of any of the ten genes. This study demon strated that it is possible to identify changes in DNA methylation in serum from breast cancer patients. It also showed that a combination of methylated genes provided high specificity and sensitivity markers for breast cancer as well as prognostic value, as CIMP + status in serum was associated with a relative risk of recurrence of 8.6.
In another study, Radpour et al. [96] focused on ten candidate genes and investigated two cohorts: a first cohort with 36 plasma samples from breast cancer patients and 30 plasma samples from healthy controls, and a second cohort of 60 triple matched samples (cancer ous tissue, and matched normal tissue and serum samples) from 20 patients with nonfamilial breast cancer. Seven of the genes showed concordant methylation in serum and tumor tissue from the same patient. This supports the hypothesis that serum DNA is derived from and accurately reflects the DNA methylation profile of the primary tumor. A panel of eight genes out of the ten studied was proposed as a highly specific and sensitive test for breast cancer [96]. Furthermore, methylation of particular genes was associated with particular clinical parameters. It is unclear whether this specific set of DNA methylation biomarkers provides any early predictive or prognostic value, and further and more extensive studies are required. Nevertheless, these recent studies suggest that there is potential for using blood samples to detect DNA methylation markers in breast cancer.

Conclusions and future directions
DNA methylation states are involved in longterm gene expression programming of celltype identity and are therefore exquisite descriptors of the functional state of a cell. The differences in DNA methylation between cell types involve multiple genes that are components of functional gene networks. Therefore, it is reasonable to consider that cancer cells might have a unique profile of DNA methylation that reflects not only their identity as tumor cells, but also will differentiate between tumor stages and predict clinical outcomes and response to therapy. Early attempts to discover breast cancer DNA methylation markers focused on a shortlist of candidate genes and were highly biased towards DNA hyper methy lation of CG islands in tumor suppressor genes. The advent of genomewide methods for DNA methylation mapping should allow us to delineate comprehensive and unbiased highdefinition DNA methylation signatures that could provide accurate classification of breast cancers. Such maps might be used in prognosis, predic tion of therapeutic outcomes and stratification for differ ent treatment strategies. Several studies have supported the prospect that DNA methylation signatures can be effective diagnostic markers. However, the data so far are very limited and the predictive value of the small number of DNA methylation signatures that have been identified is unclear. Several studies were limited to different breast cancer cell line manipulations and only few studies have looked at a sizeable number of genes in primary tumors. The critical challenge is to derive highquality DNA methylation signatures that are confirmed in prospective studies as specific and sensitive predictors of clinical outcome and therapeutic responses. An additional question is to determine whether DNA methylation signatures would provide advantages over current histo pathological and immunochemical methods.
It will be particularly important to develop noninvasive molecular markers for breast cancer. Preliminary studies suggest that circulating tumor DNA in plasma samples bear tumorspecific DNA methylation markers that provide potential molecular markers for breast cancer. It is so far unknown whether bloodbased DNA methyla tion markers have prognostic or early predictive value, specifically in followup of response to therapy. Further more, the DNA methylation signatures of the earliest transition into a transformed state are unknown; if such DNA methylation signatures exist, they could provide early molecular markers of breast cancer, especially if found in serum DNA.
RNA transcription profiles have been used in breast cancer molecular diagnosis [70,97102]. DNA methyla tion markers potentially have several advantages over transcription profiles as diagnostic tools in breast cancer. Firstly, DNA is a robust clinical material and could be preserved under harsh conditions, including incubation in serum, whereas RNA is a highly labile material. Secondly, DNA methylation profiles represent a stable longterm programming of the genome, whereas transcription assays provide a snapshot of the trans cription activity at a specific time point and in response in part to transient signals. It is therefore anticipated that the noisetosignal ratio should be significantly lower for DNA methylation signatures, which constitute a stable definition of the molecular state of a cell. The limited data that are available, the advent of genomewide methods for DNA methylation mapping and the emerging understanding of the cardinal role of DNA methylation in controlling celltypespecific genome function support continuing studies in this emerging area and provide reasons for optimism that DNA methylation markers could serve as exquisite molecular markers for prediction, prognosis and followup of breast cancer therapy.