RNA sequencing and microarray analysis for population-based breast tumors. (A) Hierarchical clustering of 49 primary breast tumors (clustered columns) using the RNA-seq gene expression measurements and the PAM50 intrinsic gene signature (clustered rows). Clinical annotations for estrogen receptor (ER), progesterone receptor (PgR), and HER2 are indicated below the sample dendrogram, and PAM50 intrinsic subtyping is shown for classification using RNA-seq data as well as using microarray data generated from the same input RNA (90% concordant; results for Sørlie (92%) and Hu (96%) signatures are presented in Additional file 2: Figure S2). Genes of interest are highlighted in red, and relative expression level is indicated by the box color (see color key below the heatmap). For six tumor samples, technical replicates from the same RNA sources were performed for both RNA-seq and microarrays; plotted in (B) and (C) are representative examples comparing the fold-change for all RefSeq genes between two tumors (Y axis), and the fold-change between the replicated experiments for the same two tumors (X axis). Consistently, RNA-seq demonstrated values closer to the ideal line of identity and for a broader dynamic range. The +/- 2 fold-change (|log2| = 1) thresholds are indicated by blue dashed lines. (D) RNA-seq-derived expression level of ESR1, which encodes the ER alpha protein, is shown compared to the clinical ER IHC score for each of the 49 tumors. See Additional file 2: Figure S3 for corresponding plots for progesterone receptor and ERBB2 (HER2).