Fig. 2From: De novo transcriptomic subtyping of colorectal cancer liver metastases in the context of tumor heterogeneityUnsupervised de novo subtyping of CRLMs based on gene expression. a Quality metrics from NMF classification using input gene sets defined by three different thresholds for the cross-sample SD indicated that the optimal number of sample clusters (K) was either 2 or 5. b The sample clusters at K = 2 factorization were most strongly separated by epithelial-mesenchymal characteristics, as illustrated with a sample-wise epithelial score calculated by GSVA (p value from t-test). c Heatmap of NMF clustering output at K = 5 factorization. The top annotation bars indicate sample clusters and the sample-wise silhouette width in each cluster. The red-blue color intensity in the heatmap represents the within-cluster similarity of each sample. Cross-tabulation of samples at K = 2 and K = 5 factorizations indicates that the mesenchymal subtype from K = 2 is largely retained also at K = 5. d Pie chart showing the proportion of samples in each of the de novo liver metastasis subtypes (LMS1-5) at K = 5. e PCA plot of samples based on the input gene set for NMF (cross-sample SD > 0.8) and colored according to LMS group, confirms strong separation of the mesenchymal subtype (LMS5) from the four epithelial subtypes (LMS1-4) along PC1. The density plot on the top shows the distinction between the epithelial and mesenchymal sample clusters from K = 2 factorization. f The proportion of LMS5 samples was higher among CRLMs exposed to neoadjuvant chemotherapy, but there was no significant difference between treatment groups for LMS1, LMS2, and LMS4Back to article page