Skip to main content
Fig. 2 | Genome Medicine

Fig. 2

From: Identifying Crohn’s disease signal from variome analysis

Fig. 2

DKMcost genes outperform PascalGWAS and KS5 sets in leave-one-out cross-validation on the CD-train panel. The x-axis is the number of genes/features used in each model. The y-axis is the AUC for precision/recall (PR, gray) and ROC (yellow) curves. At each point on the graph, the SVM models were trained using a top-ranked genes from the PascalGWAS genes, b randomly selected KS5 genes; note that at most 127 genes were in KS5, i.e., were below the Kolmogorov-Smirnov p value of 0.05 (“Methods” section), c top-ranked DKMcost genes. Error bars are standard deviations over 100 iterations of model training. Note that for each point on the x-axis in a and c the genes used in each of these training iterations were the same, but the resampling of individuals was different. Dotted lines indicate baseline performance of ROC AUC of 0.5 (yellow) and PR AUC of 0.58 (gray, 64 CDs out of 111 total number of individuals). Dashed lines indicate the highest performance achieved through all gene sets

Back to article page