Skip to main content

Table 6 Summary of MeSHOP performance

From: Inferring novel gene-disease associations using Medical Subject Heading Over-representation Profiles

Scoring method

Mean AUC

AUC standard error

Mean test rank (n= 200)

Overall rank

Cosine distance of term frequency-inverse document frequency

0.93

0.03

15.03

2

Cosine distance of P-values

0.57

0.05

87.25

16

Cosine distance of term fractions

0.90

0.04

20.21

4

Sum of the log of combined P-values

0.91

0.03

18.88

3

Sum of the differences of log P-values

0.87

0.06

26.97

7

L2 of log-p of overlapping terms only

0.94

0.03

12.06

1

L2 of term fractions of overlapping terms only

0.57

0.04

86.70

15

L2 of log of P-values

0.86

0.07

28.05

10

L2 of P-values

0.86

0.07

29.62

12

L2 of term fractions

0.90

0.03

20.39

5

L2 of term frequency

0.86

0.06

28.31

11

Term coverage

0.87

0.06

27.14

8

Term overlap

0.87

0.03

26.17

6

Number of gene MeSH terms

0.81

0.05

38.69

13

Number of disease MeSH terms

0.86

0.06

27.87

9

Gene ID

0.71

0.06

58.78

14

  1. The AUC mean, standard deviation and ranking for the MeSHOP scores and the gene and disease baselines are described, over all validation sets and both GeneRIF and gene2pubmed reference sets.