Skip to main content

Table 7 Mean average precision MeSHOP performance

From: Inferring novel gene-disease associations using Medical Subject Heading Over-representation Profiles

Scoring method

Novel MEDLINE validation MAP (02/2007-04/2010)

Rank

Novel CTD validation AUC (11/2008-04/2010)

Rank

Cosine distance of term frequency-inverse document frequency

0.87

11

0.92

4

Cosine distance of P-values

0.55

15

0.66

15

Cosine distance of term fractions

0.87

12

0.90

6

Sum of the log of combined P-values

0.88

9

0.94

2

Sum of the differences of log P-values

0.90

3

0.79

9

L2 of log-p of overlapping terms only

0.94

1

0.95

1

L2 of term fractions of overlapping terms only

0.54

16

0.52

16

L2 of log of P-values

0.89

7

0.78

13

L2 of P-values

0.89

5

0.79

8

L2 of term fractions

0.90

2

0.92

5

L2 of term frequency

0.89

8

0.79

10

Term coverage

0.90

4

0.79

11

Term overlap

0.88

10

0.93

3

Number of gene MeSH terms

0.81

13

0.88

7

Number of disease MeSH terms

0.89

6

0.78

12

Gene ID

0.69

14

0.74

14

  1. The mean average precision for the novel MEDLINE relationships (02/2007 to 04/2010) and the novel CTD relationships (11/2008 to 04/2010). In each trial, 100 positive relationships and 100 negative relationships were chosen uniformly at random, and the average precision was computed for each scoring method. The mean average precision presented here is calculated over 100 random trials for each validation set.