# Table 3 Explanation of the scoring functions evaluated

Scoring method Description
Cosine distance of term frequency-inverse document frequency $∑ j ∈ M g i ( j ) d i ( j ) ∑ j ∈ M g i ( j ) 2 ∑ j ∈ M d i ( j ) 2$
Cosine distance of P-values $∑ i ∈ M g p ( i ) d p ( i ) ∑ i ∈ M g p ( i ) 2 ∑ i ∈ M d p ( i ) 2$
Cosine distance of term fractions $∑ i ∈ M g f ( i ) d f ( i ) ∑ i ∈ M g f ( i ) 2 ∑ i ∈ M d f ( i ) 2$
Sum of the log of combined P-values $∑ i ∈ M log g p ( i ) + d p ( i ) - g p ( i ) d p ( i )$
Sum of the differences of log P-values $∑ i ∈ M log g p ( i ) d p ( i ) = ∑ i ∈ M log g p ( i ) - log d p ( i )$
L2 of log-p of overlapping terms only $∑ i ∈ ( G ∩ D ) log g p ( i ) - log d p ( i ) 2$
L2 of term fractions of overlapping terms only $∑ i ∈ ( G ∩ D ) g f ( i ) - d f ( i ) 2$
L2 of log of P-values $∑ i ∈ M log g p ( i ) d p ( i ) 2 = ∑ i ∈ M log g p ( i ) - log d p ( i ) 2$
L2 of P-values $∑ i ∈ M g p ( i ) - d p ( i ) 2$
L2 of term fractions $∑ i ∈ M g f ( i ) - d f ( i ) 2$
L2 of term frequency $∑ i ∈ M g ( i ) - d ( i ) 2$
Term coverage |GD|
Term overlap |GD|
Number of gene MeSH terms |G|
Number of disease MeSH terms |D|
Gene ID Entrez Gene ID of the gene
1. M refers to the set of all MeSH terms, G and D to the MeSH terms for the gene and disease profile, respectively. g(i), g f (i), g p (i) and g i (i) refer to the frequency, term fraction, hypergeometric P-value and term frequency-inverse document frequency for the MeSH term i of the gene profile. d(i), d f (i), d p (i) and d i (i) refer to the frequency, term fraction, hypergeometric P-value and term frequency-inverse document frequency for the MeSH term i of the disease profile. 