Skip to main content

Table 1 Performance of gene characteristics at predicting association with disease

From: Inferring novel gene-disease associations using Medical Subject Heading Over-representation Profiles

 

gene2pubmed

GeneRIF

Scoring method

Validation (02/2007-01/2009)

Validation (02/2007-04/2010)

CTD validation (11/2008)

Validation (02/2007-01/2009)

Validation (02/2007-04/2010)

CTD validation (11/2008)

Percentage GC content

0.50

0.50

0.51

0.50

0.50

0.51

Number of transcripts

0.53

0.53

0.55

0.51

0.51

0.53

Transcript length

0.51

0.52

0.50

0.52

0.52

0.53

Genomic length

0.52

0.52

0.50

0.51

0.51

0.52

Gene ID

0.73

0.71

0.78

0.64

0.63

0.69

  1. Characteristics were compared against the 02/2007-11/2008 validation sets using gene2pubmed and GeneRIF gene references, as well as the 11/2008 Comparative Toxicogenomics Database (CTD) validation set. Gene characteristics were extracted from EnsEMBL. We compare the performance of these characteristics at predicting new gene-disease relationships in our validation sets (for the genes with mapped characteristics).