Skip to main content
Fig. 3 | Genome Medicine

Fig. 3

From: Using multi-scale genomics to associate poorly annotated genes with rare diseases

Fig. 3

Using clades improves the performance of EvORanker phylogenetic profiling-based analysis. For each patient candidate gene list in the 109-patient exome and the 900-simulated genomes datasets (300 unique genetic disorders), we compared the accuracy of the phylogenetic profiling-based algorithm by retrieving the top 50 coevolved genes with each patient candidate gene across all Eukaryotes versus: (1) using all 16 clades where the query gene has an ortholog in addition to Eukaryotes. (2) Across only Animalia clades (Chordata, Mammalia, Archelosauria, Ecdysozoa, Nematoda, Arthropoda, and Platyhelminthes). Performance was measured by examining the ranking of the “true” disease-causing gene relative to the other patient candidate genes. The upper bar plot shows results for the autosomal and X-linked recessive cases for the real-exome dataset (left) and the simulated dataset (right). The simulated dataset contains 181 unique recessive cases and 119 unique dominant cases. The results present a compilation of three separate independent shuffles totaling 900 simulations. The lower bar plot shows results for the autosomal and X-linked dominant cases. The y-axis indicates the tested clades, and the x-axis indicates the percentage of cases where the “true” disease gene was ranked at the top or within the top 3 or top 5 genes relative to the other candidate genes in recessive cases. In dominant cases, the percentage is for the “true” gene being ranked at the top or within the top 10 genes. Overall, the best performance of ranking the “true” causative gene was achieved by merging together the co-evolving genes within all clades (the 16 clades in addition to all Eukaryota) in both datasets

Back to article page