Skip to main content

Table 2 Summary of the results from the simulated dataset mixing cases and controls

From: Rule-based induction method for haplotype comparison and identification of candidate disease loci

  Number of cases/number of controls in case dataset
  1/10 2/9 3/8 4/7 5/6 6/5 7/4 8/3 9/2 10/1
Percentage of datasets that had haplotypes (%) 0 0 0 0 12 97 99 99 99 100
Percentage of results that included the mutation (%) 0 0 0 0 8 99 98 98 100 99
Percentage of mutation loci in the top hit (%) 0 0 0 0 100 67 53 43 35 32
Mean length of haplotypes (number of SNPs) 0 0 0 0 163 443 428 397 377 355
  1. The controls and cases were mixed into the same case-dataset at different ratios, and then analyzed using a 100 marker window and requiring six cases and three controls to share the shared haplotype (SH). When the number of controls was seven or more, no haplotypes were found. If the dataset had more cases, mutated haplotypes were identified with high frequency. For the borderline case of five cases and six controls, Haplous discovered SHs from 12% of datasets, and one of these had the mutation, which was, however, the top hit in that dataset. The percentages of datasets that included any haplotypes or the mutated haplotypes and the percentage of mutated haplotypes as the top hit were calculated from the 100 simulated datasets. The mean length of haplotypes in number of SNPs was also calculated from the 100 simulated datasets.