Skip to main content

Table 3 Summary of the results from the simulated dataset using different thresholds

From: Rule-based induction method for haplotype comparison and identification of candidate disease loci

   Window sizes
  Threshold 20 30 50 100 120 150 180 200
Percentage of results that included the 3 36 80 96 100 100 100 100 100
mutation (%) 4 23 68 91 100 100 100 100 100
  5 20 62 88 100 100 100 100 100
  6 19 62 88 100 100 100 100 100
  7 19 62 88 100 100 100 100 100
  8 19 62 88 100 100 100 100 100
  9 19 62 88 100 100 100 100 100
  10 19 62 88 100 100 100 100 99
  11 18 62 87 99 99 99 98 99
Percentage of mutation loci in the top hit 3 8 3 18 29 30 37 39 42
(%) 4 13 3 19 28 30 35 37 38
  5 15 3 19 29 31 34 38 41
  6 21 3 19 31 35 39 43 45
  7 21 3 23 33 39 43 48 51
  8 21 5 27 37 43 45 50 54
  9 26 11 31 45 49 54 60 64
  10 26 13 39 55 58 65 68 70
  11 33 26 52 74 80 85 87 87
Mean length of haplotypes (number of 3 18 49 166 470 603 788 849 912
SNPs) 4 18 39 138 375 485 581 634 676
  5 18 37 129 350 416 479 532 572
  6 19 37 129 344 408 467 520 559
  7 19 37 128 344 408 467 520 559
  8 19 37 128 345 411 470 524 561
  9 19 37 128 344 405 458 508 544
  10 19 37 128 344 403 455 500 538
  11 20 37 125 334 383 430 468 491
  1. The threshold for the number of cases was varied in all window sizes. The results where six cases were required to share the haplotype (rows in bold), which corresponds to the lymphoma study, were compared with the BEAGLE results. The percentages of datasets that included the mutation and the percentage of mutated haplotypes being the top hit were calculated from the 100 simulated datasets. The mean length of haplotypes was also calculated from the 100 simulated datasets.