Skip to main content

Table 2 Summary of the results from the simulated dataset mixing cases and controls

From: Rule-based induction method for haplotype comparison and identification of candidate disease loci

 

Number of cases/number of controls in case dataset

 

1/10

2/9

3/8

4/7

5/6

6/5

7/4

8/3

9/2

10/1

Percentage of datasets that had haplotypes (%)

0

0

0

0

12

97

99

99

99

100

Percentage of results that included the mutation (%)

0

0

0

0

8

99

98

98

100

99

Percentage of mutation loci in the top hit (%)

0

0

0

0

100

67

53

43

35

32

Mean length of haplotypes (number of SNPs)

0

0

0

0

163

443

428

397

377

355

  1. The controls and cases were mixed into the same case-dataset at different ratios, and then analyzed using a 100 marker window and requiring six cases and three controls to share the shared haplotype (SH). When the number of controls was seven or more, no haplotypes were found. If the dataset had more cases, mutated haplotypes were identified with high frequency. For the borderline case of five cases and six controls, Haplous discovered SHs from 12% of datasets, and one of these had the mutation, which was, however, the top hit in that dataset. The percentages of datasets that included any haplotypes or the mutated haplotypes and the percentage of mutated haplotypes as the top hit were calculated from the 100 simulated datasets. The mean length of haplotypes in number of SNPs was also calculated from the 100 simulated datasets.