Skip to main content

Table 3 Summary of the results from the simulated dataset using different thresholds

From: Rule-based induction method for haplotype comparison and identification of candidate disease loci

  

Window sizes

 

Threshold

20

30

50

100

120

150

180

200

Percentage of results that included the

3

36

80

96

100

100

100

100

100

mutation (%)

4

23

68

91

100

100

100

100

100

 

5

20

62

88

100

100

100

100

100

 

6

19

62

88

100

100

100

100

100

 

7

19

62

88

100

100

100

100

100

 

8

19

62

88

100

100

100

100

100

 

9

19

62

88

100

100

100

100

100

 

10

19

62

88

100

100

100

100

99

 

11

18

62

87

99

99

99

98

99

Percentage of mutation loci in the top hit

3

8

3

18

29

30

37

39

42

(%)

4

13

3

19

28

30

35

37

38

 

5

15

3

19

29

31

34

38

41

 

6

21

3

19

31

35

39

43

45

 

7

21

3

23

33

39

43

48

51

 

8

21

5

27

37

43

45

50

54

 

9

26

11

31

45

49

54

60

64

 

10

26

13

39

55

58

65

68

70

 

11

33

26

52

74

80

85

87

87

Mean length of haplotypes (number of

3

18

49

166

470

603

788

849

912

SNPs)

4

18

39

138

375

485

581

634

676

 

5

18

37

129

350

416

479

532

572

 

6

19

37

129

344

408

467

520

559

 

7

19

37

128

344

408

467

520

559

 

8

19

37

128

345

411

470

524

561

 

9

19

37

128

344

405

458

508

544

 

10

19

37

128

344

403

455

500

538

 

11

20

37

125

334

383

430

468

491

  1. The threshold for the number of cases was varied in all window sizes. The results where six cases were required to share the haplotype (rows in bold), which corresponds to the lymphoma study, were compared with the BEAGLE results. The percentages of datasets that included the mutation and the percentage of mutated haplotypes being the top hit were calculated from the 100 simulated datasets. The mean length of haplotypes was also calculated from the 100 simulated datasets.