Skip to main content

Table 2 Diagnostic structural variants identified by GEM in the benchmarking cohort (20 out of 119 cases). Structural variants are ranked by GEM based on the genes harbored by the variant and presented alongside other ranked genes with coding SNVs or small indels based on the top scored gene. The asterisk indicates genes that in the literature are candidates for the phenotype of the diagnostic disease/syndrome (as described in OMIM). The results show that GEM can analyze both deletions (del) and duplications (dup) of sizes as small as 4 kb and up to entire chromosome arms, diverse modes of inheritance, pedigree structure, and from either WGS or WES assay data. GEM also automatically identified compound heterozygotes between SVs and SNV/indels (cases 1, 2, and 8). Input SV calls can include breakpoint-based calls (here “SV”), or imprecise CNV calls based on read depth analysis. Notably, GEM can also infer SVs directly from the small variant data when external SV calls are not provided (cases 2, 10, 15, and 17), and score them appropriately, identifying diagnostic variants that in the original cases were found by microarrays and not by sequencing

From: Artificial intelligence enables comprehensive genome interpretation and nomination of candidate diagnoses for rare genetic diseases

Case no.

Top scored gene(s)

Gene rank

GEM score

Variant(s) position

SV type

Length (kb)

Mode of Inheritance

Pedigree type

Assay type

SV calls in input

Diagnosis

252268

FANCA*

1

2.28

chr16:89847864-89863349; FANCA: c.3788_3790delTCT

Del

15

Recessive

Trio

WGS

SV

Fanconi anemia

223449

TANGO2*

1

2.13

chr22:20028937-20057143; TANGO2: c.605+1G>A

Del

28

Recessive

Trio

WGS

None

MECRCN

266523

BTRC*

1

2.05

chr10:102941001-103430600

Dup

490

Dominant

Duo

WGS

SV

Split hand/foot malformation type 3

267392

HIRA, TBX1*

1

2.05

chr22:18893883-21562619

Del

2669

Dominant

Single

WES

CNV

DiGeorge syndrome; velocardiofacial syndrome

267148

KMT2A

1

1.87

chr11:116691508-126432828; chr22:17038511-20307516

Dup

9741; 3269

Dominant

Trio

WES

CNV

Emanuel syndrome

253691

HIRA, TBX1*

1

1.73

chr22:18893883-20307516

Del

1414

Dominant

Single

WES

CNV

DiGeorge syndrome; velocardiofacial Syndrome

256943

MAGEL2*

1

1.64

chr15:22833478-28566610

Del

5733

Dominant

Single

WES

CNV

Prader Willi syndrome

254012

NDUFS3*

1

1.56

chr11:47605229-47609177; NDUFS3: c.374G>A

Del

4

Recessive

Trio

WGS

SV

Leigh syndrome

254728

EPHA4

2

1.46

chr2:220309089-224580863

Del

4272

Dominant

Single

WGS

SV

Pathogenic deletion in 2q35q36.1

44671

NPAP1

1

1.42

chr15 tetrasomy (broken in multiple dups)

Dup

4542; 991; 358; 158

Dominant

Trio

WGS

None

Isodicentric chromosome 15 syndrome

360547

FREM1

1

1.33

chr9:1-18477200

Del

18,437

Dominant

Trio

WGS

SV

Chromosome 9p deletion syndrome

259685

TYROBP

1

1.31

chr19:23158251-33502767

Dup

10,345

Dominant

Trio

WES

SV

Partial trisomy 19p12.q13.11

266700

TAB2

1

1.31

chr6:144951601-150260400

Del

5309

Dominant

Trio

WGS

SV

Chromsome 6q24-q25 Syndrome

244102

MAGEL2*

1

1.28

chr15:23684685-26108259

Del

2424

Dominant

Single

WES

CNV

Prader Willi syndrome

204560

JAG1*

2

1.21

chr20:10471400-13459333

Del

44

Dominant

Trio

WGS

None

Alagille syndrome

246146

HCN1

1

1.20

chr5:213101-46,270,700

Dup

44

Dominant

Single

WGS

SV

Trisomy 5p

45020

PCDH19*

1

1.15

chrX:92925011-99669272

Del

6744

X-linked dominant

Trio

WGS

None

Developmental and epileptic encephalopathy 9

248678

FANCC*

1

1.14

chr9:97998556-98009092

Del

11

Recessive

Single

WGS

SV

Fanconi Anemia

352726

THRA

1

1.00

chr17:32147833-79020944

Dup

46,873

Dominant

Proband

WGS

SV

Distal trisomy 17q

251355

TRIP11

4

0.58

chr14:84783523-96907490

Del

12,124

Dominant

Duo

WGS

SV

Chromosome 14q31.2q32.2 Syndrome