Table 2 Diagnostic structural variants identified by GEM in the benchmarking cohort (20 out of 119 cases). Structural variants are ranked by GEM based on the genes harbored by the variant and presented alongside other ranked genes with coding SNVs or small indels based on the top scored gene. The asterisk indicates genes that in the literature are candidates for the phenotype of the diagnostic disease/syndrome (as described in OMIM). The results show that GEM can analyze both deletions (del) and duplications (dup) of sizes as small as 4 kb and up to entire chromosome arms, diverse modes of inheritance, pedigree structure, and from either WGS or WES assay data. GEM also automatically identified compound heterozygotes between SVs and SNV/indels (cases 1, 2, and 8). Input SV calls can include breakpoint-based calls (here “SV”), or imprecise CNV calls based on read depth analysis. Notably, GEM can also infer SVs directly from the small variant data when external SV calls are not provided (cases 2, 10, 15, and 17), and score them appropriately, identifying diagnostic variants that in the original cases were found by microarrays and not by sequencing

From: Artificial intelligence enables comprehensive genome interpretation and nomination of candidate diagnoses for rare genetic diseases

Case no. Top scored gene(s) Gene rank GEM score Variant(s) position SV type Length (kb) Mode of Inheritance Pedigree type Assay type SV calls in input Diagnosis
252268 FANCA* 1 2.28 chr16:89847864-89863349; FANCA: c.3788_3790delTCT Del 15 Recessive Trio WGS SV Fanconi anemia
223449 TANGO2* 1 2.13 chr22:20028937-20057143; TANGO2: c.605+1G>A Del 28 Recessive Trio WGS None MECRCN
266523 BTRC* 1 2.05 chr10:102941001-103430600 Dup 490 Dominant Duo WGS SV Split hand/foot malformation type 3
267392 HIRA, TBX1* 1 2.05 chr22:18893883-21562619 Del 2669 Dominant Single WES CNV DiGeorge syndrome; velocardiofacial syndrome
267148 KMT2A 1 1.87 chr11:116691508-126432828; chr22:17038511-20307516 Dup 9741; 3269 Dominant Trio WES CNV Emanuel syndrome
253691 HIRA, TBX1* 1 1.73 chr22:18893883-20307516 Del 1414 Dominant Single WES CNV DiGeorge syndrome; velocardiofacial Syndrome
256943 MAGEL2* 1 1.64 chr15:22833478-28566610 Del 5733 Dominant Single WES CNV Prader Willi syndrome
254012 NDUFS3* 1 1.56 chr11:47605229-47609177; NDUFS3: c.374G>A Del 4 Recessive Trio WGS SV Leigh syndrome
254728 EPHA4 2 1.46 chr2:220309089-224580863 Del 4272 Dominant Single WGS SV Pathogenic deletion in 2q35q36.1
44671 NPAP1 1 1.42 chr15 tetrasomy (broken in multiple dups) Dup 4542; 991; 358; 158 Dominant Trio WGS None Isodicentric chromosome 15 syndrome
360547 FREM1 1 1.33 chr9:1-18477200 Del 18,437 Dominant Trio WGS SV Chromosome 9p deletion syndrome
259685 TYROBP 1 1.31 chr19:23158251-33502767 Dup 10,345 Dominant Trio WES SV Partial trisomy 19p12.q13.11
266700 TAB2 1 1.31 chr6:144951601-150260400 Del 5309 Dominant Trio WGS SV Chromsome 6q24-q25 Syndrome
244102 MAGEL2* 1 1.28 chr15:23684685-26108259 Del 2424 Dominant Single WES CNV Prader Willi syndrome
204560 JAG1* 2 1.21 chr20:10471400-13459333 Del 44 Dominant Trio WGS None Alagille syndrome
246146 HCN1 1 1.20 chr5:213101-46,270,700 Dup 44 Dominant Single WGS SV Trisomy 5p
45020 PCDH19* 1 1.15 chrX:92925011-99669272 Del 6744 X-linked dominant Trio WGS None Developmental and epileptic encephalopathy 9
248678 FANCC* 1 1.14 chr9:97998556-98009092 Del 11 Recessive Single WGS SV Fanconi Anemia
352726 THRA 1 1.00 chr17:32147833-79020944 Dup 46,873 Dominant Proband WGS SV Distal trisomy 17q
251355 TRIP11 4 0.58 chr14:84783523-96907490 Del 12,124 Dominant Duo WGS SV Chromosome 14q31.2q32.2 Syndrome