Rare predicted loss-of-function variants of type I IFN immunity genes are associated with life-threatening COVID-19

Background We previously reported that impaired type I IFN activity, due to inborn errors of TLR3- and TLR7-dependent type I interferon (IFN) immunity or to autoantibodies against type I IFN, account for 15–20% of cases of life-threatening COVID-19 in unvaccinated patients. Therefore, the determinants of life-threatening COVID-19 remain to be identified in ~ 80% of cases. Methods We report here a genome-wide rare variant burden association analysis in 3269 unvaccinated patients with life-threatening COVID-19, and 1373 unvaccinated SARS-CoV-2-infected individuals without pneumonia. Among the 928 patients tested for autoantibodies against type I IFN, a quarter (234) were positive and were excluded. Results No gene reached genome-wide significance. Under a recessive model, the most significant gene with at-risk variants was TLR7, with an OR of 27.68 (95%CI 1.5–528.7, P = 1.1 × 10−4) for biochemically loss-of-function (bLOF) variants. We replicated the enrichment in rare predicted LOF (pLOF) variants at 13 influenza susceptibility loci involved in TLR3-dependent type I IFN immunity (OR = 3.70[95%CI 1.3–8.2], P = 2.1 × 10−4). This enrichment was further strengthened by (1) adding the recently reported TYK2 and TLR7 COVID-19 loci, particularly under a recessive model (OR = 19.65[95%CI 2.1–2635.4], P = 3.4 × 10−3), and (2) considering as pLOF branchpoint variants with potentially strong impacts on splicing among the 15 loci (OR = 4.40[9%CI 2.3–8.4], P = 7.7 × 10−8). Finally, the patients with pLOF/bLOF variants at these 15 loci were significantly younger (mean age [SD] = 43.3 [20.3] years) than the other patients (56.0 [17.3] years; P = 1.68 × 10−5). Conclusions Rare variants of TLR3- and TLR7-dependent type I IFN immunity genes can underlie life-threatening COVID-19, particularly with recessive inheritance, in patients under 60 years old. Supplementary Information The online version contains supplementary material available at 10.1186/s13073-023-01173-8.


Background
Clinical variability is high in unvaccinated individuals infected with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), ranging from silent infection to lethal disease.In ~ 3% of cases, infection leads to critical COVID-19 pneumonia, requiring high-flow oxygen (O 2 > 6 L/min), mechanical ventilation (non-invasive or by intubation), or extracorporeal membrane oxygenation (ECMO) [1].Advanced age is by far the strongest predictor of COVID-19 severity, with the risk of death doubling every 5 years of age from childhood onward [2,3].Men are also at greater risk of death than women [3][4][5].Genome-wide (GW) association studies have identified several common loci associated with COVID-19 severity, the most significant being a region on chromosome 3p21.31that was introduced by archaic introgression from Neanderthals [6][7][8][9][10].The risk haplotype encompasses six genes (SLC6A20, LZTFL1, CCR9, FYCO1, CXCR6, and XCR1) and confers an estimated OR per copy of between 1.6 and 2.1, with higher values for individuals under 60 years old [7,11].Twentyfour GW regions have been shown to be significantly associated with critical COVID-19 [10][11][12].Four of these regions encompass genes involved in type I IFN immunity.The first, on chr12q24.13,containing protective variants, is also a Neanderthal haplotype [13] and includes the OAS1, OAS2, and OAS3 cluster, these interferon-stimulated genes (ISGs) being required for the activation of antiviral RNaseL.The second, a region on chr21q22.1,includes IFNAR2.The third, a region on chr19p13.2,includes TYK2.The fourth, a region on chr9p21, includes IFNA10.However, common variants have a modest effect size and explain only a very small fraction of the clinical variability [6,8].This prompted us to search for rare variants conferring a stronger predisposition to life-threatening COVID-19.
Through a candidate approach focusing on influenza susceptibility genes, the COVID Human Genetics Effort (CHGE [14]) provided proof-of-concept that autosomal inborn errors of TLR3-dependent and -independent type I interferon (IFN) immunity, including autosomal recessive (AR) deficiencies of IFNAR1 or IRF7, can underlie critical COVID-19 [15].Other children with AR IFNAR1, IFNAR2, TBK1, or STAT2 deficiency were subsequently reported, as well as children with AR TYK2 deficiency [16][17][18][19][20] (Fig. 1).Some other groups were unable to replicate these findings, but the variants were not tested biochemically and it is unclear whether recessive defects were considered [11,[21][22][23].There may also be other reasons for their findings [1,24], the most important being the age distribution of the case cohorts.The other case cohorts were much older than ours (mean age of 66 vs. 52 years) and we found that inborn errors of immunity (IEI) were more frequent in patients under 60 years old [25].Consistently, we recently reported that ~ 10% of children with moderate, severe, or critical COVID-19 pneumonia had recessive inborn errors of type I IFN immunity [19].Moreover, older patients are more likely to carry pre-existing autoantibodies (auto-Abs) neutralizing type I IFN, which are found in about 15% of critical cases and up to 21% of patients over the age of 80 years [26,27].The presence of such auto-Abs has been replicated by at least 26 studies worldwide [28,29], and we also recently showed that autoimmunity to type I IFNs is a strong common predictor of COVID-19 death in unvaccinated individuals, providing further evidence for the role of type I IFN immunity in life-threatening COVID-19 [29].
Using an unbiased X-wide gene burden test, we also identified X-linked recessive (XR) TLR7 deficiency in 17 male patients aged 7-71 years with critical COVID-19 pneumonia, accounting for ~ 1% of cases in men (Fig. 1) [30].Moreover, six of the 11 TLR7 variants previously reported in patients from other studies were deleterious  .SARS-CoV-2 infection can induce type I IFN production in a TLR3-dependent manner in tissue-resident RECs (which express TLR3 but not TLR7) and in a TLR7-dependent manner in circulating pDCs (which express TLR7 but not TLR3).IRF7 is constitutively expressed in pDCs, at higher levels than in other cell types, whereas it is mostly induced by viral infection in RECs.Reported in red are the 13 genes (IFNAR1, IFNAR2, IRF3, IRF7, IRF9, IKBKG, STAT1, STAT2, TBK1, TICAM1, TLR3, TRAF3, and UNC93B1) investigated in a previous study [15]; TYK2 and TLR7 were subsequently shown to underlie severe COVID-19 [19,30] (carried by nine of 16 patients) [31][32][33][34][35][36], whereas the TLR7 variants in other studies were not disclosed [21,22].TLR3 senses viral dsRNA in respiratory epithelial cells, whereas TLR7 senses ssRNA in plasmacytoid dendritic cells [25,28].Both pathways induce the production of type I IFNs.TLR7 gain-of-function variants were recently shown to be associated with human systemic lupus erythematosus [37], providing an example of mirror genetic effects between infectious and inflammatory/autoimmune diseases [38].Collectively, these findings suggest that type I IFNs are essential for protective immunity to SARS-CoV-2 in the respiratory tract, with insufficient type I IFN activity accounting for up to 15-20% of cases of life-threatening COVID-19.Despite this high proportion, the determinants of critical COVID-19 pneumonia remain to be identified in ~ 80% of cases.Here, we tested the hypotheses that other IEI may underlie critical COVID-19 pneumonia in at least some patients and that our initial findings could be replicated in a new cohort.With the CHGE, we performed a GW gene-based rare variant association analysis.This analysis was performed in both previously investigated patients who had not been screened at the GW level [15,19,30], and in newly recruited patients.We also tested the hypothesis that we could replicate our initial finding of an enrichment in pLOF variants of candidate type I IFN-related genes in newly recruited patients, given the controversy from other groups.We extended the analysis to two other type I IFN-related genes, TLR7 and TYK2, that we had recently found to be associated with critical COVID-19 [19,30], and to branchpoint (BP) variants with a potentially strong impact on the splicing of the 15 type I IFN-related genes [39].Finally, we refined the analysis of the type I IFN-related genes by taking age, sex, and zygosity into account.

Cohort
Since the beginning of the pandemic, we have enrolled more than 9000 individuals with SARS-CoV-2 infection and broad clinical manifestations from all over the world through the COVID Human Genetic Effort (CHGE).In this study, we focused on 3503 patients with life-threatening COVID-19 and 1373 individuals with asymptomatic/mild infection.Life-threatening COVID-19 cases were defined as patients with pneumonia who developed critical disease, whether pulmonary with high-flow oxygen (> 6 L/min) or mechanical ventilation [continuous positive airway pressure (CPAP), bilevel positive airway pressure (BIPAP), and intubation], septic shock, or any other type of organ damage requiring intensive care unit admission.We screened for the presence of autoantibodies (auto-Abs) against type I IFNs in all patients for whom plasma was available (N = 928), as previously described [26,27], and we excluded 234 patients who tested positive for auto-Abs as they already have a major risk factor for developing critical COVID-19 [29].In total, 3269 patients with life-threatening COVID-19 were included in the analysis.Among those 3269 patients, 1301 had been included in previous studies restricted to a short list of 18 candidate genes [15,19] or to the X chromosome [30], and 1968 had not been studied before.Controls were defined as individuals infected with SARS-CoV-2 who remained asymptomatic or pauci-symptomatic, with the presence of mild, self-healing, ambulatory disease (N = 1373).The presence of infection was assessed on the basis of a positive PCR test and/or serological test and/or the presence of typical symptoms such as anosmia or agueusia after exposure to a confirmed COVID-19 case.Whole-exome (N = 2003 cases and 866 controls) or whole-genome (N = 1266 cases and 507 controls) sequencing was performed for the cases and controls, and high-quality variants were obtained from the sequencing data as detailed in the Additional file 1: Supplementary Methods.

Population stratification
Principal component analysis (PCA) was performed with PLINK v1.9 software [40] on a pruned subset of ~ 14,600 SNPs not in linkage disequilibrium (maximum r2 value for linkage disequilibrium 0.4 between pairs of SNPs) with a minor allele frequency (MAF) > 1%, call rate > 99%, and P value for departure from Hardy-Weinberg equilibrium > 10 −5 , as previously described [41].Ethnic origin was inferred from the PCA as previously described [41].

Variant selection
For each gene, we considered several sets of candidate coding variants, defined according to (i) functional annotation: predicted loss-of-function (pLOF) variants only (including stop gain/lost, start lost, frameshift, or splice variants), or pLOF with missense and in-frame variants (MISSLOF); (ii) the gnomAD v2.1 allele frequency (AF): variants with a gnomAD allele frequency below 1%, 0.1%, or 0.01%; and (iii) Combined Annotation Dependent Depletion (CADD) score [42] for missense and in-frame variants: CADD score ≥ mutation significance cut-off (MSC) for the corresponding gene [43] or all variants regardless of the CADD score.We considered nine sets of variants in total: (1)

Rare variant burden analysis
We performed a genome-wide gene-based rare variants burden analysis.For each gene, the genotypic information for candidate rare variants was summarized into a genetic score defined according to three genetic models: (1) co-dominant: samples were coded 2 if they carried at least one biallelic variant, 1 if they carried at least one monoallelic variant, and 0 otherwise; (2) heterozygous: samples were coded 1 if they carried at least one monoallelic variant and 0 otherwise; and (3) recessive: samples were coded 1 if they carried at least one biallelic variant and 0 otherwise.For the X chromosome, hemizygous males are considered to be equivalent to homozygous females.The association between the genetic score for each gene and the disease status was assessed with a logistic regression-based likelihood ratio test (LRT) from EPACTS (Efficient and Parallelizable Association Container Toolbox) [44] for the genome-wide burden analysis or R 3.6.0[45] for the candidate type I IFN-related pathway.Firth's bias correction, with the fast.logistf.fitfunction of EPACTS or the logistf function of the R logistf package [46], was applied if the P value of the LRT was below 0.05.Analyses were adjusted for sex, age (in years), and the first five PCs of the PCA In Firth's regression, a penalty term is assigned to the standard maximum likelihood function used to estimate the parameters of a logistic regression model when there are rare events or when complete separation exists [47].With no covariates, this corresponds to adding 0.5 to every cell of a 2 by 2 table of allele counts versus case-control status.For a given gene and variant set, the burden test was not performed if the number of carriers across all samples was below 3.
We used three analysis strategies: (1) joint analysis of all samples; (2) trans-ethnic meta-analysis: the analysis was stratified according to 7 inferred ancestry subgroups (African, North African, European, admixed American, Middle Eastern, South Asian, East Asian).For each subgroup, an ethnicity specific PCA was performed and used in the logistic regression model; and (3) transpipeline meta-analysis to account for heterogeneity due to the type of sequencing data: the analysis was stratified according to the type of data shared (FASTQ vs. VCF).Subgroup P values were subjected to further meta-analysis, accounting for the direction of the effect and sample size, with METAL [48].

Correction for multiple testing
For each gene, up to 9 burden tests were performed per genetic model.These tests were not independent; we therefore assessed the effective number of burden tests Meff with a method adapted from that described by Patin et al. [49], based on the approach of Li and Ji [50].This approach makes use of the variance of the eigenvalues of the observed statistics correlation matrix to estimate Meff.The Bonferroni-corrected threshold was then defined as 0.05/Meff.

Odds ratio (OR) equality for homozygous/hemizygous versus heterozygous carriers of pLOF variants at type I IFN genes
We investigated whether the odds of critical COVID-19 differed for carriers and non-carriers of pLOF variants at the type I IFN immunity loci as a function of zygosity (homozygous/hemizygous vs heterozygous).In the full sample, we used LRT to compare a full Firth bias-corrected logistic regression model including two different parameters for carriers of pLOF as a function of zygosity (alternative hypothesis) with a Firth bias-corrected logistic regression model including only one parameter for carriers of pLOF, not taking zygosity into account (null hypothesis).The analysis was performed with the R logistf package.

Cohort description
Through the CHGE, we collected whole-exome sequencing (WES) or whole-genome sequencing (WGS) data for 3503 patients with life-threatening COVID-19 pneumonia (hereafter referred to as "patients"; see Supplemental Methods) and 1373 individuals with mild or asymptomatic infection, i.e., without pneumonia (hereafter referred to as "controls").In total, 928 of the 3503 patients were screened for the presence of auto-Abs against type I IFN [26,27] (Supplemental Methods) and the 234 patients who tested positive were excluded from this analysis as they already have a major risk factor for the development of critical COVID-19 [29].In total, 1301 of the 3269 remaining patients had been included in previous studies restricted to a short list of 18 candidate genes [15,19] or to the X chromosome [30], and 1968 had not been studied before.The mean age (SD) of the patients was 55.7 (17.4) years, with a male-to-female ratio of 2.4 (Table 1).The controls were significantly younger than the patients (P < 0.0001), with a mean age (SD) of 43.8 years (20.1 years) and were more likely to be female (P < 0.0001; male-to-female ratio = 0.7).The patients and controls were of various ethnic origins, mostly of European and Middle Eastern ancestry, according to principal component analysis (PCA) (Fig. 2).Raw sequencing data were either centralized in the HGID laboratory and processed with the HGID pipeline (2492 cases and 870 controls) or processed separately by each sequencing hub (777 cases and 503 controls; See Supplemental Methods).A joint analysis was performed first on the combined sample of 3269 patients and 1373 controls.Given the heterogeneity of the cohort due to different ancestries and processing pipelines, we also performed a trans-ethnic and a trans-pipeline meta-analysis; only results consistent across the three analyses are reported here (See Supplemental Methods).

Genome-wide analysis under a co-dominant model
We first performed a GW rare variant burden analysis on the 3269 patients with life-threatening COVID-19 and 1373 controls with asymptomatic/mild COVID-19 under a co-dominant model, using nine sets of variants (See Supplemental Methods).The QQ plots for the joint analysis of the samples revealed no systematic deviations from the null hypothesis, and the genomic inflation factors (λ) were close to 1 (Additional file 2: Table S1).In total, 18,064 genes were analyzed with at least one of the nine variant sets, resulting in an effective number of independent tests (Meff ) for the joint analysis of 108,384, giving a Bonferroni-corrected significance threshold of 4.61 × 10 −7 .No gene was found to be of GW significance (see the Manhattan plot in Fig. 3A, Additional file 2: Table S2).The gene with the strongest association was TREH, encoding the trehalase enzyme, which hydrolyses trehalose, with rare (gnomAD allele frequency [AF] < 10 −4 ) nonsynonymous variants associated with a lower risk of lifethreatening COVID-19 (OR = 0.12[95% CI 0.05-0.28],P = 1.9 × 10 −6 ; Additional file 2: Table S3).In analyses of genes for which rare predicted loss-of-function (pLOF) variants were associated with an increase in the risk of life-threatening COVID-19 (Table 2), the strongest association was that for NPC2, for rare (gnomAD AF < 0.01) pLOF variants, with 28 heterozygous carriers among patients (0.9%), and four heterozygous carriers (0.3%) among controls (OR = 5.41 [95% CI 1.8-16.4],P = 5.8 × 10 −4 ).NPC2 encodes the Niemann-Pick disease type C2 protein and homozygous LOF mutations of this gene cause Niemann-Pick disease [51].NPC2 interacts with NPC1, which is also an essential endosomal receptor for the Ebola virus [52,53].Both NPC1 and NPC2 were implicated in the regulation of SARS-CoV-2 entry in a CRISPR screen [54].The GW burden analysis under a dominant model yielded similar conclusions (Additional file 2: Table S3).

Genome-wide analysis under a recessive model
We then performed a GW screen under a recessive model (autosomal and X-linked).In total, 4511 genes were analyzed with at least one of the nine variant sets, resulting in 27,066 independent tests, giving a Bonferroni-corrected significance threshold of 1.85 × 10 −6 .No gene reached GW significance (Fig. 3B).In analyses of genes with rare variants increasing the risk of lifethreatening COVID-19, TLR7 was, by two orders of magnitude, the most significant gene, with 51 carriers (1.6%) of at least one rare (gnomAD AF < 0.01) missense or pLOF variant in patients versus two carriers (0.1%) in controls (OR = 8.41[95% CI 1.9-35.5],P = 8.95 × 10 −5 ) (Table 3).Most of the carriers were male, with only one carrier among the patients and one among the controls being female.The variants carried by the two controls were previously shown to be biochemically neutral [19,30] (Additional file 2: Table S4).The 51 cases carried 33 different variants, 13 of which had been shown to be neutral; 16 were previously shown to be hypomorphic or amorphic [19,30], and four were previously unknown.
The four new variants were tested: one was found to be neutral and the other three were deleterious (Additional file 1: Fig S1).Restricting the analysis to biochemically proven LOF variants (bLOF) decreased the number of carriers (20 cases vs. 0 controls), but the association signal remained highly significant, with a much higher odds ratio (OR = 27.68 [95% CI 1.5-528.7],P = 1.08 × 10 −4 ) (Table 3).These findings confirm that TLR7 is a critical COVID-19 susceptibility locus, responsible for 0.9% of critical cases in male patients.

Genome-wide gene-based analysis including common variants
Published GWAS identified a number of common variants associated with severe COVID-19 pneumonia [8,10,12,55].We then assessed the combined effect of common and rare candidate coding variants at the gene level, in a weighted burden approach [56], as detailed in the Supplemental Methods.Briefly, for each individual, we calculated a genetic score by summing the number of minor alleles for each variant and weighting this sum by the frequency of the variant [57].We then tested the association between this genetic score and case-control status in a logistic regression framework.As described above, we focused on pLOF only, pLOF and in-frame variants with CADD > MSC, or pLOF and in-frame variants without filtering on CADD score (Additional file 2: Table S5).As in the analysis focusing on rare variants only, no gene reached genome-wide significance after correction for multiple testing in this analysis considering both rare and common variants.The top-ranked  Combined Annotation Dependent Depletion (CADD) score [42] greater than the Mutation Significance Cut-off (MSC) for the corresponding gene.The MSC is defined for a given gene as the lower limit of the confidence interval (95%) of the CADD score of all its known pathogenic mutations [43] Chr gene, with consistent results across the joint analysis and the trans-ethnic and trans-pipeline meta-analyses, was TREH, with a protective effect against life-threatening COVID-19 of pLOF or nonsynonymous variants with a CADD score greater than the MSC (OR = 0.85 [95%CI 0.78-0.91],P = 3.6 × 10 −6 ).Finally, we analyzed 20 candidate genes identified by GWAS for critical pneumonia in more detail [8,10,12,55].No significant association was detected for any of these genes (Additional file 2: Table S6), even with a relaxed Bonferroni threshold of 2.5 × 10 −3 , accounting for the number of GWAS genes.

Enrichment in rare pLOF variants at 13 type I IFN-related influenza susceptibility loci
Following on from our initial analysis [15], we also performed a candidate pathway enrichment analysis focusing on the 13 genes involved in Toll-like receptor 3 (TLR3)-and interferon regulatory factor 7 (IRF7)dependent type I IFN immunity to influenza virus (IFNAR1, IFNAR2, IRF3, IRF7, IRF9, IKBKG, STAT1, STAT2, TBK1, TICAM1, TLR3, TRAF3, and UNC93B1) (Fig. 1).We confirmed the significant enrichment in rare (gnomAD AF < 10 −3 ) pLOF variants at the 13 loci in patients with critical COVID-19, with 34 carriers among patients versus six among controls (OR = 3.70 [95% CI 1.7-9.5],P = 2.1 × 10 −4 under a co-dominant model; Table 4).We also estimated this p value in a simulation study taking 13 loci randomly selected from a set of genes with similar pLI and CoNeS values (see Additional file 1: Supplemental Methods); we obtained an empirical p value of 3.7 × 10 −4 .Our cohort included 551 patients and 314 controls already screened for pLOF variants of the 13 genes included in our previous study [15] (Additional file 2: Table S7).The exclusion of these 551 cases and 314 controls resulted in a similar conclusion of enrichment in rare pLOF at the 13 loci (OR = 3.21 [95% CI 1.3-8.2],P = 5.97 × 10 −3 ) formally replicating our initial association.Significant replication was also observed in the trans-ethnic (P = 0.01) and the trans-pipeline (P = 0.009) analyses.We found that 31 of the 34 carriers of pLOF variants were heterozygous, and three were homozygous: one for a frameshift variant of IRF7 described in a previous study [15], one for a previously reported deletion spanning 4394 base pairs in IFNAR1 [16,19], and one for a previously unknown deletion spanning 6624 base pairs of IFNAR1 (Additional file 2: Table S8).All the biallelic pLOF variants were found in patients.Consequently, the OR for homozygous carriers (OR = 15.79 [95%CI 1.4-2170.4],P = 0.02) was higher than that for heterozygous carriers (OR = 3.11 [95%CI 1.4-8.6],P = 5.2 × 10 −3 ), but both were significant.

Inclusion of TYK2 and TLR7 genes and branchpoint variants
Since the publication of the aforementioned study [15], AR TYK2 deficiency has been reported in children with COVID-19 pneumonia [19].We identified two patients homozygous for a rare pLOF variant of TYK2 already described in a previous study [19] and one patient and one control heterozygous for a rare pLOF variant (Additional file 2: Table S8).Adding these patients to the analysis gave very similar results under a co-dominant model (OR = 3.30[95% CI 1.6-7.8],P = 1.4 × 10 −4 ) and strengthened the evidence for association under a recessive model (OR = 19.65[95%CI 2.1-2635.4],P = 3.4 × 10 −3 ) (Table 4).An analysis of the rare pLOF variants at these 14 loci plus the bLOF variants of TLR7 revealed highly significant enrichment (OR = 3.82 [95%CI 2.0-7.2],P = 1.3 × 10 −7 under a co-dominant model).The effect was stronger for homozygous/hemizygous carriers (OR = 39.19 [95%CI 5.2-5037.01],P = 4.7 × 10 −7 ) than for heterozygous carriers (OR = 2.27 [95%CI 1.0-5.2],P = 0.04), and these two ORs were significantly different (P = 0.008).We further screened the full cohort of cases and controls for intronic branchpoint (BP) variants, which might potentially have a strong impact on splicing and be considered pLOF variants, in the 15 type I IFN-related genes, with our new tool BPHunter [39].We identified six branchpoint (BP) variants (Additional file 2: Table S9) carried in heterozygous state by 10 additional cases and no controls.Adding these BP variants to the analysis of the 15 type I IFN-related loci under a co-dominant model further strengthened the association signal (OR = 4.40 [2.3-8.4],P = 7.7 × 10 −8 ) (Table 4).

Age and sex stratified analysis of the 15 type I IFN-related loci
Advanced age is the strongest risk factor for life-threatening COVID-19.Male individuals are also at higher risk than female individuals.As for the main GWAS hits [58, 59], we performed an analysis stratified for age and sex for the 15 type I IFN-related loci.The analysis stratified for sex revealed a much stronger association signal in male than in female individuals, as expected given the X-linked recessive mode of inheritance of TLR7 deficiency (Additional file 2: Table S10).Nevertheless, the enrichment in rare pLOF variants at the 15 loci in female subjects remained significant under a co-dominant model (P = 0.02) and a recessive model (P = 0.05).
The addition of the BP variants strengthened the association signal in female subjects under a co-dominant model (P = 3.7 × 10 −3 ).In the analysis stratified for age, we assigned the cases to two age groups (under 60 years of age vs. 60 years and over), which we compared with all controls.We used an age cut-off of 60 years, which was close to the median age of the cases, in accordance with the analyses performed in [7,

In-frame nonsynonymous variants at the 15 loci
We further screened our cohort for rare in-frame nonsynonymous variants with a gnomAD AF < 10 −3 at these type I IFN-related susceptibility loci.For the 13 initial loci, the enrichment disappeared when in-frame nonsynonymous variants were added to pLOF variants under a co-dominant model (OR = 1.08 [95%CI 0.9-1.3],P = 0.42) (Additional file 2: Table S11), whereas a non-significant trend persisted under the recessive model (OR = 5.02 [95% CI 0.7-52.7],P = 0.06).Focusing exclusively on inframe variants decreased the strength of this trend considerably, with only eight homozygous carriers among patients and one among controls (OR = 1.14 [0.2-912.5],P = 0.68).Adding TYK2 variants led to similar conclusions (Additional file 2: Table S11).We then added TLR7 variants and considered the 15 loci together.Under a codominant model, the enrichment became non-significant when in-frame nonsynonymous variants were added (OR = 1.15 [1.0-1.4],P = 0.09), but enrichment remained significant under a recessive model (OR = 6.54 [2.4-24.8],P = 5.3 × 10 −6 ; Additional file 2: Table S11).In analyses considering only rare in-frame homozygous/hemizygous nonsynonymous variants, the effect size was smaller, but the enrichment remained significant (OR = 3.52 [1.3-13.3],P = 2.8 × 10 −3 ).In total, 41 patients carried a rare homozygous/hemizygous in-frame nonsynonymous variant at one of the 15 loci, and 16 of these variants (carried by 16 patients) were TLR7 in-frame variants already shown to be bLOF.After excluding the TLR7 bLOF variants, there was no residual significant enrichment in rare in-frame nonsynonymous variants in patients relative to controls, whatever the genetic model considered.

Discussion
In this exome-wide gene burden analysis for rare variants underlying critical COVID-19, no gene reached GW statistical significance after accounting for multiple testing.We used simulations to determine the power of our sample to detect an association at the 2.5 × 10 −6 exomewide significance threshold (Additional file 1: Fig S3 ); our sample had a power of more than 80% for detecting alleles with a carrier frequency of 5 × 10 −3 in the general population and a relative risk of critical COVID-19 of at least 6.These results are consistent with those of two previous large exome-wide studies including more than 1000 critical cases and thousands of population-based controls that found no statistically significant autosomal gene burden associations at stringent significance thresholds accounting for the number of phenotypes and variant sets analyzed [11,21].However, under a recessive model, the strongest association-albeit not statistically significant at GW level-was obtained with the X-linked TLR7 gene, for which association has consistently been reported across studies [21,22,30,32], reaching the less conservative exome-wide significance threshold of 2.5 × 10 −6 in some of these previous studies [21,22].It should be stressed that stringent correction for multiple testing, while necessary to avoid false positives, is a conservative strategy, and that the lack of formal statistical significance at a GW level does not preclude biological causality and medical significance.The burden of proof can be provided experimentally via biochemical, virological, and immunological experiments, as our previous studies of TLR7 in which we showed that biochemically deleterious TLR7 variants blunted the pDC-dependent sensing of SARS-CoV-2 and induction of type I IFN, thereby accounting for ~ 1% of critical pneumonia cases in men [30].Additional genes may be found by restricting the association analysis to variants experimentally proven to be deleterious.This analysis also confirms our previous findings of an enrichment in rare pLOF variants of 13 genes involved in TLR3-and IRF7-dependent type I IFN immunity to seasonal influenza virus in critical cases relative to controls with mild/asymptomatic infection [15].These results were strengthened by the addition of TYK2, which was recently shown to underlie severe COVID-19 [19,20], and TLR7, especially under a recessive model.We found that homozygous/hemizygous carriers of rare pLOF or bLOF variants at the 15 loci had a significantly higher risk of life-threatening COVID-19 than heterozygotes.This is consistent with the generally higher clinical penetrance of recessive than dominant IEI [1].Overall, 1.7% of the patients with life-threatening COVID-19 carried a rare pLOF or bLOF variant at one of the 15 loci, these variants being homozygous/hemizygous in 0.8% of cases.
Adding the BP variants at the 15 loci increased the proportion of carriers among patients with life-threatening COVID-19 to 2.1%.One of the STAT2 BP variants identified (2:56749159:T > A) has already been validated experimentally [39], but the effects of the five other BP variants identified require confirmation.The study of in-frame nonsynonymous variants might also increase this proportion, but would require the experimental characterization of all these variants.Indeed, in analyses restricted to rare in-frame nonsynonymous variants, we detected no significant enrichment in patients relative to controls.This result is not surprising, as we showed in a previous study [15] that less than 15% of the rare in-frame nonsynonymous variants at the 13 loci carried by cases initially studied were bLOF variants, whereas all the pLOF variants were found to be bLOF.Similar results were obtained for TLR7, with only 10 of 108 (9.2%) in-frame nonsynonymous variants observed in gnomAD being bLOF [30].This high proportion of neutral variants strongly affects the power of burden tests and highlights the need for the experimental characterization of variants.
We also showed that patients carrying rare pLOF or bLOF variants of these 15 type I IFN-related genes were significantly younger than the remaining patients (mean age [SD] in years: 43.3 [20.3] vs. 56.0[17.3] years).This was particularly true for homozygous/hemizygous carriers of rare pLOF or bLOF variants (35.2 [20.3] years), potentially accounting for the lack of replication of this finding by other studies including older patients [11,[21][22][23].Consistent with this result, we recently found that ~ 10% of children hospitalized for COVID-19 pneumonia carry recessive inborn errors of type I IFN immunity [19].In addition, older patients are more likely to carry auto-Abs against type I IFN, and unlike previous studies, we excluded patients carrying such antibodies from this analysis.None of the 234 patients with critical COVID-19 excluded from this study due to the presence of auto-Abs against type I IFN carried a rare pLOF variant of the 15 genes.Hence, samples in which the vast majority of patients are over the age of 60 years and of unknown status for auto-Abs against type I IFNs would have a much lower power to identify these rare inborn errors of type I IFN immunity.

Conclusions
Rare autosomal inborn errors of type I IFN-dependent immunity to influenza viruses can underlie critical forms of COVID-19, especially in subjects below 60 years of age, in addition to X-linked TLR7 deficiency.The search for additional rare mutations conferring a strong predisposition to life-threatening COVID-19 should focus on young patients with critical COVID-19 without auto-Abs against type I IFNs.
Additional file 1: Supplementary Methods.Fig S1 .Luciferase assay on HEK293T cells transfected with the pGL4.32 luciferase reporter construct and an expression vector for Renilla luciferase together with no vector (mock), EV, WT, or 4 TLR7 variants found in our cohort.After 24 h, transfected cells were left untreated or were treated by incubation with 1 μg/mL R848 for 24 h.These data were established from two independent experiments.The y-axis represents NF-κB transcriptional activity as a percentage of the WT.The x-axis indicates the alleles used for transfection.Fig S2 .Age distribution as boxplot and violin plot of the critical COVID-19 cases according to the carrier status of pLOF/bLOF at 15 type I IFN-related loci.Mean age of the patients for each category is shown in red.T-test was used to compare the means, showing a significant difference between non-carriers and carriers of heterozygous or homozygous/hemizygous variants (P = 2.2 × 10 −6 ) and between heterozygous carriers and homozygous/hemizygous carriers (P = 0.008).Fig S3 .Empirical power of our sample to detect an association at the 2.5 × 10 −6 exome-wide significance threshold for various relative risks and proportion of carriers of at least one disease causing variant in the general population (PD), as estimated by simulation study (N = 1000 replicates).
Additional file 2: Table S1.Number of genes tested and Genomic inflation factor for each model and variant set.Table S2.Complete results of the genome-wide burden joint analysis, trans-pipeline meta-analysis and trans-ethnic meta-analysis on rare variants.Table S3.Best results of the genome-wide burden analysis on rare variants under a co-dominant and dominant model.Table S4.TLR7 homozygous and hemizygous variants (AF < 0.01).Table S5.Results of the genome-wide burden analysis on common and rare variants under a co-dominant model.Table S6.Results of the genome-wide burden analyses for the candidate genes identified by GWAS under co-dominant model.Table S7.Characteristics of patients and controls in the full sample and according to the inclusion in the Zhang Q. et al., Science 2020 paper.Table S8.Carriers of rare pLOF/ bLOF variants in genes involved in type I IFN immunity to influenza virus.Table S9.Branchpoint variants identified by BPHunter and characteristics of the carriers.Table S10.Age and sex stratified analysis for the 15 type I IFN-related loci.Table S11.Enrichment analysis of rare variants, including missense and inframe variants, in genes involved in type I IFN immunity in the full cohort of 3269 cases and 1373 controls.Table S12.pLI and CoNeS distribution of the analyzed genes.

Fig. 3
Fig.3Manhattan plot for genome-wide burden analysis under the co-dominant (top) and recessive (bottom) models.For each gene, the negative log-transformed p value of the joint analysis for the most significant variant set under a co-dominant (top) or recessive (bottom) model is plotted.For each gene, variant sets providing inconsistent results across the joint analysis, the trans-ethnic meta-analysis, and the trans-pipeline meta-analysis (i.e., P < 0.001 in the joint analysis and P > 0.05 in the trans-ethnic or trans-pipeline meta-analysis) were discarded.The red lines represent the significance threshold after Bonferroni correction to account for the total number of independent tests (P = 4.61 × 10 −7 under a co-dominant model and 1.85 × 10 −6 under a recessive model).The names of the top-ranked genes with a joint P < 10 −4 are shown in red for rare variants associated with an increase in the risk of critical COVID-19 and in blue otherwise

Table 1
Baseline characteristics of study participantsHGID Human Genetics of Infectious Diseases a Chi-squared tests were used to compare proportions, and t tests were used to compare the mean ages

Table 2
Top results of the genome-wide burden analysis for rare pLOF variants increasing the risk of life-threatening COVID-19 under a co-dominant model Only genes with a P value ≤ 5 × 10 −3 in the joint analysis and P values < 0.05 in trans-ethnic and trans-pipeline meta-analyses are displayed

Table 3
Top results of the genome-wide burden analysis for rare variants increasing the risk of life-threatening COVID-19 under a recessive model Only genes with P values ≤ 0.01 in the joint analysis and P values < 0.05 in trans-ethnic and trans-pipeline meta-analyses are displayed AF allele frequency a