Skip to main content
Fig. 1 | Genome Medicine

Fig. 1

From: Calculating variant penetrance from family history of disease and average family size in population-scale data

Fig. 1

Summary of the key steps within this penetrance estimation approach. Legend: Step 1: Variant frequencies (M) and weighting factors (W) are defined for a valid subset of the familial (F), sporadic (S), unaffected (U), and affected (A) states (see Table 1) to calculate rate of one of these states, arbitrarily labelled state X, among families harbouring the pathogenic variant across those states with data provided, \(R{\left(X\right)}^{obs}\). Step 2: Eqs. (5–8) are applied to calculate \(P(familial)\), \(P(sporadic)\), \(P(unaffected)\), and \(P(affected)\), for a series of penetrance values, \(f_{i}=0,\dots ,1\), at a defined sibship size, \(N\), and with disease risk \(g\) for people not harbouring the variant. The rate of state X expected at each \(f_{i}\) among variant harbouring families from those states represented in Step 1, \(R{\left(X\right)}_{i}^{ex}\), is calculated and stored alongside the corresponding \(f_{i}\) in a lookup table. Step 3: The lookup table is queried using \(R{\left(X\right)}^{obs}\) to identify the closest \(R{\left(X\right)}_{i}^{ex}\) value and corresponding \(f_{i}\). Step 4: Bias in the obtained \(f_{i}\) estimate is corrected by simulating a population of families representative of the sample data, estimating the difference between true and estimated penetrance values in this population between \(f=0,\dots ,1\) and adjusting the estimated \(f_{i}\) by error predicted within a polynomial regression model fitted upon the simulated estimate errors. Optional step: Confidence intervals for \(R{\left(X\right)}^{obs}\) can be calculated from error in the estimates of \(M\) provided [48]; Penetrance is estimated as in Steps 3 and 4 for the interval bounds. All steps within this approach are comprehensively detailed in Additional File: Sect. 1.1

Back to article page