Each cell division brings with it a risk of a new mutation. Mutations that occur after fertilization lead to the formation of distinct cell lineages or a state of genetic mosaicism. Depending on the functional consequence of the mutation, the timing of its acquisition, and its tissue distribution, the effect of a mosaic variant on patient phenotype can range from negligible to catastrophic. Although mosaic variation has been known to cause disease for decades, high-throughput sequencing technologies with the analytical sensitivity to consistently detect variants at reduced allelic fractions have only recently emerged as routine clinical diagnostic tests. Therefore, empirical studies of the frequency of mosaicism in large patient populations are only now being performed and published. The incidence of mosaic CNVs and aneuploidy found in patients referred for microarray testing has been estimated at 0.55–1% [18, 19]. Without additional verification studies, it is challenging in routine ES analyses to distinguish real somatic variants from apparently de novo heterozygous variants with highly skewed (lower than 0.36) AAF. Therefore, we have focused here only on clinically relevant SNVs. A systematic assessment of the rate of clinically relevant mosaic variant detection in large cohorts of individuals referred for ES with heterogeneous clinical presentations needs more investigations [13].
We endeavored to study the frequency, type, allelic fraction, and phenotypic consequences of reportable mosaic SNVs in a cohort of nearly 12,000 consecutive unrelated patients referred for clinical ES. A total of 120 mosaic variants in 107 established disease genes were detected and reported in either proband (n = 80) or parental (n = 39)/grandparental (n = 1) samples. Mosaic variation was considered definitely or possibly contributory to disease in approximately 1% of 11,992 subjects in this study. Assuming a molecular diagnosis was ascertained in 25% of patients in this cohort [14], an estimated 1.5% of all molecular diagnoses could be attributed to a mosaic variant detected in the proband samples. The fact that these estimates are low relative to other published cohorts was anticipated, as existing reports have studied mosaicism in specific genes [9, 20] or phenotypes [10, 11, 21], and/or have assessed the frequency of rare mosaic variants [11] but not specifically clinically reportable variants.
To assess the phenotypic effects of mosaicism in our cohort, we analyzed the provided clinical information and compared the phenotype of each patient to descriptions in the literature and/or in Online Mendelian Inheritance in Man (OMIM) of individuals with predominantly non-mosaic mutations. In the vast majority of probands with mosaic P/LP variants in AD/X-linked/somatic genes and no confounding factors (e.g., presence of multiple mosaic variants, underlying structural variation), the clinical presentation was not appreciably diminished in severity. In contrast, among parents with mosaic variants, only two (82M-Mo, 120F-Fa) were reported to have a phenotype that could be attributed to the identified mosaic mutation. Excluding mosaic variants detected in X-linked genes in males, a comparison of the AAF of mosaic variants in parental samples (14.6% ± 8.0%) relative to proband samples (20.0% ± 9.8%) showed that unaffected parents with mosaic variants have a significantly lower AAF (p = 0.004, t-test). It is intriguing that mosaic variants with ~ 5% lower AAFs can result in mild or absent phenotypes or can cause clinically significant manifestations. One explanation would be that the impact of any given postzygotic variant is likely to be dependent on the biological function of the gene and the distribution of the mutation in critical tissues. This notion is supported by the mosaic variants found in MTOR, PIK3CA, and CACNA1A in our study. Mosaic variants in MTOR and PIK3CA with AAFs ranging from 12.7 to 24.4% were detected in affected probands with Smith-Kingsmore syndrome [MIM: 616638], Cowden syndrome 5 [MIM: 615108], and/or megalencephaly-capillary malformation-polymicrogyria syndrome [MIM: 602501]. Conversely, mosaic variants in CACNA1A with similar AAFs ranging from 15.7 to 29.5% were all detected in asymptomatic parents. The contrasting severity of phenotypes seen in probands versus clinically unaffected parents highlights the challenge of predicting phenotypic outcomes based on genetic testing alone. It also raises the question of how variant mosaicism should be weighed in the course of variant classification given that both pathogenic and benign effects are possible depending on the clinical context in which the variant is detected.
Interestingly, recurrent mosaic variants in a subset of 9 genes: MTOR, CREBBP, CACNA1A, DDX3X, DNM1, DYRK1A, GRIA3, KMT2D, and PIK3CA accounted for 18.3% (22/120) of all detected mosaic variants in the analyzed cohort. Mosaic variants in several of these genes have been reported previously in the literature: MTOR [11], CREBBP [22], CACNA1A [23], DNM1 [24], KMT2D [25], and PIK3CA [26]. In some cases, e.g., the MTOR and PIK3CA genes, somatic variants are the predominant or the only form of disease-causing mutation described in affected individuals. We have also noted that 10 (12.5%) of the 80 de novo mosaic variants detected in the proband samples were found in a gene associated with the Ras or PI3K-AKT-mTOR pathway, including one variant each in BRAF, NF1, HRAS, and KRAS, and three variants in PIK3CA and MTOR. Heterozygous variants in the same six genes were reported in less than 1% of the entire cohort, indicating that mosaic variation is disproportionately likely to affect this pathway. In fact, mosaic events in this pathway have been commonly observed [27]. The reason for enrichment of mosaicism in the Ras or PI3K-AKT-mTOR signaling pathway is unclear; possible explanations include (1) preferential expansion of hematologic clones with variants in these genes increasing the likelihood of mosaic variant detection, (2) high penetrance of mosaic variants in Ras pathway genes relative to other genes, and (3) a preponderance of intragenic mutation-prone residues.
The recognition that certain genes are more prone to pathogenic postzygotic mutation critically informs recurrence risk counseling and enables optimization of test development and data interpretation in the diagnostic lab setting. Panel-based tests targeting genes with recurrent mosaic variants should have sufficient depth of coverage and, to account for the risk of parental mosaicism, should include recommendations for parental testing. AAF filters are often utilized for comprehensive genomic assays such as exome and whole genome sequencing to exclude variants that are likely to represent sequencing artifact, a practice that can preclude detection of low-level mosaicism. Even with an average ES read depth of 130×, mosaic variants with AAF of less than 10% may be filtered out and excluded from review. For these methodologies, relaxing AAF filters for a defined subset of phenotypically relevant genes in which recurrent mosaic events are known to occur may help to optimize mosaic variant detection. Additionally, testing of tissues distant from the hematopoietic lineage (e.g., urine or hair follicles) could be performed to confirm mosaic status [7].
Adding to the complexity of mosaic variant interpretation, several patients in our cohort were found to harbor more than one mosaic variant. One patient (12U) with multiple congenital malformations was found to have compound heterozygous variants in RAD51C, a gene associated with Fanconi anemia [28], a mosaic VOUS in ENG, and seven additional mosaic variants in genes with no definitive disease association. Genomic instability resulting from spontaneous chromosome breakage is a hallmark of FA [29] and previous studies have shown an increased risk of mosaic copy-number and structural variants in affected individuals [30]. However, the impact of underlying FA on acquisition of somatic single nucleotide and small insertion/deletion variants has not been clearly elucidated. Therefore, although likely, the mosaic variants detected in this patient cannot be unequivocally attributed to the FA diagnosis. Multiple mosaic variants (n = 17) were also detected in patient 3M referred for ES with a history of malignant astrocytoma, myelodysplasia, and dysmorphic features. The mosaic mutations detected in this individual were likely related to the patient’s recent history of myelodysplastic syndrome. Although the phenomenon of mutation acquisition in pre-cancerous and cancerous states is not novel [31], multiple mosaic events stemming from malignancy can be an unexpected finding on assays like ES that are generally performed for the detection of germline, rather than somatic mutations. These findings are also challenging from the standpoint of clinical follow-up, as guidelines do not exist to direct management of incidentally ascertained cancer variants in individuals without a known malignancy.
Finally, we have noted that SNV mosaicism can also be explained by chromosomal abnormalities. Patient 52F with developmental delay and microcephaly was found to have a pathogenic variant in the COX15 gene detected at an AAF of 12%. Analysis of the parental samples for the pathogenic change indicated that the father was heterozygous and the mother was negative for the variant. Due to the unexpectedly low AAF in the proband of the purportedly inherited COX15 variant, review of the SNP array data was performed and the mosaic maternal uniparental disomy of distal chromosome 10q encompassing the COX15 gene was found. In a second case, patient 55F with macrocephaly, dysmorphic features, and digital anomalies was found to have a mosaic pathogenic variant in ZMPSTE24 at an AAF of 80%. The pathogenic variant was found to be heterozygous in the mother and negative in the father. Analysis of the SNP array data again revealed mosaic copy neutral AOH suspicious for UPD involving chromosome 1 and encompassing the ZMPSTE24 gene, which presumably served as the “second hit” for the autosomal recessive disorder.
The many variables that complicate mosaic variant interpretation can also be leveraged in research studies to make inferences about variant pathogenicity and to provide insights into gene function. For example, from the observation that activating mutations in GNAS (associated with McCune-Albright syndrome, OMIM 174800) are detected only in the mosaic state, one can infer that constitutional activating mutations in this gene are incompatible with life [8, 32]. It is plausible that studies of affected individuals, including analyses of AAF by tissue type, would help to define key aspects of gene function, including after what critical developmental period the mutation must occur to ensure viability. For example, conditional PIK3CA activation in mouse cortex showed that abnormal mTOR activation in excitatory neurons and glia, but not interneurons, is sufficient for abnormal cortical overgrowth [33].
Although our cohort is comprised of nearly 12,000 families and we have detected and reported 120 mosaic mutations, only a minority of individuals were found to have mosaic variants in the same gene, which limits our ability to draw conclusions about gene function from analysis of mosaic variation in this cohort specifically. Moreover, causative mutations may be restricted to brain or other tissues that are not commonly studied sources of DNA [34]. As such, additional studies dedicated to assessing mosaicism including larger cohorts of affected and unaffected individuals will be necessary to accumulate the evidence needed to make broad conclusions about gene function based on mosaic variation in the population. Such studies may also allow the use of quantitative information, such as AAF, to predict clinical phenotype, particularly if multiple tissues can be analyzed. Finally, single-cell sequencing will permit a more accurate evaluation of the role of somatic mutations in neurodevelopmental disorders and during normal brain development [35].