- Open Access
Exome sequencing reveals a high prevalence of BRCA1 and BRCA2 founder variants in a diverse population-based biobank
Genome Medicine volume 12, Article number: 2 (2020)
Pathogenic variants in BRCA1 and BRCA2 (BRCA1/2) lead to increased risk of breast, ovarian, and other cancers, but most variant-positive individuals in the general population are unaware of their risk, and little is known about prevalence in non-European populations. We investigated BRCA1/2 prevalence and impact in the electronic health record (EHR)-linked BioMe Biobank in New York City.
Exome sequence data from 30,223 adult BioMe participants were evaluated for pathogenic variants in BRCA1/2. Prevalence estimates were made in population groups defined by genetic ancestry and self-report. EHR data were used to evaluate clinical characteristics of variant-positive individuals.
There were 218 (0.7%) individuals harboring expected pathogenic variants, resulting in an overall prevalence of 1 in 139. The highest prevalence was in individuals with Ashkenazi Jewish (AJ; 1 in 49), Filipino and other Southeast Asian (1 in 81), and non-AJ European (1 in 103) ancestry. Among 218 variant-positive individuals, 112 (51.4%) harbored known founder variants: 80 had AJ founder variants (BRCA1 c.5266dupC and c.68_69delAG, and BRCA2 c.5946delT), 8 had a Puerto Rican founder variant (BRCA2 c.3922G>T), and 24 had one of 19 other founder variants. Non-European populations were more likely to harbor BRCA1/2 variants that were not classified in ClinVar or that had uncertain or conflicting evidence for pathogenicity (uncertain/conflicting). Within mixed ancestry populations, such as Hispanic/Latinos with genetic ancestry from Africa, Europe, and the Americas, there was a strong correlation between the proportion of African genetic ancestry and the likelihood of harboring an uncertain/conflicting variant. Approximately 28% of variant-positive individuals had a personal history, and 45% had a personal or family history of BRCA1/2-associated cancers. Approximately 27% of variant-positive individuals had prior clinical genetic testing for BRCA1/2. However, individuals with AJ founder variants were twice as likely to have had a clinical test (39%) than those with other pathogenic variants (20%).
These findings deepen our knowledge about BRCA1/2 variants and associated cancer risk in diverse populations, indicate a gap in knowledge about potential cancer-related variants in non-European populations, and suggest that genomic screening in diverse patient populations may be an effective tool to identify at-risk individuals.
The recognition of strong familial clustering of breast and ovarian cancer , followed by the discovery of the BRCA1 and BRCA2 (BRCA1/2) genes in 1994  and 1995 , respectively, has led to the study and characterization of BRCA1/2-related hereditary breast and ovarian cancer syndrome (HBOC). Inherited pathogenic variants in either of these genes cause a significantly elevated risk for cancer of the female breast as well as high-grade serous ovarian, tubal, and peritoneal carcinoma. The risk for other cancers, including prostate, male breast, pancreas, melanoma and possibly others, is also increased . Pathogenic variants in these genes are highly penetrant and inherited in an autosomal dominant pattern.
The prevalence of pathogenic BRCA1/2 variants has been previously estimated, with historical data suggesting a prevalence of approximately 1 in 400 individuals in the general population [5, 6]. A higher prevalence has been observed in certain populations; for example, approximately 1 in 42 individuals of Ashkenazi Jewish (AJ) descent harbor one of three common founder variants [7, 8]. Founder variants in other populations have also been described, including Icelandic, French Canadian, and Puerto Rican populations . Recent unselected population-based genomic screening efforts have demonstrated a higher than expected prevalence of BRCA1/2 pathogenic variants in predominantly European-ancestry individuals, approximately 1 in 190, with only half of these individuals meeting current guidelines for genetic testing [10,11,12] and only 18% having prior knowledge of their BRCA1/2 status through clinical genetic testing .
Understanding of the prevalence and contribution to cancer risk of BRCA1/2 variants in non-European populations has been limited by racial and ethnic disparities in genetic research . In addition to reduced uptake of genetic testing in diverse populations [15,16,17,18], there is a higher rate of detection of variants of uncertain significance in non-European populations [19,20,21]. Here, we evaluated the range of BRCA1/2 variants in a diverse patient population from the BioMe Biobank in New York City and explored clinical characteristics of individuals harboring expected pathogenic variants in BRCA1/2.
Setting and study population
The BioMe Biobank is an electronic health record (EHR)-linked biobank of over 50,000 participants from the Mount Sinai Health System (MSHS) in New York, NY. Participant recruitment into BioMe has been ongoing since 2007 and occurs predominantly through ambulatory care practices across the MSHS. The BioMe participants in this analysis were recruited between 2007 and 2015, with approximately half coming from general medicine and primary care clinics and the rest from different specialty or multi-specialty sites at MSHS. BioMe participants consent to provide DNA and plasma samples linked to their de-identified EHRs. Participants provide additional information on self-reported ancestry, personal and family medical history through questionnaires administered upon enrollment. This study was approved by the Icahn School of Medicine at Mount Sinai’s Institutional Review Board. The study population consisted of 30,223 consented BioMe participants aged 18 years or older (upon enrollment) and with exome sequence data available through a collaboration with the Regeneron Genetics Center.
Generation and QC of genomic data
Sample preparation and exome sequencing were performed at the Regeneron Genetics Center as previously described  yielding N = 31,250 samples and n = 8,761,478 sites. Genotype array data using the Illumina Global Screening Array was also generated for each individual . Post-hoc filtering of the sequence data included filtering of N = 229 low-quality samples, including low-coverage, contaminated, and genotype-exome discordant samples; N = 208 gender discordant and duplicate samples were also removed. This resulted in N = 30,813 samples for downstream analysis, and N = 30,223 samples from participants aged 18 years and older. Mean depth of coverage for the remaining samples was 36.4x, and a minimum depth of 27.0x, and sequence coverage was sufficient to provide at least 20x haploid read depth at > 85% of targeted bases in 96% of samples. Sites with missingness greater than 0.02 (n = 267,955 sites) were removed, as were sites showing allele imbalance (n = 320,877; allelic balance < 0.3 or > 0.8). Samples were stratified by self-reported ancestry, and sites with Hardy Weinberg equilibrium p < 1 × 10− 6 (n = 12,762) were removed from analysis. Variants at multi-allelic sites in BRCA1 and BRCA2 (n = 124) underwent the same quality control workflow as those from bi-allelic sites, with the exception that allelic balance was calculated only among heterozygous carriers of multi-allelic variants. Multi-allelic sites for which the mean allelic balance among heterozygous carriers was < 0.3 or > 0.8 were excluded from downstream analysis. This resulted in the exclusion of n = 1 site, leaving a total of n = 123 for further analysis. Manual inspection of pileups was performed for carriers (N = 22) of the n = 13 multi-allelic sites annotated as pathogenic in ClinVar. Of these, N = 6 out of 7 carriers of the 13:32339421:C:CA variant were determined to be false positives and excluded from downstream analyses.
Self-reported and genetic ancestry
Self-reported ancestry categories were derived from a multiple-choice survey administered to participants upon enrollment into the BioMe Biobank . Participants could select one or more of the following categories: African American/African, American Indian/Native American, Caucasian/White, East/Southeast Asian, Hispanic/Latino, Jewish, Mediterranean, South Asian/Indian, or Other. Individuals who selected “Jewish,” “Caucasian/White,” or both were designated as “European.” Individuals who selected “Mediterranean,” “Other,” or both were designated as “Other.” Individuals who selected multiple categories including “Hispanic/Latino” were designated as “Hispanic/Latino.” Individuals from the “Native American,” “Other,” or “Multiple Selected” categories were excluded from downstream analysis of prevalence in self-reported groups.
Genetic ancestry in the form of identity-by-descent community designation was performed on a subset of participants excluding second-degree relatives and above, yielding 17 distinct communities representing patterns of cultural endogamy and recent diaspora to New York City. Eight of these communities with > 400 unrelated participants were used for downstream analysis of prevalence. These communities included individuals with African American and African ancestry (N = 6874), non-AJ European ancestry (N = 5474), AJ ancestry (N = 3887), Filipino and other Southeast Asian ancestry (N = 556), as well as ancestry from Puerto Rico (PR; N = 5105), the Dominican Republic (DR; N = 1876), Ecuador (N = 418), and other Central and South American communities (N = 1116). Full details of the global ancestry inference, genetic community detection, and genotype quality control are described in Belbin et al. . Finally, we determined the proportion African genetic ancestry in mixed ancestry Hispanic/Latino populations using the ADMIXTURE  software. We assumed five ancestral populations (k = 5) with 5-fold cross validation across n = 256,052 SNPs in N = 27,984 unrelated participants that were also genotyped on the Global Screening Array (GSA), in addition to N = 4149 reference samples representing 5 continental regions . Unrelated, self-reported Hispanic/Latino participants with both exome sequence and GSA genotype data (N = 8457) were extracted, and binned into four groups of proportion African genetic ancestry; 0-20% (N= 3748), >20-40% (N = 2779), >40-60% (N = 1242), and >60% (N = 688). We estimated relatedness using the software KING , and for all prevalence estimates in self-reported and genetic ancestry groups, we excluded second-degree relatives and above.
BRCA1/2 variant annotation
Sequence variants were annotated with the Variant Effect Predictor (VEP; Genbank gene definitions; BRCA1 NM_007294.3, BRCA2 NM_000059.3). In order to reduce the set of false positive predicted loss-of-function (pLOF) calls, we also ran the Loss-Of-Function Transcript Effect Estimator (LOFTEE) and defined the consensus calls from both methods as the set of pLOF variants for the study. Sequenced variants were cross-referenced with the ClinVar database (accessed July 2018)  and annotated according to their ClinVar assertions when available as pathogenic, likely pathogenic, uncertain significance, benign, likely benign, or with conflicting interpretations of pathogenicity. All variants with conflicting interpretations were manually reviewed in ClinVar (accessed November 2018) by a genetic counselor (J.A.O. or E.R.S.). In addition, we included the following categories of pLOF variants not classified in ClinVar: single nucleotide variants (SNVs) leading to a premature stop codon, loss of a start codon, or loss of a stop codon; SNVs or insertion/deletion sequence variants (indels) disrupting canonical splice acceptor or donor dinucleotides; and open reading frame shifting indels leading to the formation of a premature stop codon. The union of ClinVar pathogenic/likely pathogenic and pLOF variants was termed “expected pathogenic,” and this set of variants was used to identify individuals in BioMe for subsequent analyses of HBOC-related clinical characteristics.
BRCA1/2 founder variants
All expected pathogenic variants detected in BRCA1/2 were reviewed for evidence of a founder effect. This was carried out by manual review of each expected pathogenic variant by a genetic counselor (E.R.S.) in the Human Gene Mutation Database , ClinVar, and PubMed utilizing the currently designated HGVS nomenclature for each variant , as well as previous designations as noted in ClinVar. Variants were considered to be founder variants if they were described as such in the primary literature, based on confirmatory haplotype analysis or population frequency.
Clinical characteristics in variant-positive individuals
Individuals harboring expected pathogenic variants in BRCA1/2 in BioMe, termed “variant positive,” were evaluated for any evidence of personal or family histories of HBOC-related cancers, through extraction of International Classification of Diseases (ICD)-9 and ICD-10 codes from participant EHRs (Additional file 1: Table S1). These data were supplemented by participant questionnaire data for personal and family histories of HBOC-related cancers, which were available for 61 variant-positive individuals. Medical record review of variant-positive individuals was carried out independently by two individuals, including genetic counselors (J.A.O., E.R.S., or S.A.S.) and a clinical research coordinator (J.E.R.) to determine whether participants had evidence of previous clinical genetic testing for BRCA1/2. Data were summarized using medians and interquartile ranges (IQR) for continuous variables and frequencies and percentages for categorical variables. Pearson’s chi-squared test with Yates correction was used to test for statistical independence of different categorical outcomes measured in the study.
HBOC-related cancer case-control and phenome-wide association studies
Cases were defined as participants having any of the ICD-9 or ICD-10 codes for personal history of HBOC-related cancers (Additional file 1: Table S1). Controls were defined as individuals without any of these ICD-9 or ICD-10 codes. We tested for association with variant-positive compared with variant-negative participants (defined as not having any variants that were pathogenic, uncertain/conflicting, or unclassified in ClinVar (novel)). Genotypes were coded using a binary model (0 for variant negative and 1 for variant positive). We repeated the analysis to compare participants with uncertain/conflicting variants with variant-negative participants. We excluded individuals determined to be second-degree relatives and above from the analysis. Odds ratios were estimated by logistic regression and adjusted for age, sex, and the first 5 principal components of ancestry.
We also performed a phenome-wide association study (PheWAS) of variant-positive vs. variant-negative participants using ICD-9- and ICD-10-based diagnosis codes that were collapsed to hierarchical clinical disease groups (termed phecodes) [29, 30]. We performed logistic regression systematically using BRCA1/2 expected pathogenic carrier status as the primary predictor variable and the presence of a given phecode as the outcome variable, excluding second-degree relatives and above and adjusting for age, sex, and the first 5 principal components. To minimize spurious associations due to limited numbers of case observations, we restricted analyses to phecodes present in at least 5 variant-positive participants, resulting in a total of p = 260 tests. Statistical significance was determined using Bonferroni correction (Bonferroni-adjusted significance threshold p < 1.9 × 10− 4). Logistic regression analyses were performed using PLINK (v1.90b3.35) software.
We evaluated BRCA1/2 variants among 30,223 adult participants of the BioMe Biobank with available exome sequence data and genotype array data. Participants were 59.3% female and had a median age of 59 years (Table 1). The majority of participants (74.3%) were of non-European descent, based on self-report. A total of 1601 variants were analyzed, including 1478 (92.3%) occurring at bi-allelic sites and 123 (7.7%) at multi-allelic sites. The majority of variants were missense (63.5%), and 1335 (83.4%) variants were available in ClinVar (Additional file 1: Table S2). The proportion of individuals harboring BRCA1/2 variants that were not classified in ClinVar (novel) was lowest in individuals of self-reported European descent (0.8%) and highest in individuals of South Asian descent (2.3%; Fig. 1a). The proportion of individuals harboring BRCA1/2 variants of uncertain significance or with conflicting interpretations of pathogenicity (uncertain/conflicting) in ClinVar was lowest in individuals of self-reported European descent (4.1%) and highest in those of self-reported African American/African descent (12.2%; Fig. 1b). We saw a similar trend when investigating genetic ancestry within populations with recent mixed ancestry, for example, Hispanic/Latino populations, who can trace their recent ancestry to Europe, Africa, and the Americas (Additional file 1: Figure S1). Although the mean uncertain/conflicting variant rate in all self-reported Hispanic/Latino participants was 8.5% (95% CI 7.9-9.1%; Fig. 1b), this rate was almost twofold higher in those with > 60% African genetic ancestry (11.3% (95% CI 9.2–13.9%)) compared with those with < 20% African genetic ancestry (6.9% (95% CI 6.1–7.7%); chi-squared p = 7.8 × 10− 5; Additional file 1: Figure S1).
Exome sequence data of the BRCA1/2 genes was then used to identify expected pathogenic variants. There were 102 variants with a pathogenic or likely pathogenic assertion in ClinVar, all of which had a 2- or 3-star review status (Additional file 1: Table S3). There were 10 additional pLOF variants (frameshift or stop gained) that were not classified in ClinVar, including 2 in BRCA1 and 8 in BRCA2. The 10 pLOF variants were each observed as singletons in BioMe, and only one of them (BRCA2 c.1039C>T) was found in the gnomAD database  with an allele frequency of 0.000004, suggesting that these are rare in the general population. The union of 102 ClinVar pathogenic and 10 additional rare pLOF variants was the set of expected pathogenic BRCA1/2 variants (n = 112) used to define variant-positive individuals in BioMe.
Overall, 218 (0.7%) individuals in BioMe harbored expected pathogenic variants in BRCA1/2: 86 (39.4%) of these individuals had an expected pathogenic variant in BRCA1, 131 (60.1%) had a variant in BRCA2, and 1 (0.5%) individual had a variant in both BRCA1 (c.68_69delAG) and BRCA2 (c.5946delT). Variant-positive individuals were 62.8% female and had a median age of 58 years (Table 1). The prevalence of BioMe participants harboring expected pathogenic variants in BRCA1/2 was 1:139 (Table 2). In a subset of individuals excluding second-degree relatives and above (N = 27,816), overall prevalence was unchanged at 1:134. In the unrelated subset, prevalence was highest in individuals of self-reported European descent (1:66) and lowest in those of Hispanic/Latino descent (1:283). We previously used genotype array data to identify fine-scale population groups in BioMe using genetic ancestry , revealing eight communities with greater than 400 individuals represented (Table 2). Across these, prevalence was highest in individuals with AJ ancestry (1:49), among whom the majority (72 out of 80 individuals, or 90.0%) harbored one of the three AJ founder variants (c.5266dupC and c.68_69delAG in BRCA1, and c.5946delT in BRCA2), and 8 individuals (10.0%) harbored a different variant in BRCA1/2 (Additional file 1: Table S3). Prevalence was lower in non-AJ Europeans (1:103) and lowest in those with ancestry from PR (1:340) and DR (1:469; Table 2).
We identified 23 unique founder variants that have previously been reported in multiple founder populations, including 13 variants in BRCA1 and 10 in BRCA2 (Table 3). A total of 112 of 218 variant-positive individuals (51.4%) were identified as harboring at least one founder variant (61 individuals with a variant in BRCA1, 50 with BRCA2, and 1 with both BRCA1 and BRCA2). The majority of identified founder variants were accounted for by the three AJ founder variants, with 80 individuals in BioMe harboring at least one of these variants, 72 of whom had AJ genetic ancestry. There were 32 participants harboring non-AJ founder variants in BRCA1/2, the most common being BRCA2 c.3922G>T, a well-documented founder variant in PR . Among 15 BRCA1/2 variant-positive individuals with genetic ancestry from PR, 7 (46.7%) harbored the BRCA2 c.3922G>T variant, and 3 others (20.0%) harbored Chilean or Spanish founder variants (Table 3).
We evaluated the clinical characteristics of BRCA1/2 variant-positive individuals using EHR-extracted diagnosis codes (Additional file 1: Table S1), as well as additional personal and family medical history questionnaire data available for 61 of these individuals. Overall, 61 of 218 (28.0%) BRCA1/2 variant-positive individuals had a documented personal history and 98 (45.0%) had either a personal or family history of HBOC-related cancer (breast, ovarian, pancreatic, prostate, or melanoma; Table 4). Variant-positive females were 2.8 times more likely than males to have a personal or family history of HBOC-related cancers (chi-squared p = 9.9 × 10− 8). Among variant-positive females (N = 137), 53 (38.7%) had HBOC-related cancers, including 50 (36.5%) with breast or ovarian cancer. Among the three females with cancer other than breast or ovarian, two had pancreatic cancer and one had melanoma. There were 3 (2.2%) variant-positive females who had more than one cancer, all of whom had both breast and ovarian cancers: one with BRCA1 c.68_69delAG and two with BRCA2 c.5946delT. Among variant-positive males (N = 81), 2 (2.5%) had breast cancer (BRCA1 c.5266dupC and BRCA2 c.4471_4474delCTGA) and 6 (7.4%) had prostate cancer (two men with BRCA1 c.5266dupC and one man each with BRCA1 c.68_69delAG, BRCA2 c.2808_2811delACAA, BRCA2 c.5946delT, and BRCA2 c.4716_4717delinsAAAGACC). One of these men (1.2%) had more than one cancer (breast and pancreatic) and harbored BRCA2 c.4471_4474delCTGA.
We assessed the number of variant-positive individuals with prior knowledge of their BRCA1/2 variant status. Review of medical records revealed that 58 (26.6%) had EHR evidence of clinical genetic testing for BRCA1/2 (Table 4). Among 98 variant-positive individuals with a personal or family history of HBOC-related cancer, 51 (52.0%) had evidence of clinical genetic testing. Only 5 of 81 (6.2%) males had evidence of clinical genetic testing, compared with 53 of 137 (38.7%) females (chi-squared p = 3.6 × 10− 7). Although personal rates of cancer were similar among individuals with AJ founder variants and those with other variants (28.8% vs. 27.5%, chi-squared p = 0.97), knowledge of BRCA1/2 variant status varied: 31 of 80 (38.8%) individuals with AJ founder variants had documented evidence of clinical genetic testing, compared with only 27 of 138 (19.6%) individuals harboring other BRCA1/2 variants (chi-squared p = 3.4 × 10− 3).
We tested for association with HBOC-related cancers in variant-positive (N = 208) compared with variant-negative (not harboring any ClinVar pathogenic, uncertain/conflicting, or novel variants; N = 24,927) participants in the unrelated subset. Variant-positive individuals had increased odds of HBOC-related cancers (odds ratio (OR) 5.6; 95% confidence interval (CI) 4.0 to 8.0; p = 6.7 × 10− 23). In contrast, participants harboring uncertain/conflicting variants (N = 2395) did not have increased odds of HBOC-related cancers (OR 1.2; 95% CI 1.0 to 1.4; p = 0.1). To more comprehensively evaluate the clinical consequences of expected pathogenic variants in BRCA1/2, we performed a PheWAS of variant-positive vs. variant-negative participants. Using a Bonferroni significance threshold of p = 1.9 × 10− 4 for associations with 260 clinical diagnoses, we identified significant associations with “malignant neoplasm of female breast” (OR 8.1; 95% CI 5.4 to 12.2; p = 2.2 × 10− 23) and “other specified disorders of breast” (OR 6.9; 95% CI 2.9 to 16.2; p = 9.0 × 10− 6; Additional file 1: Figure S2). There were no associations with other types of cancer or non-cancer phenotypes, including known HBOC-related cancers, suggesting we may have been underpowered to observe other relevant associations.
In this study, we demonstrate the ability of large-scale, population-based genomic sequencing to identify and characterize consequential variants in BRCA1/2 in a large, ethnically diverse health system. We found an overall prevalence of 1 in 139 individuals with expected pathogenic variants in BRCA1/2, observed differing frequencies of such variants among a broad range of represented ancestries, and discovered that the majority of individuals harboring these variants were unaware of their genomic risk status.
The overall prevalence of expected pathogenic BRCA1/2 variants in our population was higher than previous estimates [5, 6, 13] and may be partly explained by the large number of founder variants detected. The highest prevalence was 1 in 49 (2.1%) in individuals with AJ genetic ancestry, which is similar to the previously established prevalence of 1 in 42 (2.4%) in this population [7, 8]. The high proportion of AJ individuals in our cohort (14.0%) contributed to the high overall prevalence observed. Multiple other founder variants were also detected in different populations in our study, including the c.3922G>T (p.Glu1308Ter) variant in BRCA2 that we found in almost half of the variant-positive individuals with ancestry from PR, consistent with previous findings . We report, for the first time, prevalence estimates in a number of diverse populations, including African American and Hispanic/Latino populations for which these estimates did not previously exist.
Our findings also revealed that non-European populations, and particularly those most genetically divergent from European populations, are more likely to harbor BRCA1/2 variants that are not classified in public databases or that have uncertain or conflicting evidence for pathogenicity. This was also evident in mixed ancestry populations such as Hispanic/Latino populations, in whom the proportion of variants with uncertain/conflicting interpretations correlated with the percent African genetic ancestry. While BRCA1/2 variant-positive individuals had significantly increased risk of HBOC-related cancers, those with uncertain/conflicting variants did not, suggesting that many of these variants are likely to be benign or of low penetrance. These data add to a growing body of literature [19,20,21] underscoring the pressing need to further characterize genomic variation across diverse populations.
As with previous studies, there was a higher rate of relevant cancers in BRCA1 variant-positive individuals than in BRCA2, and in women than in men [13, 54, 55]. Over one-third of the variant-positive females in our study had a documented current or prior diagnosis of a HBOC-related cancer. Genomic screening in individuals with cancer still provides an opportunity for early detection or prophylaxis, as evidenced by the finding of a second primary cancer in four participants. Genomic screening in apparently healthy men may represent an opportunity for intervention through increased prostate surveillance, given the recently recognized contribution of germline BRCA1/2 variants to metastatic prostate cancer burden .
Knowledge of BRCA1/2 status as documented in participant EHRs was only 27% overall, and even lower (20%) in individuals with non-AJ founder variants, confirming prior reports of clinical under-ascertainment . Of note, 10% of the variant-positive AJ individuals harbored non-founder variants, consistent with previous findings  and highlighting the need for comprehensive testing of BRCA1/2 genes rather than targeted screening for specific founder variants in this population. The observed difference in clinical testing among individuals with or without AJ founder variants, despite similar rates of cancer, indicates that there may be additional barriers to genetic testing in populations that are not considered higher risk on the basis of ancestry. Obstacles in non-AJ populations could include lack of patient awareness about BRCA1/2, lower suspicion for HBOC by healthcare providers, or reduced access and/or uptake of genetic testing in certain populations within the context of broader healthcare disparities. Such barriers have been described in African American and Hispanic/Latino populations, the two largest non-European populations in BioMe, suggesting that interventions to improve awareness, risk perception, and patient-provider communication are needed to reduce disparities in BRCA1/2 testing in diverse populations .
Current evidence- and expert opinion-driven guidelines [10, 11, 59] as well as statistical models [60,61,62,63] to identify potential candidates for BRCA1/2 testing are mainly based on the number of individuals with relevant cancers in a kindred, age(s) of diagnosis, and ancestry. Testing criteria have widened over time with the recognition that they do not sufficiently identify all individuals harboring a BRCA1/2 pathogenic variant. Nevertheless, our findings suggest that current clinical practices still miss a significant opportunity for reducing morbidity and mortality through identification of high-risk variant-positive individuals. While we were unable to evaluate whether variant-positive individuals would meet current testing criteria, we did observe that almost half of those with a relevant personal or family history of cancer had no evidence of clinical BRCA1/2 testing. The potential for improved health outcomes from genomic screening through ascertainment of patients and identification of at-risk relatives through cascade testing [64, 65] supports the Centers for Disease Control and Prevention’s designation of HBOC as a tier 1 genomic condition for which positive public health impact exists (https://www.cdc.gov/genomics/implementation/toolkit/tier1.htm).
There are limitations to our study. The study population consisted of individuals recruited from clinical care sites, which does not necessarily reflect the general population of New York City. However, these findings do provide insight into diverse patient populations that were ascertained in a relatively unselected, population-based manner and that have not been previously represented in similar research efforts. The observed prevalence of BRCA1/2 expected pathogenic variants may represent an underestimate, as certain variants would not be detected via this approach, including large copy number variants, which make up approximately 10% of all BRCA1/2 pathogenic variants [66,67,68,69]. Additionally, some percentage of variants of uncertain significance may in fact be pathogenic and likely will be classified as such in the future. We were also constrained by the use of EHR-extracted clinical information, which may not reflect complete medical and family history , and may downwardly bias the true penetrance of HBOC in our cohort.
Genomic screening for pathogenic BRCA1/2 variants in apparently healthy individuals has the potential to lead to earlier diagnosis of cancer via increased surveillance, as well as cancer risk reduction via prophylactic medical interventions. In this study, we provide evidence for a higher overall prevalence of BRCA1/2 expected pathogenic variants in the BioMe Biobank than historically appreciated, in line with recent findings from another unselected clinical care cohort . We show that this approach can effectively identify at-risk individuals across ethnically diverse and underserved populations such as those present in BioMe. These findings are in part due to the cross-sectional representation of founder variants from multiple different populations, which accounted for over half of individuals harboring pathogenic variants in this study. We demonstrate that genomic screening for BRCA1/2 in diverse patient populations may be an effective tool to identify otherwise unrecognized HBOC-associated variants, in order to prevent or diagnose disease. However, further work is needed to accurately classify pathogenic variants in non-European populations, in order to most effectively use this strategy to improve health outcomes in diverse settings.
Availability of data and materials
Expected pathogenic variants in BRCA1/2 reported in this paper are tabulated in Additional file 1: Table S3. Summary statistics, including genotype counts across self-reported and genetic ancestry groups from BioMe, for all BRCA1/2 variants are available at https://sinaigenomichealth.org/research-resources/. Exome sequencing and genotyping of BioMe was performed in collaboration with the Regeneron Genetics Center. Individual-level data generated via this collaboration are not publicly available due to the terms of the BioMe biospecimen and data access agreement but may be requested directly from the corresponding author.
Smithers DW. Family histories of 459 patients with cancer of the breast. Br J Cancer. 1948;2(2):163–7.
Miki Y, Swensen J, Shattuck-Eidens D, Futreal PA, Harshman K, Tavtigian S, et al. A strong candidate for the breast and ovarian cancer susceptibility gene BRCA1. Science. 1994;266(5182):66–71.
Wooster R, Bignell G, Lancaster J, Swift S, Seal S, Mangion J, et al. Identification of the breast cancer susceptibility gene BRCA2. Nature. 1995;378(6559):789–92.
Petrucelli N, Daly MB, Pal T. BRCA1- and BRCA2-Associated Hereditary Breast and Ovarian Cancer. In: Adam MP, Ardinger HH, Pagon RA, Wallace SE, Bean LJH, Stephens K, et al., editors. GeneReviews((R)). Seattle (WA)1993.
Anglian Breast Cancer Study Group. Prevalence and penetrance of BRCA1 and BRCA2 mutations in a population-based series of breast cancer cases. Br J Cancer. 2000;83(10):1301–8.
McClain MR, Palomaki GE, Nathanson KL, Haddow JE. Adjusting the estimated proportion of breast cancer cases associated with BRCA1 and BRCA2 mutations: public health implications. Genet Med. 2005;7(1):28–33.
Hartge P, Struewing JP, Wacholder S, Brody LC, Tucker MA. The prevalence of common BRCA1 and BRCA2 mutations among Ashkenazi Jews. Am J Hum Genet. 1999;64(4):963–70.
Roa BB, Boyd AA, Volcik K, Richards CS. Ashkenazi Jewish population frequencies for common mutations in BRCA1 and BRCA2. Nat Genet. 1996;14(2):185–7.
Rebbeck TR, Friebel TM, Friedman E, Hamann U, Huo D, Kwong A, et al. Mutational spectrum in a worldwide study of 29,700 families with BRCA1 or BRCA2 mutations. Hum Mutat. 2018;39(5):593–620.
Daly MB, Pilarski R, Berry M, Buys SS, Farmer M, Friedman S, et al. NCCN guidelines insights: genetic/familial high-risk assessment: breast and ovarian, version 2.2017. J Natl Compr Cancer Netw. 2017;15(1):9–20.
Hampel H, Bennett RL, Buchanan A, Pearlman R, Wiesner GL, Guideline Development Group ACoMG, et al. A practice guideline from the American College of Medical Genetics and Genomics and the National Society of Genetic Counselors: referral indications for cancer predisposition assessment. Genet Med. 2015;17(1):70–87.
Moyer VA, Force USPST. Risk assessment, genetic counseling, and genetic testing for BRCA-related cancer in women: U.S. Preventive Services Task Force recommendation statement. Ann Intern Med. 2014;160(4):271–81.
Manickam K, Buchanan AH, Schwartz MLB, Hallquist MLG, Williams JL, Rahm AK, et al. Exome sequencing-based screening for BRCA1/2 expected pathogenic variants among adult biobank participants. JAMA Netw Open. 2018;1(5):e182140.
Popejoy AB, Ritter DI, Crooks K, Currey E, Fullerton SM, Hindorff LA, et al. The clinical imperative for inclusivity: race, ethnicity, and ancestry (REA) in genomics. Hum Mutat. 2018;39(11):1713–20.
Butrick M, Kelly S, Peshkin BN, Luta G, Nusbaum R, Hooker GW, et al. Disparities in uptake of BRCA1/2 genetic testing in a randomized trial of telephone counseling. Genet Med. 2015;17(6):467–75.
Cragun D, Bonner D, Kim J, Akbari MR, Narod SA, Gomez-Fuego A, et al. Factors associated with genetic counseling and BRCA testing in a population-based sample of young Black women with breast cancer. Breast Cancer Res Treat. 2015;151(1):169–76.
Cragun D, Weidner A, Lewis C, Bonner D, Kim J, Vadaparampil ST, et al. Racial disparities in BRCA testing and cancer risk management across a population-based sample of young breast cancer survivors. Cancer. 2017;123(13):2497–505.
Lynce F, Graves KD, Jandorf L, Ricker C, Castro E, Moreno L, et al. Genomic disparities in breast cancer among Latinas. Cancer Control. 2016;23(4):359–72.
Caswell-Jin JL, Gupta T, Hall E, Petrovchich IM, Mills MA, Kingham KE, et al. Racial/ethnic differences in multiple-gene sequencing results for hereditary cancer risk. Genet Med. 2018;20(2):234–9.
Hall MJ, Reid JE, Burbidge LA, Pruss D, Deffenbaugh AM, Frye C, et al. BRCA1 and BRCA2 mutations in women of different ethnicities undergoing testing for hereditary breast-ovarian cancer. Cancer. 2009;115(10):2222–33.
Ricks-Santi L, McDonald JT, Gold B, Dean M, Thompson N, Abbas M, et al. Next generation sequencing reveals high prevalence of BRCA1 and BRCA2 variants of unknown significance in early-onset breast cancer in African American women. Ethn Dis. 2017;27(2):169–78.
Dewey FE, Gusarova V, O'Dushlaine C, Gottesman O, Trejos J, Hunt C, et al. Inactivating variants in ANGPTL4 and risk of coronary artery disease. N Engl J Med. 2016;374(12):1123–33.
Belbin GM, Wenric S, Cullina S, Glicksberg BS, Moscati A, Wojcik GL, et al. Towards a fine-scale population health monitoring system. bioRxiv. 2019:780668. https://doi.org/10.1101/780668.
Alexander DH, Novembre J, Lange K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 2009;19(9):1655–64.
Manichaikul A, Mychaleckyj JC, Rich SS, Daly K, Sale M, Chen WM. Robust relationship inference in genome-wide association studies. Bioinformatics. 2010;26(22):2867–73.
Landrum MJ, Lee JM, Riley GR, Jang W, Rubinstein WS, Church DM, et al. ClinVar: public archive of relationships among sequence variation and human phenotype. Nucleic Acids Res. 2014;42(Database issue):D980–5.
Stenson PD, Mort M, Ball EV, Shaw K, Phillips A, Cooper DN. The Human Gene Mutation Database: building a comprehensive mutation repository for clinical and molecular genetics, diagnostic testing and personalized genomic medicine. Hum Genet. 2014;133(1):1–9.
den Dunnen JT, Dalgleish R, Maglott DR, Hart RK, Greenblatt MS, McGowan-Jordan J, et al. HGVS recommendations for the description of sequence variants: 2016 update. Hum Mutat. 2016;37(6):564–9.
Denny JC, Ritchie MD, Basford MA, Pulley JM, Bastarache L, Brown-Gentry K, et al. PheWAS: demonstrating the feasibility of a phenome-wide scan to discover gene-disease associations. Bioinformatics. 2010;26(9):1205–10.
Wu P, Gifford A, Meng X, Li X, Campbell H, Varley T, et al. Developing and evaluating mappings of ICD-10 and ICD-10-CM codes to PheCodes. bioRxiv. 2019:462077. https://doi.org/10.1101/462077.
Karczewski KJ, Francioli LC, Tiao G, Cummings BB, Alföldi J, Wang Q, et al. Variation across 141,456 human exomes and genomes reveals the spectrum of loss-of-function intolerance across human protein-coding genes. bioRxiv. 2019:531210. https://doi.org/10.1101/531210.
De Leon Matsuda ML, Liede A, Kwan E, Mapua CA, Cutiongco EM, Tan A, et al. BRCA1 and BRCA2 mutations among breast cancer patients from the Philippines. Int J Cancer. 2002;98(4):596–603.
Im KM, Kirchhoff T, Wang X, Green T, Chow CY, Vijai J, et al. Haplotype structure in Ashkenazi Jewish BRCA1 and BRCA2 mutation carriers. Hum Genet. 2011;130(5):685–99.
Torres D, Rashid MU, Gil F, Umana A, Ramelli G, Robledo JF, et al. High proportion of BRCA1/2 founder mutations in Hispanic breast/ovarian cancer families from Colombia. Breast Cancer Res Treat. 2007;103(2):225–32.
Vezina H, Durocher F, Dumont M, Houde L, Szabo C, Tranchant M, et al. Molecular and genealogical characterization of the R1443X BRCA1 mutation in high-risk French-Canadian breast/ovarian cancer families. Hum Genet. 2005;117(2–3):119–32.
Weitzel JN, Clague J, Martir-Negron A, Ogaz R, Herzog J, Ricker C, et al. Prevalence and type of BRCA mutations in Hispanics undergoing genetic cancer risk assessment in the southwestern United States: a report from the Clinical Cancer Genetics Community Research Network. J Clin Oncol. 2013;31(2):210–6.
Alvarez C, Tapia T, Perez-Moreno E, Gajardo-Meneses P, Ruiz C, Rios M, et al. BRCA1 and BRCA2 founder mutations account for 78% of germline carriers among hereditary breast cancer families in Chile. Oncotarget. 2017;8(43):74233–43.
Janavicius R. Founder BRCA1/2 mutations in the Europe: implications for hereditary breast-ovarian cancer prevention and control. EPMA J. 2010;1(3):397–412.
Thomassen M, Hansen TV, Borg A, Lianee HT, Wikman F, Pedersen IS, et al. BRCA1 and BRCA2 mutations in Danish families with hereditary breast and/or ovarian cancer. Acta Oncol. 2008;47(4):772–7.
Zhang B, Fackenthal JD, Niu Q, Huo D, Sveen WE, DeMarco T, et al. Evidence for an ancient BRCA1 mutation in breast cancer patients of Yoruban ancestry. Familial Cancer. 2009;8(1):15–22.
Vega A, Campos B, Bressac-De-Paillerets B, Bond PM, Janin N, Douglas FS, et al. The R71G BRCA1 is a founder Spanish mutation and leads to aberrant splicing of the transcript. Hum Mutat. 2001;17(6):520–1.
Gorski B, Byrski T, Huzarski T, Jakubowska A, Menkiszak J, Gronwald J, et al. Founder mutations in the BRCA1 gene in Polish families with breast-ovarian cancer. Am J Hum Genet. 2000;66(6):1963–8.
Cini G, Mezzavilla M, Della Puppa L, Cupelli E, Fornasin A, D'Elia AV, et al. Tracking of the origin of recurrent mutations of the BRCA1 and BRCA2 genes in the north-east of Italy and improved mutation analysis strategy. BMC Med Genet. 2016;17:11.
Neuhausen SL, Mazoyer S, Friedman L, Stratton M, Offit K, Caligo A, et al. Haplotype and phenotype analysis of six recurrent BRCA1 mutations in 61 families: results of an international study. Am J Hum Genet. 1996;58(2):271–80.
Neuhausen SL, Godwin AK, Gershoni-Baruch R, Schubert E, Garber J, Stoppa-Lyonnet D, et al. Haplotype and phenotype analysis of nine recurrent BRCA2 mutations in 111 families: results of an international study. Am J Hum Genet. 1998;62(6):1381–8.
Ossa CA, Torres D. Founder and recurrent mutations in BRCA1 and BRCA2 genes in Latin American countries: state of the art and literature review. Oncologist. 2016;21(7):832–9.
Diaz-Zabala HJ, Ortiz AP, Garland L, Jones K, Perez CM, Mora E, et al. A recurrent BRCA2 mutation explains the majority of hereditary breast and ovarian cancer syndrome cases in Puerto Rico. Cancers (Basel). 2018;10(11):419.
Ikeda N, Miyoshi Y, Yoneda K, Shiba E, Sekihara Y, Kinoshita M, et al. Frequency of BRCA1 and BRCA2 germline mutations in Japanese breast cancer families. Int J Cancer. 2001;91(1):83–8.
Tonin PN, Perret C, Lambert JA, Paradis AJ, Kantemiroff T, Benoit MH, et al. Founder BRCA1 and BRCA2 mutations in early-onset French Canadian breast cancer cases unselected for family history. Int J Cancer. 2001;95(3):189–93.
Caputo S, Benboudjema L, Sinilnikova O, Rouleau E, Beroud C, Lidereau R, et al. Description and analysis of genetic variants in French hereditary breast and ovarian cancer families recorded in the UMD-BRCA1/BRCA2 databases. Nucleic Acids Res. 2012;40(Database issue):D992–1002.
Seong MW, Cho S, Noh DY, Han W, Kim SW, Park CM, et al. Comprehensive mutational analysis of BRCA1/BRCA2 for Korean breast cancer patients: evidence of a founder mutation. Clin Genet. 2009;76(2):152–60.
Sarantaus L, Huusko P, Eerola H, Launonen V, Vehmanen P, Rapakko K, et al. Multiple founder effects and geographical clustering of BRCA1 and BRCA2 families in Finland. Eur J Hum Genet. 2000;8(10):757–63.
Machackova E, Foretova L, Lukesova M, Vasickova P, Navratilova M, Coene I, et al. Spectrum and characterisation of BRCA1 and BRCA2 deleterious mutations in high-risk Czech patients with breast and/or ovarian cancer. BMC Cancer. 2008;8:140.
Kuchenbaecker KB, Hopper JL, Barnes DR, Phillips KA, Mooij TM, Roos-Blom MJ, et al. Risks of breast, ovarian, and contralateral breast cancer for BRCA1 and BRCA2 mutation carriers. JAMA. 2017;317(23):2402–16.
Chen S, Parmigiani G. Meta-analysis of BRCA1 and BRCA2 penetrance. J Clin Oncol. 2007;25(11):1329–33.
Pritchard CC, Mateo J, Walsh MF, De Sarkar N, Abida W, Beltran H, et al. Inherited DNA-repair gene mutations in men with metastatic prostate cancer. N Engl J Med. 2016;375(5):443–53.
Rosenthal E, Moyes K, Arnell C, Evans B, Wenstrup RJ. Incidence of BRCA1 and BRCA2 non-founder mutations in patients of Ashkenazi Jewish ancestry. Breast Cancer Res Treat. 2015;149(1):223–7.
Williams CD, Bullard AJ, O'Leary M, Thomas R, Redding TS, Goldstein K. Racial/ethnic disparities in BRCA counseling and testing: a narrative review. J Racial Ethn Health Disparities. 2019;6(3):570–83.
Force USPST. Risk assessment, genetic counseling, and genetic testing for BRCA-related cancer in women: recommendation statement. Am Fam Physician. 2015;91(2):Online.
Claus EB, Schildkraut J, Iversen ES Jr, Berry D, Parmigiani G. Effect of BRCA1 and BRCA2 on the association between breast cancer risk and family history. J Natl Cancer Inst. 1998;90(23):1824–9.
Parmigiani G, Berry D, Aguilar O. Determining carrier probabilities for breast cancer-susceptibility genes BRCA1 and BRCA2. Am J Hum Genet. 1998;62(1):145–58.
Antoniou AC, Cunningham AP, Peto J, Evans DG, Lalloo F, Narod SA, et al. The BOADICEA model of genetic susceptibility to breast and ovarian cancers: updates and extensions. Br J Cancer. 2008;98(8):1457–66.
Tyrer J, Duffy SW, Cuzick J. A breast cancer prediction model incorporating familial and personal risk factors. Stat Med. 2004;23(7):1111–30.
Tuffaha HW, Mitchell A, Ward RL, Connelly L, Butler JRG, Norris S, et al. Cost-effectiveness analysis of germ-line BRCA testing in women with breast cancer and cascade testing in family members of mutation carriers. Genet Med. 2018;20(9):985–94.
Roberts MC, Dotson WD, DeVore CS, Bednar EM, Bowen DJ, Ganiats TG, et al. Delivery of cascade screening for hereditary conditions: a scoping review of the literature. Health Aff (Millwood). 2018;37(5):801–8.
Judkins T, Rosenthal E, Arnell C, Burbidge LA, Geary W, Barrus T, et al. Clinical significance of large rearrangements in BRCA1 and BRCA2. Cancer. 2012;118(21):5210–6.
Palma MD, Domchek SM, Stopfer J, Erlichman J, Siegfried JD, Tigges-Cardwell J, et al. The relative contribution of point mutations and genomic rearrangements in BRCA1 and BRCA2 in high-risk breast cancer families. Cancer Res. 2008;68(17):7006–14.
Ewald IP, Ribeiro PL, Palmero EI, Cossio SL, Giugliani R, Ashton-Prolla P. Genomic rearrangements in BRCA1 and BRCA2: a literature review. Genet Mol Biol. 2009;32(3):437–46.
Kang P, Mariapun S, Phuah SY, Lim LS, Liu J, Yoon SY, et al. Large BRCA1 and BRCA2 genomic rearrangements in Malaysian high risk breast-ovarian cancer families. Breast Cancer Res Treat. 2010;124(2):579–84.
Abul-Husn NS, Kenny EE. Personalized medicine and the power of electronic health records. Cell. 2019;177(1):58–69.
We would like to thank participants of the BioMe Biobank for their permission to use their health and genomic information.
This work is supported by dedicated funding to the Center for Genomic Health by the Icahn School of Medicine at Mount Sinai. E.E.K., N.S.A-H., S.A.S., J.A.O., J.E.R., and G.M.B. are supported by the National Institutes of Health, National Human Genome Research Institute (NHGRI), and National Institute on Minority Health and Health Disparities (U01 HG009610). E.E.K. is also supported by NHGRI (R01 HG010297, U01 HG009080, UM1 HG0089001, U01 HG007417); the National Heart, Lung, Blood Institute (R01 HL104608, X01 HL1345); and the National Institute of Diabetes and Kidney and Digestive Disease (R01 DK110113).
Ethics approval and consent to participate
The Icahn School of Medicine at Mount Sinai’s Institutional Review Board approved this study (protocol number 18-1771), including a waiver of informed consent and a HIPAA waiver of authorization. The study population consisted of 30,223 participants aged 18 years or older from Mount Sinai’s BioMe Biobank (protocol number 07-0529). This research conformed to the Declaration of Helsinki.
N.S.A-H. was previously employed by Regeneron Pharmaceuticals and has received a speaker honorarium from Genentech. E.E.K has received speaker honoraria from Illumina and Regeneron Pharmaceuticals. The remaining authors declare no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Additional file 1: Figure S1. Correlation between proportion African genetic ancestry and the likelihood of harboring an uncertain/conflicting BRCA1/2 variant in Hispanic/Latinos. Figure S2. Phenome-wide association study of BRCA1/2 variant-positive vs. variant-negative participants using EHR-derived clinical diagnoses (phecodes). Table S1. International Classification of Diseases (ICD)-9 and − 10 codes used to characterize BRCA1/2 variant-positive individuals. Table S2. Distribution of 1601 BRCA1/2 variants obtained from exome sequence data available from 30,223 adult BioMe Biobank participants, according to ClinVar assertion and variant type. Table S3. BRCA1/2 expected pathogenic variants identified in 30,223 exome sequenced adults from the BioMe Biobank.
About this article
Cite this article
Abul-Husn, N.S., Soper, E.R., Odgis, J.A. et al. Exome sequencing reveals a high prevalence of BRCA1 and BRCA2 founder variants in a diverse population-based biobank. Genome Med 12, 2 (2020). https://doi.org/10.1186/s13073-019-0691-1