Risk assessment for colorectal cancer via polygenic risk score and lifestyle exposure: a large-scale association study of East Asian and European populations

Background The genetic architectures of colorectal cancer are distinct across different populations. To date, the majority of polygenic risk scores (PRSs) are derived from European (EUR) populations, which limits their accurate extrapolation to other populations. Here, we aimed to generate a PRS by incorporating East Asian (EAS) and EUR ancestry groups and validate its utility for colorectal cancer risk assessment among different populations. Methods A large-scale colorectal cancer genome-wide association study (GWAS), harboring 35,145 cases and 288,934 controls from EAS and EUR populations, was used for the EAS-EUR GWAS meta-analysis and the construction of candidate EAS-EUR PRSs via different approaches. The performance of each PRS was then validated in external GWAS datasets of EAS (727 cases and 1452 controls) and EUR (1289 cases and 1284 controls) ancestries, respectively. The optimal PRS was further tested using the UK Biobank longitudinal cohort of 355,543 individuals and ultimately applied to stratify individual risk attached by healthy lifestyle. Results In the meta-analysis across EAS and EUR populations, we identified 48 independent variants beyond genome-wide significance (P < 5 × 10−8) at previously reported loci. Among 26 candidate EAS-EUR PRSs, the PRS-CSx approach-derived PRS (defined as PRSCSx) that harbored genome-wide variants achieved the optimal discriminatory ability in both validation datasets, as well as better performance in the EAS population compared to the PRS derived from known variants. Using the UK Biobank cohort, we further validated a significant dose-response effect of PRSCSx on incident colorectal cancer, in which the risk was 2.11- and 3.88-fold higher in individuals with intermediate and high PRSCSx than in the low score subgroup (Ptrend = 8.15 × 10−53). Notably, the detrimental effect of being at a high genetic risk could be largely attenuated by adherence to a favorable lifestyle, with a 0.53% reduction in 5-year absolute risk. Conclusions In summary, we systemically constructed an EAS-EUR PRS to effectively stratify colorectal cancer risk, which highlighted its clinical implication among diverse ancestries. Importantly, these findings also supported that a healthy lifestyle could reduce the genetic impact on incident colorectal cancer. Supplementary Information The online version contains supplementary material available at 10.1186/s13073-023-01156-9.


Background
Colorectal cancer is one of the most commonly diagnosed cancers and the second leading cause of cancer death worldwide, with over 1.8 million new cases and 0.9 million deaths in 2020 [1]. Cumulative evidence has demonstrated that colorectal cancer is caused by environmental factors (e.g., lifestyle), genetic factors, and their interactions [2]. Although environmental risk factors contribute the most, genetic variants can separately explain approximately 7-16% of heritability for colorectal cancer among European (EUR) and East Asian (EAS) populations, indicating the vital role of variants in the development of colorectal cancer [3,4].
In the past decades, genome-wide association studies (GWASs) have identified over 100 single nucleotide polymorphisms (SNPs) associated with the risk of colorectal cancer [5][6][7]. Although each of these risk variants contributes a small effect on colorectal cancer risk, the polygenic risk score (PRS), a method that combines the weak effect of these known or genome-wide variants, has been found to be an efficient tool for identifying individuals at high risk of developing colorectal cancer risk [8][9][10]. However, most PRSs were developed and optimized based on the GWAS data of EUR ancestry and had a limited discriminating ability among other populations (e.g., EAS) [10,11]. Therefore, it is urgent to construct a trans-ancestry PRS that can improve the ability of colorectal cancer risk prediction in diverse populations.
Unhealthy lifestyles have been known to be associated with an increased risk of colorectal cancer, while healthy lifestyle habits show inverse associations [12]. In particular, accumulating evidence indicated that among individuals with high genetic risk, cancer risk can be attenuated by adherence to a healthy lifestyle, such as colorectal cancer [13], as well as our previous studies in gastric cancer [14] and lung cancer [15].
In this study, we performed a large-scale meta-analysis of EAS and EUR populations, to identify common genetic variants associated with colorectal cancer risk across the two ethnic groups. Subsequently, we aimed to develop a novel EAS-EUR PRS that can be used to stratify colorectal cancer risk in diverse populations, and further evaluate the benefit of adherence to a healthy lifestyle stratified by different levels of genetic risk for developing colorectal cancer in a longitudinal cohort (Fig. 1).

Case-control studies of derivation stage
EAS of the Chinese population The subjects of four independent Chinese colorectal cancer GWAS (Additional file 1: Table S1 and Fig. S1) were recruited from the National ColoRectal Cancer Cohort (NCRCC), including NJCRC GWAS [1316 cases and 2207 controls [16], being part of the Genetics and Epidemiology of Colorectal Cancer Consortium (GECCO)], BJCRC GWAS (932 cases and 966 controls) [17], SHCRC GWAS (1116 cases and 1054 controls), and ZJCRC GWAS (1046 cases and 1184 controls). The detailed information is described in Additional file 1: Supplementary Materials. EAS of the Japanese population All participants of the Japanese GWAS were collected in the BioBank Japan Project (BBJ), and the population details have been published in a previous study [18]. We obtained the GWAS summary statistics of colorectal cancer (7062 cases and 195,745 controls) from the JENGER website.
EUR population (GECCO) The GWAS datasets of GECCO consortia were deposited in the database of Genotypes and Phenotypes (dbGaP, phs001315.v1.p1; phs001415.v1.p1 and phs001078.v1.p1). All cases were confirmed by medical records, pathologic reports, cancer registries, or death certificates. The population details have been published in previous studies [5,6]. After individual-level quality control (Additional file 1: Supplementary Materials), a total of 21,608 cases and 20,278 controls, which did not include datasets of Prostate, Lung, Colorectal, and Ovarian (PLCO) and Colorectal Cancer Study of Austria (CORSA), were retained for analysis.

EUR population (PLCO)
The PLCO cancer screening trial is a cohort study that aims to evaluate the accuracy and reliability of screening methods for prostate, lung, colorectal, and ovarian cancer [19], and the detailed information was described in our previous study [20]. We obtained the up-to-date GWAS summary statistics of colorectal cancer (2065 cases and 67,500 controls; October 18,2022) in the EUR population from the PLCOjs website [21]. This study was approved by the ethics committees of the PLCO consortium providers (#PLCO-84). Xin et al. Genome Medicine (2023) 15:4 Case-control studies of the validation stage EAS of the Chinese population The confirmed cases from the JSCRC study were consecutively recruited from hospitals in Jiangsu province, China. The cancer-free control subjects were selected from individuals receiving routine physical examination at hospitals or those participating in community screening for non-communicable diseases in Jiangsu province. A total of 727 cases and 1452 controls were finally included in this study.
EUR population (CORSA) The CORSA dataset included colorectal cancer and adenoma cases and colonoscopy-negative controls. Controls received a complete colonoscopy and were free of colorectal cancer or polyps [22]. We accessed the CORSA genotype data from dbGaP (phs001415.v1.p1) and kept 1289 cases and 1284 controls for subsequent analysis after the individual-level quality control process (Additional file 1: Supplementary Materials).

Longitudinal cohort of the testing stage
The UK Biobank cohort is a prospective, populationbased study, which recruited 502,528 adults aged 40 [23]. After individual-level quality control (Additional file 1: Supplementary Materials), a total of 355,543 participants were retained for our analysis (Additional file 1: Table S2) [24]. The follow-up time was calculated from baseline assessment to the first diagnosis of colorectal cancer [International Classification of Diseases, 10th revision (ICD-10) codes with C18-C20], loss to follow-up, and death or last follow-up (December 14,2016). This study was conducted using the UK Biobank Resource under Application #45611.
GWAS meta-analysis of colorectal cancer The genotyping, imputation, and SNP-level quality control procedures of all GWAS datasets are described in Additional file 1: Supplementary Materials. We used a multivariable logistic regression model to estimate the odds ratios (ORs) and 95% confidence intervals (CIs) for each SNP with the adjustment of sex, age, and principal components of ancestry, separately for each individual-level GWAS dataset.
We then performed a meta-analysis based on the summary statistics derived from EAS and EUR populations of derivation datasets (35,145 cases and 288,934 controls in total) using the inverse variance-weighted fixed-effects model, implemented by the METAL software [25]. After obtaining the summary statistics of the meta-analysis, we excluded SNPs if they (i) had substantial heterogeneity identified among studies (P value for heterogeneity test < 0.001) and (ii) did not pass filters in both EAS and EUR populations, a total of 4.7 million SNPs were retained for further analysis, and variants at P value < 5 × 10 −8 were considered to be genome-wide significant. In the previously reported regions, genome-wide significant SNPs with P conditional < 5 × 10 −8 were considered as novel variants using conditional analysis with the Genome-wide Complex Trait Analysis (GCTA) software conditioning on the known SNPs [26].

Calculation of PRS
We calculated PRS to aggregate the weak effect of individual SNP [8], based on the following formula: PRS = n i=1 β i SNP i , where n means the number of SNPs, SNP i and β i are the number of risk alleles (i.e., 0, 1, 2), and weight carried by the ith SNP. The EASancestry (Additional file 1: Table S3) and EUR-ancestry PRSs [10] were constructed using GWAS-reported variants. Furthermore, the development of candidate EAS-EUR PRSs was determined by five different approaches (Additional file 1: Supplementary Materials), including clumping and P value thresholding (i.e., C+T) approach (12 scores) [27], LDpred (11 scores) [28], lassosum (1 score) [29], LDpred2 (1 score) [30], and PRS-CSx methods (1 score) [31]. The 1000 Genomes EAS and EUR populations (Phase 3; 769 individuals) were used as a reference panel. The proportions of the different ethnic groups in the reference panel were consistent with those in the meta-analysis of EAS and EUR GWASs.
Calculation of lifestyle score We calculated healthy lifestyle scores based on the eight lifestyle factors [32], including body mass index (BMI), tobacco smoking, alcohol consumption, waist-to-hip ratio (WHR), physical activity, sedentary time, red and processed meat intake, and vegetable and fruit intake (Additional file 1: Table S4). Each lifestyle factor was given a score of 0 or 1, with 1 representing the healthy behavior category, and the sum of the eight scores was used as the healthy lifestyle score. The detailed information is described in Additional file 1: Supplementary Materials.

Estimation of 5-year absolute risk
We estimated individual 5-year absolute risk for developing colorectal cancer by combining the relative risk (incorporating genetic risk and lifestyle) with the incidence rate of colorectal cancer and the mortality rate for all causes except for colorectal cancer [9], and the exact details of the calculations were described in our previous study [16].

Statistical analysis
The population structure was estimated using the EIGENSOFT software [33], and the Manhattan plot and quantile-quantile plot based on the -log 10 (P value) were created by using the R package qqman (https:// cran.r-proje ct. org/ web/ packa ges/ qqman/ index. html). We evaluated the discriminatory ability of PRSs derived from different approaches described above using the crude and covariates-adjusted area under the receiver operating characteristics curve (AUC) via the R package RISCA [34].
In the UK Biobank cohort, the Cox proportional hazards model was used to estimate the hazard ratios (HRs) and 95% CIs after adjusting for corresponding confounding factors. We compared the difference in the distribution of PRS between two or more groups by the Wilcoxon or Kruskal-Wallis tests. Participants were classified into ten equal subgroups according to the decile distribution of PRS and categorized into low (bottom 10%), intermediate (10-90%), and high genetic risk (top 10%) subgroups for group comparisons. Similarly, participants were classified into unfavorable (0 and 1 score), intermediate (2 and 3 score), and favorable (≥ 4 score) lifestyle subgroups based on lifestyle scores ranging from 0 to 8. The logrank test was used to evaluate the difference in cumulative incidence (one minus the Kaplan-Meier estimate) stratified by different levels of PRS or lifestyle scores. The incidence proportion and 95% CI in each group were estimated by the exact Poisson test. The R package Shiny (https:// cran.r-proje ct. org/ web/ packa ges/ shiny/) was used to construct the colorectal cancer risk prediction web server, which was freely available and open source.
In addition, to assess the robustness of the results, we performed the following sensitivity analyses: (i) excluded incident colorectal cancer cases that had occurred during the first year of follow-up; (ii) evaluated the associations using ancestry-corrected PRS: briefly, fit a linear regression model using the first ten principal components of ancestry to predict PRS, and the residual from this model was used to create ancestry-corrected PRS; (iii) healthy lifestyle categories were reclassified to unfavorable (0, 1, and 2 score), intermediate (3 and 4 score), and favorable (≥ 5 score) lifestyle groups; and (iv) excluded noncolorectal cancer participants with other cancers that occurred during the time of follow-up.
All other statistical analyses were performed using the R software (version 3.6.1, https:// cran.r-proje ct. org/), and a twosided P value less than 0.05 was considered as significant.

EAS-EUR GWAS meta-analysis of colorectal cancer
The combined EAS-EUR GWAS dataset of colorectal cancer comprised a total of 35,145 cases and 288,934 controls, and there was no residual population stratification observed via genomic control inflation factors (lambda = 1.002; Additional file 1: Fig. S2).
In total, we identified 48 independent SNPs [linkage disequilibrium (LD) r 2 < 0.1] that were significantly associated with colorectal cancer risk beyond genomewide significance (P < 5 × 10 −8 ; Table 1; Additional file 1: Fig. S3). We found that all of these SNPs were located within 1 Mb of well-identified regions reported by previous GWASs, while one novel risk variant (LD r 2 < 0.1 with the previously reported SNPs) was found to be independently associated with colorectal cancer risk in conditional analyses on GWAS-reported risk variants [rs7623129 (3p14.1), OR conditional = 1.06, P conditional = 1.18 × 10 −8 ; Additional file 1: Table S5]. Especially, functional annotation showed that rs7623129 overlapped with the enhancer histone mark and DNAse hypersensitivity site, indicating that it may be involved in the development of colorectal cancer by regulating the expression of nearby ADAMTS9 (Additional file 1: Table S6).

PRS calculation and validation in the independent datasets
Subsequently, we aimed to construct and validate a novel PRS for colorectal cancer risk stratification by incorporating EAS and EUR populations. As shown in Table 2, although the EUR-ancestry PRS showed great discriminatory ability in the EUR population (i.e., CORSA dataset; AUC crude = 0.629, AUC adjust = 0.638), its performance in the EAS population (i.e., JSCRC dataset; AUC crude = 0.511, AUC adjust = 0.510) was limited. Similar results were also found in EAS-ancestry PRS, demonstrating the limited transferability of single-ancestry PRS in other populations.

PRS test in the UK Biobank cohort
We further evaluated the performance of the optimal PRS CSx for colorectal cancer risk prediction in the UK Biobank cohort, in which 2621 colorectal cancer cases among 355,543 individuals were confirmed during a median follow-up of 7.88 years. As expected, colorectal cancer cases had a higher PRS CSx value than those without colorectal cancer [HR = 1.42, 95% CI = 1.37 to 1.48 per SD increase, P = 3.53 × 10 −72 , Additional file 1: Table S7; P Wilcoxon < 2 × 10 −16 ; Additional file 1: Fig. S6A]. Importantly, PRS CSx had a stable discriminatory ability with an AUC of 0.595 (for crude AUC) and 0.597 (for covariates-adjusted AUC; Additional file 1: Fig. S6B), similar with that in the validation dataset of EUR ancestry. Notably, there was a dose-response effect of PRS CSx on developing colorectal cancer at both decile classification (P trend = 1.57 × 10 −56 ; Additional file 1: Fig. S6C) Table S7; log-rank P < 2 × 10 −16 ; Fig. 2A). Besides, we     Table S8).

Evaluation of the benefit of adherence to a healthy lifestyle stratified by genetic risk
In the UK Biobank cohort, several healthy lifestyle factors were associated with a decreased risk of colorectal cancer; for example, compared to smokers, non-smokers had a 0.18fold reduced risk of developing colorectal cancer (OR = 0.82, P = 3.58 × 10 −7 ; Additional file 1: Table S4). Furthermore, we noticed a significantly protective effect of combined lifestyle score in a dose-response manner on colorectal cancer development at both continuous levels (HR = 0.90, 95% CI = 0.88 to 0.93 per lifestyle score increase, P = 3.39 × 10 −12 ; Additional file 1: Table S9) and stratified levels (intermediate vs. unfavorable: HR = 0.79, 95% CI = 0.72 to 0.87, P = 2.86 × 10 −6 ; favorable vs. unfavorable: HR = 0.65, 95% CI = 0.58 to 0.74, P = 2.56 × 10 −12 ; P trend = 1.92 × 10 −12 ; log-rank P < 2 × 10 −16 ; Fig. 2B). Similar findings were observed in the sensitivity analyses (Additional file 1: Table S10). Intriguingly, there was an inverse relationship between the PRS CSx and several lifestyle factors (P Wilcoxon < 0.05; Additional file 1: Fig. S7A) or the lifestyle score (P Kruskal-Wallis = 1.60 × 10 −8 ; P chi-square = 9.83 × 10 −7 ; Additional file 1: Fig. S7B-C), but their effects on colorectal cancer risk were not mutually influenced (Additional file 1: Tables S7-10). Therefore, we further evaluated the joint effect of genetic and lifestyle factors on the risk for incident colorectal cancer. As expected, there was a notable dose-response manner on increasing colorectal cancer risk as PRS CSx increased and lifestyle score decreased (trend to unfavorable lifestyle) (log-rank P < 2 × 10 −16 ; Fig. 2C, D), but no multiplicative interaction between genetic risk and lifestyle score was observed (P interaction = 0.539). Interestingly, when stratifying individuals by PRS CSx categories, we observed that a healthy lifestyle could still be significantly associated with a reduced risk of developing colorectal cancer broadly, regardless of the genetic risk effect (low: P trend = 0.043, intermediate: P trend = 7.18 × 10 −11 , high: P trend = 0.077; Table 3). Similar trends were found in the sensitivity analyses (Additional file 1: Table S11).

Estimation of 5-year absolute risk
Subsequently, we estimated the 5-year absolute risk of developing colorectal cancer using a combination of genetic and lifestyle factors and observed that colorectal cancer patients had a higher 5-year absolute risk than those without colorectal cancer (P Wilcoxon < 2 × 10 −16 ; Additional file 1: Fig. S8A). Especially when stratified by age group, a higher 5-year absolute risk was observed in individuals carrying a high genetic risk or an unfavorable lifestyle (P Kruskal-Wallis < 2 × 10 −16 ; Additional file 1: Fig.  S8B-C). Furthermore, in the stratification by genetic risk (Table 3 and Fig. 3A), there was a significant risk reduction in individuals with a low PRS and a favorable lifestyle (risk = 0.14%, reduction = 0.14%) compared with those with a low PRS but an unfavorable lifestyle (risk = 0.28%), and among individuals with a high PRS, the risk of an unfavorable lifestyle increased to 1.07%, which could be reduced to 0.54% among those with a favorable lifestyle (reduction = 0.53%).

Construction of ColoRectal Cancer Risk Prediction System (CRC-RPS)
Furthermore, we stratified the risk population according to the median value (0.34%; as a reference threshold) and two times the threshold (0.68%) of 5-year absolute risk among individuals without colorectal cancer, which was defined as low (< 0.34%), intermediate (0.34 to 0.68%) and high risk (> 0.68%). As expected, both intermediateand high-risk populations had a higher risk of developing colorectal cancer than the low-risk population (intermediate: HR = 2.47, 95% CI = 2.21 to 2.75; high: HR = 4.30, 95% CI = 3.87 to 4.78; Fig. 3B). To friendly apply our findings, we developed a colorectal cancer risk prediction web server, CRC-RPS, to help users estimate their 5-year absolute risk of developing colorectal cancer by combining genetic and lifestyle factors (http:// njmu-edu. cn: 3838/ CRC-RPS/). In brief, users can easily input their sex, age, and lifestyle information along with the genotypes of 1.15 million SNPs to obtain an estimated 5-year absolute risk and the assigned risk-population group. For example, a user with a predicted 0.2% of 5-year absolute risk was grouped as low risk of developing colorectal cancer.

Discussion
In the present study, we comprehensively constructed several sets of EAS-EUR PRSs based on the large-scale GWAS data of colorectal cancer across EAS and EUR populations and subsequently found a solid PRS framework (i.e., PRS CSx ) derived from genome-wide SNPs, independent of individual lifestyle, for stratifying the risk populations of developing colorectal cancer evidenced by independent validation datasets and a longitudinal cohort. Importantly, even though there was diversity in genetic risk, adherence to a healthy lifestyle behavior could consistently reduce the risk of developing colorectal cancer.
In recent decades, convincing evidence has emerged suggesting that identifying high-risk individuals can enable enhanced screening and the application of other interventions, thereby reducing the incidence of colorectal cancer [35]. Therefore, researchers have paid more attention to the clinical use of PRS, by determining whether it can stratify populations into subgroups with a distinct risk of developing diseases for early interventions [8,36]. To date, multiple PRSs have been constructed and confirmed to have a discriminatory ability in distinguishing colorectal cancer cases from healthy controls [9,10,37]. However, most PRSs were derived from individuals of EUR ancestry, which might limit their application in other ethnic populations. Cumulative evidence has demonstrated that, when applying the PRS models trained with EUR individuals to other ethnic populations, there were less accurate compared to EUR populations [11,38]. In particular, Thomas et al. found that the PRS model of colorectal cancer derived from 120,184 subjects of EUR ancestry performed worse for Asians, Hispanics, and African Americans than for Europeans [10]. These  Although the performance of our PRS in the EUR population (e.g., CORSA dataset) is substantially lower than previous EUR-ancestry PRSs (e.g., Thomas et al. 's genome-wide PRS) [10], our aim was to improve the clinical utility of PRS in multiple ethnic groups, especially for non-EUR (e.g., EAS) populations. As evidenced in a recent trans-ancestry PRS study, when the target population was EUR population, the improvement of multi-ancestry PRS over EUR-ancestry PRS was limited; however, when predicting into EAS populations, multiancestry PRS clearly outperformed EUR-ancestry PRS [31], which was also found in our study. Therefore, the advantage of our PRS compared to EUR-ancestry PRSs should be further validated in independent EAS longitudinal cohorts.
A healthy lifestyle has been known to be associated with a decreased risk of colorectal cancer. For instance, Kirkegaard et al. found that 23% of colorectal cancer cases might be caused by a lack of adherence to five lifestyle recommendations in a prospective Danish cohort study with 55,487 participants [39]. In our study, another important finding was that the detrimental effect of high genetic risk on incident colorectal cancer could be largely attenuated by adherence to a healthy lifestyle, which was consistent with previous findings [13,32,40]. Moreover, although the 5-year absolute risk associated with adherence to a healthy lifestyle was greatest in the group at high genetic risk, our results still emphasize the notion that the public senses of a healthy lifestyle in the whole population will lead to an evident reduction in colorectal cancer risk.
This study has several strengths. First, to our knowledge, this is the first study to develop an EAS-EUR PRS with a sufficient sample size, followed by the performance evaluation on incident colorectal cancer risk via external case-control studies and prospective cohort. This study provided further genetic information supporting the contribution of germline variation to ancestry disparity in the development of colorectal cancer. Second, we constructed a user-friendly web server to help generate a customized estimate of risk for developing colorectal cancer, for use as an early screening method. Nevertheless, we acknowledge several limitations. First, we need to validate the predictive ability of this novel PRS in an independent EAS longitudinal cohort with sufficient samples. Second, we currently focus on EAS and EUR populations in this study, and other populations (e.g., African Americans and Hispanics) need to be included in future work. Third, the limited model performance in the EUR population needs to be further improved using a larger sample size in the training set, as well as more sophisticated trans-ancestry PRS methods.

Conclusions
In conclusion, we applied an EAS-EUR combined approach to construct a PRS framework derived from genome-wide SNPs that can effectively predict colorectal cancer risk, which reduced the gap in genetic risk prediction between diverse populations. Importantly, these findings also provided further evidence that a healthy lifestyle can attenuate the genetic impact on incident colorectal cancer.
Additional file 1: Table S1. Basic characteristics of colorectal cancer GWASs. Table S2. Basic characteristics of the UK Biobank cohort. Table S3. Summary of 37 colorectal cancer GWAS-reported SNPs in East Asian. Table S4. Summary of eight lifestyle factors in the UK Biobank cohort. Table S5. Summary of one novel EAS-EUR conditionally independent variant at known colorectal cancer risk loci. Table S6. Functional annotations of one novel colorectal cancer risk locus. Table S7. The association of PRS with colorectal cancer risk in the UK Biobank. Table S8. Sensitivity analyses for the association of PRS with colorectal cancer risk in the UK Biobank cohort. Table S9. The association of lifestyle score with colorectal cancer risk in the UK Biobank cohort. Table S10. Sensitivity analyses for the association of lifestyle score with colorectal cancer risk in the UK Biobank cohort. Table S11. Sensitivity analyses for cumulative risk of developing colorectal cancer according to different levels of PRS and lifestyle score in the UK Biobank cohort. Fig. S1. Principal component analysis based on the colorectal cancer GWAS subjects and 1000 Genomes Project populations. Fig. S2. Quantile-quantile plot and genomic inflation factor for the association with colorectal cancer risk in the meta-analysis of EAS-EUR GWASs. Fig. S3. Manhattan plot from colorectal cancer EAS-EUR GWAS meta-analysis. Fig. S4. The association of PRS CSx with incident colorectal cancer in the JSCRC GWAS dataset. Fig. S5. The association of PRS CSx with incident colorectal cancer in the CORSA GWAS dataset. Fig. S6. The association of PRS with incident colorectal cancer in the UK Biobank cohort. Fig.  S7. The association of PRS with lifestyle factors in the UK Biobank cohort. Fig. S8. Distribution of 5-year absolute risk of developing colorectal cancer in the UK Biobank cohort.