Abstract
Background
BOADICEA is a widely used algorithm for predicting breast and ovarian cancer risks, using a combination of genetic and lifestyle, hormonal and reproductive risk factors. However, it has largely been developed using data from White/European individuals, limiting its applicability to other ethnicities. Here, we updated BOADICEA to provide ethnicity-specific risk estimates.
Methods
We utilised data from multiple sources to derive estimates for the distributions and effect sizes of risk factors in major UK ethnic groups (White, Black, South Asian, East Asian, and Mixed), along with ethnicity-specific population cancer incidences. We also developed a method for deriving adjusted polygenic scores for individuals of mixed genetic ancestry.
Results
The predicted average absolute risks were smaller in all non-White ethnic groups than in Whites, and the risk distributions were narrower. The proportion of women classified as at moderate or high risk of breast or ovarian cancer, according to national guidelines, was considerably smaller in non-Whites.
Discussion
The updated BOADICEA, available in the CanRisk tool (www.canrisk.org), is based on more appropriate estimates for non-White women in the UK. Further validation of the model in prospective studies is required. Considering these findings, risk classification guidelines for non-White women may need to be revised.
Similar content being viewed by others
Background
Breast cancer (BC) is the most common cancer in women in the UK, with 55,920 new cases diagnosed and 11,499 deaths (annual averages, 2017–2019 [1]). Epithelial tubo-ovarian cancer (EOC) is less common but is associated with significant mortality, with 7495 EOC cases diagnosed and 4142 deaths in the UK (annual averages, 2017–2019 [1]). Surveillance [2,3,4,5] and prevention [6,7,8,9,10,11] options are available; however, applying them universally is impractical and expensive [12], and may be associated with adverse effects [13, 14]. Personalised cancer risk assessment tools enable the classification of women into different risk categories, facilitating targeted screening and prophylaxis for those who would benefit most.
BOADICEA (the Breast and Ovarian Analysis of Disease Incidence and Carrier Estimation Algorithm) [15,16,17] incorporates separate multifactorial BC and EOC risk prediction models; it is available via the CanRisk tool (www.canrisk.org, [18]). The models predict future risks of BC and EOC for women by combining information on lifestyle/hormonal/reproductive risk factors (questionnaire-based risk factors, QRFs), rare moderate- and high-risk pathogenic variants (PVs) in cancer susceptibility genes, joint effects of common genetic variants summarised via polygenic scores (PGS), cancer family history (FH), mammographic density (MD), BC pathology information, and population cancer incidences. Like most risk prediction models, BOADICEA (BC model v6 and earlier; EOC model v2) was largely developed and validated using data from populations of White ethnicity/European ancestry [19,20,21]. A key challenge in developing models for non-White ethnicities has been their under-representation in studies of both genetic and lifestyle/hormonal/reproductive risk factors [22, 23].
BC and EOC incidences vary among countries and across ethnic groups, even within the same country [24,25,26]. The distributions of established lifestyle and hormonal risk factors also vary by ethnicity [27, 28]. Ethnicity has also been associated with differences in breast tumour pathology; in particular, triple negative BC is more common in individuals of African descent [29,30,31,32]. There are also important differences in the effects of genetic risk factors by population. BOADICEA incorporates a 313-SNP PGS for BC [33] and a 36-SNP PGS for EOC [34], developed using data from individuals of European ancestry [19, 33, 35,36,37,38]. Although these PGS are associated with cancer risks for Asian and African ancestry women, the associations are attenuated; moreover, the PGS distributions vary by ancestry [34, 39,40,41]. Therefore, to allow broad applicability to diverse populations, the BOADICEA models need to be adapted to account for these differences.
In the UK, 18.3% of the population self-identified as non-White in 2021 [42]. Here, we describe our approach to adapting the BOADICEA BC and EOC risk prediction models to ethnically diverse populations (BC model v7, EOC model v3). This involved updating the population cancer incidences, relative risk (RRs), and joint risk factor distributions with ethnicity-specific estimates using a synthetic modelling approach. [15, 43]. We also developed a novel approach for incorporating PGS for individuals of mixed genetic ancestry. We have focused specifically on developing a model applicable to the UK population, but the approach could be used to develop similar generalised adaptations of BOADICEA for other countries.
Methods
Underlying cancer models
In BOADICEA, the BC and EOC incidences are modelled as functions of the effects of QRFs, MD, PVs in cancer susceptibility genes, PGS, and a residual polygenic component modelling residual familial effects on cancer risks ([16, 17], Supplementary Materials). There are separate models for BC and EOC, developed using the same framework, which consider (partially) different cancer susceptibility genes.
Both models consider BC and EOC, and within each model the incidences of each cancer are assumed to be independent, conditional on the genotypes and QRFs included in the model [16, 17]. The genotype- and QRF-specific incidences are calculated by constraining the total age-specific incidences to agree with the overall population incidences [15], which are birth cohort- and country-specific [17].
The models also incorporate data on breast tumour pathology, in the overall population and for PV carriers. For this purpose, tumours are classified into three subtypes: 1) oestrogen receptor (ER) positive; 2) triple negative (TN), i.e. ER, progesterone receptor, and human epidermal growth factor receptor 2 negative; and 3) ER-negative but not TN. The models use age-specific estimates of tumour subtype distributions, considering the known differences in subtype distributions between PV carriers and non-carriers. [16, 44,45,46,47].
For this BOADICEA extension, we examined and, if necessary, updated all parameters while preserving the underlying models structure.
Ancestries and ethnicities considered
Ethnicity refers to a social group identity that is based on shared characteristics, such as cultural traditions, ancestry, language, religion, or social experiences [48] (Supplementary Materials). Since ethnicity could influence behavioural choices and risk factors, we used data on self-reported ethnicity to obtain risk estimates for lifestyle, hormonal, and reproductive risk factors. To adapt BOADICEA to the major ethnic groups in the UK, we adopted the ethnic categories employed in UK Biobank, which correspond to those collected in recent UK Censuses [42]. We considered six broad ethnic groups: White (British, Irish, other White), South Asian (Bangladeshi, Indian, Pakistani, other South Asian), East Asian (Chinese, other East Asian), Black (Black African, Black Caribbean, other Black), Mixed (White and Black African, White and Black Caribbean, White and Asian, other Mixed) and Other.
Genetic ancestry refers to the ‘complex inheritance of one’s genetic material’ [49]. Statistical methods identify groups of individuals with high (genetic) affinity; this allows measurement of the genetic similarity between populations and individuals (Supplementary Materials). Typically, genetic factors are driven by ancestry and related data are ancestry-specific; therefore, for genetic risk factors, we based estimates on genetic ancestry rather than self-reported ethnicity. For this purpose, we considered four main ancestries (European, African, South Asian, East Asian) and a derived category (‘Mixed’).
Datasets used in the modelling process
The study populations used to define the different model components are described in detail in the Supplementary Materials. We used data from UK Biobank[50] and the KARMA [51], MyBrCa [52], SGBCC [53], and BCSC [54] studies.
Cancer incidences and tumour subtype distributions
Ethnicity-specific population cancer incidences
We considered breast, ovarian, and pancreatic cancer incidences for women and pancreatic and prostate cancer incidences for men. Due to data limitations, we could not derive ethnicity-specific incidences for male BC, and these were assumed to be independent of ethnicity. We derived ethnicity-specific cancer incidences for White, Black, Mixed, and East/South Asian UK populations; due to sample size limitations, South and East Asians were grouped together.
For each cancer type, we used published incidence rate ratios by ethnic group relative to Whites, based on Public Health England data (2013–2017) [25]. These ratios were then re-standardised to obtain incidence rate ratios relative to the whole UK population. The proportion of each ethnic group in the UK population (Supplementary Table S1) was estimated from the Office for National Statistics (ONS, [55]); the re-standardisation procedure is described in the Supplementary Materials. We applied these incidence rate ratios to the UK population cancer incidences currently used in BOADICEA [16], thus generating ethnicity- and birth cohort-specific population cancer incidences. This approach ensures that the models remain internally consistent (i.e. the weighted sums of ethnicity-specific incidences are consistent with the overall population incidences) and backward compatible when ethnicity information is missing (i.e., in such cases, the updated models yield the same risk predictions as previously published models).
Gene- and ethnicity-specific tumour subtype distributions
We used published estimates from NCRAS (National Cancer Registration and Analysis Service) data [56, 57] on the distribution, by ethnicity and age at cancer diagnosis, of the three BC subtypes included in BOADICEA (Supplementary Materials). We derived age-, ethnicity and gene-specific subtype proportions for PV carriers by combining those estimates with published ORs for PV carriers and the respective age-interactions for each subtype [47].
Genetic components
Polygenic scores
PGS models
Given a set of SNPs, a PGS (\({R}_{{PGS}}\)) for a given individual is of the form:
where \({x}_{k}\) is the allele dosage for SNP k, \({w}_{k}\) is the corresponding weight and K is the total number of SNPs in the set. The weights are assumed to be the same for each ancestry so that the same raw PGS would be calculated regardless of the individual’s ancestry. In the results section, ‘PGS’ refers to this raw score.
A ‘PGS model’ here is defined by the list of SNPs, their corresponding weights and ancestry-specific frequencies, and the mean (\({\mu }_{i}\)) and standard deviation (\({\sigma }_{i}\)) in each genetically defined ancestry group (i). For each individual, \({\mu }_{i}\) and \({\sigma }_{i}\) are used to obtain the normalised PGS (\({Z}_{i}\)), which depends on ancestry i even though the raw PGS is the same:
The PGS models we focused on here were derived from validated breast and ovarian cancer PGS models already incorporated into BOADICEA. However, the algorithm can be adapted to other PGS [58].
The BC PGS models were based on the 313-SNP PGS [33] and ancestry-specific parameter estimates from UK Biobank. Imputed genotypes were available in UK Biobank for 309 of 313 SNPs. Therefore, we constructed a 309-SNP PGS model, using the same weights as in the 313-SNP PGS model but excluding the 4 unavailable SNPs. SNP frequencies were estimated for all ancestries. We also constructed a 307-SNP PGS model by excluding two further SNPs that are correlated with the CHEK2 PV c.1100del (22_29203724_C_T and 22_29551872_A_G). This PGS model can be used when separate sequencing data for CHEK2 are available, in order to prevent double-counting its effect [58]. Parameters for the 313-SNP, 309-SNP, and 307-SNP PGS models were also estimated using Asian ancestry data from the MyBrCa [52] and SGBCC [53] studies, together with those for a 303-SNP PGS that included only SNPs with an imputation accuracy of >0.5 in both Europeans and Asians.
For EOC, we used the 36-SNP PGS model, developed by Dareng et al. [34], that is already included in the BOADICEA EOC risk model [17]; SNP frequencies were estimated for all ancestries using UK Biobank data.
Use of PGS in BOADICEA
The PGS is incorporated into BOADICEA by partitioning the total (normalised) polygenic component into the sum of a known component, measured by the normalised PGS (Z), and an unmeasured residual component [15]:
\({\alpha }^{2}\) represents the proportion of the model polygenic variance attributable to the PGS, which can be ancestry-specific. \({x}_{R}\) is normally distributed with mean 0 and variance \({1-\alpha }^{2}\). Figure 1 summarises the workflow for generating the final PGS models implemented in the updated BOADICEA.
a process for BC PGS models. b process for EOC PGS models. Blue-shaded area: workflow and parameters derived from UK Biobank data. Green-shaded area: workflow and parameters derived from ‘BCAC data’. Red-shaded area: parameters derived from published OCAC (Ovarian Cancer Association Consortium) studies [34]. Outputs: ancestry-specific means, standard deviations (SDs), and proportions of the polygenic variance explained (alpha).
For BC, a retrospective likelihood (RL) method was used to estimate the ancestry-specific alphas (\({\alpha }_{i}\)) for South and East Asian women, using data from the MyBrCa [52] and SGBCC [53] studies. This approach models the probability of observing the PGS conditional on the phenotypes of the individuals (age of diagnosis and case/control status, Supplementary Materials and [58]). Comparable data were not available for Africans; therefore, we estimated \(\alpha\) indirectly from three published estimates of the OR per standard deviation (SD) [59,60,61], assuming that \(\alpha\) was proportional to the log-odds ratio and using the effect sizes in East Asians as the comparison (Supplementary Materials). For Europeans, we calculated \(\alpha\) using published estimates [33] and procedures to incorporate alternative PGS into BOADICEA [58]. Data used for calculating alphas (for BC PGS models) are referred to collectively as ‘Breast Cancer Association Consortium (BCAC) data’.
For EOC, we first calculated the variance explained by the PGS using ancestry-specific allele frequencies estimated from UK Biobank and published log-odds ratios estimated from OCAC [34]. These variances were then used to calculate the corresponding \({\alpha }_{i}\) parameters (Supplementary Materials). For Europeans, we used published \(\alpha\) estimates [17] already implemented in CanRisk.
PGS for individuals of mixed genetic ancestry
The ancestry-specific parameters (e.g. SNP frequencies, alphas) and normalised polygenic scores (\({Z}_{i}\)) assume that each individual can be assigned unequivocally to one ancestry. We consider individuals to be ‘single ancestry’ if the estimated proportion of their genome derived from their main ancestry is ≥0.8 (0.7 for South Asian). Otherwise, we consider them as ‘mixed genetic ancestry’ (or ‘Mixed’, for brevity); for those individuals, corrected normalised score (\(Z\)) and overall alpha (\(\alpha\)) can be computed if the estimates of the proportions of the genome attributable to each ancestry (\({P}_{i}\)) are known. \(Z\) and \(\alpha\) are given by:
where \({\mu }_{i}\) \({\sigma }_{i}\), \({\alpha }_{i}\) are ancestry-specific parameters, \({Z}_{i}\) is the polygenic scores normalised using the ancestry-specific \({\mu }_{i}\) and \({\sigma }_{i}\), \({P}_{i}\) is the ancestry proportion for ancestry \(i\), and M is the number of ancestry groups considered (here, 4). The variance term \({\sigma }^{2}\) is given by
where \({q}_{{ik}}\) is the ancestry-specific frequency of allele k and \({w}_{k}\) is its corresponding weight. The variance depends on the proportion of each ancestry in the individual but also accounts for the fact that SNP frequencies vary among ancestries (Supplementary Materials for derivations).
Rare genetic variants
No systematic reviews are currently available on the risks and frequencies of rare genetic variants associated with BC or EOC across multiple ethnicities/ancestries. Therefore, we conducted a literature search using PubMed in October 2024. The keywords used in the search can be found in Supplementary Materials. We manually reviewed articles and selected those that reported either RRs (or ORs) for the associations of germline PVs with primary BC or EOC, and/or PV frequencies in different populations.
Lifestyle, hormonal, and reproductive risk factors
A recent scoping review [22] examined the associations of BC QRFs by ethnicity and concluded that there was no convincing evidence of differences in RRs among different ethnic groups for the QRFs included in BOADICEA. This was also confirmed in other systematic reviews, e.g. on Asian women [62]. Given that the largest studies and most robust studies to date have been conducted in White populations [22], we assumed that the RRs currently used in the BC model [16] apply to all ethnic groups. Similarly, previous studies have found no clear evidence for differences in RRs for EOC across different ethnic groups, for the QRFs in BOADICEA [63,64,65,66,67,68]. We therefore assumed that the RRs currently used in the EOC model are also applicable to all ethnic groups [17].
However, the distributions of these QRFs vary across different ethnicities [62]. We used data from UK Biobank to estimate the BC and EOC QRFs distributions for each ethnic group. It is well recognised that UK Biobank is a highly selected cohort and may not be representative of the entire UK population. To assess the impact of using the UK Biobank QRFs distributions on the predicted risks we compared the 10-year BC risk predicted in the KARolinska Mammography Project for Risk Prediction of Breast Cancer (KARMA) cohort using 1) the default QRFs distributions in BOADICEA, which were estimated from large epidemiological studies [15], and 2) the QRFs distributions in the White populations in UK Biobank.
Mammographic density
BOADICEA incorporates MD, scored using the four-category BIRADS scale (Breast Imaging Reporting & Data System, version 4) [15]. We used the public version of the BCSC (Breast Cancer Surveillance Consortium) risk estimation dataset [54] to estimate the ORs and distribution of BIRADS categories for each ethnic group (Supplementary Materials and Supplementary Tables S2, S3).
Assessing risk stratification
We used the final models to assess the BC and EOC risk distributions by ethnicity. Risks were calculated for hypothetical UK women considering all possible combinations of risk predictors (QRFs, MD and PGS); distributions were then obtained by considering the frequency of each combination in the reference population of each model. We then estimated the proportion of women in different risk categories, given the risk factors studied. For BC, we used the risk categories employed in the NICE guidelines [8]: population (<17% lifetime risk), moderate (17%-30%), and high (>30%). For EOC, NICE guidelines [11, 69] set a single threshold (5% lifetime risk) for recommending risk-reducing surgery. We considered this threshold and an additional intermediate threshold of 3.5% [21].
Results
Cancer incidences and tumour subtype distributions
Ethnicity-specific population cancer incidences
Table 1 shows the incidence rate ratios for each ethnic group relative to the whole UK population, for the four cancers considered in BOADICEA. Supplementary Fig. S1 shows the corresponding population cancer incidences for the 1980–1989 birth cohort, by age and ethnicity. The incidences of all cancer types are lower in each of the non-White ethnicities than in Whites, except for pancreatic and prostate cancer incidences for Blacks. The largest difference is for prostate cancer incidence, which is approximately twice as high in Black men but half as high in Asian men compared to the UK male population overall.
Gene- and ethnicity-specific tumour subtype distributions
The distributions of breast tumour subtypes by age at cancer diagnosis and ethnicity, based on published estimates from NCRAS data [56, 57], are shown in Supplementary Figs. S2–S4. The distributions of tumour subtypes in the general population varied by ethnicity, with higher proportions of ER-negative and TN breast cancers in non-Whites (particularly Black women, Supplementary Materials). The derived age- and gene-specific subtype proportions for PV carriers by ethnicity are shown in Supplementary Tables S4–S6 and Supplementary Figs. S5–S7; differences between ethnicities were particularly marked for BRCA1, BRCA2, PALB2, BARD1, RAD51C, and RAD51D carriers.
Genetic components
Polygenic scores
PGS distribution in European and non-European individuals from UK Biobank
Supplementary Tables S7–S10 contain the SNP weights (\({w}_{k}\)) and ancestry-specific allele frequencies (\({q}_{{ik}}\)) for the BC PGS models (Supplementary Tables S7 and S8, respectively), along with the ancestry-specific means and SDs (Supplementary Tables S9 and S10). Supplementary Table S11 reports the ancestry distributions in UK Biobank participants. Since the PGS means estimated in women and men were similar, we used the parameter estimates based on the combined sample of men and women.
For both 309-SNP and 307-SNP PGS models, the mean PGS was highest in African individuals, followed by South and East Asians, and lowest in Europeans. Individuals of mixed genetic ancestry had intermediate means, similar to South Asians. The SDs were generally similar, but slightly higher among Europeans than other populations and highest among the ‘Mixed’.
Supplementary Table S12 contains the \({w}_{k}\) and \({q}_{{ik}}\) parameters for the EOC PGS model, and Supplementary Table S13 contains the corresponding ancestry-specific means and SDs. The mean EOC PGS was highest among European individuals, and similar among individuals of African, South Asian, and East Asian ancestry; the SD was also higher for European ancestry individuals than other ancestries.
Association between PGS and cancer risk, and proportion of polygenic variance explained
Supplementary Table S14 shows the parameters for the different BC PGS models by ancestry, estimated using ‘BCAC data’. The means and SDs of the BC 309-SNP and 307-SNP PGS were very similar to those estimated in the corresponding ancestry groups in UK Biobank above.
The OR per 1 SD associated with the 309-SNP PGS model was slightly lower in both South and East Asian women (1.49 and 1.53 respectively) than that reported for European women (OR = 1.64). The corresponding alphas (αRL) were markedly lower in South and East Asian ancestry women (0.326 and 0.331 respectively) than the corresponding European parameter (0.50). Very similar αRL estimates were obtained for the other PGS. Removing the two CHEK2-correlated SNPs resulted in large differences in the PGS mean. For African women, we indirectly estimated α to be ~0.196 for all the BC PGS models. For the EOC PGS model, the estimated α was highest in Europeans (α = 0.223) and lowest for East Asians (0.171) (Supplementary Table S13).
PGS for individuals of mixed genetic ancestry
Supplementary Fig. S8 shows the distribution of \(\sigma\), the standard deviation of the raw PGS, calculated over a range of combinations of ancestry proportions: \(\sigma\) is greater than the weighted average of the ancestry-specific standard deviations \({\sigma }_{i}\), and the discrepancy is more evident when no ancestry dominates (i.e. when two or more ancestries have similar proportions). As a result, for individuals of mixed genetic ancestry, the normalised PGS and \(\alpha\) are smaller than the estimates obtained by calculating a simple weighted average of the ancestry-specific values. The effect of the PGS on the risk estimates is therefore attenuated, relative to the predictions assuming a simple weighted average (Supplementary Fig. S9).
Figure 2 outlines the steps needed to include PGS in BOADICEA risk calculations, for individuals of single or mixed genetic ancestry. Table 2 summarises the corresponding ancestry-specific parameters (\({\mu }_{i}\), \({\sigma }_{i}\) and \({\alpha }_{i}\)) used in the models.
Red lines: population-based parameters of the PGS model. Blue lines: ancestry-specific parameters of the PGS from UK Biobank. Green lines: ancestry-specific alphas (αi) from ‘BCAC data’ (BC) or UK Biobank (EOC).
Rare genetic variants
Five large studies have examined the associations between rare PVs and BC risk in individuals from different ethnicities/ancestries (Supplementary Materials) [70,71,72,73,74]. There were no significant differences in the BC OR estimates among ancestries for any of the genes in any of the studies (Supplementary Table S16). Similarly, two studies have assessed the associations between germline PVs and EOC risk in non-European populations [75, 76] and showed no significant differences in the ancestry-specific EOC ORs estimates. For instance, Ho et al. estimated OR = 39.8 (95% CI: 29.6–53.3) and OR = 6.8 (95% CI: 3.9–11.9) in Asian women, for BRCA1 and BRCA2 respectively [76]. We therefore assumed that the RRs for both BC and EOC associated with PV susceptibility genes, as previously implemented in BOADICEA for European women [16], are also applicable to non-European ancestries.
Supplementary Table S17 shows that the estimated PV allele frequencies were similar across ancestry groups except for CHEK2, where the overall PV frequency was lower among non-Europeans than in Europeans. This difference is driven by the c.1100del variant, which accounts for the majority of PV carriers in Western European populations but is much rarer in non-Europeans [70]. The overall CHEK2 PV frequency in non-Europeans was estimated to be 0.00109 (Supplementary Materials). The PV frequencies for the other susceptibility genes were assumed to be the same across all ancestries and equal to those previously assumed in BOADICEA [16].
Lifestyle, hormonal, and reproductive risk factors
We estimated QRFs distributions using baseline data on 250,739 women in UK Biobank, including 235,843 Whites, 4352 Blacks, 3510 South Asians, 937 East Asians, 1765 Mixed, and 4332 of other or unknown ethnicity (Supplementary Table S18). A comparison of the 10-year BC predicted risks in the KARMA study, using the default QRFs distributions in BOADICEA [16] with those assuming the estimated distributions in UK Biobank Whites showed only marginal differences in absolute risk predictions (median difference: 0.0007, IQR: 0.0006–0.0009, range: −0.0007 to 0.0045, Supplementary Fig. S10). Given that the risk distributions were similar, that the default QRFs distributions in BOADICEA (validated in Whites) were based on multiple large studies, and to maintain consistency with the previous BOADICEA model for Whites, we applied the current default distributions to Whites and individuals of other/unknown ethnicity. The estimated QRF distributions from UK Biobank were incorporated for Black, South Asian, East Asian, and Mixed ethnicities.
Mammographic density
We used the imputed BCSC dataset to estimate ORs for the associations of MD with BC risk by ethnicity and age category (Supplementary Table S19). The OR estimates for Asian, Black, and Mixed ethnicities were associated with wide confidence intervals. Models that included an interaction term between ethnicity and MD (Supplementary Table S20) did not fit significantly better: likelihood-ratio test (LRT) p-value = 0.12 in the <50 years age group; LRT p-value = 0.053 in the ≥50 years group. We therefore assumed that the RRs previously incorporated in BOADICEA could be applied to all ethnic groups.
The ethnicity-specific BIRADS distributions were estimated using the imputed BCSC dataset, (Supplementary Tables S18 and S21). The distributions differ by ethnicity in both the <50 and ≥50 years age groups. (Pearson’s Chi-squared test, p-value < 10−5). MD was highest in Asians (e.g. category D was 1.7–1.8 times more frequent than in Whites, in both age groups) and lowest in Black women (e.g. category A was 1.1–1.3 times more frequent than in Whites).
Multifactorial models
Implications for risk stratification
Supplementary Figs. S11 and S12 show the cumulative BC and EOC risks for PV carriers. The estimated risks are highest for White ethnicity PV carriers. For example, the BC risk by age 80 ranged from 47% for Mixed to 50% for Asian, 54% for Black, and 58% for White BRCA2 PV carriers. Similarly, the corresponding EOC risks were 8% for Mixed, 10% for Black, 11% for Asian, and 15% for White BRCA2 PV carriers.
Figure 3 shows the distribution of lifetime BC and EOC risks for women of different ethnicities, considering all risk predictors available in the models; ethnicity-specific parameters are summarised in Supplementary Table S22. The distributions of BC and EOC risks were widest for White women; all other ethnicities had similar narrower ranges and lower median risks. Supplementary Figs. S13–S20 further show the distribution of lifetime and 10-year BC and EOC risks by ethnicity and by predictor combinations; Supplementary Fig. S21 shows the distribution of lifetime BC risk in BRCA2 carrier women. These figures show that the wider risk distribution in Whites is primarily driven by the higher effect size of the PGS (\({\alpha }_{i}\)) whereas the lower median risks in non-Whites are mainly driven by the ethnicity-specific incidences.
a, b BC risk; backgrounds are shaded to indicate three risk categories: <17% (light yellow); 17–30% (yellow); ≥30% (light blue). c, d EOC risk; backgrounds are shaded to indicate three risk categories: < 3.5% (light yellow); 3.5–5% (yellow); ≥5% (light blue). a, c Probability density function against absolute risk; b, d absolute risk against cumulative distribution. BC breast cancer, EOC epithelial ovarian cancer, PGS polygenic score, QRFs questionnaire-based risk factors, MD mammographic density.
Table 3 and Supplementary Table S23 show the proportions of women in different BC and EOC risk categories, based on lifetime absolute risk thresholds (unknown FH and mother with BC/EOC cancer at 50 years, respectively). Supplementary Tables S24 and S25 display the corresponding proportions for 10-year BC risk. The proportions of women at high or moderate risk of BC and EOC were markedly smaller among non-White ethnic groups, regardless of the predictors used. For example, using data on PGS, QRFs, and MD (unknown FH), 79.9% of White women were in the population-level category for BC, 18.2% in the moderate-risk category, and 1.86% in the high-risk category. In contrast, 95–96% of non-White women were in the population-level, 3–4% in the moderate-risk, and only 0.01% in the high-risk category. Similar patterns were seen for EOC lifetime risk.
Discussion
In this manuscript, we present the methodological framework we used to adapt the multifactorial BOADICEA BC and EOC risk prediction models to ethnically diverse populations. This is one of the first risk prediction models to be adapted for multiple ethnicities. While we derived and implemented these models in the context of ethnic groups in the UK, the same framework could be used for other countries, provided that risk factor parameters and incidences are available for the relevant ethnic groups. It could also be applied to the prostate cancer model based on a similar algorithm [77]. We also present a novel approach for adjusting the PGS parameters for individuals of mixed genetic ancestry, which corrects the variance in the PGS due to the variability of individual SNPs across ancestries.
A key challenge in adapting cancer risk prediction models is the lack of data on different predictors in diverse populations. Although such data are becoming more readily available, no single dataset contains the necessary information on all risk factors. Therefore, we used a synthetic approach, combining data from several sources to inform ethnicity-specific parameters. We previously demonstrated that this approach is robust and yields models that provide valid risk estimates for both BC and EOC [15,16,17, 19,20,21]. The models described here have been designed to apply to the four broad ethnicities most common in the UK: White, Black, South Asian, and East Asian. Further work will be necessary to adapt the model for specific subpopulations within these broad categories, and for other minority groups (e.g. from the Middle East or North Africa). Our flexible implementation allows for the models to be easily updated as more data become available.
Age-specific cancer incidence data by ethnicity are not currently available in the UK. We derived ethnicity-specific incidences using published incidence rate ratios (IRRs), which compare the incidence in each ethnic group relative to Whites. The resulting BC and EOC incidences are lower for Black and Asian women; the differences in incidence by ethnicity are broadly similar to those reported in other Western countries [78]. The ethnicity-specific IRRs were assumed to be independent of age and birth cohort. Ethnicity-specific incidence rates might converge, becoming more similar in more recent birth cohorts (i.e. IRRs approaching 1). Also, differences in IRRs may be less pronounced at certain ages; if so, the model might underestimate the lifetime risks for some age groups of non-White women and will require further adaptation. Nevertheless, published IRRs for the <65 and ≥65 years age groups were similar [25], in line with the model assumptions. In part, the lower observed cancer incidences among Asian and Black women might reflect existing disparities in diagnosis, e.g. lower BC screening uptake [79]; this could contribute to an underestimation of their risk (if screened similarly to White women).
The narrower range of BC and EOC risks observed among women of Asian and African ancestry is primarily due to the lower discrimination provided by the PGS (developed in European ancestry datasets) in these ancestries. Incorporating ancestry-specific data in PGS development has been shown to improve discrimination [80,81,82], and larger GWAS including diverse populations should generate PGS with better discrimination in non-Europeans [83]. Incorporating such PGS into BOADICEA should then provide a model with a broader range of predicted risks in non-White populations. Our methodological approach is sufficiently flexible to incorporate these advances in PGS development by adjusting the corresponding alpha parameter, i.e. the proportion of the familial polygenic component explained by the PGS [58].
Another assumption in our analyses is that the overall polygenic variance in each model is the same across all ethnic groups. These variances were previously estimated using segregation analyses and based predominantly on data from White women [84,85,86]. There are no comparable segregation analyses in other populations; however, data from case-control studies suggest that the RRs associated with a positive family history of BC or EOC (which determines the polygenic variance) are similar across ethnic groups [22], indicating that this is a plausible assumption.
Due to limited available sample sizes in the UK, the proportions of the polygenic variance explained by the PGS models (alphas) were determined using ancestry-specific PGS effect sizes from non-UK studies. We used data on South and East Asian women from Malaysia/Singapore [39, 53, 81] and published data on, primarily, African American women [59,60,61]. Given the similarities in PGS distributions between these studies and UK Biobank, this seems a reasonable assumption. However, there may be important differences that are not accounted for (e.g. the parameters for women of Afro-Caribbean ancestry could differ from African American women) and parameters may need further adjustment as more data become available.
Published evidence provides no convincing evidence of differences in RRs for different QRFs across ethnicities [22]. Papers reporting RRs in single countries or smaller geographical areas [87,88,89] differ greatly in the methods employed, complicating comparisons. We therefore made the plausible simplifying assumption that the RRs associated with QRFs were independent of ethnicity. However, there are marked differences in the QRFs distributions by ethnic group. For this implementation, we used data from UK Biobank, which may not fully represent the broader UK population due to healthy volunteer bias [90,91,92]. To assess the impact of this assumption on the risk predictions, we conducted a sensitivity analysis in which we replaced the default QRFs distributions in BOADICEA [15] with the UK Biobank distributions for Whites and predicted BC risks in the KARMA cohort [51]. The predicted risks were very similar (Figure S10) suggesting that using UK Biobank data for QRFs distributions does not induce any substantial bias in the predictions. Another limitation is the fact that ~90% of UK Biobank participants are White Europeans, thus limiting the amount of data available for estimating ancestry- or ethnicity-specific risk-factor distributions. Data from studies with larger sample sizes from diverse populations could improve the accuracy of these estimates. Unfortunately, population-level data on ethnicity-specific distributions of lifestyle, reproductive, or hormonal risk factors are currently lacking.
For MD, published studies vary in the methodological approaches used [89, 93, 94]. We used the BCSC cohort to model ethnicity-specific components, ensuring a consistent methodological approach across ethnic groups. Our analyses suggest that the ORs for the association are similar across ethnic groups in the BCSC dataset, consistent with findings from other populations, which reported no statistically significant differences in the estimates across ethnicities [95, 96]. We therefore assumed the same RR associations with MD for all ethnicities. One limitation of the BCSC public dataset is that it only includes data on incident cancers within one year of the baseline mammogram; datasets with longer follow-up would be valuable to confirm the consistency of MD associations across ethnicities. Another limitation of the model described here is the assumption that the density distribution, as for the QRFs, is independent of age; this limitation has been addressed in a separate model extension based on continuous MD measures [97].
Our implementation uses both self-reported ethnicity (to determine QRFs, pathology, and cancer incidence) and genetically determined ancestry (for the PGS). In practice, ancestry informative markers are not always available, e.g. if a targeted sequencing panel has been used to capture the PGS SNPs [98, 99]. In such cases, self-reported ethnicity can serve as a proxy since it is highly correlated with genetic ancestry (Supplementary Table S11). However, genetic data should ideally be used to determine ancestry for more accurate PGS estimation.
Due to the lower incidence rates, the predicted absolute BC and EOC risks for non-White women are on average lower than for White women; moreover, the distribution of risk, based on currently available risk factors, is narrower for non-White women. This shows that the previous versions of the models, calibrated for White women, resulted in substantial overestimation of risk if used for women of other ethnicities; the new version corrects this. Incorporating ethnicity-specific parameters substantially impacts risk stratification, with far fewer Asian and Black women classified as moderate- or high-risk of developing cancer based on existing clinical management thresholds [8, 11]. This raises the question as to whether current thresholds, primarily based on White populations, may need adjustment for other ethnic groups. Moreover, it is important to highlight that BOADICEA does not currently predict development of subtype specific disease or prognosis and survival, outcomes which also differ by ethnicity.
In summary, we have extended the BOADICEA BC and EOC risk prediction models to better apply to the diverse UK population. These updated models (BC v7, EOC v3) have been implemented in the CanRisk user-friendly interface (www.canrisk.org [18]), allowing the specification of self-reported ethnicity and genetic ancestry to determine the relevant parameters. Future large prospective studies will be required to provide direct validation of the BC and EOC risk models in non-White populations.
Data availability
The study uses primarily UK Biobank (www.ukbiobank.ac.uk) and BCSC data, which are available at www.bcsc-research.org. Data from the KARMA study are available upon request from Karolinska Institutet, through the MTA form available at karmastudy.org/contact/data-access/. MyBrCa and SGBCC data may be requested via the study investigators of via application to the BCAC data access committee (www.ccge.medschl.cam.ac.uk/breast-cancer-association-consortium-bcac/data-data-access). All parameters added to BOADICEA (BC v7, EOC v3) are summarised in Table S22; parameters for calculating raw PGS and standardising it for individuals of single ancestry are available in Supplementary Tables. Scripts and parameters for standardising the raw PGS for individuals of mixed genetic ancestry and deriving the appropriate overall alpha are available on GitHub (https://github.com/CCGE-BOADICEA/SHARE-PRScalculation). A simple script for extracting ancestries proportions depending on the ethnicity (including Mixed) is also provided there.
References
Cancer Research UK. 2024. Available from: https://www.cancerresearchuk.org/.
Duffy SW, Vulkan D, Cuckle H, Parmar D, Sheikh S, Smith RA, et al. Effect of mammographic screening from age 40 years on breast cancer mortality (UK Age trial): final results of a randomised, controlled trial. Lancet Oncol. 2020;21:1165–72.
Nystrom L, Andersson I, Bjurstam N, Frisell J, Nordenskjold B, Rutqvist LE. Long-term effects of mammography screening: updated overview of the Swedish randomised trials. Lancet. 2002;359:909–19.
Jacobs IJ, Menon U, Ryan A, Gentry-Maharaj A, Burnell M, Kalsi JK, et al. Ovarian cancer screening and mortality in the UK Collaborative Trial of Ovarian Cancer Screening (UKCTOCS): a randomised controlled trial. Lancet. 2016;387:945–56.
Menon U, Gentry-Maharaj A, Burnell M, Singh N, Ryan A, Karpinskyj C, et al. Ovarian cancer population screening and mortality after long-term follow-up in the UK Collaborative Trial of Ovarian Cancer Screening (UKCTOCS): a randomised controlled trial. Lancet. 2021;397:2182–93.
Britt KL, Cuzick J, Phillips KA. Key steps for effective breast cancer prevention. Nat Rev Cancer. 2020;20:417–36.
Pashayan N, Antoniou AC, Ivanus U, Esserman LJ, Easton DF, French D, et al. Personalized early detection and prevention of breast cancer: ENVISION consensus statement. Nat Rev Clin Oncol. 2020;17:687–705.
National Institute for Health and Care Excellence. Familial breast cancer: classification, care and managing breast cancer and related risks in people with a family history of breast cancer. 2023. Available from: https://www.nice.org.uk/guidance/cg164
Kotsopoulos J, Narod SA. Prophylactic salpingectomy for the prevention of ovarian cancer: who should we target? Int J Cancer. 2020;147:1245–51.
Manchanda R, Legood R, Antoniou AC, Gordeev VS, Menon U. Specifying the ovarian cancer risk threshold of ‘premenopausal risk-reducing salpingo-oophorectomy’ for ovarian cancer prevention: a cost-effectiveness analysis. J Med Genet. 2016;53:591–9.
National Institute for Health and Care Excellence. Ovarian cancer: identifying and managing familial and genetic risk. 2024. Available from: https://www.nice.org.uk/guidance/ng241
Pashayan N, Morris S, Gilbert FJ, Pharoah PDP. Cost-effectiveness and benefit-to-harm ratio of risk-stratified screening for breast cancer: a life-table model. JAMA Oncol. 2018;4:1504–10.
Srivastava S, Koay EJ, Borowsky AD, De Marzo AM, Ghosh S, Wagner PD, et al. Cancer overdiagnosis: a biological challenge and clinical dilemma. Nat Rev Cancer. 2019;19:349–58.
Pashayan N, Pharoah PD, Schleutker J, Talala K, Tammela T, Maattanen L, et al. Reducing overdiagnosis by polygenic risk-stratified screening: findings from the Finnish section of the ERSPC. Br J Cancer. 2015;113:1086–93.
Lee A, Mavaddat N, Wilcox AN, Cunningham AP, Carver T, Hartley S, et al. BOADICEA: a comprehensive breast cancer risk prediction model incorporating genetic and nongenetic risk factors. Genet Med. 2019;21:1708–18.
Lee A, Mavaddat N, Cunningham A, Carver T, Ficorella L, Archer S, et al. Enhancing the BOADICEA cancer risk prediction model to incorporate new data on RAD51C, RAD51D, BARD1 updates to tumour pathology and cancer incidence. J Med Genet. 2022;59:1206–18.
Lee A, Yang X, Tyrer J, Gentry-Maharaj A, Ryan A, Mavaddat N, et al. Comprehensive epithelial tubo-ovarian cancer risk prediction model incorporating genetic and epidemiological risk factors. J Med Genet. 2022;59:632–43.
Carver T, Hartley S, Lee A, Cunningham AP, Archer S, Babb de Villiers C, et al. CanRisk Tool-A web interface for the prediction of breast and ovarian cancer risk and the likelihood of carrying genetic pathogenic variants. Cancer Epidemiol Biomark Prev. 2021;30:469–73.
Yang X, Eriksson M, Czene K, Lee A, Leslie G, Lush M, et al. Prospective validation of the BOADICEA multifactorial breast cancer risk prediction model in a large prospective cohort study. J Med Genet. 2022;59:1196–205.
Yang X, Mooij TM, Leslie G, Ficorella L, Andrieu N, Kast K, et al. Validation of the BOADICEA model in a prospective cohort of BRCA1/2 pathogenic variant carriers. J Med Genet. 2024;61:803–9.
Yang X, Wu Y, Ficorella L, Wilcox N, Dennis J, Tyrer J, et al. Validation of the BOADICEA model for epithelial tubo-ovarian cancer risk prediction in UK Biobank. Br J Cancer. 2024;131:1473–9.
Hurson AN, Ahearn TU, Koka H, Jenkins BD, Harris AR, Roberts S, et al. Risk factors for breast cancer subtypes by race and ethnicity: a scoping review. J Natl Cancer Inst. 2024;116:1992–2002.
Mills MC, Rahal C. The GWAS Diversity Monitor tracks diversity by disease in real time. Nat Genet. 2020;52:242–3.
Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2021;71:209–49.
Delon C, Brown KF, Payne NWS, Kotrotsios Y, Vernon S, Shelton J. Differences in cancer incidence by broad ethnic group in England, 2013-2017. Br J Cancer. 2022;126:1765–73.
Shirley MH, Barnes I, Sayeed S, Finlayson A, Ali R. Incidence of breast and gynaecological cancers by ethnic group in England, 2001-2007: a descriptive study. BMC Cancer. 2014;14:979.
Gathani T, Ali R, Balkwill A, Green J, Reeves G, Beral V, et al. Ethnic differences in breast cancer incidence in England are due to differences in known risk factors for the disease: prospective study. Br J Cancer. 2014;110:224–9.
Evans DG, Brentnall AR, Harvie M, Astley S, Harkness EF, Stavrinos P, et al. Breast cancer risk in a screening cohort of Asian and white British/Irish women from Manchester UK. BMC Public Health. 2018;18:178.
Gathani T, Reeves G, Broggio J, Barnes I. Ethnicity and the tumour characteristics of invasive breast cancer in over 116,500 women in England. Br J Cancer. 2021;125:611–7.
Copson E, Maishman T, Gerty S, Eccles B, Stanton L, Cutress RI, et al. Ethnicity and outcome of young breast cancer patients in the United Kingdom: the POSH study. Br J Cancer. 2014;110:230–41.
DeSantis C, Ma J, Bryan L, Jemal A. Breast cancer statistics, 2013. CA Cancer J Clin. 2014;64:52–62.
Shoemaker ML, White MC, Wu M, Weir HK, Romieu I. Differences in breast cancer incidence among young women aged 20-49 years by stage and tumor characteristics, age, race, and ethnicity, 2004-2013. Breast Cancer Res Treat. 2018;169:595–606.
Mavaddat N, Michailidou K, Dennis J, Lush M, Fachal L, Lee A, et al. Polygenic risk scores for prediction of breast cancer and breast cancer subtypes. Am J Hum Genet. 2019;104:21–34.
Dareng EO, Tyrer JP, Barnes DR, Jones MR, Yang X, Aben KKH, et al. Polygenic risk modeling for prediction of epithelial ovarian cancer risk. Eur J Hum Genet. 2022;30:349–62.
Torkamani A, Wineinger NE, Topol EJ. The personal and clinical utility of polygenic risk scores. Nat Rev Genet. 2018;19:581–90.
Pal Choudhury P, Brook MN, Hurson AN, Lee A, Mulder CV, Coulson P, et al. Comparative validation of the BOADICEA and Tyrer-Cuzick breast cancer risk models incorporating classical risk factors and polygenic risk in a population-based prospective cohort of women of European ancestry. Breast Cancer Res. 2021;23:22.
Li SX, Milne RL, Nguyen-Dumont T, Wang X, English DR, Giles GG, et al. Prospective evaluation of the addition of polygenic risk scores to breast cancer risk models. JNCI Cancer Spectr. 2021;5:pkab021.
Lakeman IMM, Rodriguez-Girondo M, Lee A, Ruiter R, Stricker BH, Wijnant SRA, et al. Validation of the BOADICEA model and a 313-variant polygenic risk score for breast cancer risk prediction in a Dutch prospective cohort. Genet Med. 2020;22:1803–11.
Ho WK, Tan MM, Mavaddat N, Tai MC, Mariapun S, Li J, et al. European polygenic risk score for prediction of breast cancer shows similar performance in Asian women. Nat Commun. 2020;11:3833.
Duncan L, Shen H, Gelaye B, Meijsen J, Ressler K, Feldman M, et al. Analysis of polygenic risk score usage and performance in diverse human populations. Nat Commun. 2019;10:3328.
Shieh Y, Fejerman L, Lott PC, Marker K, Sawyer SD, Hu D, et al. A polygenic risk score for breast cancer in US Latinas and Latin American women. J Natl Cancer Inst. 2020;112:590–8.
Office for National Statistics. Ethnic group, England and Wales: Census 2021. 2022. Available from: https://www.ons.gov.uk/peoplepopulationandcommunity/culturalidentity/ethnicity/bulletins/ethnicgroupenglandandwales/census2021.
Pal Choudhury P, Maas P, Wilcox A, Wheeler W, Brook M, Check D, et al. iCARE: an R package to build, validate and apply absolute risk models. PLoS One. 2020;15:e0228198.
Mavaddat N, Rebbeck TR, Lakhani SR, Easton DF, Antoniou AC. Incorporating tumour pathology information into breast cancer risk prediction algorithms. Breast Cancer Res. 2010;12:R28.
Lee AJ, Cunningham AP, Kuchenbaecker KB, Mavaddat N, Easton DF, Antoniou AC, et al. BOADICEA breast cancer risk prediction model: updates to cancer incidences, tumour pathology and web interface. Br J Cancer. 2014;110:535–45.
Mavaddat N, Barrowdale D, Andrulis IL, Domchek SM, Eccles D, Nevanlinna H, et al. Pathology of breast and ovarian cancers among BRCA1 and BRCA2 mutation carriers: results from the Consortium of Investigators of Modifiers of BRCA1/2 (CIMBA). Cancer Epidemiol Biomark Prev. 2012;21:134–47.
Breast Cancer Association Consortium, Mavaddat N, Dorling L, Carvalho S, Allen J, Gonzalez-Neira A, et al. Pathology of tumors associated with pathogenic germline variants in 9 breast cancer susceptibility genes. JAMA Oncol. 2022;8:e216744.
Caliebe A, Tekola-Ayele F, Darst BF, Wang X, Song YE, Gui J, et al. Including diverse and admixed populations in genetic epidemiology research. Genet Epidemiol. 2022;46:347–71.
Constantinescu AE, Mitchell RE, Zheng J, Bull CJ, Timpson NJ, Amulic B, et al. A framework for research into continental ancestry groups of the UK Biobank. Hum Genomics. 2022;16:3.
Sudlow C, Gallacher J, Allen N, Beral V, Burton P, Danesh J, et al. UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 2015;12:e1001779.
Gabrielson M, Eriksson M, Hammarstrom M, Borgquist S, Leifland K, Czene K, et al. Cohort profile: the Karolinska mammography project for risk prediction of breast cancer (KARMA). Int J Epidemiol. 2017;46:1740–1g.
Tan MM, Ho WK, Yoon SY, Mariapun S, Hasan SN, Lee DS, et al. A case-control study of breast cancer risk factors in 7,663 women in Malaysia. PLoS One. 2018;13:e0203469.
Ho PJ, Yeoh YS, Miao H, Lim SH, Tan EY, Tan BKT, et al. Cohort profile: The Singapore Breast Cancer Cohort (SGBCC), a multi-center breast cancer cohort for evaluation of phenotypic risk factors and genetic markers. PLoS One. 2021;16:e0250102.
Barlow WE, White E, Ballard-Barbash R, Vacek PM, Titus-Ernstoff L, Carney PA, et al. Prospective breast cancer risk prediction model for women undergoing screening mammography. J Natl Cancer Inst. 2006;98:1204–14.
Office for National Statistics. Population estimates by ethnic group and religion, England and Wales: 2019. 2021. Available from: https://www.ons.gov.uk/peoplepopulationandcommunity/populationandmigration/populationestimates/articles/populationestimatesbyethnicgroupandreligionenglandandwales/2019.
Hassan H, Allen I, Rahman T, Bacon A, Knott C, Huntley C, et al. Long-term outcomes of bilateral salpingo-oophorectomy in women with personal history of breast cancer. BMJ Oncol. 2025;4:e000574
Hassan H. Long-term outcomes of bilateral salpingo-oophorectomy in the general population and in women with pathogenic variants in BRCA1 and BRCA2 [Unpublished PhD thesis]: University of Cambridge; 2025.
Mavaddat N, Ficorella L, Carver T, Lee A, Cunningham AP, Lush M, et al. Incorporating alternative polygenic risk scores into the BOADICEA breast cancer risk prediction model. Cancer Epidemiol Biomark Prev. 2023;32:422–7.
Du Z, Gao G, Adedokun B, Ahearn T, Lunetta KL, Zirpoli G, et al. Evaluating polygenic risk scores for breast cancer in women of African ancestry. J Natl Cancer Inst. 2021;113:1168–76.
Liu C, Zeinomar N, Chung WK, Kiryluk K, Gharavi AG, Hripcsak G, et al. Generalizability of polygenic risk scores for breast cancer among women with European, African, and Latinx ancestry. JAMA Netw Open. 2021;4:e2119084.
Shang H, Ding Y, Venkateswaran V, Boulier K, Kathuria-Prakash N, Malidarreh PB, et al. Generalizability of PGS(313) for breast cancer risk in a Los Angeles biobank. HGG Adv. 2024;5:100302.
Ang BH, Teo SH, Ho WK. Systematic review and meta-analysis of lifestyle and reproductive factors associated with risk of breast cancer in Asian women. Cancer Epidemiol Biomark Prev. 2024;33:1273–85.
Gay GM, Lim JS, Chay WY, Chow KY, Tan MH, Lim WY. Reproductive factors, adiposity, breastfeeding and their associations with ovarian cancer in an Asian cohort. Cancer Causes Control. 2015;26:1561–73.
Meagher NS, White KK, Wilkens LR, Bandera EV, Berchuck A, Carney ME, et al. Racial and ethnic differences in epithelial ovarian cancer risk: an analysis from the Ovarian Cancer Association Consortium. Am J Epidemiol. 2024;193:1242–52.
Moorman PG, Alberg AJ, Bandera EV, Barnholtz-Sloan J, Bondy M, Cote ML, et al. Reproductive factors and ovarian cancer risk in African-American women. Ann Epidemiol. 2016;26:654–62.
Peres LC, Risch H, Terry KL, Webb PM, Goodman MT, Wu AH, et al. Racial/ethnic differences in the epidemiology of ovarian cancer: a pooled analysis of 12 case-control studies. Int J Epidemiol. 2018;47:460–72.
Sarink D, Wu AH, Le Marchand L, White KK, Park SY, Setiawan VW, et al. Racial/ethnic differences in ovarian cancer risk: results from the multiethnic cohort study. Cancer Epidemiol Biomark Prev. 2020;29:2019–25.
Wu AH, Pearce CL, Tseng CC, Pike MC. African Americans and Hispanics remain at lower risk of ovarian cancer than non-hispanic whites after considering nongenetic risk factors and oophorectomy rates. Cancer Epidemiol Biomark Prev. 2015;24:1094–100.
Slade E, Berg L, Dworzynski K, Manchanda R. Guideline C. Ovarian cancer: identifying and managing familial and genetic risk-summary of new NICE guidance. BMJ. 2024;385:q807.
Dorling L, Carvalho S, Allen J, González-Neira A, Luccarini C, Wahlström C, et al. Breast cancer risk genes — association analysis in more than 113,000 women. N Engl J Med. 2021;384:428–39.
Hu C, Hart SN, Gnanaolivu R, Huang H, Lee KY, Na J, et al. A population-based study of genes previously implicated in breast cancer. N Engl J Med. 2021;384:440–51.
Ahearn TU, Choudhury PP, Derkach A, Wiafe-Addai B, Awuah B, Yarney J, et al. Breast cancer risk in women from Ghana carrying rare germline pathogenic mutations. Cancer Epidemiol Biomark Prev. 2022;31:1593–601.
Palmer JR, Polley EC, Hu C, John EM, Haiman C, Hart SN, et al. Contribution of germline predisposition gene mutations to breast cancer risk in African American women. J Natl Cancer Inst. 2020;112:1213–21.
Momozawa Y, Iwasaki Y, Parsons MT, Kamatani Y, Takahashi A, Tamura C, et al. Germline pathogenic variants of 11 breast cancer genes in 7,051 Japanese patients and 11,241 controls. Nat Commun. 2018;9:4083.
Oak N, Cherniack AD, Mashl RJ, Network TA, Hirsch FR, Ding L, et al. Ancestry-specific predisposing germline variants in cancer. Genome Med. 2020;12:51.
Ho WK, Hassan NT, Yoon SY, Yang X, Lim JMC, Binte Ishak ND, et al. Age-specific breast and ovarian cancer risks associated with germline BRCA1 or BRCA2 pathogenic variants - an Asian study of 572 families. Lancet Reg Health West Pac. 2024;44:101017.
Nyberg T, Brook MN, Ficorella L, Lee A, Dennis J, Yang X, et al. CanRisk-Prostate: a comprehensive, externally validated risk model for the prediction of future prostate cancer. J Clin Oncol. 2023;41:1092–1104.
Bray F, Colombet M, Aitken JF, Bardot A, Eser S, Galceran J, et al. Cancer Incidence in Five Continents Vol. XII (IARC CancerBase No. 19). Lyon: International Agency for Research on Cancer; 2023.
Gathani T, Chaudhry A, Chagla L, Chopra S, Copson E, Purushotham A, et al. Ethnicity and breast cancer in the UK: where are we now? Eur J Surg Oncol. 2021;47:2978–81.
Barnes DR, Tyrer JP, Dennis J, Leslie G, Bolla MK, Lush M, et al. Large-scale genome-wide association study of 398,238 women unveils seven novel loci associated with high-grade serous epithelial ovarian cancer risk. medRxiv. 2024, https://www.medrxiv.org/content/10.1101/2024.02.29.24303243v1.
Ho WK, Tai MC, Dennis J, Shu X, Li J, Ho PJ, et al. Polygenic risk scores for prediction of breast cancer risk in Asian populations. Genet Med. 2022;24:586–600.
Jia G, Ping J, Guo X, Yang Y, Tao R, Li B, et al. Genome-wide association analyses of breast cancer in women of African ancestry identify new susceptibility loci and improve risk prediction. Nat Genet. 2024;56:819–26.
National Cancer Institute. Confluence Data Platform. 2024. Available from: https://confluence.cancer.gov/.
Antoniou AC, Cunningham AP, Peto J, Evans DG, Lalloo F, Narod SA, et al. The BOADICEA model of genetic susceptibility to breast and ovarian cancers: updates and extensions. Br J Cancer. 2008;98:1457–66.
Jervis S, Song H, Lee A, Dicks E, Harrington P, Baynes C, et al. A risk prediction algorithm for ovarian cancer incorporating BRCA1, BRCA2, common alleles and other familial effects. J Med Genet. 2015;52:465–75.
Jervis S, Song H, Lee A, Dicks E, Tyrer J, Harrington P, et al. Ovarian cancer familial relative risks by tumour subtypes and by known ovarian cancer genetic susceptibility variants. J Med Genet. 2014;51:108–13.
Wang JM, Zhao HG, Liu TT, Wang FY. Evaluation of the association between mammographic density and the risk of breast cancer using Quantra software and the BI-RADS classification. Medicine. 2020;99:e23112.
Tran TXM, Moon SG, Kim S, Park B. Association of the interaction between mammographic breast density, body mass index, and menopausal status with breast cancer risk among Korean women. JAMA Netw Open. 2021;4:e2139161.
Nishiyama K, Taira N, Mizoo T, Kochi M, Ikeda H, Iwamoto T, et al. Influence of breast density on breast cancer risk: a case control study in Japanese women. Breast Cancer. 2020;27:277–83.
Fry A, Littlejohns TJ, Sudlow C, Doherty N, Adamska L, Sprosen T, et al. Comparison of sociodemographic and health-related characteristics of UK Biobank participants with those of the general population. Am J Epidemiol. 2017;186:1026–34.
Schoeler T, Speed D, Porcu E, Pirastu N, Pingault JB, Kutalik Z. Participation bias in the UK Biobank distorts genetic associations and downstream analyses. Nat Hum Behav. 2023;7:1216–27.
van Alten S, Domingue BW, Faul J, Galama T, Marees AT. Reweighting UK Biobank corrects for pervasive selection bias due to volunteering. Int J Epidemiol. 2024;53:dyae054.
Ye DM, Li Q, Yu T, Wang HT, Luo YH, Li WQ. Clinical and epidemiologic factors associated with breast cancer and its subtypes among Northeast Chinese women. Cancer Med. 2019;8:7431–45.
Park B, Cho HM, Lee EH, Song S, Suh M, Choi KS, et al. Does breast density measured through population-based screening independently increase breast cancer risk in Asian females? Clin Epidemiol. 2018;10:61–70.
Conroy SM, Woolcott CG, Koga KR, Byrne C, Nagata C, Ursin G, et al. Mammographic density and risk of breast cancer by adiposity: an analysis of four case-control studies. Int J Cancer. 2012;130:1915–24.
Razzaghi H, Troester MA, Gierach GL, Olshan AF, Yankaskas BC, Millikan RC. Mammographic density and breast cancer risk in White and African American Women. Breast Cancer Res Treat. 2012;135:571–80.
Ficorella L, Eriksson M, Czene K, Leslie G, Yang X, Carver T, et al. Incorporating continuous mammographic density into the BOADICEA breast cancer risk prediction model. medRxiv. 2025, https://www.medrxiv.org/content/10.1101/2025.02.14.25322305v1.
University of Cambridge. Assessing the impact of personalised risk estimates on the uptake and timing of risk management options in women who have inherited a change in genes associated with an increased risk of breast and ovarian cancer. 2022. Available from: https://www.isrctn.com/ISRCTN15331714.
Brooks JD, Nabi HH, Andrulis IL, Antoniou AC, Chiquette J, Després P, et al. Personalized Risk Assessment for Prevention and Early Detection of Breast Cancer: Integration and Implementation (PERSPECTIVE I&I). J Pers Med. 2021;11:511.
UK Biobank. 2025. Available from: https://www.ukbiobank.ac.uk/.
Acknowledgements
This research has been conducted using data from UK Biobank, a major biomedical database [50, 100]. We thank the investigators of the KARMA [51], MyBrCa [52], and SGBCC [53] studies. MYBRCA thanks study participants and research staff (particularly Patsy Ng, Nurhidayu Hassan, Yoon Sook-Yee, Daphne Lee, Lee Sheau Yee, Phuah Sze Yee and Norhashimah Hassan) for their contributions and commitment to this study. SGBCC thanks the participants and all research coordinators for their excellent help with recruitment, data and sample collection. We also thank the BCSC [54] participants, investigators, mammography facilities, and radiologists for the data they have provided for this study.
Funding
This work was supported by: grants from Cancer Research UK (grant PPRPGM-Nov20\100002 and Catalyst Award CanGene-CanVar, C61296/A27223); core funding from the NIHR Cambridge Biomedical Research Centre (NIHR203312) [*]; the Gray Foundation; the European Union’s Horizon 2020 research and innovation programme, under grant agreement numbers 633784 (B-CAST) and 634935 (BRIDGES); the PERSPECTIVE I&I project, which the Government of Canada funds through Genome Canada (#13529) and the Canadian Institutes of Health Research (#155865), the Ministère de l’Économie et de l’Innovation du Québec through Genome Québec, the Quebec Breast Cancer Foundation; the CHU de Quebec. ACA is supported by Cancer Research UK grant: SEBCD3-2024/100001. The BCSC data collection and sharing was supported by the National Cancer Institute-funded Breast Cancer Surveillance Consortium (HHSN261201100031C). MyBrCa is funded by research grants from the Wellcome Trust (v203477/Z/16/Z), the Malaysian Ministry of Higher Education (UM.C/HlR/MOHE/06) and Cancer Research Malaysia. SGBCC is funded by the National Research Foundation Singapore, NUS start-up Grant, National University Cancer Institute Singapore (NCIS) Centre Grant, Breast Cancer Prevention Programme, Asian Breast Cancer Research Fund and the NMRC Clinician Scientist Award (SI Category); population-based controls were from the Multi-Ethnic Cohort (MEC) funded by grants from the Ministry of Health, Singapore, National University of Singapore and National University Health System, Singapore. *The views expressed are those of the author(s) and not necessarily those of the NIHR or the Department of Health and Social Care.
Author information
Authors and Affiliations
Contributions
Conceptualisation: LF, XY, NM, FSD, SA, JS, PDPP, JAUS, MT, DFE and ACA. Data collection: LF, XY, NM, HH, WKH, SHT, MH, JL, ME, KC, PH, TR, AB, SH. Data analysis: LF, XY, NM, HH, JD, JT. Model development: LF, TC, AES. Writing in the initial draft: LF, XY, NM, HH, DFE and ACA. Funding: JAUS, MT, DFE, ACA. All authors reviewed the manuscript, provided feedback and approved the final manuscript text.
Corresponding authors
Ethics declarations
Competing interests
LF, TC, DFE and ACA are listed as creators of the BOADICEA model, which has been licensed by Cambridge Enterprise (University of Cambridge).
Ethics approval and consent to participate
MyBrCa study was approved by the Independent Ethics Committee, Ramsay Sime Darby Health Care (reference nos: 201109.4 and 201208.1), and the Medical Ethics Committee, University Malaya Medical Centre (reference no: 842.9). Regarding SGCBBstudy, the approval committees for ‘cases’ were National Health Group (NHG) Domain Specific Review Board (DSRB) and SingHealth Centralised Institutional Review Board (CIRB); the approval committee for ‘controls’ was the National University of Singapore (NUS) IRB. UK Biobank study was approved by the UK North West Multi-centre Research Ethics Committee (MREC). KARMA study was approved by the ethical review board at Karolinska Institutet (2010/958-31/1). All participants to the studies provided informed consent, either written or electronically signed. The studies were performed in accordance with the Declaration of Helsinki.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Ficorella, L., Yang, X., Mavaddat, N. et al. Adapting the BOADICEA breast and ovarian cancer risk models for the ethnically diverse UK population. Br J Cancer 133, 844–855 (2025). https://doi.org/10.1038/s41416-025-03117-y
Received:
Revised:
Accepted:
Published:
Version of record:
Issue date:
DOI: https://doi.org/10.1038/s41416-025-03117-y
This article is cited by
-
Performance of different polygenic risk scores for breast cancer risk prediction: in-depth evaluations across large UK and Australian cohorts
European Journal of Human Genetics (2026)





