Introduction

The global incidence of breast cancer is projected to exceed 3 million new cases and 1 million deaths annually by 20401. Several lifestyle factors increasing breast cancer risk are identified2. Higher alcohol consumption3, being overweight or obese (especially post-menopausal women)4, delayed childbirth5, not breastfeeding6, and prolonged hormone replacement therapy use (over five years)7 are associated with elevated risk of breast cancer. These risk factors are particularly relevant to estrogen receptor-positive (ER+) breast cancer, as its growth is driven by estrogen signaling8. Additionally, The Women’s Health Initiative (WHI) Dietary Modification (DM) clinical trial has shown that reducing fat intake to 20% of total energy may lower the risk of breast cancer mortality in postmenopausal women9. Moreover, a large-scale prospective cohort indicates that a long-term anti-inflammatory diet characterized by a higher intake of vegetables, fruits, nuts, seeds, and fish, may improve survival of breast cancer survivors10.

Ultra-processed foods (UPF) are industrial formulations typically containing five or more ingredients, including additives like flavorings, colorings, emulsifiers, and preservatives11. Those foods are often ready-to-eat, affordable, hyper-palatable, energy-dense, and heavily marketed. Two case-control studies12,13 suggest a potential relationship between UPF consumption and increased breast cancer risk. Additionally, the European Prospective Investigation into Cancer and Nutrition (EPIC) study that enrolled 450,111 participants from ten European countries indicate a 10% substitution of processed foods with an equal amount of minimally processed foods was associated with reduced risk of postmenopausal breast cancer14. Additionally, some studies suggest younger women consuming more UPF may face a significantly higher risk15. Nevertheless, those studies focusing on the association between UPF and pan-cancers risk, are less likely to fully adjust for potential confounding factors of breast cancer, such as reproductive risk factors and hormone replacement therapy use.

Here, we aim to evaluate whether UPF consumption is associated with breast cancer risk or death in US population using prospective data from the Prostate, Lung, Colorectal and Ovarian (PLCO) Cancer Screening Trial.

Method

Participants

The dataset utilized in this study originates from the Prostate, Lung, Colorectal, and Ovarian (PLCO) Cancer Screening Trial, a large-scale, randomized prospective study initiated and supported by the U.S. National Cancer Institute (NCI) to evaluate the effects of cancer screening on mortality outcomes among adults aged 55–74. Detailed trial information, including registration numbers—NCT00002540, NCT01696968, NCT01696981, NCT01696994, and NCT00339495—is accessible through ClinicalTrials.gov and the CDAS portal (https://cdas.cancer.gov/plco/). A total of approximately 155,000 individuals were recruited between November 1993 and July 2001, with cancer incidence and mortality data tracked through 2022. This analysis focused solely on female participants. Among the 78,209 women initially randomized into either the screening (n = 39,103) or usual care (n = 39,106) arms, a series of exclusions were applied. These included women who did not return the baseline questionnaire (n = 2,094), submitted incomplete or invalid baseline data (n = 5,120), had a prior cancer diagnosis before completing the diet history questionnaire (DHQ) (n = 6,722), failed to provide a valid DHQ (n = 2,664), reported implausible dietary energy intake based on interquartile range thresholds (n = 2,004), or lacked sufficient follow-up time (under 6 months post-DHQ or missing follow-up; n = 21,086). After these exclusions, the final analytic cohort comprised 50,855 women, with 25,441 in the intervention group and 25,414 in the control group, as illustrated in Supplementary Fig. 1.

Assessment of ultra-processed food consumption

The method used to assess UPF consumption in this study has been described previously16. In brief, two dietitians categorized all items of food and drink from DHQ into one of four groups as per the NOVA classification11. The NOVA classification, which is based on the purpose, nature, and extent of food processing, defines four food groups: unprocessed or minimally processed foods, processed culinary ingredients, processed foods, and UPF. Detailed descriptions, including definitions and examples for each group, can be found NOVA classification11. Our study primarily focused on UPF. These include items such as sour cream, cream cheese, ice cream, frozen yogurt, fried foods, breads, cookies, cakes, pastries, salty snacks, breakfast cereals, instant noodles and soups, sauces, margarine, candy, soft drinks, fruit drinks, and fast food items like hamburgers, hot dogs, and pizza. Using a previously reported categorization method17, we further classified all UPF items into nine categories: (1) animal-based processed foods; (2) sugar-sweetened beverages; (3) artificially sweetened beverages; (4) cookies, cakes, and pastries; (5) savoury snacks; (6) milkshakes and dairy-based desserts; (7) sweets and confectionery; (8) salads, spreads, and sauces; and (9) other UPF. We calculated each participant’s total UPF intake by summing the reported amount consumed (grams per day) across all UPF food items (see Supplementary Table 1). We next estimated the energy contribution of each UPF item by converting the reported grams to kilocalories (amount in grams ÷ 100 × energy value per 100 g). The total energy from UPF was obtained by summing these item-specific values. However, UPF intake includes non–energy-providing products (e.g., artificially sweetened beverages) as well as additives and other non-nutritional components introduced during processing, we used total UPF weight (grams per day) as the primary exposure metric in this study.

Covariates

The women reported their BMI (kg/m2), pack-years of cigarette, age at baseline/menarche/menopause, history of fertility, race (white non-Hispanic, black non- Hispanic, Hispanic, Asian, Pacific Islander and American Indian), education level (less than high school, high school graduate or equivalent, post high school education, college education, or higher), female hormone and birth control pill use, and family history of breast cancer. Data on total energy intake, fat intake, and alcohol drinking status were collected from DHQ. In addition, the Supplemental Questionnaire (SQX), which included patient-reported measures, contained questions related to physical activity. Participants were asked whether they had engaged in moderate or strenuous physical activity at least once per month during the past 12 months. For this study, we focused on four binary (yes/no) items assessing the frequency of exercise.

Outcome ascertainment

Outcomes included diagnosis of breast cancer and time to breast cancer events. All reports of breast cancer were followed up, and medical records were abstracted and reviewed for case ascertainment. We extracted breast cancer cases according to International Classification of Diseases for Oncology, Third Edition (ICD-3, which was coded as 50.0-C50.6, C50.8-C50.9). Death and cause of death was determined through annual questionnaires, next of kin notifications, death certificate verification, and National Death Index searches.

Statistical analysis

Eligible female participants were categorized into quintiles (Q1–Q5) based on their ultra-processed food (UPF) intake (in grams) to enable more detailed stratification and enhance statistical power in analyses of breast cancer incidence, which involved a relatively large number of cases. For breast cancer mortality, however, UPF consumption was grouped into tertiles (T1–T3) due to the smaller number of events, minimizing the risk of sparse data and improving the robustness of the estimates. Baseline characteristics across UPF intake groups were summarized using means and standard deviations (SD) for continuous variables with normal distributions, as confirmed by the Shapiro-Wilk test (P < 0.05 for all). Categorical variables were described using counts and percentages. Group comparisons were assessed using independent t-tests for continuous variables and chi-square tests for categorical variables.

Cox proportional hazards regression models were employed to estimate both crude and adjusted hazard ratios (HRs) with corresponding 95% confidence intervals (CIs). The proportional hazards assumption was verified through the Schoenfeld residuals test18. For modeling breast cancer-specific mortality, a Fine-Gray subdistribution hazards model was applied, treating deaths from other causes as competing risks. Confounding variables were identified using the “change-in-estimate” approach19 which evaluated potential confounders based on their associations with both UPF intake and breast cancer outcomes, or by their ability to alter the crude HR by more than 10% in bivariate analyses. Follow-up time for breast cancer incidence was defined as the interval (in days) between the completion of the Diet History Questionnaire (DHQ) and breast cancer diagnosis. For those who remained cancer-free, follow-up was censored at the date of death or last known contact before or on December 31, 2009.

To assess potential effect modification, interaction terms between UPF intake and key covariates were added to the multivariable-adjusted models. Statistical significance for interactions was determined using likelihood ratio tests, with a Pinteraction< 0.05 indicating significant modification. Accordingly, we performed stratified analyses based on predefined variables, including age, smoking status, and family history of breast cancer, which are well-established factors known to influence breast cancer risk and/or dietary effect20.

All statistical tests were two-sided, and results with P < 0.05 were considered statistically significant. Data analyses were performed using R software (version 4.2.1).

Ethics statement

The study protocol of the PLCO Cancer Screening Trial was approved by the Institutional Review Board of the National Cancer Institute (PLCO-1308), and all participants provided written informed consent. The present study is a secondary analysis of de-identified data obtained from the PLCO database. Because the dataset does not contain individually identifiable private information, this research does not involve human subjects as defined by U.S. federal regulations (45 CFR 46.102(e)). Therefore, this secondary analysis is considered exempt from IRB review. The Guangdong Food and Drug Vocational College Institutional Review Board has confirmed this classification. All methods were carried out in accordance with relevant guidelines and regulations.

Results

Participant characteristics

Baseline characteristics of study population (n = 50,855) across UFP consumption are shown in Table 1. In the whole study population, the mean (SD) UPF consumption was 950 (725) gram/day and the mean percentage (SD) of energy brought by UPF in total daily energy intake was 53.8% (16.9%). Participants who have higher UPF intake (highest vs. the lowest quartiles) were more likely to be younger, white/black, non-Hispanic, and have higher BMI, pack-years of cigarette, total energy and fat intake, lower education level (all p < 0.05). As for those breast cancer-related lifestyle factors, higher UPF consumption is associated with current alcohol drinking, female hormone, younger age at first birth, birth control pills use, and more babies delivered (all p < 0.05).

Table 1 Baseline characteristics of female subjects by ultra‑processed food consumption in the PLCO screening trial.

UPF and the risk of breast cancer occurrence and breast cancer-specific mortality

The risk estimates for breast cancer incidence and mortality in relation to UPF consumption (Q1: lowest UPF consumption; Q5: highest UPF consumption) are presented in Table 2. Over an average follow-up period of 8.8 years, a total of 2,245 breast cancer cases and 270 breast cancer-specific deaths were recorded. With 445,998 person-years, the overall incidence rate was 50 cases and 6 deaths per 10,000 person-years.

In the unadjusted (Q2: HR 1.15; 95% CI 1.01–1.31, P = 0.04, Q5: HR 1.15; 95% CI 1.01–1.31, P = 0.04) and age adjusted (Q2: HR 1.15; 95% CI 1.01–1.32, P = 0.04, Q5: HR 1.16; 95% CI 1.02–1.33, P = 0.03) models, participants in the second and fifth quintile of UPF consumption had a 15% higher risk of breast cancer compared to those in the lowest quintile (Table 2). In addition, we applied the change-in-estimate approach to identify variables with confounding effects (see Method), including age, body mass index (BMI), physical activity, total energy intake, fat intake, race, educational level, hormone replacement therapy, alcohol intake, pack-years of cigarette, and family history of breast cancer. However, after adjusting for those confounding factors, only second quintile of UPF consumption was significantly associated with a 14% increased risk of breast cancer (HR for Q2 vs. Q1: 1.16; 95% CI, 1.01–1.34, P = 0.04, Table 2). No linear trends across quintiles of UPF was observed.

We utilized the competing risk model for breast cancer-specific mortality to estimate cause-specific hazard ratios. In age-adjusted (T3 vs. T1: HR 1.34; 95% CI 1.0–1.80.0.80, P = 0.05) models, UPF consumption was associated with risk of breast cancer mortality. In addition, each tertile increase in UPF consumption was associated with a higher risk of breast cancer mortality in the age-adjusted model (HR: 1.16, 95% CI: 1.0–1.35, P = 0.05; Table 2). A similar trend was observed in the fully adjusted model, although the association did not reach statistical significance (HR: 1.16, 95% CI: 0.99–1.35, P = 0.06, Table 2).

Table 2 HRs (95% CIs) for breast cancer incidence by quintiles/tertiles of UPF consumption (gram/day) in the screening arm of the PLCO screening trial: 1993–2009.

Subgroup analysis

To identify potential interaction effects between known breast cancer risk factors and UPF consumption on breast cancer incidence, we assessed baseline characteristics and lifestyle factors—including age, BMI, total energy intake, fat intake, race, educational level, hormone replacement therapy, alcohol intake, pack-years of smoking, and family history of breast cancer—using multivariate Cox regression models (Table 3). Interestingly, we found that UPF intake is significantly associated with breast cancer risk, especially within elder participants (age > 65 years, Q3 of HR: 1.30; Q5 of HR: 1.41, Ptrend=0.02, Table 3), women harbouring bad lifestyle habits such as drinking alcohol (Q2 of HR: 1.18), or those who had breast cancer family history (Q2 of HR: 1.64; Q3 of HR: 1.45) (all Pinteraction < = 0.05) (Table 3).

Table 3 Risk of breast cancer incidence stratified by clinicopathological factors by quintiles of intake of UPF consumption in the PLCO cancer screening trial: 1993–2009.

Discussion

Principal findings

In this large prospective study with long-term follow-up, we demonstrated significant harmful associations of UPF with risks of breast cancer incidence and mortality. Subgroup analyses further showed that harmful association between UPF consumption and breast cancer risk were more pronounced in women who are elder (> 65 years) or had detrimental lifestyle habits such as smoking drinking alcohol, or had breast cancer family history.

Comparison with prior work

Several studies12,13,14,15,21,22 have examined the association of UPF consumption with breast cancer risk, but the results are controversial. Kliemann et al. demonstrated that higher consumption of processed and ultra-processed foods was correlated with increased risks of overall cancer and specific cancers, except for breast cancer, in a large, multicenter, prospective cohort study14. Similarly, Jacobs et al. found no significant association between higher consumption of culinary ingredients, processed foods, and ultra-processed foods and breast cancer risk in a case-control study involving Black women from Soweto, South Africa22. However, their classification appears less comprehensive, as it did not include certain culturally relevant or potentially impactful foods like fried chicken and processed meats. Interestingly, these two studies primarily considered baseline characteristics and dietary information as confounding factors in their multivariate models. However, they did not fully adjust for reproductive factors related to breast cancer and the use of hormone replacement therapy. When considering reproductive factors and all breast cancer-related lifestyle factors like smoking, alcohol drinking, and hormone replacement therapy, several studies consistently suggested that UPF intake is associated with increased risk of breast cancer based on UPF dose-response analyses12,21 or categorical analyses12,13,15. A recent meta-analysis23, which included six articles involving 462,292 participants, suggested higher consumption of UPF is slightly related to a higher risk of breast cancer. Jian-Yuan et al. found that among 1,100 colorectal, 1,750 lung, 4,336 prostate, and 2,443 breast cancer patients, reducing ultra-processed food consumption before diagnosis improved overall survival for lung and prostate cancer patients, but not for those with breast cancer24. In this study, several associations were observed in intermediate UPF categories rather than demonstrating a consistent dose–response pattern. This may indicate a non-linear relationship between UPF intake and breast cancer outcomes, which could reflect differential contributions of specific UPF subgroups, measurement error in self-reported dietary assessments, or threshold effects. Future studies with more granular dietary data and repeated measurements are warranted to clarify these potential non-linear associations.

Strengths and limitations

The strengths of this study include its prospective cohort design, comprehensive assessment of ultra-processed food (UPF) intake based on the NOVA classification, and long follow-up period. To minimize participant selection bias, potential confounding factors were thoroughly screened using the “change-in-estimate” method19 and adjusted in the multivariate model. When analysing the relationship of UPF intake and breast cancer mortality, competing events such as deaths from cardiovascular deaths or other cancers were well modelled using Fine and Gray’s model. However, there are several limitations. The PLCO trial only recruited participants aged 55–74 years, limiting the generalizability of the findings to other age groups. Misclassification bias might occur in categorizing food items due to insufficient information in the DHQ, although this bias is likely nondifferential and could bias risk estimates toward the null. Most importantly, residual confounding due to unmeasured confounders may still be present, which was observed by limited impact of UPF consumption on breast cancer risk/mortality. Additionally, higher UPF consumption is associated with poorer overall diet quality25, which may mediate the relationship between UPF consumption and breast cancer mortality. We acknowledge that using static UPF consumption and baseline confounders, rather than longitudinal profiles, may also limit the generalizability of our findings to the current population. Lifestyle patterns, including BMI, energy intake, UPF consumption, and alcohol/tobacco use, may have shifted over time, potentially impacting associations. Last but not least, a key limitation was the absence of individual-level data on breast cancer screening practices within the PLCO dataset, which could represent an important confounding factor.

Conclusion

To conclude, the findings of this study showed that a high intake of UPF is associated with a slightly elevated risk of breast cancer incidence and death. In addition, these findings suggest that dietary interventions may warrant further investigation in future trials, particularly among women with adverse lifestyle factors such as smoking, alcohol consumption, or a family history of breast cancer. Further prospective studies are needed to confirm these results.