Introduction

Breast Cancer (BC) is currently the most commonly diagnosed cancer and the leading cause of cancer mortality among women in Europe, with an estimated 557,900 new cases and 144,500 deaths in 2022. In France alone, 2023 saw an estimated 61,200 new diagnoses and 12,100 deaths1,2.

To detect BC, public health agencies recommend tailored screening strategies based on individual risk profiles (Fig. 1)3.

Figure 1
figure 1

Screening guidelines in France.

Clinical trials indicate that mammography can reduce BC mortality by 10–20%, aiding in early detection and thereby improving treatment outcomes4,5,6,7. Expert groups presented substantial evidence supporting the performance and cost-effectiveness of mammographic screening5, which prompted the initiation of organized screening programs across Europe in the 2000s8. The French Organized Screening Program for Breast Cancer (DOCS) was introduced in 2004, reaching a 47.7% participation rate among 10.8 million eligible women in 2020–20219.

In 2023, a meta-review of BC screening guidelines reported the five most recommended metrics to assess mammography performance: sensitivity (recommended in 11 guidelines), cancer detection rate per 1000 women screened (10 guidelines), cancer size, interval cancer rate, and positive predictive value (7 guidelines)10. Public health institutions and other studies emphasize the importance of these metrics to evaluate mammography performance.

In 2004, the National Agency for Healthcare Accreditation and Evaluation (ANAES) recommended two types of test validity measures: sensitivity and specificity, assessing the intrinsic validity of screening tests, and predictive values, relevant to the screened population11. In 2008, the European Commission Initiative on Breast Cancer (ECIBC) recommended organized screening mammography based on findings from a Danish study indicating that its sensitivity and specificity outperformed those of opportunistic or non-organized screening programs12,13. Regional studies in France and Germany estimated that approximately 15–20% of cancers detected within screened populations are identified within 24 months following a negative mammography14,15.

In 2019, a report by Santé Publique France on the performance of DOCS mammography highlighted the necessity of aligning with European guidelines to improve sensitivity and predictive values while pointing to the need for registries to track post-negative-screening cases16. In France in 2018, 33 cancer registries covered 24 out of 101 departments17,18, with databases format heterogeneity, making it difficult to use them for data linkage19.

This study aimed (1) to assess the performance of mammography by estimating the sensitivity, specificity, predictive values, cancer detection rate, and interval cancer rates of the screening program without cancer registry and (2) to compare these estimates with findings from other studies.

Material and methods

Data sources

Screening data were obtained from the Coordination Center of Cancer Screening in the Occitanie region (CRCDC-OC) and included individual patient characteristics, full-field digital mammography (FFDM) images, and mammography results. Mammography exams comprise a clinical assessment (palpation and observation) and a double reading of four images: two per breast, one cranio-caudal (CC) view and one mediolateral oblique (MLO) view. Following the initial reading, additional exams such as image enlargement or fine-needle biopsy may be conducted as needed20. Data were collected from centers in Gard and Lozère departments (see Supplementary Figure S1) between 2004 and 2020, supported by individual or collective communications. Patients’ data were pseudonymized twice, and the data processing was designated of public interest and complied with the European General Data Protection Regulation act (GDPR) and French privacy standards, as authorized by the French National Data Protection Authority (CNIL) with authorization number DR-2020-365, granted in 2020. All methods were performed in accordance with these guidelines and regulations.

For all screened women, medico-administrative data from the National Health Data System (SNDS) were also collected. Established in 2016 to unify public insurance databases, the SNDS provides general information (age, sex and residence location), treatment and acts data (SNIIRAM tables), patient records from healthcare institutions (PMSI tables), and cause-of-death data (CepiDC tables), generally retained for 20 years from the date of inclusion21,22. No individual socioeconomic data can be found for the general population in the SNDS23.

The linkage between the screening database and the SNDS database was facilitated through a technological platform provided by the Health Data Hub (HDH), a public entity formed in 2019 to support SNDS-related research projects24,25.

Study population

Every two years, the DOCS screening program invites women aged 50 to 74 years with a Social Security (French healthcare reimbursement system) number, without reported risk factors and without any previously reported breast lesions26. Our study population included all women invited to the program with at least one screening mammography done between 2011 and 2018. The start year, 2011, was chosen to avoid the impact of a technology shift from analog to digital mammography between 2008 and 2010, which likely affected false-negative rates in 2010 (see Supplementary Figure S2). The endpoint, 2018, was selected to allow a two-year follow-up period in the SNDS database for cancer detection. Mean age in the study cohort was 61 years. 1,407 examinations (0.2%) that did not comply with DOCS guidelines, such as negative first readings not followed by a second reading, were excluded27. In total, 252,786 screening exams (29,661 to 33,447 annually) from 111,783 women were included.

BC identification

BC cases were either identified by the screening program or through the SNDS medico-administrative follow-up for the shorter of two periods: within 24 months or until next screening, if applicable (Fig. 2). Women with screen-detected cancer were also followed up in the SNDS to complete an initial assessment of the accuracy of our identification method.

In the SNDS, five criteria were used for BC identification. BC deaths were identified in the Center of Epidemiology on Medical Causes of Death (CépiDc) tables (1). BC surgeries, such as total mastectomy or partial mastectomy with axillary lymph node dissection, were identified in the SNIIRAM tables, with the first qualifying surgery found up to 24 months after mammography serving as a BC identification proxy (2). Breast surgeries potentially indicating cancer (e.g. partial mastectomy or tumorectomy without axillary dissection) were identified in the SNIIRAM tables with cancer treatments used to confirm BC diagnosis (3). Surgery to treatments intervals were based on the existing oncological literature28. Targeted therapy (TT): between 250 days before and up to 180 days after surgery. Endocrine therapy (ET): between 250 and up to 365 days. Radiotherapy (RT): between 150 and up to 365 days. Chemotherapy (CT): between 250 and up to 180 days.

Other cases without recorded death or surgery were identified up to 24 months post-screening either via the SNIIRAM Long-Term Conditions (LTC, ALD in French) tables, where CIM-10 diagnosis codes C50 and D05 were confirmed with treatment (4), or with BC diagnoses found in the hospitals stays data tables (PMSI in French), with confirmatory treatments (5). Among LTC cases, 2% had no record of TT, ET, RT, or CT. Among PMSI cases, 64% had no record of TT, ET, RT or CT.

Mammography result and screening result

The American College of Radiology Breast Imaging Reporting and Data System (BI-RADS)29 reports seven levels to categorize breast imaging tests results (categories definitions are detailed in Supplementary Figure S3). In the context of the DOCS screening program, mammography result ranges from 1 (normal) to 5 (highly suggestive of malignancy). Following DOCS evaluation methods of national authorities, a mammography result was deemed positive if rated 3, 4, or 5 and negative if rated 1 or 216,30,31. The screening outcome was classified as positive or negative, integrating both mammography results and any additional exam findings.

Interval cancers

Interval cancers, defined as cases detected after a negative mammography within 24 months and before next screening, were identified by SNDS follow-up or women self-reporting their cancer diagnosis to the program (Fig. 2)12,13,14,15,16. Interval cancer rate was defined as the number of interval cancer cases per 1,000 screenings. Relative interval cancer rate was defined as the proportion of cancers detected after a negative mammography.

Figure 2
figure 2

Screening track with additional SNDS follow-up for BC identification. Details / Keys (1)-(2)-(3)-(4)-(5) refer to the SNDS identification criteria defined in BC identification section. KS means “cancer”, conf. means “confirmatory”. Dashed line illustrates patients self-reporting cancer to a screening center.

Metrics definition

Sensitivity (SE), specificity (SP), positive and negative predictive values (PPV/NPV), cancer detection rate (CDR) and interval cancer rate (ICR) and relative interval cancer rate (RICR) were computed with different combinations of SNDS criteria for cancer identification, in addition to screen-detected cases (Table 1).

Table 1 Definition of mammography performance metrics.

Outcomes and statistical analyses

Metrics were computed over the whole 2011–2018 period and annually, with and without stratification.

To stratify the metrics, the following features were considered: age at the exam, rank of the screening exam and ACR level of the exam.

Confidence intervals (CI) were computed with a dedicated method for each metric32,33,34. Results were obtained with a Python 3 kernel in the HDH environment.

Results

Additional cases identified in the SNDS

Among positive mammography exams, the DOCS screening program identified 1,611 cases, and an additional 387 cases were identified by SNDS follow-up up to 24 months after screening.

Among negative mammography exams, the DOCS identified 101 cases, and 670 additional cases were identified in the SNDS (Fig. 3 for the full distribution).

95.3% of screen-detected cases were also identified through SNDS follow-up.

Figure 3
figure 3

Additional cases identified in the SNDS, by time elapsed since mammography exam.

Global performance of mammography

Over the period, SE of mammography without incorporating SNDS-identified cancers was 98.8% (95% CI 98.2–99.2). Including SNDS cases, SE was 77.9% (76.3–79.3) (Fig. 4), indicating that 77.9% of cases appearing within 24 months after screening had a positive mammography result. This result is above the European minimum standard of 70%35 and aligns with findings from studies using cancer registries13,36,37.

Figure 4
figure 4

Global performance results. Metrics estimates with 95% confidence and yearly min/max, with and without SNDS-detected cases (bold metrics take into account SNDS-identified cases), compared to other studies (A-B-C: 13,37,38) and European guidelines (EU)35,39.

PPV of mammography was 15.1% (14.5–15.8) without SNDS-identified cases, and 19.8% (19–20.5) with SNDS cases. Over time, an upward trend of PPV was observed both with and without SNDS cases. In 2017, 22.4% (20.3–24.5) of patients with a positive mammography were confirmed with cancer within 24 months.

CDR without SNDS cases was 6.6‰ (6.2–6.8). Including SNDS cases, CDR was 10.9‰ (10.5–11.3), meaning that approximately 1 in 100 screened patients was diagnosed with cancer within 24 months. This result is consistent with reported CDRs for organised screening with FFDM38,40.

ICR with SNDS cases was 2.4‰ (2.2-2.6), indicating that more than 2 in every 1000 mammographies led to an interval cancer within 24 months. ICR for the first year post-screening was 0.8‰ (0.7–0.9), increasing to 1.7‰ (1.5–1.9) in the second year (Fig. 4). These values are in line with the literature37,38 and comply with European guidelines (details on the ICR guidelines are available at the end of Supplementary Materials)35.

RICR with SNDS cases was 22.1% (20.6–23.7) over the whole period, meaning that 22.1% of cases identified within 24 months after screening occurred after a negative mammography. First year post-screening RICR was 9% (7.9–10.2), and second year RICR was 80% (76.5–83.1). In other words, approximately 1 in 10 cancers diagnosed within the first year after screening had previously been classified as negative on mammography, compared to about 4 in 5 for cancers diagnosed in the second year following screening.

Mammography performance by exam rank

PPV significantly increased with the exam rank (Table 2). This may reflect improved radiologist accuracy with a cumulative screening record, enhancing understanding of the current exam. Alternatively, increased PPV might indicate severity differences among patients with higher screening participation. However, using mean positive ACR result as a severity proxy showed no evidence of correlation between severity and participation rates.

A slight decrease in CDR was observed with higher exam rank, potentially reflecting a reduced cancer incidence in populations undergoing repeated testing.

Table 2 Mammography PPV and CDR by screening exam rank (95% CI).

Mammography performance by age

PPV varied significantly by age (Table 3). For women aged 50–59, PPV was 14.6%, when it reached 28.1% for women aged 70–74, suggesting that a positive mammography result was more predictive of cancer for older patients. CDR also increased with age, from 9.1‰for women aged 50-59 to 14.8‰for women aged 70–74.

Table 3 Mammography PPV and CDR by age (95% CI).

Mammography performance by ACR level

PPV varied considerably by ACR level. Over the 2011–2018 period, PPV for ACR level 3 was 6.1% (5.6–6.8), PPV was 50.2% (47.8–52.5) for ACR 4, and 94.2% (92.5–95.6) for ACR 5, without significant changes over time (see Supplementary Figure S4). These findings align with radiologists’ observations on PPV by ACR level41 and comply with French national BC screening guidelines27.

Discussions

In this study, interval cancers were defined as mammography false negatives, whether or not precursor signs could be seen on the mammograms. This definition assumes that truly undetectable interval cancers could stem from the screening program’s limitations, in particular the time interval between two consecutive invitations. By distinguishing interval cancers missed by radiologists (false negatives) from those retrospectively undetectable (true negatives), we could better separate screening inaccuracies from the inherent constraints of the program.

Our stratified analyses showed notable variations in performance metrics, specifically in positive predictive value (PPV) and cancer detection rate (CDR), across age groups and with successive screening exams. These findings may suggest that breast cancer in older patients has characteristics that make it easier to detect, as evidenced by the increase in ACR scores along with age (3.35 average for women aged 50–59, 3.46 for 60–69, and 3.54 for 70–74, see Supplementary Figure S5). The observed improvement in PPV with subsequent screening exams requires further investigation to identify causal factors.

Although an initial validation showed that our BC identification algorithm is 95.3% sensitive to screen-detected cases, the only way to test whether the algorithm over-identifies cases is to link a BC registry to the SNDS in a French region equipped with a population-based cancer registry. In other words, a future study applying this methodology in a registry-covered department should validate the effectiveness of linking screening data to SNDS data for systematic BC identification.

Conclusion

Our findings indicate substantial changes in SE, PPV, and CDR with SNDS-identified cancers. Additionally, SNDS follow-up enables the computation of ICR and RICR. Most metrics align with European guidelines and findings from registry-based studies (when available). Given the national coverage of the SNDS, this approach has the potential to bridge the gap created by the limited availability of registries in France. Future studies applying this methodology in regions with registries could validate the effectiveness of linking screenings to the SNDS for cancer identification.

Although breast cancer is relatively well contained compared to other cancers, with only 5% of cases diagnosed at stage 4, versus 57% for lung and 65% for colorectal cancers42,43, approximately 30% of survivors eventually develop metastases44. This underscores that, while early detection and treatment are effective, the disease mortality burden remains considerable. Thus, advancing the prediction of cancer severity and outcomes based on patient characteristics is essential. The database used in this study, one of the largest imaging resources for organized breast cancer screening research with more than 250,000 mammographic images and linked medico-administrative follow-up45, offers a valuable opportunity to address these challenges.