Background

The incidence of colorectal cancer (CRC) in patients under 50 years, generally referred to as early-onset (EO)CRC, has rapidly increased since the early 2000s [1,2,3,4]. In the UK, adults under 50 years now account for 10% of all new diagnoses, a significant rise from 5% in 2017. This concerning trend contrasts with a slight decline in overall CRC incidence in the UK over the last decade [5]. The reasons for the rise in EOCRC are unclear but are likely to have resulted from Westernisation of lifestyles typified by increased consumption of sugar, processed foods and beverages, lack of exercise, urbanisation, pollution and increased antibiotic usage that can disrupt the gut microbiome (6). EOCRC incidence exhibits a birth cohort effect, whereby the odds of developing CRC are three times higher among adults born in the mid-1980s than in the mid-1960s [6, 7].

In the absence of CRC screening for patients under 50 years, diagnosis is dependent on symptomatic presentation that frequently leads to delay because the possibility of cancer is rarely considered. Consequently, EOCRC patients are generally diagnosed at an advanced stage more often than those with late-onset CRC [8]. The faecal immunochemical test (FIT) measures the amount of haemoglobin (Hb) in a faeces sample and is a diagnostic triage tool for patients presenting with clinical features of CRC in primary care, with a threshold of ≥10 µg Hb/g of faeces used for urgent referral for CRC investigation. FIT became more ubiquitous in primary care practice after the guideline recommendation by the Association of Coloproctology of Great Britian and Ireland and the British Society of Gastroenterology (ACPPGBI/BSG) in July 2022 [9]. This prompted the UK’s National Institute for Health and Care Excellence (NICE) to update their Diagnostics Guidance 56 (DG56), incorporating a broader range of symptoms and revising the role of age in FIT recommendations [10]. This is outlined in Box 1.

While the evidence supporting the use of FIT to triage patients aged 50 years or above is well established [10,11,12], there is limited evidence on the performance of FIT for younger patients presenting to primary care in the UK [10]. The primary aim of this study was to determine the diagnostic performance of FIT in the detection of EOCRC, using a symptomatic population under 50 years of age in the upper South West of England.

Methods

This retrospective cohort study was conducted and reported following the Standards for Reporting of Diagnostic Accuracy Group (STARD) initiative checklist for diagnostic tests [13], as well as The Reporting of Studies Collected Using Observational Routinely-Collected Heath Data (RECORD) statement [14], an extension of Strengthening and Reporting of Observational Studies in Epidemiology (STROBE) statement [15] (Supplementary Materials 1).

Patients and data collection

Participants eligible for inclusion in the cohort were patients aged between 18 and 49 years undergoing at least one FIT in primary care from 1st January 2021 to 10th July 2023 in the Upper South West of England. We collected results from all FITs analysed by Severn Pathology in Bristol, UK, using the HM-JACKarc analyser. The sample included patients tested at any primary care practice in the Somerset, Wiltshire, Avon, and Gloucestershire (SWAG) NHS Cancer Alliance that covers the Upper South West of England. A threshold of ≥10 µg Hb/g of faeces defined a positive result. The range of possible results was <2 to >400 µg Hb/g. Patients with a result of <2 were assigned a result of 2 µg Hb/g, and patients with a result of >400 were assigned a result of 400 µg Hb/g. Data extracted from the laboratory included NHS number, patient postcode, date of birth, date of FIT, FIT result, and symptoms preceding FIT. If a patient had more than one FIT result recorded, their first test in the study period was used. Symptoms were coded as ‘low risk symptoms’ (reflecting symptoms set out in NICE guidance at the time of testing [16]), ‘change in bowel habit’, ‘non-site-specific symptoms’, ‘IDA’, or ‘two week wait pathway’ (indicating that the patient had a feature or symptom that ‘qualifies’ for an urgent suspected cancer referral). Socioeconomic status was characterised using quintiles of the area-based Index of Multiple Deprivation 2015 (IMD) [17] using postcode data. Postcodes were securely deleted after IMD assignment and after linkage to cancer diagnosis data.

All patients aged under 50 years who were diagnosed with CRC in the secondary care NHS trusts within the SWAG Cancer Alliance (Gloucestershire Hospitals NHS Foundation Trust, Royal United Hospitals Bath NHS Foundation Trust, Salisbury NHS Foundation Trust, Somerset NHS Foundation Trust, North Bristol Trust and University Hospitals Bristol and Weston NHS Foundation Trust) from 1st January 2021 to the 10th October 2024 were identified. This included patients with and without a pre-diagnosis FIT. This time frame was chosen to ensure at least 12 months follow-up from the last date of FIT sampling. (10/07/2023). Previous work suggests that a 12-month time frame is appropriate to FIT negative CRC patients who may be diagnosed through other routes, such as emergency presentations, while limiting the identification of CRC to patients diagnosed after 12 months when their CRC may not have been causing symptoms at the time of their FIT [11]. Data extracted from secondary care included NHS number, ICD-10 codes, and cancer stage at diagnosis. Proximal tumours were defined as ICD-10 codes C180 to C186 (caecum to descending colon), while codes C187, C19, and C20 were classified as distal (sigmoid colon to rectum). Early stage was defined as stage I or II, and advanced stage as stage III or IV.

The list of patients with a FIT was linked to the list of CRC diagnoses using NHS number. CRC diagnoses were retained if they occurred within 12 months of a patients’ first FIT and patients were excluded if they had a CRC diagnosis prior to their first or only recorded FIT. After linkage had been performed, an encrypted and anonymous patient identifier was assigned to each patient and all NHS numbers were securely deleted.

Statistical analysis

Analyses were conducted using Stata SE version 18.0 [18], undertaking a complete case analysis approach. Summary statistics described all patients with a FIT, all patients with a diagnosis of CRC, and the subgroup of patients with a pre-diagnosis FIT who were subsequently diagnosed with CRC. The Chi-squared test was used to evaluate differences in proportions and the Mann–Whitney U test to assessed differences in median values of the summary statistics. Diagnostic sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) were estimated using Stata’s diagt command [19] at the following thresholds: 10, 20, 30, 40, 50, 75, and 100 µg Hb/g of faeces. This was completed for all patients, then stratified by age group: 18–29 years, 30–39 years, and 40–49 years. Due to the low number of CRC diagnoses in the 18 to 29 age group, we also included a combined age group of 18–39 years. The ROCtab command estimated the area under the receiver operating characteristic curve (AUC) for quantitative f-HB against CRC diagnosis [20], and the smallest sum of squares of 1-sensitivity and 1-specificity was selected to estimate the FIT threshold to maximise both sensitivity and specificity using Stata’s rocmic command [21].

Multivariable fractional polynomial regression estimated CRC risk by FIT result as a continuous variable to identify the threshold above which estimated CRC risk is 3% or higher, as described previously [11]. Here, the analysis was also stratified by age group (18–29 years, 30–39 years and 40–49 years). Due to the large number of patients with a FIT result >400 µg Hb/g, a post-hoc sensitivity analysis excluded patients with a FIT result of >400 µg Hb/g.

Data governance

This project evaluated service delivery rather than changing routine clinical practice; therefore, ethical committee approval was not required. Data sharing agreements and Caldicott guardian approvals were in place between all involved parties, allowing data sharing. The use of individual NHS numbers in this study aligns with the criteria outlined in section 6 of the General Data Protection Regulation: Guidance on Lawful Processing. The processing of data is based upon GDPR Article 6(1)(e)—‘exercise of official authority’ and article 9(2)(h) ‘management of health and care services’. The legal foundation for this processing is the NHS Act 2006, Section 13E, which mandates NHS England to ensure continuous service quality improvement. This same legal basis applied to the secondary care providers contributing data.

Patient and public involvement and engagement

Our Patient and Public Involvement and Engagement representative and co-author’s experience and reflections of being an EOCRC survivor have shaped the interpretation of results and directed areas for future research to improve the use of FIT in the younger population.

Results

A total of 41,068 FIT samples were analysed by Severn Pathology between 1st January 2021 and 10th July 2023 in patients aged 18–49 (Fig. 1). Of these, 2877 were repeat FITs for the same patient, 49 results were missing, and 25 results were recorded after a CRC diagnosis, leaving a total of 38,117 eligible patients with a FIT result (the FIT cohort). Between 1st January 2021 and 10th October 2024, 528 patients under 50 years were diagnosed with CRC in the Upper South West (the CRC cohort). Approximately one-fifth of these patients (105 patients, 19.8%) had a record of a primary care-ordered FIT in the year before their diagnosis (CRC patients who underwent a pre-diagnostic FIT).

Fig. 1: Cohort derivation.
figure 1

Flow diagram showing patients tested, patients diagnosed with CRC, and patients eligible for inclusion.

The cohort characteristics of the CRC cohort, FIT cohort, and cohort of CRC patients with a pre-diagnostic FIT are outlined in Table 1. There was little difference in age, ethnicity, and socioeconomic status across the cohorts, although there was a higher proportion of women who underwent FIT compared to the proportion of women diagnosed with CRC (62% vs 46%, p < 0.001).

Table 1 Cohort characteristics of patients diagnosed with CRC (CRC cohort), patients with a FIT result (FIT cohort), and CRC patients who had a FIT result in the year before their diagnosis.

CRC characteristics

Two-thirds of the CRC cohort with a recorded stage at diagnosis were diagnosed at an advanced stage (266 cases, 68.7%), though stage was not recorded in 151 cases (28.1%). Among CRC patients with a determinable tumour location based on their ICD-10 code, most were diagnosed with proximal tumours (77.1%), while 22.9% were distal tumours. However, 114 CRC patients had an ambiguous ICD-10 code (C18—malignant neoplasm of colon and C189—malignant neoplasm of colon unspecified), preventing determination of the tumour location. The distribution of missing tumour location was similar between the CRC cohorts with and without a pre-diagnostic FIT. There was no difference in advanced-stage diagnoses between patients diagnosed with distal and proximal tumours.

FIT results

Of the 38,117 patients with a FIT test, 4474 (11.7%) had a positive result (Supplementary Material 2). A positive FIT was more common in men (12.7%) than women (11.1%) (p < 0.001), with a negligible difference in the median age of patients with a positive result (40 years vs 41 years, p < 0.001). The distribution of positive FIT results by CRC status is displayed in Fig. 2. Half of the CRC patients with a positive FIT had the maximum FIT result of ≥400 µg Hb/g (46, 47%), compared to one-quarter of the patients without CRC (1033, 24%).

Fig. 2: Histogram of positive FIT results, by CRC status.
figure 2

Red bars represent CRC patients, blue bars represent patients without CRC.

Symptoms recorded before FIT

Symptoms preceding FIT included ‘low risk symptoms’ (15,703, 56.4%), ‘change in bowel habit’ (5326, 19.1%), ‘non-site-specific symptoms’ (2485, 8.9%), ‘IDA’ (2330, 8.4%), and ‘two week wait pathway’ (1987, 7.1%). No symptom was recorded for 10,286 (27.0%) patients.

Patients who were subsequently diagnosed with CRC had a higher proportion of a specific symptom record compared to those who were not diagnosed, including change in bowel habit (27.5% vs 19.1%, p = 0.03) and IDA (13.8% vs 8.7%, p = 0.04). Additionally, a lower proportion of EOCRC patients reported low-risk symptoms (38.8% vs 56.5%, p = 0.001). Patients aged <40 years were more likely to have a non-specific ‘low risk symptoms’ record compared to those aged ≥40 years (71.4% vs 45.0%, p < 0.0001), whereas patients aged ≥40 years were more likely to have a specific symptom recorded than those aged <40 years, including: 28.6% vs 6.8% for change in bowel habit (28.6% vs 6.8%, p < 0.0001) and IDA (10% vs 6%, p < 0.0001).

The diagnostic performance of FIT

At the ≥10 µg Hb/g of faeces threshold, the PPV of a positive test for adults under 50 years was 2.2% (95% CI 1.8–2.6%), the NPV was 100% (100–100%), the sensitivity was 92.4% (85.5–96.7%), and the specificity was 88.5% (88.2–88.8%). Stratification by age group (18–29 years, 30–39 years, and 40–49 years) indicated varying diagnostic performance by age, with improved performance with increasing age (Table 2). FIT performance was excellent for patients aged 40 to 49 years, with a PPV of 3.2% (95% CI 2.5–4.0%), NPV 100% (99.9–100%), sensitivity 92.9% (85.1–97.3%), and specificity 89.0% (88.6–89.4%). Although sensitivity, specificity and NPVs were similarly high in younger patients, the PPVs were lower: 0.4% (0.1–1.3%) for 18–29 years and 1.2% (0.7–1.9%) for 30–39 years. There was little difference in the sensitivity, specificity, NPV and AUC of FIT between the sexes or by tumour location; however, a positive FIT was slightly more predictive of CRC for males compared to females, and for distal tumours compared to proximal tumours.

Table 2 Diagnostic performance of FIT at the ≥10 µg Hb/g of faeces threshold for patients by age group (18–29 years, 30–39 years, and 40–49 years), sex, and tumour location (proximal and distal).

The AUC for FIT was excellent for all three age groups: 0.95 (95% CI 0.92–0.99) for 18–29 years, 0.93 (0.86–0.99) for 30–39 years, and 0.95 (0.92–0.97) for 40–49 years (Fig. 3). The estimated FIT threshold to maximise both sensitivity and specificity was 65 µg Hb/g (95% CI 0–341 µg Hb/g), 9 µg Hb/g (0–63 µg Hb/g), 17 µg Hb/g (6–28 µg Hb/g) for the three age groups, respectively.

Fig. 3: The area under the receiver characteristic curve for the diagnostic performance of FIT.
figure 3

The blue line represents patients aged 18–29 years, the red line represents patients aged 30–39 years, and the green line represents patients aged 40–49 years. AUC, area under the receiving operator characteristic curve.

Table 3 outlines the diagnostic performance of FIT at thresholds of 10, 20, 30, 40, 50, 75, and 100 μg Hb/g for patients aged 18–29 years and 30–39 years, as well as a post-hoc analysis of patients aged 18–39 years due to the low number of cancer diagnoses in the 18–29 year group. The NPVs remain extremely high at all thresholds and age groups (lowest NPV value was 99.9%).

Table 3 The diagnostic performance of FIT at 10, 20, 30, 40, 50, 75, and 100 μg Hb/g for patients aged 18–29 years, 30–39 years, and 18–39 years.

The predicted probability a CRC diagnosis at a given faecal Hb concentration is shown in Fig. 4. Patients under 40 did not reach a 3% probability at any concentration, whereas a patient aged 40–49 years with a FIT result of 136 μg Hb/g (95% CI: 108–171 μg Hb/g) was estimated to have a 3% probability of CRC. In the sensitivity analysis, excluding patients with a FIT result >400 μg Hb/g, this was slightly lower: a FIT result of 104 μg Hb/g (95% CI: 79–138 μg Hb/g) corresponded to a 3% probability of CRC. However, for patients under 40, the probability of CRC still did not reach 3% at any concentration.

Fig. 4: The predicted probability a patient will be diagnosed with CRC at a set f-Hb level.
figure 4

Results from the multivariable fractional polynomial regression model. The solid line represents the predicted probability, whereas the dashed lines represent the 95% confidence intervals.

Discussion

The diagnostic performance of FIT for EOCRC was high for all patients aged 18–49 years in this audit of routinely collected health data. However, due to the relatively low incidence of EOCRC, the PPV only exceeded 3% (the threshold at which NICE recommend urgent referral for suspected cancer) in patients aged 40–49 years at a threshold of 10 μg Hb/g. Although the PPVs were lower than 3% in the 18–39-year age groups at all thresholds up to 100 μg Hb/g, the otherwise promising diagnostic performance suggests FIT may be useful in this age range as well, possibly in combination with other factors.

Two previous studies have evaluated the performance of FIT in symptomatic patients under 50 years in the UK. D’Souza and colleagues (2021) conducted an NHS-commissioned evaluation of the diagnostic accuracy of FIT in 1103 patients referred for urgent colonoscopy between October 2017 and December 2019 [22], and Tibbs and Benton (2024) evaluated the use of FIT in 3119 symptomatic patients in primary care from June 2019 to October 2020 [23]. Both studies identified only 16 and 12 EOCRC cases, respectively, resulting in wider confidence intervals and limiting further age-stratified analysis. Despite the small number of EOCRC cases, both studies reported diagnostic sensitivity, specificity, AUC, and NPV values comparable to those in the present study, reinforcing the reliability of these findings. The PPV reported by D’Souza et al. (6.8%, 95% CI 3.7–11.4%) was approximately triple that was observed here (2.2% (2.0–2.3%), likely reflecting a higher-risk population already referred for colonoscopy, rather than patients who received a FIT as part of diagnostic work-up in primary care. The PPV reported by Tibbs and Benton was only slightly higher than what was reported here (2.7% (2.2–3.0%), potentially due their study period preceding the ACPGBI/BSG guideline recommendations.

The diagnostic sensitivity, specificity, and NPV of FIT reported here were remarkably similar to those reported in patients aged 50 and above. Bailey et al. (2021) [11] reported a sensitivity of 86.3% (95% CI 71.4–93.0%), specificity of 85.0% (83.8–86.1%), and NPV of 99.8% (99.5–99.9%) at a threshold of 10 μg Hb/g. Likewise, Withrow and colleagues (2022) [12] reported a sensitivity of 92.1% (86.4–95.5%), specificity of 91.5% (91.1– 91.9%), and NPV of 99.9% (99.9–100%) at the same threshold. The corresponding estimates in the present study were 92.4% (85.5–96.7%), 88.5% (88.2–88.8%), and 100% (100–100%), respectively. The notable difference was in the PPVs: ranging from 0.4% (0.1–1.3%) for 18–29 years, 1.2% (0.7–1.9%) for 30–39 years, 3.2 (2.5–4.0%) for 40–49 years, to 7.0% (5.1–9.3%) [11] and 8.4% (7.1–9.9%) [12] for patients aged 50 and over, reflecting the increasing prevalence of CRC with age.

The lower PPVs for the under-40s suggest FIT may be used too broadly in this age group. Indeed, the proportion of patients reporting ‘low-risk symptoms’ was significantly higher for patients under 40 years (71.4% vs 45.0%, p < 0.0001), which contrasts with NICE guidance in place during the study period. Incidentally, undiagnosed EOCRC patients were less likely to have reported ‘low-risk symptoms’. Under the guidance preceding the D56 update, FIT was only recommended for patients with changes in bowel habit or IDA in this age group [16]. This deviation from national guidance may reflect increased regional awareness of EOCRC in the South West, where incidence is rising fastest in the UK [6]. When ACPGBI/BSG guidelines were implemented in the region, primary care clinicians in the South West were encouraged to be permissive in their use of FIT in adults under 50 [24]. Data from Severn Pathology laboratory indicates ~20% of all FITs are performed in patients under 50 years [Correspondence 1]. Despite this, the majority of EOCRC patients (~80%) did not undergo FIT in primary care in the year before their diagnosis. Combined with the low PPVs, this suggests that FIT may not be optimally utilised in younger patients, highlighting the need to better define the risk profile of younger patients in whom FIT could enhance early detection. However, the low proportion of EOCRC patients who underwent a FIT suggests these findings may not fully represent the broader EOCRC population.

This study was a retrospective analysis of clinical data from primary care settings where FIT is currently in use, and therefore minimised spectrum bias and provided accurate findings to reflect FITs true clinical performance. All secondary care providers in the upper South West region were recruited to ensure complete case follow-up, and dedicated cancer managers ensured accurate data recording. The 12-month follow-up period allowed detection of EOCRCs that may not have been offered further investigation following a negative FIT result. However, there is a potential bias in undercounting CRC diagnoses among patients who were 49 years old at the time of their FIT but turned 50 before diagnosis. This could lead to either an underestimation of false negatives or an overestimation of false positives. Nonetheless, the number of such cases is likely very small and would not impact the overall conclusion that FIT performs excellently for patients aged 40 and over. Many of the FIT samples included in this study were likely to have been outside the contemporary guideline recommendations, although the evidence we present here, that FIT has reasonable performance in younger age groups, probably means that the off-recommendation use of FIT did not matter.

The cohorts included in this audit are largely representative of patients aged under 50 years in England; however, there is a lower representation of non-White patients and those in the most deprived IMD quintile in the South West [25]. Future research may wish to validate these findings in other regions of the UK or in countries with similar healthcare systems. As with previous reports, FIT usage was more common in women [9, 23, 26], whereas EOCRC was equally common between men and women. This disparity likely contributed to the lower PPVs observed for women, which may be explained by the higher prevalence of IDA in women [27] as well as their greater likelihood of attending primary care consultations [28].

This study supports the use of FIT to guide referral decisions in primary care for symptomatic patients aged 40 years and over at the current threshold of 10 µg Hb/g faeces. The lower PPVs observed in patients aged under 40 years suggest that applying a 20 or even 30 µg Hb/g for patients <40 years may lessen the diagnostic burden on secondary care, with only a negligible increase in false negative (missed) tests, an approach supported by previous work [23, 29]. Meanwhile, further research is needed to evaluate how symptomatic profiles impact FIT’s predictive performance in patients under 40. Given the wide variability in the predictive value of gastrointestinal symptoms for CRC [30, 31] and the lower absolute risks for each symptom [32], incorporating additional clinical factors, such as blood work, could enhance FIT’s utility in this population. Understanding FIT’s predictive accuracy in younger adults will be valuable in defining optimal FIT thresholds should the screening age be reduced below 50 years. Additionally, a composite endpoint of combining high-risk polyps and CRC may improve PPVs, albeit at the expense of NPV, but could offer a clinically meaningful strategy for earlier cancer detection.

Conclusion

FIT performs excellently for symptomatic patients aged 40–49 years in primary care at the ≥10 µg Hb/g of faeces threshold. The lower PPVs in the 18–39-year group suggest that FIT may be being used too widely in low-risk patients in this age group, and that a different strategy could be needed to guide definitive investigation. Further research is needed on how specific symptoms and the incorporation of clinical features (such as blood work) may increase FIT’s predictive ability in younger adults. Meanwhile, a higher threshold for f-Hb may be more appropriate for patients aged under 40 years with low-risk symptoms in primary care.