Normal breast tissue (NBT)-classifiers: advancing compartment classification in normal breast histology

Chen, Siyuan; Parreno-Centeno, Mario; Booker, Graham; Verghese, Gregory; Mohamed, Fathima Sumayya; Arslan, Salim; Pandya, Pahini; Oozeer, Aasiyah; D’Angelo, Marcello; Barrow, Rachel; Nelan, Rachel; Sobral-Leite, Marcelo; de Martino, Fabio; Brisken, Cathrin; Smalley, Matthew J.; Lips, Esther H.; Gillett, Cheryl; Jones, Louise J.; Banerji, Christopher R. S.; Pinder, Sarah E.; Grigoriadis, Anita

doi:10.1038/s41523-026-00896-2

Download PDF

Article
Open access
Published: 09 February 2026

Normal breast tissue (NBT)-classifiers: advancing compartment classification in normal breast histology

Siyuan Chen¹^na1,
Mario Parreno-Centeno¹^na1,
Graham Booker¹,
Gregory Verghese^1,2,3,
Fathima Sumayya Mohamed¹,
Salim Arslan⁴,
Pahini Pandya⁴,
Aasiyah Oozeer^2,5,
Marcello D’Angelo⁵,
Rachel Barrow^2,6,
Rachel Nelan^2,6,
Marcelo Sobral-Leite⁷,
Fabio de Martino⁸,
Cathrin Brisken^8,9,
Matthew J. Smalley¹⁰,
Esther H. Lips⁷,
Cheryl Gillett^2,5,
Louise J. Jones^2,6,
Christopher R. S. Banerji^1,11,12,
Sarah E. Pinder¹³ &
…
Anita Grigoriadis^1,2,3

npj Breast Cancer volume 12, Article number: 41 (2026) Cite this article

2367 Accesses
Metrics details

Subjects

Abstract

Cancer research emphasises early detection, yet quantitative methods for normal tissue analysis remain limited. Digitised haematoxylin and eosin (H&E)-stained slides enable computational histopathology, but artificial intelligence (AI)-based analysis of normal breast tissue (NBT) in whole slide images (WSIs) remains scarce. We curated 70 WSIs of NBTs from multiple sources and cohorts with pathologist-guided manual annotations of epithelium, stroma, and adipocytes (https://github.com/cancerbioinformatics/OASIS). We developed robust convolutional neural network (CNN)-based, patch-level classification models, named NBT-Classifiers, to tessellate and classify NBTs at different scales. Across three external cohorts, NBT-Classifiers trained on 128 × 128 µm and 256 × 256 µm patches achieved AUCs of 0.98–1.00. The model learned independent normal features different from those of precancerous and cancerous epithelium, which were further visualised using two explainable AI techniques. When integrated into an end-to-end preprocessing pipeline, NBT-Classifiers facilitate efficient downstream analysis within peri-lobular regions. NBT-Classifiers provide robust compartment-specific analytical tools and enhance our understanding of NBT appearances, which serve as valuable reference points for identifying premalignant changes and guiding early breast cancer prevention strategies.

A review and comparison of breast tumor cell nuclei segmentation performances using deep convolutional neural networks

Article Open access 13 April 2021

Detection, localization, and staging of breast cancer lymph node metastasis in digital pathology whole slide images using selective neighborhood attention-based deep learning

Article Open access 29 October 2025

Assessing the risk of recurrence in early-stage breast cancer through H&E stained whole slide images

Article Open access 08 October 2025

Introduction

Normal breast tissue (NBT) research is gaining momentum for the early detection of breast cancer¹. The epithelium, organised into ducts and lobules, also referred to as terminal duct-lobular units (TDLUs), is of primary importance, as these structures are where most physiological processes and pathological alterations occur². Apart from the epithelium, recent evidence suggests that the stroma, particularly the peri-lobular stroma, may also harbour precursor alterations and have a crucial role in breast cancer progression^3,4,5,6. Therefore, in-depth histopathological research into these structures, especially in women with varying breast cancer risk, holds promise for detecting early signs of cancer initiation in breast tissues that appear “normal”. This is particularly significant for women at higher breast cancer risk, such as germline BRCA1/2 mutation (gBRCA1/2m) carriers⁷.

Histopathology remains the gold standard for diagnosing cancer, relying on microscopic examination to detect tissue abnormalities. In routine practice, pathologists first survey the entire tissue section on a glass slide before focusing on pathological regions at higher magnification for more detailed assessment. Whole slide images (WSIs) enable the hierarchical storage of microscopic histological images scanned at multiple magnifications⁸. Their ease of management and ability to facilitate rapid sharing have led to their growing integration into clinical workflows⁹, as well as driving advancements in digital image analysis using platforms such as QuPath¹⁰. This shift has led to the recent growth of computational pathology (CPath), a rapidly evolving field that leverages artificial intelligence (AI) and deep learning, particularly convolutional neural networks (CNNs) and the latest vision transformer (ViT)-based foundation models¹¹, for high-throughput analysis of large-scale WSI datasets^12,13,14.

Automatically classifying tissue and decomposing individual tissue compartments into smaller pieces, namely patches, are crucial WSI pre-processing steps. Numerous CNN-based models have been developed for breast cancer¹⁵, including approaches that classify tissue compartments within normal tissues adjacent to tumours, such as HistoROI¹⁶. In contrast, few studies have specifically focused on NBTs in those individuals without malignancy, with prior efforts primarily aimed at detecting and quantifying lobules^{17,18,19,20,21,22} or assessing overall tissue composition^23,24. Nonetheless, none of these approaches has been specifically designed to provide patch-level tissue classifications on digitised WSIs of NBTs. Moreover, all these studies typically relied on a single source of NBTs, which may limit the generalisability of the resulting models²⁵. Specifically, NBTs are heavily underrepresented in large annotated WSI databases. A recent literature review²⁶ summarised publicly available breast haematoxylin and eosin (H&E) WSI datasets from 2015 to 2023, identifying 17 datasets comprising a total of 10,385 breast H&E WSIs. Of these, only two contain NBTs (350 female normal breast WSIs), and just one (44 WSIs) includes manual annotations. Thus, there is a clear need for a universal patch-level classification model for NBTs, developed on a curated dataset that captures real-world variability and is supported by ground-truth manual annotations.

We present CNN-based NBT-Classifiers to classify patches of epithelium, stroma and adipocytes at two different scales. We provide pathologist-guided manually annotated NBTs on digitised WSIs, sourced from five cohorts, including NBTs from healthy women without high-risk germline mutations who underwent reduction surgery or core biopsy, gBRCA1/2m carriers who underwent a risk-reducing mastectomy, as well as contralateral and ipsilateral NBTs from breast cancer patients. NBT-Classifiers achieved robust performance across three external cohorts by leveraging normal-specific features, which were further validated by explainable AI-based approaches^27,28. Additionally, an end-to-end framework was integrated, outputting pseudo patch-level annotations of target regions, including lobules and their microenvironment. Collectively, NBT-Classifiers provide a primary tool for studying different tissue compartments at sub-tissue level on large-scale digitised images in the context of NBT.

Results

Optimisation of NBT-Classifiers

In practice, pixel-level annotated digitised histological image datasets of NBTs are scarce²⁶. Furthermore, the biological heterogeneity of NBTs from diverse sources is often overlooked when assembling such datasets. To address this, we collected in total 70 digitised formalin-fixed, paraffin-embedded (FFPE), H&E-stained images (n = 70 patients) of NBTs from various sources, including core biopsies of healthy donors (non-gBRCA1/2m carriers, n = 12 WSIs), women undergoing reduction mammoplasties (non-gBRCA1/2m carriers, n = 30 WSIs), gBRCA1/2m carriers undergoing risk-reducing mastectomy (n = 21 WSIs), and contralateral or ipsilateral normal tissues from breast cancer patients (n = 7 WSIs) (Fig. 1a, and Supplementary Table 1, “Methods”). These NBT WSIs were sourced from five distinct patient cohorts: the King’s Health Partners Cancer Biobank (KHP) in London (UK), the Netherlands Cancer Institute (NKI) in Amsterdam (Netherlands), the Barts Cancer Institute (BCI) in London (UK), the École Polytechnique Fédérale de Lausanne (EPFL) in Lausanne (Switzerland), and the publicly available Susan G. Komen Tissue Bank (SGK)²⁹ (Fig. 1b). Patient age ranged from 16 to 74 years. Collectively, these digitised images capture both technical variability (differences in H&E-staining and scanning) and biological diversity (variations in tissue sources and a broad patient age range) within NBTs. Due to their greater data diversity, WSIs from the NKI and BCI cohorts were used for model training and optimisation, while WSIs from the KHP, EPFL, and SGK cohorts were held out for external validation. Expert manual annotations of epithelium, stroma, and adipocytes were performed under the supervision of a consultant pathologist (SEP) using QuPath v0.3.0 (Fig. 1c, d, and Supplementary Fig. 1, “Methods”). The pathology-guided annotated WSIs of NBTs are available in the nOrmal breASt tISsue Dataset (OASIS) repository, accessible at https://github.com/cancerbioinformatics/OASIS.

NBTs inherently exhibit a hierarchical organisation. Within the epithelium, structures ranging from individual epithelial cells, through to larger islands of acini, and then lobules, can be captured at fields of view (FOVs) of 64 × 64 µm, 128 × 128 µm, and 256 × 256 µm (Supplementary Fig. 2). Therefore, we aimed to train models enabling tissue classification at different scales to offer greater analytical flexibility for downstream analysis. To train these models, we generated patch-level datasets using QuPath v0.3.0 for the NKI and BCI cohorts. These datasets consisted of non-overlapping patches with sizes of 224 × 224, 512 × 512, and 1024 × 1024 pixels, all extracted at a resolution of 0.25 µm/pixel (Supplementary Table 2, “Methods”).

In the training pipeline (Fig. 1c), patches were subjected to stain normalisation to minimise cohort variability (Supplementary Fig. 3, “Methods”). The model employs a pre-trained CNN backbone for feature extraction, with a trainable classification head that predicts the probabilities of epithelium, stroma, and adipocytes (Supplementary Fig. 4, “Methods”). Given the critical role of stain normalisation³⁰ and the selection of CNN backbone³¹, we initially performed threefold cross-validation experiments to optimise these parameters (“Methods”). The stain normalisation methods evaluated included Reinhard³², Vahadane³³, Macenko³⁴, and StainGAN³⁵ (Supplementary Fig. 5, “Methods”). For the CNN backbones, we tested MobileNet³⁶, ResNet50³⁷, DenseNet³⁸, and InceptionV3³⁹, using weights pre-trained on ImageNet⁴⁰. These experiments were conducted using patch-level datasets containing non-overlapping 512 × 512-pixel patches from the NKI and BCI cohorts, with patches sampled in a class-balanced manner to ensure equal representation of tissue types (Supplementary Figs. 6 and 7, “Methods”). The combination of Reinhard stain normalisation and the MobileNet backbone yielded the best overall performance (Fig. 1e, and Supplementary Table 3).

Configured with Reinhard stain normalisation and the MobileNet backbone, we trained models using class-balanced, non-overlapping patches of 224 × 224, 512 × 512, and 1024 × 1024 pixels from the NKI and BCI cohorts (Supplementary Figs. 6 and 7, “Methods”). The models trained on larger FOVs generally achieved higher accuracies (Fig. 1f, and Supplementary Table 3), while the 224 × 224-pixel model exhibited the lowest accuracy and was more prone to misclassifying epithelium (Supplementary Fig. 8). Due to the suboptimal performance of the 224 × 224-pixel model, we proceeded to retrain two final versions of the NBT-Classifier: one using 512 × 512-pixel patches and the other using 1024 × 1024-pixel patches.

NBT-Classifiers demonstrate strong generalisability underpinned by interpretable tissue representations

We then evaluated the optimised NBT-Classifiers on the external KHP, EPFL, and SGK cohorts using receiver operating characteristic (ROC) analysis (“Methods”). The class-specific area under the curve (AUC) values were 0.98 for epithelium, 0.98 for stroma, and 1.00 for adipocytes across all cohorts (Fig. 2a), with cohort-specific AUCs showing some variation (Supplementary Fig. 9). The 1024px-based NBT-Classifier achieved similarly robust class-specific AUCs of 0.99 for epithelium, 0.98 for stroma, and 1.00 for adipocytes (Fig. 2b), with slight variations across cohort-specific AUCs (Supplementary Fig. 10). For both 512px- and 1024px-based NBT-Classifiers, we observed no bias in the accuracy concerning patient age in external cohorts or source of the NBT (Supplementary Fig. 11, “Methods”).

Fig. 2: NBT-Classifiers demonstrate high generalisability based on robust normal breast features. — **Fig. 2: *NBT-Classifiers* demonstrate high generalisability based on robust normal breast features.**

To dissect the underlying learned representations, we first visualised the feature representations from the last hidden layer of both models using t-distributed Stochastic Neighbour Embedding (t-SNE)⁴¹ (“Methods”). Patches predicted as epithelium, stroma, and adipocytes formed well-separated visual clusters (Fig. 2c, and Supplementary Fig. 12a) that closely mirrored their respective histology (Fig. 2d, and Supplementary Fig. 12b, “Methods”). We then examined the specific histological patterns driving class-specific predictions using class activation mapping (CAM)²⁷ and gradient-weighted class activation mapping (Grad-CAM)²⁸ (Supplementary Fig. 13, “Methods”). Both methods consistently highlighted biologically meaningful regions—epithelial cellular contents, collagen fibres, and adipocyte membranes—corresponding to the three tissue classes (Fig. 2e, and Supplementary Fig. 14a, b). When applied to a larger, representative region (5120 × 5120 pixels, 0.25 µm/pixel) encompassing all three tissue compartments, these characteristic patterns were coherently captured across non-overlapping patches, preserving spatial continuity (Fig. 2f). Notably, these patterns were reproducible across all cohorts (Supplementary Fig. 15a), including images scanned at 20× magnification from the SGK cohort (Supplementary Fig. 15b). When comparing the predictions with ground-truth annotations, most misclassifications occurred at the boundaries between class-specific visual clusters where histological features are less distinct (Supplementary Figs. 12c and 16a). Correspondingly, the confidence of model’s predictions, approximated by the probability associated with the predicted tissue class, was notably lower at these boundaries (Supplementary Fig. 16b). We then applied CAM and Grad-CAM to assess patches with low-confidence predictions. The resulting heatmaps for all three tissue classes showed that the model’s attention was proportionally distributed across mixed tissue regions (Supplementary Fig. 17). This suggests that the misclassifications are not due to a bias in the model’s innate judgement, but rather to ambiguity in the testing patches where tissue classes are less clearly defined. Together, these analyses demonstrate the strong generalisability of the NBT-Classifiers, underpinned by interpretable and biologically grounded representations of NBT architecture.

Training exclusively on normal tissue enables learning of distinctive features in the normal breast

Because there are no CNN models trained exclusively on NBT, we benchmarked the 1024px-based NBT-Classifier against HistoROI¹⁶, a ResNet18-based³⁷ model trained on mixed breast tissue WSIs, ranging from histologically normal, noncancerous, precancerous, to cancerous tissues^42,43, to classify epithelium, stroma, adipocytes, along with lymphocytes, miscellaneous tissues, and artefacts (“Methods”). HistoROI demonstrated comparable AUCs for epithelium (0.99 vs 0.99), while performance was lower for stroma (0.97 vs 0.98) and adipocytes (0.95 vs 1.00) (Fig. 3a, DeLong’s test, P < 0.0001, “Methods”). This performance gap was more pronounced in accuracy metrics: HistoROI achieved 80%, 91%, and 72% for epithelium, stroma, and adipocytes, respectively, lower than the 87%, 94%, and 96% attained by the NBT-Classifier (Supplementary Fig. 18).

Fig. 3: NBT-Classifiers capture features unique to normal breast epithelium. — **Fig. 3: *NBT-Classifiers* capture features unique to normal breast epithelium.**

The fundamental difference between the 1024px-based NBT-Classifier and HistoROI lies in their training datasets: HistoROI was developed using mixed breast tissues, while NBT-Classifiers were trained exclusively on NBT. This raised the question: could NBT-Classifiers capture unique architectural features of NBT, distinguishing it from abnormal tissues? To test this hypothesis, we curated a patch dataset using the same dataset employed to develop the HistoROI model^16,42,43 (“Methods”). In total, 56,000 1024 × 1024-pixel patches were extracted at 40× magnification for evaluation of both the NBT-Classifier and HistoROI. The dataset comprised 8000 patches per class across the following categories: normal (peri-tumoral, N), benign (pathological benign, PB; usual ductal hyperplasia, UDH), precancerous (flat epithelial atypia, FEA; atypical ductal hyperplasia, ADH), and cancerous (ductal carcinoma in situ, DCIS; invasive carcinoma, IC). Of these, 96.14% were predicted as epithelium by the NBT-Classifier, indicating good generalisability on external datasets. Notably, the epithelium probability distributions were shifted for NBT compared with pathological categories. In contrast, the mixed-tissue-based HistoROI exhibited substantial overlap between the two classes (Fig. 3b). ROC analysis further demonstrates that the NBT-Classifier outperforms HistoROI in discriminating normal breast epithelium from abnormal tissue (Fig. 3c, DeLong’s test, P < 0.0001). Importantly, differences between normal and precancerous or cancerous epithelium were consistently better captured by the NBT-Classifier (Fig. 3d). When applying Grad-CAM to patches of normal epithelium, we observed differences in the resolution and granularity of attention due to the distinct CNN architectures used (MobileNet⁴⁴ for NBT-Classifiers and ResNet18³⁷ for HistoROI). Beyond these architectural differences, HistoROI showed limited recognition of entire normal lobules, with attention misaligned and incomplete, whereas the normal-specific NBT-Classifier provided more comprehensive and well-aligned coverage of normal epithelium (Fig. 3e). Together, these findings suggest that the NBT-Classifier captures normal-specific features, in contrast to the more general epithelial features learned by models trained on mixed-breast-tissue datasets.

Slide-level tissue compartment classification and visualisation

NBT-Classifiers are patch-level tissue classification models that, once trained on annotated patches, can be applied to entire WSIs. To demonstrate this, we performed whole-slide tissue classification on a representative WSI using both the 512px- and 1024px-based NBT-Classifiers. The results were then converted and imported into QuPath v0.3.0 for interactive histological inspection (Fig. 4a, b, “Methods”). Heatmaps generated by both models showed strong alignment between predicted tissue classes and the underlying histology at different scales. For a larger tissue region (5120 × 5120 pixels, 0.25 µm/pixel), we created a smoother heatmap by using predicted tissue class probabilities (“Methods”). This high-resolution heatmap effectively delineated individual lobules and segmented tissue compartments with smooth transitions (Fig. 4c). Additionally, we analysed six other regions from tissue across different patient age groups, all of which demonstrated similarly strong performance (Supplementary Fig. 19). These results highlight the potential of this approach as a foundation for more advanced downstream image analyses, such as lobule detection.

**Fig. 4: Visualisation of whole slide tissue classification.**

Among all patches used for external validation, 98.2% had probabilities greater than 0.7, and 84% had probabilities greater than 0.99 (Fig. 4d). We then examined the spatial distribution of patches predicted with varying confidence in the same WSI example. Patches with confidence above 0.7 covered most tissue regions, while those with scores above 0.99 covered only partial regions, with gaps primarily at the boundaries between tissue compartments and within large lobules (Fig. 4e). These low-confidence patches in large lobules may explain the reduced accuracy of the 224px-based NBT-Classifier, as the smaller patch size tends to capture more intra-lobular stroma, leading to a higher rate of misclassification of epithelium as stroma. Furthermore, the varying spatial localisation of model confidence could serve as a valuable metric for capturing spatial and histological variations within each tissue compartment, making it particularly useful for patch selection in slide-level analysis, such as multiple-instance learning (MIL)^16,45.

End-to-end pipeline for analysing large-scale WSIs of NBT

Building on the demonstrated utility of the proposed NBT-Classifiers, we integrated the models into an end-to-end WSI pre-processing pipeline (Fig. 5a). This generates tissue classification results that facilitate large-scale digital image analysis within target sub-tissue regions in the normal breast, such as lobules and their microenvironment, and supports the patch selection process for more advanced deep learning frameworks, such as MIL^16,45. The pipeline begins with HistoQC⁴⁶ to detect foreground tissue regions (Supplementary Fig. 20, “Methods”). Once identified, the region is tessellated into non-overlapping patches, which are then analysed by the corresponding NBT-Classifier to predict tissue classes. This produces a whole slide tissue class heatmap, from which lobules and their peri-lobular regions within a specified range can be detected and localised (Fig. 5b, “Methods”). For frameworks such as MIL^16,45, patches from target tissue regions can be selectively balanced and treated as a hyperparameter to optimise performance during training (Fig. 5a). Besides, the pipeline outputs a binary mask that localises individual lobules on each WSI and can be directly imported into QuPath v0.3.0 as pseudo lobule annotations (Fig. 5c). Within each lobule, QuPath v0.3.0’s built-in nuclei detection and advanced spatial analysis tools, such as Delaunay triangulation, can be utilised for object-level lobule quantification. Additionally, the original patch-level classifications of the target regions (lobules and peri-lobular areas) can be exported and loaded into QuPath v0.3.0 where digital image analyses, such as texture analysis can be performed, across the slide at the patch level (Fig. 5d). These approaches build upon the previous efforts of standardised quantifications of TDLUs^{17,18,19,20,21}, aiming at expanding interpretation of lobules as well as their direct microenvironment, which might unlock novel biomarkers that are indicative of breast cancer precursors. The pipeline performed robustly across WSIs with varying staining at both 40× and 20× magnification (Supplementary Fig. 21), offering flexibility for studies involving heterogeneous datasets. The source code running this pipeline is available at: https://github.com/cancerbioinformatics/NBT-Classifier and https://hub.docker.com/repository/docker/siyuan726/nbtclassifier.

**Fig. 5: Approaches facilitating downstream WSI analysis of NBTs.**

Discussion

AI-driven WSI analysis has given rise to groundbreaking advancements in histopathology¹³. In breast research, while the research frontier remains largely focused on breast cancer, histologically normal tissues are significantly underrepresented among millions of digitised WSIs, with few applications specifically designed for facilitating in-depth morphological characterisation of NBTs^{17,18,19,20,21,22,23,24}. To address this gap, we compiled high-quality, manually annotated WSIs of NBTs, sourced from women with varying breast cancer risk across multiple institutions. Through extensive optimisation and external validation, we developed robust NBT-Classifiers capable of processing WSIs at different scales and performing patch-level tissue classification of epithelium, stroma, and adipocytes. To the best of our knowledge, NBT-Classifiers represent the first deep learning-based CPath models specifically trained to classify tissue compartments within NBT (from individuals without any lesions). Building on this, we integrated an end-to-end WSI pre-processing pipeline to ensure objective and reproducible tessellation and patch selection from biologically important regions, such as lobules and their microenvironment. The outputs are compatible with QuPath v0.3.0, one of the most widely used digital image analysis platforms and can be seamlessly integrated with deep learning-based CPath frameworks such as MIL^16,45. Our models (https://github.com/cancerbioinformatics/NBT-Classifier and https://hub.docker.com/repository/docker/siyuan726/nbtclassifier) and datasets (https://github.com/cancerbioinformatics/OASIS) are publicly available, whereby the latter will be a highly valuable resource for training pathology foundation models as they provide rich histology data of normal breast.

To establish fundamental tissue classification tools for NBTs, particularly as a primary step for downstream analyses, ensuring generalisability, robustness, and adaptability is essential. Achieving this requires a diverse and representative WSI dataset, ground-truth manual annotations, and seamless compatibility with downstream analytical frameworks. Previous NBT studies have typically relied on a single data source, such as benign breast biopsies^17,18,19,23, healthy donors^20,22, or reduction mammoplasties²¹. Only one exception²⁴ incorporated NBTs from both risk-reducing mastectomy and reduction mammoplasty specimens (gBRCA1/2m status unknown). However, these datasets do not capture the diverse origins of NBTs, which may exhibit varying predispositions to cancer initiation and distinct histological manifestations. In contrast, a key strength of our work is the incorporation of a broad spectrum of NBT variations, including differences in patient age, breast cancer risk, H&E staining protocols, and scanning platforms.

In the generation of AI, manual annotations should not be overlooked when developing supervised applications, especially in highly expert histopathological domains⁴⁷. Our fully pathologist-supervised ground-truth likely explains the slightly better performance in stroma and adipocytes classification compared to HistoROI¹⁶, which employed human-in-the-loop annotations. Among the previous NBT research, three studies provide manual annotations, yet their sample sizes and annotation levels vary (Supplementary Table 4). We expanded on this by providing detailed annotation of lobules, stroma and adipocytes that displayed biological variations. Moreover, we demonstrated that NBT-Classifiers, trained exclusively on histologically NBT, learn distinctive, normal-specific features. Their outputs reflect the degree of similarity to truly normal epithelium, whereas models trained on mixed breast tissues capture more general epithelium features. The comparable performance observed in the mix-breast-tissue model might largely arise from the relative similarity of epithelium to other tissue compartments, making the prediction effectively an approximation rather than a normal-specific assessment. This distinction is critical, as a classifier that is grounded in the unique characteristics of NBT offers a more robust foundation for future applications in early detection, positioning it as a screening tool for identifying pathological deviations in gigapixel WSIs of histologically NBT.

Performing patch-level analysis within key regions of NBTs, such as lobules and their microenvironment, is essential for detecting early cancer precursors. Prior NBT studies have focused on well-established cancer risk biomarkers, such as TDLU involution^17,18,19,20, immune infiltration in lobular regions²¹, alterations in tissue composition^22,23, and fibroglandular density²⁴. In addition to these visible pathological biomarkers, recent studies have highlighted greater predictive potential within NBTs at the single-cell level^48,49,50,51. Given the extensive validation of genotype-phenotype links in various CPath studies focused on cancer^52,53,54, analysing sub-tissue regions at the cellular and sub-cellular level shows significant promise for NBTs as well. In this context, the ability to adapt to broader downstream pathomics feature extraction tools, such as QuPath-based digital image analysis, or deep learning-driven CPath frameworks, opens new opportunities for evaluating high-throughput histopathological measurements. This, in turn, provides insights into localised histological markers that may signal the earliest stages of breast cancer initiation. Building upon this, advanced techniques such as spatial transcriptomics⁵⁵ and spatial proteomics⁵⁶ could uncover the underlying molecular mechanisms driving these histological changes. By linking molecular data with histological features, a deeper understanding of how early molecular alterations contribute to the development of breast cancer can be achieved.

One limitation of this study is the diversity of the external validation cohorts, which may not fully capture the spectrum of variations in NBTs, such as contralateral tissues or those from risk-reducing mastectomies. Future work will systematically evaluate the model’s performance, particularly across diverse demographic groups and healthcare settings, with a specific focus on ensuring equitable generalisability in underrepresented populations. Another limitation lies in the annotation scope, which included only subsets of tissue types. Consequently, regions like artefacts, necrosis and blood vessels were misclassified as epithelium when applying the NBT-Classifiers to whole WSIs. Including these regions as additional classes could improve classification accuracy. While expert annotation remains essential due to the histological complexity of NBTs, strategies to improve efficiency, such as AI-powered annotation platforms or carefully guided crowdsourcing approaches⁴⁷, could help scale the process. Biological assays or molecular markers²¹ could further support the generation of more precise, cell-level ground-truth annotations. In addition, ViT-based foundation models were not evaluated, as the near-perfect AUCs achieved suggest that the semantic distinctions between epithelium, stroma, and adipocytes were sufficiently pronounced for capture by conventional CNNs. Additionally, while ViTs excel at capturing global context, their patch-based architecture and multi-head attention often produce coarse or fragmented saliency maps, limiting interpretability at the cellular and sub-tissue level^57,58. In contrast, CNNs, with their locality and translation-invariance biases, generate more granular and biologically intuitive explanations that align with pathologists’ reasoning⁵⁹. Thus, although CNNs may lack some global context, their explainability was critical for our goal of visualising distinct, plausible patterns in NBT. Future work could explore whether transformer-based backbones provide additional performance gains or improve generalisability. Lastly, due to out-of-distribution effects between training and testing data, domain shift was observed in features extracted from both NBT-Classifiers, as well as HistoROI (Supplementary Fig. 22), suggesting stain normalisation cannot fully address the variations in data distributions across cohorts. Recent studies indicate that even foundation models^60,61 trained on large and diverse datasets do not always avoid domain shift^62,63. In future work, domain generalisation techniques could be applied to improve the models, enabling them to learn domain-invariant features and reduce cohort-specific biases^30,64.

In summary, we present deep learning-based NBT-Classifiers for patch-level tissue type classification in NBTs. These have the potential to enhance our understanding of how various NBT components contribute to both benign and malignant breast pathology and lay the groundwork for the development of more advanced deep learning models and spatial-defined molecular large-scale analyses in the future.

Methods

Patient cohorts

In this study, NBTs were obtained from the following sources: core biopsies of healthy donors (non-gBRCA1/2m carriers, n = 12 WSIs); women undergoing reduction mammoplasties (non-gBRCA1/2m carriers, n = 30 WSIs); gBRCA1/2m carriers undergoing risk-reducing mastectomy (n = 21 WSIs); and contralateral or ipsilateral normal tissues from breast cancer patients (n = 7 WSIs). A total of 70 digitised FFPE, H&E-stained WSI were collected across five cohorts: the King’s Health Partners Cancer Biobank (KHP) in London (UK) (n = 16 WSIs), the Netherlands Cancer Institute (NKI) in Amsterdam (Netherlands) (n = 16 WSIs), the Barts Cancer Institute (BCI) in London (UK) (n = 16 WSIs), the École Polytechnique Fédérale de Lausanne (EPFL) in Lausanne (Switzerland) (n = 10 WSIs), and the public Susan G. Komen Tissue Bank (SGK) (n = 12 WSIs)²⁹. Patient age ranged from 16 to 74 years. WSIs from the KHP, BCI, and EPFL cohorts were scanned on Hamamatsu NanoZoomer scanners whilst WSIs from the NKI cohort were scanned on MIRAX scanners, all at 40× magnification (0.25 µm/pixel). WSIs from the SGK cohort were downloaded from the virtual tissue bank (https://virtualtissuebank.iu.edu/query/), having been scanned at 20× magnification²⁹.

To develop NBT-Classifiers, we used the NKI and BCI cohorts. These included samples of NBTs from reduction mammoplasties (n = 4 patients, non-gBRCA1/2m carriers), risk-reducing mastectomies in gBRCA1/2m mutation carriers (n = 21 patients, no cancer at surgery), and contralateral or ipsilateral NBTs from breast cancer patients (n = 7 patients). The remaining WSIs from the KHP, EPFL and SGK cohorts were used for external validation. These included NBTs from reduction mammoplasties (n = 26 patients, non-gBRCA1/2m carriers) and core biopsies from healthy donors (n = 12 patients, non-gBRCA1/2m carriers). Detailed information on individual slides can be found in Supplementary Table 1 and is available at the OASIS repository (https://github.com/cancerbioinformatics/OASIS).

Manual annotation

Expert manual annotations of epithelium, stroma, and adipocytes were performed for each WSI under the supervision of a consultant pathologist (SEP) using QuPath v0.3.0¹⁰. To improve the efficiency of the annotation process, we adopted a hybrid strategy: lobular boundaries were precisely delineated to annotate epithelium, while stroma and adipocytes were annotated using rectangular boxes of comparable size, randomly sampled from various regions throughout the NBT, to ensure a balanced representation of tissue classes during training (Fig. 1c, d, and Supplementary Fig. 1). The pathology-guided annotated WSIs of NBTs are available in the OASIS repository, accessible at https://github.com/cancerbioinformatics/OASIS.

Patch-level datasets for training

To train models classifying NBTs at different scales, we used QuPath v0.3.0 to extract labelled patches from the NKI and BCI cohorts. These datasets consisted of non-overlapping patches with sizes of 224 × 224, 512 × 512, and 1024 × 1024 pixels, all extracted at a resolution of 0.25 µm/pixel (Supplementary Table 2). Only patches with their centres located within the annotated regions were extracted and inherited the labels from the pixel-level manual annotations. The patch extraction was performed using Groovy scripts in QuPath v0.3.0. The script is available at: https://github.com/cancerbioinformatics/NBT-Classifier/blob/main/patching_qupath.groovy.

Stain normalisation

In experiments optimising the stain normalisation method, we used the same reference image when implementing the Reinhard, Macenko, and Vahadane method, which is available at: https://github.com/cancerbioinformatics/NBT-Classifier/blob/main/data/he.jpg.

Architecture of NBT-Classifiers

The NBT-Classifiers leverage transfer learning, each employing a pre-trained CNN backbone to extract visual features from image patches, followed by a trainable classification head for tissue class prediction (Supplementary Fig. 4). Specifically, the feature maps from the last convolutional layer of each CNN backbone were transformed into a single-dimensional vector using a global average pooling layer. The reduced visual features serve as input to the subsequent classification module, which consists of two densely connected layers. The first dense layer linearly transforms the input features into a 1024-dimensional representation, followed by a second dense layer that reduces it to 512 dimensions, both employing the “ReLU” activation function. To mitigate overfitting, a Dropout layer with a rate of 50% is applied between these layers. The final output layer maps the 512-dimensional representation to a 3-dimensional feature, using a “Softmax” activation function to predict the probability distribution across the three tissue classes. The class with the highest probability is assigned as the predicted tissue class (0: epithelium, 1: stroma, 2: adipocytes).

Three-fold cross-validation

NBT-Classifiers were trained and optimised through threefold cross-validation. To ensure methodological consistency and fair comparison across experiments, a single fixed partition of the data was applied throughout all analyses. In each fold, patches were split at the WSI level: 20 WSIs for training (10 from each cohort), 4 WSIs for internal validation (2 from each cohort), and 8 WSIs for internal testing (4 from each cohort).

For experiments optimising the stain normalisation method and CNN backbone, patch-level datasets containing non-overlapping 512 × 512-pixel patches from the NKI and BCI cohorts were used. To ensure class balance, 250 patches per tissue class were randomly sampled from each WSI. For experiments optimising NBT-Classifiers on patches of different FOVs, non-overlapping patches of 224 × 224, 512 × 512, and 1024 × 1024 pixels from the NKI and BCI cohorts were used. To ensure class balance, 150 patches per tissue class were randomly sampled from each WSI (Supplementary Figs. 6 and 7, and Supplementary Table 2).

For training patches, six data augmentation techniques were employed, including horizontal flipping, random rotation within the range of −40° to 40°, random shifts, random shear transformations, random zoom in/out, and random shifts of the channel values within the range of [0, 10], to enhance model’s robustness and regularise the algorithm. The weights of the CNN backbones were all frozen, and the densely connected layers in the classification head were initialised by the Xavier normal initializer and set to be trainable. We used categorical cross-entropy as the loss function and Adam optimiser with a learning rate of 1e−5. The batch size was 16 and we trained each model for 50 epochs. The best model was selected according to the highest validation accuracy, reflecting the overall multi-class classification performance, and was then saved for further testing.

External validation

For cohorts used for external validation (KHP, EPFL and SGK), we extracted non-overlapping patches with a size of 512 × 512 and 1024 × 1024 pixels at a resolution of 0.25 µm/pixel for the KHP and EPFL cohorts. For the SGK cohort, scanned at 20× magnification, we extracted non-overlapping patches with a size of 256 × 256 and 512 × 512 pixels at a resolution of 0.5 µm/pixel (Supplementary Table 2). All patches were normalised using the Reinhard method and patches from SGK were resized to match the dimension of the corresponding NBT-Classifier. For each patch, each NBT-Classifier outputs probabilities of epithelium, stroma and adipocytes and the tissue class with the highest predicted probability was assigned as the final class prediction. The primary statistical endpoint was the area under the receiver operating characteristic curve. The secondary statistical endpoint was class-specific accuracy. Accuracy was determined by comparing the ground-truth labels with the class assigned the highest predicted probability. We report both the overall accuracy as well as individual accuracies in each KHP, EPFL and SGK cohort.

Confounder analysis

We evaluated whether the NBT-Classifiers exhibited any bias towards potential confounders such as patient age and the source of NBTs. For a fair comparison, we sampled 15 patients from a total of 38 patients across three external cohorts, ensuring balanced representation across three age groups, namely “premenopausal years” <45 years: 5 patients; “menopausal years” 45–55 years: 5 patients; “postmenopausal years” >55 years: 5 patients. We sampled 30 patients from a total of 32 patients across NKI and BCI cohorts to include equal representation across three NBT sources: reduction mammoplasty (10 patients), risk-reducing mastectomy (10 patients), and NBTs from breast cancer patients (10 patients). To mitigate class imbalance, 500 patches per tissue class per WSI were randomly sampled when using the 512px-based model and 250 patches per class per WSI were sampled when using the 1024px-based model. These experiments were repeated three times.

Implementation of HistoROI

HistoROI¹⁶ is a CNN-based tissue classification model trained on WSIs of histologically normal, noncancerous, precancerous, and cancerous breast tissue from the public BReAst Carcinoma Subtyping (BRACS) dataset (https://research.ibm.com/haifa/Workshops/BRIGHT/, https://www.bracs.icar.cnr.it/)^42,43. The model classifies epithelium, stroma, and adipocytes, along with lymphocytes, miscellaneous tissues, and artefacts. It processes 256 × 256-pixel patches at a resolution of 1 µm/pixel, capturing the same field of view (FOV) as 1024 × 1024-pixel patches at 0.25 µm/pixel. To ensure a fair comparison, we resized the 1024 × 1024-pixel patches extracted from WSIs in three external cohorts and applied Reinhard stain normalisation before using them with HistoROI¹⁶.

Analysis of epithelium patches from BReAst Carcinoma Subtyping (BRACS) database and analysis

To evaluate the ability of NBT-Classifier and HistoROI to discriminate between histologically normal, precancerous, and cancerous epithelium, we systematically curated a patch dataset from the public BRACS database^42,43 that was used to develop the HistoROI model. Specifically, region of interest (ROI) images (scanned at 40×, available at: https://www.bracs.icar.cnr.it/) encompassing peri-tumoral normal breast tissue (N), ADH, DCIS, and invasive carcinoma (IC) were tessellated into patches compatible with both models. We then randomly sampled 8000 patches per class and applied both classifiers to obtain epithelium probabilities and feature representations. The distributional differences in epithelium probabilities between the normal class (N) and each pathological class (ADH, DCIS, IC) were assessed using the Wilcoxon rank-sum test, followed by Bonferroni correction for multiple comparisons. The code is available at: https://github.com/cancerbioinformatics/NBT-Classifier/blob/main/notebooks/BRACS_analyses.ipynb.

Visual assessment of feature representations

We performed t-SNE⁴¹ to visualise the output of features from the layer before the final output layer, of each NBT-Classifier in a 2-dimensional space. To facilitate the inspection of the corresponding histology of the features, we randomly selected 500 H&E patches from each ground-truth tissue class (epithelium, stroma, adipocytes) and projected them onto the t-SNE plot based on their coordinates. The code for this visualisation is available at: https://github.com/cancerbioinformatics/NBT-Classifier/blob/main/vis_features.ipynb.

CAM and Grad-CAM visualisation

To enhance model interpretability, we applied CAM²⁷ and Grad-CAM²⁸. These techniques generate activation heatmaps that highlight the spatial regions most important to the model’s predictions for any given input image, utilising the final convolutional layer of the MobileNet backbone weighted by importance scores. Based on their distinct mechanisms (Supplementary Fig. 13), CAM typically generates more localised heatmaps that highlight areas relevant to the final classification. In contrast, Grad-CAM may produce more diffuse or generalised heatmaps in some instances, due to its reliance on gradients to capture the model’s learning dynamics. Once the activation heatmaps were computed, we applied bicubic interpolation to match their dimensions to those of the original H&E images. The resulting smoothed heatmaps were overlaid onto the digitised H&E images to visualise the high-attention histological structures corresponding to each particular tissue class prediction. The source code is available at: https://github.com/cancerbioinformatics/NBT-Classifier/blob/main/vis_CAMs.ipynb.

Foreground tissue detection

Given that WSIs often contain substantial non-tissue regions, including white background and artefacts such as coverslip edges, excluding these regions is essential for reducing the model’s inference time. We employed HistoQC⁴⁶, a WSI quality control software, to automatically generate binary masks that highlight the artefact-free foreground tissue regions on each WSI (Supplementary Fig. 20), ensuring that only tissue-containing regions are included in downstream tissue classification analyses.

Whole slide tissue classification

When using the 512px-based NBT-Classifier, the detected foreground tissue regions are divided into non-overlapping patches with a fixed size of 128 × 128 µm to account for variations in the micron per pixel (mpp) values across different scanners. These patches are then resized to 512 × 512 pixels to match the input dimensions of the model. When using the 1024px-based NBT-Classifier, the detected foreground region is divided into non-overlapping patches with a fixed size of 256 × 256 µm and resized to 1024 × 1024 pixels. Each NBT-Classifier outputs both probabilities of epithelium, stroma and adipocytes and the final class prediction for each patch, along with the original coordination on the slide.

Whole slide visualisation of tissue classification results

To visualise the whole slide tissue classification in a single image, we arranged the class predictions of patches according to their corresponding coordinates on the slide to create a heatmap. For interactive histological inspection, the tissue class heatmap can be exported as a JSON file and imported into the QuPath v0.3.0 platform. Additionally, we generated smoother tissue class heatmaps by upscaling the predicted probability matrix through interpolation. The visualisation method is available at: https://github.com/cancerbioinformatics/NBT-Classifier/blob/main/NBT_pipeline.ipynb.

Lobule detection

To detect individual lobules with their surrounding peri-lobular regions (adjustable), the epithelial layer in the whole slide tissue class heatmap is extracted and enlarged by a factor of 32. Then, connected component analysis is applied to identify groups of connected foreground pixels (filled with a value of 1) as distinct objects. Through manual examination, we observed that lobules empirically are covered by more than two epithelium patches, ~500,000 pixels and 0.25 µm/pixel. Therefore, we removed foreground objects (filled with a value of 1) smaller than 400,000 pixels and filled “holes” (filled with a value of 0) smaller than 400,000 pixels inside each detected foreground object. For slides scanned at 20× magnification, the threshold is adjusted to 250,000 pixels. This step reduces noise and increases the robustness for lobule localisation. This method outputs a binary mask highlighting individual lobules and an optional surrounding stroma within a pre-defined range. The mask can be converted into a JSON file and imported into QuPath v0.3.0 for downstream digital image analyses. The source code is available at: https://github.com/cancerbioinformatics/NBT-Classifier/blob/main/NBT_pipeline.ipynb.

Deep learning implementation

All deep learning experiments were implemented with TensorFlow2 in Python on two NVIDIA A100 GPUs from the high-performance computing cluster of King’s Computational Research, Engineering, and Technology Environment (CREATE).

Statistical analysis

To compare distributions between two groups, we performed pairwise Wilcoxon tests, applying Bonferroni correction for multiple comparisons. A p-value of < 0.05 was considered statistically significant. DeLong’s test was employed to compare the AUCs between different classifiers, assessing the significance of performance differences^65,66.

Data availability

WSIs involved in this study are stored at the OASIS repository: https://github.com/cancerbioinformatics/OASIS, which currently can be accessed upon request.

Code availability

The implementation code involved in this study can be found at: https://github.com/cancerbioinformatics/NBT-Classifier, including Python codes for implementing NBT-Classifiers and the pre-processing pipeline, and QuPath v0.3.0 scripts for tessellation and importing pseudo annotations.

References

Salim, M. et al. AI-based selection of individuals for supplemental MRI in population-based breast cancer screening: the randomized ScreenTrustMRI trial. Nat. Med. 30, 2623–2630 (2024).
Article CAS PubMed PubMed Central Google Scholar
Henson, D. E. & Tarone, R. E. Involution and the etiology of breast cancer. Cancer 74, 424–429 (1994).
Article CAS PubMed Google Scholar
Conklin, M. W. et al. Aligned collagen is a prognostic signature for survival in human breast carcinoma. Am J Pathol 178, 1221–32 (2011).
Li, H. et al. Collagen fiber orientation disorder from H&E images is prognostic for early stage breast cancer: clinical trial validation. NPJ Breast Cancer 7, 104 (2021).
Provenzano, P. P. et al. Collagen reorganization at the tumor-stromal interface facilitates local invasion. BMC Med. 4, 38 (2006). 20061226.
Article PubMed PubMed Central Google Scholar
Gadaleta, E. et al. Field cancerization in breast cancer. J. Pathol. 257, 561–574 (2022).
Article PubMed PubMed Central Google Scholar
Kuchenbaecker, K. B. et al. Risks of breast, ovarian, and contralateral breast cancer for BRCA1 and BRCA2 mutation carriers. JAMA 317, 2402–2416 (2017).
Article CAS PubMed Google Scholar
Ferreira, R. et al. The virtual microscope. In Proc. AMIA Annual Fall Symposium 449 (American Medical Informatics Association, 1997).
Pantanowitz, L. et al. Twenty years of digital pathology: an overview of the road travelled, what is on the horizon, and the emergence of vendor-neutral archives. J. Pathol. Inform. 9, 40 (2018).
Article PubMed PubMed Central Google Scholar
Bankhead, P. et al. QuPath: open source software for digital pathology image analysis. Sci. Rep. 7, 16878 (2017).
Article PubMed PubMed Central Google Scholar
Lipkova, J. & Kather, J. N. The age of foundation models. Nat. Rev. Clin. Oncol. 21, 769–770 (2024).
Article PubMed Google Scholar
Cooper, M., Ji, Z. & Krishnan, R. G. Machine learning in computational histopathology: challenges and opportunities. Genes 62, 540–556 (2023).
CAS Google Scholar
Hosseini, M. S. et al. Computational pathology: a survey review and the way forward. J. Pathol. Inform. 15, 100357 (2024).
Song, A. H. et al. Artificial intelligence for digital and computational pathology. Nat. Rev. Bioeng. 1, 930–949 (2023).
Article CAS Google Scholar
Zhu, Z., Wang, S. H. & Zhang, Y. D. A survey of convolutional neural network in breast cancer. Comput. Model. Eng. Sci. 136, 2127–2172 (2023).
PubMed PubMed Central Google Scholar
Patil, A. et al. Efficient quality control of whole slide pathology images with human-in-the-loop training. J. Pathol. Inform. 14, 100306 (2023).
Article PubMed PubMed Central Google Scholar
Wetstein, S. C. et al. Deep learning assessment of breast terminal duct lobular unit involution: towards automated prediction of breast cancer risk. PLoS ONE 15, e0231653 (2020).
Article CAS PubMed PubMed Central Google Scholar
de Bel, T. et al. Automated quantification of levels of breast terminal duct lobular (TDLU) involution using deep learning. npj Breast Cancer 8, 13 (2022).
Article PubMed PubMed Central Google Scholar
Kensler, K. H. et al. Automated quantitative measures of terminal duct lobular unit involution and breast cancer risk. Cancer Epidemiol. Biomark. Prev. 29, 2358–2368 (2020).
Article Google Scholar
Ogony, J. et al. Towards defining morphologic parameters of normal parous and nulliparous breast tissues by artificial intelligence. Breast Cancer Res. 24, 45 (2022).
Article CAS PubMed PubMed Central Google Scholar
Apou, G. et al. Detection of lobular structures in normal breast tissue. Comput. Biol. Med. 74, 91–102 (2016).
Article PubMed Google Scholar
Abubakar, M. et al. Host, reproductive, and lifestyle factors in relation to quantitative histologic metrics of the normal breast. Breast Cancer Res. 25, 97 (2023).
Article PubMed PubMed Central Google Scholar
Ish, J. L. et al. Outdoor air pollution and histologic composition of normal breast tissue. Environ. Int. 176, 107984 (2023).
Article CAS PubMed PubMed Central Google Scholar
Heydarlou, H. et al. A deep learning approach for the classification of fibroglandular breast density in histology images of human breast tissue. Cancers 17, 20250128 (2025).
Stacke, K. et al. Measuring Domain Shift for Deep Learning in Histopathology. IEEE J. Biomed. Health Inf. 25, 325–336 (2021).
Article Google Scholar
Tafavvoghi, M. et al. Publicly available datasets of breast histopathology H&E whole-slide images: a scoping review. J. Pathol. Inform. 15, 100363 (2024).
Article PubMed PubMed Central Google Scholar
Zhou, B. et al. Learning deep features for discriminative localization. In Proc. IEEE conference on computer vision and pattern recognition 2921–2929 (2016).
Selvaraju, R. R. et al. Grad-CAM: visual explanations from deep networks via gradient-based localization. Int. J. Comput. Vis. 128, 336–359 (2020).
Article Google Scholar
Sherman, M. E. et al. The Susan G. Komen for the Cure Tissue Bank at the IU Simon Cancer Center: a unique resource for defining the “molecular histology” of the breast. Cancer Prev. Res. 5, 528–535 (2012).
Article Google Scholar
Jahanifar, M. et al. Domain generalization in computational pathology: survey and guidelines. ACM Computing Surveys 57, 1–37 (2025).
Voon, W. et al. Performance analysis of seven Convolutional Neural Networks (CNNs) with transfer learning for Invasive Ductal Carcinoma (IDC) grading in breast histopathological images. Sci. Rep. 12, 19200 (2022).
Article CAS PubMed PubMed Central Google Scholar
Reinhard, E. et al. Color transfer between images. IEEE Comput. Graph. Appl. 21, 34–41 (2001).
Article Google Scholar
Vahadane, A. et al. Structure-preserving color normalization and sparse stain separation for histological images. IEEE Trans. Med. Imaging 35, 20160427 (2016). 1962-1971.
Article Google Scholar
Macenko, M. et al. A method for normalizing histology slides for quantitative analysis. In Proc. International Symposium on Biomedical Imaging: From Nano to Macro, 1107–1110 (IEEE, 2009).
Kang, H. et al. StainNet: a fast and robust stain normalization network. Front. Med. 8, 746307 (2021).
Article Google Scholar
Howard, A. G. Mobilenets: efficient convolutional neural networks for mobile vision applications. Preprint at https://arxiv.org/abs/1704.04861 (2017).
He, K. et al. Deep residual learning for image recognition. In Proc. conference on computer vision and pattern recognition, 770-778 (IEEE, 2016).
Huang, G. et al. Densely connected convolutional networks. In Proc. IEEE conference on computer vision and pattern recognition, 4700–4708 (2017).
Szegedy, C. et al. Rethinking the inception architecture for computer vision. In Proc. conference on computer vision and pattern recognition, 2818–2826 (IEEE, 2016).
Deng, J. et al. Imagenet: a large-scale hierarchical image database. In Proc. 2009 IEEE Conference on Computer Vision and Pattern Recognition 248–255 (IEEE, 2009).
Maaten, L. & Hinton, G. Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579-2605 (2008).
BReAst Carcinoma Subtyping (BRACS), https://www.bracs.icar.cnr.it/background/.
Brancati, N. et al. BRACS: A Dataset for BReAst Carcinoma Subtyping in H&E Histology Images. Database 2022, baac093. https://doi.org/10.1093/database/baac093.(2022).
Howard, A. G. et al. Mobilenets: efficient convolutional neural networks for mobile vision applications. Preprint at https://arxiv.org/abs/1704.04861 (2017).
Yamashita, R. et al. Deep learning model for the prediction of microsatellite instability in colorectal cancer: a diagnostic study. Lancet Oncol. 22, 132–141 (2021).
Article PubMed Google Scholar
Janowczyk, A. et al. HistoQC: an open-source quality control tool for digital pathology slides. JCO Clin. Cancer Inf. 3, 1–7 (2019).
Google Scholar
Montezuma, D. et al. Annotation practices in computational pathology: a European Society of Digital and Integrative Pathology (ESDIP) survey study. Lab. Invest. 105, 102203 (2024).
Article PubMed Google Scholar
Reed, A. D. et al. A single-cell atlas enables mapping of homeostatic cellular shifts in the adult human breast. Nat. Genet. 56, 652–662 (2024).
Article CAS PubMed PubMed Central Google Scholar
Kumar, T. et al. A spatially resolved single-cell genomic atlas of the adult human breast. Nature 620, 181–191 (2023).
Article CAS PubMed PubMed Central Google Scholar
Sun, S. et al. Single-cell analysis of somatic mutation burden in mammary epithelial cells of pathogenic BRCA1/2 mutation carriers. J Clin Invest 132, e148113 (2022).
Pal, B. et al. A single-cell RNA expression atlas of normal, preneoplastic and tumorigenic states in the human breast. EMBO J. 40, e107333 (2021).
Anaya, J. et al. Multiple-instance learning of somatic mutations for the classification of tumour type and the prediction of microsatellite status. Nat. Biomed. Eng. 8, 20231102 (2024).
Google Scholar
Naik, N. et al. Deep learning-enabled breast cancer hormonal receptor status determination from base-level H&E stains. Nat. Commun. 11, 5727 (2020).
Article CAS PubMed PubMed Central Google Scholar
El Nahhas, O. S. M. et al. Regression-based deep-learning predicts molecular biomarkers from pathology slides. Nat. Commun. 15, 1253 (2024).
Article CAS PubMed PubMed Central Google Scholar
Caputo, A. et al. Spatial transcriptomics suggests that alterations occur in the preneoplastic breast microenvironment of BRCA1/2 mutation carriers. Mol. Cancer Res. 22, 169–180 (2024).
Article CAS PubMed PubMed Central Google Scholar
Trujillo, K. A. et al. Markers of fibrosis and epithelial to mesenchymal transition demonstrate field cancerization in histologically normal tissue adjacent to breast tumors. Int J Cancer 129, 1310–1321 (2011).
Jo S., Jang G. and Park H. GMAR: gradient-driven multi-head attention rollout for vision transformer interpretability. Preprint at https://arxiv.org/abs/2504.19414 (2025).
Chowdhury, A. et al. Prompt-CAM: Making Vision Transformers Interpretable for Fine-Grained Analysis. In Proc. Computer Vision and Pattern Recognition Conference, 4375–4385 (2025).
Oh, S., Kim, N. & Ryu, J. Analyzing to discover origins of CNNs and ViT architectures in medical images. Sci. Rep. 14, 8755 (2024).
Article CAS PubMed PubMed Central Google Scholar
Chen, R. J. et al. Towards a general-purpose foundation model for computational pathology. Nat Med 30, 850–862 (2024).
Lu, M. Y. et al. A visual-language foundation model for computational pathology. Nat. Med. 30, 863–874 (2024).
Article CAS PubMed PubMed Central Google Scholar
Gustafsson, F. K. & Rantalainen, M. Evaluating computational pathology foundation models for prostate cancer grading under distribution shifts. Preprint at https://arxiv.org/abs/2410.06723 (2024).
Song, A. H. et al. Morphological prototyping for unsupervised slide representation learning in computational pathology. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition, 11566–11578 (2024).
Yun, J. et al. Enhancing whole slide pathology foundation models through stain normalization. Preprint at https://arxiv.org/html/2408.00380v1 (2024).
DeLong, E. R., DeLong, D. M. & Clarke-Pearson, D. L. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics 44, 837–845 (1988).
Article CAS PubMed Google Scholar
Sun, × & Xu, W. Fast Implementation of DeLong’s algorithm for comparing the areas under correlated receiver operating characteristic curves. IEEE Signal Process. Lett. 21, 1389–1393 (2014).
Article Google Scholar

Download references

Acknowledgements

The authors wish to acknowledge the support of the King’s Health Partners Cancer Biobank in London, the Netherlands Cancer Institute in Amsterdam, the Barts Cancer Institute in London, and the École Polytechnique Fédérale de Lausanne in Lausanne, Switzerland and Aasiyah Oozeer, Marcello D’Angelo, Rachel Barrow, Rachel Nelan, Marcelo Sobral-Leite, Fabio de Martino for material collection. The authors would like to thank all members of the Cancer Bioinformatics team at King’s College London (London, UK) for their helpful suggestions. We thank the Breast Cancer Research Trust, Breast Cancer Now (and their legacy charity Breakthrough Breast Cancer), the Medical Research Council (MRC) [MR/X012476/1], Cancer Research UK [CRUK/07/012, KCL-BCN-Q3], and CRUK City of London Centre Award [CTRQQR-2021/100004], Guy’s Cancer Charity, and the UK Government through the Research Venture Catalyst award, Department for Science, Innovation, and Technology for funding this project. Siyuan Chen is funded by a China Scholarship Council PhD scholarship. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. During the preparation of this work, the author(s) used Grammarly (free version) and ChatGPT (version 2) to correct some grammatical errors and enhance the overall readability of the manuscript. After using this tool/service, the authors reviewed and edited the content as needed and take(s) full responsibility for the content of the publication.

Author information

These authors contributed equally: Siyuan Chen, Mario Parreno-Centeno.

Authors and Affiliations

Cancer Bioinformatics, School of Cancer & Pharmaceutical Sciences, Faculty of Life Sciences and Medicine, King’s College London, London, UK
Siyuan Chen, Mario Parreno-Centeno, Graham Booker, Gregory Verghese, Fathima Sumayya Mohamed, Christopher R. S. Banerji & Anita Grigoriadis
PharosAI, London, UK
Gregory Verghese, Aasiyah Oozeer, Rachel Barrow, Rachel Nelan, Cheryl Gillett, Louise J. Jones & Anita Grigoriadis
The Breast Cancer Now Research Unit, School of Cancer and Pharmaceutical Sciences, Faculty of Life Sciences and Medicine, King’s College London, London, UK
Gregory Verghese & Anita Grigoriadis
Panakeia Technologies, London, UK
Salim Arslan & Pahini Pandya
King’s Health Partners Cancer Biobank, King’s College London, London, UK
Aasiyah Oozeer, Marcello D’Angelo & Cheryl Gillett
Centre for Tumour Biology, Barts Cancer Institute, Queen Mary University of London, London, UK
Rachel Barrow, Rachel Nelan & Louise J. Jones
Division of Molecular Pathology, Netherlands Cancer Institute, Amsterdam, The Netherlands
Marcelo Sobral-Leite & Esther H. Lips
Swiss Institute for Experimental Cancer Research, School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
Fabio de Martino & Cathrin Brisken
The Breast Cancer Now Toby Robins Research Centre, The Institute of Cancer Research, London, UK
Cathrin Brisken
European Cancer Stem Cell Research Institute, School of Biosciences, Cardiff University, Cardiff, UK
Matthew J. Smalley
The Alan Turing Institute, London, UK
Christopher R. S. Banerji
University College London NHS Trust, London, UK
Christopher R. S. Banerji
School of Cancer & Pharmaceutical Sciences, Faculty of Life Sciences and Medicine, King’s College London, London, UK
Sarah E. Pinder

Authors

Siyuan Chen
View author publications
Search author on:PubMed Google Scholar
Mario Parreno-Centeno
View author publications
Search author on:PubMed Google Scholar
Graham Booker
View author publications
Search author on:PubMed Google Scholar
Gregory Verghese
View author publications
Search author on:PubMed Google Scholar
Fathima Sumayya Mohamed
View author publications
Search author on:PubMed Google Scholar
Salim Arslan
View author publications
Search author on:PubMed Google Scholar
Pahini Pandya
View author publications
Search author on:PubMed Google Scholar
Aasiyah Oozeer
View author publications
Search author on:PubMed Google Scholar
Marcello D’Angelo
View author publications
Search author on:PubMed Google Scholar
Rachel Barrow
View author publications
Search author on:PubMed Google Scholar
Rachel Nelan
View author publications
Search author on:PubMed Google Scholar
Marcelo Sobral-Leite
View author publications
Search author on:PubMed Google Scholar
Fabio de Martino
View author publications
Search author on:PubMed Google Scholar
Cathrin Brisken
View author publications
Search author on:PubMed Google Scholar
Matthew J. Smalley
View author publications
Search author on:PubMed Google Scholar
Esther H. Lips
View author publications
Search author on:PubMed Google Scholar
Cheryl Gillett
View author publications
Search author on:PubMed Google Scholar
Louise J. Jones
View author publications
Search author on:PubMed Google Scholar
Christopher R. S. Banerji
View author publications
Search author on:PubMed Google Scholar
Sarah E. Pinder
View author publications
Search author on:PubMed Google Scholar
Anita Grigoriadis
View author publications
Search author on:PubMed Google Scholar

Contributions

Conceptualisation: A.G., S.P., S.C., M.P.-C., C.R.S.B. Data curation: S.C., G.B. Formal analysis: S.C., M.P.-C. Funding acquisition: A.G. Investigation: A.G., S.P., S.C., M.P.-C., C.R.S.B. Methodology: S.C., M.P.-C., C.R.S.B., G.V., S.A. Project administration: A.G. Resources: C.G., L.J.J., E.H.L., C.B., A.O., M.D’A., R.B., R.N., M.S.-L., F.de.M. Software: S.C., M.P.-C. Supervision: A.G., S.P., C.R.S.B. Validation: F.S.M. Visualisation: S.C. Writing – original draft: S.C. Writing – review & editing: A.G., S.P., C.R.S.B., E.H.L., C.B., L.J.J., M.J.S., G.V., F.S.M., G.B., S.C., M.P.-C., S.A., P.P.

Corresponding author

Correspondence to Anita Grigoriadis.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

NPJ_BreastCancer_2fd7f8ba-044e-423c-89b3-ff07ca907853_Supplementary_Material_Revision_clean (download PDF )

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Chen, S., Parreno-Centeno, M., Booker, G. et al. Normal breast tissue (NBT)-classifiers: advancing compartment classification in normal breast histology. npj Breast Cancer 12, 41 (2026). https://doi.org/10.1038/s41523-026-00896-2

Download citation

Received: 30 April 2025
Accepted: 14 January 2026
Published: 09 February 2026
Version of record: 17 March 2026
DOI: https://doi.org/10.1038/s41523-026-00896-2