Deep learning-based quantitative analysis of glomerular morphology in IgA nephropathy whole slide images and its prognostic implications

Cho, Seung Yeon; Kim, Yisak; Park, Sehoon; Paik, Jin Ho; Chin, Ho Jun; Park, Jeong Hwan; Lee, Jung Pyo; Kim, Yong-Jin; Park, Sun-Hee; Lee, Ho-chang; Cho, Hyunjeong; Lim, Beom Jin; Kim, Hyung Woo; Han, Seung Hyeok; Go, Heounjeong; Baek, Chung Hee; Lee, Hajeong; Moon, Kyung Chul; Kim, Young-Gon

doi:10.1038/s41598-025-09031-w

Download PDF

Article
Open access
Published: 02 July 2025

Deep learning-based quantitative analysis of glomerular morphology in IgA nephropathy whole slide images and its prognostic implications

Seung Yeon Cho¹,
Yisak Kim^1,2,
Sehoon Park³,
Jin Ho Paik⁴,
Ho Jun Chin^5,6,
Jeong Hwan Park⁷,
Jung Pyo Lee⁸,
Yong-Jin Kim⁹,
Sun-Hee Park¹⁰,
Ho-chang Lee¹¹,
Hyunjeong Cho¹²,
Beom Jin Lim¹³,
Hyung Woo Kim¹⁴,
Seung Hyeok Han¹⁴,
Heounjeong Go¹⁵,
Chung Hee Baek¹⁶,
Hajeong Lee^3,5,
Kyung Chul Moon^17,18^na1 &
…
Young-Gon Kim^19,20^na1

Scientific Reports volume 15, Article number: 23566 (2025) Cite this article

1952 Accesses
Metrics details

Subjects

Abstract

Kidney pathology of immunoglobulin A nephropathy (IgAN), which is the key finding of both diagnosis and risk stratification, involves labor-intensive manual interpretation as well as unavoidable interpreter-dependent variabilities. We propose artificial intelligence-based frameworks for quantitatively analyzing glomerular histologic features that can predict kidney progression in IgAN. A deep learning model, based on DeepLabV3Plus and EfficientNet-B3, was developed for segmenting glomeruli and quantifying the morphological features by using digitized whole slide images from seven tertiary hospitals. Subsequently, it was used for machine learning-based risk prediction of IgAN progression. Its predictability was compared with the conventional clinicopathologic feature-based model to demonstrate its comparable performance. In total, 1,241 whole slide images were obtained. The weighted averages of average precision and dice similarity coefficient were 0.795 and 0.721 in internal validation and 0.818 and 0.743 in external validation, respectively. Interestingly, image features-only-based kidney outcome prediction models showed similar predictability compared with clinical features-only-based models. In addition, incorporating an image-based deep learning model into the clinical features-based models enhanced predictabilities, although insignificant. These results show that quantitative glomerular histologic features are comparable to clinical data, suggesting that they may offer additional prognostic insights not covered by Oxford classification or other clinical parameters.

Machine learning-based diagnostic prediction of IgA nephropathy: model development and validation study

Article Open access 30 May 2024

An image inpainting-based data augmentation method for improved sclerosed glomerular identification performance with the segmentation model EfficientNetB3-Unet

Article Open access 10 January 2024

IgA nephropathy

Article 30 November 2023

Introduction

Immunoglobulin A nephropathy (IgAN) is one of the most prevalent glomerular diseases worldwide, contributing substantially to the progression of kidney failure^1,2. It is characterized by the abnormal deposition of immunoglobulin A, leading to subsequent inflammation of glomeruli³. Accurate and timely diagnosis of IgAN and its risk assessment are necessary for guiding appropriate clinical management, as early intervention can potentially slow or prevent disease progression. Kidney biopsy is an essential routine clinical procedure for patients with glomerular diseases, providing diagnostic and prognostic information to guide clinical decision-making^3,4. Although it remains the gold standard for the diagnosis and risk assessment of IgAN, the interpretation of kidney biopsy slides is complex and labor-intensive, requiring the expertise of pathologists.

Grading systems and prognostic tools have been developed to better characterize IgAN and predict progression outcomes^5,6. Particularly, the Oxford classification (the MEST-C score) of IgAN has been reported as a highly successful prognostic classification system in nephropathology, standardizing the biopsy reports and improving reliability^5,6. However, Oxford classification requires the manual examination of the biopsy slides and involves the examiner’s visual estimation. Such inherent subjectivity in histopathological assessments may lead to intra- and inter-observer variations, underscoring the need for complementary approaches^7,8,9. Recently, the International IgAN Prediction Tool (IIgAN-PT), a risk prediction model, was introduced, leveraging clinical and histological data to quantify the risk of kidney disease progression at diagnosis¹⁰. Although it demonstrated significant advancement in the prognostic modeling of IgAN, it incorporates Oxford classification, and its utility may be limited by the availability and completeness of the data required¹⁰.

In recent years, the integration of digital pathology and artificial intelligence techniques has opened new avenues for enhancing the reliability and efficiency of kidney biopsy interpretation, gaining increasing attention from the medical community^11,12,13,14. Specifically, deep learning-based glomerulus segmentation has been employed to analyze glomerulosclerosis in whole slide images (WSIs) of Periodic acid-Schiff (PAS) biopsy specimens^15,16. Acknowledging the significance of quantifying glomeruli for the histopathologic assessment of kidney tissue, studies have developed an ensemble of deep learning models to further improve the performance of glomerulus segmentation¹⁷. In addition, some machine learning methods have been applied to identify critical clinical and histopathological predictors of disease progression in minimal change disease or focal segmental glomerulosclerosis¹⁸. Considering these advancements, encompassing an artificial intelligence approach to analyze various characteristics of glomerular lesions from the biopsy images of IgAN patients may offer additional valuable information to clinicians.

This study aimed to develop and validate a AI-based framework to automatically quantify and analyze glomerular histologic features for prognostic classification of IgAN using a multi-center cohort. Various types of glomerular lesions were segmented and classified by deep learning models, and the morphological features of the glomeruli were extracted to predict the progression of IgAN. In addition, image-based and clinical data-based prognostic models were compared to evaluate whether glomerular histologic features provide prognostic value comparable to that of clinical parameters. This study is novel in that it integrates lesion-specific glomerular segmentation with quantitative morphometric analysis across a large, multi-center PAS-stained biopsy cohort, and demonstrates that even basic image-derived features can achieve prognostic performance comparable to clinical models.

Materials and methods

Study subjects

This study was a retrospective, multi-center study considering biopsy-confirmed cases with primitive IgAN diagnosis. The cases were obtained from seven different medical institutions in South Korea—Seoul National University Hospital (SNUH), Seoul National University Bundang Hospital (SNUBH), Seoul Metropolitan Government Seoul National University Boramae Medical Center (SMG-SNU BMC), Kyungpook National University Hospital (KNUH), Chungbuk National University Hospital (CBNUH), Severance Hospital (SVH), and Asan Medical Center (AMC). All patients included in this study had digital biopsy images available; corresponding demographic, clinical, and pathological characteristics were also collected (Table 1). For the clinical variables, estimated glomerular filtration rate (eGFR), serum creatinine levels, and urine protein-to-creatinine ratio (UPCR) were included, collectively describing the kidney function. In addition, only the scores available through pathology reports were included for the pathological variables. Exclusion criteria were any biopsy slides of inferior quality, as the primary input for the glomerulus segmentation model was the biopsy images; cases with less than six months of follow-up were further excluded during the development of the prognostic classification model.

Table 1 Clinical and pathologic characteristics of the study population.

Full size table

Study outcome

The primary study outcome included a reduction in eGFR to below 50% of the value at biopsy or the occurrence of end-stage kidney disease (eGFR < 15 mL/min/1.73 m² or kidney replacement therapy). The cohort was censored at the occurrence of the outcome event or the point of follow-up loss.

Data description

The slides from SNUH, SNUBH, BRMH, and parts of KNUH were acquired in ScanScope Virtual Slide (SVS) format, and those from AMC were acquired in TIF format. The rest of the biopsy images were provided as microscopic glass slides and scanned using a digital microscopy scanner (Aperio AT2; Leica Biosystems, Wetzlar, Germany) in SVS format at SNUH; these digitized slides are referred to as WSIs. For each WSI obtained, nephropathologists annotated a single tissue core and glomeruli. The manually scanned and TIF slides were not annotated. Each annotated glomerulus was labeled as one of the five classes that describe the lesion types: no lesion, global sclerosis, segmental sclerosis, crescent, or ischemic change (Table 2). The annotations underwent two-stage validation by expert nephropathologists, where one group labeled the glomeruli, and another group validated the labels. As WSIs from SNUH reflected the most substantial data collection, they were split into training, development, and validation sets; WSIs from the remaining six institutions were used for external validation. The WSIs containing minor class glomeruli were prioritized to be split to establish balanced training and validation sets. This ensured that minor class glomeruli, characterized by a lower prevalence, were distributed proportionally across both sets. All WSIs used in this study were PAS-stained specimens. A detailed overview of case inclusion, exclusion criteria, and how the dataset was split into training, development, and validation sets is provided in Fig. 1. This diagram summarizes the flow of data collection and preparation used for model development and evaluation.

Table 2 Number of annotated glomeruli in the internal and external cohort.

Full size table

Model development

Data pre-processing

A patch-based approach using a sliding window of patch size 512 × 512 was used for training the segmentation model. The sliding window size was set as 50% overlap for each patch to ensure that all biopsy regions were adequately covered; the image patches that did not contain any pixels of glomerulus were discarded. All patches were extracted at 10\(\:\times\:\) magnification of the slide level. Next, sampling methods were applied to mitigate the impact of class imbalances, as the number of no-lesioned glomeruli largely outnumber the lesioned glomeruli. In this study, combined sampling was used to overcome class imbalance leading to overfitting, where patches with no-lesioned glomeruli were under-sampled, and patches containing the rest of the classes were over-sampled. Finally, various data augmentations were performed on each input image using the Albumentation library before being input to the model¹⁹.

Segmentation model

The deep learning model aimed to segment the glomeruli in WSIs and classify the lesion type. Instead of concatenating segmentation and classification models, a multi-class segmentation model was designed to segment and classify the glomeruli simultaneously. Therefore, the input data comprised an RGB image of a tissue patch and a 5-channel mask image, with each channel representing a mask for one of the five glomerulus classes. The model was trained using DeepLabV3Plus with EfficientNet-B3 as an ImageNet-pretrained backbone encoder^20,21. The Adam optimizer was applied, and the Dice loss was selected for the loss function. The initial learning rate was 1\(\:{e}^{-4}\:\) with a step decay factor of 0.5 after every 10 epochs of no improvement in loss; the training was stopped after another 10 epochs with no improvement in loss. During validation, patch-wise Test Time Augmentation²², including flip, rotation, and multiply, was applied to further improve the segmentation performance. The patch-level inference results were aggregated to return a final slide-level inference.

Data post-processing

After the glomerulus segmentation, a post-processing step was conducted to address the issue of connected over-segments within the segmented regions, where two or more glomeruli located closely are segmented to connect the boundaries of the masks. First, the distance transform was applied to generate a distance map of the segmented glomeruli; this encoded the proximity of each pixel to the nearest background pixel. Subsequently, the watershed algorithm used the markers derived from the local minima in the distance map to expand the regions gradually. This process effectively separated the connected over-segments into distinct, non-overlapping glomerulus instances.

Morphological feature extraction

The predicted mask images of glomeruli with various types of lesions were used to analyze the morphological characteristics of the predicted glomeruli. Basic computer vision techniques were used to compute the area (number of pixels) of glomeruli, number of glomeruli, length of major and minor axes, solidness, compactness, eccentricity, and roundness (Supplementary Table S2). Each feature was extracted separately for each glomerular lesion class and averaged across glomeruli to represent the slide-level feature. For the area and number of glomeruli, the ratio of the value of each class relative to the total was calculated. The intention of using these features was to evaluate whether basic, simple, and explainable features that are directly measurable from glomerular morphology could provide meaningful prognostic information at the slide level.

Prognostic classification model

As the final stage of the proposed framework, classification models were trained using the previously extracted image-based features to predict kidney disease prognosis in IgAN. The comparability of the image-based prognostic model was compared against two other models trained using clinical information: one model was trained using the basic clinical data collected in the electronic medical record, and another was trained based on the variables used for IIgAN-PT. The complete list of input variables included for each model can be found in Supplementary Table S1. Additionally, the input features for the two clinical data-based models were combined with the image-based features for further assessment. Training was conducted using three classic machine learning algorithms, namely extreme gradient boosting (XGBoost), random forest, and logistic regression^23,24,25. The Scikit-learn modules were used for the machine learning library²⁶. A step-by-step schematic of the full workflow, including input WSIs, glomerular segmentation, feature extraction, and prognostic classification, is illustrated in Fig. 2.

Statistical analysis

The performance of the segmentation model was evaluated in terms of average precision (AP) and dice similarity coefficient (DSC) scores, each with 95% confidence intervals (CI). For AP, a glomerulus prediction was considered a true positive when the predicted region overlapped with at least 50% of a ground-truth region. Both metrics were evaluated for each class of glomeruli, and the weighted average scores were computed for internal and external validation sets to compare the overall performance. The performances of the classification models were assessed through a receiver operating characteristic (ROC) analysis and evaluated using the area under the ROC curve (AUC). The AUC values of the classification models were compared using Delong’s test²⁷. All statistical analyses were performed using the Python environment (v.3.8.0). The level of statistical significance was set at p < 0.05.

Ethical statement

The study protocols were approved by the Institutional Review Board Committees of SNUH/SNUBH/SMG-SNU BMC (IRB number: H-2103-091-1205), KNUH (IRB number: 2021-04-036), CBNUH (IRB number: 2021-09-004), SVH (IRB number: 4-2021-0376), and AMC (IRB number: 2021 − 1333), which waived the need for informed patient consent. The study adhered to the principles of the Declaration of Helsinki.

Code availability

The code used for morphological feature extraction, prognostic model development, and evaluation is available via GitHub: https://github.com/younggon2/Research-Segmentation_Glomerulus.