Introduction

Knee osteoarthritis (OA) is the most prevalent joint disease that causing knee pain and disability1,2,3. However, preventing its OA progression and treatment remain challenging due to its heterogeneous characteristics. Knee OA can involve a variety of risk factors, including local biomechanical (e.g., joint malalignment, post-traumatic, sports, and occupational injury) and systemic (e.g., age, sex, metabolic syndrome, bone mineral density [BMD], and race)4,5,6. Therefore, identifying patient-specific characteristics is important for preventing and effectively treating OA.

Based on this concept, OA has been grouped as a distinct phenotype based on different characteristics of knee OA4,7,8. Three phenotypes reportedly exist based on the predominant tissue reaction that causes joint damage: a bone-driven phenotype associated with subchondral bone osteopenia; another with predominant joint inflammation, and another in which cartilage damage is the structural hallmark9,10. Phenotypes have been classified by various OA risk factors, but studies on radiological features representing these three phenotypes are lacking.

Patient-specific characteristics are expected to affect the limited or whole compartmental involvement of the knee OA4. Local biomechanical factors contribute to the cartilaginous degeneration of a single knee joint, and systemic factors generate a low-grade inflammatory response, resulting in the degeneration of the cartilage and bone in whole compartments (medial tibiofemoral [MTF], lateral tibiofemoral [LTF], and patellofemoral [PF] joint)11. Therefore, OA with systemic factors is considered more likely to progress to tricompartmental OA, whereas OA with local factors progresses to unicompartmental OA (MTF, LTF, or PF). In addition, the systemic factors that cause joint space narrowing (JSN) differ from those that cause osteophyte formation in terms of pathogenic mechanism12. These radiographic distinctions of knee OA probably reflect differences in the underlying patient-specific characteristics and can be the factors distinguishing among the structural phenotypes.

The machine learning (ML) approach is widely used to classify disease types13. As it is also used to analyze unpredictable meaningful patterns of heterogeneous data, we selected this approach to obtain new insight into predicting the radiographic pattern of OA progression and identify an association with OA phenotypes. Here, we focused on interpreting the classification output of ML algorithms and analyzed which factors were the most influential for classifying radiographic phenotypes of OA. This study aimed to identify key factors associated with different OA progression patterns by developing a model that predicts the progression of early OA using patient-specific characteristics (demographic and radiographic factors). The study hypotheses were as follows: 1) patients with systemic factors are more likely to progress to tricompartmental OA; and 2) the contributing factors would differ between JSN and osteophyte formation in tricompartmental OA.

Results

Data obtained from the clinical data warehouse (CDW) of a single institution were collected from 79,634 patients between April 2003 and October 2017 who visited the outpatient clinic for knee pain. From the dataset, this study included knees that progressed to unicompartmental or tricompartmental OA with a minimum follow-up period of 5 years. The patient-specific characteristics that increase the risk of OA progression and could be obtained from the CDW were divided into demographic and radiographic factors. The patients’ demographic data at the initial visit and radiographic data at the initial and final visits were retrospectively analyzed. The following demographic factors were included: age, sex, body mass index (BMI), BMD, metabolic diseases (hypertension [HTN], diabetes mellitus [DM], and dyslipidemia), and other comorbidities (tuberculosis, liver disease, cardiovascular disease, cancer, epilepsy, and kidney disease)14. The exclusion criteria were as follows: (1) age < 20 years; (2) no follow-up radiographic data above 5 years; (3) no demographic data; (4) severe OA (any JSN or osteophyte Osteoarthritis Research Society International [OARSI] grade = 3) at the initial visit; (5) bicompartmental or tricompartmental knee OA at the initial visit; (6) bicompartmental OA at the final visit; (7) prior knee surgery or fracture history; (8) rheumatoid arthritis, infectious arthritis, gouty arthritis, or psoriatic arthritis. A total of 833 knee data points were used to develop a model for predicting the progression of knee OA (Fig. 1).

Fig. 1: Flow chart for data enrollment.
figure 1

A total of 833 knee data points were included to develop a model for predicting the progression of knee OA. OA osteoarthritis.

The JSN and osteophyte formation grades in each knee joint were used as radiographic factors. The JSN was evaluated at three sites in the MTF, LTF, and PF joints. Marginal osteophyte formation was evaluated at five sites in the medial femur (MF), lateral femur (LF), medial tibial plateau (MT), lateral tibial plateau (LT), and PF.

Patients were classified into one of the following two classes with the radiographic dataset of the final visit X-ray: (1) unicompartmental OA (if knee OA is defined in a monocompartment) and (2) tricompartmental OA (if knee OA is defined in the whole compartment). Tricompartmental OA was subclassified to determine how demographic and radiographic factors affected the radiographic patterns of JSN and osteophyte formation. The highest grade of JSN and osteophyte formation at any site was used to define the subclasses. Each knee progressing to tricompartmental OA was classified into one of the following three subclasses: (1) JSN-dominant OA (if the maximum JSN grade was greater than the maximum osteophyte grade), (2) osteophyte (O)-dominant OA (if the maximum osteophyte grade was greater than the maximum JSN grade) and indeterminate OA (the maximum osteophyte score was equal to the maximum JSN score)12.

Data description

Detailed patient information about the demographic and radiographic features is presented in Table 1. The mean patient age at the initial visit was 57.9 ± 9.4 years. The average follow-up period was 10.9 ± 3.4 years. The detailed compartmental distribution of knee OA progression and sex-based comparisons of demographics and compartmental patterns are also presented in Table 1. No significant differences were found between male and female patients, except for DM and dyslipidemia.

Table 1 Patient characteristics and compartmental distribution of OA progression with sex-based comparisons

Representative cases of the knee OA classes and subclasses are shown in Fig. 2.

Fig. 2: Representative cases of knee OA.
figure 2

a Unicompartmental OA (F/66; BMD: −1.1; BMI: 29.75 kg/m2; comorbidities: none). b Tricompartmental joint space narrowing-dominant OA (F/85; BMD: −4.2; BMI: 27.5 kg/m2; comorbidities: HTN). c Tricompartmental osteophyte-dominant OA (F/72; BMD: −2.7; BMI: 28.04 kg/m2; comorbidities: HTN, DM, dyslipidemia). d Tricompartmental OA (F/68; BMD: −2.9; BMI: 25.55 kg/m2; comorbidities: HTN, dyslipidemia, chronic hepatitis B). OA osteoarthritis, BMD bone mineral density, BMI body mass index, HTN hypertension.

Intra-rater reliability (κ) for the osteophyte and JSN OARSI grading at the initial visit was 0.72–0.83 and 0.78–0.87, respectively. Intra-rater reliability for the osteophyte and JSN OARSI grading at the final visit was 0.74–0.87 and 0.76–0.85, respectively.

The demographic and radiographic features of unicompartmental versus tricompartmental OA are summarized in Table 2. The mean age at the initial and final visits and the BMI of the tricompartmental OA class were higher than those of the unicompartmental OA class (p < 0.001). The mean BMD value of the tricompartmental OA class was lower than that of the unicompartmental OA class (p < 0.001). The prevalence of metabolic diseases was higher in the tricompartmental OA class than in the unicompartmental OA class (p < 0.001). There was no difference in the mean follow-up period, sex ratio, or prevalence of comorbidities between the two classes. The grades of all radiographic features were higher in the tricompartmental versus unicompartmental OA classes (p < 0.001), except for the MTF JSN grade at the initial visit (p = 0.011).

Table 2 Demographic and radiologic factors of unicompartmental versus tricompartmental OA

The demographic and radiographic features are compared between tricompartmental O-dominant, indeterminate, and JSN-dominant OA classes in Table 3. The mean BMD value was the lowest in the tricompartmental JSN-dominant OA class between the tricompartmental OA subclasses (p = 0.005). The prevalence of metabolic diseases was the higher in the tricompartmental O-dominant OA class than in the tricompartmental J-dominant OA class (p = 0.010).

Table 3 Demographic and radiologic factors of tricompartmental OA subclasses

Prediction performance of ML models

The performance metrics of the OA progression models are listed in Table 4. The light gradient boosting machine (LGBM) showed a higher area under the curve (AUC) than logistic regression (LR) for all classification models. Among the LGBM models, the model for unicompartmental/tricompartmental OA showed the highest AUC of 0.94, while that for tricompartmental O-dominant OA showed the lowest AUC (0.73; Fig. 3).

Fig. 3: ROC curve of LR and LGBM model for predicting OA progression.
figure 3

a Unicompartmental/tricompartmental OA classification model. b Tricompartmental joint space narrowing-dominant OA classification model. c Tricompartmental osteophyte-dominant OA classification model. ROC receiver operating characteristic, LR logistic regression, OA osteoarthritis, LGBM light gradient boosting machine.

Table 4 Performance of prediction models

Identification of strong contributors

We calculated the feature importance applied by SHapley Additive exPlanations (SHAP) feature explanations for each LGBM model. For the unicompartmental/tricompartmental OA classification model, the most important feature was PF osteophyte grade, followed by BMI, BMD, MF osteophyte grade, and LF osteophyte grade. The lower the BMD, the greater the risk of progression to tricompartmental OA. The higher the PF, MF, and LF osteophyte grade, the greater the risk of progression to tricompartmental OA (Fig. 4).

Fig. 4: SHAP summary plot of unicompartmental/tricompartmental OA classification model.
figure 4

All features were ordered by SHAP importance. OA osteoarthritis, SHAP Shapley Additive exPlanations.

For the tricompartmental JSN-dominant OA classification model, BMD was ranked first, followed by BMI, MTF JSN grade, and age. The lower the BMD, the greater the risk of progression to tricompartmental JSN-dominant OA. The higher the MTF JSN grade and age, the greater the risk of progression to tricompartmental JSN-dominant OA. For the tricompartmental O-dominant OA classification model, BMD was ranked first, followed by BMI, age, and metabolic disease. The higher the prevalence of metabolic diseases and the younger the age, the greater the risk of progression to tricompartmental O-dominant OA (Fig. 5). However, the linear relationship between BMI and risk of progression to each class was unclear.

Fig. 5: SHAP summary plot of tricompartmental OA subclassification models.
figure 5

All features were ordered by SHAP importance. a Tricompartmental joint space narrowing-dominant OA classification model. b Tricompartmental osteophyte-dominant OA classification model. OA osteoarthritis, SHAP Shapley Additive exPlanations.

Discussion

The principal findings of this study were as follows: Patients with a low BMD and high osteophyte grade for any site were likely to progress to tricompartmental OA, whereas those with a high BMD and low osteophyte grade for any site were likely to progress to unicompartmental OA. Among tricompartmental OA knees, patients with a low BMD, old age, and high MTF JSN grade were likely to progress to tricompartmental JSN-dominant OA, while those with metabolic disease and a relatively young age were likely to progress to tricompartmental O-dominant OA.

OA is a heterogeneous and multifactorial disease known to progress due to a wide variety of factors, and recent studies have classified OA phenotypes based on the factors affected7,9. Several studies have used the ML approach to predict knee OA progression and identify contributing factors for OA phenotypes13,15. Here, we used the LGBM ML model, which had better performance than the conventional LR for predicting OA progression patterns, focusing on the contributing factors and SHAP methodology for the interpretation of ML models.

In the unicompartmental/tricompartmental OA classification model, patients with a low BMD were likely to progress to tricompartmental OA. The association between knee OA and BMD has not yet been established, and there is ongoing debate about this. Zhang et al. demonstrated that high BMD decreased the risk of OA progression in the Framingham cohort, whereas Hart et al. low BMD at the hip appears weakly related to OA progression16,17. In this study, low BMD was identified as a contributing factor for the progression of tricompartmental OA. In patients with OA with low BMD, the imbalance between bone resorption and formation could impair subchondral bone remodeling. This results in a weakened microstructure and worsening biomechanical properties in subchondral bone layer increasing the loading stress on the cartilage layer and leading to more severe cartilage damage18. Low BMD affects the subchondral bone systemically, causing microfractures in all three compartments of the knee joint and increasing the risk of progression to tricompartmental OA19,20.

In contrast, patients with a high BMD were likely to progress to unicompartmental OA. Previous studies showed that patients with a high BMD had an increased risk of incident OA, while the patients with low BMD was related to the progression of OA21,22,23. This apparent discrepancy with previous findings may be explained by differences in the dominant mechanisms of OA development. Localized biomechanical stresses are more relevant in compartment-specific OA, whereas systemic alterations in bone quality, such as low BMD, may contribute to pathogenesis of generalized OA. In addition, Bergink et al. demonstrated that there was positive association between BMD and radiologic OA in weight-bearing joint due to local biomechanical factors, such as malalignment24. They focused on the incidence and progression of hypertrophic OA, a phenotype characterized by elevated BMD, osteoformative responses, slower disease progression, reduced bone turnover. In this study, patients with high BMD may exhibit increased subchondral bone stiffness, which could amplify mechanical stress, particularly in the presence of lower limb malalignment. Such malalignment could impose excessive stress on the unicompartmental articular cartilage and subchondral bone. Repetitive forces may result in a denser and stiffer bony characteristics with a lowered load absorptive ability, which may lead to an increased deterioration of overlying cartilage24,25. Alignment factors were not assessed in this study, so future research using whole-leg radiographs can help predict the progression of OA.

The association between OA and BMD may differ depending on whether a patient has a generalized or localized form of OA. OA may be amenable to tailored treatments that target specific phenotypes13. Unicompartmental OA is suggested to be an OA phenotype associated with high BMD, characterized by mechanical overload and cartilage damage, which may improve with alignment surgery9. In contrast, tricompartmental OA is suggested to be an OA phenotype associated with low BMD, characterized by aberrant subchondral bone remodeling, which may improve with osteoporosis treatment.

BMI was also an important contributing factor in the classification model, with the tricompartmental OA group showing higher BMI values compared to the unicompartmental OA group. However, the linear relationship between BMI and each class was not clear. This pattern may reflect a non-linear dose-response association between BMI and the risk of knee OA, with BMI influencing OA risk primarily beyond a certain threshold, rather than contributing uniformly across its entire range26,27. Furthermore, obesity could affect knee OA as an intermediate mediator, both locally and systemically, causing mechanical overloading at specific joints, which leads to localized effect, while also contributing to low-grade systemic inflammation at multiple joints. Fat mass accumulation and dysregulation drive inflammation and extracellular matrix degradation in musculoskeletal system, with adipokines playing key role in development of osteoarthritis. However, the interaction between inflammatory pathways, mechanical and metabolic processes in the cartilage and bone disorders remain still unclear28.

In the tricompartmental JSN-dominant OA classification model, a low BMD and old age were important demographic factors. Osteoporotic OA is characterized by weakened subchondral bone strength and improper structural support for articular cartilage, with greater stress on the medial platform compared to the lateral platform under physiological conditions29,30. Osteoporosis should be considered the major risk factor for varus deformity in knee OA due to both narrowing of the joint space and changes in the bony structure around the joint through different pathogenetic mechanisms versus unicompartmental OA with a high BMD. This could approach the bone-driven OA phenotype, which has structural benefits from treatment with bone-acting agents. Moreover, age may be associated with a prolonged repetitive force on the knee joint that can cause JSN.

The tricompartmental O-dominant OA classification model showed the lowest AUC among the OA progression models, making it less convincing to interpret the model. Rather than being immediately applied to clinical practice, it is better interpreted as an initial exploratory model for the characterization of OA subtypes. However, metabolic diseases newly emerged as an important contributing factor with a linear relationship. Several studies indicated that metabolic diseases increase the risk of OA31,32,33. Subchondral bone ischemia, caused by hypertension-induced atheroma lesions, may accelerate damage to joint tissues. The accumulation of advanced glycation end products or cholesterol activates inflammatory mechanisms and increases oxidative stress within the joint. They contribute to chronic low-grade inflammation in all three knee joints and elevate synovial activation and osteophyte formation in early-stage OA. Increasing awareness of this classification and patients’ comorbidities may enable interventions that could prevent the progression of OA. Although this study did not include laboratory markers indicative of systemic inflammation, future research with larger sample size and incorporation of biochemical and inflammatory markers would be valuable in confirming these associations and further characterizing the metabolic OA phenotype.

Many studies of clinical phenotypes and endotypes of OA have examined the prevention and treatment of early-stage OA7,13,33. However, despite many studies that classified phenotypes using biomechanical data, those of structural phenotypes using radiographic data are limited12,13. This study attempted to develop an ML algorithm using patient-specific information that could be easily obtained from clinics, such as demographics, comorbidities, and radiographic data. By interpreting the feature importance of each classification model using the SHAP methodology, we were able to identify the factors contributing to each OA class and predict the corresponding OA phenotype. Assessing OA progression patterns and identifying patient-specific information at the time of the clinical visit may enable targeted interventions, thereby supporting more effective personalized treatment and prevention strategies. However, the application of ML in predicting OA progression also requires careful consideration of its potential influence on clinical decision-making and the communication of prognostic information to patients. Although the current model was primarily trained on participants with progressed OA due to the study design, it may serve as a foundation for future clinical tools. With additional training on larger and more diverse datasets, including non-progressors, the model could be adapted and validated for use in clinical prediction. This could assist clinicians in the earlier identification of OA progression patterns and support personalized monitoring and intervention strategies.

This study has several limitations. First, the dataset was obtained from a single institution, and the number of datasets for both tricompartmental JSN-dominant OA classification model and the tricompartmental O-dominant OA classification model may not be sufficient to implement optimal gradient boosting, potentially leading to bias in the results; however, a recent review article suggests that an AUC between 0.7 and 0.8 can be considered to have acceptable discrimination and the performance indicators in this study were not overfitted34. Second, patients who had progressed to bicompartmental OA were excluded from the analysis. While this approach allowed us to focus on distinct OA progression patterns, it may have limited the generalizability of our findings to broader patient populations. A multicenter study with a larger sample size may be valuable for the future analysis. Third, clinical symptoms, biomechanical and lifestyle-related factors, such as pain severity, lower limb alignment, physical activity, and occupation were not included due to lack of available data, which limits the comprehensive characterization of OA progression. Including these factors in future studies may improve clinical relevance and facilitate the identification of contributors to specific OA phenotypes. Fourth, the study population exhibited a clear sex imbalance, with ~80% of patients being female. The predominance of female patients may limit the applicability of our findings to male populations. However, sex showed minimal contribution in the classification models, and subgroup analyses revealed no major differences between male and female patients. Fifth, this study was retrospective in nature, and the follow-up periods differed among patients. The retrospective design limits control over potential confounders and warrants caution in interpreting the findings. However, no significant difference in mean follow-up periods was observed between classes.

In conclusion, this study provides insights into the relationship between patient-specific characteristics, radiographic features, and OA phenotypes. Patients with osteoporosis were likely to progress to tricompartmental OA with JSN, while those with a high BMD were likely to progress to unicompartmental OA. Metabolic disease-related OA was associated with tricompartmental OA involving large osteophytes.

Methods

This study was approved by the institutional review boards of Seoul National University Bundang Hospital (IRB NO: B-2211-790-103), and the requirements for informed consent were waived, considering the retrospective study design and the use of anonymized patient data.

Radiologic data

INFINITT ver. 5.0.9.2 (INFINITT, Seoul, Korea) was used for all radiographic assessments. All knee radiographs of the enrolled patients were analyzed on weight-bearing AP views for the tibiofemoral joints and 30° flexion LAT views for the PF joints obtained at the initial and final visits. All radiographic factors in the dataset were scored at 6-week intervals by a blinded independent orthopedic surgeon who was not aware of the clinical characteristics of the patients.

Grades for JSN and osteophyte formation at each site were scored according to the OARSI atlas in four-point scale (0, absent; 1, mild; 2, moderate; 3, severe), respectively35,36,37. MTF or LTF OA was defined using the OARSI atlas grading system if one or more of the following criteria were fulfilled: (1) JSN grade ≥2, (2) the sum of the two marginal osteophyte grades in the same compartment grade ≥2, or (3) JSN grade 1 and marginal osteophyte grade 1 in the same tibiofemoral compartment38,39. PF OA was defined as follows: (1) any osteophyte grade ≥2 or (2) JSN grade ≥ 1 and any osteophyte grade ≥1 in the PF compartment40.

Features variables

Demographic and radiographic features were obtained from the CDW to predict OA progression as follows. For demographic features, Age, BMI, and BMD were used as continuous variables. The lowest T-score for the initial BMD found at the spine, total hip, femoral neck, or trochanter, which is recommended for the diagnosis of osteoporosis by the International Society for Clinical Densitometry, was used41,42. The remaining demographic features (metabolic disease, HTN, DM, dyslipidemia, and other comorbidities) were categorical variables. Metabolic diseases and other comorbidities were coded if at least one constitutive disease was present. All radiographic features were ordinal variables: JSN grade for the MTF, LTF, and PF joints, and osteophyte grade for the MF, LF, MT, LT, and PF sites.

Data preprocessing

Seventeen features, including demographic and radiographic features at the initial visit, were inputted. The dataset was randomly divided into training and test sets at a ratio of 70:30, using stratified randomization. All preprocessing steps, including Z-score normalization of continuous variables, were performed using parameters derived from the training set, and subsequently applied to the test set. Fig. 6 shows the development process of the prediction model for OA progression.

Fig. 6: Pipeline of machine learning algorithm classification models.
figure 6

The dataset was split into training and test sets (70:30) using stratified randomization. Data preprocessing steps, including Z-score normalization of continuous variables, were performed. OA osteoarthritis, BMI body mass index, BMD bone mineral density, HTN hypertension, DM diabetes mellitus, JSN-dominant joint space narrowing-dominant, O-dominant osteophyte-dominant.

Model development and evaluation

We trained supervised binary classification models using the data of all included knees (unicompartmental OA vs. tricompartmental OA). We also built two one-versus-rest (OVR) classification models, which transformed the multi-class classification into multiple binary classifications with the data of knees progressing to tricompartmental OA (tricompartmental JSN-dominant OA vs. rest tricompartmental OA [O-dominant and indeterminate OA] and tricompartmental O-dominant OA vs. rest tricompartmental OA [JSN-dominant and indeterminate OA])43.

Here, we focused on the interpretation of each classification model. We used two ML algorithms, conventional LR and LGBM, to develop a classifier model. For each algorithm, hyper-parameter optimization was conducted using five-fold cross-validation within the training set, and the hyper-parameter tuning method was used to maximize the area under the receiver operating characteristic curve. We searched for fine-tuned models with the best hyper-parameters: for LR, regularization method and inverse of regularization strength; and for LGBM, the number of leaves, GridSearchCV, and regularization type.

The performance metrics were the AUC, sensitivity, specificity, positive predictive value, and negative predictive value. The AUC was used as the primary metric for selecting the final model. All model development processes were performed using scikit-learn version 1.0.2 and Python version 3.8.15.

Model interpretation

We explained the SHAP features of the ML algorithm, which had a higher AUC for each classification model. The SHAP analysis is a model-agnostic method for calculating features. The SHAP value is the average of the marginal contributions of each predictive feature for the model prediction. All features were ordered by SHAP feature importance and visualized in SHAP summary plots. On the y-axis of the SHAP summary plot, the predictive feature with the largest feature importance was located at the top, while the rest were listed in that order. The x-axis shows the contribution of each feature to the OA class progression. Each dot represents a colored sample based on its feature value. Red indicates a higher risk of progression to OA class, while blue indicates a lower risk of progression to OA class44.

Statistical analysis

All statistical analyses were performed using SPSS statistical package (version 25.0; IBM Corp., Armonk, NY, USA). Continuous variables are expressed as mean ± standard deviation. Categorical variables are expressed as number and percentage. Differences in continuous variables were analyzed using Student’s t-test or Mann-Whitney test between the two classes and one-way analysis of variance (ANOVA) with Bonferroni’s correction between more than three classes. Differences in categorical variables were analyzed using Pearson’s chi-squared test or Fisher’s exact test. The reproducibility of the radiographic assessment was evaluated using the weighted kappa (κ) statistics. The significance level was set at P < 0.05.