Bimodal machine learning model for unstable hips in infants: integration of radiographic images with automatically-generated clinical measurements

Shimizu, Hirokazu; Enda, Ken; Koyano, Hidenori; Shimizu, Tomohiro; Shimodan, Shun; Sato, Komei; Ogawa, Takuya; Tanaka, Shinya; Iwasaki, Norimasa; Takahashi, Daisuke

doi:10.1038/s41598-024-68484-7

Download PDF

Article
Open access
Published: 01 August 2024

Bimodal machine learning model for unstable hips in infants: integration of radiographic images with automatically-generated clinical measurements

Hirokazu Shimizu^1,2^na1,
Ken Enda²^na1,
Hidenori Koyano³,
Tomohiro Shimizu¹,
Shun Shimodan⁴,
Komei Sato¹,
Takuya Ogawa¹,
Shinya Tanaka^2,5,
Norimasa Iwasaki¹ &
…
Daisuke Takahashi ORCID: orcid.org/0000-0003-4740-3643¹

Scientific Reports volume 14, Article number: 17826 (2024) Cite this article

1530 Accesses
3 Citations
Metrics details

Subjects

Abstract

Bimodal convolutional neural networks (CNNs) are frequently combined with patient information or several medical images to enhance the diagnostic performance. However, the technologies that integrate automatically generated clinical measurements within the images are scarce. Hence, we developed a bimodal model that produced automatic algorithm for clinical measurement (aaCM) from radiographic images and integrated the model with CNNs. In this multicenter research project, the diagnostic performance of the model was investigated with 813 radiographic hip images of infants at risk of developmental dysplasia of the hips (232 and 581 images of unstable and stable hips, respectively), with the ground truth defined by provocative examinations. The results indicated that the accuracy of aaCM was equal or higher than that of specialists, and the bimodal model showed better diagnostic performance than LightGBM, XGBoost, SVM, and single CNN models. aaCM can provide expert’s knowledge in a high level, and our proposed bimodal model has better performance than the state-of-art models.

Fair AI-powered orthopedic image segmentation: addressing bias and promoting equitable healthcare

Article Open access 12 July 2024

Accuracy of deep learning models in the detection of accessory ostium in coronal cone beam computed tomographic images

Article Open access 10 March 2025

Automated identification of hip arthroplasty implants using artificial intelligence

Article Open access 16 July 2022

Introduction

Developmental dysplasia of the hip (DDH) is one of the most common congenital abnormalities of the musculoskeletal apparatus affecting infants, ranging from mild dysplasia to dislocated hips¹. Provocative maneuvers have been used widely for screening unstable hips^2,3,4,5. Therapeutic interventions should be performed on the unstable hips of infants under 6 months⁶, whereas infants with stable hips can be observed. Notably, a randomized study found that interventions for stable dysplasia did not affect acetabular growth⁷, thus making it crucial to identify unstable hips because therapeutic strategies differ according to hip instability. Image investigations can also assist with diagnosing DDH with the use of the following four parameters on the hip region: acetabular index, O-edge angle, Yamamuro A, and Yamamuro-B^8,9 (Fig. 1). Thus, these clinical measurements are considered substantial for detecting unstable hips.

Machine learning models have been introduced in multiple image modalities for DDH detection^10,11,12,13. In general, physicians gain patient information from a medical interview and/or pathological reviews when making an image-based diagnosis. Conventionally, the data from medical interview or pathological review were obtained manually. Recently, many researchers combined machine learning models and images with the patient information^{14,15,16,17,18,19}. These integrated models performed better than image-only models^14,19,20. Thus, now data can also be obtained using another multi-modal technique. Previously, few researchers combined series of radiographic images to detect fractures or anomalies in bimodal convolutional neural network (CNN) models^21,22. Some researchers developed multi-modal models by adding manual tabular data or another external image to a single image. However, the technologies that integrate automatically-generated clinical measurements within the images are scarce. Thus, we hypothesized that automatic algorithm of clinical measurements (aaCM) with high accuracy can enhance the diagnostic performance.

Here, we developed a bimodal machine learning model to generate aaCM and later integrated it with radiographic images. Our proposed model was divided into the two components: aaCM and pure CNN. Regarding aaCM: traditional image processing has been applied to calculate fine parameters, such as angle and distance, of the radiographic images pertaining to DDH^23,24,25; application of artificial intelligence and CNN has also profoundly impacted image processing in medical field²⁶. Based on it, in this study, hybrid image processing was developed for clinical measurements. In particular, CNN was applied to detect the regions of interest (ROIs) in the radiographic images. Next, traditional image processing was applied to detect bone contours, identify feature points, and generate the parameters.

This study was aimed at evaluating the diagnostic performance of the bimodal model on infantile hip images by integrating clinical measurements produced by aaCM.

Materials and methods

Model development

Datasets and ground truth

In this multicenter and retrospective study, we enrolled all infants at risk for DDH, who visited the orthopedic department of our hospital and a related hospital between 2010 and 2020 based on the Japanese Pediatric Orthopaedic Association’s guidelines²⁷. Anterior–posterior X-rays were collected from infants aged 4–6 months. A total of 813 images were collected as Digital Imaging and Communications in Medicine (DICOM) data, with no data excluded. As a binary ground truth, three orthopedic surgeons with 22, 17, and 16 years of experience defined whether infants had an unstable hip. An unstable hip was diagnosed if provocative examinations, such as the Barlow or Ortolani tests, were positive. Therapeutic intervention such as brace treatment was performed on infants with unstable hips. The three surgeons also grounded the four parameters in each image. Demographic data in the whole dataset showed the affected side (right or left) in infants with unstable hips (Table 1). Six-fold cross-validation stratified by binary ground truth was adopted for the dataset²⁸.

Table 1 Demographic data.

Full size table

Framework of the proposed models

The settings were prepared as follows: Setting A included pure CNNs; Setting B included aaCM followed by models for tabular data; Setting C was the integrated model (Fig. 2). In Setting A, unstable hips were directly predicted from the X-ray images. In Setting B, ROIs were extracted from the X-ray images, and then, four clinical parameters per a hip were generated. Thus, this process yielded eight clinical measurements per an image, which were then used to predict unstable hips using models for tabular data. In Setting C, the features extracted from Setting A were concatenated with the clinical measurements obtained from aaCM, and the combined data were fed into a fully connected model for prediction.

aaCM part

Definition of bone ROIs and hip ROIs

As training data, the right and left sides of the acetabulum, proximal femur, and ischium were annotated and validated by the three surgeons. As these ROIs were considered bone ROIs (bROIs), each image had six bROIs. Using the ipsilateral three bROIs, a rectangle with hip ROIs (hROIs) was built as follows: height was defined as the distance between the upmost point of bROIs and the bottommost point, whereas width was defined as the distance between the lateralmost and innermost points.

Component 1: ROI detection

YOLOv5 was used to detect and extract six distinct bROIs from each image, represented as rectangular regions²⁹, with the S, M, and L models trained based on input parameters. To ensure the detection of all six bROIs in each image, we applied post-processing steps to remove any overlapping detections, guaranteeing the presence of exactly six unique bROIs per image. As augmentation methods, in addition to the basic techniques provided by the library²⁹, we flipped the images and bROIs horizontally to correct for the right and left sides.

Component 2: yielding clinical measurements

Binarization with adaptive thresholding was applied to each bROI to transform the bone area into a blob³⁰, while γ correction was adopted to adjust the image contrast. The local threshold was calculated at every individual point of the image with sliding window image processing³¹. The threshold value is based on the intensity of the pixel and its neighborhood, with the blob itself detected by labeling processing. The contour of the blob was described by a convex hull. Then, the featured points based on the bone contour were detected as follows (Fig. 3):

P1: the bottommost point in the ROI containing the acetabulum.
P2: the lateral point to P1with straight line.
P3: the lateralmost point in ROIs with the ischium.
P4: the upmost point in ROIs.
P5: the point nearest to the middle point between the innermost points in ROIs with the femur.

The radiographic hip parameters were measured using P1–P5. The output was defined as the clinical measurements acetabular index, O-edge angle, Yamamuro A, and Yamamuro B.

Setting A: pure CNNs

As widely investigated models, EfficientNet B0, B4, and B8 models were investigated, with their initial parameters ported from ImageNet-pretrained models^32,33,34,35. A sigmoid function was used for activation, and binary cross-entropy loss was used to train the NN³⁶. Each source image was converted into an 8-bit grayscale image. Then, the resulting pixel values were normalized using the mean and standard deviation of the dataset. Image augmentation, including flip, Gauss noise, blur, Contrast-Limited Adaptive Histogram Equalization (CLAHE), and saturation processes, were performed. Furthermore, the gradient-weighted class activation mapping (Grad-CAM) technique was applied to an image in each fold³⁷.

Setting B: aaCM followed by models for tabular data

Light Gradient-Booting Machine (LightGBM), support vector machine (SVM), random forest algorithm (RF), and logistic regression (LR) models were applied and compared. LightGBM was trained³⁸ using the outputs of aaCM. The maximum depth was three, early stopping round was 50, and boost round was 10,000. For the SVM model, we used the radial basis function (RBF) kernel, and the optimal values for the γ and C parameters were determined through a grid search.

To compare feature importance of the affected and contralateral sides in the infants with unstable hips, SHapley Additive exPlations (SHAP) methodology was applied^39,40 to the 232 positive cases (Table 1). SHAP values can individually represent feature-importance estimates and the direction^39,40; features with a positive sign contribute to the prediction. The color of each point in the plot represents the relative magnitude of the feature value, while the position indicates the SHAP value; higher SHAP values signify a greater positive impact on the outcome. Thus, the top-ranked parameters have higher SHAP values.

Setting C: integrated model

The features obtained from the convolutional layers of the CNNs with the clinical measurements were concatenated as the input for the fully connected layer. This architecture was originally created to enable training and inference in a single shot by dynamically combining the cached features with additional inputs in an online setting, rather than caching the generated features.

Evaluation metrics

Six-fold cross-validation has been used to reach a majority decision^41,42. It was used in this study to evaluate the outcomes. The model output undergoes a logistic transformation to constrain the predicted probabilities (_p_) between 0 and 1. The model performance was evaluated by comparing _p_ with the ground truth labels at different thresholds using following metrics:

$$Recall = \frac{TN}{{TP + FN}},$$

$$Specificity = \frac{TN}{{TP + FP}},$$

$$Accuracy = \frac{TP + TN}{{TP + TN + FP + FN}},$$

$$F1\_score = \frac{2TP }{{2 TP + FP + FN}},$$

where TP, FP, FN and TN are the abbreviations for the true positive, false positive, false negative, and true negative, respectively.

The reported values are based on the Youden Index⁴³, which maximizes the sum of recall and specificity. The receiver operating curve (ROC) plots recall against the false positive rate, while the precision-recall (PR) curve plots precision against recall. The areas under these curves (AUC) summarize the overall performance of the model.

The mean average precision (mAP) was calculated to evaluate the performance of ROI detection⁴⁴. Mean absolute errors between the clinical measurements from aaCM and the two orthopedic specialists were calculated.

Several evaluation metrics were analyzed to compare the settings: accuracy, average AUPRC, AUROC, and F1 score for the test data in each fold⁴¹. These values were compared using a paired t-test with Bonferroni correction for multiple comparisons⁴².

Ethics approval and consent to participate

This multicenter and retrospective study was approved by the local ethics committee of Hokkaido University hospital (018-0397) and by a participating hospital, and informed consent was obtained from every person prior to their inclusion in this study. The guidelines of human ethics according to the Declaration of Helsinki were followed.

Results

Performance of end-to-end models for unstable hips

The performance was evaluated among Settings A–C using the CNNs and models for tabular data (Table 5). After performing six-fold cross-validation using EfficientNet B4, the average accuracy of the models for predicting unstable hips was 0.809 (± 0.010) and 0.822 (± 0.024) for Settings A and B, respectively (Fig. 4a–c). The average area under the PR curve (AUPRC) was 0.733 (± 0.043) and 0.706 (± 0.042) for Settings A and B, respectively. The average area under the ROC (AUROC) was 0.840 (± 0.029) and 0.789 (± 0.037) for Settings A and B, respectively. The average F1 score was 0.690 (± 0.027) and 0.710 (± 0.039) for Settings A and B, respectively. Setting C was trained on all images simultaneously, obtaining an average accuracy of 0.832 (± 0.018), average AUPRC of 0.804 (± 0.053), average AUROC of 0.885 (± 0.022), and average F1 score of 0.738 (± 0.025). Thus, Setting C offered significantly better performance than Settings A and B with P ≤ 0.05 for accuracy, AUPRC, AUROC, and F1 score (Fig. 4a–c).

The diagnostic performance was evaluated using EfficientNet B0, B4 and B8; the average AUROC in Setting C was highest at 0.885 (± 0.022) with EfficientNet B4. It was significantly higher than the average AUROC in Setting A at 0.848 (± 0.027) (P = 0.02) with EfficientNet B8.

ROI detection

The right and left sides of the acetabulum, proximal femur, and ischium were annotated. As these ROIs were considered bROIs, each image had six bROIs. The six bROIs in all images were extracted using YOLOv5 models. The mean value of mAP50 for hROIs was 0.978 (± 0.011) in YOLOv5 S, 0.987 (± 0.004) in YOLOv5 M, and 0.990 (± 0.003) in L (Table 4). Using the ipsilateral three bROIs, a rectangle with hip ROIs (hROIs) was built.

Accuracy of the estimated parameters by aaCM

The mean absolute error of the estimated acetabular index was 1.76° on the right and 1.91° on the left, lesser than previously reported^13,24 (Table 2). Moreover, the estimated acetabular index produced by aaCM had a significantly smaller error than that produced by the orthopedic specialists (Table 3).

Table 2 Mean absolute error of acetabular index.

Full size table

Table 3 Accuracy of the estimated parameters produced by aaCM compared with that of orthopedic specialists.

Full size table

SHAP values in estimated parameters by aaCM

The feature importance scores of the individual parameters in the 232 positive cases (Table 1) were evaluated between the affected and contralateral sides (Fig. 4). The parameters with the higher SHAP values signify a greater positive impact on the outcome and are ranked higher. Thus, the three parameters at the affected side—Yamamuro A, O-edge angle, and acetabular index—had higher positive impacts than those at the contralateral side. Furthermore, the Grad-CAM heatmaps also revealed that the weighted region was consistent with the affected side in the positive case (Fig. 5b).

Discussion

We developed a bimodal model, in which X-ray images were integrated with automatically-produced clinical measurements. The bimodal model showed an average AUPRC and AUROC of 0.804 (± 0.060) and 0.885 (± 0.022), respectively. Thus, it outperformed the pure CNNs. For Setting A, we obtained an average AUPRC and AUROC of 0.733 (± 0.050) and 0.840 (± 0.029), respectively. Furthermore, the accuracy of aaCM was equal to or higher than the orthopedic specialists. Feature importance scores on the parameters at affected side were also highly ranked than those at the contralateral side (Tables 4, 5).

Table 4 Mean average precision (mAP).

Full size table

Table 5 The performance comparison between different models.

Full size table

The diagnostic performance of the end-to-end model enhanced without any manual preparation of clinical data by physicians, owing to an ad hoc adoption of clinical insights. According to some previous studies, patient demographic information enhances the diagnostic performance of medical images in orthopedic and other areas^14,19,45. Patient information can also improve the performance of artificial intelligence on various modalities of medical images: computed tomography, magnetic resonance imaging, pathological images. Generally, the patient demographic information is manually prepared. In another multi-modal technique, radiographic images in different views (anterior, posterior, lateral, and oblique condition) were combined and used as input data of CNNs to detect fractures or anomalies^21,22. Yoon et al. developed a bimodal CNN by combining gray scale images and scalograms of electrocardiogram. This model outperformed single CNN models¹⁹. Other researchers developed bimodal models by adding manual tabular data or another image to a single image. Conversely, our model automatically created additional data from a single image. The clinical measurements are based on expert knowledge; therefore, Setting C might not have been established if the models had been utilized only by technicians without domain knowledge about DDH. Ad hoc adoptions like our model could contribute toward performance reinforcement independently from versatile machine learning (ML) techniques.

The estimated parameters produced by aaCM had an absolute error smaller than those of the orthopedic specialists or previous reports^13,24. YOLOv5 was applied to detect ROIs in the radiographic images. Then, traditional image processing was applied to detect contour of bone, identify feature points, and generate parameters with high accuracy. Thus, the combination of traditional image processing and artificial intelligence had sufficient performance to measure the clinical parameters. Artificial intelligence and CNN has profoundly impacted image processing in medical fields²⁶. As aaCM was attributable to the reliable detection using YOLOv5, the robustness of this model on radiographic images was as also shown by previous literature related to DDH⁴⁶. Individual feature importance was also evaluated by SHAP^39,40, revealing that the parameters at affected side were highly weighted compared to those at contralateral side. This result was consistent with the Grad-CAM heatmaps and exhibited moderate diagnostic performance in Setting B.

Clinically, unstable hips have been conventionally diagnosed by provocative maneuvers^2,3,4,5, which are widely used around the world as screening techniques. However, the reliability of this test depends on the skill and experience of the examiner, with iatrogenic effects possible from repeated examinations⁴⁷; thus, a standardized image analysis method is desired to address this issue. Automated analysis of images could decrease the number of provocative maneuvers or shorten the waiting period for infants to get the correct diagnosis.

This study has some limitations. First, in this study, we used radiographic images. While ultrasound inspection is a representative modality used to evaluate DDH in clinics, radiographs are also accepted for four- to six-month-old infants. Some reports state that radiographs are preferred for this age^48,49. Second, the ground truth for unstable hips is provocative maneuvers, considered a conventional method. Although those maneuvers are widely used for screening, dynamic or static ultrasounds might also be preferred for defining unstable hips. Third, our dataset was relatively small compared to that of a previous study¹³ because we focused on six-month-old infants, a critical age group for DDH.

In conclusion, we presented the bimodal model on infantile hip images integrated with clinical measurements. We demonstrated that the automatic algorithm of clinical measurements had high accuracy and reinforced the diagnostic performance.

Data availability

The datasets used and/or analyzed during the current study are available from the corresponding author upon reasonable request.

Code availability

The underlying code for this study is available from the corresponding author upon reasonable request.

References

Mureşan, S., Mărginean, M. O., Voidăzan, S., Vlasa, I. & Sîntean, I. Musculoskeletal ultrasound: a useful tool for diagnosis of hip developmental dysplasia: One single-center experience. Medicine 98, e14081. https://doi.org/10.1097/md.0000000000014081 (2019).
Article PubMed PubMed Central Google Scholar
Cook, K. A. et al. Pavlik Harness initiation on Barlow positive hips: Can we wait?. J. Orthop. 16, 378–381. https://doi.org/10.1016/j.jor.2019.03.012 (2019).
Article PubMed PubMed Central Google Scholar
Neal, D. et al. Comparison of Pavlik Harness treatment regimens for reduced but dislocatable (Barlow positive) hips in infantile DDH. J. Orthop. 16, 440–444. https://doi.org/10.1016/j.jor.2019.06.027 (2019).
Article PubMed PubMed Central Google Scholar
Jackson, J. C., Runge, M. M. & Nye, N. S. Common questions about developmental dysplasia of the hip. Am. Fam. Phys. 90, 843–850 (2014).
Google Scholar
Williams, N. Improving early detection of developmental dysplasia of the hip through general practitioner assessment and surveillance. Aust. J. Gener. Pract. 47, 619–623 (2018).
ADS Google Scholar
Agostiniani, R. et al. Recommendations for early diagnosis of Developmental Dysplasia of the Hip (DDH): Working group intersociety consensus document. Ital. J. Pediatr. 46, 150. https://doi.org/10.1186/s13052-020-00908-2 (2020).
Article PubMed PubMed Central Google Scholar
Pollet, V. et al. Abduction treatment in stable hip dysplasia does not alter the acetabular growth: Results of a randomized clinical trial. Sci. Rep. 10, 9647. https://doi.org/10.1038/s41598-020-66634-1 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Narayanan, U. et al. Reliability of a new radiographic classification for developmental dysplasia of the hip. J. Pediatr. Orthop. 35, 478–484. https://doi.org/10.1097/bpo.0000000000000318 (2015).
Article PubMed PubMed Central Google Scholar
Ohmori, T. et al. Radiographic prediction of the results of long-term treatment with the Pavlik harness for developmental dislocation of the hip. Acta Med. Okayama 63, 123–128. https://doi.org/10.18926/amo/31850 (2009).
Article PubMed Google Scholar
Kinugasa, M. et al. Diagnosis of developmental dysplasia of the hip by ultrasound imaging using deep learning. J. Pediatr. Orthop. 43, e538–e544. https://doi.org/10.1097/bpo.0000000000002428 (2023).
Article PubMed Google Scholar
Zhang, S. C. et al. Clinical application of artificial intelligence-assisted diagnosis using anteroposterior pelvic radiographs in children with developmental dysplasia of the hip. Bone Jt. J. 102-b, 1574–1581. https://doi.org/10.1302/0301-620x.102b11.Bjj-2020-0712.R2 (2020).
Article Google Scholar
Fraiwan, M., Al-Kofahi, N., Ibnian, A. & Hanatleh, O. Detection of developmental dysplasia of the hip in X-ray images using deep transfer learning. BMC Med. Inform. Decis. Mak. 22, 216. https://doi.org/10.1186/s12911-022-01957-9 (2022).
Article PubMed PubMed Central Google Scholar
Liu, C. et al. Misshapen pelvis landmark detection with local-global feature learning for diagnosing developmental dysplasia of the hip. IEEE Trans. Med. Imaging 39, 3944–3954. https://doi.org/10.1109/tmi.2020.3008382 (2020).
Article PubMed Google Scholar
Kita, K. et al. Bimodal artificial intelligence using TabNet for differentiating spinal cord tumors-Integration of patient background information and images. iScience 26, 107900. https://doi.org/10.1016/j.isci.2023.107900 (2023).
Article ADS PubMed PubMed Central Google Scholar
Tiulpin, A. et al. Multimodal machine learning-based knee osteoarthritis progression prediction from plain radiographs and clinical data. Sci. Rep. 9, 20038. https://doi.org/10.1038/s41598-019-56527-3 (2019).
Article ADS CAS PubMed PubMed Central Google Scholar
Joo, S. et al. Multimodal deep learning models for the prediction of pathologic response to neoadjuvant chemotherapy in breast cancer. Sci. Rep. 11, 18800. https://doi.org/10.1038/s41598-021-98408-8 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Han, X. et al. The value of longitudinal clinical data and paired CT scans in predicting the deterioration of COVID-19 revealed by an artificial intelligence system. iScience 25, 104227. https://doi.org/10.1016/j.isci.2022.104227 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Bozzo, A., Tsui, J. M. G., Bhatnagar, S. & Forsberg, J. Deep learning and multimodal artificial intelligence in orthopaedic surgery. J. Am. Acad. Orthop. Surg. 32, e523–e532. https://doi.org/10.5435/jaaos-d-23-00831 (2024).
Article PubMed PubMed Central Google Scholar
Yoon, T. & Kang, D. Bimodal CNN for cardiovascular disease classification by co-training ECG grayscale images and scalograms. Sci. Rep. 13, 2937. https://doi.org/10.1038/s41598-023-30208-8 (2023).
Article ADS CAS PubMed PubMed Central Google Scholar
Zheng, X. et al. Deep learning radiomics can predict axillary lymph node status in early-stage breast cancer. Nat. Commun. 11, 1236. https://doi.org/10.1038/s41467-020-15027-z (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Choi, J. W. et al. Using a dual-input convolutional neural network for automated detection of pediatric supracondylar fracture on conventional radiography. Invest. Radiol. 55, 101–110. https://doi.org/10.1097/rli.0000000000000615 (2020).
Article PubMed Google Scholar
Rayan, J. C., Reddy, N., Kan, J. H., Zhang, W. & Annapragada, A. Binomial classification of pediatric elbow fractures using a deep learning multiview approach emulating radiologist decision making. Radiol. Artif. Intell. 1, e180015. https://doi.org/10.1148/ryai.2019180015 (2019).
Article PubMed PubMed Central Google Scholar
Yang, G. et al. A semi-automatic diagnosis of hip dysplasia on X-ray films. Front. Mol. Biosci. 7, 613878. https://doi.org/10.3389/fmolb.2020.613878 (2020).
Article PubMed PubMed Central Google Scholar
Al-Bashir, A. K., Al-Abed, M., Abu Sharkh, F. M., Kordeya, M. N. & Rousan, F. M. Algorithm for automatic angles measurement and screening for Developmental Dysplasia of the Hip (DDH). Annu. Int. IEEE Eng. Med. Biol. Soc. 2015, 6386–6389. https://doi.org/10.1109/embc.2015.7319854 (2015).
Article Google Scholar
Li, Q. et al. Auxiliary diagnosis of developmental dysplasia of the hip by automated detection of Sharp’s angle on standardized anteroposterior pelvic radiographs. Medicine 98, e18500. https://doi.org/10.1097/md.0000000000018500 (2019).
Article PubMed PubMed Central Google Scholar
Pinto-Coelho, L. How artificial intelligence is shaping medical imaging technology: A survey of innovations and applications. Bioengineering https://doi.org/10.3390/bioengineering10121435 (2023).
Article PubMed PubMed Central Google Scholar
Shimizu, T. et al. Validation of parameters recommended for secondary screening for developmental dysplasia of the hip in Japan. J. Orthop. Sci. https://doi.org/10.1016/j.jos.2023.06.013 (2023).
Article PubMed Google Scholar
Jung, Y. & Hu, J. A K-fold averaging cross-validation procedure. J. Nonparametric Stat. 27, 167–179. https://doi.org/10.1080/10485252.2015.1010532 (2015).
Article MathSciNet Google Scholar
Zhu, X., Lyu, S., Wang, X. & Zhao, Q. in Proceedings of the IEEE/CVF international conference on computer vision. 2778–2788.
Korzynska, A. et al. Validation of various adaptive threshold methods of segmentation applied to follicular lymphoma digital images stained with 3,3’-Diaminobenzidine&Haematoxylin. Diagn. Pathol. 8, 48. https://doi.org/10.1186/1746-1596-8-48 (2013).
Article PubMed PubMed Central Google Scholar
Sezgin, M. & Sankur, B. Survey over image thresholding techniques and quantitative performance evaluation. J. Electron. .ing 13, 146–165. https://doi.org/10.1117/1.1631315 (2004).
Article Google Scholar
Tan, M. & Le, Q. in International conference on machine learning. 6105–6114 (PMLR).
Marques, G., Ferreras, A. & de la Torre-Diez, I. An ensemble-based approach for automated medical diagnosis of malaria using EfficientNet. Multimed. Tools Appl. 81, 28061–28078. https://doi.org/10.1007/s11042-022-12624-6 (2022).
Article PubMed PubMed Central Google Scholar
Chen, X. et al. Application of EfficientNet-B0 and GRU-based deep learning on classifying the colposcopy diagnosis of precancerous cervical lesions. Cancer Med. 12, 8690–8699. https://doi.org/10.1002/cam4.5581 (2023).
Article CAS PubMed PubMed Central Google Scholar
Sharma, N. et al. EfficientNetB0 cum FPN based semantic segmentation of gastrointestinal tract organs in MRI scans. Diagnostics https://doi.org/10.3390/diagnostics13142399 (2023).
Article PubMed PubMed Central Google Scholar
Tan, M. & Le, Q. V. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. arXiv:1905.11946 (2019). https://ui.adsabs.harvard.edu/abs/2019arXiv190511946T
Selvaraju, R. R. et al. in 2017 IEEE International Conference on Computer Vision (ICCV). 618–626.
Ke, G. et al. in Proceedings of the 31st International Conference on Neural Information Processing Systems 3149–3157 (Curran Associates Inc., Long Beach, California, USA, 2017).
Rodríguez-Pérez, R. & Bajorath, J. Interpretation of machine learning models using shapley values: application to compound potency and multi-target activity predictions. J. Comput. Aided Mol. Des. 34, 1013–1026. https://doi.org/10.1007/s10822-020-00314-0 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Shapley, L. (Princeton University Press New Jersey, 1953).
Foersch, S. et al. Multistain deep learning for prediction of prognosis and therapy response in colorectal cancer. Nat. Med. 29, 430–439. https://doi.org/10.1038/s41591-022-02134-1 (2023).
Article CAS PubMed Google Scholar
Moncada-Torres, A., van Maaren, M. C., Hendriks, M. P., Siesling, S. & Geleijnse, G. Explainable machine learning can outperform Cox regression predictions and provide insights in breast cancer survival. Sci. Rep. 11, 6968. https://doi.org/10.1038/s41598-021-86327-7 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Schisterman, E. F., Faraggi, D., Reiser, B. & Hu, J. Youden Index and the optimal threshold for markers with mass at zero. Stat. Med. 27, 297–315. https://doi.org/10.1002/sim.2993 (2008).
Article MathSciNet PubMed PubMed Central Google Scholar
Lin, T.-Y. et al. in Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6–12, 2014, Proceedings, Part V 13. 740–755 (Springer).
Vollmer, A. et al. Multimodal artificial intelligence-based pathogenomics improves survival prediction in oral squamous cell carcinoma. Sci. Rep. 14, 5687. https://doi.org/10.1038/s41598-024-56172-5 (2024).
Article ADS CAS PubMed PubMed Central Google Scholar
Den, H., Ito, J. & Kokaze, A. Diagnostic accuracy of a deep learning model using YOLOv5 for detecting developmental dysplasia of the hip on radiography images. Sci. Rep. 13, 6693. https://doi.org/10.1038/s41598-023-33860-2 (2023).
Article ADS CAS PubMed PubMed Central Google Scholar
Sewell, M. D. & Eastwood, D. M. Screening and treatment in developmental dysplasia of the hip-where do we go from here?. Int. Orthop. 35, 1359–1367. https://doi.org/10.1007/s00264-011-1257-z (2011).
Article PubMed PubMed Central Google Scholar
Schaeffer, E., Lubicky, J. & Mulpuri, K. AAOS appropriate use criteria: The management of developmental dysplasia of the hip in infants up to 6 months of age: Intended for use by general pediatricians and referring physicians. J. Am. Acad. Orthop. Surg. 27, e364–e368. https://doi.org/10.5435/jaaos-d-18-00500 (2019).
Article PubMed Google Scholar
Shaw, B. A. & Segal, L. S. Evaluation and referral for developmental dysplasia of the hip in infants. Pediatrics https://doi.org/10.1542/peds.2016-3107 (2016).
Article PubMed Google Scholar

Download references

Author information

These authors contributed equally: Hirokazu Shimizu and Ken Enda.

Authors and Affiliations

Department of Orthopaedic Surgery, Faculty of Medicine and Graduate School of Medicine, Hokkaido University, Sapporo, Hokkaido, Japan
Hirokazu Shimizu, Tomohiro Shimizu, Komei Sato, Takuya Ogawa, Norimasa Iwasaki & Daisuke Takahashi
Department of Cancer Pathology, Faculty of Medicine and Graduate School of Medicine, Hokkaido University, Sapporo, Hokkaido, Japan
Hirokazu Shimizu, Ken Enda & Shinya Tanaka
Graduate School of Biomedical Science and Engineering, Hokkaido University, Sapporo, Hokkaido, Japan
Hidenori Koyano
Department of Orthopaedic Surgery, Kushiro City General Hospital, Kushiro, Hokkaido, Japan
Shun Shimodan
Institute for Chemical Reaction Design and Discovery (WPI-ICReDD), Hokkaido University, Sapporo, Hokkaido, Japan
Shinya Tanaka

Authors

Hirokazu Shimizu
View author publications
Search author on:PubMed Google Scholar
Ken Enda
View author publications
Search author on:PubMed Google Scholar
Hidenori Koyano
View author publications
Search author on:PubMed Google Scholar
Tomohiro Shimizu
View author publications
Search author on:PubMed Google Scholar
Shun Shimodan
View author publications
Search author on:PubMed Google Scholar
Komei Sato
View author publications
Search author on:PubMed Google Scholar
Takuya Ogawa
View author publications
Search author on:PubMed Google Scholar
Shinya Tanaka
View author publications
Search author on:PubMed Google Scholar
Norimasa Iwasaki
View author publications
Search author on:PubMed Google Scholar
Daisuke Takahashi
View author publications
Search author on:PubMed Google Scholar

Contributions

H. S. designed the study and collected data, all supervised by D.T.. K.E. and H.K. conducted models. D.T., T.S., and S.S. defined the positive cases. K.S. and T.O. measured the clinical parameters. H. Shimizu wrote the manuscript, and K.E. illustrated the figures. N.I. and S.T. directed the entire study.

Corresponding author

Correspondence to Daisuke Takahashi.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Shimizu, H., Enda, K., Koyano, H. et al. Bimodal machine learning model for unstable hips in infants: integration of radiographic images with automatically-generated clinical measurements. Sci Rep 14, 17826 (2024). https://doi.org/10.1038/s41598-024-68484-7

Download citation

Received: 25 December 2023
Accepted: 24 July 2024
Published: 01 August 2024
DOI: https://doi.org/10.1038/s41598-024-68484-7

This article is cited by

Artificial intelligence can extract important features for diagnosing axillary lymph node metastasis in early breast cancer using contrast-enhanced ultrasonography
- Tomohiro Oshino
- Ken Enda
- Masato Takahashi
Scientific Reports (2025)

Subjects

Abstract

Similar content being viewed by others

Fair AI-powered orthopedic image segmentation: addressing bias and promoting equitable healthcare

Accuracy of deep learning models in the detection of accessory ostium in coronal cone beam computed tomographic images

Automated identification of hip arthroplasty implants using artificial intelligence

Introduction

Materials and methods

Model development

Datasets and ground truth

Framework of the proposed models

aaCM part

Definition of bone ROIs and hip ROIs

Component 1: ROI detection

Component 2: yielding clinical measurements

Setting A: pure CNNs

Setting B: aaCM followed by models for tabular data

Setting C: integrated model

Evaluation metrics

Ethics approval and consent to participate

Results

Performance of end-to-end models for unstable hips

ROI detection

Accuracy of the estimated parameters by aaCM

SHAP values in estimated parameters by aaCM

Discussion

Data availability

Code availability

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Artificial intelligence can extract important features for diagnosing axillary lymph node metastasis in early breast cancer using contrast-enhanced ultrasonography

Search

Quick links