Table 2 Summary of performance metrics of predictive models for Hologic BMD.

From: Automated bone mineral density prediction and fracture risk assessment using plain radiographs via deep learning

Patient strata

Number of ROIs

Predicted vs. measured mean BMD (sd, g/cm2); p**

Correlation coefficient

Linear regression R2, RMSE

Calibration slop, CITL

Bland-Altman bias (g/cm2; sd)

The hip testing set (Hologic)

Overall

5164

0.692 (0.144) vs. 0.689 (0.156); p < 0.001

0.92

0.84, 0.062

0.982, −0.003

−0.003 (0.062)

Female

3997

0.668 (0.137) vs. 0.661 (0.144); p < 0.001

0.91

0.83, 0.056

0.961, −0.007

−0.007 (0.059)

Male

1167

0.774 (0.137) vs. 0.782 (0.151); p < 0.001

0.89

0.80, 0.062

0.985, 0.008

0.008 (0.068)

40–59 years

712

0.796 (0.132) vs. 0.790 (0.140); p = 0.010

0.91

0.82, 0.056

0.987, −0.006

−0.006 (0.061)

60–74 years

1817

0.722 (0.132) vs. 0.717 (0.141); p <  0.001

0.90

0.82, 0.057

0.963, −0.005

−0.005 (0.061)

75–90 years

2635

0.644 (0.135) vs. 0.642 (0.148); p = 0.175

0.91

0.82, 0.057

0.998, −0.002

−0.002 (0.062)

The spine testing set (Hologic)*

Overall

57,662

0.837 (0.172) vs. 0.839 (0.186); p < 0.001

0.90

0.81, 0.081

0.978, 0.003

0.003 (0.081)

Female

46,349

0.813 (0.162) vs. 0.813 (0.176); p < 0.001

0.89

0.80, 0.079

0.969, 0.000

0.000 (0.079)

Male

11,313

0.931 (0.177) vs. 0.945 (0.191); p = 0.94

0.89

0.79, 0.088

0.958, 0.014

0.014 (0.088)

40–59 years

14,501

0.912 (0.160) vs. 0.909 (0.174); p < 0.001

0.90

0.80, 0.077

0.973, −0.003

−0.003 (0.081)

60–74 years

26,935

0.827 (0.166) vs. 0.828 (0.181); p = 0.003

0.90

0.80, 0.079

0.978, 0.001

0.001 (0.079)

75–90 years

16,226

0.784 (0.166) vs. 0.795 (0.188); p < 0.001

0.88

0.79, 0.086

1.004, 0.011

0.011 (0.086)

L1

12,731

0.742 (0.150) vs. 0.747 (0.167); p < 0.001

0.87

0.76, 0.081

0.968, 0.006

0.006 (0.082)

L2

15,809

0.816 (0.160) vs. 0.821 (0.177); p < 0.001

0.90

0.81, 0.078

0.993, 0.006

0.006 (0.078)

L3

15,679

0.873 (0.163) vs. 0.873 (0.181); p = 0.22

0.90

0.80, 0.080

0.993, 0.001

0.001 (0.080)

L4

13,443

0.909 (0.167) vs. 0.907 (0.182); p = 0.01

0.88

0.78, 0.084

0.966, −0.002

−0.002 (0.085)

  1. *Calculated per eligible vertebrae.
  2. **Means were compared using student t-test and medians were compared using Wilcoxon rank-sum test. Two-sided p values were reported.