Table 13 Statistical analysis of model performance Across RGB, NIR, and multimodal variants.

From: TomatoRipen-MMT: transformer-based RGB and NIR spectral fusion for tomato maturity grading

Model / Variant

Modality

Accuracy Mean (%)

SD

95% CI

mIoU Mean (%)

SD

95% CI

Statistical Test Applied

p-value

Effect Size

Significance

ResNet-50

RGB

78.4

2.1

[77.6, 79.8]

58.2

2.4

[57.1, 59.3]

ANOVA vs group

 < 0.001

η2 = 0.77

Yes

U-Net (RGB)

RGB

67.3

1.8

[66.6, 68.2]

ANOVA

 < 0.001

η2 = 0.77

Yes

ViT-B/16

RGB

81.3

2

[80.2, 82.1]

69.2

1.6

[68.5, 69.9]

ANOVA

 < 0.001

η2 = 0.77

Yes

Swin Transformer (RGB)

RGB

82.1

1.9

[81.3, 83.2]

71.4

1.7

[70.5, 72.1]

Tukey vs ResNet

0.0003

d = 1.31

Yes

NIR-MLP

NIR

81

2.3

[80.1, 82.4]

t-test vs Swin-NIR

 < 0.001

d = 0.98

Yes

NIR U-Net

NIR

68.5

1.8

[67.8, 69.3]

ANOVA

 < 0.001

η2 = 0.77

Yes

Swin-NIR

NIR

87.6

2

[86.8, 88.7]

68.5

1.7

[67.8, 69.1]

t-test vs NIR-MLP

 < 0.001

d = 0.98

Yes

Early Fusion (A3)

RGB + NIR

89

1.6

[88.3, 89.6]

74.2

1.5

[73.6, 75.0]

ANOVA

 < 0.01

η2 = 0.71

Yes

Late Fusion (A4)

RGB + NIR

90.2

1.5

[89.5, 90.9]

78.1

1.4

[77.4, 78.7]

Tukey vs Early Fusion

0.009

d = 0.72

Yes

Cross-Attention (A5)

RGB + NIR

92.5

1.4

[91.9, 93.1]

80.4

1.3

[79.7, 80.9]

RM-ANOVA

0.002

η2 = 0.71

Yes

TomatoRipen-MMT (Proposed, A6)

RGB + NIR

94.8

1.2

[94.3, 95.4]

82.6

1.1

[82.1, 83.1]

t-test vs best baseline

 < 0.0001

d = 2.14

Highly Significant