Table 4 Convergence (Cohen’s Weighted Kappa) with reference truth for ASCO-SNO-ASTRO Guideline

	ASCO-SNO-ASTRO guideline
	SoR evaluation			QoE evaluation
	κ-value	%95 CI	p-value	κ-value	%95 CI	p-value
Nrs_a1:	0.060	−0.267 to 0.387	0.710	0.286	0.049 to 0.583	0.037
Nrs_a2:	−0.118	−0.421 to 0.184	0.353	−0.054	−0.554 to 0.446	0.756
Nrs_r1:	−0.321	−0.728 to −0.086	0.040	0.071	−0.073 to 0.215	0.443
Nrs_r2:	0.428	0.042 to 0.813	0.050	0.644	0.354 to 0.934	0.004
Rad_a1:	−0.118	−0.293 to −0.057	0.353	0.085	−0.030 to 0.199	0.279
Rad_a2:	0.025	−0.239 to 0.289	0.853	0.106	−0.131 to 0.342	0.429
Rad_r1:	0.042	−0.043 to 0.127	0.488	−0.010	−0.186 to 0.167	0.913
Rad_r2:	0.066	−0.220 to 0.352	0.634	0.133	−0.047 to 0.314	0.242
GPT-4o	0.291	0.098 to 0.484	0.026	0.117	−0.023 to 0.256	0.188
Gemini	0.060	−0.276 to 0.396	0.710	0.286	0.029 to 0.542	0.037
Copilot	−0.090	−0.353 to 0.173	0.506	−0.022	−0.213 to 0.186	0.835
Deepseek	0.428	0.041 to 0.814	0.069	0.264	−0.153 to 0.681	0.187

The values shown in bold in the tables represent the highest values achieved by participants in the relevant section.

Quick links

Search