Table 2 Preferences over coaching messages generated by LLMs versus human expert-crafted messages, stratified by stages of change
From: Fine-tuning LLMs in behavioral psychology for scalable health coaching
N | Prefer LLM (N, %) | Prefer expert human (N, %) | Chi-Square P-value | |
---|---|---|---|---|
LLM vs Expert | ||||
Precontemplation | 37 | 27, 73.0% | 10, 27.0% | 0.005 |
Contemplation | 89 | 64, 71.9% | 25, 28.1% | <0.001 |
Preparation | 112 | 73, 65.2% | 39, 34.8% | 0.001 |
Action | 76 | 61, 80.3% | 15, 19.7% | <0.001 |
Maintenance | 318 | 205, 64.5% | 113, 35.5% | <0.001 |
Total | 632 | 430, 68.0% | 202, 32.0% | <0.001 |
N | Prefer General LLM Message (N, %) | Prefer General Coaching Message (N, %) | Chi-SquareP-value | |
Generic Message | 632 | 540, 85.4% | 92, 14.6% | <0.001 |