Table 4 Two-sample t-test results comparing AIPatient and H-SP

From: Simulated patient systems powered by large language model-based AI agents offer potential for transforming medical education

Metric

Question

t-statistics

Fidelity

  

Role/Text Adherence

The SP followed the case script without contradictions.

0.57

 

The SP’s responses matched the intended medical condition.

1.77*

Contextual Appropriateness

The SP’s responses felt natural and relevant to my questions.

1.10

Emotional Realism

The SP displayed believable emotions (e.g., pain, anxiety).

3.02 ***

Coherence/Consistency

The SP’s dialogue was coherent (no abrupt shifts).

1.23

Response Quality

The SP’s answers were directly relevant to clinical questions.

0.17

Usability

  

Ease of Use

Interacting with this SP required minimal effort.

1.62

 

I encountered no technical difficulties (e.g., delays).

2.68***

Feasibility/Scalability

This SP could be easily integrated into our training program.

0.47

Effectiveness

  

Diagnostic Accuracy

I have reached a preliminary diagnosis at the end.

1.59

Learner Satisfaction

This session improved my clinical reasoning skills.

2.19**

  1. *p < 0.1, **p < 0.05, ***p < 0.01; reported p values are from two-sided tests.