Table 3 Distribution of number of evaluation samples and human evaluators in each healthcare application

From: A framework for human evaluation of large language models in healthcare derived from literature review

Healthcare application

Number of human evaluators

Number of evaluation samples

Median

Mean

S.D.

Median

Mean

S.D.

Clinical decision support

2

4

3

22

89

155

Medical examination and medical education

3

7

14

50

97

125

Patient education

4

8

13

48

89

113

Patient-provider question answering

3

12

24

50

76

71

Others

5

11

21

39

270

820

Clinical trials

8

8

-

7

7

-