Table 4 Distribution of annotated relations in all datasets. The n corresponds to the number of documents while figures in cells correspond to the number of instances for this relation.

From: Improving social determinants of health documentation in French electronic health records using large language models

Relations

MUSCADET-InHouse

(n = 1700)

MUSCADET-Synthetic (n = 340)

UW-FrenchSDOH (n = 364)

InHouse tuberculosis and ALS (n = 400)

Train

Dev

Test

Test

Test

Test

Status

1206

175

314

514

646

117

Amount

1182

155

301

253

148

82

Duration

51

8

18

12

33

10

Frequency

489

68

124

124

135

35

History

262

30

73

44

55

13

Type

1096

146

289

261

154

96