Table 1 Demographic, clinical, and recording characteristics of participants across Dataset-1 to Dataset-5

From: FedOcw: optimized federated learning for cross-lingual speech-based Parkinson’s disease detection

Dataset (Language)

Gender Male/Female

Age Range (Mean ± SD)

Disease Severity (Mean ± SD)

Recording Conditions

Speech Tasks

Dataset-1 (Spanish)40

PD: 25/25

HC: 25/25

PD: 33–81 (61 ± 9.4)

HC: 31–86 (61 ± 9.5)

UPDRS speech score: 6–93 (37.7 ± 18.3)

Soundproof booth; 44.1 kHz, 16-bit; professional recording setup

Sustained vowels, isolated words, sentence reading, spontaneous speech

Dataset-2 (Italian)41

PD: 19/9

HC: 23/14

PD: 40–80 (67.2 ± 8.7)

HC: 19–77 (48.3 ± 23.4)

UPDRS II speech score: 0–4 (1.1 ± 1.2)

Echo-free environment; 15–25 cm mic distance

Text and phrase reading, syllable repetition (/pa/, /ta/), sustained vowels

Dataset-3 (Chinese)37

PD: 16/14

HC: 7/8

PD: 36–86 (60 ± 13.6)

HC: 23–72 (51.9 ± 14.1)

Hoehn and Yahr: 1–5 (2.5 ± 0.8)

Smartphone; 10 cm from mouth

Sustained vowels (/a/, /e/), short sentence reading

Dataset-4 (Czech)43

PD: 10/ 12

HC: 11/ 11

PD: 48–82 (64.4 ± 9.6)

HC: 41–79 (63.6 ± 10.0)

UPDRS III: 6–34 (15.9 ± 7.6)

Headset mic; 5 cm distance; 48 kHz, 16-bit

Sustained vowels (/A/, /I/)

Dataset-5 (English)44

PD: 9/7

HC: 19/2

UPDRS II Part 5: 0–3 (0.8 ± 0.9)

Smartphone (Moto G4); 44.1 kHz, 16-bit

Not specified in full; smartphone-based voice tasks

  1. “–” denotes unavailable data.