Table 1 Overview of validation studies classifying five stages

From: Evaluating the performance of wearable EEG sleep monitoring devices: a meta-analysis approach

Study

Device

# of al.a

El. position

El. type

# of part.

Age

Part.

Env.b

Nights

Ref.

Epochs

Device scoringc

 

OA (mi)

OA (ma)

Wake

N1

N2

N3

REM

Forehead

Headband

Li et al. (2025)27

WPSG-I27

6

Fp1, Fp2, M1, M2, Chin 1 (EMG), Chin 2 (EMG)

Dry

20 (M: 13, F: 7)

56.2 ± 9.5

Healthy: 6, PD: 8, ALS: 5, nacrolepsy: 1

Controlled

1

PSG scored by 2 experts according to AASM

26,572

WPSG-I proprietary algorithm

ACC

0.89

0.96

0.96

0.93

0.93

0.98

0.98

κ

0.80

0.71

0.91

0.32

0.79

0.82

0.70

SE

0.89

0.74

0.97

0.34

0.83

0.93

0.63

SP

0.97

0.97

0.93

0.97

0.96

0.98

0.99

PPV

0.89

0.75

0.96

0.36

0.85

0.75

0.81

NPV

0.97

0.97

0.95

0.96

0.95

1.00

0.98

F1

0.89

0.74

0.97

0.35

0.84

0.83

0.71

MCC

0.80

0.71

0.91

0.32

0.79

0.82

0.70

Manual scoring

ACC

0.95

0.98

0.98

0.96

0.97

0.99

0.99

κ

0.90

0.86

0.95

0.65

0.91

0.91

0.89

SE

0.95

0.89

0.98

0.70

0.91

0.98

0.89

SP

0.99

0.98

0.97

0.98

0.99

0.99

1.00

PPV

0.95

0.86

0.98

0.63

0.95

0.86

0.89

NPV

0.99

0.98

0.97

0.98

0.97

1.00

1.00

F1

0.95

0.88

0.98

0.66

0.93

0.91

0.89

MCC

0.90

0.86

0.95

0.65

0.91

0.91

0.89

Ravindran et al. (2025)28

Dreem69

5

F7, F8, O1, O2, Fpz

Dry

62 (M: 35, F: 27)

70.5 ± 6.7 (44–83)

Healthy: 50, AD: 12

Controlled

1

PSG scored by 2 experts according to AASM

63,546

Dreem proprietary algorithm

ACC

0.69

0.88

0.89

0.88

0.77

0.92

0.93

κ

0.58

0.54

0.74

0.14

0.52

0.68

0.61

SE

0.69

0.61

0.74

0.14

0.86

0.65

0.68

SP

0.92

0.92

0.96

0.97

0.73

0.97

0.95

PPV

0.69

0.65

0.89

0.34

0.59

0.83

0.62

NPV

0.92

0.92

0.89

0.90

0.92

0.93

0.96

F1

0.69

0.62

0.81

0.20

0.70

0.73

0.65

MCC

0.59

0.55

0.74

0.16

0.54

0.69

0.61

Seol et al. (2024)29

Insomnograf K273

4

Fp1, Fp2, M1, M2 (Fpz as ref)

Wet

77 (N/R)

>20 years

Suspected or known OSA

Controlled

1

PSG scored by an expert according to AASM

75,677

Manual scoring

ACC

0.78

0.91

0.96

0.84

0.85

0.96

0.96

κ

0.71

0.73

0.86

0.55

0.69

0.71

0.81

SE

0.78

0.78

0.92

0.56

0.90

0.67

0.85

SP

0.95

0.94

0.97

0.94

0.82

0.99

0.97

PPV

0.78

0.80

0.85

0.78

0.75

0.81

0.83

NPV

0.95

0.94

0.99

0.85

0.93

0.97

0.98

F1

0.78

0.78

0.88

0.65

0.82

0.73

0.84

MCC

0.71

0.73

0.86

0.56

0.70

0.71

0.81

Rusanen et al. (2023)30

Focusband78

2

Fp1, Fp2 (Fpz as ref)

Dry

10 (M: 7, F: 3)

23–37

Healthy

Home (devices fitted by a specialist)

1

PSG scored by an expert according to AASM

9337

Deep learning (CNN)

ACC

0.82

0.93

0.94

0.98

0.85

0.93

0.94

κ

0.75

0.67

0.78

0.26

0.69

0.79

0.67

SE

0.82

0.70

0.80

0.21

0.87

0.77

0.86

SP

0.96

0.95

0.97

0.99

0.83

0.98

0.96

PPV

0.82

0.74

0.83

0.35

0.79

0.89

0.85

NPV

0.96

0.95

0.96

0.98

0.90

0.94

0.97

F1

0.82

0.72

0.82

0.28

0.82

0.83

0.85

MCC

0.75

0.67

0.78

0.28

0.69

0.79

0.82

Casciola et al. (2021)31

Cognionics79

4

F3, F4, A1, A2

Dry

12 (M: 6, F: 6)

21–61

Healthy

Controlled

1

PSG scored by an expert according to AASM

9747

Deep learning (CNN + LSTM)

ACC

0.74

0.90

0.92

0.89

0.83

0.95

0.90

κ

0.64

0.59

0.76

0.20

0.65

0.76

0.57

SE

0.74

0.69

0.82

0.28

0.74

0.87

0.73

SP

0.93

0.93

0.94

0.93

0.91

0.96

0.92

PPV

0.74

0.64

0.81

0.24

0.88

0.73

0.55

NPV

0.93

0.93

0.95

0.95

0.79

0.98

0.96

F1

0.74

0.66

0.81

0.26

0.80

0.79

0.63

MCC

0.64

0.59

0.76

0.20

0.66

0.77

0.58

Machine learning (Ensemble-bagged trees model)

ACC

0.68d

0.49d

N/R

N/R

N/R

N/R

N/R

SE

N/R

N/R

0.80d

0.04d

0.82d

0.50d

0.28d

Arnal et al. (2020)32

Dreem69

4

F7, F8, O1, O2, Fpz

Dry

25 (M: 19, F: 6)

35.3 ± 7.5 (23–50)

Mostly healthy, some with mild symptoms of anxiety or depression, one with insomnia

Controlled

1

PSG scored by 5 experts according to AASM

24,662

Deep learning (2 layers of LSTM + Softmax function)

ACC

0.81

0.92

0.95

0.93

0.86

0.95

0.93

κ

0.73

0.69

0.76

0.43

0.72

0.78

0.78

SE

0.81

0.76

0.78

0.48

0.83

0.87

0.86

SP

0.95

0.95

0.97

0.96

0.90

0.96

0.95

PPV

0.81

0.74

0.80

0.46

0.89

0.76

0.79

NPV

0.95

0.94

0.97

0.96

0.84

0.98

0.97

F1

0.81

0.75

0.79

0.47

0.86

0.81

0.83

MCC

0.73

0.70

0.76

0.43

0.72

0.78

0.78

Lin et al. (2017)33

Prototype developed in the study33

4

AF7, Fp1, Fp2, AF8

Dry

10 (M: 10, F: 0)

24 ± 6

Healthy

Controlled

1

PSG scored by an expert according to AASM

8251

Machine learning (RVM)

ACC

0.77

0.91

0.94

0.86

0.86

0.95

0.92

κ

0.69

0.65

0.80

0.23

0.70

0.83

0.72

SE

0.77

0.72

0.84

0.22

0.85

0.87

0.83

SP

0.94

0.94

0.96

0.96

0.86

0.97

0.94

PPV

0.77

0.72

0.83

0.43

0.79

0.84

0.71

NPV

0.94

0.94

0.96

0.89

0.90

0.98

0.97

F1

0.77

0.71

0.83

0.29

0.82

0.86

0.77

MCC

0.69

0.66

0.80

0.24

0.70

0.83

0.73

Levendowski et al. (2017)22

X4 Sleep Profiler70

2

AF7, AF8 (Fpz as ref)

Dry

47 (M: 35, F:12)

23–77

Sleep-disordered breathing and healthy

Controlled

1

PSG scored by 5 experts according to AASM

33,635

X4 Sleep Profiler proprietary algorithm

ACC

0.77

0.91

0.90

0.87

0.86

0.96

0.95

κ

0.68

0.64

0.73

0.22

0.72

0.78

0.75

SE

0.77

0.71

0.77

0.32

0.83

0.79

0.83

SP

0.94

0.94

0.94

0.92

0.89

0.98

0.96

PPV

0.77

0.70

0.82

0.26

0.85

0.82

0.73

NPV

0.94

0.94

0.92

0.94

0.87

0.97

0.98

F1

0.77

0.70

0.79

0.29

0.84

0.81

0.78

MCC

0.68

0.64

0.73

0.22

0.72

0.78

0.75

Automatic scoring corrected by reviewer

ACC

0.80

0.92

0.92

0.88

0.87

0.96

0.97

κ

0.72

0.69

0.79

0.28

0.74

0.79

0.85

SE

0.80

0.75

0.82

0.36

0.85

0.80

0.94

SP

0.95

0.95

0.96

0.93

0.89

0.98

0.97

PPV

0.80

0.74

0.87

0.33

0.86

0.83

0.80

NPV

0.95

0.95

0.94

0.94

0.88

0.97

0.99

F1

0.80

0.74

0.84

0.34

0.85

0.81

0.87

MCC

0.72

0.69

0.79

0.28

0.74

0.79

0.85

Finan et al. (2016)34

X4 Sleep Profiler70

2

AF7, AF8 (Fpz as ref)

Dry

14 (M: 6, F: 8)

26.4 ± 3.7 (22–34)

Healthy

Controlled

1

PSG scored by an expert according to AASM

13,445*

X4 Sleep Profiler proprietary algorithm

ACC

0.66

0.86

0.92

0.89

0.78

0.88

0.85

κ

0.52

0.45

0.42

0.07

0.55

0.62

0.58

SE

0.66

0.55

0.44

0.27

0.72

0.60

0.72

SP

0.92

0.91

0.96

0.91

0.83

0.96

0.89

PPV

0.66

0.56

0.47

0.07

0.79

0.81

0.63

NPV

0.92

0.91

0.96

0.98

0.77

0.89

0.92

F1

0.66

0.54

0.46

0.11

0.75

0.69

0.67

MCC

0.53

0.46

0.42

0.09

0.56

0.58

0.58

Manual scoring

ACC

0.74

0.90

0.94

0.93

0.81

0.90

0.90

κ

0.62

0.52

0.44

0.14

0.61

0.71

0.72

SE

0.74

0.61

0.40

0.27

0.76

0.74

0.86

SP

0.93

0.93

0.98

0.95

0.85

0.95

0.91

PPV

0.74

0.61

0.57

0.12

0.81

0.80

0.72

NPV

0.93

0.92

0.95

0.98

0.80

0.93

0.96

F1

0.74

0.60

0.47

0.17

0.79

0.77

0.79

MCC

0.62

0.53

0.45

0.15

0.61

0.71

0.73

Sleepmask

Liang et al. (2015)35

Prototype developed in the study35

2

EOG L, EOG R (Fpz as ref)

Dry

16 (M:11, F:5)

25.3 ± 2.5

Healthy

Controlled

1

PSG scored by an expert according to AASM

6480

Machine learning (LDA)

ACC

0.84

0.94

0.96

0.97

0.87

0.93

0.96

κ

0.77

0.69

0.69

0.33

0.74

0.80

0.87

SE

0.84

0.77

0.84

0.43

0.83

0.81

0.94

SP

0.96

0.96

0.97

0.98

0.91

0.97

0.96

PPV

0.84

0.71

0.62

0.29

0.89

0.87

0.87

NPV

0.96

0.95

0.99

0.99

0.85

0.95

0.98

F1

0.84

0.73

0.71

0.35

0.86

0.84

0.90

MCC

0.77

0.69

0.70

0.34

0.74

0.80

0.88

Sheet-like/patches

Massie et al. (2025)36

Prototype developed in the study36

1

EOG R (Fpz as ref)

Dry

106 (M: 60, F: 46)

58 ± 15 (22–82)

Suspected OSA

Controlled

1

PSG scored by experts according to AASM

81,786

Deep learning (RNN)

ACC

0.80

0.92

0.92

0.95

0.84

0.94

0.95

κ

0.70

0.66

0.77

0.44

0.69

0.62

0.78

SE

0.80

0.74

0.84

0.47

0.82

0.76

0.80

SP

0.95

0.94

0.94

0.97

0.88

0.95

0.98

PPV

0.80

0.71

0.80

0.47

0.87

0.57

0.82

NPV

0.95

0.94

0.95

0.97

0.82

0.98

0.97

F1

0.80

0.72

0.82

0.47

0.84

0.65

0.81

MCC

0.70

0.66

0.77

0.44

0.69

0.62

0.78

Roach et al. (2025)37

Somfit80

1

Fpz

Wet

27 (M: 13, F: 14)

22.3 ± 5.1

Healthy

Controlled

1

PSG scored by 3 experts according to AASM

21,600

Somfit proprietary algorithm

ACC

N/R

N/R

N/R

N/R

N/R

N/R

N/R

κ

0.47d

N/R

N/R

N/R

N/R

N/R

N/R

SE

N/R

N/R

0.60d

0.19d

0.69d

0.61d

0.53d

Um et al. (2025)38

Prototype developed in the study38

4

F7, F8, EOG L, EMG L, chin

Dry

1 (N/R)

N/R

Healthy

Controlled

1

PSG scored by an expert according to AASM

688

Deep learning (BiLSTM + attention model on spectrogram input)

ACC

0.73

0.89

0.96

0.78

0.79

0.98

0.95

κ

0.61

0.65

0.74

0.42

0.59

0.66

0.83

SE

0.73

0.75

0.72

0.73

0.70

0.77

0.83

SP

0.93

0.93

0.98

0.79

0.90

0.99

0.98

PPV

0.73

0.72

0.81

0.45

0.88

0.59

0.89

NPV

0.93

0.92

0.98

0.93

0.74

0.99

0.96

F1

0.73

0.72

0.76

0.56

0.78

0.67

0.86

MCC

0.63

0.66

0.74

0.45

0.60

0.66

0.83

McMahon et al. (2024)39

Somfit80

1

Fpz

Wet

106 (M: 59, F: 47)

<65:85 ≥ 65:21

Suspected or known OSA

Controlled

1

PSG scored by 3 experts according to AASM

N/R

Deep learning (U-sleep CNN)

ACC

N/R

N/R

0.89d

0.91d

0.84d

0.94d

0.95d

κ

0.67d

N/R

N/R

N/R

N/R

N/R

N/R

SE

N/R

N/R

0.78d

0.22d

0.84d

0.58d

0.87d

SP

N/R

N/R

0.91d

0.97d

0.83d

0.99d

0.96d

PPV

N/R

N/R

0.76d

0.38d

0.76d

0.85d

0.76d

NPV

N/R

N/R

0.93d

0.93d

0.90d

0.94d

0.98d

F1

N/R

N/R

0.77e

0.28e

0.80e

0.69e

0.81e

Oz et al. (2023)40

X-trodes soft electrode array81

8

4 EEG (forehead), 2 EOG R, 2 EMG R (chin)

Dry

50 (M: 32, F: 18)

61.4 ± 7.9

Healthy: 21 PD: 29

Controlled

1

PSG scored by 2 experts according to the AASM

N/R

Manual scoring

ACC

0.77d

N/R

N/R

N/R

N/R

N/R

N/R

κ

0.69d

N/R

0.70d

0.22d

0.58d

0.41d

0.72d

SE

N/R

N/R

0.91d

0.16d

0.84d

0.68d

0.77d

SP

N/R

N/R

0.94d

0.99

0.80d

0.97d

0.98d

PPV

N/R

N/R

0.84d

0.44d

0.71d

0.83d

0.85d

F1

N/R

N/R

0.87e

0.23e

0.77e

0.75e

0.81e

Kwon et al. (2023)41

Skin patch developed in the study41

5

2 EEG (forehead), EOG R, EOG L, EMG (chin)

Dry

8 (N/R)

N/R

Healthy

Controlled

1

PSG scored by an expert according to AASM

4961

Deep learning (CNN)

ACC

0.84

0.94

0.94

0.93

0.89

0.97

0.96

κ

0.76

0.68

0.82

0.23

0.77

0.80

0.80

SE

0.84

0.72

0.84

0.17

0.94

0.78

0.88

SP

0.96

0.95

0.97

0.99

0.84

0.99

0.97

PPV

0.84

0.78

0.88

0.52

0.85

0.86

0.77

NPV

0.96

0.96

0.95

0.94

0.93

0.98

0.99

F1

0.84

0.73

0.86

0.25

0.89

0.82

0.82

MCC

0.76

0.69

0.82

0.27

0.78

0.81

0.80

Manual scoring

ACC

0.82

0.93

0.92

0.90

0.88

0.96

0.98

κ

0.74

0.69

0.76

0.23

0.76

0.78

0.92

SE

0.82

0.73

0.72

0.30

0.94

0.75

0.92

SP

0.96

0.95

0.98

0.94

0.83

0.99

0.99

PPV

0.82

0.77

0.93

0.28

0.84

0.86

0.93

NPV

0.96

0.95

0.92

0.95

0.93

0.97

0.99

F1

0.82

0.74

0.81

0.29

0.89

0.80

0.93

MCC

0.74

0.70

0.77

0.23

0.77

0.79

0.92

Matsumori et al. (2022)42

Prototype developed in the study42

6

Forehead (ref on mastoid)

Wet

27 (M:23, F:4)

27.4 ± 9.2

Healthy

Controlled

1

PSG scored by an expert according to the AASM

24,979

Deep learning (DSN: CNN + BiLSTM)

ACC

0.81

0.92

0.97

0.90

0.92

0.95

0.89

κ

0.74

0.71

0.77

0.47

0.83

0.82

0.69

SE

0.81

0.76

0.70

0.58

0.85

0.85

0.83

SP

0.95

0.95

0.99

0.93

0.97

0.97

0.90

PPV

0.81

0.77

0.88

0.48

0.95

0.86

0.70

NPV

0.95

0.95

0.97

0.95

0.90

0.97

0.95

F1

0.81

0.76

0.78

0.52

0.90

0.85

0.76

MCC

0.74

0.72

0.77

0.47

0.83

0.82

0.69

Myllymaa et al. (2016)43

Bittium Brainstatus EEG82

11

Fp1, Fp2, AF7, AF8, F8, F7, Sp1, Sp2, T10, T9, EOG R

Wet

31 (M:10, F: 21)

31.3 ± 11.8

Sleep bruxism or healthy

Controlled

1

PSG scored by 2 experts according to the AASM

27,692

Manual scoring

ACC

0.80

0.92

0.94

0.89

0.86

0.94

0.96

κ

0.71

0.69

0.76

0.43

0.72

0.79

0.77

SE

0.80

0.74

0.76

0.57

0.87

0.77

0.75

SP

0.95

0.94

0.97

0.93

0.85

0.98

0.98

PPV

0.80

0.77

0.83

0.44

0.83

0.89

0.85

NPV

0.95

0.94

0.96

0.96

0.89

0.95

0.97

F1

0.80

0.75

0.79

0.49

0.85

0.83

0.79

MCC

0.71

0.70

0.76

0.44

0.72

0.79

0.77

Ear

                    

In-ear

                    

Borges et al. (2025)44

Prototype developed in Goverdovsky et al. (201674, 2017)75

4

2 per ear

Wet

14 (M: 6, F: 8)

53.2 ± 17.4 (25–78)

Mostly OSA

Controlled

1

PSG scored by an expert according to AASM

30,960

Automatic (not specified)

ACC

0.83

0.93

0.94

0.91

0.89

0.95

0.98

κ

0.77

0.77

0.83

0.56

0.77

0.77

0.89

SE

0.83

0.80

0.81

0.60

0.91

0.74

0.93

SP

0.96

0.95

0.98

0.95

0.87

0.98

0.98

PPV

0.83

0.83

0.93

0.62

0.82

0.87

0.88

NPV

0.96

0.96

0.95

0.94

0.94

0.96

0.99

F1

0.83

0.81

0.87

0.61

0.86

0.80

0.91

MCC

0.77

0.77

0.83

0.56

0.77

0.78

0.89

Hammour et al. (2024)45

Prototype developed in Goverdovsky et al. (2016)74, (2017)75

4

2 per ear

Wet

13 (M: 9, F: 4)

71.8 ± 4.4 (65–83)

Mostly healthy; some with stable comorbidities (type-2 diabetes, sleep apnea, hypertension)

Controlled

1

PSG scored by 2 scorers according to AASM

13,403

Machine learning (fine-tuned pre-trained LightGBM83)

ACC

0.74

0.90

0.90

0.86

0.83

0.94

0.95

κ

0.64

0.62

0.79

0.29

0.59

0.73

0.69

SE

0.74

0.67

0.90

0.36

0.72

0.74

0.64

SP

0.93

0.93

0.90

0.92

0.87

0.97

0.98

PPV

0.74

0.71

0.84

0.38

0.70

0.79

0.81

NPV

0.93

0.93

0.94

0.91

0.88

0.96

0.96

F1

0.74

0.69

0.87

0.37

0.71

0.77

0.72

MCC

0.64

0.62

0.79

0.29

0.59

0.73

0.70

Borup et al. (2023)46

Prototype developed in Kappel et al. (2018)72

12

6 per ear

Dry

20 (M: 7, F:13)

25.9 (22–36)

Healthy

Home (devices fitted by a specialist)

4

PSG scored by an expert according to AASM

72,942

Deep learning (personalized ensemble deep learning)

ACC

0.84

0.94

0.98

0.94

0.88

0.95

0.94

κ

0.78

0.75

0.88

0.47

0.76

0.85

0.80

SE

0.84

0.79

0.92

0.44

0.89

0.83

0.85

SP

0.96

0.96

0.98

0.98

0.87

0.99

0.96

PPV

0.84

0.81

0.88

0.57

0.83

0.95

0.82

NPV

0.96

0.96

0.99

0.96

0.92

0.95

0.97

F1

0.84

0.80

0.90

0.50

0.86

0.88

0.84

MCC

0.78

0.75

0.88

0.47

0.76

0.86

0.80

Tabar et al. (2023)47

Prototype developed in the study47

4

2 per ear

Dry

10 (M: 6, F: 4)

27.4 ± 4.9 (22–35)

Healthy

Home (device fitted by participants, PSG by specialist)

2

Partial PSG scored by an expert according to AASM

15,709

Machine learning (Random Forest)

ACC

0.81

0.92

0.96

0.92

0.85

0.96

0.93

κ

0.72

0.65

0.70

0.23

0.70

0.86

0.77

SE

0.81

0.71

0.76

0.17

0.89

0.90

0.81

SP

0.95

0.94

0.97

0.99

0.81

0.97

0.96

PPV

0.81

0.75

0.70

0.55

0.81

0.87

0.82

NPV

0.95

0.95

0.98

0.93

0.89

0.98

0.95

F1

0.81

0.71

0.73

0.25

0.85

0.89

0.81

MCC

0.72

0.66

0.70

0.27

0.71

0.86

0.77

Jørgensen et al. (2023)48

Prototype developed in Kappel et al. (2018)72

12

6 per ear

Dry

1 (M: 0, F:1)

29 ± 3.8 (22–35)

Healthy

Home (devices fitted by a specialist)

2

PSG scored by an expert according to AASM

1578

Machine learning (Random Forest)

ACC

0.79

0.92

0.98

0.91

0.82

0.93

0.95

κ

0.72

0.67

0.83

0.24

0.60

0.81

0.88

SE

0.79

0.71

0.90

0.48

0.58

1.00

0.96

SP

0.95

0.95

0.99

0.93

0.98

0.91

0.95

PPV

0.79

0.78

0.78

0.20

0.94

0.75

0.89

NPV

0.95

0.95

1.00

0.98

0.77

1.00

0.98

F1

0.79

0.72

0.84

0.28

0.72

0.86

0.92

MCC

0.73

0.69

0.83

0.27

0.63

0.83

0.89

Kjaer et al. (2022)49

Prototype developed in Kappel et al. (2018)72

12

6 per ear

Dry

20 (M: 7, F: 13)

25.9 ± 3.8 (22–36)

Healthy

Home (devices fitted by a specialist)

4

PSG scored by 2 experts according to AASM

72,942

Machine learning (Random Forest)

ACC

0.80

0.92

0.96

0.94

0.84

0.96

0.91

κ

0.73

0.65

0.84

0.15

0.68

0.85

0.70

SE

0.80

0.70

0.91

0.10

0.87

0.85

0.75

SP

0.95

0.94

0.97

1.00

0.82

0.98

0.95

PPV

0.80

0.76

0.82

0.54

0.78

0.92

0.76

NPV

0.95

0.95

0.98

0.95

0.90

0.96

0.94

F1

0.80

0.70

0.87

0.16

0.82

0.88

0.76

MCC

0.73

0.66

0.84

0.21

0.68

0.86

0.70

Jørgensen et al. (2020)50

Prototype developed in Zibrandtsen et al. (2016)84

8

4 per ear

Wet

13 (M: 5, F: 8)

41.5 (18–60)

Epilepsy

Controlled

1–4

PSG scored by an expert according to AASM

27,593

Manual scoring

ACC

0.81

0.92

0.96

0.89

0.85

0.94

0.98

κ

0.74

0.74

0.85

0.47

0.70

0.79

0.90

SE

0.81

0.80

0.91

0.63

0.80

0.80

0.87

SP

0.95

0.95

0.97

0.92

0.89

0.97

0.99

PPV

0.81

0.79

0.84

0.46

0.86

0.86

0.96

NPV

0.95

0.95

0.98

0.96

0.85

0.96

0.98

F1

0.81

0.79

0.87

0.53

0.83

0.83

0.91

MCC

0.74

0.74

0.85

0.48

0.70

0.80

0.90

Nakamura et al. (2020)51

Prototype developed in Goverdovsky et al. (2016)74, (2017)75

4

2 per ear

Wet

22 (N/R)

23.8 ± 4.8

Healthy

Home (devices fitted by specialist)

1

PSG scored by an expert according to AASM

11,610

Machine learning (SVM)

ACC

0.74

0.90

0.89

0.99

0.77

0.93

0.90

κ

0.61

0.51

0.69

0.09

0.55

0.75

0.49

SE

0.74

0.57

0.74

0.05

0.84

0.75

0.46

SP

0.94

0.92

0.94

1.00

0.71

0.97

0.97

PPV

0.74

0.73

0.77

0.64

0.71

0.84

0.67

NPV

0.94

0.93

0.93

0.99

0.84

0.95

0.92

F1

0.74

0.59

0.76

0.09

0.77

0.79

0.54

MCC

0.62

0.53

0.69

0.18

0.56

0.75

0.50

Mikkelsen et al. (2019)52

Prototype developed in Kappel et al. (2018)72

12

6 per ear

Dry

20 (M:7, F:13)

25.9 (22–36)

Healthy

Home (devices fitted by a specialist)

4

Partial PSG scored by an expert according to AASM

72,942

Machine learning (Random Forest)

ACC

0.81

0.92

0.96

0.93

0.85

0.96

0.91

κ

0.73

0.68

0.85

0.29

0.69

0.86

0.70

SE

0.81

0.77

0.84

0.52

0.79

0.93

0.77

SP

0.95

0.95

0.98

0.94

0.90

0.97

0.94

PPV

0.81

0.72

0.91

0.23

0.88

0.85

0.74

NPV

0.95

0.94

0.97

0.98

0.82

0.99

0.95

F1

0.81

0.73

0.87

0.32

0.83

0.89

0.76

MCC

0.73

0.68

0.85

0.31

0.69

0.86

0.70

Mikkelsen et al. (2017)53

Prototype developed in the study53

12

6 per ear

Wet

9 (M: 6, F: 3)

26–44

Healthy

Home (devices fitted by a specialist)

1

Partial PSG scored by experts according to AASM

7411

Machine learning (Random Forest)

ACC

0.60

0.84

0.84

0.92

0.74

0.91

0.80

κ

0.45

0.40

0.52

0.04

0.47

0.61

0.34

SE

0.60

0.52

0.53

0.16

0.69

0.74

0.46

SP

0.90

0.89

0.94

0.93

0.78

0.94

0.88

PPV

0.60

0.51

0.74

0.04

0.71

0.59

0.47

NPV

0.90

0.89

0.86

0.98

0.77

0.97

0.87

F1

0.60

0.50

0.61

0.07

0.70

0.66

0.47

MCC

0.45

0.40

0.53

0.05

0.47

0.61

0.34

Around-the-ear

da Silva Suoto et al. (2022)55

Prototype developed in the study55

7

2 EMG, 1 EOG, 2 forehead, 2 around-the-ear

Wet

12 (M: 9, F: 3)

28.9 (18–45)

Healthy

Home (devices fitted by a specialist)

1

PSG scored by an expert according to AASM

10,632

Manual scoring

ACC

0.78

0.93

0.95

0.91

0.85

0.91

0.93

κ

0.70

0.57

0.75

0.46

0.66

0.79

0.76

SE

0.78

0.62

0.84

0.56

0.79

0.83

0.72

SP

0.96

0.95

0.96

0.94

0.87

0.95

0.98

PPV

0.78

0.62

0.72

0.46

0.76

0.87

0.90

NPV

0.96

0.95

0.98

0.96

0.89

0.93

0.94

F1

0.78

N/R

0.77

0.51

0.77

0.85

0.80

MCC

0.70

0.57

0.75

0.46

0.66

0.79

0.77

da Silva Suoto et al. (2021)54

cEEGrid85

16

8 per ear

Wet

10 (M: 2, F: 8)

28.4 ± 4.3

Healthy

Home (devices fitted by a specialist)

1

EEG of Fpz, EOG_L and EOG_R scored by expert according to AASM

9341

Manual scoring

ACC

0.75

0.92

0.94

0.92

0.82

0.95

0.91

κ

0.67

0.60

0.71

0.37

0.62

0.85

0.69

SE

0.75

0.70

0.69

0.35

0.82

0.90

0.67

SP

0.95

0.94

0.98

0.97

0.82

0.97

0.97

PPV

0.75

0.67

0.80

0.50

0.74

0.87

0.83

NPV

0.95

0.95

0.96

0.95

0.88

0.97

0.93

F1

0.75

0.65

0.74

0.41

0.78

0.88

0.74

MCC

0.67

0.61

0.71

0.38

0.62

0.85

0.70

Mikkelsen et al. (2019)56

cEEGrid85

16

8 per ear

Wet

15 (M: 6, F: 9)

35.3 ± 14.3

Healthy

Controlled

1

PSG scored by 2 experts according to AASM

18,920

Machine learning (Random Forest)

ACC

0.700f

N/R

N/R

N/R

N/R

N/R

N/R

κ

0.600f

N/R

N/R

N/R

N/R

N/R

N/R

Sterr et al. (2018)57

cEEGrid85

16

8 per ear

Wet

15 (M: 6, F: 9)

35.3 ± 14.3

Healthy

Controlled

1

PSG scored by 2 experts according to AASM

18,920

Manual scoring

ACC

0.59d

N/R

N/R

N/R

N/R

N/R

N/R

κ

0.42d

N/R

N/R

N/R

N/R

N/R

N/R

  1. All evaluation metric values were calculated from confusion matrices provided in the original publications unless indicated otherwise.
  2. AASM American Association of Sleep Medicine, ACC accuracy, AD Alzheimer’s disease, ALS amyotrophic lateral sclerosis, BiLSTM bidirectional long short-term memory, CNN convolutional neural network, DSN deep stacking networks, EEG electroencephalogram, F females, F1 F1 score; κ Cohen’s kappa, LDA linear discriminant analysis; LightGBM: light gradient boosting machine; LSTM: long short-term memory, M males, MCC Matthews correlation coefficient, NPV negative predictive value, OA (ma) overall macro-averaged metrics, OA (mi) overall micro-averaged metrics; OSA: obstructive sleep apnea, PD Parkinson’s disease, PPV positive predictive value, REM rapid eye movement, RNN recurrent neural network, RVM relevance vector machine, SE sensitivity, SP specificity, SVM support vector machine, WPSG-I wearable polysomnogram, N/R not reported.
  3. aThe number of recording electrodes.
  4. bFor controlled environment studies, devices were fitted by researchers or technicians. For home studies, self-application vs. expert fitting is indicated in parentheses.
  5. cIn studies reporting manual scoring of wEEG data, scoring was conducted on raw signals from the device. As most wEEG devices lacked EOG and EMG channels, REM sleep was identified based on EEG characteristics alone, such as low-amplitude mixed-frequency activity.
  6. dReported in the original study.
  7. eCalculated from reported metrics.
  8. fEstimated from the graph.