Fig. 6: Distribution of values for the best-performing features in sleep-wake classification using the unbalanced and ADASYN-balanced datasets.

Each subplot includes histograms and boxplots illustrating the distribution of values across sleep and wake classes, with corresponding p-values indicating significant group differences. 1 Unbalanced dataset: 1a–1d show results from the unbalanced dataset: 1a PPG_skew, representing the skewness of the PPG waveform; 1b PPG_TM25, the trimmed mean of the PPG signal with 25% of extreme values removed; 1c PPG_LC, the Lyapunov coefficient capturing signal complexity; and 1d e_a_ratio_std, the standard deviation of the ratio between the ‘e’ and ‘a’ points in the second derivative of the PPG waveform. 2 ADASYN-balanced dataset: 2a–2d present results from the ADASYN-balanced dataset, displaying features derived from the frequency-domain analysis of the peak-to-peak interval (PPI) time series. 2a PPI_LF_HF_power, the ratio of low-frequency (LF) to high-frequency (HF) power; 2b PPI_VLF_LF_power, the ratio of very low-frequency (VLF) to LF power; 2c PPI_VLF_HF_power, the ratio of VLF to HF power; 2d PPI_LF_Total_power, the proportion of LF power relative to the total power spectrum.