Background & Summary

Sleep Apnea Syndrome (SAS) is a serious respiratory disorder with a complex patho- physiology. It is primarily caused by partial or complete obstruction of the upper airway, and it can also result from the brain failing to send appropriate respiratory signals, leading to hypoventilation or apnea1,2. In some extreme cases, patients may exhibit both factors, making the condition more complex3. The pathophysiology of SAS involves multiple aspects, including anatomical abnormalities of the upper air- way, neurological dysregulation, and muscular dysfunction4.According to large-scale community cohort studies, the prevalence of Obstructive Sleep Apnea (OSA) is 26%, with major risk factors including age, gender, body mass index (BMI), hypertension, and smoking history2,3. During nocturnal episodes, SAS patients typically experi- ence reduced airflow and significant decreases in blood oxygen levels5. These episodes of hypoxemia and apnea frequently occur, severely impacting the quality of nighttime sleep and leading to frequent awakenings6,7.

Chronic sleep fragmentation and hypoxemia not only affect daytime functioning but can also lead to various physiological and psychological issues, such as cogni- tive dysfunction, mood swings, and a decline in quality of life8,9,10,11. SAS is not only a disruptor of sleep quality but is also a significant risk factor for multiple cardio- vascular diseases12,13. Studies have shown that SAS is closely associated with conditions like hypertension, arrhythmias, coronary artery disease, and stroke9,14. Repeated episodes of hypoxemia and sleep fragmentation trigger systemic inflamma- tory responses and oxidative stress, which damage cardiovascular function and increase the risk of cardiovascular events9,15. Furthermore, SAS is linked to metabolic disor- ders, such as insulin resistance and type 2 diabetes, further exacerbating the health burden on patients16.

Currently, the accurate diagnosis of Obstructive Sleep Apnea Syndrome (OSA) pri- marilyrelies on polysomnography (PSG) as the gold standard. PSG provides clinicians with detailed data on sleep and respiratory function by comprehensively monitoring physiological parameters such as airflow, blood oxygen saturation, respiratory effort, and brain activity, enabling an effective assessment of the severity of OSA17,18. However, despite being a reliable diagnostic tool, the limitations of PSG significantly hinder its widespread application. Firstly, PSG must be conducted in a hospital or sleep laboratory, where patients are required to wear multi-sensor devices, increasing discomfort, especially during long-term monitoring. This complexity greatly reduces patient compliance. Secondly, unfamiliar sleep environments (e.g., hospital beds, light- ing, and noise) can lead to reduced sleep quality, thereby affecting the accuracy of the monitoring results. Additionally, wearing PSG equipment may induce or exacerbate snoring, disrupting natural sleep patterns and causing deviations in the collected data from real conditions. These issues highlight the inconvenience of PSG in daily home monitoring, making it difficult to meet the urgent demand for convenient, continuous sleep health management. Therefore, developing portable OSA monitoring algorithms based on deep learning has become a solution with significant engineering application potential, utilizing portable devices (such as smartphones and smart bands) to col- lect data and achieve non-invasive, real-time detection of sleep apnea. This approach can reduce patient inconvenience in a home environment, ensure more natural sleep patterns, and enhance diagnostic accuracy and efficiency through deep learning algo- rithms, providing more options and support for continuous patient management and treatment19,20.

However, most publicly available datasets for OSA research currently rely on high- quality audio and multimodal data collected via PSG. For example, Georgia Korompili et al. collected a public dataset containing 212 PSG records and their synchronized audio, recorded in controlled environments with low background noise, high sensor sensitivity, and precise annotations, making it suitable for training high-performance models21. Similarly, Andrea Bernardini et al. developed a PSG dataset for stroke ward patients, including data related to obstructive sleep apnea, but with a limited sample size22. The idealized collection conditions of these datasets make them diffi- cult to apply directly to real-world engineering scenarios, while audio data collected by portable devices (such as smartphone microphones) is significantly lower in quality compared to PSG. Smartphone audio is constrained by factors such as lower sampling rates, noise, device placement, and signal attenuation, resulting in higher variability and noise interference. Models trained on PSG datasets often experience a significant drop in performance when handling these low-quality, complex-background smart- phone audio recordings, failing to meet the practical application needs of portable devices in home environments. Therefore, constructing a dataset that closely mirrors real-world usage scenarios for smartphone-collected data has become a key foundation for developing robust, engineering-compatible OSA monitoring algorithms.

To address these needs, we plan to construct a new OSA dataset to support the development of deep learning-based portable snore monitoring algorithms, specifi- cally targeting real-world smartphone audio applications. Data were collected from 50 patients over more than 400 hours of sleep, with audio captured using smartphone microphones and professional digital recorders placed naturally on bedside tables. Simultaneously, PSG devices were used to collect various physiological data during the night to assist professional annotators in labeling snore-related events. Finally, key physiological data such as blood oxygen levels, heart rate, sleep structure, and airflow were retained for research purposes. The dataset, stored in formats such as WAV, MP3, CSV, and JSON, and has been uploaded to the Science Data Bank for cloud access. This dataset will bridge the performance gap between PSG and smart- phone data, enabling the training of high-performance models and supporting the engineering application of portable OSA monitoring devices.

Methods

Ethical issues management

This study strictly adheres to international and Chinese national ethical standards to ensure transparency, integrity, and respect for participants. The study design and implementation have been approved by the Ethics Committee of Shenzhen Second People’s Hospital (Approval No. 2023-113-01YJ), with all methods and data analyses undergoing rigorous review. Each participant involved in data collection is required to sign an informed consent form, agreeing to participate in the study and to the recording of audio signals and polysomnography (PSG) data during sleep. Additionally, the informed consent form specifies that all de-identified data may be used for scientific research purposes. For individuals who decline to participate in data collection, their physiological data generated during hospital visits will not be extracted or used for any research purposes. We are committed to protecting participants’ personal privacy and data security. All data will be stored and analyzed in an anonymized manner, with all personally identifiable information (such as names, ID numbers, and contact details) removed from the collected files. All researchers have signed conflict-of-interest declarations to ensure the objectivity and impartiality of the study. We will regularly report study progress and any ethical issues to the Ethics Committee. Should any ethical violations be identified, immediate corrective actions will be taken, and the matter will be reported to the Ethics Committee.

Data collection storage

Data collection was conducted at Shenzhen Second People’s Hospital, involving 50 patients diagnosed with sleep apnea. To ensure data consistency, we used the Embla SDx Polysomnography (PSG) equipment to record the patients’ physiological data throughout the night. Simultaneously with the PSG data collection, we employed an OPPO Reno8 smartphone and a Newamy V03 digital voice recorder to capture the patients’ respiratory sounds during sleep. To ensure that the collected data could be aligned on the same timeline, we synchronized the Embla SDx PSG equipment, smartphone, and voice recorder to Beijing time, calibrating the timelines of these three data sources. Finally, we anonymized the collected data for processing. Table 1 lists the data channels recorded by the PSG equipment along with their respective sampling rates, and Fig. 1 displays the complete data for one of the patients.

Table 1 Data Sample Rate.
Fig. 1
Fig. 1
Full size image

Visualization of patients’ snoring sounds blood oxygen saturation, heart rate, and airflow monitoring data over time.

Data annotation

We selected a professional team from the hospital, trained in using the Embla Rem- Logic software, to annotate all collected PSG data; all annotators are clinical doctors. The annotation standards follow the American Academy of Sleep Medicine (AASM) Manual for the Scoring of Sleep and Associated Events18. Based on the physiological data collected by PSG, four types of snoring states were primarily annotated: obstruc- tive apnea, central apnea, mixed apnea, and hypopnea. Since the smartphone, voice recorder, and PSG equipment were all calibrated to Beijing time, the snoring sound data can directly utilize the PSG annotation files.

Data synchronization

Data collected from multiple devices often faces synchronization challenges. Even PSG-audio datasets, with a maximum tolerable error of 2 seconds, can result in incon- sistencies across thousands of data points. To address this, we synchronized all devices to Beijing time, leveraging millisecond-precise timestamps from PSG data and audio file creation times from smartphones and digital voice recorders, aligning all data to a unified timeline.

Data Records

The dataset is organized into individual folders for each of the 50 patients, it can be accessed at Science Data Bank (https://doi.org/10.57760/sciencedb.19070)23. Each patient’s folder contains a smartphone audio recording, two digital voice recorder recordings, sleep structure data, heart rate data, blood oxygen saturation data, and annotation data. In later experiments, we added airflow measurements for 36 of these patients, so their folders also include this additional physiological data.

The original PSG data included EEG and ECG recordings. However, as the dataset focuses on detecting sleep apnea events via respiratory sounds, only relevant physio- logical data—blood oxygen, airflow, and heart rate—were retained. This serves two purposes: providing a reference for annotation and enabling analysis of the relationship between physiological data (e.g., blood oxygen, airflow) and respiratory sounds. These data were exported from Embla RemLogic software into CSV files, with columns for patient data index, timestamp, and physiological data or sleep stage.

For the respiratory sound data recorded during the patients’ nighttime sleep, we directly exported the recordings from the smartphone and voice recorder, saving them in MP3 and WAV formats.

The annotation data is exported from the Embla RemLogic software in TXT file format. Subsequently, all event information is extracted from the annotations and saved as a JSON file, structured into three main components: record start, awake intervals, and events. record start:Time in seconds marking the PSG recording start (24-hour format). awake intervals: A list of time intervals when the patient was awake, per PSG sleep staging. events: A list of dictionaries, each detailing a snoring event with its type (event type), start time (event start), duration (event duration), and sleep stage (sleep stage). To align the annotation data with the snoring data, the start time (event start) of each event can be adjusted by subtracting record start.

Technical Validation

This dataset synchronously collects audio through a smartphone microphone and a professional digital recorder, with all timestamps calibrated to Beijing Time. The synchronization error with the PSG device is less than 1 second, ensuring the alignment of multimodal data.Fig. 2 presents box plots of SpO2 (blood oxygen saturation) and heart rate data for 50 patients. The plots indicate that, overall, the patients’ SpO2 levels and heart rates fall within normal ranges; however, there are some patients whose SpO2 and heart rates significantly deviate below the third quartile (Q3), with a similar distribution pattern in SpO2 levels. Specifically, patients numbered 25 and 45 exhibit SpO2 values well below Q3, indicating a substantial number of low SpO2 recordings. This suggests prolonged periods of hypoxemia during sleep. The average heart rate was 64.14 BPM (±8.24 BPM), with a coefficient of variation of 12.85%. Additionally, the average SpO2 was 92.59% (±2.58%), with a coefficient of variation of 2.79%.Such observations reflect the reliability of the data in characterizing the severity of obstructive sleep apnea.

Fig. 2
Fig. 2
Full size image

Boxplots of blood oxygen saturation (SpO2) and heart rate data for 50 patients, as well as airflow data for 36 patients (with outliers removed).

The last box plot in the figure illustrates airflow data during sleep for the final 36 patients. The data reveal that the dynamic range of airflow primarily spans between 40 and 80 units, depending on the severity of the patient’s sleep apnea. Comparing this with patients 25 and 45 from the previous two plots (patient 29 in the airflow box plot), it is evident that this patient has a broad dynamic range for both SpO2 and airflow. Conversely, the adjacent patients to this individual exhibit smaller interquartile ranges in their airflow data, which correspondingly results in narrower ranges for SpO2 levels in their datasets. Similar analyses of other patients’ airflow data suggest the severity of sleep-disordered breathing. Therefore, we can conclude that the dataset provides reliable insights into the condition of the patients.