A comprehensive evaluation framework for consumer-grade EEG devices: signal quality, robustness, and usability

Lee, Yeeun; Gwon, Daeun; Kim, Kiyoun; Park, Seokhwan; Sohn, Semin; Choi, Minuck; Choi, Minho; Bae, Jang-Han; Ahn, Minkyu

doi:10.1038/s41598-026-39056-8

Download PDF

Article
Open access
Published: 11 February 2026

A comprehensive evaluation framework for consumer-grade EEG devices: signal quality, robustness, and usability

Yeeun Lee¹^na1,
Daeun Gwon¹^na1,
Kiyoun Kim²,
Seokhwan Park²,
Semin Sohn²,
Minuck Choi²,
Minho Choi³,
Jang-Han Bae^3,4 &
…
Minkyu Ahn^1,2

Scientific Reports volume 16, Article number: 8408 (2026) Cite this article

1655 Accesses
1 Altmetric
Metrics details

Subjects

Abstract

The widespread adoption of consumer-grade electroencephalogram (EEG) devices has introduced new opportunities for applications beyond traditional clinical and laboratory settings. However, the lack of standardized evaluation methodologies for these devices raises concerns regarding their signal validity and usability. This study proposes a comprehensive evaluation paradigm for consumer-grade EEG devices, encompassing three levels of signal assessment: the detection of non-neural physiological artefacts signals, the validation of brain waves, and robustness to noise. Using a participant pool of 30 individuals, we assessed four popular consumer-grade EEG devices—BrainLink Pro, NeuroNicle FX2, Mindwave Mobile2, and Muse2—against a research-grade reference device, DSI-24 (Wearable Sensing Inc.). Experimental paradigms involved tasks such as eye blinking, jaw clenching, eyes-open/closed conditions for brain wave detection, and controlled head movements. The results indicate that all tested devices successfully detected both non-neural physiological artefacts and brain wave signals, with consumer-grade devices displaying comparable alpha rhythm characteristics and noise robustness to the research-grade device. User experience was evaluated through a structured questionnaire, revealing significantly higher usability scores for consumer-grade devices, particularly Mindwave Mobile2. The findings highlight the feasibility of using consumer-grade EEG devices for practical applications, provided that validation is performed using structured evaluation protocols as proposed.

Introduction

As the body’s central organ, the brain plays a pivotal role in regulating most of the physical and mental processes that underlie human behavior. The information encoded in brain waves has therefore garnered significant attention and holds substantial promise. Electroencephalography¹ stands out among brain imaging methods for its relatively high temporal resolution and cost effectiveness². Thus, electroencephalograms (EEGs) are used in a wide variety of fields, including the study of cognitive functions, healthcare, and mental state monitoring³.

Typically, EEGs are measured in a controlled environment to minimize the effects of external factors, such as noise, electromagnetic interference, and light^4,5. The subject’s skin is prepared before electrodes are attached to it to reduce impedance⁶. High-frequency sampling rates are recommended for obtaining quality data, and wired connections are preferred for stable recording. However, wired connections between the device and computer present challenges in real-life applications due to restricting the subject’s mobility and convenience, limiting EEG devices’ usability outside the laboratory⁷.

Researchers and EEG device manufacturers have been actively working to overcome these issues and enhance EEG devices’ universal usability^8,9. They have developed wireless devices to facilitate unrestricted movement, as well as gel-free sensors to enable users to quickly and comfortably wear the device. These commercial EEG devices are widely used in various fields, such as brain-computer interface (BCI) field and clinical applications¹⁰. However, there are concerns about the signal quality of these devices.

Previous studies have reported the evaluation of EEG devices and the validation of their signal quality through diverse approaches. Some studies assessed consumer-grade EEG devices’ usability¹¹, whereas others evaluated temporal delays and signal distortion through experiments¹². Cognitive tasks commonly used in BCI research—such as the oddball task, 0-back task, and stop-signal task—have also been used to examine whether characteristic EEG features, such as quantitative electroencephalography (QEEG) or event-related potential (ERP) components, appear as expected^12,13. Resting-state EEGs have been used to evaluate device performance by examining whether characteristic EEG features are reliably observed^13,14. Some studies assessed the detectability of large signals, such as eye blinks^9,15. Although various evaluation methods have been explored, no studies have compared practical dry wireless commercial devices with four or fewer electrodes. In addition, basic-level comparisons of multiple consumer-grade and research-grade EEG devices have not been reported.

Our work compares the signal quality and user experience of wireless gel-free consumer-grade EEG devices, proposing a validation paradigm specifically designed for this purpose. We selected four widely used consumer-grade EEG devices and one research device for comparison and conducted a comparative evaluation. The paradigm assesses the devices’ ability to detect signals progressively, starting from large non-neural physiological artefacts signals to smaller brainwave signals. Additionally, it evaluates each device’s performance in noisy environments to ensure reliable signal acquisition under real-world conditions. This approach enables a comprehensive comparison of whether consumer-grade EEG devices can accurately measure EEG signals and how they perform in various challenging scenarios.

Methods and materials

Level of signal quality

Signal quality can be checked at various levels. For example, many researchers directly checked EEG devices’ applicability in BCI applications^16,17. On the other hand, one recent study¹⁸ introduced an EEG phantom test that uses a rubber model, a function generator, and an oscilloscope for checking signal quality. The procedures in these studies are reasonable in the context of brain signal validity; however, they do not provide the granularity required to assess the specific signal quality that each EEG device captures. The testing procedure needs to be segmented to more accurately evaluate device-level signal quality. EEG device sensors typically measure electrical potentials flowing on the scalp; however, the measurement does not display the zero micro-voltage level because the device still records a residual analog signal^19,20. Thus, experimenters usually set EEG electrode impedance below a certain level and monitor for artefactual signals generated from known actions like eye blinking and jaw clenching²¹. Once these are confirmed, brain wave acquisition is conducted according to a predefined experimental protocol. Another important consideration in EEG usage is its high sensitivity to noise from movement or environmental changes. Because most consumer-grade EEG devices are designed to be convenient for users, they are less stable and relatively more sensitive to physical movement^22,23,24. Thus, noise-robustness is one of the important features for a usable EEG device^25,26. Considering these points and issues, three points seem important and straightforward for evaluating consumer-grade EEG devices: quality of “signal detection,” “brain wave detection,” and “noise robustness.” “Signal detection” refers to the detection of non-neural physiological signals (e.g., eye blinks, jaw clenching) that exhibit greater amplitude than brain waves, whereas “brain wave detection” refers to the detection of characteristic features inherent in brain waves that includes changes of brain rhythms like Alpha, Beta and Gamma.

By summarizing the above-mentioned issues, we introduce an evaluation framework for checking the signal quality of Consumer-grade EEG devices as shown in Table 1. The first level of evaluation should assess basic functionality, ensuring that EEG devices can reliably detect significant fluctuations in scalp potentials, even when not originating from brain wave activity. As commonly practiced, artefactual signals generated from known movements such as eye blinking, jaw clenching, and head tilting¹⁵ serve as effective benchmarks for confirming EEG devices’ sensitivity to these signal fluctuations.

The second level of evaluation pertains to brain wave activity. Given that scalp electrical potential is not zero micro-voltage even in the absence of information about brain waves, it is important to confirm signal variations resulting from neural processes or specific brain rhythm changes. Conventional paradigms such as motor imagery²⁷, ERPs²⁸, mental arithmetic tasks²⁹ are commonly used to elicit measurable brain responses. In addition, power shifts in the alpha rhythm—called the “Berger effect”—offer a simple yet effective means of validating this level. The primary goal of this level of evaluation is to confirm the device’s capacity for measuring brain activity.

The central question for the last level of evaluation is whether the device is robust against various sources of noise. The assessment methods at this stage should evaluate the EEG device’s robustness against noise from physical movement, environmental changes, and prolonged recording durations. A simple method for completing this evaluation involves checking the EEG characteristics of normal-state and noisy situations. For example, a subject may be instructed to remain in a relaxed state, perform physical movements, and then relax again. The resting-state EEG recorded before/after the movement task can be compared to see if they show similar spectral patterns.

Table 1 Evaluation framework of signal quality for consumer-grade EEG devices. Per each level, the main question, evaluation and exemplary methodology and experiment are described.

Full size table

Evaluation study

EEG devices

In this study, we evaluated four consumer-grade EEG devices that are widely used for research and application development, as well as one research-grade device, DSI-24, for comparison. All consumer-grade and research-grade EEG devices used in this study are dry-electrode systems. These devices are shown in Fig. 1.

BrainLink Pro (BLP) is a product that Macrotellect, Inc. released in 2018. It has a single channel at Fp1 and one reference electrode on the left ear (Fig. 1-A). The maximum sampling rate is 512 Hz.

NeuroNicle FX2 (FX2) is a product by Laxtha, Inc. It has two electrodes (named EEG1 and EEG2) on the left and right frontal areas (Fig. 1-B), analogous to Fp1 and Fp2. It includes a reference electrode on the left ear and has a maximum sampling rate of 250 Hz.

Mindwave Mobile2 (MW2), which Neurosky released in 2018, also has a single channel at Fp1 and a reference electrode (Fig. 1-C) with a maximum sampling rate of 512 Hz.

Muse2 is a product that InteraXon, Inc. released in 2018. It has a total of four channels, Af7, Af8, Tp9, and Tp10 (Fig. 1-D), with a maximum sampling rate of 256 Hz.

DSI-24, a product from Wearable Sensing, Inc., has 21 electrodes and can be used both wired and wirelessly (Fig. 1-E), with a maximum sampling rate of 300 Hz. It is suitable for a wide range of studies³⁰ and has been applied in various BCI paradigms, including P300-based spellers³¹, motor imagery³², and QEEG³³, consistently demonstrating the intended outcomes within these paradigms.

Subjects

A total of 30 subjects (16 females and 14 males, aged 19–27 years, with a mean age of 23.2) participated in the study. The study received approval from the Institutional Review Board (IRB) of Handong Global University (No. [2023-HGUR026]). The experiments were conducted with the subjects’ full understanding and their written consent. All procedures were performed in accordance with the relevant guidelines and regulations.

Experiment

The experiment was designed to ensure that consumer-grade EEG devices can measure brain waves. The composition of the experimental paradigm is shown in Fig. 2. First, we verified whether the devices could detect artefactual signals, which show relatively larger amplitudes than brain waves. Subsequently, to evaluate the measured EEG signals’ validity, we verified whether the devices could detect Alpha power changes and alpha peak frequency. Finally, we verified the devices’ sensitivity to the wearer’s movements. Consumer-grade EEG devices are intended for use not only in controlled environments but also in everyday public situations where movements are likely to occur during use. For checking EEG devices’ signal quality, we followed the levels described in Table 1.

For level 1, “signal detection,” we used artefactual signals generated from eye blinking and mandibular contraction (jaw clenching). These actions produce significantly larger amplitudes compared with brain waves, making them important indicators of artefactual signal in the recordings³⁴. Assessing eye blinking and jaw clenching as initial steps allowed us to verify whether the EEG device was functioning correctly and could accurately capture signal variations on the scalp. The paradigm of eye blinking and jaw clenching proceeded as follows (see Fig. 2). The resting-state EEG (pre-rest) was measured before performing a one-minute task in a comfortable state. After that, the subjects performed 20 physical movements (e.g., eye blinking or jaw clenching) at the sound of a beep every three seconds within one minute. After this task, the resting-state EEG was measured for one minute in a comfortable state (post-rest). Except for eye-blinking cases, signals were recorded in the eyes-closed condition. During the jaw-clenching task, the action was performed by lightly biting a coffee straw with the teeth. We monitored the movement of the straw to ensure that the subjects performed the given task correctly.

For level 2, “brain wave detection,” the alpha shifts of eyes open versus closed conditions were introduced. For level 3, “noise robustness,” head movement was chosen because it requires a relatively light movement that a subject can easily perform, and moving the head is likely to occur in normal situations involving using an EEG headset, possibly influencing the EEG signal’s quality^35,36. Thus, we incorporated eyes-open/closed conditions and head movement in the experiment. The head movement paradigm proceeded as follows. A pre-resting-state EEG was measured before a one-minute task was performed in a comfortable state. After that, the head moved from left to right or from right to left by the beep at three-second intervals for one minute. Sub-beep sounds at one-second intervals were played to prevent the subject from moving the neck too quickly and encourage them to move the neck at an even speed. The task was performed 20 times in one minute. Afterward, the post-resting-state EEG was measured for one minute as the subject remained in a comfortable state. This procedure was performed with eyes-open and eyes-closed states, respectively.

The experiment consisted of a total of five sessions, with each session using one randomly assigned device. However, the comparison device, DSI-24, was used in Session 3. Each session included four paradigms: eye blinking, jaw clenching, head movements to the left and right with eyes open, and head movements to the left and right with eyes closed.

All paradigms were conducted for a total of three minutes, consisting of a one-minute pre-rest period, during which an EEG was measured in a resting state before the task; a one-minute task performance; and a one-minute post-rest period, during which an EEG was measured in a resting state after the task (Fig. 2). Participants performed the task at the sound of a beep that played every three seconds. Each experiment required approximately 2.5 h, including EEG setup, user evaluation, and the completion of four paradigms across the five devices.

Questionnaire-based evaluation

A questionnaire-based study was also conducted using an adapted version of the System Usability Scale (SUS)³⁷ to assess user evaluation. At the end of each session, participants completed all four paradigms using a single EEG device. They were then asked to rate the device on a scale from 1 to 10 across five dimensions: comfort while wearing, willingness to wear again, design, and familiarity with the device. At the end of the experiment, the participants were asked to rank the five devices based on the difficulty of wearing them and their preference for each device. The survey was conducted in Korean (native language), and Table 2 provides the translated version of the survey questions.

Table 2 Questionnaire items for user evaluation.

Full size table

Data analysis

Data preprocessing

For data analysis, EEG signals were obtained from electrodes positioned over the left forehead was used. That are Fp1 channel for BLP, MW2, and DSI-24, AF7 for Muse2, and EEG1 for FX2. Also, for Muse 2, bipolar re-referencing was performed by referencing AF7 to TP10.

Due to the heavy noise issue, not all the data was usable. Thus, we excluded some data for the analysis. The number of subjects excluded for each device in the data analysis was as follows (Table 3). Subject exclusion was categorized into two cases: Case 1, where the eye blinking or jaw clenching paradigm could not be used, and Case 2, where the head movement paradigm could not be used. Due to Case 1, three subjects were excluded from Muse2. Due to Case 2, five subjects were excluded in Muse2, and one subject was excluded in each of the other four devices. In Case 2, the data from subject S26, measured with all devices, were excluded.

For the non-neural physiological artefacts signal evaluation, the one-minute data obtained during the tasks (Fig. 2-A, B) were used. The full one-minute pre-rest data (Fig. 2-C, D) were used for the EEG signal validity evaluation, whereas both pre-rest and post-rest one-minute data (Fig. 2-D, E) were used for the EEG signal motion sensitivity evaluation. Data analysis was conducted in MATLAB (MathWorks Inc., R2022b) and using the EEGLAB library (version 2022.1).

Table 3 Number of excluded subjects. Subjects are excluded when Raw data contained excessive noise or unstable electrode contact.

Full size table

Evaluation of signal detection

Figure 3 shows an exemplary plot of the EEG recorded during the eye blinking and jaw clenching task. We manually counted the peaks stemming from blinking and jaw clenching to verify how well these non-neural physiological artefacts signals were detected on the recordings. The time series data for each task was visualized and reviewed by three experimenters to ensure accuracy.

Evaluation of brain wave detection

Figure 4 shows an exemplary EEG recording obtained during resting conditions with eyes closed and eyes open. We utilized the alpha rhythm shifts (known as the Berger effect) and the alpha peak frequency to evaluate the capacity of brain wave detection. The Berger effect refers to the phenomenon in which visual stimuli, such as opening the eyes, suppress or reduce stable alpha waves^38,39. Thus, an increase in alpha power occurs when the eyes are closed compared with when they are open. Alpha peak frequency denotes the frequency of the highest alpha peak, which varies individually⁴⁰. These two features were computed as follows.

First, the Berger effect index was obtained by dividing the alpha power of the eyes-closed condition by the alpha power of the eyes-open condition (which produces the ratio of alpha powers between the two conditions). The Berger effect index was higher than 1, as the eyes-closed condition produced a larger amplitude of the alpha rhythm. To calculate this index without bias, we removed the 1/f trend from each power spectral density (PSD) by fitting the trend (the Curve Fitting Toolbox in MATLAB was used), and the alpha power was computed by adding up the powers within the frequency range of 8–13 Hz.

Second, we picked the frequency of the peak alpha within 6–15 Hz. This was because the alpha peak can appear in a slightly wider band range than the traditional alpha band (8–13 Hz)⁴⁰. The experimenter manually identified and determined the peak. During this procedure, we referred to the open source⁴¹ to calculate the alpha peak frequency. Figure 5 represents the exemplary power spectral densities of two conditions. As seen, the higher alpha power is shown in eyes-closed condition, and the peak frequency at 10.3 Hz is observable.

Evaluation of noise robustness

We compared the pre-/post-movement data to evaluate the device’s sensitivity to movement. Based on the assumption that devices may exhibit varying levels of robustness against artefactual movements, we hypothesized that differences would manifest in the EEG spectral patterns. Thus, the power spectral density was calculated, and Pearson correlation analysis was conducted to determine the similarity between the two time points.

Statistical analysis

Repeated measures analysis of variance (ANOVA) was employed to compare differences between devices, accounting for within-subject variability across multiple measurements. This approach was chosen due to the repeated measurements collected from each subject across different devices, allowing for a more accurate assessment of device differences while controlling for individual variability. Additionally, we also used the Wilcoxon signed-rank test in case normality is not satisfied.

Results

Evaluation of signal detection

Eye blinking and jaw clenching were performed to evaluate the detection of non-neural physiological artefacts signals. Each participant performed 20 times per task for each device, and we confirmed that all devices showed 20 or close to 20 (FX2 and Muse2) on average (Table 4). This confirms that all five devices can detect non-neural physiological artefacts signals.

Table 4 Counts of eye blinking (EB) and jaw clenching (JC). Data excluded from this analysis due to heavy noise are marked with “-”. Note that the maximum number of events is 20 per each task.

Full size table

Evaluation of brain wave detection

We checked the Berger effect index, which is the ratio between the alpha powers of eyes-closed/open conditions. Figure 6 shows the results of the Berger effect index. Average values of the index are 4.023 (BLP), 5.442 (FX2), 4.284(MW2), 3.779(Muse2), and 7.962(DSI-24). Repeated measures ANOVA was performed across the devices less than 0.001, indicating a statistical difference. A post-hoc pairwise comparison with Bonferroni correction revealed significant differences between device 5 and devices 1, 2, 3, and 4 (adjusted p < 0.05). All devices showed a reasonable score beyond index = 1, although some subjects (presented in dots) were placed under the y = 1 line (black dashed horizontal line in Fig. 6). This result indicates that the tested devices are capable of detecting brain wave shifts (here, the alpha rhythm).

Secondly, we picked the individual alpha peak frequency and compared it with the value obtained from DSI-24 (research-grade device). Table 5 shows the results, and Table 6 demonstrates the average difference of the alpha peak frequency between each device and DSI-24. We observed \(\:\varDelta\:f=\) 0.24 Hz (BLP), 0.26 (FX2), 0.20 (MW2), and 0.32 (Muse2) on average. A repeated measures ANOVA revealed that no statistically significant difference exists between them (p > 0.05).

Table 5 Individual alpha peak frequency (Hz). Data excluded from this analysis due to heavy noise are marked with “-”.

Full size table

Table 6 The average difference of alpha peak frequencies (Hz) between each device and DSI-24.

Full size table

Evaluation of noise robustness

Movement sensitivity was evaluated through a correlation analysis with the power spectral densities of two EEG recordings from pre-/post-movement. The results are shown in Fig. 7. The correlation coefficients are \(\:r=\) 0.94 (DSI-24), 0.95(BLP), 0.94 (MW2), 0.91 (FX2), and a relatively lower value was observed in Muse2 (0.89). Repeated measures ANOVA was performed across the devices, resulting in a p-value of 0.038, less than 0.05, indicating a statistical difference. A post-hoc pairwise comparison using the Wilcoxon signed-rank test with Bonferroni correction revealed significant differences between device 1 and device 4, and between device 3 and device 4 (adjusted p < 0.05).

User evaluation

Figure 8 presents the user evaluation survey results. The user evaluation results showed that the research device, DSI-24, scored lower compared with the four consumer-grade EEG devices. The average score across five categories was 7.24 (BLP), 7.11 (FX2), 7.67 (MW2), 7.11 (Muse2), and 4.15 (DSI-24). MW2 received the highest scores across all five question items. Notably, the level of comfort while wearing DSI-24 was significantly lower, at 3.33, compared with the consumer-grade EEG devices’ average score of 7.9. Repeated measures ANOVA was performed across the devices per question item, and the p-values for all question items were less than p < 0.05. Post-hoc pairwise comparisons with Bonferroni correction showed that DSI-24 differed significantly from the four other devices across all questions (all p < 0.001). In contrast, no significant differences were observed among those four EEG devices. The differences between research equipment and consumer-grade equipment were significant, but no significant differences were found among the consumer-grade equipment. The only exception was observed when a statistical test was conducted on the design-related question between MW2, which received the highest score among the consumer-grade equipment, and BLP, which received the lowest score, showing a significant difference.

In the survey responses regarding the maximum wearable duration shown in Table 7, at least 66.6% of subjects responded that they could wear consumer-grade EEG devices for more than 60 min–60 min. However, only 33.3% of subjects responded similarly to the DSI-24. Notably, whereas up to 10% of respondents reported being able to wear the consumer-grade EEG devices for less than 30 min, 36.7% reported this for the DSI-24, indicating difficulties with prolonged wear.

In the post-experiment survey, 29 out of 30 participants indicated that the DSI-24 was the most difficult device to wear. Additionally, 24 out of 30 participants reported it as the most uncomfortable device to wear. When asked about their preferred devices, nine participants chose BLP, four chose FX2, seven chose MW2, six chose Muse2, and four chose DSI-24. The reasons for choosing a preferred device included comfort during wear, stability while wearing, fit, absence of a reference, and weight.

Table 7 Maximum wearable duration per equipment (number of respondents = 30). The number of respondents is presented in each pair of device and time duration.

Full size table

Discussion

An experimental paradigm was proposed to verify whether consumer-grade EEG devices accurately measure EEG signals and assess the quality of data. First, the device’s ability to detect artifacts with relatively large amplitudes was assessed to verify accurate measurement, and the recorded EEG signals’ validity was confirmed by analyzing Alpha power changes and its peak frequency. Furthermore, the devices’ sensitivity to movement was evaluated to check the noise robustness, considering consumer-grade equipment’s diverse applications.

All devices successfully detected non-neural physiological artifacts. In this study, the artifacts observed across 20 trials were visually confirmed by the experimenter. However, when the number of repetitions increases in future sessions, an automated algorithm would be required to efficiently identify such artifacts.

An increase in alpha power was observed in all devices when the eyes were closed. Additionally, the individual alpha peak frequency, which varies among individuals, was similarly detected across all devices within the same participant. This indicates that the EEG data measured with all devices is valid. The BLP device showed lower sensitivity to movement compared with other devices, whereas Muse2 exhibited relatively higher sensitivity to movement compared with the other devices. These results may relate to the noise level of each data. We checked the known documents and found some relevant information. The NeuroNicle FX2 has internal noise below 0.8 µV rms⁴², and the DSI-24 is reported to have less than 3 µV peak-to-peak within the frequency range of 1–50 Hz⁴³. However, noise floor values are not specified in the vendor documentation for consumer-grade systems such as MindWave 2, BrainLink Pro, and Muse 2. Thus, we could not further interpret our results.

The absolute amplitude varied across devices because the hardware configurations of the five systems are not identical. Differences in factors such as reference scheme, electrode characteristics, impedance behavior, and analog front-end properties (including gain and bandwidth) can influence the scale of the recorded voltage even when the same neural activity is measured. As these device-specific elements can affect amplitude independently of signal quality, the absolute magnitude should not be interpreted as a direct measure of performance. Instead, more stable indicators—such as spectral patterns, alpha peak identification, and task-related modulations—provide more meaningful bases for comparing device behavior.

During the experiments, multiple retests were conducted with the Muse2 device due to heavy noise, and some experiments were excluded when the issues could not be resolved. These problems appeared to be related to the participants’ head shapes. When the experimenter wore the device for verification, the device worked normally, but it failed to measure normally when the participant wore it again. Although Muse2 is somewhat sensitive to the participant’s head shape, all five devices seem capable of measuring brain waves (tested here by using the alpha rhythm).

In the user evaluation, a significant difference in responses was found between the consumer-grade EEG devices and the research device. Additionally, in the survey conducted after testing all five devices, the majority of respondents indicated that the DSI-24 was the most difficult and uncomfortable to wear. This suggests that the research-grade device, DSI-24, faces challenges in consumerization not only due to its high cost but also due to user discomfort and difficulties in continuous use from a usability perspective.

In terms of signal characteristics, the four consumer-grade EEG devices demonstrated generally similar performance. However, each device had its own advantages and disadvantages in terms of the number of electrodes and connection methods. In practice, when evaluating and selecting EEG devices, it is important to consider not only signal quality but also whether the device specifications are suitable for the experimental purpose. In this study, we prioritized the use of a program called OpenViBE to connect the EEG devices. OpenViBE is a free software platform widely used in EEG experiments, as it enables the use of multiple EEG devices through a unified interface and facilitates experimental design and execution⁴⁴.

Mindwave2 was easily connected via OpenViBE, enabling smooth signal acquisition and experiment execution, with no re-recordings due to connection issues. It also received the highest user evaluation scores, reflecting its relatively favorable usability and comfort. However, because it uses a single electrode at Fp1, the amount of information that can be obtained is limited. The BrainLink Pro, like Mindwave2, could also be connected via OpenViBE and allowed for easy signal acquisition; however, re-recordings due to connection issues occurred in 10 out of 30 participants. BrainLink Pro also provides limited information due to having only one electrode at Fp1. Since many frontal EEG studies aim to examine asymmetry^45,46, MindWave2 and BrainLink Pro cannot meet these experimental requirements.

FX2 consists of two electrodes (around Fp1 and Fp2) but has the disadvantage of being incompatible with OpenViBE. Although it provides its own software, stimuli cannot be presented, and event markers cannot be inserted, which substantially limits its use beyond resting-state recordings. Lastly, Muse2 comprises four electrodes and supports Lab streaming layer (LSL) communication through external programs, allowing it to be used in OpenViBE-based experiments. However, it showed the highest number of re-recordings due to signal instability and exclusions caused by severe noise, which appeared to be influenced by individual differences in head shape affecting electrode contact quality.

When choosing among several consumer-grade EEG devices, if there is no significant difference in signal quality, the device specifications may become an important factor in determining suitability for the intended experimental purpose.

Future work should include EEG device validation using paradigms widely employed in practical BCI applications. For instance, tasks such as mental arithmetic, cognitive workload assessment, and the oddball paradigm can be analyzed using a relatively small number of channels, making them promising candidates for practical applications with consumer-grade EEG devices. These tasks can be used to compare whether EEG features obtained from consumer-grade EEG devices resemble the features extracted with research-grade devices. This comparison would help clarify not only consumer-grade EEG devices’ practical applicability but also the scope and limitations of the analyses that can be performed with them.

Conclusion

In this study, we designed a conceptual evaluation framework for EEG devices comprising three concrete levels of testing and applied it to assess five EEG devices. The results demonstrated that four consumer-grade EEG devices were capable of reliably measuring non-neural physiological artefacts signal, brain waves. Moreover, these consumer-grade devices exhibited superior usability compared with the research-grade device. Future research should further assess these devices’ suitability for BCI applications by employing well-established paradigms such as ERPs and P300.

Data availability

The datasets generated and/or analyzed during the current available from the corresponding author on reasonable request.

References

Ismail, L. E. & Karwowski, W. Applications of EEG indices for the quantification of human cognitive performance: a systematic review and bibliometric analysis. PLOS ONE. 15, e0242857 (2020).
Article CAS PubMed PubMed Central Google Scholar
Rivera, M. J., Teruel, M. A., Maté, A. & Trujillo, J. Diagnosis and prognosis of mental disorders by means of EEG and deep learning: a systematic mapping study. Artif. Intell. Rev. 55, 1209–1251 (2022).
Article Google Scholar
Anders, C. & Arnrich, B. Wearable electroencephalography and multi-modal mental state classification: a systematic literature review. Comput. Biol. Med. 150, 106088 (2022).
Article PubMed Google Scholar
Medithe, J. W. C. & Nelakuditi, U. R. Study on the impact of light on human physiology and electroencephalogram. J. Biomim. Biomater. Biomed. Eng. 28, 36–43 (2016).
Google Scholar
Khan, S. S. et al. A review of EEG artifact removal methods for brain-computer interface applications. Ing. Syst. Inf. 29, 247–252 (2024).
Hinrichs, H. et al. Comparison between a wireless dry electrode EEG system with a conventional wired wet electrode EEG system for clinical applications. Sci. Rep. 10, 5218 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Portillo-Lara, R., Tahirbegi, B., Chapman, C. A. R., Goding, J. A. & Green, R. A. Mind the gap: State-of-the-art technologies and applications for EEG-based brain–computer interfaces. APL Bioeng. 5, 031507 (2021).
Article PubMed PubMed Central Google Scholar
LaRocco, J., Le, M. D. & Paeng, D. G. A systemic review of available low-cost EEG headsets used for drowsiness detection. Front. Neuroinformatics. 14, 553352 (2020).
Article Google Scholar
Sawangjai, P., Hompoonsup, S., Leelaarporn, P., Kongwudhikunakorn, S. & Wilaiprasitporn, T. Consumer grade EEG measuring sensors as research tools: a review. IEEE Sens. J. 20, 3996–4024 (2020).
Article ADS Google Scholar
Sabio, J., Williams, N. S., McArthur, G. M. & Badcock N. A. A scoping review on the use of consumer-grade EEG devices for research. PLOS ONE. 19, e0291186 (2024).
Article CAS PubMed PubMed Central Google Scholar
Hairston, D. Usability of four commercially-oriented EEG systems. J. Neural Eng. 11, 046018 (2014).
Article ADS Google Scholar
Anthony, J. R. A comparison of electroencephalography signals acquired from conventional and mobile systems. J. Neurosci. Neuroeng. 3, 10–20 (2014).
Article Google Scholar
Radüntz, T. Signal quality evaluation of emerging EEG devices. Front. Physiol. 9, 98 (2018).
Article PubMed PubMed Central Google Scholar
Liu, D. et al. A study on quality assessment of the surface EEG signal based on fuzzy comprehensive evaluation method. Comput. Assist. Surg. 24, 167–173 (2019).
Article Google Scholar
Maskeliunas, R., Damasevicius, R., Martisius, I. & Vasiljevas, M. Consumer-grade EEG devices: are they usable for control tasks? PeerJ 4, e1746 (2016).
Article PubMed PubMed Central Google Scholar
Guger, C., Edlinger, G., Harkam, W., Niedermayer, I. & Pfurtscheller, G. How many people are able to operate an EEG-based brain-computer interface (BCI)? IEEE Trans. Neural Syst. Rehabil Eng. 11, 145–147 (2003).
Article CAS PubMed Google Scholar
Nijboer, F. et al. A P300-based brain–computer interface for people with amyotrophic lateral sclerosis. Clin. Neurophysiol. 119, 1909–1916 (2008).
Article CAS PubMed PubMed Central Google Scholar
Lee, S., Kim, M. & Ahn, M. Evaluation of consumer-grade wireless EEG systems for brain-computer interface applications. Biomed. Eng. Lett. 14, 1433–1443 (2024).
Article PubMed PubMed Central Google Scholar
Mollazadeh, M., Murari, K., Cauwenberghs, G., Thakor, N. & Micropower, C. M. O. S. Integrated Low-Noise Amplification, Filtering, and digitization of multimodal neuropotentials. IEEE Trans. Biomed. Circuits Syst. 3, 1–10 (2009).
Article CAS PubMed PubMed Central Google Scholar
Baishnab, K. L., Guha, K., Chanda, S., Laskar, N. M. & Biswas, D. A low power, low noise amplifier for neural signal amplification in SCL 180nm. in International Conference on Electron Devices and Solid-State Circuits (EDSSC) 1–2 (2017). 1–2 (2017). (2017). https://doi.org/10.1109/EDSSC.2017.8333234
Dreyer, P., Roc, A., Pillette, L., Rimbert, S. & Lotte, F. A large EEG database with users’ profile information for motor imagery brain-computer interface research. Sci. Data. 10, 580 (2023).
Article PubMed PubMed Central Google Scholar
Riedl, R. et al. Insights on the measurement quality based on a literature review and implications for neurois research. in Information Systems and Neuroscience (ed (eds Davis, F. D. et al.) 350–361 (Springer International Publishing, Cham, doi:https://doi.org/10.1007/978-3-030-60073-0_41. (2020).
Chapter Google Scholar
Lee, Y. E. & Lee, S. W. Decoding Event-related Potential from Ear-EEG Signals based on Ensemble Convolutional Neural Networks in Ambulatory Environment. in 9th International Winter Conference on Brain-Computer Interface (BCI) 1–5 (2021). 1–5 (2021). (2021). https://doi.org/10.1109/BCI51272.2021.9385313
Gramann, K. & Mobile EEG for neurourbanism research - What could possibly go wrong? A critical review with guidelines. J. Environ. Psychol. 96, 102308 (2024).
Article Google Scholar
Song, S. & Nordin, A. D. Mobile electroencephalography for studying neural control of human locomotion. Front Hum. Neurosci 15, 749017 (2021).
Mathewson, K. E., Kuziek, J. P., Scanlon, J. E. M. & Robles, D. The moving wave: applications of the mobile EEG approach to study human attention. Psychophysiology 61, e14603 (2024).
Article PubMed Google Scholar
Pfurtscheller, G. & Neuper, C. Motor imagery and direct brain-computer communication. Proc. IEEE. 89, 1123–1134 (2001).
Article ADS Google Scholar
Farwell, L. A. & Donchin, E. Talking off the top of your head: toward a mental prosthesis utilizing event-related brain potentials. Electroencephalogr. Clin. Neurophysiol. 70, 510–523 (1988).
Article CAS PubMed Google Scholar
Bastien, C. Does insomnia exist without hyperarousal? What else can there be? Brain Sci. 10, 225 (2020).
Article PubMed PubMed Central Google Scholar
Soufineyestani, M., Dowling, D. & Khan, A. Electroencephalography (EEG) technology applications and available devices. Appl. Sci. 10, 7453 (2020).
Article CAS Google Scholar
Kim, S., Lee, S., Kang, H., Kim, S. & Ahn, M. P300 brain–computer interface-based drone control in virtual and augmented reality. Sensors 21, 5765 (2021).
Article ADS PubMed PubMed Central Google Scholar
Gwon, D. & Ahn, M. Motor task-to-task transfer learning for motor imagery brain-computer interfaces. NeuroImage 302, 120906 (2024).
Article PubMed Google Scholar
Attar, E. T., Balasubramanian, V., Subasi, E. & Kaya, M. Stress analysis based on simultaneous heart rate variability and EEG monitoring. IEEE J. Transl Eng. Health Med. 9, 1–7 (2021).
Article Google Scholar
Croft, R. J. & Barry, R. J. Removal of ocular artifact from the EEG: a review. Neurophysiol. Clin. Neurophysiol. 30, 5–19 (2000).
Article CAS Google Scholar
Oliveira, A. S., Schlink, B. R., Hairston, W. D., König, P. & Ferris, D. P. Induction and separation of motion artifacts in EEG data using a mobile Phantom head device. J. Neural Eng. 13, 036014 (2016).
Article ADS PubMed Google Scholar
Kerous, B., Skola, F. & Liarokapis, F. EEG-based BCI and video games: a progress report. Virtual Real. 22, 119–135 (2018).
Article Google Scholar
Brooke, J. S. U. S. A quick and dirty usability scale. in Usability Evaluation in Industry (eds (eds Jordan, P. W., Thomas, B., Weerdmeester, B. A. & McClelland, I. L.) 189–194 (Taylor & Francis, (1996).
Gloor, P. Hans Berger on electroencephalography. Am. J. EEG Technol. 9, 1–8 (1969).
Article Google Scholar
Kirschfeld, K. The physical basis of alpha waves in the electroencephalogram and the origin of the ‘Berger effect’. Biol. Cybern. 92, 177–185 (2005).
Article PubMed Google Scholar
Bazanova, O. M. & Vernon, D. Interpreting EEG alpha activity. Neurosci. Biobehav Rev. 44, 94–110 (2014).
Article CAS PubMed Google Scholar
Corcoran, A. corcorana/restingIAF. (2019).
Omnifit Inc. NeuroNicle FX2 User Manual. (2021). https://omnifit.co.kr/en/pdf/neuroNicle%20FX2%20%EC%82%AC%EC%9A%A9%EC%84%A4%EB%AA%85%EC%84%9C.pdf
Wearable Sensing. DSI-24 Specification Sheet. (2021). https://wearablesensing.com/wp-content/uploads/2021/12/Wearable-Sensing-DSI-24-Specification-Sheet_s.pdf
Renard, Y. et al. OpenViBE: an open-source software platform to design, test, and use brain–computer interfaces in real and virtual environments. Presence 19, 35–53 (2010).
Article Google Scholar
Smith, E. E., Reznik, S. J., Stewart, J. L. & Allen, J. J. B. Assessing and conceptualizing frontal EEG asymmetry: an updated primer on recording, processing, analyzing, and interpreting frontal alpha asymmetry. Int. J. Psychophysiol. 111, 98–114 (2017).
Article PubMed Google Scholar
Allen, J. J. & Reznik, S. J. Frontal EEG asymmetry as a promising marker of depression vulnerability: summary and methodological considerations. Curr. Opin. Psychol. 4, 93–97 (2015).
Article PubMed PubMed Central Google Scholar

Download references

Acknowledgments

This study was approved by the Institutional Review Board of Handong Global University (IRB No. 2023-HGUR026). All participants provided written informed consent before participation.

Funding

This work was supported by the Korea Institute of Oriental Medicine grant (No. KSN2324022), the National Research Council of Science and Technology (NST) Aging Convergence Research Center (CRC22014-500) and also by the National Research Foundation of Korea (NRF) grant (No.2021R1I1A3060828, No.RS-2025-25412061).

Author information

Yeeun Lee and Daeun Gwon contributed equally to this work.

Authors and Affiliations

Department of Computer Science and Electrical Engineering, Handong Global University, Pohang, Republic of Korea
Yeeun Lee, Daeun Gwon & Minkyu Ahn
School of Computer Science and Electrical Engineering, Handong Global University, Pohang, Republic of Korea
Kiyoun Kim, Seokhwan Park, Semin Sohn, Minuck Choi & Minkyu Ahn
Digital Health Research Division, Korea Institute of Oriental Medicine, Daejeon, Republic of Korea
Minho Choi & Jang-Han Bae
Aging Convergence Research Center, Korea Research Institute of Bioscience and Biotechnology (KRIBB), Daejeon, Republic of Korea
Jang-Han Bae

Authors

Yeeun Lee
View author publications
Search author on:PubMed Google Scholar
Daeun Gwon
View author publications
Search author on:PubMed Google Scholar
Kiyoun Kim
View author publications
Search author on:PubMed Google Scholar
Seokhwan Park
View author publications
Search author on:PubMed Google Scholar
Semin Sohn
View author publications
Search author on:PubMed Google Scholar
Minuck Choi
View author publications
Search author on:PubMed Google Scholar
Minho Choi
View author publications
Search author on:PubMed Google Scholar
Jang-Han Bae
View author publications
Search author on:PubMed Google Scholar
Minkyu Ahn
View author publications
Search author on:PubMed Google Scholar

Contributions

Conceptualization, YL, DG, MHC, JB and MA; methodology, YL, DG, MA; data acquisition, YL, DG, KK, SP, SS, and MUC; validation, YL, DG, MA; formal analysis, YL; investigation, YL, DG, MA; data curation, YL, DG, KK, SP, SS, and MUC; writing-original draft preparation, YL, DG and MA; writing-review and editing, YL, DG, MHC, JB and MA; visualization, YL, DG and MA; supervision MA; project administration, MA.

Corresponding author

Correspondence to Minkyu Ahn.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Lee, Y., Gwon, D., Kim, K. et al. A comprehensive evaluation framework for consumer-grade EEG devices: signal quality, robustness, and usability. Sci Rep 16, 8408 (2026). https://doi.org/10.1038/s41598-026-39056-8

Download citation

Received: 23 July 2025
Accepted: 02 February 2026
Published: 11 February 2026
Version of record: 09 March 2026
DOI: https://doi.org/10.1038/s41598-026-39056-8

Subjects

Abstract

Introduction

Methods and materials

Level of signal quality

Evaluation study

EEG devices

Subjects

Experiment

Questionnaire-based evaluation

Data analysis

Data preprocessing

Evaluation of signal detection

Evaluation of brain wave detection

Evaluation of noise robustness

Statistical analysis

Results

Evaluation of signal detection

Evaluation of brain wave detection

Evaluation of noise robustness

User evaluation

Discussion

Conclusion

Data availability

References

Acknowledgments

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Quick links