Introduction

Autism Spectrum Disorder (ASD) is a prevalent neurodevelopmental disorder characterized by deficits in social interaction, communication difficulties, and repetitive behaviors, affecting approximately 2.3% of children and 2.2% of adults in the U.S.1. While its etiology involves a complex interplay of genetic, environmental, immunological, and neuroanatomical factors, ASD lacks specific diagnostic biomarkers and targeted treatments2. The economic burden of ASD is profound, with care costs in the U.S. surpassing $268.3 billion annually3. Despite significant progress in understanding the molecular and behavioral aspects of ASD, challenges remain in developing precise treatment strategies, making continuous research imperative for improving patient outcomes.

Electroencephalography (EEG) has emerged as a valuable tool in autism research due to its ability to capture brain activity in real-time with high temporal resolution and low cost4. EEG is a noninvasive neuroimaging technique that measures brain activity through electrodes placed on the scalp, offering insights into cognitive processes and neurological disorders, including ASD5. Individuals with ASD often exhibit abnormalities, such as reduced long-range functional connectivity, increased local overconnectivity, and lower alpha spectral power compared to neurotypical individuals6. These findings highlight the need for further research into EEG signals to deepen our understanding of brain activity patterns and develop potential interventions targeting neural connectivity and signal strength. Additionally, EEG technology advancements, particularly in portability and affordability, are essential to make research more accessible beyond traditional clinical or laboratory environments.

EEG offers a practical approach for exploring neural activity by capturing dynamic brain signals in real-time7. In particular, it detects electrical patterns generated by neurons. It divides them into key frequency bands: Delta \(\delta\) (0.5–4 Hz) linked to deep sleep, Theta \(\theta\) (4–8 Hz) to drowsiness and memory, Alpha \(\alpha\) (8–13 Hz) to relaxation, Beta \(\beta\) (13–30 Hz) to alertness and cognitive tasks, and Gamma \(\gamma\) (>30 Hz), involved in higher cognitive functions8. ASD-related abnormalities often manifest in these bands, such as lower-alpha power indicating impaired relaxation states and altered connectivity patterns across regions9. However, EEG signals are inherently weak, ranging between 2 and 75 \({\upmu }\)V, and often contaminated by noise from muscle movements, power line interference, and other artifacts, which complicates signal extraction and interpretation10. Despite these challenges, EEG’s high temporal resolution capturing events within milliseconds makes it ideal for tracking rapid neural processes and studying functional connectivity in ASD. As ASD involves disruptions in local and long-range brain interactions, EEG can help identify underconnectivity and overconnectivity patterns11, providing a foundation for more targeted research and therapeutic interventions.

Preprocessing EEG signals is a crucial step in EEG data analysis, aimed at refining raw data to enhance its quality and reliability12. While a definitive standard for preprocessing workflows is yet to be established, this process generally employs digital signal processing techniques to isolate brain activity signals from unwanted noise and artifacts13. Effective artifact removal is vital for improving the quality of brain signal analysis, ensuring that the data collected is free from external and internal disturbances14. Farhangi et al.15 emphasized that data preprocessing is essential for modeling drowsy EEG signals, as it significantly enhances the modeling outcomes. By applying these preprocessing steps, researchers can ensure that the resulting EEG signals are cleaner, more accurate, and better aligned with the specific tasks being studied, ultimately leading to more robust findings and insights into neural activity.

Figure 1
figure 1

The process of EEG signal analysis.

Various preprocessing methods have been proposed to improve the quality of EEG data and the performance of signal analysis algorithms. One commonly used technique is Independent Component Analysis (ICA), as suggested by Klug et al.16, which effectively removes artifacts by employing high-pass filters with a 1–2 Hz cutoff and high channel density to enhance decomposition. Zhang et al.17 reviewed the use of Principal Component Analysis (PCA) for dimensionality reduction. However, it was found that PCA could decrease decoding accuracy in SVM models, necessitating selective application. Additionally, a combination of IIR-FIR filters with a GUI interface has proven effective for real-time EEG analysis18. Another study highlighted that IIR elliptic filters could significantly improve CNN performance by reducing noise in epilepsy diagnosis19. Jovicic et al.20 compared ICA, Canonical Correlation Analysis (CCA), and signal filtering, where CCA was shown to be superior in removing flicker and drift artifacts, thereby optimizing machine learning performance on affective disorder detection. Kang et al.21 further demonstrated that the Butterworth bandpass filter outperformed the Stationary Wavelet Transform (SWT) filter in maintaining signal clarity, yielding a lower mean square error (MSE) in critical channels such as Fp1, Fp2, and Fz. These findings indicate that each preprocessing technique must be chosen based on the analytical goals and the specific characteristics of the processed EEG signal.

Recent ASD EEG studies increasingly leverage time-frequency features and deep models. For example, a TFD-based automatic ASD framework reported robust performance on spectrogram-like inputs22, and time-frequency synergy networks improved ASD classification by jointly attending to temporal and spectral cues23. A head-to-head evaluation of ICA, DWT, and Butterworth specifically for ASD EEG within a single standardized pipeline on the same dataset is scarce in the literature, prior comparisons typically address non-ASD settings or use non-harmonized metrics and units. Our work complements these by standardizing a multi-metric denoising comparison (SNR in dB, MAE/MSE, PSD in \(\mu\)V²/Hz, SE, Hjorth) on a 16-channel OpenBCI dataset and by translating the results into actionable method selection that links preprocessing choices to deployment constraints often under-specified in the ASD literature. We also position the study alongside recent advances in wavelets/filtering and channel relevance for EEG, including optimal time-frequency localized wavelet filters for AD EEG24, efficient bi-orthogonal wavelets for EEG25, channel/sub-band sensitivity explanations26, Hybrid Reptile-Snake Optimizer-based channel selection27, the spatio-temporal CNN STEADYNet28, and the lightweight CNN LCADNet29, as well as TFD/DL pipelines22,23. Collectively, these points underline that preprocessing selection directly impacts feature quality and downstream classification in ASD EEG.

In this study, we evaluate three preprocessing methods: Butterworth, Discrete Wavelet Transform (DWT), and Independent Component Analysis (ICA). The Butterworth is widely used for EEG processing due to its flat frequency response in the passband, ensuring minimal signal distortion. It effectively preserves essential frequency components within the [0.5, 40] Hz range, corresponding to key brain wave activities such as delta, theta, alpha, and beta waves30. Additionally, it removes low-frequency artifacts and high-frequency noise while maintaining a stable linear phase response, which is critical for further analysis, including machine learning-based classification31. On the other hand, DWT is employed to decompose EEG signals into multiple frequency bands, enabling better noise separation while retaining important neural patterns32. Meanwhile, ICA is particularly effective in isolating and removing artifacts such as eye blinks and muscle movements, thereby enhancing the reliability of extracted features33.

We employ multiple evaluation metrics to assess the performance of these preprocessing techniques. Mean Absolute Error (MAE) and Mean Squared Error (MSE) are calculated to quantify the deviation between the processed signals and reference signals, indicating the denoising effectiveness. Power Spectral Density (PSD) analysis is conducted to examine power distribution across different frequency bands, offering insights into neural activity patterns related to ASD. Signal-to-Noise Ratio (SNR) is also measured to evaluate the relative strength of neural signals against background noise, ensuring that preprocessing methods enhance signal clarity.

Furthermore, Spectral Entropy (SE) is utilized to assess the complexity and unpredictability of the EEG power spectrum, which reflects cognitive and neural state variations. We compute Hjorth parameters: activity, mobility, and complexity to gain deeper insights into ASD-related neural characteristics. Activity represents the signal power, mobility describes frequency variability, and complexity captures the irregularity of neural oscillations. These parameters comprehensively characterize EEG dynamics, which can serve as potential biomarkers for ASD identification.

We aim to determine the most effective preprocessing method for ASD-related EEG studies through a comparative analysis of Butterworth filtering, DWT, and ICA. Our key contributions include:

  1. 1.

    A detailed evaluation of EEG preprocessing techniques, comparing Butterworth, DWT, and ICA regarding their impact on signal quality and feature extraction.

  2. 2.

    Implementation of MAE, MSE, PSD, SNR, and SE analysis to quantitatively assess the effectiveness of each preprocessing approach.

  3. 3.

    Extraction and comparison of Hjorth parameters (activity, mobility, complexity) to understand the distinct neural patterns in ASD and neurotypical (normal).

The findings of this study provide critical insights into selecting optimal preprocessing techniques for ASD-related EEG analysis. By identifying the most effective method for preserving relevant neural information while minimizing artifacts, this research improves the reliability of EEG-based ASD diagnosis and potential therapeutic interventions.

Materials and methods

Our methodology is designed to systematically capture, refine, and analyze EEG signals to explore neural patterns related to ASD. The process unfolds in three main stages: first, acquiring the EEG dataset to ensure high-quality signal collection; next, applying denoising techniques to minimize interference and enhance signal clarity; and finally, employing a range of evaluation metrics to assess and interpret signal characteristics offering insights into key neural dynamics (see Fig. 1 for an overview of these stages).

Data acquisition

The dataset for this research comprises EEG recordings, a non-invasive method widely used to capture neural oscillations and assess brain activity. EEG is particularly valuable in cognitive neuroscience and clinical studies, including ASD research, for detecting abnormalities in neural patterns. In this study, EEG data were collected from ten participants, five children diagnosed with ASD, aged 6–10 years, from Sekolah Luar Biasa (SLB) Negeri Kota Banda Aceh, and five healthy adults over 17 years old from Universitas Syiah Kuala in Banda Aceh, Indonesia34. Data acquisition was conducted across 16 electrode sites Fp1, Fp2, F3, F4, F7, F8, C3, C4, T3, T4, T5, T6, P3, P4, O1, and O2–under controlled conditions designed to promote participant comfort. Recordings occurred from 09:00 to 12:00 WIT to ensure participants were alert and calm. Each session lasted approximately 15 min across eight channels, repeated to cover all 16 channels, yielding a cumulative recording time of about 40 min. A 10-min setup period was included to position the electrodes accurately.

Various tools were employed for EEG data collection, with the primary hardware being the OpenBCI Cyton Biosensing Board, which allows for eight channels of EEG recording per session. The data recording setup also included an OpenBCI USB Dongle to wirelessly transmit data from the Cyton board to a computer. Other essential components included a 3.7V 500 mAh battery to power the Cyton board, an EEG cap to position electrodes on the participant’s scalp, and an EEG electrode gel to improve electrode contact and signal quality. A personal computer equipped with OpenBCI software served as the interface for real-time data visualization and processing.

Regarding physical setup, the data collection environment was carefully configured to ensure both functionality and comfort. The room measured approximately 3–5 m on one side and 2–4 m on the other, spacious enough to accommodate both equipment and participants comfortably. Within this space, a desk and chair were arranged, with the desk placed centrally and the chair 30 cm away from it to ensure participants faced a blank wall, minimizing distractions and ensuring consistent orientation across sessions. The equipment specifications of the Cyton Biosensing Board, designed by Cyton, included a PIC32MX250F128B microcontroller, a 3-axis accelerometer (LIS3DH), a high signal-to-noise ratio (SNR) of 121 dB, and a voltage resolution of 0.298 microvolts per bit. The board also supports multiple connectivity options, USB, Bluetooth 4.0, and WiFi, enabling seamless data transmission to a computer for further processing. This device, compatible with open-source software libraries like OpenBCI GUI and Brainflow, enables detailed analysis and visualization of EEG signals, facilitating comprehensive data assessment and enhancing the overall reliability of the study.

Ethical statement

The dataset was ethically cleared by the Ethical Clearance Committee, adhering to WHO 2011 standards (reference number 117/EA/FK/2024). This dataset serves as a valuable resource for further research, as it provides comprehensive EEG recordings suitable for various applications in ASD studies. This resource can support broader EEG research initiatives, enhancing collaborative efforts across ASD studies and related fields.

Signal denoising

Butterworth filter

The Butterworth filter is a signal processing filter designed to exhibit a maximally flat frequency response in the passband35,36. Unlike Chebyshev or elliptic filters, it shows no ripples in either the passband or stopband, making it highly suitable for applications requiring minimal distortion37,38. Initially introduced by Stephen Butterworth in 193039, this filter was developed to approximate an ideal filter by increasing the filter order and optimizing component selection.

In EEG analysis, the Butterworth filter is widely used to reduce noise while preserving critical signal features18,40. Its smooth response helps maintain the integrity of brainwave signals, which is crucial in clinical assessments of neurological conditions such as ASD41. Digital implementations commonly use the bilinear or matched Z-transform, often realized as cascaded biquad sections to ensure numerical stability42.

The low-pass and high-pass Butterworth transfer functions are defined by

$$\begin{aligned} |H_{n,lp}(j\omega )| = \frac{1}{\sqrt{1+(\frac{\omega }{\omega _{c}})^{2n}}}, \quad |H_{n,hp}(j\omega )| = \frac{1}{\sqrt{1+(\frac{\omega _{c}}{\omega })^{2n}}}, \end{aligned}$$
(1)

where \(\omega _c\) denotes the cut-off frequency and n is the filter order, which controls the sharpness of the frequency response.

Given its ability to suppress unwanted frequency components while preserving vital brain activity signals, the Butterworth filter plays a crucial role in EEG preprocessing for various applications, including communication systems43, ECG enhancement44, and EEG-based ASD detection45,46.

Discrete Wavelet Transform (DWT)

DWT is a widely used signal processing technique that decomposes a signal into different frequency components while preserving temporal resolution32. Unlike the Fourier Transform, which provides only frequency information, DWT enables both time and frequency localization, making it particularly effective for analyzing non-stationary signals such as EEG.

DWT operates by passing a signal \(x(t)\) through a series of high-pass and low-pass filters, resulting in the approximation coefficients (\(A\)) and detail (\(D\)) coefficients. The decomposition process follows a dyadic structure, where each successive level provides a finer frequency resolution. Mathematically, the approximation and detail coefficients at level \(j\) are computed as follows:

$$\begin{aligned} & A_j[n] = \sum _{k} x[k] \cdot g[2n - k], \end{aligned}$$
(2)
$$\begin{aligned} & D_j[n] = \sum _{k} x[k] \cdot h[2n - k], \end{aligned}$$
(3)

where \(g[n]\) is the scaling function (low-pass filter) and \(h[n]\) is the wavelet function (high-pass filter). The approximation coefficients capture low-frequency components, while the detail coefficients represent high-frequency components. This decomposition recursively applies to the approximation coefficients, leading to a multi-level wavelet decomposition.

Alsharabi et al. researched DWT’s effectiveness for EEG feature extraction, demonstrating its ability to isolate significant signal components relevant to neurological conditions47. Their study highlights the advantages of wavelet-based analysis over traditional methods in identifying EEG abnormalities. Similarly, Alturki et al. explored the application of DWT to classify EEG signals in ASD diagnosis, showing improved accuracy compared to conventional features in the time domain48.

For EEG analysis, DWT is particularly useful in extracting relevant characteristics to detect neurological disorders such as ASD. Different wavelet families, such as Daubechies (db4), Coiflet, and Symlet, are commonly employed to analyze EEG signals in multiple frequency bands, including delta (0.5–4 Hz), theta (4–8 Hz), alpha (8–12 Hz), beta (12–30 Hz), and gamma (>30 Hz). Feature extraction involves statistical measures such as mean, variance, entropy, and energy of the wavelet coefficients, which can be utilized for classification tasks using machine learning models.

Independent Component Analysis (ICA)

ICA is a statistical technique applied in the processing of EEG signals to separate mixed signals into statistically independent components33. This method is particularly effective in removing artifacts such as ocular movements, muscle noise, and power line interference, thus improving the quality of EEG signals49. ICA operates under the assumption that the observed EEG signals are linear mixtures of underlying independent sources. Mathematically, the ICA model can be formulated as:

$$\begin{aligned} X = A S, \end{aligned}$$
(4)

where \(X\) represents the observed EEG signal matrix, \(A\) is the unknown mixing matrix, and \(S\) contains the independent source signals. The objective of ICA is to estimate an unmixing matrix \(W\) such that:

$$\begin{aligned} S = W X, \end{aligned}$$
(5)

where \(S\) represents the independent components, including neural activity and unwanted artifacts. The separation of these components is achieved through an optimization process that maximizes the statistical independence of the sources, often utilizing algorithms such as FastICA and Infomax ICA50. Once ICA decomposes the EEG signal, each Independent Component (IC) is analyzed to identify and remove artifacts, thereby improving signal quality for subsequent feature extraction and classification51. In EEG-based ASD studies, ICA plays a crucial role in preprocessing, ensuring that the analysis focuses on neural activity rather than contamination from non-neural sources52.

Evaluation criteria for denoising

Signal-to-Noise Ratio (SNR)

SNR in EEG signals measures signal quality, indicating the ratio between the primary signal’s power and unwanted noise’s power53. Raw EEG signals typically contain high levels of noise that can obscure the actual brain signals, making filtering necessary to reduce this noise for more accurate signal analysis. Noise in EEG signals originates from two main sources: external noise, such as electromagnetic interference from the environment, and internal noise resulting from other brain activities outside the study’s focus. High noise levels in EEG signals present challenges in data analysis and interpretation, so improving SNR is essential for ensuring reliable data quality.

Methods to enhance SNR in EEG signals include reducing external noise and separating internal noise from the relevant signal54. External noise can be minimized by using high-quality equipment, such as wet electrodes that are more resistant to external interference, and by removing devices that produce electromagnetic interference from the recording area. Internal noise, often caused by participant movements like blinking or muscle tension, can be reduced through an experimental protocol that keeps the participant focused on a specific task and provides rest breaks to alleviate physical discomfort during recording.

After EEG data is recorded, various post-recording techniques are employed to improve further SNR, such as processing algorithms like Blind Signal Separation (BSS), which can identify and separate noise from the primary signal55. In modern EEG analysis, this approach is often combined with manual cleaning by EEG experts to ensure that segments of the signal containing artifacts and noise are removed. Mathematically, SNR can be calculated using the following formula:

$$\begin{aligned} SNR=\frac{P_{signal}}{P_{noise}}, \end{aligned}$$
(6)

where \(P_{signal}\) represents the power of the desired signal and \(P_{noise}\) is the power of the noise. A higher SNR value indicates better signal quality, yielding more reliable data for further analysis.

Mean Absolute Error (MAE)

MAE is a valuable metric for assessing the accuracy of predictions in EEG signal analysis, offering a measure of the average error between actual EEG signal values and predicted values56. In the context of EEG, MAE quantifies the mean absolute difference between the true signal value at each time point and the value predicted by a model. Unlike other error measures, such as Mean Squared Error (MSE), MAE is less sensitive to extreme outliers, making it particularly useful for EEG data containing abrupt fluctuations due to noise or sudden neural events57. In EEG analysis, MAE helps researchers and clinicians understand the magnitude of prediction errors without factoring in their direction58,59. This could be helpful in models where both positive and negative prediction errors are expected

The MAE formula is:

$$\begin{aligned} \text {MAE} = \frac{1}{n} \sum _{i=1}^{n} |y_i - \hat{y}_i|, \end{aligned}$$
(7)

where \(n\) is the total number of observations or EEG samples, \(y_i\) is the actual value at the \(i\)-th sample, and \(\hat{y}_i\) is the predicted value at the same sample.

Mean Squared Error (MSE)

MSE is a statistical measure used to assess the average of the squares of the errors or the squared differences between estimated values and actual values60. MSE helps quantify the difference between recorded EEG data and its predicted or expected values, thereby providing insights into the accuracy of the model used for signal analysis. This metric functions as a risk measure and often represents the expected value of the squared error loss. MSE is usually positive due to inherent randomness or potential limitations in the model’s ability to accurately capture all EEG signal characteristics5.

The MSE formula is:

$$\begin{aligned} \text {MSE} = \frac{1}{n} \sum _{i=1}^{n} {|y_i - \hat{y}_i|}^2, \end{aligned}$$
(8)

where n represents the total number of EEG signal samples collected, and the term \(y_i\) denotes the actual EEG signal value at the i-th data point in the dataset, capturing the absolute observed measurement. Meanwhile, \(\hat{y}_i\) refers to the predicted EEG signal value at the i-th data point generated by the model or estimator. This comparison between \(y_i\) and \(\hat{y}_i\) for each data point allows the calculation of MSE, which helps assess the accuracy of predictions against accurate EEG signal measurements. The MSE provides a measure of the quality of an estimator in EEG signal processing5.

Since it originates from the square of Euclidean distance, MSE is always a positive value, approaching zero as the error decreases61. It represents the error’s second moment (about the origin), capturing both the estimator’s variance and its bias. For unbiased estimators, the MSE equals the estimator’s variance. Like variance, MSE shares the same units as the square of the estimated quantity. The square root of MSE gives the Root-Mean-Square Error (RMSE), which aligns with the units of the original amount and corresponds to the standard error in the case of an unbiased estimator.

Spectral Entropy (SE)

SE is a powerful nonlinear method used to analyze the irregularity of an EEG signal’s power distribution across different frequency bands62. Grounded in the principles of Shannon entropy from information theory, SE quantifies the complexity of the EEG signal’s spectral content, providing insights into the underlying neural activities. The spectral entropy is computed by considering the normalized power spectrum of the EEG signal as a probability distribution63, where the spectral entropy SE is given by the formula:

$$\begin{aligned} \text {SE} =-\sum _{f}{p_f\log {(p_f)}}, \end{aligned}$$
(9)

where \(p_f\) represents the normalized power in each frequency band. A high SE value indicates greater unpredictability and irregularity in the EEG signal, suggesting complex brain activity. In contrast, a low SE value signifies regularity, often associated with more stable states such as deep sleep or relaxation.

The application of spectral entropy in EEG analysis is particularly valuable in studying various brain conditions, such as epilepsy and other neurological disorders. By examining the spectral complexity of EEG signals, researchers can gain insights into the dynamic states of the brain, facilitating the detection of abnormal patterns indicative of epileptiform activity64. Additionally, spectral entropy serves as a helpful feature in classifying different cognitive states and assessing the impact of interventions in neurofeedback and brain-computer interface studies.

Power Spectral Density (PSD)

PSD is a fundamental method used to analyze EEG data, providing a spectral representation of how power is distributed across different frequency components of the EEG signal65. It quantifies the frequency content of the brain’s electrical activity, allowing researchers to differentiate between noise and significant features within the EEG recordings12,65. The PSD is typically estimated using Thomson’s multitaper spectral estimation method, which is favored for its ability to produce lower variance and increased frequency resolution compared to other techniques, such as Welch’s method or simple Fast Fourier Transform (FFT)66. The mathematical representation of PSD can be expressed as:

$$\begin{aligned} PSD(f) = \lim _{T \rightarrow \infty } \frac{1}{T} \left| \sum _{n=-\infty }^{\infty } x[n] e^{-j 2 \pi f n} \right| ^2, \end{aligned}$$
(10)

where x[n] represents the discrete EEG signal and f denotes the frequency.

The analysis can yield more stationary signals that allow robust feature extraction by segmenting the EEG data into smaller intervals, typically 30 seconds. This method enables the computation of PSD across multiple frequency bands, such as delta, theta, alpha, and beta, leading to a comprehensive feature set that can be employed in machine learning classification tasks. As a result, a segmented 5-minute EEG recording may produce multiple 30-second segments, yielding a substantial number of PSD features that are instrumental in understanding brain dynamics and potential pathologies67.

Hjorth parameters

Hjorth parameters are statistical indicators that provide insights into the structure and variability of EEG signals by analyzing the data in the time domain68. Developed by Bo Hjorth in 197069, these parameters include Activity, Mobility, and Complexity and are commonly used in EEG signal processing to capture distinct characteristics of brainwave patterns. The parameters, also known as Normalized Slope Descriptors (NSDs), offer a low-complexity method for understanding EEG signal properties, which is helpful in various applications such as epilepsy monitoring and neurological studies70.

The first parameter, activity, represents the signal’s power and is equivalent to the variance of the EEG signal y(t). It quantifies the overall power or energy in the signal and can also indicate the area under the power spectrum in the frequency domain:

$$\begin{aligned} \text {Activity} = \text {var}(y(t)). \end{aligned}$$
(11)

Mobility measures the mean frequency of the signal and is defined as the square root of the variance of the first derivative of y(t) divided by the variance of y(t) itself. This parameter gives insight into the signal’s frequency content, essentially estimating the average rate of change in the signal:

$$\begin{aligned} \text {Mobility} = \sqrt{\frac{\text {var}\left( \frac{dy(t)}{dt}\right) }{\text {var}(y(t))}}. \end{aligned}$$
(12)

Lastly, Complexity compares the Mobility of the original signal with that of its first derivative, estimating how the signal frequency changes over time. A higher Complexity value suggests that the signal diverges more from a simple sine wave, indicating greater irregularity:

$$\begin{aligned} \text {Complexity} = \frac{\text {Mobility}\left( \frac{dy(t)}{dt}\right) }{\text {Mobility}(y(t))}. \end{aligned}$$
(13)

These parameters provide a comprehensive profile of EEG signal characteristics, enabling researchers to extract valuable features related to brain activity, cognitive states, and neural responses.

Results and discussion

In this study, three EEG preprocessing techniques, Butterworth, DWT, and ICA, were independently applied to assess their effectiveness in enhancing EEG signal quality for ASD analysis. Each method was executed separately using the same raw EEG dataset. The performance of each technique was then evaluated using metrics such as SNR, MAE, MSE, SE, PSD, and Hjorth parameters (activity, mobility, and complexity).

The Butterworth was employed for its flat frequency response characteristics, effectively attenuating high-frequency noise while preserving the integrity of neural signals. DWT was utilized to decompose EEG signals into multiple time-frequency resolution levels, which is beneficial in isolating non-stationary components typical of brain activity. Meanwhile, ICA was implemented to identify and remove statistically independent noise sources, such as eye blinks and muscle artifacts.

Figure 2
figure 2

EEG signals before and after preprocessing using Butterworth, DWT, and ICA. (A) Original EEG signals. (B) After Butterworth filtering. (C) After DWT preprocessing. (D) After ICA preprocessing.

Figure 2 illustrates the results of each preprocessing method. Figure 2A displays the original EEG signals, where significant noise and oscillations are visible across multiple channels. Figure 2B shows the output after applying the Butterworth, where the signal appears smoother and less noisy. Figure 2C presents the DWT-processed signals, which retain more of the original frequency characteristics but with some noise reduction. Figure 2D depicts the ICA-preprocessed signals, showing improved separation of artifacts in specific channels but with varying residual noise levels.

This comparison provides valuable insight into each preprocessing approach’s individual strengths and limitations, enabling a more informed selection for EEG-based ASD signal analysis.

Results of denoising

To gain deeper insight into the impact of the Butterworth, DWT, and ICA on EEG signal characteristics, a comparison was conducted between recordings from individuals with ASD and neurotypical (normal). This analysis involved evaluating key statistical and signal-based parameters across multiple electrode positions to assess the preprocessed EEG signals’ quality, complexity, and variability.

Results of butterworth

Table 1 presents the comparative performance of the Butterworth technique applied to EEG data from normal and ASD individuals across multiple electrodes. Regarding SNR, normal participants consistently exhibited higher average values (62.48) than ASD participants (57.73), suggesting a clearer signal after filtering for the normal group. Correspondingly, error metrics such as MAE and MSE were higher in the normal group (MAE: 15054.43; MSE: 350,566) compared to the ASD group (MAE: 11551.17; MSE: 180,454), indicating that despite higher SNR, the overall absolute and squared deviations were primarily due to more pronounced original signal amplitudes in the normal data.

Looking at the SE, the values are quite close between the groups, with ASD individuals showing a slightly higher average (3.28 and 3.15), implying marginally more complex or irregular frequency components in their brain activity. PSD also followed a similar pattern, with ASD subjects showing a higher average value (0.45 and 0.41), indicating greater overall power in their EEG signals after preprocessing.

Hjorth parameters offered more nuanced insights. Activity was higher in the ASD group (192.75) than in the normal group (52.82), suggesting more significant signal variation or energy. Similarly, mobility (0.51 and 0.43) and complexity (1.69 and 1.53) were elevated in the ASD group, reflecting a more dynamic and complex underlying neural signal. This may imply that while the Butterworth efficiently preserved signal complexity and variation, these characteristics are inherently more pronounced in the ASD EEG data, possibly correlating with atypical neural processing in ASD individuals.

Table 1 Comparison of average values for EEG signal parameters using Butterworth.

Results of DWT

Table 2 shows that, in terms of SNR, the normal group exhibits a higher average value (3.0103) compared to the ASD group (3.0098), indicating that the quality of the signal is better preserved in the normal group after preprocessing. Although the EEG signals from the normal group may exhibit more distinct activity patterns, this complexity also contributes to a higher MAE value (6235.74) compared to the ASD group (4785.08), indicating greater absolute deviation from the original signal during preprocessing. This trend is also observed in MSE, where the normal group shows a higher value (601468) than the ASD group (309690), reflecting a more significant deviation in the squared errors of the normal group’s data.

For SE, the normal group exhibits a higher average value (2.6736), which indicates that the signal spectrum is more complex or less regular in the normal group compared to the ASD group, which has a lower average SE (1.0417). In contrast, the PSD is significantly lower in the normal group (6.9798) than in the ASD group (206.25), suggesting that the EEG signals from the ASD group contain greater power in their frequency spectrum after DWT processing.

Finally, notable differences between the two groups are observed when examining Hjorth parameters. The activity parameter is higher in the ASD group (28190.51) compared to the normal group (2372.93), indicating a more significant signal variation in the ASD group. Similarly, mobility and complexity are higher in the ASD group, with average values of 1.7594 and 1.0873, respectively, compared to the normal group’s values of 0.8134 and 3.3508. These findings suggest that, while DWT processing effectively handles EEG signals, the ASD group’s signals exhibit more dynamic and complex characteristics, which may reflect underlying differences in brain processing between individuals with ASD and those without.

Table 2 Comparison of average values for EEG signal parameters using DWT.

Results of ICA

ICA aims to separate EEG signals into statistically independent components, allowing for isolating and removing artifacts such as eye blinks or muscle noise. Table 3 presents the average results across key signal parameters.

In terms of SNR, ICA improved signal clarity for both groups, with the normal group achieving an average of 86.44 and the ASD group 78.69. This indicates a consistently effective noise separation across both datasets, although the slightly higher SNR in the normal group may reflect cleaner baseline signals or fewer interfering components.

Regarding MAE and MSE, the normal group actually presented lower average errors (MAE: 47142.89, MSE: 506,827) compared to the ASD group (MAE: 14673.51, MSE: 582,521). The unexpectedly high MSE in the ASD group despite lower MAE may suggest that while typical deviations are smaller (reflected in MAE), the ASD signals might contain sporadic but significant error spikes, inflating the MSE.

For SE, the normal group recorded a notably higher average (3.59) than the ASD group (2.20). This suggests that EEG signals in the normal group have more uniform or complex spectral distributions, whereas the ASD group exhibited more structured or predictable frequency components, possibly due to neural atypicality.

A significant distinction appears in the PSD results, where the ASD group showed a dramatically higher average (12.14) than the normal group (0.0029). This stark contrast likely reflects substantial differences in the energy of retained frequency components post-denoising and might imply hyper-synchronization or abnormal neural oscillations in ASD participants.

The Hjorth parameters further underscore these neural differences. While Activity values were normalized (1.00) across both groups due to preprocessing, Mobility was higher in the normal group (0.4958) compared to the ASD group (0.1678), indicating more signal fluctuation or frequency spread in neurotypical EEGs. Complexity also followed this trend, with the ASD group showing a higher average (9.8792) versus the normal group (3.6789), suggesting less regular signal dynamics and potentially greater instability or non-linearity in the brain signals of ASD individuals.

ICA demonstrates its robustness in artifact removal while preserving critical signal characteristics. The significant disparities in parameters like PSD and Hjorth complexity between ASD and normal groups point to its utility in uncovering meaningful neural differences relevant for classification and further neurological interpretation.

Table 3 Comparison of average values for EEG signal parameters using ICA.

Evaluation criteria

In this study, we utilize a range of evaluation criteria to assess EEG signal quality and characterize distinct neural patterns between ASD and normal groups. By leveraging this multi-faceted approach, we systematically compare EEG signal quality across different electrode positions, revealing underlying neurophysiological differences that may be characteristic of ASD.

Figure 3
figure 3

Signal-to-Noise Ratio (SNR). (A) SNR Butterworth. (B) SNR DWT Filter. (C) SNR ICA Filter.

Figure 4
figure 4

Mean Absolute Error (MAE). (A) MAE Butterworth. (B) MAE DWT Filter. (C) MAE ICA Filter.

Figure 5
figure 5

Mean Squared Error (MSE). (A) MSE Butterworth. (B) MSE DWT Filter. (C) MSE ICA Filter.

Figure 6
figure 6

Spectral Entropy (SE). (A) SE Butterworth. (B) SE DWT Filter. (C) SE ICA Filter.

Figure 7
figure 7

Power Spectral Density (PSD). (A) PSD Butterworth. (B) PSD DWT Filter. (C) PSD ICA Filter.

Figure 8
figure 8

Hjorth Activity. (A) Activity Hjorth Butterworth. (B) Activity Hjorth DWT Filter. (C) Activity Hjorth ICA Filter.

Figure 9
figure 9

Hjorth Mobility. (A) Mobility Hjorth Butterworth. (B) Mobility Hjorth DWT Filter. (C) Mobility Hjorth ICA Filter.

Figure 10
figure 10

Hjorth Complexity. (A) Complexity Hjorth Butterworth. (B) Complexity Hjorth DWT Filter. (C) Complexity Hjorth ICA Filter.

SNR

This study computed SNR values across multiple EEG channels for ASD and normal groups after applying Butterworth, DWT, and ICA, as depicted in Fig. 3.

In Fig. 3A, the SNR results for Butterworth reveal moderate variability across channels. The normal group tends to exhibit higher SNR values, particularly in channels such as Fp1, Fp2, and P4, suggesting a relatively strong signal presence compared to background noise. Conversely, the autism group shows slightly lower and more stable SNR values throughout, indicating consistent but less pronounced noise suppression. This pattern implies that while the Butterworth retains signal fidelity, its performance may be limited in enhancing signal clarity for ASD-specific EEG data.

Figure 3B presents the SNR outcomes following DWT preprocessing. Interestingly, both groups exhibit highly narrow ranges of SNR values, clustered around \(-3.010 mV\). Although the values are consistently low, their uniformity suggests that DWT achieves a balanced but conservative denoising effect. The lack of significant variance implies that signal and noise components may be proportionally attenuated, resulting in stable but suboptimal signal enhancement. While DWT excels in maintaining frequency integrity, it may require further tuning to optimize signal amplification relative to noise.

In Fig. 3C, the ICA method demonstrates the highest SNR values across all techniques, with the normal group notably reaching peaks above 100 mV in several channels, such as O1, O2, and T4. The autism group also shows elevated SNRs, although slightly lower in magnitude and more variable across channels. This outcome highlights ICA’s superior capability in isolating and removing artifacts, significantly boosting signal clarity. However, the pronounced differences between groups suggest that ICA’s performance may be dataset-specific, potentially over-amplifying signals in non-ASD recordings.

Overall, the SNR analysis underscores the superior signal enhancement capabilities of ICA, especially in improving signal clarity post-artifact removal. DWT offers a uniform but relatively conservative improvement, while Butterworth shows moderate enhancement with group-dependent performance. These findings support the strategic use of ICA in ASD-focused EEG analysis, mainly when signal clarity is paramount for subsequent feature extraction or classification.

MAE

This study calculates MAE for each channel across autism and normal groups following preprocessing with Butterworth, DWT, and ICA. The resulting patterns are illustrated in Fig. 4, revealing significant differences in error distribution across methods and subject groups.

In Fig. 4A, the Butterworth shows considerable channel-wise variation in MAE values. Notably, the normal group exhibits markedly higher MAE in channels such as Fp1, C4, and P4, while the autism group maintains lower and more consistent values across most regions. This observation suggests that Butterworth may be more effective at preserving signal fidelity in the autism group. However, the variability in the normal group indicates inconsistent artifact suppression, potentially due to higher baseline variability in non-ASD EEG recordings.

Figure 4B demonstrates the results of DWT-based preprocessing. Here, MAE values are generally lower for both groups, indicating improved noise reduction compared to the Butterworth approach. Channels such as C4, T6, and T3 reflect a more apparent separation between groups, where the autism group consistently shows slightly higher MAE than the normal group. This suggests that DWT successfully attenuates noise while preserving subtle features related to ASD, facilitating group differentiation and supporting its utility for EEG signal refinement.

The ICA results, presented in Fig. 4C, reveal a distinctive pattern. The autism group maintains low and relatively stable MAE across all channels, while the Normal group exhibits extreme peaks, particularly in O1, O2, and P3, where MAE values surpass 120,000. These outliers may indicate an overcorrection effect, where ICA aggressively removes components interpreted as artifacts but which may contain relevant signal data in the normal group. Nevertheless, the stable and low error observed in the autism group demonstrates ICA’s potential to enhance ASD-specific signal clarity when adequately calibrated.

The MAE analysis reinforces findings from the MSE metric, highlighting DWT as the most balanced preprocessing technique with consistent noise suppression and feature preservation across both groups. ICA proves effective for ASD signals, though caution is advised due to possible overfitting in non-ASD data. Butterworth, while helpful, displays more significant variability, warranting refinement for broader applicability. These results further emphasize the importance of selecting preprocessing methods that align with the neurophysiological characteristics of the target population in EEG-based autism research.

MSE

As illustrated in Fig. 5, the comparative analysis reveals distinct performance trends for each method. In Fig. 5A, the Butterworth exhibits considerable variability in MSE values across channels. While the autism group shows lower MSE in channels such as C4, F7, and T3, higher values are observed in regions like Fp1 and C4. This inconsistency suggests that although Butterworth can preserve relevant frequency components, it may not consistently attenuate noise across all channels, highlighting the potential need for channel-specific adjustments or supplemental denoising techniques.

In contrast, Fig. 5B presents results for the DWT approach, demonstrating overall lower and more consistent MSE values across channels. The method effectively reduces signal distortion while maintaining the neural characteristics essential for distinguishing between autism and normal groups. This balance suggests that DWT is well-suited for EEG preprocessing in ASD-related studies, offering robust noise suppression and preservation of diagnostically relevant features.

Meanwhile, Fig. 5C depicts the performance of the ICA method, which shows a striking difference between the two groups. While MSE remains relatively low and uniform in the autism group, exceptionally high values appear in the Normal group, particularly in channels O1 and O2. This disparity may indicate the over-removal of signal components in non-ASD subjects, potentially discarding meaningful brain activity and artifacts–nonetheless, ICA’s ability to isolate artifacts effectively for autism-specific preprocessing when carefully calibrated.

The analysis of MSE across all three methods reveals that DWT offers the most balanced and reliable denoising performance. ICA shows strong potential for autism EEG preprocessing, though caution is warranted due to its impact on non-ASD signals. Butterworth, while effective in specific scenarios, displays significant channel-wise variability. These observations underscore the importance of selecting and tailoring preprocessing strategies based on the target diagnostic context in EEG-based ASD research. ‘

SE

Figure 6A displays the SE outcomes after applying Butterworth. The autism group exhibits relatively high SE values in most channels, with noticeable fluctuations, particularly in O2 and F3. In contrast, the normal group maintains more stable and lower SE values across channels. This suggests that Butterworth preserves a balanced level of frequency complexity in ASD data, potentially highlighting the altered neural dynamics characteristic of autism. The subtle differences between groups imply that while Butterworth is adequate for general signal preservation, its ability to emphasize distinctions in frequency complexity is moderate.

In Fig. 6B, the SE values following DWT reveal a stark contrast between the two groups. The normal group consistently shows higher SE values, often exceeding 3.0 mV, while the autism group remains uniformly low, around 1.0 mV or below across all channels. This clear separation suggests that DWT significantly enhances the ability to detect entropy-related differences in neural activity. The method’s multiresolution decomposition may effectively isolate frequency components, amplifying the contrast in complexity between autism and normal signals. Thus, DWT appears particularly well-suited for distinguishing reduced spectral variability in ASD populations.

Fig. 6C presents the SE results using ICA. Similar to DWT, the normal group exhibits elevated SE values across most channels, frequently surpassing 3.5 mV, whereas the autism group maintains notably lower values, generally ranging between 2.0 and 2.8. ICA’s ability to remove artifacts while preserving subtle frequency dynamics likely contributes to this differentiation. The results suggest that ICA enhances signal clarity and retains sufficient spectral detail to capture entropy differences between groups, making it a reliable method for SE-based analysis.

SE analysis reveals that DWT and ICA provide superior discrimination between autism and normal groups based on spectral complexity, with DWT offering the most pronounced contrast. While effective in maintaining general signal characteristics, Butterworth demonstrates a limited capacity for capturing entropy-based group distinctions. These findings highlight the importance of selecting preprocessing methods that align with the specific EEG features targeted in ASD diagnostics.

Power Spectral Density (PSD)

In Fig. 7A, the Butterworth demonstrates moderate differentiation between autism and normal groups. Although it maintains stable power values across EEG channels, the distinction between the two classes is not consistently evident. Specific channels such as O1 and F8 show higher PSD in the Normal group, while Fp2 and F7 present higher PSD values for the autism group. This suggests that while the Butterworth effectively preserves frequency components, it may not sufficiently enhance discriminative features critical for ASD classification.

Conversely, Fig. 7B reveals that the DWT significantly amplifies the PSD values in the autism group across nearly all channels, peaking around 500 mV. In contrast, the normal group maintains consistently low PSD values. This stark contrast suggests that DWT is highly effective in isolating frequency-specific features associated with ASD, potentially due to its capacity to decompose EEG signals into multiple resolution levels, thereby enhancing noise separation and neural pattern retention.

Figure 7C presents the results for ICA, where the normal group exhibits relatively higher PSD values than the autism group, particularly in channels like O2. The overall PSD values are much lower than those obtained through DWT or Butterworth, indicating that ICA might aggressively suppress signal components alongside artifacts. While ICA is advantageous for artifact removal, its tendency to attenuate relevant EEG features could limit its effectiveness in distinguishing ASD-related neural dynamics.

Among the three preprocessing techniques, DWT yields the most pronounced differentiation between autism and normal groups regarding PSD, highlighting its potential as a powerful method for enhancing EEG signals in ASD detection.

Hjorth parameters

Hjorth activity analysis (Fig. 8) reveals that autistic individuals exhibit consistently higher activity levels across most EEG channels than the normal group, with a surge notably around channels F4, F7, and F8. This suggests more intense or unregulated neural firing in these brain areas. Under DWT (Fig. 8B), this separation becomes more pronounced–activity values in the autism group rise significantly, underscoring DWT’s ability to preserve ASD-specific signal features while suppressing noise. However, ICA (Fig. 8C) seems to overly flatten the signals, with activity values approximating 1.0 mV across all channels. This could reflect signal oversuppression.

For Hjorth mobility (Fig. 9), autism group signals again show higher values, especially in frontal and parietal regions, indicating faster frequency fluctuations. The effect is most evident after DWT processing (Fig. 9B), which nearly doubles mobility in the ASD group. In contrast, ICA (Fig. 9C) reveals higher mobility in the normal group, potentially due to preserved high-frequency noise or artifacts in the control EEG.

Hjorth complexity (Fig. 10) shows more varied patterns. Initially (Fig. 10A), ASD signals are more complex, particularly in the frontal and temporal lobes. DWT processing (Fig. 10B) changes this pattern significantly–ASD complexity becomes uniform and lower, whereas control signals show sharp variations. ICA results (Fig. 10C), however, suggest that ASD signals maintain a higher degree of irregularity, especially at channel P4, indicating ICA’s potential to preserve ASD-relevant signal structure. “Given inter-channel variability and method-specific behavior, we refrain from blanket statements such as ‘ASD universally higher/lower on Hjorth parameters’. Instead, we emphasize patterns contingent on the preprocessing: ICA consistently maximized SNR(dB); DWT minimized error while preserving band structure (reflected in PSD/entropy separability); Butterworth served as a baseline with larger.

Future directions and common challenges

As a significant advancement in early diagnosis research for ASD, brain-signal analysis methods are attracting increasing attention. With the global prevalence of ASD on the rise, the demand for faster and more accurate diagnostic methods has grown more urgent. Traditional approaches, which predominantly rely on behavioral observations, face limitations in identifying ASD characteristics at an early stage, especially in individuals with mild symptoms. In this context, EEG signal analysis presents a promising approach to extracting distinctive brain activity patterns in individuals with ASD.

Table 4 presents a summary of several studies that utilized various denoising techniques to process EEG signals, highlighting the methodologies applied and the key findings from each investigation. The datasets encompassed diverse groups, including individuals with Mild Cognitive Impairment (MCI), Alzheimer’s Disease (AD), and those employed in emotion recognition tasks. A range of denoising methods was applied, including high-pass, low-pass, and Savitzky-Golay filters, as well as more sophisticated techniques such as Discrete Wavelet Transform (DWT), Empirical Mode Decomposition (EMD), and Graph Signal Processing (GSP). These approaches were implemented to mitigate noise and artifacts within the EEG signals, thereby enhancing the accuracy and reliability of subsequent analyses.

Table 4 Applications of denoising methods in EEG signal.

For instance, Ohal et al.71 employed various filtering techniques, achieving an impressive 92% classification accuracy for MCI detection, underscoring the efficacy of signal denoising in improving diagnostic outcomes. Pirrone et al.72 applied PSD analysis to EEG data from subjects with Alzheimer’s, obtaining classification accuracies exceeding 87%, which demonstrated the utility of PSD as a robust feature for Alzheimer’s detection. In the present study, a Butterworth filter was employed for preprocessing EEG data from participants diagnosed with ASD and healthy controls. This preprocessing step significantly enhanced signal quality and revealed notable differences between the two groups when evaluated through a comprehensive set of criteria, including Hjorth parameters, MAE, MSE, PSD, SNR, and spectral entropy. The results of this study emphasize the critical role of signal preprocessing in improving the quality of EEG data, which in turn facilitates more accurate and reliable diagnostic methodologies for various neurological conditions.

[Noted:R1-3,R1-2]This study presents an in-depth analysis of EEG signal characteristics in individuals with ASD by comparing preprocessing techniques using Butterworth, DWT, and ICA. Quantitative error evaluations were conducted using MAE and MSE, PSD (via Thomson’s multitaper) was used to examine power distribution across frequency bands and SNR, SE, and Hjorth parameters were extracted to enhance interpretability. The findings indicate significant differences in signal complexity and neural activity between ASD and control groups, suggesting atypical brain dynamics in ASD.

Beyond these results, our standardized head-to-head comparison provides practical guidance for method selection, DWT showed lower errors (MAE/MSE) while preserving spectral structure (PSD/SE), making it a pragmatic choice for routine or rapid processing, ICA delivered the highest SNR and is best used when maximizing signal clarity is required and Butterworth remains a useful baseline band-pass stage prior to feature extraction or further denoising. By clarifying when each method is preferable, this study offers a concrete foundation for developing EEG-based tools to support early screening and clinical decision-making in ASD diagnostics.

Feasibility in real-world workflows hinges on computational and operational constraints. For 16-channel recordings, DWT is computationally lightweight and runs on standard workstations without specialized accelerators, a simple Butterworth stage is effective as a first-pass band-pass before DWT and feature extraction. ICA is most suitable for confirmatory analysis where maximizing signal clarity is prioritized, although feasible on typical clinical PCs, ICA decomposition is iterative and often benefits from manual review of independent components prior to interpretation. Anticipated implementation challenges include variability in electrode placement/impedance (especially in children), motion/ocular artifacts, multi-session montage effects (two 8-channel recordings to cover 16 electrodes), and site-to-site differences in acquisition protocols and filtering.

To support large-scale screening, we advocate a two-step pathway, a lightweight Butterworth + DWT pipeline for screening/triage, followed by ICA-assisted confirmatory review when needed. This division aligns with contemporary ASD EEG pipelines that consume spectro-temporal representations22,23, and it minimizes hardware burden while preserving interpretability. Finally, standardizing units and metrics (SNR, MAE/MSE, PSD in \(\mu\)V²/Hz, SE, Hjorth) and documenting QC steps (impedance checks, artifact handling, segment length) will facilitate multi-center deployment and reproducibility.

To enhance the precision and applicability of this work, we explicitly acknowledge several limitations and outline concrete steps for replication. The current sample is small (n = 10) and age-imbalanced (ASD 6-10 years vs. controls \(\ge\) 17 years), which may introduce age confounding, ASD subtypes were not recorded. Acquisition-related sources of bias include covering 16 electrodes via two 8-channel sessions per participant (possible session effects). Clinical covariates (e.g., comorbidities, biomarkers such as CRP/vitamin A, diet) were not collected, precluding neurophysiological-clinical coupling analyses. Results may also be sensitive to preprocessing choices (Butterworth band/order, wavelet family/level, ICA algorithm and component review) and PSD settings (multitaper parameters). To improve generalizability and reproducibility, future work will recruit a larger, age-matched, multi-center cohort, document ASD subtypes and comorbidities, harmonize acquisition protocols (montage size) and evaluate hybrid preprocessing strategies (e.g., ICA-initialized wavelet) alongside formal parameter-sensitivity analyses.

A multimodal approach to EEG analysis presents both a challenge and an exciting avenue for future research. Combining EEG data with other physiological measurements such as Heart Rate Variability (HRV), Galvanic Skin Response (GSR), or respiratory pattern analysis could provide a more holistic understanding of the autism spectrum and its relationship to individual physiological responses. With multimodal data, models could be developed to detect common ASD comorbidities, such as anxiety disorders or sleep disturbances, which are difficult to identify solely from EEG analysis.

However, developing increasingly complex models brings significant challenges. One of the most prevalent challenges lies in EEG acquisition and signal processing, as the data is prone to noise and artifacts. Addressing this issue will require advanced processing techniques to enhance data quality. For example, deep learning methods like Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) can be applied in the preprocessing stage to detect and eliminate noise automatically.

In terms of modeling, the diverse characteristics of individuals on the autism spectrum necessitate a personalized approach to detection algorithms. Since not all ASD individuals exhibit uniform EEG patterns, adaptable Machine Learning approaches capable of individualized detection are crucial for ensuring diagnostic accuracy and reliability. Additionally, model interpretability is becoming increasingly important, particularly in making technology-based insights accessible to healthcare professionals in clinical settings. Explainable AI (XAI) models may be employed to enhance transparency and help make analysis results more interpretable for medical stakeholders12.

Moreover, data security and privacy of EEG recordings present major challenges, particularly in data collection involving children and ASD individuals. This includes strict data protection measures that align with international research ethics standards and privacy regulations. Future research should focus on developing stronger encryption techniques and robust security protocols to safeguard this sensitive data effectively.

Overall, research into EEG-based ASD detection faces various technical, methodological, and ethical challenges. Nonetheless, advancements in signal processing, machine learning, and security systems offer promising opportunities to produce more effective, accurate, and secure solutions for early ASD diagnosis.

Conclusion

This study presents a comprehensive evaluation of three EEG preprocessing techniques–Butterworth, DWT, and ICA–to enhance signal quality and extract relevant features for distinguishing ASD [and neurotypical EEG patterns. Among the methods, ICA demonstrated superior capability in increasing signal clarity, as reflected in the highest SNR values (normal: 86.44, ASD: 78.69), though accompanied by higher error metrics (MAE: 14,673.51; MSE: 582,521) in the ASD group, indicating residual irregularities in the signal. DWT achieved the most balanced performance, offering low error rates (MAE: 4785.08; MSE: 309,690 for ASD) while maintaining critical temporal and spectral properties. In contrast, Butterworth filtering showed moderate denoising effectiveness but exhibited inconsistent results across different metrics, especially in the normal group. Across all methods, the EEG signals from ASD participants consistently demonstrated lower Hjorth Activity, Mobility, and Complexity values compared to neurotypical individuals. This suggests reduced signal variability and dynamism, in line with known alterations in neural processing in ASD. Overall, ICA is well-suited for applications requiring maximum signal clarity, while DWT provides a robust trade off between denoising performance and feature preservation. Practical recommendation, based on our standardized comparison across six metrics (SNR, MAE, MSE, PSD, SE, Hjorth) on a 16-channel dataset, no single method is universally optimal for all purposes. We recommend DWT for real-time/edge pre-screening because it yields lower errors (MAE/MSE) while preserving spectral structure (PSD/SE) that remains stable for feature extraction and time-frequency representations; in contrast, ICA is most appropriate for offline clinical confirmation when signal clarity (SNR) is the priority and independent components can be reviewed. Butterworth remains useful as a baseline band-pass stage before feature extraction or additional denoising. Future research should aim to expand the dataset with a broader participant demographic, incorporate multimodal neurophysiological data, and explore hybrid preprocessing strategies to further enhance the reliability of ASD detection via EEG signal analysis.