Introduction

The World Health Organization estimated that in 2020 there were over 55 million people worldwide living with dementias, with this number reaching 78 million by 20301. The earliest identification of cognitive decline increases the possibility of the identification of those at risk for decline, identification of targets for disease modifying therapies with potential of slowing progression, potential to enrich clinical trials and provide objective evidence of change with treatment. Assessment of dementia biomarkers such as, positron emission tomography (PET), volumetric magnetic resonance imaging (MRI), beta-amyloid, p-tau in blood and still more reliable in cerebrospinal fluid (CSF) have been shown to have high diagnostic accuracy for dementia and have reported an acceptable ability to predict progression from mild cognitive impairment (MCI) to dementia1,2,3,4,5. However, prognostic accuracy from the earliest pre-clinical stages has not been reliably investigated. The majority of these works focused on patients with MCI who may already have irreversible structural neuronal loss and have not been demonstrated to be sensitive to the earliest signs of functional abnormalities that preceded this stage. Further, such biomarkers are both invasive and costly and not readily repeated over time to track progression; some also require high technological devices which are often missing in economically poor countries. Therefore, there remains an important need for non-invasive, low-cost and widely available novel biomarkers of Alzheimer’s Disease (AD) and related dementias6 that are sensitive to the earliest stages of subjective cognitive decline prior to overt synaptic and neuronal loss and that could predict the progression of the disease.

EEG is uniquely sensitive (millisecond time base) to brain changes associated with brain dysfunction. Since EEG signal is generated at the synaptic level, brain diseases impacting synaptic transmission in their early stages (including the synaptic stage of neurodegeneration) can be explored with remarkable precision. Advances in signal processing, real-time analyses, and use of machine learning (ML) for the identification of profiles of abnormalities distinctive to different disorders have greatly advanced the clinical utility of quantitative EEG (qEEG) at the point of care and the potential of EEG-based biomarkers of disease7,8,9. The richness of the quantitative EEG (qEEG) feature set includes measures of cortico-cortical connectivity (reflecting disruption of neuronal transmission), measures of complexity (reflecting disorganization of neural networks, and features reflecting perturbations of the frequency spectrum (reactive to changes in neurochemistry, oxygen flow, glucose metabolism, etc.). Significant correlations have been reported between abnormal qEEG features and abnormalities in cerebral blood flow, hippocampal atrophy, changes in glucose metabolism, changes in connectivity (DTI), with cerebrospinal fluid biomarkers of AD, and with evidence of cortical thinning2,10,11,12,13,14,15,16,17,18. These findings clearly reveal a complex relationship between qEEG features and structural and functional brain modifications. In AD research, reports of a non-linear relationship between amyloid burden and EEG abnormalities, suggest that EEG patterns of abnormality may be modulated differently dependent on the amyloid burden, suggesting that the initial compensatory mechanisms present in the earliest stages of cognitive decline may be insufficient as amyloid load increases19. Hallmarks of neurodegenerative processes of early dementia have been identified using EEG connectivity measures20. These are important factors related to the pathophysiology of dementia, suggesting an important role for qEEG-based biomarkers in the evaluation and evolution of neurodegenerative disorders6,19,21,22. A recent review emphasizes the role of electrophysiological biomarkers in probing the neurophysiological underlying sources of cognitive function and the potential contribution of such biomarkers to test clinical efficacy of treatment in dementia, both pharmacological and non-pharmacological23,24,25. An early foundational publication by Prichep and colleagues22 demonstrated high accuracy for the prediction of future cognitive decline in normal elderly with only subjective cognitive impairment (SCI) using intrinsic brain activity features with input from the 19 electrode locations of the International 10/20 placement system.

These results and the growing evidence in the literature support an important role for qEEG in the evaluation of neurodegenerative disorders and sensitivity to the earliest sign of brain function impairment. To this point, the majority of research on qEEG biomarkers was focused on the prediction of progression from MCI to AD. In contrast, the focus of the current study was to derive a Machine Learning (ML) qEEG-based biomarker for the earliest identification of brain dysfunction predictive of decline to MCI or AD. In this study, we used baseline EEG data from patients who manifested only subjective memory complaints (SCI) at the time of the evaluation. We also reduced the number of EEG inputs, improved real-time signal processing, and implemented an ML derived algorithm to improve usability at the point of care. The relationship between such a qEEG biomarker and the underlying pathophysiology of the neurodegenerative process support important clinical applications in first line screening.

Methods

Participants

The database used for the derivation of this algorithm was an expansion of the population from the New York University School of Medicine/Brain Research Laboratories (NYUSOM/BRL) database described in the foundational study22 and consisted of 88 community residing elderly persons ages 52–85 years with self-report of decline in cognitive function for voluntary participation in a longitudinal study. All subjects were screened for inclusion in the study at the NYUSOM Silberstein Aging and Dementia Research Center (ADRC), all had only subjective memory complaints and met diagnostic criteria for Subjective Cognitive Impairment (SCI)26. Enrolled subjects were followed up for 5–7 years and staged yearly for cognitive decline using the Global Deterioration Scale (GDS) for staging27. All methods were approved at the time of data collection by the New York University School of Medicine Institutional Review Board (Human Research Protections) in accordance with the Declaration of Helsinki and all subjects signed written informed consent.

Exclusion criteria for this study were as described in Prichep and colleagues22 and included: (a) past history of significant head trauma, seizures or known neurological disorder; (b) any focal signs of significant neuropathology; (c) diagnosis of multi-infarct dementia based on a history of cerebral infarction or transient ischemic attacks; (d) significant history of alcohol or drug abuse; (e) previous history of Axis I psychiatric disorders; (f) cardiac, pulmonary, vascular, metabolic or hematologic conditions of sufficient severity to adversely affect cognition or functioning. Written informed consent was obtained from all study subjects at the time of enrollment and included permission for inclusion in a deidentified research database.

Study sample

The study sample consisted of 88 SCI participants meeting the inclusion/exclusion criteria and included in the analyses. Table 1 shows the demographics of this population.

Table 1 Demographics for the expanded NYUSOM/BRL dataset.

No significant differences were seen between gender, race, or Mini-Mental State Exam (MMSE). The significant difference in age between classes was addressed by the use of age regression relative to age expected normal values to remove the effect of age on the EEG features. The significant difference in years of education, ~ 1.2 years, was not considered clinically significant as both groups were at the high education level. Differences seen between length of follow-up between Class 0 and 1 were expected and therefore not tested. Such differences reflect the fact that follow-ups did not continue once a participant converted to AD. Follow-ups continued for those that declined to MCI as they could have converted within the follow-up period but ended upon conversion to AD for all patients.

Staging for degree of cognitive decline and follow-up

All candidates for enrollment were assessed for the degree of cognitive decline at baseline using the Global Deterioration Scale (GDS)27,28 for age-associated cognitive decline and primary degenerative dementia, only those with a GDS score of 2 were considered as candidates for enrollment in the study. GDS 2 subjects were diagnosed as having Subjective Cognitive Impairment (SCI)26 with subjective memory complaints in the absence of objectively manifest deficits. Upon yearly follow-up evaluation for clinical staging subjects were assessed for decline to Mild Cognitive Impairment (MCI) or conversion to dementia (AD) over a period of 5–7 years to allow sufficient time for expects rates of cognitive decline. The final study population included herein were only those who were SCI at baseline and were followed up for 7 years or until conversion to AD. At each evaluation the MMSE was used to assess change from baseline value29. The size of the study population reflects the challenges in following a population in this age group where subjects may relocate, pass away, suffer from an unrelated medical condition (e.g., heart attack or stroke), and those that refused follow-up.

Outcome groups

For the purpose of algorithm development and in recognition of the small number of subjects, the study population was divided into two outcome groups based on GDS stage at final follow-up. The non-decliners were those who remained GDS = 2 after at least 7 years, demonstrating stability in this group. The Decliners included both those that declined to MCI (n = 29) at some point in follow-up but showed no further decline within the 7-year period; or those that received a diagnosis of dementia (n = 14) during the 7-year follow-up period. In all cases the maximum severity of deterioration was used for derivation of the prediction algorithm.

EEG data acquisition and reduction of electrode locations

The baseline raw EEG recording consisted of 20 min of eyes closed resting EEG acquired from the 19-channels of the International 10/20 electrode sites, references to linked ears. Cadwell laboratories data acquisition systems were used. All impedances were < 5 kΩ. Bandpass filters were from 0.5 to 70 Hz, with a 60 Hz notch filter and down-sampled from 200 to 100 Hz.

These 19-channel EEG raw data files collected at NYUSOM/BRL and raw independent validation EEG recordings collected from multiple data acquisition systems were subjected to the same signal processing pipeline and verified for signal integrity. All further analyses were done using this reduced EEG montage (referenced to linked ears) containing the following channels: FP1, FP2, aFz, F7, F8, A1, A2, and ground. These 5 channels included those used in algorithms previously cleared by the FDA and regions of interest as identified in published studies deriving biomarkers in SCI and dementia1,30,34. The original dataset was subjected to removal of physiological and non-physiological contamination (e.g., eye movement, electromyography muscle activity), using expert visual identification combined with an artifacting algorithm for amplitude overrange. This resulted in 1–2 min of artifact-free data for each participant. While more data might have been derived from the 20 min acquired, we used the standard first 1–2 min of artifact free data for consistency and future usability.

The cleaned EEG recordings were segmented into smaller samples. Waveforms greater than 120 s in duration were split into two 60 s segments, those between 60 and 120 s remained as single segments. These segments were considered independent samples of the same phenotypic outcome. In total, this augmented NYUSOM/BRL dataset included 176 samples (90 non-decliners versus 86 decliners), providing a better representative dataset for classifier development.

EEG features extracted

The power spectrum of the artifact-free EEG data was computed using the Fast Fourier Transform (FFT) in order to extract all qEEG features (power, complexity, and connectivity) in the following frequency bands: delta (1.5–3.5 Hz), theta (3.5–7.5 Hz), alpha (7.5–12.5 Hz), alpha 1 (7.5–10.0 Hz), alpha 2 (10.0–12.5 Hz), beta (12.5–25.0 Hz), beta 2/gamma 1 (25.0–35.0 Hz), and total power (1.5–35.0 Hz). A set of more than 6000 qEEG features was extracted from the FFT of the artifact-free data, age-regressed, and z-transformed (details given elsewhere)31,32. The use of age-regression and z-transforming aids in removing the effect of age on EEG features, which is of great importance in the elderly population. These features quantify characteristics of the intrinsic electrical brain activity of different regions and frequency bands, expressed in measures including connectivity reflecting relationships between brain regions (including asymmetry, coherence, phase lag, phase synchrony), complexity reflecting disorganization of neural networks (including fractal dimension, scale-free and entropy), and frequency distribution reflecting changes in neurochemistry, neuroinflammation, oxygen and glucose utilization (including absolute and relative power and mean frequency).

Feature reduction

An important step in selecting the feature pool of qEEG features to enter into the derivation of the ML model included reduction of the large number of extracted features to minimize possible overtraining and maximize separation between groups. This step included exploration of: (1) replicability of features within the elderly population to aid in selection of stable variables; (2) feature significance (F-scores) in group separation; (3) feature correlation across all subjects was used to eliminate redundant features. Features remaining after data reduction using F-scores, repeatability and correlation requirements were used to train and evaluate three classification models from disparate mathematical foundations, i.e., logistic regression (LR), support vector machines (SVM), and random forest (RF).

Model derivation

Prediction models were trained on 50 repeated random splits of the training set into 70/30 train and test subsets. Performance from multiple splits was averaged to determine the best performing models, focusing on optimizing the Area Under the ROC Curve (AUC), while minimizing the over-training using internal validations. Recursive feature elimination (RFE) was used to further reduce the EEG features. RFE involves training a classification model on a pool of features and ranking the features based on their importance to the model. After ranking the features, RFE removed the least important feature from the pool, the model is then trained again, and the next least important feature is removed again. This process continues until a fixed set of features is reached (25 features). It is noted that since there was a significant difference in age between groups (see Table 1) and although all features were age-regressed with the assumption that this would remove the effect of age, age was added as a candidate variable in the feature pool to further evaluate this significance by the model.

Once a feature pool of 100 highly ranked features was identified, an additional step was introduced where a qEEG expert further reviewed and refined the feature pool for model building, balancing data-driven selection with domain expertise. Using this feature pool resulted in a final set of features and the derivation of the prediction algorithms used for the classifier training. The projected performance of the three algorithms was estimated using leaving-one-out (LOO) cross-validation and a 0.632 + bootstrap procedures33. Two of the three algorithms (LR and SVM) were locked for external validation due to a more stable behavior across the LOOCV procedure. A final external validation of the two locked algorithms included performance values when tested using two independent and unseen cohorts of subjects. Calibration of the outcome probability was also inspected to enable balancing false positive and negative predictions.

Results

The final models contained 14 features. Key qEEG features contributing the most to the separation between groups included multiple measure sets, frequency bands and regions. Connectivity measures reflecting disturbances in neuronal transmission (phase lag (e.g., var14) and asymmetry (e.g., var1)) were highly represented, with abnormalities in the alpha and theta frequency bands dominant. In addition, contributing to the model were measures of relative power (e.g., var13), mean frequency (e.g., var2) and complexity (e.g., var7). Figure 1 shows the SHAP (SHapley Additive exPlanations) values’ distribution for the LR predictions based on the selected 14 features in the model (values for SVM were similar). SHAP values quantify the contribution of each feature to the prediction outcome relative to a baseline random prediction of declining (x-axis 0 value). In this plot, each point represents the SHAP value of a feature for a single observation, illustrating how much that feature pushed the classifier output higher or lower. Blue and red values in the plot correspond to low and high values for the product of each predictor times its associated coefficient, respectively. Positive deviation in SHAP values is a measure of strength towards a possible dementia outcome (Class 1), whereas negative SHAP values indicate strength in favor of no-change prediction (Class 0).

Fig. 1
figure 1

SHAP bee-swarm plot for the LR model using 14 features used in the final model. The Variable Key describes the measure set and frequency band for the feature set displayed.

Figure 2 shows the mean z-values of the highest contributing features to the LR model for the No-Change (Class 0) and Decline/Convert (Class 1) groups. As expected, the mean z-values of the features within the Class 0 group are centered around the zero (age expected normal values), with small standard deviations. Mean z-values for Class 1, conversely, showed patterns of deviations (non-zero values) with features reaching significant negative (var2, var6), and positive values (var11, var13). It is also noticeable how the variances for Class 1 are significantly larger than in Class 0 subjects, suggesting heterogeneity within this group beyond that seen in Class 0. It is important to note that it is the unique set of features that define the qEEG-based biomarker, and that although individual features can be seen to be significantly different between the groups, no one could distinguish between groups.

Fig. 2
figure 2

Mean z-score for variables with high contribution to the LR classification model. See Key in Fig. 1 for variable descriptions. These variables are a subset of those contributing the most to the classifier algorithm separating the two groups. The associated probability level of a group average z-score is estimated by considering the square root of the size of the group. Thus, for this group size, a z-score of 0.5 is associated with a p < 0.001.

Figure 3 shows the probability of classification for Class 0 versus Class 1 using the LR locked algorithm. Clear separation can be seen with the bottom right section of the plot showing predominantly green dots (members of Class 0, no change) with a high probability of belonging to Class 0 and low probability of belonging to Class 1 (decline/convert), and with the vast majority of the upper left portion of the plot populated by red dots (members of Class 1) with a low probability of belonging to Class 0 and high probability of belonging to Class 1.

Fig. 3
figure 3

Plot of the probability of No Change (Class 0) vs. Decline/Convert (Class 1) using LR. Green dots are members of Class 0, and red dots are members of Class 1.

Performance of the prediction models

Leaving-one-out estimator

The LOO cross-validation method was used to provide estimates of a model’s predictive performance. As seen in Table 2, the estimated accuracy for all models was between 0.80 and 0.82 with an AUC ~ 0.90.

Table 2 Sensitivity and specificity estimators using the leaving-one-out method (LOO) on all-in NYU cohort with the 14 final features.

The 0.632 + bootstrap estimator

The Bootstrap 0.632 + method was also used to evaluate the performance of predictive models. This method is an enhancement of the bootstrap method and aims to provide more accurate and less biased estimates of model performance by addressing some of the limitations of other resampling techniques like cross-validation.

It is noted that the Bootstrap 0.632 + can only produce estimations for the Accuracy metric due to the inability of gathering a confusion matrix. Average accuracy was ~ 0.81 using this procedure, as shown in Table 3.

Table 3 Accuracy percentage estimator with 95% confidence interval based on the 0.632 + bootstrap method using all NYU cohort with the 14 most relevant features.

The performance estimators demonstrated an overall accuracy for all models to be over 80%.

Performance in the independent validation cohorts

As noted in Methods section (Model derivation), two of the three algorithms (LR and SVM) were locked for external validation due to a more stable behavior across the LOOCV procedure. The locked LR and SVM algorithms were tested on two small external independent cohorts of data (Cohort 1, University of Kentucky, Dr. Y. Jiang; and Cohort 2, Italy, Dr. P. M. Rossini). Performance metrics showed variability, with sensitivity and specificity impacted by the characteristics of each dataset. Although small, these independent validation datasets helped assess the generalization capabilities of the algorithms when faced with unseen clinical data from disparate sources. Due to differences in the relative amount of artifact free data for the two samples, results are reported separately.

For Cohort 1, all research activity was approved by the University of Kentucky Institutional Review Board in accordance with the Declaration of Helsinki, and all participants provided written informed consent. For Cohort 2, all data was collected under informed and overt consent of each participant or caregiver, in line with the Code of Ethics of the World Medical Association (Declaration of Helsinki) and the standards established by the Author’s Institutional Review Board.

Participants cohort 1: University of Kentucky

The University of Kentucky (UOK) cohort (Cohort 1) included 19 subjects (11 no change vs. 8 decliners) all with sufficient artifact-free data for analyses. Data was acquired with impedance below 5 kΩ and was sampled at a rate of 500 Hz and down-sampled to 100 Hz. Following filtering, the data was subjected to a suite of FDA cleared automatic artifactors (described in detail elsewhere31). Table 4 shows the demographics for this cohort. It was noted that the age range of these patients was within that of the NYU population and the median MMSE was the same. Further details of the Human subjects and EEG recording for this cohort are detailed elsewhere34.

Table 4 Demographics for the UOK independent cohort 1.

Prediction performance of the locked algorithms for this cohort is shown in Table 5.

Table 5 Prediction performance for both locked algorithms on the university of Kentucky cohort (classification threshold set at 0.5).

In summary the performance accuracy of Cohort 1 averaged across the two models was 0.71, approximately the 10-point drop expected for such an independent sample. It is noted that this dataset contained normal elderly without clear distinction of those with subjective memory complaints rendering these data a conservative test of the algorithms developed on SCI patients only.

Participants cohort 2: Italy dataset (Neuroconnect)

In Cohort 2, Italy dataset, the total number of samples available with the minimum required artifact-free data was n = 28, 10 no change, and 18 declining subjects. Data was acquired with impedance below 5kΩ and was sampled at a rate of 256 Hz and down-sampled to 100 Hz. Following filtering, the data was subjected to a suite of FDA cleared automatic artifactors (described in detail elsewhere31). Table 6 shows the demographics for this cohort. It was noted that the age range of these patients was within that of the NYU population. However, the median MMSE was lower.

Table 6 Demographics for the Italy independent cohort 2.

Table 7 presents the predictive results when this dataset was parsed through the locked algorithms.

Table 7 Prediction performance for both locked algorithms on the Italy cohort 2 (classification threshold set at 0.5).

Potential impact of different classification thresholds on accuracy

In the results presented this far the threshold was set at the default, that is the middle point of the probability segment, i.e., at 0.5. Changing this classification threshold can have significant implications for risk stratification of the outcomes. Adjusting classification threshold impacts the sensitivity and specificity of the classification, influencing the overall risk dynamics of the prediction. Table 8 shows three examples of the impact of changing the classification threshold on the NYU population. Sensitivity performance improves at the cost of lowering the specificity metric when the threshold is moved from 0.5 to 0.4 (Table 8 Panel A vs. Panel B); [2] At 0.5 (Table 8 Panel B) sensitivity and specificity are relative equivalent; and [3] specificity increases at the expense of sensitivity when the threshold is at 0.55 (Table 8 Panel C). Parallel changes in NPV (highest at 0.4) and PPV (highest at 0.55) can also be seen.

Table 8 Prediction performance for both locked algorithms on the NYU all-in cohort (for different positive class thresholds set at 0.4 (top panel A), 0.5 (middle panel B) and 0.55 (bottom panel C).

With respect to the impact of changing classification threshold on the independent validation cohorts, caution is noted as the n is small. However, in general the same overall impact was observed. That is, as classification threshold increases, specificity increases and sensitivity decreases (or was stable across thresholds). For example, for Italy when the threshold for LR classification goes from 0.4 to 0.5, specificity goes from 50% to 70%, with no impact on sensitivity.

Discussion

This PoC study demonstrated high accuracy of an ML derived EEG-based biomarker for prediction of cognitive decline/conversion in the next 5–7 years in a population of SCI participants. Overall accuracy of the ML models demonstrated prediction accuracy of 80%, with AUCs of 0.90, with robust performance in two independent validation cohorts. As noted in the Methods section, due to small n’s those who declined to MCI were combined with those who converted to dementia over the follow-up period for algorithm development and performance estimates. However, it was observed that those who converted to dementia had significantly higher prediction probability of future change than those who declined to MCI.

As discussed in the introduction, the scientific literature reveals a complex relationship between qEEG features and structure and functional brain alterations related to the pathophysiology of dementia. These relationships underscore the important role for qEEG-based biomarkers in the evaluation and prognosis of neurodegenerative disorders as represented in the prediction biomarker derived in this study. Features contributing most to the final locked models presented herein were dominated by those features reflecting disruption in neuronal transmission (phase lag and asymmetry) with abnormalities in the alpha and theta frequency bands. Changes in measures of connectivity in the subjective cognitive impairment (SCI) population provide evidence of changes in neuronal transmission within frontal networks. Studies using qEEG to predict decline from SCI/MCI to AD have specifically reported compatible disruptions in the frontal and executive networks prior to objective cognitive decline30, but this work extended these observations to the SCI population.

In addition, contributing to the model were measures of relative power and mean frequency (especially alpha and theta) and complexity (entropy, total power). The presence of abnormalities within the theta frequency band are consistent with evidence of increase in theta often associated with neurodegenerative disorders, reflecting hippocampal atrophy and cortical thinning16, suggesting that such features are reflecting the earliest signs of dysfunction before structural damage is detectable. These abnormalities in the presence of increased activity in the alpha frequency band (mean frequency and relative power) suggest that at this early prodromal phase of SCI alpha is acting as a compensatory measure.

It is important to emphasize that most prior reports of algorithms to predict conversion to dementia started from baselines of MCI or combined MCI/SCI with 2–5 years follow-ups, using 19-lead EEG data. In general, reported accuracies range from 69% to 71% using qEEG alone, and some have demonstrated improved accuracies when neuroimaging, genotyping or clinical data were added to the ML model12,18,35,36. Accuracy of prediction of progression to AD from MCI was reported to be 77% with EEG alone, 72% from fMRI alone, 72% from CSF alone and 77% using all three biomarkers together in a ML model37. The combination of ApoE alleles genotyping was also reported to improve prognostic ability to well above 90% when combined with connectivity analysis for EEG frequencies36. However, few publications exist with the intent to predict future cognitive decline from prodromal or SCI patients only such as was demonstrated in this study, where the impact of clinical intervention prior to structural damage may be highest. To further put our results in the context of works using clinical and neuroimaging data, a recent ADNI publication38 developed a predictive prognostic model (PPM) in MCI patients with an accuracy of 81.7% for progression to dementia using both cognitive tests and MRI (grey matter atrophy). They further report these findings to outperform either marker alone. In addition, a recent publication on blood biomarkers reports prediction of AD from MCI with an AUC of 0.89 when combining three blood biomarkers3. A recent combination of blood biomarkers with CSF and neuroimaging for prediction of AD from MCI + SCI reported an AUC of 0.8–0.94.

When contemplating integrating this prediction biomarker into clinical evaluations it is important to note the findings reported by adjusting the threshold for classification. This allows clinicians to strategically manage the balance between risk of a “misclassification.” For example, lowering the threshold might increase the number of false positives (no change subjects labeled as probable decliners) returned by the classifier. Conversely, raising the threshold could decrease the number of declining subjects being correctly labelled, aligning with a more stringent policy of very few false positives. In summary, each adjustment impacts the sensitivity and specificity of the classification, influencing the overall risk dynamics of the prediction while using the same prediction biomarker. Such an approach further supports clinical usability.

Limitations of the current study include the small number of subjects both for derivation of the prediction algorithm and for the independent validation cohorts. However, while it is recommended that future studies include larger populations, the difficulty in performing such longitudinal studies needs to be emphasized. To accumulate the population in this study with 5–7 years of clinical follow-up and only SCI at baseline took over 15 years and funding sources for such studies are rare. Another limitation was the lack of other baseline biomarker data in the populations which should be included in future studies to allow for correlations to be studied.

This study demonstrates accuracy of an EEG-based prediction biomarker using reduced EEG inputs for the derivation of a classification model (only frontal/dorsolateral prefrontal cortex electrodes). In fact, the algorithm’s performance was at a level equivalent to or higher than others reported in the current literature of EEG predictive algorithms for progression from MCI/MCI + SCI to AD using the full 10–20 channels and, in many cases, even using additional neuroimaging and/or clinical data5,12,18,19,35,36,39The reduced frontal montage, ease of use, rapid results, and high accuracy further support the clinical utility of this biomarker.

Conclusion

This study successfully achieved its primary objective of developing and validating preliminary qEEG-only algorithms for the prediction of future cognitive decline from the earliest stage (SCI). The EEG features contributing most to these models spanned measure sets including those that reflect disruption of neural transmission and complexity, in frequency bands most often associated with neurodegeneration and known to be correlated with other biomarkers of dementia. Accuracy for these qEEG-only algorithms using a limited frontal montage was well within the range of those in the current published literature for EEG models which included all electrode locations and additional dimensions (neuroimaging and/or clinical features). This further supports the potential clinical utility for screening, aid in diagnosis and tracking progression of cognitive decline of this qEEG biomarker.