Introduction

When learning about a topic, many people turn to educational multimedia, which frequently offers an efficient and convenient solution, especially when we consider time limitations and budget constraints. Teachers also typically incorporate educational multimedia into their lessons, expecting students to learn from the content presented. However, the learner’s experience can vary greatly depending on how the multimedia is designed1, what content it includes2, and how its structure promotes active learning3. In some cases, learners easily follow the multimedia program and use it as a foundation for further learning, engaging with the content and achieving better learning outcomes. Conversely, learners may struggle to engage with poorly designed videos and feel anxious and exhausted, leading to an inability to apply the content later. This raises important questions: what neural mechanisms govern our internal evaluations and performance during and after watching educational multimedia? How do different brain regions interact to facilitate more effective learning while watching a video?

Learning from educational multimedia is a complex cognitive task that occurs in a multidimensional space, resulting in varying levels of cognitive workload in the brain. How the human brain processes and stores information has been discussed in cognitive load theory (CLT)4. In modern education, CLT provides valuable insights for designing efficient educational multimedia that optimizes the cognitive load and improves the learning process5. By understanding how to manage intrinsic and extraneous cognitive load, educators and instructional designers can create content that minimizes unnecessary distractions and fosters deeper learning6.

In 2002, Richard Mayer introduced twelve principles to improve multimedia design and reduce the cognitive load for learners2. Five principles specifically target extraneous processing, which involves cognitive effort on nonessential elements. These include coherence (removing irrelevant content), signaling (highlighting key information), redundancy (excluding on-screen text when graphics and narration are used), spatial contiguity (placing related words and images close together), and temporal contiguity (presenting related words and images simultaneously). Adhering to these 5 principles helps create more effective multimedia learning experiences with optimized cognitive load7,8.

Like other cognitive tasks, assessing the cognitive workload during the learning process requires accurate and reliable measurement techniques1. Researchers commonly employ two main approaches: subjective and objective. Subjective methods involve questionnaires and interviews to evaluate cognitive load based on the subject’s perception of task difficulty9. A widely used questionnaire in this category is the NASA Task Load Index (NASA-TLX), which assesses different dimensions of workload: mental demand, physical demand, temporal demand, performance, effort, and frustration10. Objective methods, on the other hand, utilize physiological and behavioral measures, including electroencephalography (EEG)11, which provides continuous and real-time data12; eye-tracking13, which can assess cognitive load through physiological indicators such as pupil diameter and microsaccades14; and functional magnetic resonance imaging (fMRI)15. By employing both subjective and objective measures, we can gain a more comprehensive view of the complex interplay between neural processes and their manifestations in human experience and behavior16.

Among all these methods, EEG, either alone or in conjunction with other approaches, plays a key role in analyzing brain function because of its real-time monitoring ability, long-term recording potential, and high temporal resolution11. Researchers have employed various statistical analyses17 to ensure robust interpretations of cognitive processes18 and machine learning techniques on EEG data to evaluate different aspects of cognitive load in the learning process, including memory and attention19. Furthermore, extensive research has been conducted on frequency band analysis, particularly focusing on the ratios of the theta and alpha bands20, as well as their individual characteristics21. Changes in theta frequency band activity are associated with working memory performance, particularly in frontal and central brain regions22. Moreover, theta band power increases during successful memory encoding23, and theta band activity is positively correlated with memory retrieval24. However, alpha-band oscillations are correlated with attention and the control of access to stored information, particularly in the parietal and occipital lobes25. EEG alpha power has also been shown to be associated with creative ideation26. However, a crucial aspect that remains unclear is how different brain regions interact with each other across various frequency bands, leading to varying levels of cognitive workload.

Recent research suggests that our understanding of the principles and mechanisms underlying complex brain function can be improved by integrating data from multiple brain regions and examining their functional interactions. Rather than focusing solely on localized changes in brain structure and function27, network neuroscience takes an integrative approach29. It models the brain as a complex network29, with brain regions as nodes and their connections as edges30, allowing the application of graph theory and network analysis to study the brain’s functional mechanism at the whole-brain scale31. In this way, several network measurements have been employed to characterize one or several aspects of global and local functional brain connectivity. Three common types of measurements are as follows: (a) Segregation measurements reveal that the brain network exhibits a modular organization, with densely connected clusters of brain regions responsible for specialized information processing32. (b) Integration measurements: The modular architecture of the brain network resembles a highly efficient “small-world” structure, striking a balance between local clustering and global integration33. The integration measures evaluate the global information processing in the brain32. (c) Centrality measurements: These identify “hub” regions that play crucial roles in integrating information across distributed brain systems34.

Adopting the network neuroscience approach, which models the brain as a complex network of interconnected regions, can provide valuable insights into the neural mechanisms underlying multimedia learning and its cognitive workload. Prior investigations have focused on prominent subnetworks of the brain such as the Default Mode Network (DMN), which is crucial for integrating semantic information during naturalistic experiences, such as watching evolving movie narratives35, and the Salience Network (SN), which enables dynamic attention shifts between introspection and task focus36. Some studies have also explored interactions between these subnetworks37. However, a comprehensive understanding of learning processes requires an investigation of neural mechanisms throughout the whole brain, as well as research focused on specific subnetworks. One such study investigated the dynamic reconfiguration of functional brain networks during skill acquisition and adaptation38. Additionally, research has shown that factors such as screen type39 (receptive vs. interactive) and visual features of multimedia, such as shape and color, influence functional connectivity patterns across the whole brain40. Despite these advancements, a considerable gap still exists in our understanding of the whole-brain neural mechanisms that support learning efficiency, which is crucial for developing reliable metrics to optimize multimedia content and improve educational design.

This study aims to investigate how Mayer’s principles affect functional brain networks in order to facilitate learning process and optimize cognitive workload. To achieve this goal, we constructed two educational multimedia programs with the same auditory content; however, in one, the visual design adhered to Mayer’s principles, whereas in the other, it violated these principles. We employed both subjective and objective methods to assess the impact of these multimedia designs on learning outcomes and cognitive load. The subjective measures included a recall test and the NASA-TLX questionnaire, whereas the objective measures involved analyzing brain networks obtained from the EEG data. We examined local and global information processing in the brain across two conditions using relevant functional brain network measurements, network modularity, and behavioral outcomes, and then explored the relationships between these measures.

Through network neuroscience analysis of EEG data, we found that principal networks are more effective at facilitating local information processing, whereas non-principal networks enhance global processing and hub formation. Modularity analysis supported these observations, revealing increased integration and reduced segregation in non-principal networks, which suggest the activation of compensatory mechanisms that help learners manage cognitive load despite suboptimal instructional designs. Additionally, significant correlations were identified between functional network metrics and behavioral outcomes, such as recall test accuracy, as well as cognitive load assessments measured by NASA TLX scores. These findings emphasize the importance of integrating neural insights into multimedia design to improve learning outcomes.

Materials and methods

Participants

The dataset included 39 healthy adult volunteers who were local university students, aged between 20 and 29 years, with a mean age of 22.8 years and a standard deviation of ± 2.5 years. Five participants were excluded because of incomplete recordings (n = 2), noisy data (n = 2), or low post-test scores (n = 1). For additional details, refer to earlier research on this dataset7,8,41. Two educational multimedia learning videos were designed to investigate the impact of Mayer’s multimedia learning principles on the neural mechanism of the brain. One of these videos followed Mayer’s principles (principal multimedia or P), whereas the other one intentionally avoided these principles (non-principal videos or NP). Crucially, the auditory content of the two videos was the same, but the visual content was different. Sixteen participants learned materials from the principal video and the remaining 18 participants learned from the non-principal videos.

All the participants were right-handed, which helped avoid hemispheric lateralization. Additionally, they had normal hearing and normal or corrected-to-normal vision, with no history of head injury. Their primary language was Persian, and they were also proficient in English. Before they participated in the study, they completed a standard pre-task listening assessment and a similar test to familiarize themselves with the main task. Informed written consent was obtained from all participants. The study followed approved protocols from the Iran University of Medical Sciences (IR.IUMS.REC.1397.951) and adhered to the guidelines and regulations outlined by the Declaration of Helsinki.

Computer-based educational multimedia

In this study, one listening chapter, Lesson 11, from Open Forum 342, was chosen as the core material for developing two educational multimedia programs. This selection is accessible online through Oxford University Press (https://elt.oup.com/student/openforum/3?cc=ir&selLanguage=en). The listening chapter served as the foundation for creating two distinct versions of multimedia: principal and non-principal. A motion graphics expert used Adobe After Effects CC 2017 (version 14.2.1.34) to create the multimedia content. The video duration for lesson 11 is 342 s. Both multimedia programs used in this experiment are available on GitHub (https://github.com/K-Hun/multimedia-learning-hci).

Experimental task

During the experimental session, participants sat in an adjustable chair positioned 57 cm away from a 17-inch monitor with a refresh rate of 60 Hz, within a partially sound-attenuated and dimly lit room. To minimize head movements and maintain consistent data collection, participants put their heads on a chin rest. Two loudspeakers were placed in front of them, one on the right and one on the left. Subsequently, the EEG cap was fitted, and data recording commenced.

The experiment consisted of four distinct phases:

  • Resting State Phase: Participants fixated on a black-filled circle (r = 5 mm) at the center of a gray screen for 20 s. They were advised to relax and not think about anything specific, while EEG signals were recorded.

  • Multimedia Learning Phase: After the resting state phase concluded, a multimedia video played automatically for 342 s. Participants were instructed to pay close attention to the concepts presented, with no interaction allowed with the computer during this phase.

  • Recall Test Phase: Following a brief pause after the video concluded, a recall test began automatically. This computer-based test utilized a multiple-choice question (MCQ) format consisting of twelve identical questions related to the content of principal and non-principal multimedia. Participants had 420 s to respond using a mouse interface, with options to skip questions and navigate between them; however, only one question and its options were visible at any given time. Participants could also terminate the recall test before the timer expires, which would automatically conclude this phase of the experiment.

  • NASA Task Load Index (NASA-TLX) Phase: After completing the recall test, EEG recording ceased, and participants filled out a classic paper-based version of the NASA-TLX (Hart 1988) to assess subjective cognitive load across two conditions: low-load and high-load. The NASA-TLX serves as a self-report index that ranges from 0 to 100, and participants had up to 15 min to complete this questionnaire.

This study primarily focused on analyzing functional brain networks during the learning process; therefore, only data from the second phase (the multimedia learning task) were analyzed. Figure 1 illustrates the experimental design. As previously mentioned, participants were randomly divided into two groups: one group was exposed to Lesson 11 NP, while the other group viewed Lesson 11 P.

Fig. 1
Fig. 1
Full size image

Experimental procedure and multimedia design. (a) The experiment was conducted in a systematic sequence (from left to right): Phase I—Resting State, where participants observed a black-filled circle to gather baseline data; Phase II—Multimedia Learning, during which participants viewed multimedia content passively, without interaction; Phase III—Recall Test, where participants completed a recall test using a mouse interface; and Phase IV—NASA Task Load Index (NASA-TLX), where participants filled out the NASA-TLX questionnaire in paper format. EEG signals were recorded during Phases I, II, and III. (b) Example frames from the principal (top row) and non-principal (bottom row) multimedia design conditions. Both conditions used the same auditory content. In the principal condition: the first frame (from left to right) demonstrates the spatial contiguity principle, where related words and images are presented and placed close together; the second frame applies the coherence and redundancy principles, removing irrelevant content and avoiding redundant on-screen text when graphics and narration are present; the third frame illustrates the signaling principle by visually highlighting key information. The corresponding non-principal frames intentionally violate each of these principles.

EEG data collection and preprocessing

For EEG data collection, a portable 32-channel eWave amplifier43,44,45 paired with eProbe v6.7.3.0 software was used. The data were recorded via 29 passive wet electrodes placed on the scalp according to the 10–20 system7,8,41. Additionally, we used bilateral mastoids (M1 on the left and M2 on the right) as reference points for the EEG signals.

The electrode topography was organized as follows (Fig. 2): the prefrontal cortex (PFC) includes Fp1 and Fp2; the midline prefrontal cortex (mPFC) is represented by Fpz; the ventrolateral prefrontal cortex (VLPFC) comprises F7 and F8; the dorsolateral prefrontal cortex (DLPFC) consists of F3 and F4; the frontal cortex (FC) encompasses FC5, FC1, FC2, and FC6; the midline frontal cortex (mFC) includes Fz and Cz; the temporal cortex (TC) contains T7, T8, P7, and P8; the parietal cortex (PC) is made up of C3, C4, CP5, CP1, CP2, CP6, P3, and P4; the midline parietal cortex (mPC) is denoted by Pz; the occipital cortex (OC) involves O1 and O2; and the midline occipital cortex (mOC) is indicated by POz. The system records data with 24-bit resolution at a rate of 1000 samples per second. Visual triggers on the monitor were also employed to ensure synchronization. Electrode impedances were kept below 5 \(K\Omega\) in all recordings and electrode sites.

Fig. 2
Fig. 2
Full size image

Electrode placement using the extended international 10–20 system. The extended international 10–20 system (10% system) was employed to place 32 electrodes on the scalp, with the electrodes distributed across various regions of the cortex. The electrodes were categorized as follows: prefrontal (Fp1, Fp2), medial prefrontal (mPFC) at Fpz, ventrolateral prefrontal (F7, F8), dorsolateral prefrontal (F3, F4), frontal (FC5, FC1, FC2, FC6), midfrontal (Fz, Cz), temporal (T7, T8, P7, P8), parietal (C3, C4, CP5, CP1, CP2, CP6, P3, P4), midparietal (Pz), occipital (O1, O2), and midoccipital (POz). The reference for the system was CPz, with grounding at AFz.

As previously described in the Experimental Task section, EEG signals were recorded continuously for each participant from the onset of the resting-state phase to the conclusion of the recall test phase. The resting-state phase lasted 20 s, the multimedia learning phase had a fixed duration of 342 s, and the recall test phase varied depending on participant response times. In this study, only EEG data corresponding to the multimedia learning phase (342 s) were analyzed. This approach ensured consistency across participants and alignment with the study’s focus on brain activity during the learning process. EEG data analysis and preprocessing were carried out via EEGLAB46 Toolbox version 2020.0 in the MATLAB environment. Prior to preprocessing, missing channels were interpolated using the spherical spline interpolation method47,48. By incorporating this interpolation step early in the preprocessing pipeline, we maintain the integrity of the EEG data, which is essential for obtaining accurate results49.

To begin the preprocessing of EEG signals, we applied the basic FIR band-pass filter within the 0.5–48 Hz range to eliminate DC and high-frequency noise. However, mastoid referencing introduces external experimental artifacts in EEG signals because of unstable connections with the mastoids. To mitigate this effect, we employed the re-referencing component of the PREP pipeline algorithm50 to estimate the true reference. Additionally, we utilized the artifact subspace reconstruction (ASR) algorithm51 to correct corrupted segments of the EEG data, including the removal of high-amplitude components such as eye blinks, muscle movements, and sensor motion52. We perform ASR using the Clean_Rawdata plug-in with default settings. In the final preprocessing step, we applied independent component analysis (ICA) via the fastICA algorithm to remove the remaining artifacts (specifically, eye movements) from the data.

Surface Laplacian transformation

Source imaging algorithms, such as weighted minimum-norm estimation (wMNE), exact low-resolution electromagnetic tomography (eLORETA), and beamforming, aim to mitigate the effects of volume conduction. However, these methods have limitations. Previous studies have indicated that using a smaller number of electrodes than the minimum requirement (64 electrodes) can lead to inaccurate source reconstruction53. Moreover, the inverse problem lacks a unique solution54,55,56 because of the demands of source reconstruction algorithms, which necessitate precise inverse and forward models, selection of anatomical templates and head volume conductor models (including tissue conductivities), and initial assumptions about the sources. To address this challenge, the surface laplacian (SL) provides an alternative approach for estimating current-source density (CSD) that offers several advantages. The SL method is reference-free because it does not rely on assumptions about the sources. Additionally, SL can produce reliable results even when a low-density electrode setup with fewer than 64 electrodes is used47,57. In our study, we utilized the surface laplacian (SL) on the corrected EEG signals to estimate the current-source density (CSD). By adopting this approach, we aimed to improve topographical localization and minimize volume conduction effects. A similar approach was employed in other studies40,58,59. Finally, for a more detailed examination of EEG signals across diverse spectral ranges, we considered corrected EEG signals in five frequency bands: delta (0.5–4 Hz), theta (4–8 Hz), alpha (8–13 Hz), beta (13–30 Hz), and gamma (30–48 Hz).

Brain network construction

This study employed the phase slope index \((PSI)\) to quantify both the magnitude and direction of information flow between different brain regions60. The implementation of PSI is publicly available at http://doc.ml.tu-berlin.de/causality/. \(PSI\) was selected for its insensitivity to volume conduction artifacts and its ability to identify non-zero phase delays, facilitating the accurate estimation of effective connectivity networks (ECNs) at the sensor level. This method is based on the slope of the phase of cross-spectra between two-time series. A fixed time delay for an interaction between two systems will affect different frequency components differently. This is most easily seen if we assume that the interaction is merely a delay by time \((\tau )\), i.e. \({y}_{2}(t) = a{y}_{1}(t - \tau )\) with \(a\) being some constant and \(y(t)\) is the measured data. In the Fourier domain, this relation reads \(y_{2} (f) = a\exp ( - 2\pi f\tau )y_{1}^{ \wedge } (f)\) For the cross-spectrum \({S}_{ij}(f)\) between the two channels \(i\) and \(j\) one has

$${S}_{12}\left(f\right)= \langle {y}_{1}\left(f\right){y}_{2}^{*}\left(f\right)\rangle \sim exp(i2\pi f\tau ) \equiv exp(i\Phi (f))$$

where \(\langle .\rangle\) denotes expectation value. The phase-spectrum \(\Phi (f) = 2\pi f\tau\) is linear and proportional to the time delay \(\tau\). The slope of \(\Phi (f)\) can be estimated, where a positive slope indicates causality from \({y}_{1}\) to \({y}_{2}\), and a negative slope indicates the reverse.

The idea is to define an average phase slope in such a way that (a) this quantity properly represents relative time delays of different signals and coincides with the classical definition for linear phase spectra, (b) it is insensitive to signals which do not interact regardless of spectral content and superpositions of these signals, and (c) it properly weights different frequency regions according to the statistical relevance. This quantity is termed ’Phase Slope Index’ (PSI) and is defined as

$${\widetilde{\Psi }}_{i,j}=\mathfrak{J} \left(\sum_{f\in F}{Coh}_{ij}^{*}\left(f\right){Coh}_{ij}\left(f+\delta f\right)\right)$$

where \(F\) is the set of all frequencies; \(\mathfrak{J} (.)\) is the imaginary part of coherency; \({Coh}_{ij}\left(f\right)={S}_{ij}(f)/\sqrt{{S}_{ii}(f){S}_{jj}(f)}\) is the complex coherency; \(\delta f\) is the frequency resolution and \(\delta f= {F}_{s}/{n}_{FFT}= 0.5 HZ\); and \(S\) is the cross-spectral matrix. Finally, the \(PSI\) values are normalized, which is calculated via the Jackknife method60.

$$\Psi =\widetilde{\Psi }/std(\widetilde{\Psi })$$

Importantly, the weighted average of the slope, \(\widetilde{\Psi }\), becomes zero if the imaginary part of the coherency approaches zero. This makes it insensitive to mixtures of non-interacting sources, as it is unaffected by zero phase differences, making it a robust measure of interactions between brain regions61. In the next step, the normalized \(PSI\) are sorted in a \(29\times 29\) skew-symmetric matrix, including \(406\) possible pairwise associations (\(({N}^{2}-N)/2\), where \(N=29\)). A statistical threshold \(\left(\left|\Psi \right|> 2\right),\) corresponding to a 95% confidence interval of the \(PSI\) distribution at \(p < 0.05\) (two-tailed), was then applied to the obtained matrix60. This threshold was used to prepare the data for graph theoretical analysis where values falling below the threshold were reassigned to zero, whereas those surpassing the threshold maintained their original values. To extract effective connectivity networks and apply graph theoretical analysis, we relied on the obtained directed and weighted adjacency matrix. In this context, electrodes serve as nodes within the graph or network, and the edges represent connections between different brain regions, with each edge corresponding to an entry in the adjacency matrix.

Measures of brain networks

The topological properties of identified effective connectivity networks can be analyzed via fundamental graph theoretical measures such as the node degree and directionality index30. Additionally, key metrics related to integration, segregation, network sparsity and centrality offer deeper insights into network structure. We computed all these metrics from the directed and weighted adjacency matrices via the Brain Connectivity Toolbox (BCT)29.

Node degree and directionality index

The degree of a node refers to the number of edges connected to that node. Since an adjacency matrix is generated from \(PSI\) values, it reveals the direction of information flow. To compute a node’s degree, we divide it into two components: The in-degree \(({k}_{i}^{in})\), which represents the strength of the incoming flow, and the out-degree \({(k}_{i}^{out})\), which represents the strength of the outgoing flow and are computed as follows:

$$\left\{\begin{array}{c}{k}_{i}^{in}=\sum_{j\in N}{A}_{ji}\\ {k}_{i}^{out}=\sum_{j\in N}{A}_{ij}\end{array}\right.$$

Nodes with high out-degree values are regions that can exert influence over others. Conversely, nodes with high in-degree values indicate areas influenced by other regions. In this way, the total degree \((TD)\) characterizes the hubness or centrality of a node in the network. Mathematically, \(TD\) is expressed as

$$TD=\sum {k}_{i}^{in}+\sum {k}_{i}^{out}$$

To determine the overall direction of the information flow in each node, we can use the directionality index \((DI)\). This measurement is expressed as

$$DI=\sum {k}_{i}^{out}-\sum {k}_{i}^{in}$$

A positive value of \(DI\) indicates that the electrode behaves as a source or sender of information, actively transmitting data. Conversely, a negative value of \(DI\) denotes the electrode’s role as a sink or receiver, passively accepting incoming information.

Measures of functional segregation

The brain exhibits functional segregation, allowing specialized processing to occur in densely interconnected groups of brain regions. Measures of segregation primarily evaluate the presence of these groups, which are commonly known as clusters or modules, within the network. The clustering coefficient \((CC)\) of a node represents how close its neighbors tend to cluster together62. The global clustering coefficient is computed by averaging the local clustering coefficients across all nodes, yielding a value between 0 and 1. Mathematically, it is computed as follows:

$$CC=\frac{1}{N}\sum_{i\in N}{C}_{i}=\frac{1}{N}\sum_{i\in N}\frac{{t}_{i}}{({k}_{i}^{out}+{k}_{i}^{in})\left({k}_{i}^{out}+{k}_{i}^{in}-1\right)-2\sum_{j\in N}{A}_{ij}{A}_{ij}}$$

where \(N\) is the number of nodes; \({C}_{i}\) is the local clustering coefficient; \({t}_{i}\) is the number of triangles that exist around each node; \({k}_{i}^{in}\) and \({k}_{i}^{out}\) are the in-degree and out-degree of a node, respectively; and \({A}_{ij}\), \({A}_{ji}\) where is the entry of the adjacency matrix.

Local efficiency \((LE)\) quantifies how information is transmitted within local clusters. It reflects the ability of neighboring nodes to communicate efficiently when a specific node is removed. High local efficiency values facilitate parallel processing, allowing effective integration of information.

$$LE=\frac{1}{2N}\sum_{i\in N}\frac{\sum_{j,h\in N,j\ne i}\left({A}_{ij}+{A}_{ij}\right)\left({A}_{ih}+{A}_{hi}\right)\left[{\left({d}_{jh}\left({N}_{i}\right)\right)}^{-1}+{\left({d}_{hj}\left({N}_{i}\right)\right)}^{-1}\right]}{({k}_{i}^{out}+{k}_{i}^{in})\left({k}_{i}^{out}+{k}_{i}^{in}-1\right)-2\sum_{j\in N}{A}_{ij}{A}_{ij}}$$

where \({d}_{hj}\left({N}_{i}\right)\) is the length of the shortest path between \(j\) and \(h\) within the subgraph composed of the neighbors of node \(i\).

Measure of functional integration

Functional integration refers to the brain’s ability to rapidly merge specialized information from diverse brain regions. Measures of integration define this concept by assessing how easily brain regions communicate. These measures typically rely on the concept of a path, which consists of sequences of separate nodes and links30. The global efficiency \((GE)\) is a fundamental indicator of a network’s integration and is computed by the average inverse shortest path length63. The shortest path length \((PL)\) is the lowest number of edges that exist between any given pair of nodes.

$$GE= \frac{1}{N}\sum_{i\in N}(\sum_{j\in N,j\ne i}{(PL)}^{-1}/N-1)$$

High \(GE\) values indicate efficient communication and fewer processing steps between network nodes, whereas low \(GE\) values indicate the opposite.

Measures of centrality and network sparsity

Important brain regions, known as hubs, communicate with many other regions, enabling functional integration and contributing significantly to network resilience against insults. Different measures of centrality assess the significance of individual nodes based on these principles30. Betweenness centrality \((BC)\) represents the proportion of all the shortest paths in a network that path to a specific node. Nodes with high \(BC\) values serve as hubs, participating in numerous “shortest” paths. Mathematically, \(BC\) is computed for each node in the network by

$${BC}_{i}=\frac{1}{(N-1)(N-2)}\sum_{h,j\in N, h\ne i,j, i\ne j}\frac{{sp}_{hj}(i)}{{sp}_{hj}}$$

where \({sp}_{hj}\) is the number of shortest paths between nodes \(h\) and \(j\), and \({sp}_{hj}(i)\) is the number of shortest paths between nodes \(h\) and \(j\) that path through \(i\)30. Removing nodes with high \(BC\) significantly impacts network performance, which is essential for efficient communication.

The networks in this study are directed and weighted. Each network can have up to \({(N }^{2}-N)/2\) possible edges. The network sparsity, a measure of the density of connections within the network, was calculated by dividing the total number of existing edges by the maximum possible number of edges in the network. This approach allows for the comparison of different networks in terms of their connectivity density.

Network modularity

Modularity analysis is used to evaluate the quality of a network’s community structure. The goal of these methods is to identify communities, or groups of nodes, that have high internal connectivity and low external connectivity to other communities64. In directed networks, a modularity optimization method can be applied to identify communities65. This approach aims to find a division of the network into communities that maximize the benefit function \(Q\), called modularity, which is defined as follows:

$$Q=\frac{1}{m}\sum_{ij}\left[{A}_{ij}-\frac{{k}_{i}^{in}{k}_{j}^{out}}{m}\right]{\delta }_{ci,cj}$$

where \({A}_{ij}\) is defined conventionally: it equals 1 if there is an edge from \(j\) to \(i\), and 0 otherwise. The terms \({k}_{i}^{in}\) and \({k}_{j}^{out}\) represent the in-degree and out-degree of the respective vertices, whereas \(m\) denotes the total number of edges in the network. \({\delta }_{ij}-function\) is equal to 1 if \(i=j\) and \(0\) otherwise. Additionally, \({c}_{i}\) is the label of the community to which the vertex \(i\) is assigned65.

To identify stable communities in the brain networks of principal and non-principal groups, we employed a multistep approach. First, for each subject, we computed a 29 × 29 association matrix66,67 via the modularity optimization method described above. The element \({A}_{i,j}\) in this matrix represents the number of times the nodes i and j were assigned to the same module across 100 runs of the modularity algorithm. Next, we generated a null model by randomly permuting the original partitions 100 times. For each of these 100 random partitions, we reassigned nodes uniformly to the modules present in the partition. This process yields a null model matrix, where the element \({A}_{ij}\) is the number of times nodes \(i\) and \(j\) are randomly assigned to the same community. To remove the effects of randomness, we thresholded the original association matrix by setting any element \({A}_{ij}\) to 0 if its value was less than the maximum value observed in the random association matrix68. This step ensured that only the significant co-assignments were retained in the thresholded matrix. Finally, we computed the mean of the thresholded association matrices for all the subjects within each group (principal and non-principal). This group-level thresholded association matrix served as the input for a final round of modularity optimization via the following modularity method for an undirected network69:

$$Q=\frac{1}{2m}\sum_{ij}\left[{A}_{ij}-{P}_{ij}\right]{\delta }_{ci,cj}$$

where the actual number of edges falling between a particular pair of vertices \(i\) and \(j\) is \({A}_{ij}\), and \({P}_{ij}\) is the expected number of edges between \(i\) and \(j\), a definition that allows for the possibility that there may be more than one edge between a pair of vertices, which happens in certain types of networks69.

To quantitatively evaluate the role of each module during task execution, we utilized the integration metric as described in earlier research38,70. This metric measures the interactions among modules within the network. The integration of a specific module is calculated as the average number of connections that each node within that module has with nodes from other modules71,72.

Statistical analysis

To compare network metrics between the principal and non-principal groups, including the clustering coefficient, local efficiency, global efficiency, betweenness centrality and network sparsity, we used the Wilcoxon Rank-Sum test (also known as the Mann–Whitney U test). First, we computed all network measures in each network obtained from each participant’s data. In the next step, we calculated the mean of each network measure among the subjects who participated in each task (principal and non-principal) separately. The error bars represent the standard error of the mean (SEM). The same statistical analysis was performed for the behavioral data.

Results

In our experiment, two educational multimedia programs with the same audio content were designed. The participants were randomly assigned to either the principal multimedia (P group) or the non-principal multimedia (NP group). Behavioral performance was assessed using the NASA-TLX questionnaire and a recall test. Then, main graph measurements and network modularity were computed for the cortical brain networks extracted from the EEG data. Correlation analysis was also performed between the behavioral results and graph measures to provide a better understanding of the differences in the brain patterns of connectivity between the P and NP groups. In the gamma frequency band, we did not find any statistically significant differences in the graph metrics between the P and NP networks, or any correlations between the graph metrics and the behavioral results. Therefore, results related to the gamma band have not been reported.

NASA-TLX and recall test results

To assess whether the designed multimedia under the P and NP conditions influenced participants’ cognitive load and attentiveness, subjects were required to answer 12 four-option questions about the multimedia content immediately after viewing the video. Subsequently, they completed the NASA-TLX questionnaire to evaluate their perceived cognitive load. We hypothesized that significant differences would emerge between the P and NP conditions.

The results, presented in Fig. 3, confirm this hypothesis. Specifically, Fig. 3a demonstrates that the NASA-TLX scores for the P group were significantly lower than those for the NP group, indicating that participants who engaged with the P multimedia experienced a reduced overall cognitive load compared to those exposed to the NP multimedia. This difference was statistically significant according to the Rank-Sum Wilcoxon test \((p =3.2E-08<0.0001)\).

Fig. 3
Fig. 3
Full size image

Behavioral analysis: NASA-TLX scores and recall test accuracy scores. (a) NASA-TLX scores of all the subjects after they viewed principal or non-principal multimedia. The non-principal group presented significantly higher NASA-TLX scores. (b) Accuracy scores of the recall tests for all the subjects after they watched principal or non-principal multimedia. The principal group outperformed the non-principal group. The scores were measured on a scale from 0 to 100. The p-value from the Rank-Sum Wilcoxon test is annotated in both figures.

Additionally, the P group achieved higher accuracy on the recall test (Fig. 3b), calculated as the percentage of correct responses to total responses, than the NP group. This finding suggests that the NP multimedia increased cognitive load, adversely affecting performance on the recall test. The Wilcoxon Rank-Sum test confirmed a statistically significant difference between groups \((p = 1.2E-06<0.0001)\).

Graph theoretical analysis

Hub centers and the strength of information flow

Figure 4 presents topographical maps of the average Directionality Index (DI) and Total Degree (TD) at each electrode during multimedia learning, across multiple frequency bands, for both the principal and non-principal conditions. DI analysis (Fig. 4a) showed that the NP group exhibited organized posterior-to-anterior information flow in the theta and beta bands, while the P group displayed less structured activity. In the delta band, the NP group demonstrated left-to-right information flow, whereas the P group had dominant left-hemisphere receiver nodes. In the alpha band, receiver nodes were concentrated in the mFC for the P group and in the lateral cortices for the NP group. Moreover, intergroup differences were evaluated by comparing DI values at each electrode between groups. Electrodes with statistically significant differences \((p < 0.05),\) as determined by the Wilcoxon rank-sum test, are highlighted in dark colors in Fig. 4a’s third column.

Fig. 4
Fig. 4
Full size image

Topological maps of directionality index and total degree across frequency bands. (a) Mean directionality index (DI) in principal and non-principal groups across the delta, theta, alpha, and beta frequency bands. Red indicates sender nodes, while blue indicates receiver nodes. In the third column, the DI was calculated for each electrode in each participant within each group, and DI values were compared between groups. The resulting \(p-values\) were transformed using -\(ln(p)\); values exceeding 3 (corresponding to \(p< 0.05\)) were considered statistically significant and are indicated by darker colors. The colorbar represents the –ln(p) values. (b) Mean total degree (TD) of all electrodes in the principal and non-principal brain networks across the delta, theta, alpha, and beta bands. Red indicates nodes with a high total degree, while blue indicates nodes with a low total degree. In the third column, TD was computed for each electrode in each participant within each group, and TD values were compared between groups. Electrodes with – ln(p) values exceeding 3 (equivalent to \(p < 0.05\)) are indicated by darker colors. All statistical comparisons were performed using the Wilcoxon rank-sum test.

TD analysis (Fig. 4b) highlighted regional differences in connectivity. In the delta band, the NP group showed higher TD in the left PFC, right PC, and TC, while the P group exhibited greater TD in the right frontal and parietal regions. The theta band displayed higher TD in frontal areas for the NP group, and the alpha band revealed a broader TD distribution in the NP group compared to the concentrated activity in the mFC for the P group. In the beta band, the NP group had higher TD in the parietal cortex, whereas the P group demonstrated stronger connectivity in the frontal cortex. To highlight intergroup differences, TD values at each electrode were compared between the two groups. Electrodes with statistically significant differences \((p < 0.05)\) are marked in dark colors in the third column of Fig. 4b. Overall, in the whole brain and all frequency bands, the nodes of the NP networks have higher total degrees than the nodes of the P networks. These results indicated that principal and non-principal states exhibited distinct, frequency-specific connectivity patterns, highlighting differences in information processing during multimedia learning.

Global and local information processing

The results for local information processing or functional segregation, as measured by the clustering coefficient (CC) and local efficiency (LE), are shown in Fig. 5a and b, for both the principal and non-principal groups. In the delta, alpha, and beta bands, no significant difference was observed between the mean CC values for the NP group and the P group. However, in the theta band, the mean CC value for the NP group was significantly greater than that for the P group (\(p=0.013\)). Moreover, the mean local efficiency (LE) values in all bands in the P group were greater than those in the NP group. Specifically, statistically significant differences were observed in the theta and beta bands (\(p=0.009,p=0.014\)) (Fig. 5b).

Fig. 5
Fig. 5
Full size image

Measures of functional segregation and integration. (a, b) Functional segregation measures, clustering coefficient and local efficiency in the principal and non-principal groups across the delta, theta, alpha, and beta frequency bands. (a) The clustering coefficient was significantly higher in the non-principal group than in the principal group in the theta band. (b) The local efficiency of the principal group was greater than that of the non-principal group in the theta and beta bands. (c) The functional integration measure, global efficiency, in the principal and non-principal groups. In the delta, theta and alpha frequency bands, the global efficiency increased significantly in the non-principal group compared with the principal group. The p-values are annotated in all figures to indicate the statistical significance of the observed differences between the two groups.

Regarding global efficiency (GE) as a representative of functional integration or global information processing, Fig. 5c shows that the mean GE values in all bands in the NP group were significantly greater than those in the P group (\(p=0.034, p=0.004, p=0.003\)), except in the beta band, where there was no significant difference.

Measures of centrality and network sparsity

Figure 6a shows that across all frequency bands, the NP networks have higher betweenness centrality (BC) values than the P networks do. Specifically, this difference was significant in the delta, theta, and alpha bands (\(p=0.042, p=0.003, p=0.003\)). The results for network sparsity (NS) exhibit a similar trend. In all frequency bands, the mean value of NS in the NP group was greater than that in the P group, and statistically significant differences were observed in all frequency bands except the beta band (\(p=0.042, p=0.005, p=0.004\)). Network sparsity values, which are indicative of complexity in brain networks73,74, suggest that the NP group exhibits more complex connectivity patterns than the P group, as shown in Fig. 6b.

Fig. 6
Fig. 6
Full size image

Betweenness centrality and network sparsity. (a) The mean betweenness centrality in the P and NP groups across the delta, theta, alpha, and beta frequency bands. Compared with the P group, the NP group presented significantly greater betweenness centrality values in the delta, theta, and alpha bands. (b) Network sparsity comparison. Across all frequency bands except the beta band, similar results were obtained for network sparsity, and the NP group demonstrated significantly higher network sparsity values than the principal group. The p-values are annotated in all figures.

Network modularity

We clustered the networks into modules or communities to further investigate the differences between the principal and non-principal groups, allowing for a distinct observation of the integration and segregation parameters. We computed network modularity for both groups in the theta and alpha frequency bands, which are particularly relevant in cognitive load studies.

Figure 7a shows that a qualitative visual inspection of the association matrices revealed distinct modular configurations between the two groups, particularly regarding intermodular activity (integration) and intramodular connectivity (segregation).

Fig. 7
Fig. 7
Full size image

Network modularity in the theta and alpha frequency bands. (a) Modular allegiance matrices in the theta and alpha frequency bands for the principal and non-principal groups. (b) The integration values for each module in both conditions in the theta and alpha frequency bands. Compared with the principal group, the non-principal group had more modules and higher integration values in both bands.

The findings also indicate that the non-principal networks exhibited high intermodular activity (high integration) and low intramodular connectivity (low segregation) in the theta and alpha frequency bands (Fig. 7b). This pattern suggests that the non-principal group may exhibit a more complex and interconnected modular structure, potentially reflecting differences in cognitive processing or functional organization between the two groups.

Correlation analysis between graph metrics and behavioral measures

We investigated the correlations between graph metrics and behavioral performance, including NASA-TLX scores and recall test accuracy, across different frequency bands, electrodes, and between the principal and non-principal groups. Significant correlations were found between the NASA-TLX scores and the graph metrics only in the theta band. However, significant correlations were observed between the recall test accuracy and the graph metrics in the alpha and beta bands (Table 1).

Table 1 Correlation analysis between graph metrics and behavioral scores.

In the principal group, in the theta band, there was a negative correlation between local efficiency (LE) values and NASA-TLX scores at electrode C4, as well as a negative correlation between betweenness centrality (BC) values and NASA-TLX scores at electrode FC6. A positive correlation was observed between LE values at electrode CP6 and recall test accuracy in the alpha band in the principal group. Notably, all three electrodes (C4, FC6, CP6) are located in the right hemisphere (Table 1).

In the non-principal group, higher clustering coefficient values in \(T8\) were negatively correlated with NASA-TLX scores in the theta band, suggesting that cognitive load decreases as local processing increases in this area. Additionally, recall test accuracy was significantly correlated with graph metrics in the alpha and beta bands. In the alpha band, negative correlations between the clustering coefficient and recall test accuracy were observed in the left hemisphere regions (P7, P3). In the beta band, the clustering coefficient values were negatively correlated with recall test accuracy in left hemisphere regions (CP5, T7), while betweenness centrality exhibited significant negative correlations with recall test accuracy in regions FP1, F4, and T8 (Table 1). These findings demonstrate that fluctuations in the values of functional network measurements across certain brain regions and frequency bands can impact cognitive load and recall test performance.

Figure 8 shows the strongest significant correlations between graph metrics and behavioral measures in the theta and alpha frequency bands. In the alpha band, there was a positive correlation between the local efficiency of CP6 and the accuracy score of the principal group (\(p=0.01, R=0.61\)) (blue line). Conversely, there was a negative correlation between the accuracy score and the clustering coefficient of P3 in the non-principal group (\(p=0.001, R=-0.7\)) (red line). In the theta band, significant negative correlations were observed at FC6 in the principal group, which links betweenness centrality to NASA-TLX scores (\(p=0.002, R=-0.71\)), and at T8 in the non-principal group, which associates the clustering coefficient with NASA-TLX scores (\(p=0.01, R=-0.59\)).

Fig. 8
Fig. 8
Full size image

Relationships between graph metrics and behavioral data. (a) The most significant relationships between the graph metrics and the accuracy scores and NASA-TLX scores in the principal group. The x-axis denotes the values of the graph metrics, whereas the y-axis represents the associated behavioral performance scores. In the alpha band, a positive correlation was found between the local efficiency (LE) values of all the subjects in the principal group and their corresponding accuracy scores. Conversely, in the theta band, a negative correlation was observed between the betweenness centrality (BC) values of all principal subjects and their NASA-TLX scores. The third column displays a topographic plot of electrodes showing significant correlations between at least one graph metric and behavioral scores in the principal condition, as detailed in Table 1. These electrodes are colored in blue. (b) The strongest correlations between the graph metrics and accuracy scores, along with the NASA-TLX scores, in the non-principal group. In the alpha and theta bands, significant negative correlations were found between the clustering coefficient (CC) of NP brain networks and both accuracy scores and NASA-TLX scores. The third column presents a topographic plot of electrodes with significant correlations between at least one graph metric and behavioral scores in the non-principal condition, as reported in Table 1. These electrodes are colored in pink.

Discussion

This study investigated how multimedia design principles influence the neural mechanisms underlying educational multimedia learning. In the experiment, two multimedia programs were constructed using the same auditory content but different visual designs. In one task, the visual elements adhered to Mayer’s principles of multimedia learning, whereas the other intentionally violated these guidelines. The participants were randomly divided into two groups, the P group and the NP group, who watched the principal or non-principal multimedia, respectively. EEG signals were recorded from participants while watching the video and the cortical functional networks were constructed from scalp signals via the phase slope index method.

In this study, whole-brain network analysis allowed us to compare functional connectivity patterns evoked by the P and NP educational multimedia programs. We employed several brain network measures aligned with network modularity to explore the relationships between different brain regions, focusing on global and local connectivity measurements as well as those related to the importance of individual regions (hubs). The results indicated that principal networks were superior in local information processing, whereas non-principal networks excelled in global information processing and hub formation across various frequency bands. Modularity analysis also demonstrated that non-principal networks exhibited higher integration and lower segregation compared to principal networks, corroborating earlier findings. Moreover, significant correlations were found between participants’ behavioral performance (measured by NASA-TLX scores and recall test accuracy) and functional network metrics across different brain regions.

Compensatory mechanisms in non-principal brain networks: evidence from integration and the role of hubs

Bassett and Sporns have emphasized the importance of functional segregation, which is associated with efficient local information processing, and functional integration, which is linked to enhanced global information processing28. These concepts are essential for understanding brain function29 within the framework of network science27 and graph theory30. Our brain network analysis indicates that, compared with P networks, NP networks exhibit increased integration, as revealed by higher global efficiency. Moreover, NP networks contain more hubs across the entire brain which facilitates extensive interregional communication and supports functional integration, as demonstrated by higher betweenness centrality and node degree. One possible interpretation of this enhanced information processing—less observed in local processing and more prominent in global processing—is that it reflects a compensatory mechanism for efficient learning when engaging with non-principal designs.

Interestingly, this compensatory mechanism is found across the delta, theta, and alpha frequency bands in the NP networks. In these bands, the performance of NP networks is significantly enhanced by global information processing, which facilitates efficient communication throughout the entire brain. Moreover, various types of hubs, including BC and TD hubs, support this compensatory mechanism. It was also shown that NP networks contain more connectivity edges, reflected in higher network sparsity (NS) values in these three frequency bands. However, regarding local information processing, P networks outperform NP networks in the theta and beta bands when examining the ability of neighboring nodes to communicate efficiently after removing a specific node (LE). However, when considering how closely neighbors tend to cluster together (CC), a significant difference was observed only in the theta band, with higher values attributed to NP networks. This difference may help explain why the neural mechanism underlying P networks leads to lower cognitive load and better performance when participants engage with multimedia designed according to Mayer’s principles.

Brain network modularity: enhanced integration and reduced segregation in the non-principal group

The modular architecture of human brain networks has been extensively reported via various neuroimaging techniques75 across different scales76. Numerous studies have demonstrated that the brain is organized into modules77 which play a crucial role in cognitive processes and are associated with various brain states and diseases78. This modular organization is linked to individual cognitive performance and has been shown to facilitate flexible learning and promote functional specialization79.

In this study, we focused on the functional role of brain network modularity while participants learned from principal and non-principal multimedia. Our findings indicate that these two conditions exhibit distinct modular organizations characterized by differing integration and segregation values. Specifically, compared to principal networks, non-principal networks demonstrated higher integration and lower segregation. These results can be interpreted as an enhancement of communication between modules, driven by the increased cognitive demands of learning from non-principal multimedia. The non-organized visual information presented in NP designs may not be readily associated with familiar patterns in the brain’s cognitive database, prompting participants to engage in broader and more effortful searches for recognizable signals.

The role of frequency bands and brain regions in the compensatory mechanism

Previous studies have demonstrated the pivotal role of theta band activity in cognitive load assessment20,21. Our research further supports these findings through network-based analyses. In the theta band, we observed significant negative correlations between graph metrics (local efficiency, betweenness centrality, and clustering coefficient) derived from three closely situated brain regions in the right hemisphere (C4, FC6 and T8) and NASA-TLX scores, a well-established tool for measuring subjective mental workload10. These negative correlations indicate that higher values of the network metrics are associated with reduced cognitive load. This aligns with previous findings that increased theta power during cognitive tasks is associated with better performance20.

Similarly, research has shown that alpha-band oscillations are correlated with visual attention80, selective attention81 and working memory82. In our study, we observed a positive correlation between the LE value of the CP6 and the performance of the subjects (accuracy in the recall tests) in the principal group. This result indicates that when local efficiency increases, accuracy also increases. This finding is consistent with research suggesting that alpha oscillations are involved in attentional processes81 and cognitive control82. Additionally, we found a negative correlation between the clustering coefficient values of P3 and P7 in the left hemisphere and the accuracy of subjects in the non-principal (NP) group. This suggests that individuals in the non-principal group may require higher levels of integration rather than segregation to achieve better performance. This finding aligns with previous research highlighting the significance of alpha oscillations in integrative brain function83.

Delta band activity is also linked to various cognitive processes, including attention and memory, both of which are critical for effective multimedia learning84. Delta oscillations facilitate temporal and representational integration during sentence processing, which is essential for comprehending and retaining multimedia content85. We found a significant increase in delta band activity within the NP group, as measured by functional integration (GE) and centrality metrics (BC and TD). These findings provide compelling evidence of compensatory mechanisms operating across different frequency bands.

Conclusions

The compensatory mechanism observed in NP brain networks provides valuable insights into how the brain adapts to suboptimal multimedia design. This adaptation aligns with the principles of neural plasticity and the brain’s capacity to manage cognitive challenges or limitations. Additionally, the varying effects across frequency bands highlight the intricate dynamics of brain networks during multimedia learning. Our findings indicate that NP networks exhibit compensatory mechanisms through enhanced global processing and hub formation in the delta, theta, and alpha bands, whereas principal networks demonstrate more efficient local processing, particularly in the theta and beta bands. Regarding network modularity, we also found that non-principal multimedia evoked greater integration and lower segregation in brain networks than did principal multimedia. These results underscore the importance of evaluating both global and local network properties in the study of brain function. Future research should further explore the temporal dynamics of these network changes and investigate how individual differences in cognitive abilities may influence these compensatory mechanisms. Furthermore, integrating graph-theoretical analyses with complementary neuroimaging modalities could lead to a more comprehensive understanding of the neural underpinnings of multimedia learning.