Abstract
Neural interfaces can restore or augment human sensorimotor capabilities by converting high-bandwidth biological signals into control signals for an external device via a decoder algorithm. Leveraging user and decoder adaptation to create co-adaptive interfaces presents opportunities to improve usability and personalize devices. However, we lack principled methods to model and optimize the complex two-learner dynamics that arise in co-adaptive interfaces. Here we present computational methods based on control theory and game theory to analyse and generate predictions for user–decoder co-adaptive outcomes in continuous interactions. We tested these computational methods using an experimental platform in which human participants (N = 14) learn to control a cursor using an adaptive myoelectric interface to track a target on a computer display. Our framework allowed us to characterize user and decoder changes within co-adaptive myoelectric interfaces. Our framework further allowed us to predict how changes in the decoder algorithm impacted co-adaptive interface performance and revealed how interface properties can shape user behaviour. Our findings demonstrate an experimentally validated computational framework that can be used to design user–decoder interactions in closed-loop, co-adaptive neural interfaces. This framework opens future opportunities to optimize co-adaptive neural interfaces to expand the performance and application domains for neural interfaces.
Similar content being viewed by others
Main
Neural interfaces can restore or augment human capabilities1,2,3,4,5,6. In a neural interface, signals from the user are translated via a decoder algorithm to control a device. Decoder algorithms that adapt to users during interface operations can improve performance and enable personalization2,7,8,9,10,11,12,13,14,15,16,17. Users also adapt as they learn to control the interface because they receive real-time feedback3,18,19,20,21. Introducing adaptive algorithms into neural interfaces, therefore, creates a co-adaptive system. Prior work has demonstrated the benefits of leveraging co-adaptation to guide or assist user adaptation10,22,23 and maintain performance over time10,14,24.
However, co-adaptive systems are challenging to design because they involve dynamic, two-learner interactions: both user and decoder may adapt simultaneously and in response to one another. Experiments have exposed the importance of identifying appropriate decoder parameters, such as the rate of decoder adaptation, to ensure stable and predictable interface control7,25,26. Methods to design and analyse two-learner systems are an active area of research in neural engineering27,28 and dynamic game theory29, but co-adaptive neural interfaces are largely designed and implemented empirically. Theoretically grounded and experimentally validated techniques to model and quantify co-adaptive dynamics will unlock new ways to design robust interfaces that harness the full potential of user and decoder adaptation.
Existing frameworks for designing co-adaptive interfaces highlight the promise of model-based approaches, but they either do not capture the full range of dynamic interactions that can arise between users and decoders or do not make predictions about interaction outcomes. Past work on neural interfaces focused on the user adapting to a leading, optimal (fixed) decoder30 or the decoder adapting to follow the user31, and most experimental neural interfaces are constrained to this limited range of co-adaptive interactions2,10,14,22. We instead seek methods that will be able to accommodate a wide spectrum of user–decoder interactions, where the device may shift from ‘leading’ to ‘following’ the user over time32. Game theory has been used to analyse the joint responses of two learners in human–human motor interactions9,33 and human–machine interactions25,34,35. We built on these ideas, adapting tools from control theory and game theory to create a flexible, experimentally validated framework for the analysis and synthesis of co-adaptive outcomes of continuous interactions in neural interfaces.
We developed a myoelectric interface platform that allowed us to build experimentally validated frameworks for co-adaptive neural interfaces. We validated that users changed their behaviour alongside an adaptive decoder to control a cursor via their muscle activity, creating a co-adaptive system. We modelled this closed-loop system using control theory methods to quantify interactions between users and the device. This experimental platform and analysis toolkit allowed us to test analytical predictions about user–machine behaviours derived from a model treating the user and decoder as two agents in a game. Our experimental results revealed that user learning can be influenced through real-time interface adaptation and that these outcomes can be predicted using game-theoretic models. Together, our findings establish methods to analyse and synthesize co-adaptive systems that can enable next-generation approaches for designing and personalizing neural interfaces.
Results
We created an experimental platform to study co-adaptation using non-invasive myoelectric interfaces, which we subsequently used as a testbed to develop and validate computational frameworks for co-adaptive interactions. This platform created unfamiliar human–machine interactions for our human participants (N = 14, demographics in Table 1) that let us systematically manipulate decoder properties to investigate the effect of decoder adaptation on user adaptation. Here we use the term ‘adaptation’ to describe any change over time in the decoder or user, and do not aim to draw analogies to particular forms of motor learning36. Our myoelectric interface had similar properties to many common motor neural interfaces, including brain–computer interfaces: it used high-dimensional biosignals to control a lower-dimensional movement37,38 and was a closed-loop system that can facilitate user adaptation12. We first analysed the experimental platform to establish that both users and decoders adapt, creating a co-adaptive system (see the ‘An adaptive myoelectric interface created a co-adaptive experimental platform’ section). We then leveraged control theory and game theory to develop computational methods that can measure and predict co-adaptive outcomes (see the ‘Co-adaptation outcomes matched control theory predictions’ and ‘Game theory model generated predictions for co-adaptation outcomes’ sections). Finally, we used our experimental platform to validate model predictions by manipulating the decoder adaptation parameters and demonstrate that our framework could predict the outcomes of co-adaptive interactions (see the ‘Decoder effort penalty influenced user effort without changing performance’ section).
An adaptive myoelectric interface created a co-adaptive experimental platform
Participants completed a two-dimensional trajectory tracking task using a 64-channel adaptive myoelectric interface (Fig. 1a and Extended Data Fig. 1a). The task was chosen for its advantages in analysing user sensorimotor transforms39,40. Participants controlled a cursor using surface electromyography (EMG) signals measured from their dominant forearm via a velocity-controlled Wiener filter decoder (Methods). The decoder was randomly initialized at the beginning of each trial and then updated online every 20 s using previously developed adaptive decoding algorithms (SmoothBatch41). Decoder adaptation conditions varied from trial to trial, and participants completed a total of 16 five-minute trials (arranged in two blocks of eight trials; Fig. 1b and Methods).
a, Experiment schematic. Users tracked a target by controlling the two-dimensional cursor velocity with muscle activity via an adaptive decoder. b, Session schematic. Participants completed two blocks of 8 five-minute trials with different, randomized decoder conditions; decoders were initialized randomly and adapted every 20 s. c, Example cursor (solid blue) and target (dashed black) trajectories. The orange lines denote decoder updates. d, Mean error (N = 14, centre shows median; box shows 25th–75th percentiles; whiskers extend to 1.5× the interquartile range) early (first 30 s) and late (last 30 s) in trials. Error (∥τ − y∥) was averaged across trials for each participant. One-sided Wilcoxon signed-rank test, **P = 0.0001. e, Mean error (N = 14) separated by decoder initialization (D1, D2) early and late in trials. Error (∥τ − y∥) was averaged across trials for each participant. Two-sided Wilcoxon signed-rank test, (early) ns = 0.67, (late) ns = 0.43. f, EMG tuning curve for one EMG channel early (dashed light purple) and late (solid purple) within a single trial. g, Average EMG tuning curve for one channel early (dashed light purple) and late (solid purple) across all trials (N = 16, median; shading shows the 25th–75th percentile range). The preferred directions for each curve are shown with vertical lines. h, Norm difference of the average early versus late EMG tuning curves (∥ΔEMG∥ = ∥EMG(late) − EMG(early)∥) (purple) compared with norm difference of the average consecutive 30-s intervals (∥ΔEMG∥ = ∥EMG(t) − EMG(t + 1)∥) (grey) for all channels, computed for each subject (N = 63, one-sided Wilcoxon signed-rank test with a Bonferroni correction of n = 14; *P < 0.001 for all subjects). i, Magnitude of angular change in EMG-preferred direction early versus late (∥ΔPD∥ = ∥PD(late) − PD(early)∥) (purple) compared with angular change in consecutive 30-s intervals (∥ΔPD∥ = ∥PD(t) − PD(t − 1)∥) (grey) for all channels, computed for each subject (N = 63, one-sided Wilcoxon signed-rank test with a Bonferroni correction; *P < 0.001 for all subjects). j, Average change in EMG tuning curves within a trial (∥ΔEMG∥ = ∥EMG(late) − EMG(early)∥) for all subjects as a function of trial number (N = 14, median; shading shows the 25th–75th percentile range). Average change in EMG of the first trial is compared with the last trial (one-sided Wilcoxon signed-rank test, *P = 0.0083). k, Average error as a function of trial time in block 1 (dashed grey) and block 2 (solid black) (N = 14, median; shading shows the 25th–75th percentile range). The black boxes represent comparisons of mid-trial and end-trial task error between block 1 and block 2 (one-sided Wilcoxon signed-rank test, **P = 0.00043, *P = 0.015). All box plots (d, e, h and i) show the median (centre line), 25th–75th percentiles (box; interquartile range) and 1.5× the interquartile range (whiskers). Panel c adapted with permission from ref. 26, Elsevier.
We first analyse the general properties of system behaviour independent of decoder adaptation conditions before examining the influence of decoder conditions (learning rate and cost functions; see the ‘Game theory model generated predictions for co-adaptation outcomes’ and ‘Decoder effort penalty influenced user effort without changing performance’ sections, respectively). The performance improved as the decoder adapted during a trial (Wilcoxon signed-rank test, P < 0.001; Fig. 1c,d), and these performance improvements were not impacted by decoder initialization (Wilcoxon signed-rank test, P > 0.05; Fig. 1e), similar to invasive brain–computer interface adaptive decoder studies27,41,42. These results confirm the expected behaviour of adaptive algorithms.
Although past work suggests that users probably adapt alongside the algorithm, we next aimed to confirm this behaviour in our testbed. System-level performance metrics alone cannot separate the effect of user adaptation from that of decoder adaptation because performance changes could be attributed to either learner. The decoder adapted during trial by design, leading to shifts in the decoder matrix (D) according to the prescribed cost functions (for example, Fig. 5c, also reported in preliminary studies26). To validate that our myoelectric interfaces created a co-adaptive system, we examined whether users adapted during each trial. We asked whether the user’s behaviour changed as they controlled the interface by estimating the relationship between the user surface EMG activity and the direction of intended movement—that is, an EMG tuning curve43,44,45 (Methods). Variation in electrode placement and user behaviour resulted in a wide range of EMG tuning curves that varied across participants and across electrodes (Extended Data Fig. 2). EMG tuning curves from a sample participant revealed within-trial changes in user behaviour (Fig. 1f,g). We saw a range of changes in the EMG tuning curves in other participants (Extended Data Fig. 3). We quantified EMG changes for each participant by calculating the magnitude of the difference between EMG tuning curves (Fig. 1h) and angular differences in the preferred direction (Fig. 1i; Methods). Because changes in EMG tuning were diverse across electrodes and participants, we assayed whether changes across a trial reflected a directed process. We compared the magnitude of change in the first 30 s (early) versus the last 30 s (late) to the magnitude of change on consecutive 30-s intervals. All participants showed significantly larger changes in the magnitude and tuning angles of EMG activity over the course of the trial than moment-to-moment fluctuations, suggesting that users adapted their behaviour in a directed fashion during the five-minute trial (Wilcoxon signed-rank test with a Bonferroni correction of N = 14, P < 0.001; Fig. 1h,i).
Interestingly, we observed that within-trial differences in the EMG tuning curves decreased as the trials progressed (Fig. 1j). We hypothesized that these longer timescale changes reflected user learning over the course of the experiment. We tested this prediction by comparing performance in the first and second blocks, which contained trials with matched decoder adaptation conditions. Decoder adaptation parameters were matched across blocks of trials, which means that differences in task performance (determined by the joint user–decoder system) must stem from differences in the user behaviour between blocks. We indeed saw that participants’ average mid-trial and final task error was lower in block 2 compared with block 1 (Wilcoxon signed-rank test, P < 0.05), suggesting that the task performance trajectories improved increasingly quickly in block 2 than in block 1 (Fig. 1k). Taken together, the user EMG and system performance suggest that users adapt alongside the decoder in our myoelectric interface, both within and across trials, establishing a co-adaptive system.
Co-adaptation outcomes matched control theory predictions
Distinguishing user and decoder contributions to an interface is key to the understanding and designing of closed-loop, co-adaptive systems. Although decoders are known exactly, the user model must be estimated. The EMG direction tuning analysis we used above can characterize the user activity but does not provide a model for the user and, therefore, limits our ability to examine user–decoder interactions. We used methods from control theory, a discipline focused on the analysis and synthesis of feedback systems, to model the behaviour of users as well as the closed-loop system. Prior work shows that control theory can estimate changes in user trajectory tracking behaviour by separating their feedback and feed-forward controllers39,46,47,48. We extended these techniques to our multi-input, multi-output system to analyse co-adaptive outcomes. These approaches involve modelling users as performing position-based control with perfect and immediate sensory feedback. These models, therefore, notably simplify computations known to contribute to motor control (such as sensory delays). Yet, these and related simplified models have proven useful for capturing aspects of motor behaviour in humans and other model organisms40,49,50.
We define our model of the user’s encoder as the closed-loop mapping between task information and EMG activity in our myoelectric interface (Fig. 2a; Extended Data Table 1 lists a summary of variables). Our encoder model, \(E\in {{\mathbb{R}}}^{64\times 8}\), considers feed-forward and feedback inputs to the user, which are derived from closed-loop task information presented to the user, namely, target position (\(\tau \in {{\mathbb{R}}}^{2}\)), target velocity (\({\dot{\tau}} \in {{{{\mathbb{R}}}}}^{2}\)), position error (\(\tau -y\in {{\mathbb{R}}}^{2}\)) and velocity error (\({\dot{\tau}}-{\dot{y}}\in {{{{\mathbb{R}}}}}^{2}\)). We formulate an encoder model with these feed-forward and feedback elements that depend linearly on the target position, cursor error and offset β (to capture the resting activity):
where \({1}_{t}^{\top }=\left[1\,1\,\cdots \,1\right]\in {{\mathbb{R}}}^{1\times t}\), and t indicates time.
a, Block diagram model of the closed-loop system. The user’s encoder E outputs myoelectric activity u to the decoder D to follow the target’s position and velocity. The decoder outputs velocity, which is then integrated by system dynamics M to generate a cursor position. The user sees their cursor moving with some velocity \({\dot{y}}\) to position y. Error in the cursor position and velocity (relative to the target) acts as closed-loop feedback to the user. b, Example reconstructions of cursor position (top) and velocity (bottom) from encoder estimations (dashed black), compared with the actual position and velocity (dashed blue) with a time-shuffled baseline (solid grey). Distributions of average R2 values between the actual and reconstructed position (top right) and velocity (bottom right) for actual data versus time-shuffled controls (N = 14, median; shading shows the 25th–75th percentile; two-sided Wilcoxon signed-rank test, (position) **P = 0.0001, (velocity) **P = 0.0001). c, Product of the decoder matrix with feed-forward contributions of the encoder matrix (N = 14, median; shading shows the 25th–75th percentile). The black dashed lines represent values that yield perfect trajectory tracking. d, Product of the decoder matrix with feedback contributions of the encoder matrix (N = 14, median; shading shows the 25th–75th percentile). The black dashed lines represent values that yield closed-loop stability. e, Average change in angle between the final and initial encoders (\(\angle ({\mathcal{R}}({E}_{f}),{\mathcal{R}}({E}_{{\rm{i}}}))\)) (dark purple) and consecutive encoders (\(\angle ({\mathcal{R}}({E}_{t+1}),{\mathcal{R}}({E}_{t}))\)) (light purple). Average change in angle between the final and initial decoders (\(\angle ({\mathcal{N}}{({D}_{f})}^{\perp },{\mathcal{N}}{({D}_{{\rm{i}}})}^{\perp })\)) (dark orange) and consecutive decoders (\(\angle ({\mathcal{N}}{({D}_{t+1})}^{\perp },{\mathcal{N}}{({D}_{t})}^{\perp })\)) (light orange) (N = 14, centre shows the median; box shows the 25th–75th percentiles; whiskers extend to 1.5× the interquartile range; two-sided Wilcoxon signed-rank test; (left) **P = 0.00012, (right) **P = 0.00012) f, Average change in angle between final and initial encoders (\(\angle ({\mathcal{R}}({E}_{f}),{\mathcal{R}}({E}_{{\rm{i}}}))\)) (dark purple) and final and initial decoders (\(\angle ({\mathcal{N}}{({D}_{f})}^{\perp },{\mathcal{N}}{({D}_{{\rm{i}}})}^{\perp })\)) (dark orange) as a function of trial number (N = 14, median; shading shows the 25th–75th percentile). g, Average change in angle between consecutive encoders (\(\angle ({\mathcal{R}}({E}_{t+1}),{\mathcal{R}}({E}_{t}))\)) (light purple) and consecutive decoders (\(\angle ({\mathcal{N}}{({D}_{t+1})}^{\perp },{\mathcal{N}}{({D}_{t})}^{\perp })\)) (light orange) over the course of a trial (N = 14, median; shading shows the 25th–75th percentile).
The encoder matrix has both feed-forward (F) and feedback (B) components:
such that each component of task information, τ, \({\dot{\tau}}\), τ − y and \({\dot{\tau}}-{\dot{y}}\) corresponds to an element of the encoder, namely, \({F}_{0}\in {{\mathbb{R}}}^{64\times 2}\), \({F}_{1}\in {{\mathbb{R}}}^{64\times 2}\), \({B}_{0}\in {{\mathbb{R}}}^{64\times 2}\) and \({B}_{1}\in {{\mathbb{R}}}^{64\times 2}\), respectively. The subscripts 0 and 1 represent the zeroth- and first-order dynamics of position and velocity, respectively. The output of the encoder is an EMG activity time series \(u\in {{\mathbb{R}}}^{64\times t}\) that is input to the decoder, \(D\in {{\mathbb{R}}}^{2\times 64}\). The decoder outputs a cursor velocity, which is integrated by the system dynamics M to a cursor position y. We estimated the user’s encoders \(E\in {{\mathbb{R}}}^{64\times 8}\) within a trial with linear regression using 20-s batches of data, corresponding to the time intervals in which the decoder was held constant (Methods). This model appeared to capture useful aspects of user behaviour, yielding reconstructed cursor position and velocity trajectories that were correlated with the actual cursor movements (Fig. 2b).
We then used our encoder model to quantify encoder–decoder interactions and analyse the performance of the closed-loop system. One common challenge in co-adaptive systems is convergence to a stable solution or performance25,27,28. Our performance-based analysis above suggested that our co-adaptive interface may have converged (Fig. 1c,d), but neither proves stability nor ideal performance. We exploited the separation of the feed-forward and feedback pathways in our encoder model to empirically measure the stability and tracking error properties of the closed-loop encoder–decoder system. Perfect trajectory tracking is obtained when the user’s feed-forward input F becomes the pseudo-inverse of the decoder D and system dynamics M, which results in the following conditions: D ⋅ F0 = 0 and D ⋅ F1 = I, where 0 is the zero matrix and I is the identity matrix (Methods and equation (18)). We used known decoder values and estimated user encoders to quantify these values in our experimental data, which revealed that D ⋅ F1 approaches the analytically ideal values on average across all trials (Fig. 2c). Interestingly, we found that D ⋅ F0 slightly deviated from the expected values for perfect tracking (Fig. 2c). These deviations from the analytical ideal were found to correlate with the user tracking error (Extended Data Fig. 4), suggesting that our encoder estimates capture meaningful properties of user behaviour. Similarly, closed-loop stability can be achieved if the feedback components of the encoder–decoder product (D ⋅ B0 and D ⋅ B1) are approximately negative-definite diagonal matrices. In this case, the eigenvalues of the closed-loop system’s dynamics are approximately located at the diagonal entries (that is, in the left-half complex plane), ensuring stability (chapter 5 of ref. 51). Our experimental data observations matched the predictions for a stable system (Fig. 2d).
We also used our encoder estimation model to revisit user-learning-related changes in experiments. We found that user encoder subspaces showed larger changes across the five-minute trial compared with subsequent time intervals (Wilcoxon signed-rank test, P < 0.001; Fig. 2e (left) and Methods), consistent with directed changes observed in the EMG tuning curves (Fig. 1h,i). As expected, decoder subspaces also showed directed changes within each trial (Wilcoxon signed-rank test, P < 0.001; Fig. 2e (right) and Methods). We further observed that the within-trial change in user encoder subspaces decreased across sequential trials (Fig. 2f), mirroring our EMG and performance-based analyses, showing that users also learn over the course of the experiment (Fig. 1j,k). Decoder changes within each trial, by contrast, did not notably vary across trials in the experiment since the decoders were reinitialized randomly at the beginning of each trial (Fig. 2f). Our models, therefore, revealed the evidence of co-adaptation between the encoder and decoder, which was consistent with our EMG-based analyses.
Finally, encoder and decoder changes within each trial, paired with system-level analyses, revealed nuances of how the co-adaptive system converged (Fig. 2g). The amount of change between consecutive decoders starkly decreased over the course of a trial, consistent with the algorithm converging towards a solution from its random initialization. User encoders, by contrast, changed a similar amount over the trial. This is consistent with user learning across trials, where they do not ‘reinitialize’ their encoder on each trial. Importantly, both encoders and decoders do not converge towards zero change within the five-minute trial, despite the fact that encoder–decoder pairs (Fig. 2c,d) and task performance (Fig. 1k) appear to converge. Considered together, our analyses suggest that the joint user–decoder system converges, yet neither agent becomes completely stationary. By modelling the user, our control-theoretic encoder model allowed us to more precisely resolve user–decoder interactions within the closed-loop, co-adaptive interface.
Game theory model generated predictions for co-adaptation outcomes
Having established an experimental platform for co-adaptive interfaces and validated tools to analyse user–decoder adaptation, we next aimed to build tools to predict how users and decoders will interact. We leveraged game theory to model the co-adaptive interactions that arise between these two agents optimizing their individual objectives52,53 (Fig. 3a). The two agents in our experiment—the user and decoder—co-adapt to achieve a shared goal: reduce task error. However, the user probably trades off error with effort, preferring strategies that achieve reasonable performance and reducing muscle activity54. The decoder’s adaptation algorithm was analogously designed to trade off error with the magnitude of transformation (or gain) it defines between the EMG inputs and cursor velocity outputs. We, therefore, defined cost functions for the user’s encoder E and the decoder D as a linear combination of task error e and agent effort f:
where λE and λD are effort penalties. For tractability, we limit our analysis in this section to the one-dimensional case, where E and D are scalars, as in ref. 55. In our simplified case, the error in our task is the square difference between the feed-forward decoder output and target velocities: \(e(E,D)={({\dot\tau}-DE {\dot\tau})}^{2}\). We similarly defined effort as fE(E) = E2, fD(D) = D2.
a, Schematic of co-adaptive interfaces in which the user and decoder are modelled as adapting to minimize their own individual cost functions (plots in the inset show data for each agent). b, Visualization of the potential function that describes dynamics within the user–decoder game model (simplified 1D user and decoder; Methods). Two axes represent the scalar values of the user and decoder actions, and the vertical axis represents the value of the potential function. A two-dimensional projection of the potential function is also shown. c, Gradient field of the user and decoder cost functions (with equal penalty terms, λD = λE). The purple (user) and orange (decoder) curves are nullclines (where the agent’s gradient equals 0), which intersect at stationary points (black stars). d, Heat map showing the error decay rate as a function of learning rates in equation (5) with penalty parameters λD, λE = 1/4. e, Error decay rate as a function of decoder learning rate (αD) for two values of encoder learning rate (αE = 3/2 (blue) and αE = 1/2 (red)) corresponding to the horizontal slices in d. f, Stationary values of scalar decoder (orange) and encoder (purple) as a function of decoder penalty parameter (λD) relative to the value of encoder penalty (λE).
The structure of these cost functions define a potential game56 (Fig. 3b), which can be analysed in terms of a single potential function:
Using this potential function, we can characterize the existence of and convergence to stationary points (E*, D*) defined by the conditions ∂Eϕ(E*, D*) = 0 and ∂Dϕ(E*, D*) = 0. These game-theoretic stationary points represent joint strategies in the co-adaptive system that persist for a prolonged period of time. Such stationary points in a neural interface could correspond to stable neural representations (neural ‘encoding’) and stable decoder parameters, which have been observed in co-adaptive brain–computer interfaces2,10,14,27,41.
We model encoder and decoder adaptation in discrete time using gradient descent for E and smoothed best response for D:
where αE > 0, αD ∈ (0, 1) are learning rate parameters. The update for D is consistent with what is implemented in the experiments (Methods) and was based on the methods in refs. 10,41. We propose a gradient-based update for E as a model for user learning that incrementally improves performance, which is consistent with our findings for how encoders change (Fig. 2e–g).
Analysing game dynamics (Methods provides the calculation details) shows that the existence of stationary points is unaffected by learning rates αE, αD, but these parameters determine the system’s overall rate of convergence—or, indeed, whether convergence occurs at all (Figs. 3d and 4a). When λE and λD are less than unity, there are three stationary points (E*, D*): one at the origin and a mirror pair (Methods provides the analytical expressions). The stationary point at the origin is a saddle, and the other two are local minima of the agents’ shared potential function (Methods). The existence of multiple stationary points in the co-adaptive model suggests the same for the experimental co-adaptive system, and the initialization of either agent could influence the stationary point at which the system ultimately converges. This prediction is tested in Extended Data Fig. 5.
a, Model predictions. All panels show the gradient field of the user and decoder cost functions. The purple (user) and orange (decoder) curves show nullclines (where the agent’s cost gradient equals 0) that intersect at stationary points (black stars). The slow (middle) and fast (right) panels zoom in on one stationary point and illustrate changes in the decoder (shades of orange) and encoder (purple) for different decoder learning rates. b, Average error as a function of time within the five-minute trial for fast (dashed dark orange) and slow (solid light orange) decoder learning rates (N = 14, median; shading shows the 25th–75th percentile). c, Percentage change in error from the start of the trial (first 30 s) to the end of the trial (last 30 s), separated by fast (dashed black) and slow (solid grey) learning rates (N = 14, centre shows the median; box shows the 25th–75th percentiles; whiskers extend to 1.5× the interquartile range; two-sided Wilcoxon signed-rank test, **P = 6.1 × 10−5). d, Average change in the user’s encoder within trials (∥Ef − Ei∥) for the slow and fast decoder learning rate conditions (N = 14, centre shows the median; box shows the 25th–75th percentiles; whiskers extend to 1.5× the interquartile range; two-sided Wilcoxon signed-rank test, **P = 1.8 × 10−4). e, Product of the decoder matrix with first-order feed-forward (F1) contributions of the encoder matrix (N = 14, median; shading shows the 25th–75th percentile) for fast (dashed dark pink) and slow (solid light pink) decoder learning rates. The black dashed lines represent values that yield perfect trajectory tracking. f, Product of the decoder matrix with first-order feedback (B1) contributions of the encoder matrix (N = 14, median; shading shows the 25th–75th percentile) for fast (dashed dark blue) and slow (solid light blue) decoder learning rates. The black dashed lines represent values that yield closed-loop stability.
Setting learning rates of adaptive decoders has attracted interest in part because of its notable impact on the overall neural interface performance7,25,27,28,41. Our model allows us to make principled predictions that could be used to design and optimize adaptive algorithms. We assessed the convergence of equation (5) to the stationary points (E*, D*) with E*, D* ≠ 0 by linearizing the dynamics about that point and then computing the eigenvalues of this linearization (chapter 5 of ref. 57); the linearization was the same for both stationary points. We refer to the larger eigenvalue as the error decay rate, as it is the smallest number ρ > 0 such that the error e(i) is bounded above by e(0)ρi, where i is the iteration number. Interestingly, this analysis predicts a Goldilocks effect of decoder learning rate on system convergence, consistent with past models25, for one regime of user learning; however, it predicts a monotonic relationship between system convergence and decoder learning rate for other user learning rate regimes (Fig. 3d,e). This complex landscape illustrates that empirical optimization is challenging and demonstrates the value of our computational framework for theory-guided optimization. The prediction that learning rate affects error decay rate is tested in Fig. 4.
We then explored the contribution of penalty terms λE, λD, which influence user–decoder phase portraits (Fig. 5a). Altering these penalties shifts the stationary points (E*, D*) and, thus, affects the user and decoder efforts, fE and fD. We find that an inverse relationship holds between E* and D* for a range of λE and λD: fixing λE and varying λD from 0 to 1/4λE, D* decreases monotonically and E* increases monotonically (Fig. 3f). This prediction is tested in Fig. 5.
a, Model predictions. The panels show the gradient fields of the user and decoder cost functions (format as that in Fig. 4a) when the decoder penalty term is equal to (left) or less than (right). The arrows depict the expected stationary points of the user (purple) and decoder (orange). b, Percentage change in error from the trial start (first 30 s) to end (last 30 s), separated by high (dark grey) and low (light grey) decoder penalty term conditions (N = 14, median; shading shows the 25th–75th percentile; two-sided Wilcoxon signed-rank test, ns = 0.17). c, Left: average magnitude of the decoder matrix (norm) over time in the trial for low (dashed light orange) and high (solid dark orange) decoder penalty terms (N = 14, median; shading shows the 25th–75th percentile). Right: box plots are average decoder efforts across the trial for each subject (N = 14, centre shows the median; box shows the 25th–75th percentiles; whiskers extend to 1.5× the interquartile range; one-sided Wilcoxon signed-rank test, **P = 6.1 × 10−5). d, Left: average magnitude of the user encoder matrix (norm) over time in the trial for low (dashed light purple) and high (solid dark purple) decoder penalty terms (N = 14, median; shading shows the 25th–75th percentile). Right: box plots are average effort for each subject across the trial (N = 14, centre shows the median; box shows the 25th–75th percentiles; whiskers extend to 1.5× the interquartile range; one-sided Wilcoxon signed-rank test, *P =0.018). e, Left: average cursor speed as a function of time in the trial for low (dashed light grey) and high (solid dark grey) decoder penalty terms (N = 14, median; shading shows the 25th–75th percentile). Right: box plots are the average cursor speed across the trial for each subject (N = 14, centre shows the median; box shows the 25th–75th percentiles; whiskers extend to 1.5× the interquartile range; one-sided Wilcoxon signed-rank test, **P = 6.1 × 10−5). f, Average difference in cursor speed between the low and high decoder penalty conditions plotted against the average difference in encoder efforts between the low and high decoder penalty conditions for each individual participant (black dots). The shading illustrates qualitatively identified groupings. g, Schematic illustrating the relationship between encoder effort and cursor speed when the decoder effort changes. The dots represent the encoder effort and velocity for high (dark orange) and low (light orange) decoder penalty terms. The arrows depict the potential scenarios of how the decoder penalty can shift the encoder effort and velocity.
To summarize, our game-theoretic model generates the following predictions about user–decoder co-adaptation:
-
1.
Overall system convergence and performance are influenced by both user and decoder learning rates.
-
2.
Decoder penalty terms affect the location of stationary points in the overall system, which determines the user effort. This finding predicts that users and decoders, effectively, ‘trade-off’ effort to maintain system performance.
-
3.
Because multiple stationary points exist, each learner’s initialization will affect the final stationary point reached by the system. This predicts that decoder initialization will influence a user’s final learned encoder.
Decoder learning rates can disrupt co-adaptation
Our game-theoretic model emphasizes the importance of both user and decoder learning rates for convergence to stationary points. One agent adapting faster than the other may limit or prevent convergence to stationarity (Fig. 4a). Our model predictions echo results from previous models25 and experimental findings from both invasive neural and non-invasive body–machine interfaces7,58. We examined the validity of our model’s predictions by testing the effect of decoder learning rates in our co-adaptive myoelectric interface system. Because the user’s learning rate was unknown and could not be estimated a priori, we varied the decoder learning rate, testing two extremes: slow and fast (Methods). Our model makes two alternate predictions for the outcome of this manipulation, depending on user learning rate: if users learn relatively rapidly, these two extremes should yield similar performance; if users learn more slowly, fast decoder learning rates should reduce performance (Fig. 3e). Decoders for both learning rate conditions were initialized randomly, matching all other conditions, and then adapted according to the specified cost function. Task performance was worse for co-adaptive interfaces in which the decoder adapted with the fast learning rate compared with the slow learning rate throughout all trials (Fig. 4b), and performance improved more within the trial in the slow condition compared with the fast condition (Wilcoxon signed-rank test, P < 0.001; Fig. 4c).
We found that the decoder learning rate also affected the user’s encoders. We used our encoder estimates (Fig. 2) to quantify how much users’ encoders changed over the course of the five-minute trial in each decoder condition (Methods). User encoders changed less within trials when the decoder adapted fast compared with slow (Wilcoxon signed-rank test, P < 0.001; Fig. 4d), suggesting that the decoder may have adapted too fast for the user’s encoder to match. We then analysed user–decoder interactions, which confirmed that rapid decoder adaptation disrupted co-adaptation. The encoder–decoder pairs in the slow condition approached the system ideal for perfect trajectory tracking and closed-loop stability, but the fast-condition encoder–decoder pairs deviated from the system’s ideal state in both first-order feed-forward (Fig. 4e) and feedback conditions (Fig. 4f), showing that a fast learning rate negatively impacted the co-adaptive interactions for both trajectory tracking and closed-loop stability. Our experimental results corroborated our game-theoretic prediction that decoder learning rates influence the system behaviour, though precise validation experiments will require improved methods to estimate the user learning rates (see the ‘Discussion’ section).
Decoder effort penalty influenced user effort without changing performance
Our game-theoretic framework includes the regularization terms, ∥D∥ and ∥E∥, which correspond to what we refer to as ‘effort’. Including regularization terms enforces convergence to a stable stationary point (see the ‘Game theory methods’ section). In the model, the scaling of the decoder regularization term, controlled by λD, influences the decoder effort. In turn, the user effort is also influenced by λD due to the linear relationship between the decoder and cursor velocity (equation (6)). As the decoder penalty changes in our model, the stationary point and encoder effort correspondingly shift (Fig. 5a). Our model, therefore, predicts that changing the decoder effort will not impact convergence to stationarity in the co-adaptive interface, but will influence user behaviour by shifting their effort. Past work suggests that users can adapt their encoder to trade off with a decoder in a co-adaptive interface in which the decoder only prioritized optimizing task performance10. Yet, many studies highlight limitations on the degree of user flexibility in a range of neural interfaces59,60. It is, therefore, unknown whether users will flexibly trade off with a decoder that aims to optimize multiple objectives, including its own individual effort.
We tested the impact of two penalty terms—low penalty and high penalty—on encoder–decoder co-adaptation in our myoelectric interface experiments. Decoders for both penalty conditions were initialized randomly, matching all other conditions, and then adapted according to the specified cost function (Methods). Performance was not impacted by the penalty terms (Wilcoxon signed-rank test, P > 0.05; Fig. 5b), suggesting that the co-adaptive systems converged similarly for both decoder conditions. Decoder effort was influenced by the penalty term as expected: higher decoder penalty resulted in decreased decoder effort (Wilcoxon signed-rank test, P < 0.001; Fig. 5c). As predicted by our model, the decoder effort affected the encoder effort to reflect the trade-off in effort between the two learners. Thus, higher decoder penalty terms resulted in increases in the encoder effort (quantified by the norm of the F1 component of the encoder; Wilcoxon signed-rank test, P < 0.05; Fig. 5d). These results held true regardless of the learning rate of the decoders (Extended Data Fig. 6).
Although our experimental results broadly matched model predictions, we noticed that the relative decrease in encoder magnitude with respect to the increase in decoder magnitude was smaller than predicted by the theory (Fig. 5c versus Fig. 5d). Importantly, decoder effort controls the magnitude of D, which defines the gain between the user EMG and cursor velocity. If users do not change their effort to perfectly mirror changes in the decoder effort, the cursor’s speed—an aspect of task performance not captured in our model and not directly linked to task performance—would change. We found that the decoder penalty terms influenced the cursor speed during tracking, with a lower decoder penalty term leading to faster cursor speeds, on average, across users (Wilcoxon signed-rank test, P < 0.001; Fig. 5e). Interestingly, these trade-offs differed across our participants (Fig. 5f), potentially capturing individual preferences. We noticed clear outlier participants who chose to decrease their effort and maintain a similar cursor speed, or alternately maintain a similar effort to go faster; the remainder of our participants changed both effort and speed to varying degrees. This suggests three broad groups of strategies adopted by users (Fig. 5g). These different strategies, however, did not correlate with the task performance, suggesting that users may identify compromise solutions to trade off with the decoder without sacrificing task performance. These findings highlight experimental deviations from our model predictions for strict encoder–decoder trade-offs. We also found only subtle evidence to support our model’s predictions related to the influence of decoder initialization (prediction 3 from the preceding section, ‘Game theory model generated predictions for co-adaptation outcomes’; Extended data Fig. 5; see the ‘Discussion’ section). Despite these deviations, our experimental results broadly confirmed our game-theoretic model predictions and highlighted the potential of decoder adaptation to influence the user’s encoder during closed-loop co-adaptation.
Discussion
We extended quantitative modelling techniques from control theory and game theory to reveal how decoding algorithms can shape user adaptation in neural interfaces. The critical first step was developing a myoelectric interface testbed that enabled systematic experiments, examining how users learn to control an unfamiliar interface alongside different adaptive algorithms. Our platform provides advantages over existing human-in-the-loop simulators and related approaches, which may not fully capture how people respond to adaptive algorithms in closed-loop settings61,62,63,64,65.
We then created control-theory-based estimations of user encoder transformations that allowed us to quantify critical closed-loop properties like trajectory tracking and stability (Fig. 2c,d) and illustrated how decoder design impacted these properties (Fig. 4e,f). Prior work has shown that encoder–decoder stability probably contributes to interface performance and user learning2,7,10, but efforts to quantify stability in adaptive neural interfaces have focused on the decoder alone27,28. By developing methods to analyse encoder–decoder pairs, we were able to characterize stability in the full co-adaptive system (Figs. 2 and 4e,f) and reveal how decoding algorithms can influence user adaptation. Although our approach used a simplified model of user closed-loop control policies, our results highlight the promise of these frameworks for designing and analysing neural interface systems. Expanding the accuracy of our models, by incorporating details like sensory processing delays, will probably lead to further advances.
We also developed a game-theoretic model to generate predictions about the behaviour of different co-adaptive systems. Designing co-adaptive interfaces requires confronting a vast space of algorithm parameters that can influence decoder and interface performance7,25,27,41,66. Our approach provides experimentally validated methods to quantitatively model and predict outcomes of co-adaptive dynamics. These tools will enable principled interface design and optimization by considering multiple objectives and user–decoder interactions. Current neural interfaces largely aim to maximize performance on a task without considering other aspects of interface performance, such as stability or robustness. Our results demonstrate an example of training algorithms to consider multiple objectives, balancing task error with decoder effort, which, in turn, influences user effort (Fig. 5). Future interfaces might be designed to consider additional objectives like stability, robustness or performance across multiple tasks. We foresee our game-theoretic optimization-based framework becoming especially useful as the functionalities of neural interface tasks grow.
Our game-theoretic model is particularly valuable for predicting user–decoder interactions that influence system performance, such as decoder learning rates (Fig. 3d–f). We found that a faster decoder learning rate yielded worse overall performance (Fig. 4c) and altered trajectory tracking and stability in encoder–decoder pairs (Fig. 4e,f) in our experimental myoelectric interface. Our results add to prior experimental observations from co-adaptive kinematic7,25 and brain–computer interfaces25,58. Interestingly, many high-performance adaptive algorithms in invasive neural interfaces adapt rapidly13,28,67. Other co-adaptive interface models and experiments suggest that intermediate learning rates are optimal25. We found that a fast decoder learning rate condition disrupted co-adaptive dynamics in our interface, and our data suggest a more monotonic relationship between decoder learning rates and system performance. Our model highlights that the learning rates of both user and decoder determine system dynamics (Fig. 3d–f) and our experimental data are the most consistent with a regime in which users learned slowly. Although we cannot yet verify user learning rates, it is plausible that our naïve participants may have adapted at different rates than the more experienced participants in invasive brain–computer interface studies and participants using well-learned input interfaces (for example, a computer mouse) in a co-adaptive system25. Indeed, user learning rates may vary across different biosignal modalities20. Our framework provides a path to optimize neural interfaces by tailoring the algorithm adaptation rates based on user behaviour. This raises important future challenges to develop methods that can perform on-the-fly estimates of critical properties of user behaviour, such as learning rate.
Our game theory model and experimental results also revealed user–decoder interactions that influence user behaviour, such as decoder effort (Fig. 5d). This demonstrates the feasibility of using our framework to optimize interfaces for particular goals. In myoelectric interfaces, manipulating decoders to encourage users to learn encoders that use less muscle activity could be desirable to maximize interface usability for extended periods. Alternately, decoders that encourage users to increase muscle activation towards a desired target could be desirable for motor rehabilitation applications23,68. Interestingly, we saw that users responded differently to varying decoder penalty terms, potentially reflecting personal preferences for effort versus cursor speed (Fig. 5f). Customizing interfaces based on dynamic user preferences has been shown to be beneficial in exoskeleton interfaces69. Individual preferences for cursor dynamics have also been seen in invasive neural interfaces70. Extending our game-theoretic frameworks to include user preferences will enable the design of fully personalized, custom neural interfaces, which may make these technologies more widely accessible71.
Our game theory model highlighted that multiple encoder–decoder stationary points exist in neural interfaces. This raises the possibility that the initial decoder can affect the final stationary point and, in turn, the final user encoder, without impacting performance (Extended Data Fig. 5a). Our experimental results confirmed that performance was not impacted by decoder initialization (Fig. 1e), consistent with prior work testing random decoder initializations in neural interfaces41. Although performance was not different, we found that initialization subtly impacted the stationary points as identified by the final encoder–decoder matrices (Extended Data Fig. 5e). Our findings suggested that decoder initialization might bias the encoders learned by users, which could be particularly beneficial for rehabilitation applications, which aim to shape users’ behaviour towards a particular goal. Our results also suggest that it may be critical to carefully consider initial training protocols for any neural interface, as it may influence user learning trajectories.
Our experiments revealed occasional deviations from model predictions, including deviations from strict encoder–decoder trade-offs (Fig. 5f) and only subtle impacts of the decoder initialization (Extended Data Fig. 5). This finding suggests that the actual user’s cost might differ from our model’s assumptions. For instance, our model predicts that decoder initialization would lead users to completely invert the relationship between EMG/movement and cursor velocity (Extended Data Fig. 5a), which ignores the likely possibility that users have some bias in strategies to control the interface. Additionally, there may be multiple ways to represent effort, and these definitions may also become unclear in less embodied invasive neural interfaces. Prior work has considered that users minimize effort54,72 via muscle activation73, metabolic cost74,75 and motor variability76. Studies into how users learn to control neural interfaces suggest that other features, such as correlations between neurons or muscles, may be key factors59,60. Deeper insights into how users learn to control different kinds of interface will improve our ability to model their cost functions to predict user–decoder interactions.
By developing computational and experimental methods to study co-adaptive systems, our results reveal how interface algorithms can influence user behaviour and system performance. Specifically, our model and experimental findings show that the decoder learning rate, penalty terms and initialization influence encoder–decoder dynamics in co-adaptive interfaces. These observations are consistent with recent work suggesting that adaptive algorithms can influence neural representations learned in brain–computer interfaces77. Our framework provides paths to harness the complex interactions between users and algorithms in closed-loop interfaces. For example, our results hint at the eventual possibility of designing a ‘curriculum’ to help naïve users learn to control neural interfaces, which is a task that some find challenging48. Decoder learning rates may need to be slower early in learning when users widely explore control strategies78,79,80 and adapt as users master the interface. Strategic decoder manipulations during a user’s initial exploration could be designed to bias users towards particular strategies. We can borrow concepts from training neural networks81 to envision a decoder adaptation curriculum to intelligently shape user learning, fully harnessing the power of co-adaptive interfaces. Our framework provides a versatile and principled way to model and design the next generation of smart, personalized adaptive neural interfaces.
Methods
Experimental methods
Participants
Fourteen volunteers were recruited for this study and gave their written consent before experiments, according to study procedures approved by the University of Washington’s Institutional Review Board (IRB number STUDY00014060). All participants were compensated monetarily for their time. All participants had no known motor disorders. Participant demographic information (including gender, weight, height, age and handedness) was collected via a demographics survey that participants completed before the experiment (Table 1). Only forearm circumference was measured by the experimenter; all other information was self-reported.
Experimental design
Participants were asked to control a cursor on the screen—using muscle activity from their forearms measured via EMG—to follow a two-dimensional continuous target trajectory as closely as possible (Extended Data Fig. 1a). Participants were told that they might not be able to control the cursor at the beginning of the trial but to expect that their cursor control would improve as the trial progressed. Participants received no other cues or information about the decoder or study conditions. If the participant’s cursor was stuck in the corner or edge of the screen for longer than 3.33 s, the cursor was automatically reset to the centre of the screen. Participants sat in a chair with no restraints facing a computer screen (HP Compaq L2206tm, 1,900 × 1,600 pixels, 46.5 cm × 24.5 cm). The computer screen displayed a red target circle (RGB: 1, 0, 0). The cursor was a blue circle (RGB: 0, 0, 1). The target was three times the area of the cursor (Extended Data Fig. 1a). The target and cursor positions were updated and displayed at 60 Hz. The task was programmed using the pygame version 2.0.1 and LabGraph version 1.0.2 Python packages.
Following prior work39,40,49, the target traversed a pseudorandom trajectory, which was generated by a sum of sinusoids with randomized phases, with frequencies that were prime multiples of 0.05 Hz (x-axis frequencies, 0.10 and 0.25 Hz; y-axis frequencies, 0.15 and 0.35 Hz). We randomized the phase of the sum of sines so that the reference trajectory would be unpredictable to the participants. Distinct prime multiples were chosen in each direction to provide separability in the x and y axes for any frequency-based analysis. To ensure constant signal power, the magnitude of each frequency component was normalized by the frequency squared. The trajectory was different for every trial. Each trial was 5 min with a 5-s ramp period during which the cursor and target speed slowly increased from stationary to the experimentally prescribed speeds. This ramp period followed prior experiments49 and was added to give participants time to recognize the starting cursor and target movements. Participants completed 16 five-minute trials in two blocks; each block consisted of eight trials. Participants were given a five-minute break in between each block but their EMG array and placement did not change throughout the experiment.
EMG signal collection and preprocessing
EMG signals were obtained using a Quattrocento system (Bioelettronica). A 64-channel high-density surface EMG electrode array (8-mm interelectrode spacing, 5 × 13 electrodes in a rectangular layout) was placed on the dominant forearm of each participant, targeting the extensor carpi radialis (Extended Data Fig. 1b). Electrodes were placed on the dominant arm for each participant. Once placed, the electrode array was wrapped with Coban self-adherent wrap (3M). The electrode cables from Quattrocento to the array were secured to minimize motion artefacts.
EMG signals were acquired using Biolite Software (Bioelecttronica) at 2,048 Hz in the differential mode with the built-in low-pass (130 Hz) and high-pass (10 Hz) filters. We filtered and rectified the EMG data following prior preprocessing techniques to compute the EMG linear envelope82,83. A moving average filter was applied to downsample EMG signals from 2,048 Hz to 60 Hz (Extended Data Fig. 1c).
Myoelectric interface decoder
Real-time myoelectric control was implemented using a velocity-controlled Wiener filter that output a cursor velocity \({v}_{t}\in {{\mathbb{R}}}^{2}\):
where \({u}_{t}\in {{\mathbb{R}}}^{N}\) is the processed EMG signals of N = 64 channels and \(D\in {{\mathbb{R}}}^{2\times N}\) is the decoder mapping. The cursor velocity is integrated to output a cursor position \({y}_{t}\in {{\mathbb{R}}}^{2}\):
where Δt is the time between cursor updates.
Myoelectric interface decoder adaptation
Decoder adaptation consisted of two steps: (1) calculate the optimal decoder D* by minimizing the cost function cD based on the previous 20 s of user and trial data:
(2) update the next decoder D following the SmoothBatch approach41, which uses a weighted combination of the prior decoder D− and the optimal decoder D*:
where α ∈ (0, 1) is the learning rate. The decoder cost function was minimized using SciPy version 1.5.4 (ref. 84). To satisfy real-time timing constraints on decoder cost minimization, we initialized \({D}_{0}^{* }=(\tau -y)\cdot {u}^{\dagger}\) using the previous 20 s of data26. At the start of each trial, the decoder was initialized by randomizing the decoder weightings from a uniform distribution (using numpy.random.rand) and multiplying by a scalar factor. Each user had two different decoder initializations, referred to as D1 and D2. Initial decoders for each participant’s trial were set to D1 or D2. Note, although we programmed the decoder to update every 20 s, due to slight software imprecision, the decoder update actually occurred approximately every 18 s.
Decoder cost
The decoder cost function was formulated in earlier work55 and aims to minimize both tracking error and decoder effort as the decoder adapts. Minimizing velocity error is a common goal in user–machine interfaces, seen widely across neural interfaces10,85, body–machine interfaces86 and myoelectric interfaces44,66. Decoder effort is considered as part of the decoder cost since our prior theoretical analysis suggested that a regularization term in the user and decoder costs is necessary to ensure convergence in the user–decoder co-adaptation game to stable stationary points55.
The decoder cost cD is constructed as a linear combination of the task error and decoder effort:
here ∥x∥2 denotes the 2-norm of signal \(x:[0,t]\to {{\mathbb{R}}}^{d}\), and ∥X∥ denotes the Frobenius norm of matrix \(X\in {{\mathbb{R}}}^{m\times n}\). The decoder cost cD is then
where λD is the penalty term of the decoder effort.
Decoder conditions
We varied the decoder cost function and parameters of decoder adaptation to determine how it influenced system performance and user behaviour. Specifically, we varied the (1) learning rate α, (2) decoder effort penalty term λD and (3) decoder initialization. We tested two learning rates, namely, slow (α = 0.75) and fast (α = 0.25); two penalty terms, that is, low (λ = 102) and high (λ = 103); and two randomized initializations of the D decoder matrix (D1 and D2), namely, positive (matrix elements chosen uniformly at random in the range [0, 10−2]) and negative (elements chosen uniformly at random in [−10−2, 0]). We tested all combinations of learning rates, penalty terms and initializations, leading to a total of eight different decoder conditions.
Data analysis methods
Performance metrics
Our primary metric for quantifying performance was the tracking error, calculated as the Euclidean distance between the target and the cursor: |∣τ − y|∣2. We assessed changes in task performance within a trial by comparing the mean tracking error in the first (early) and last (late) 30 s of the trial (excluding the ramp-up time). Because participants’ proficiency in tracking varied, we quantified improvements over time by calculating the relative error: \(\frac{{\mathrm{error}}_{\mathrm{fi}}}{{\mathrm{error}}_{{\rm{i}}}}\times 100{\rm{ \% }}\), where errorfi = errorfinal – errorinitial.
Statistical analyses
All analyses treated participants as individual data points and computed the mean across decoder conditions (learning rate, penalty terms, initialization) and trials. To assess statistical significance across time and conditions for each participant, we used a one- or two-sided Wilcoxon signed-rank test (scipy.stats.wilcoxon), which is a paired, non-parametric test. We chose a non-parametric statistical test because of subject-to-subject variability. Figure box plots were plotted using matplotlib.pyplot.boxplot (centre line, median; box limits, upper and lower quartiles; whiskers, 1.5× the interquartile range; fliers not plotted).
EMG analysis
To quantify the relationship between user EMG activity and cursor movement, we computed the EMG direction tuning curves, a well-established analysis used for neural and myoelectric interfaces43,44,45. We first downsampled and grouped intended user cursor velocities into ten equally spaced directions and then fit raw EMG activity into the binned directions to form EMG tuning curves. These EMG tuning curves were created for each participant, each trial and each EMG signal (Fig. 1f). Note, since the EMG channels are differentially recorded, we selected 63 EMG signals for the EMG tuning analysis.
To quantify the within-trial EMG changes from early to late trial segments (Fig. 1g) we computed the norm difference between the early and late EMG tuning curves:
where EMGdir is the average EMG activity per direction and i and j indicate time points. All tuning curves were calculated by reconstructing 30-s segments of trial data that equally sampled all cursor directions. To measure how the EMG activity changed across the entire experiment per participant (Fig. 1h), we averaged the ∣ΔEMG∣2 across all channels for each participant to compute a mean ∣ΔEMG∣2 per participant per trial.
To find the preferred direction of the EMG activity, we performed cosine-fitting analysis43. Specifically, average EMG activity for each intended cursor direction was modelled as
where θ represents the intended cursor direction and B1, B2 and B3 are model coefficients. The preferred direction (PD) is
The change between early and late preferred directions (Fig. 1i) was calculated as ∣ΔPD∣ = ∣PDlate − PDearly∣.
Encoder estimation methods
To model the user’s transformation from task space to user signals, we estimated the user’s encoder E. We model the user’s signal as being produced by a combination of feed-forward (F) control on position (τ) and velocity (\(\dot{\tau}\)) data, as well as feedback (B) control on position error and velocity error data ((τ − y) and \((\dot\tau -\dot{y})\), respectively):
Or, in the matrix formulation,
where \({1}_{t}^{\top }=\left[1\,1\,\cdots \,1\right]\in {{\mathbb{R}}}^{1\times t}\), \(u\in {{\mathbb{R}}}^{N\times t}\), \(E\in {{\mathbb{R}}}^{N\times d}\), \(P\in {{\mathbb{R}}}^{4d\times t}\), N is the number of signals, d is the dimension of task information and β is an offset factor for estimation (\(\beta \in {{\mathbb{R}}}^{N\times 1}\)). In this Article, N = 64 EMG channel inputs, d = 2 (x and y dimensions on a computer display). The selection of t determines the timescale on which E is estimated.
We numerically estimated users encoders E using linear regression of experimentally measured data for u and P. We estimated E every 20 s. This corresponds to the timescale on which decoders were updated, thereby generating matched encoder–decoder estimate pairs to analyse co-adaptive dynamics (for example, Fig. 2c,d). Linear regression was done by constructing a u matrix of EMG signals for t = 1,200 samples (20 s of data at 60 Hz) for the 64 EMG electrodes, and a corresponding matrix P of position, velocity, position error and velocity error (\(\tau \in {{\mathbb{R}}}^{2\times t}\), \({\dot{\tau}}\), τ − y and \({\dot{\tau}}-{\dot{y}}\), respectively). These data were used to fit equation (16) using linear regression (sklearn.linearmodel.LinearRegression) with an intercept offset.
We estimated the accuracy of the myoelectric encoder model by reconstructing EMG signals from the encoder estimation and compared the coefficient of determination (R2) for the cursor velocity and position decoded from the reconstructed EMG to the actual cursor velocity and position recorded in the trial. To establish a baseline for the predictive power of our encoder model, we also computed the accuracy of cursor velocity and position decoded from time-shuffled EMG data (Fig. 2b). The time-shuffled EMG data were created by permuting the EMG signals within each trial, so that the EMG data came from the same subject and trial but different time points.
Closed-loop encoder–decoder predictions
We used control theory principles to predict relationships between elements of encoder E and decoder D. The decoder is velocity based, meaning that the cursor position y is related to the user’s EMG signal u via \({\dot{y}}=D\cdot u\), where D is the decoder matrix. We model the user’s signal as being produced by a combination of feed-forward and feedback control on the position and velocity data:
This controller can achieve exact tracking, defined as y = τ, if D ⋅ F0 = 0 and D ⋅ F1 = I, where 0 is the 2 × 2 zero matrix and I is the 2 × 2 identity matrix. To see this, suppose at time t that τ(t) = y(t). Then,
Note that M is a simple integrator, so the condition for \({\dot{\tau}}(t)={\dot{y}}(t)\) in equation (18) together with τ(t) = y(t) ensures τ = y at all times. To ensure closed-loop stability, it suffices that D ⋅ B0 and D ⋅ B1 are negative-definite diagonal matrices (chapter 5 of ref. 51).
Encoder–decoder pairs
We calculated encoder–decoder pairs by multiplying the decoder (\(D\in {{\mathbb{R}}}^{2\times 64}\)) at each update with the feed-forward and feedback contributions to the encoder (\(E\in {{\mathbb{R}}}^{64\times 8}\)) at each update. Each encoder–decoder pair (D ⋅ F0, D ⋅ F1, D ⋅ B0 and D ⋅ B1) resulted in a 2 × 2 matrix representing the dynamics between the encoder and decoder.
Encoder differences
We calculated changes in encoders and decoders over time using the principal angle between subspaces (range space of E and orthogonal complement to the null space of D)87 and the norm difference of encoder matrices: ∥Ei − Ej∥, where i and j indicate the time points. Since we started from a decoder that was initialized with randomized weightings and resulted in poor user control, the details of the first encoder E0 were not readily interpretable. We did not include E0 in computing or plotting the encoder differences in Fig. 2e–g. To compute encoder changes within a trial, we calculated the difference between the final (Ef) and initial (Ei) encoders, estimated by averaging the first three and last three encoders in a trial, respectively (excluding E0). To compare data from consecutive encoders to the average change across a full trial (Fig. 2g), we computed the consecutive encoder difference by averaging across three subsequent, non-overlapping encoders and calculating the norm difference. This method ensured that the same number of encoder data points would be compared.
Game theory methods
We modelled the co-adaptive system in a game theory framework involving two agents: the user and the decoder. The decision variables of the user and decoder were the entries in the matrices E and D, respectively. For tractability, we limit our analysis to the scalar case \(E,D\in {\mathbb{R}}\). Each agent was modelled as having a cost function defined by a linear combination of task error and agent effort. Agent effort was quantified using the square of the scalars E and D. Task error was defined as in equation (10) as the norm squared of D ⋅ u − (τ − y)/Δt. Considering the scalar case and focusing solely on feedback control (that is, setting u = B0(τ − y)) yields error \({(D{B}_{0}\Delta t-1)}^{2}{(\tau -y)}^{2}/\Delta t\). For simplicity in the analysis, we relabel E = B0, normalize by (τ − y)2/Δt, and choose coordinates for D so that e(D, E) = (DE − 1)2 = (1 − DE)2.
Stationary points
In the one-dimensional game formulation, the potential function takes the form
where \(D,E\in {\mathbb{R}}\). We solved the nonlinear equations ∂Eϕ(E, D) = 0 and ∂Dϕ(E, D) = 0 to compute stationary points as a function of penalty parameters E*(λE, λD), D*(λE, λD), finding a saddle at E* = D* = 0 and the following mirror pair of local minima:
That these points are local minimizers was verified for λE, λD ∈ (0, 1] by numerically determining that the eigenvalues of the Hessian of the potential function ϕ were positive at these points.
Model for adaptation
We modelled agent adaptation using the discrete-time dynamics in equation (5). To assess the convergence of this model for adaptation to stationary points, we linearize the discrete-time dynamics about a stationary point and compute the eigenvalues of this linearization. A sufficient condition for stability is that the magnitude of all eigenvalues is less than 1.
Calculation of error decay rate
Given a matrix A that is the linearization of the discrete-time gradient descent dynamics in equation (5), we compute the spectral radius as \(\rho =\max \{| \lambda | ,\,\mathrm{where}\,\lambda \,\mathrm{is}\,\mathrm{an}\,\mathrm{eigenvalue}\,\mathrm{of}\,A\}\). Then, the error decay rate is equal to the spectral radius ρ, and satisfies the bound e(i) ≤ e(0)ρi, where i is the iteration number.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
All data are publicly available in a Code Ocean capsule at https://doi.org/10.24433/CO.4049054.v3 (ref. 88).
Code availability
The data and analysis scripts needed to reproduce all figures and statistical results reported in the Article are publicly available via CodeOcean at https://doi.org/10.24433/CO.4049054.v3 (ref. 88).
References
Serruya, M. D., Hatsopoulos, N. G., Paninski, L., Fellows, M. R. & Donoghue, J. P. Instant neural control of a movement signal. Nature 416, 141–142 (2002).
Taylor, D. M., Tillery, S. I. H. & Schwartz, A. B. Direct cortical control of 3D neuroprosthetic devices. Science 296, 1829–1832 (2002).
Carmena, J. M. et al. Learning to control a brain-machine interface for reaching and grasping by primates. PLoS Biol. 1, e42 (2003).
Hochberg, L. R. et al. Neuronal ensemble control of prosthetic devices by a human with tetraplegia. Nature 442, 164–171 (2006).
Pandarinath, C. & Bensmaia, S. J. The science and engineering behind sensitized brain-controlled bionic hands. Physiol. Rev. 102, 551–604 (2022).
Dadarlat, M. C., Canfield, R. A. & Orsborn, A. L. Neural plasticity in sensorimotor brain-machine interfaces. Annu. Rev. Biomed. Eng. 25, 51–76 (2023).
Danziger, Z., Fishbach, A. & Mussa-Ivaldi, F. A. Learning algorithms for human-machine interfaces. IEEE Trans. Biomed. Eng. 56, 1502–1511 (2009).
DiGiovanna, J., Mahmoudi, B., Fortes, J., Principe, J. C. & Sanchez, J. C. Coadaptive brain-machine interface via reinforcement learning. IEEE Trans. Biomed. Eng. 56, 54–64 (2009).
Jarrassé, N., Charalambous, T. & Burdet, E. A framework to describe, analyze and generate interactive motor behaviors. PLoS ONE 7, e49945 (2012).
Orsborn, A. L. et al. Closed-loop decoder adaptation shapes neural plasticity for skillful neuroprosthetic control. Neuron 82, 1380–1393 (2014).
Shenoy, K. V. & Carmena, J. M. Combining decoder design and neural adaptation in brain-machine interfaces. Neuron 84, 665–680 (2014).
Hahne, J. M., Markovic, M. & Farina, D. User adaptation in myoelectric man-machine interfaces. Sci. Rep. 7, 4437 (2017).
Brandman, D. M. et al. Rapid calibration of an intracortical brain-computer interface for people with tetraplegia. J. Neural Eng. 15, 026007 (2018).
Silversmith, D. B. et al. Plug-and-play control of a brain–computer interface through neural map stabilization. Nat. Biotechnol. 39, 326–335 (2021).
Rizzoglio, F., Casadio, M., De Santis, D. & Mussa-Ivaldi, F. A. Building an adaptive interface via unsupervised tracking of latent manifolds. Neural Netw. 137, 174–187 (2021).
Gigli, A., Gijsberts, A. & Castellini, C. Unsupervised myocontrol of a virtual hand based on a coadaptive abstract motor mapping. In Proc. 2022 International Conference on Rehabilitation Robotics (ICORR) 1–6 (IEEE, 2022).
Hu, X. et al. Bridging human-robot co-adaptation via biofeedback for continuous myoelectric control. IEEE Robot. Autom. Lett. 8, 8573–8580 (2023).
Ganguly, K. & Carmena, J. M. Emergence of a stable cortical map for neuroprosthetic control. PLoS Biol. 7, e1000153 (2009).
Fetz, E. E. Volitional control of neural activity: implications for brain-computer interfaces. J. Physiol. 579, 571–579 (2007).
Jackson, A. & Fetz, E. E. Interfacing with the computational brain. IEEE Trans. Neural Syst. Rehabil. Eng. 19, 534–541 (2011).
Albert, S. T. & Shadmehr, R. The neural feedback response to error as a teaching signal for the motor learning system. J. Neurosci. 36, 4832–4845 (2016).
Oby, E. R. et al. New neural activity patterns emerge with long-term learning. Proc. Natl Acad. Sci. USA 116, 15210–15215 (2019).
De Santis, D. & Mussa-Ivaldi, F. A. Guiding functional reorganization of motor redundancy using a body-machine interface. J. NeuroEng. Rehab. 17, 61 (2020).
Benabid, A. L. et al. An exoskeleton controlled by an epidural wireless brain-machine interface in a tetraplegic patient: a proof-of-concept demonstration. Lancet Neurol. 18, 1112–1122 (2019).
Müller, J. S. et al. A mathematical model for the two-learners problem. J. Neural Eng. 14, 036005 (2017).
Madduri, M. M. et al. Co-adaptive myoelectric interface for continuous control. IFAC-PapersOnLine 55, 95–100 (2022).
Dangi, S., Orsborn, A. L., Moorman, H. G. & Carmena, J. M. Design and analysis of closed-loop decoder adaptation algorithms for brain-machine interfaces. Neural Comput. 25, 1693–1731 (2013).
Hsieh, H.-L. & Shanechi, M. M. Optimizing the learning rate for adaptive estimation of neural encoding models. PLoS Comput. Biol. 14, e1006168 (2018).
Chasnov, B., Ratliff, L., Mazumdar, E. & Burden, S. Convergence analysis of gradient-based learning in continuous games. In Conference on Uncertainty in Artificial Intelligence Vol. 115 (eds Adams, R. P. & Gogate, V.) 935–944 (PMLR, 2020).
Merel, J., Pianto, D. M., Cunningham, J. P. & Paninski, L. Encoder-decoder optimization for brain-computer interfaces. PLoS Comput. Biol. 11, e1004288 (2015).
De Santis, D. A framework for optimizing co-adaptation in body-machine interfaces. Front. Neurorobot. 15, 662181 (2021).
Madduri, M. M., Burden, S. A. & Orsborn, A. L. Biosignal-based co-adaptive user-machine interfaces for motor control. Curr. Opin. Biomed. Eng. 27, 100462 (2023).
Braun, D. A., Ortega, P. A. & Wolpert, D. M. Nash equilibria in multi-agent motor interactions. PLoS Comput. Biol. 5, e1000468 (2009).
Li, Y., Carboni, G., Gonzalez, F., Campolo, D. & Burdet, E. Differential game theory for versatile physical human-robot interaction. Nat. Mach. Intell. 1, 36–43 (2019).
Chasnov, B. J., Ratliff, L. J. & Burden, S. A. Human adaptation to adaptive machines converges to game-theoretic equilibria. Sci. Rep. 15, 29364 (2025).
Krakauer, J. W., Hadjiosif, A. M., Xu, J., Wong, A. L. & Haith, A. M. in Motor Learning 613–663 (John Wiley & Sons, 2019).
Seo, G., Kishta, A., Mugler, E., Slutzky, M. W. & Roh, J. Myoelectric interface training enables targeted reduction in abnormal muscle co-activation. J. NeuroEng. Rehab. 19, 67 (2022).
Portnova-Fahreeva, A. A., Rizzoglio, F., Mussa-Ivaldi, F. A. & Rombokas, E. Autoencoder-based myoelectric controller for prosthetic hands. Front. Bioeng. Biotechnol. 11, 1134135 (2023).
Yamagami, M., Steele, K. M. & Burden, S. A. Decoding intent with control theory: comparing muscle versus manual interface performance. In ACM Conference on Human Factors in Computing Systems (CHI) 1–12 (ACM, 2020).
Yang, C. S., Cowan, N. J. & Haith, A. M. De novo learning versus adaptation of continuous control in a manual tracking task. eLife 10, e62578 (2021).
Orsborn, A. L., Dangi, S., Moorman, H. G. & Carmena, J. M. Closed-loop decoder adaptation on intermediate time-scales facilitates rapid BMI performance improvements independent of decoder initialization conditions. IEEE Trans. Neural Syst. Rehabil. Eng. 20, 468–477 (2012).
Gilja, V. et al. A high-performance neural prosthesis enabled by control algorithm design. Nat. Neurosci. 15, 1752–1757 (2012).
Georgopoulos, A. P., Schwartz, A. B. & Kettner, R. E. Neuronal population coding of movement direction. Science 233, 1416–1419 (1986).
Radhakrishnan, S. M., Baker, S. N. & Jackson, A. Learning a novel myoelectric-controlled interface task. J. Neurophysiol. 100, 2397–2408 (2008).
Couraud, M., Cattaert, D., Paclet, F., Oudeyer, P. Y. & de Rugy, A. Model and experiments to optimize co-adaptation in a simplified myoelectric control system. J. Neural Eng. 15, 026006 (2018).
McRuer, D. T. & Jex, H. R. A review of quasi-linear pilot models. IEEE Trans. Hum. Factors Electron. 8, 231–249 (1967).
Drop, F. M., Pool, D. M., Damveld, H. J., van Paassen, M. M. & Mulder, M. Identification of the feedforward component in manual control with predictable target signals. IEEE Trans. Cybern. 43, 1936–1949 (2013).
Zhang, X., Wang, S., Hoagg, J. B. & Seigler, T. M. The roles of feedback and feedforward as humans learn to control unknown dynamic systems. IEEE Trans. Cybern. 48, 543–555 (2018).
Yamagami, M., Peterson, L. N., Howell, D., Roth, E. & Burden, S. A. Effect of handedness on learned controllers and sensorimotor noise during trajectory-tracking. IEEE Trans. Cybern. 53, 2039–2050 (2023).
Cowan, N. J. et al. Feedback control as a framework for understanding tradeoffs in biology. Integr. Comp. Biol. 54, 223–237 (2014).
Åström, K. J. & Murray, R. M. Feedback Systems: An Introduction for Scientists and Engineers (Princeton Univ. Press, 2008).
Başar, T. & Olsder, G. J. Dynamic Noncooperative Game Theory 2nd edn (SIAM, 1999).
Hespanha, J. P. Noncooperative Game Theory: An Introduction for Engineers and Computer Scientists (Princeton Univ. Press, 2017).
Todorov, E. & Jordan, M. I. Optimal feedback control as a theory of motor coordination. Nat. Neurosci. 5, 1226–1235 (2002).
Madduri, M. M., Burden, S. A. & Orsborn, A. L. A game-theoretic model for co-adaptive brain-machine interfaces. In Proc. 2021 IEEE Conference on Neural Engineering (NER) 327–330 (IEEE, 2021).
Monderer, D. & Shapley, L. S. Potential games. Games Econ. Behav. 14, 124–143 (1996).
Strogatz, S. H. Nonlinear Dynamics and Chaos: With Applications to Physics, Biology, Chemistry, and Engineering 2nd edn (Westview Press, 2015).
Orsborn, A. L., Dangi, S., Moorman, H. G. & Carmena, J. M. Exploring time-scales of closed-loop decoder adaptation in brain-machine interfaces. In 2011 Annual International Conference of the IEEE Engineering in Medicine and Biology Society 5436–5439 (IEEE, 2011).
Sadtler, P. T. et al. Neural constraints on learning. Nature 512, 423–426 (2014).
Berger, D. J., Gentner, R., Edmunds, T., Pai, D. K. & d’Avella, A. Differences in adaptation rates after virtual surgeries provide direct evidence for modularity. J. Neurosci. 33, 12384–12394 (2013).
Cheng, S. & Sabes, P. N. Modeling sensorimotor learning with linear dynamical systems. Neural Comput. 18, 760–793 (2006).
Chase, S. M., Schwartz, A. B. & Kass, R. E. Bias, optimal linear estimation, and the differences between open-loop simulation and closed-loop performance of spiking-based brain-computer interface algorithms. Neural Netw. 22, 1203–1213 (2009).
Cunningham, J. P. et al. A closed-loop human simulator for investigating the role of feedback control in brain-machine interfaces. J. Neurophysiol. 105, 1932–1949 (2011).
Willett, F. R. et al. Principled BCI decoder design and parameter selection using a feedback control model. Sci. Rep. 9, 8881 (2019).
Awasthi, P., Lin, T.-H., Bae, J., Miller, L. E. & Danziger, Z. C. Validation of a non-invasive, real-time, human-in-the-loop model of intracortical brain-computer interfaces. J. Neural Eng. 19, 056038 (2022).
Hahne, J. M., Dähne, S., Hwang, H.-J., Müller, K.-R. & Parra, L. C. Concurrent adaptation of human and machine improves simultaneous and proportional myoelectric control. IEEE Trans. Neural Syst. Rehabil. Eng. 23, 618–627 (2015).
Shanechi, M. M., Orsborn, A. L. & Carmena, J. M. Robust brain-machine interface design using optimal feedback control modeling and adaptive point process filtering. PLoS Comput. Biol. 12, e1004730 (2016).
Hung, N.-T. et al. Wearable myoelectric interface enables high-dose, home-based training in severely impaired chronic stroke survivors. Ann. Clin. Transl. Neurol. 8, 1895–1905 (2021).
Ingraham, K. A., Remy, C. D. & Rouse, E. J. The role of user preference in the customized control of robotic exoskeletons. Sci. Robot. 7, eabj3487 (2022).
Willett, F. R. et al. Feedback control policies employed by people using intracortical brain-computer interfaces. J. Neural Eng. 14, 016001 (2016).
Yamagami, M., Portnova-Fahreeva, A. A., Kong, J., Wobbrock, J. O. & Mankoff, J. How do people with limited movement personalize upper-body gestures? Considerations for the design of personalized and accessible gesture interfaces. In Proc. 25th International ACM SIGACCESS Conference on Computers and Accessibility 1 (ACM, 2023).
Flash, T. & Hogan, N. The coordination of arm movements: an experimentally confirmed mathematical model. J. Neurosci. 5, 1688–1703 (1985).
Emken, J. L., Benitez, R., Sideris, A., Bobrow, J. E. & Reinkensmeyer, D. J. Motor adaptation as a greedy optimization of error and effort. J. Neurophysiol. 97, 3997–4006 (2007).
Finley, J. M., Bastian, A. J. & Gottschall, J. S. Learning to be economical: the energy cost of walking tracks motor adaptation. J. Physiol. 591, 1081–1095 (2013).
McDonald, K. A., Cusumano, J. P., Hieronymi, A. & Rubenson, J. Humans trade off whole-body energy cost to avoid overburdening muscles while walking. Proc. R. Soc. B Biol. Sci. 289, 20221189 (2022).
Harris, C. M. & Wolpert, D. M. Signal-dependent noise determines motor planning. Nature 394, 780–784 (1998).
Rajeswaran, P., Payeur, A., Lajoie, G. & Orsborn, A. L. Assistive sensory-motor perturbations influence learned neural representations. Preprint at bioRxiv https://doi.org/10.1101/2024.03.20.585972 (2025).
Herzfeld, D. J., Vaswani, P. A., Marko, M. K. & Shadmehr, R. A memory of errors in sensorimotor learning. Science 345, 1349–1353 (2014).
Athalye, V. R., Ganguly, K., Costa, R. M. & Carmena, J. M. Emergence of coordinated neural dynamics underlies neuroprosthetic learning and skillful control. Neuron 93, 955–970.e5 (2017).
Dhawale, A. K., Smith, M. A. & ölveczky, B. P. The role of variability in motor learning. Annu. Rev. Neurosci. 40, 479–498 (2017).
Bengio, Y., Louradour, J., Collobert, R. & Weston, J. Curriculum learning. In Proc. 26th International Conference on Machine Learning (ICML ’09) 41–48 (ACM, 2009).
Yamagami, M. et al. Assessment of dry epidermal electrodes for long-term electromyography measurements. Sensors 18, 1269 (2018).
Chvatal, S. A. P. D. & Ting, L. H. P. D. Common muscle synergies for balance and walking. Front. Comput. Neurosci. 7, 48 (2013).
Virtanen, P. et al. SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat. Methods 17, 261–272 (2020).
Wodlinger, B. et al. Ten-dimensional anthropomorphic arm control in a human brain-machine interface: difficulties, solutions, and limitations. J. Neural Eng. 12, 016011 (2014).
Seáñez, I. & Mussa-Ivaldi, F. A. A body-machine interface for the control of a 2D cursor. In Proc. 2013 International Conference on Rehabilitation Robotics (ICORR) 6650508 (IEEE, 2013).
Knyazev, A. V. & Argentati, M. E. Principal angles between subspaces in an a-based scalar product: algorithms and perturbation estimates. SIAM J. Sci. Comput. 23, 2008–2040 (2002).
Madduri, M., Yamagami, M., Li, S.J., Burckhardt, S., Burden, S.A. & Orsborn, A.L. Computational framework to predict and shape human-machine interactions in closed-loop, co-adaptive neural interfaces. CodeOcean https://doi.org/10.24433/CO.4049054.v3 (2025).
Acknowledgements
This work was supported by Meta Reality Labs Research, the National Science Foundation (NSF) award number 2045014 (S.A.B.), NSF award number 2338662 (A.L.O.) and NSF award number 2124608 (A.L.O. and S.A.B.). M.M.M. was funded in part by the Department of Defense, National Defense Science and Engineering Graduate Fellowship (NDSEG). The funders had no role in the study design, data collection and analysis, decision to publish or preparation of the Article. We thank A. X. T. Millevolte for experimental setup support.
Author information
Authors and Affiliations
Contributions
M.M.M., M.Y., S.A.B. and A.L.O. were responsible for the experimental design. M.M.M., M.Y. and S.J.L. were responsible for the experimental setup. M.M.M. and S.B. conducted the experiments. M.M.M. analysed the experimental data and prepared the figures. M.M.M., S.A.B. and A.L.O. wrote the paper.
Corresponding author
Ethics declarations
Competing interests
A.L.O. declares the following competing interest: employment as a scientific advisor for Meta Reality Labs. The remaining authors declare no competing interests.
Peer review
Peer review information
Nature Machine Intelligence thanks Marco Capogrosso and Noah Cowan for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Experimental testbed.
a. Photograph of experiment. User tracks target (dashed red) by controlling the velocity of a 2D cursor (solid blue) with their forearm muscle activity. Muscle activity is collected via surface EMG electrodes that are placed on the participant’s dominant forearm and wrapped with Coban tape. b. Electrode placement on dominant forearm. Surface EMG electrodes are placed on user dominant forearm to target Extensor Carpi Radialis. c. Schematic of EMG preprocessing and decoding pipeline. EMG signals recorded from the user forearm are preprocessed and then input to the adaptive decoder. The decoder output is the cursor velocity that is integrated to display cursor position (blue) on the screen.
Extended Data Fig. 2 Surface EMG electrode and example tuning curves.
Diagram of surface EMG electrode and sample EMG tuning curves from representative subjects taken from late segments within trial, black line represents median across trials with shaded area indicating the the 25th - 75th percentiles. Pink dashed lines show fitted tuning curves. Vertical lines represent preferred direction from late trial segments.
Extended Data Fig. 3 Example tuning curve changes from participants.
Sample EMG tuning curves from range of participants taken from early (blue) and late (black) trial segments. For all plots, lines represent the median across trials with shaded area indicating the 25th - 75th percentiles. Vertical dashed lines represent preferred direction (PD) from early (blue) and late (gray) trial segments. Examples were chosen to illustrate the diversity of qualitative changes observed. a. Example EMG tuning curves showing qualitative differences in EMG amplitude and PD from early to late in the trial. b. Example EMG tuning curves showing minimal qualitative differences in EMG amplitude and PD from early to late in the trial. c. Example EMG tuning curves showing qualitative differences in EMG amplitude but minimal to no differences PD from early to late in the trial.
Extended Data Fig. 4 Effect of error on Decoder-Encoder Feedforward Pairs.
a. Block diagram model of closed-loop system. The user’s encoder E outputs myoelectric activity u to the decoder D to follow the target’s position and velocity. The decoder outputs velocity, which is then integrated by system dynamics M to generate a cursor position. The user sees their cursor moving with some velocity \({\dot{y}}\) to position y. Error in the cursor position and velocity (relative to the target) acts as closed-loop feedback to the user. (Duplicated from Fig. 2a for completeness.) b. Product of the decoder matrix with first-order feedforward contributions of the encoder matrix (N = 14, median; shading shows the 25th - 75th percentile). Black dashed lines represent values that yield perfect trajectory tracking. Purple dashed lines represent values that yield near perfect trajectory tracking. (Duplicated from Fig. 2c, left for completeness). c. Average tracking error (left: x-direction, right: y-direction) vs average D · F0 value for each individual participant (dot).
Extended Data Fig. 5 Decoder initialization slightly influences decoder-encoder pairs.
a. Model predictions. All panels show the gradient field of the user and decoder cost functions. Purple (user) and orange (decoder) curves show nullclines (where the agent’s gradient equals 0) that intersect at stationary points (black stars). Decoder initialization D1 (light orange) and initialization D2 (dark orange) are noted on the vertical axis of the diagram. Assumed user initialization (purple) is noted on the horizontal axis. b. Left: Average magnitude of the decoder matrix (norm) as a function of time in the trial for D1 (solid light orange) and D2 (dashed dark orange) initializations in slow learning rate conditions (N = 14, median; shading shows the 25th - 75th percentile). Right: Boxplots are average decoder effort across the trial for each subject (N = 14, center shows median; box shows 25th-75th percentiles; whiskers extend to 1.5 × this interquartile range; two-sided Wilcoxon signed-rank test, ns = 0.27). c. Left: Average magnitude of the user encoder matrix (norm) as a function of time in the trial for D1 (solid dark purple) and D2 (dashed light purple) decoder initializations (N = 14, median; shading shows the 25th - 75th percentile). Right: Boxplots are average effort for each subject across the trial (N = 14, center shows median; box shows 25th-75th percentiles; whiskers extend to 1.5 × this interquartile range; two-sided Wilcoxon signed-rank test, ns = 0.86). d. Percent change in error from the start of the trial (first 30 seconds) to the end of the trial (last 30 seconds), separated by D1 (dark gray) and D2 (light gray) initializations conditions (N = 14, center shows median; box shows 25th-75th percentiles; whiskers extend to 1.5 × this interquartile range; two-sided Wilcoxon signed-rank test, ns = 0.33). e. Last minute in trial of product of the average decoder matrix of each initialization with first-order feedforward (F1) contributions of the average encoder matrix of each initialization, separated by learning rate. Fast learning rate conditions are shaded gray. The matched conditions are the decoders and encoders of the same initialization and the mismatched conditions are the decoders and encoders of different initializations (N = 28, center shows median; box shows 25th-75th percentiles; whiskers extend to 1.5 × this interquartile range; two-sided Wilcoxon signed-rank test, ns > 0.05; *p < 0.05; **p < 0.001). f. Last minute in trial of product of the average decoder matrix of each initialization with first-order feedback (B1) contributions of the average encoder matrix of each initialization, separated by learning rate. Fast learning rate conditions are shaded gray. The matched conditions are the decoders and encoders of the same initialization and the mismatched conditions are the decoders and encoders of different initializations (N = 28, center shows median; box shows 25th-75th percentiles; whiskers extend to 1.5 × this interquartile range; two-sided Wilcoxon signed-rank test, ns > 0.05; *p < 0.05; **p < 0.001).
Extended Data Fig. 6 Effect of penalty parameter in slow learning rate condition only.
a. Left: Average magnitude of the decoder matrix (norm) as a function of time in the trial for low (dashed light orange) and high (solid dark orange) decoder penalty terms (N = 14, median; shading shows the 25th - 75th percentile). Right: Boxplots are average decoder effort across the trial for each subject (N=14, center shows median; box shows 25th-75th percentiles; whiskers extend to 1.5 × this interquartile range; one-sided Wilcoxon signed-rank test, **p = 6.1e − 5). b. Left: Average magnitude of the user encoder matrix (norm) as a function of time in the trial for low (dashed light purple) and high (solid dark purple) decoder penalty terms (N = 14, median; shading shows the 25th - 75th percentile). Right: Boxplots are average effort for each subject across the trial (N = 14, center shows median; box shows 25th-75th percentiles; whiskers extend to 1.5 × this interquartile range; one-sided Wilcoxon signed-rank test, *p = 0.018). c. Percent change in error from the start of the trial (first 30 seconds) to the end of the trial (last 30 seconds), separated by high (dark gray) and low (light gray) decoder penalty term conditions (N=14, median; shading shows the 25th - 75th percentile; two-sided Wilcoxon signed-rank test, ns = 0.24). d. Left: Average cursor speed as a function of time in the trial for low (light gray) and high (dark gray) decoder penalty terms (N = 14, median; shading shows the 25th - 75th percentile). Right: Boxplots are average cursor speed across the trial for each subject (N=14, center shows median; box shows 25th-75th percentiles; whiskers extend to 1.5 × this interquartile range; Wilcoxon signed-rank test, **p = 6.1e − 5).
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Madduri, M.M., Yamagami, M., Li, S.J. et al. Computational framework to predict and shape human–machine interactions in closed-loop, co-adaptive neural interfaces. Nat Mach Intell 8, 372–387 (2026). https://doi.org/10.1038/s42256-026-01194-z
Received:
Accepted:
Published:
Version of record:
Issue date:
DOI: https://doi.org/10.1038/s42256-026-01194-z







