Abstract
L-DOPA-induced dyskinesia (LID) is a common complication in the treatment of Parkinson’s disease (PD), characterized by involuntary excessive movements. The traditional Abnormal Involuntary Movement Scale (AIMs), used for quantifying abnormal involuntary movements, relies heavily on manual observation and is highly subjective. Unsupervised behavior classification typically requires joint modeling on the entire dataset, making it inflexible when dealing with new samples. Here, we propose an automated behavioral recognition framework integrating multi-view 3D motion reconstruction with a hypergraph self-attention neural network to precisely delineate LID behavioral phenotypes and evaluate pharmacological interventions. Using a synchronized four-camera setup, we collected large-scale motion data from WT, PD, and LID mice, tracking 16 key body points to reconstruct accurate 3D trajectories. By combining unsupervised clustering with manual annotation, we established a standardized behavioral database. We introduced a spatiotemporal hypergraph neural network model incorporating a self-attention mechanism, which demonstrated excellent recognition accuracy across all behaviors and effectively distinguished the behavioral profiles of WT, PD, and LID mice. Based on this, we compared the behavioral differences in treatment effects between amantadine (AMAN) and clozapine (CLZ). Overall, our automated 3D behavioral analysis framework offers a high-throughput, objective, and precise approach to behavioral quantification, presenting a powerful tool for unraveling the mechanisms underlying LID and other movement disorders, as well as for advancing pharmacological research.
Similar content being viewed by others
Introduction
Parkinson’s disease (PD) is a chronic neurodegenerative disorder, primarily characterized pathologically by the progressive loss of dopaminergic neurons in the substantia nigra, leading to functional impairments in basal ganglia circuits1,2,3. Typical symptoms include resting tremor, muscular rigidity, bradykinesia, and postural instability4. Current clinical treatments primarily rely on levodopa and a variety of adjuvant drugs, such as dopamine receptor agonists and monoamine oxidase inhibitors5. However, with long-term use of dopamine replacement therapy (e.g., levodopa), patients may develop levodopa-induced dyskinesia (LID), characterized by involuntary excessive movements6,7, such as chorea or dystonia8,9. The occurrence of these symptoms is often associated with the duration and dosage of drug treatment, especially in the later stages of treatment, when fluctuations in the efficacy of levodopa become more pronounced, making patients more susceptible to dyskinesia10.
In mice or rats with depleted dopamine, the administration of chronic levodopa induces the manifestation of levodopa-induced dyskinesia (LID), which is currently recognized as a valid animal model of dyskinesia11,12. These movements are primarily assessed using the Abnormal Involuntary Movement Scale (AIMs)13. However, AIMs scoring is labor-intensive, inefficient, and prone to human bias14. Moreover, the lack of standardized protocols, such as varying criteria for movement activities or scales adapted across species, generates fragmented data and complicates cross-study comparisons15. These issues undermine reproducibility, risk drawing erroneous conclusions about drug efficacy, and impede the determination of clinically viable therapies. Therefore, behavioral assessment tools that possess objectivity, quantifiability, and high-throughput capabilities are highly necessary for enhancing experimental efficiency and the reliability of results.
In recent years, numerous behavior recognition frameworks based on unsupervised and supervised learning have been developed and widely applied to tracking behaviors in animals such as primates16,17, rodents18,19,20,21, and insects22,23,24,25. Unsupervised learning extracts and clusters spatiotemporal features without labels (e.g., MoSeq26, Behavior Atlas21), decomposing behaviors into comparable modules that help reveal neural effects of genes, drugs, and environments, and enable cross-laboratory and cross-species analyses. However, such methods require joint modeling across the entire dataset, limiting their flexibility in handling new samples21. Supervised learning identifies visual features and learns spatiotemporal relations between keypoints (e.g., DeepLabCut27, Automated Behavior Recognition System (ABRS)28, Spatial Temporal Graph Convolutional Networks (ST-GCN)29), enabling pose estimation, behavior classification, and quantification in mice, fruit flies, quadrupeds, and zebrafish. However, for disease models with highly heterogeneous manifestations, such as LID dyskinesia, there is currently a lack of analytical strategies that can effectively capture their complex behavioral structures.
To quantitatively characterize the behavioral impairments caused by PD-related conditions and the effects of pharmacological treatments, we propose an automated behavior analysis system based on 3D motion capture, hypergraph self-attention neural network training, and support vector machine classification. We established a motion behavior classification dataset for wild-type (WT), PD, and LID mice, identifying typical specific behaviors characteristic of LID, such as abnormal gait, twisting, or rapid limb flailing. Finally, we evaluated the differences in therapeutic efficacy of amantadine and clozapine on LID behavioral phenotypes.
Results
Behavior quantification framework based on Hypergraph Self-Attention neural networks
To accurately quantify the behavioral characteristics of LID and other Parkinson’s disease-related conditions and to evaluate the efficacy of therapeutic drugs, we propose an automated analysis framework based on 3D motion capture and hypergraph self-attention neural networks (Fig. 1). First, we use DeepLabCut (DLC) to estimate the pose from a four-view data obtained by the 3D motion capture system, acquiring keypoint data of mouse movements and reconstructing the 3D body skeleton. We then perform unsupervised classification using the Behavior Atlas, followed by manual verification to obtain a dataset of 15 standard behaviors (Fig. 1 left). We integrate a hypergraph self-attention neural network, which models non-local dependencies between keypoints by introducing hyperedges and integrates spatiotemporal correlation features to achieve end-to-end supervised training. This allows for precise recognition of dyskinesia behavior patterns, such as abnormal gait, twisting, or rapid limb flailing, which are typically associated with complex synchronous or asynchronous changes of multiple keypoints in the spatiotemporal dimensions (Fig. 1 middle). Finally, we perform PCA dimensionality reduction based on the 15 behavioral features and combine support vector machines for classification among different experimental groups (Fig. 1 right).
Left module: Standard behavior dataset construction via unsupervised clustering combined with manual verification. Middle: module: The model extracts high-order dynamic behavior features through spatial hyperedges and temporal hyperedges, ultimately outputting the probability distribution of behavior categories. Right module: After extracting the principal components of the behavior data, SVM classification is used to distinguish the behavioral features of mice from different experimental groups.
Behavioral data acquisition via unsupervised clustering
First, we established PD and LID mouse models. By injecting 6-hydroxydopamine (6-OHDA) into the dorsal striatum, we created a unilateral dopamine-depleted model of PD mice30. Two weeks after the dopamine depletion in the striatal region, levodopa combined with benserazide hydrochloride was administered daily to construct the LID mouse model31 (Fig. 2a).
A Schematic of the construction of mouse PD and LID models. B Global AIMs behavioral scoring in LID mice. Each mouse (n = 12) was scored once every 20 min after L-DOPA injection. The score peaked between 20 and 60 min after injection, which used in feature analysis is indicated by the gray shaded area. Data are presented as mean ± SEM. C Workflow for obtaining behavioral categories. After spatial calibration of four cameras with a checkerboard, synchronized multi-view video recording of mice was performed to reconstruct the 3D skeleton. Behavioral feature space was obtained through decomposition clustering. D Dimensionality reduction distribution of behavioral features, where each point represents a temporal motion segment, and the color corresponds to 15 manually annotated behaviors. The horizontal axes are umap1 and umap2, and the vertical axis is movement speed, reflecting the dynamic changes in behavior. E Differences in behavioral features among LID, PD, and WT mice (n = 12 per group, ****p < 0.0001, ***p < 0.001, **p < 0.01, *p < 0.05. Statistics: Two-way ANOVA, Welch’s t test, Bonferroni correction).
To construct a quantifiable behavioral dataset for PD and LID mice, we adapted previously established methods21,32 to build a multi-view synchronized acquisition and 3D reconstruction system for behavioral tracking. (Fig. 2b). We built a multi-angle synchronized acquisition system with four cameras, which were evenly distributed around the behavior platform at 90-degree intervals. Before data acquisition, each camera was calibrated using a standard 6 × 7 checkerboard pattern to correct lens distortion and ensure spatial coordinate consistency. WT (n = 12), PD (n = 12), and LID (n = 12) mice were recorded separately. The LID model mice underwent standardized AIMs scoring, and based on these scores, we selected the 20–60 min period during which dyskinetic symptoms were most pronounced for inclusion in the dataset (Fig. 2c).
Based on the anatomical features of mice, we selected 16 key anatomical points as tracking targets, including the nose tip, both ears, neck, proximal limb joints, four paws, back, tail base, mid-tail, and tail tip. Approximately 8000 manually annotated frames were used to train a DeepLabCut (DLC) deep learning model to accurately estimate the positions of these skeletal points. Further, we reconstructed the complete three-dimensional skeletal motion trajectory of the mouse by fusing coordinate information from four different viewpoints using pode3d. Linear interpolation was used to handle missing values in the trajectory, yielding superior performance (RMSE: 2.09, MAE: 0.94) (Supplementary Table 1), in comparison with Kalman and particle filters33. Therefore, post-processing algorithms for occlusion handling are not required, as multi-view design effectively mitigates severe occlusion. We dynamically decomposed and clustered all captured motion segments using the behavioral atlas analysis tool Behavior Atlas. Ultimately, 40 unsupervised behavioral motifs were identified and classified through hierarchical clustering (Supplementary Fig. 1). Based on posture and kinematic parameters similarity17,18,20 (see Methods for details), these motifs were manually designated into 15 major behavioral categories, including 10 basic behaviors: Grooming, Sniffing in situ, Sniffing while walking, Hunching, Rising, Rearing, Running, Trotting, Walking, and Turn grooming, and other five behaviors: Axial Twist with Postural Imbalance (AT-PI), Head Deviation with Low-Speed Rotation (HD-L), Head Deviation with High-Speed Rotation (HD-H), Head Deviation with Moderate-Speed Rotation (HD-M), and Axial Twist with Orofacial Dyskinesia (AT-OD) (Table 1, Fig. 2d).
Thus, leveraging computer vision, we identified distinct behavioral features in mice.
Spontaneous behavioral patterns of WT/PD/LID
In order to assess the spontaneous behavioral characteristics of WT, PD, and LID mice, we performed a combined cluster analysis of behavioral data from these three groups, quantifying the proportion of 15 behavioral patterns across groups. We found that, compared with WT and PD mice, LID mice exhibited significant upregulation in behaviors such as AT-PI, HD-L, HD-H, HD-M, and AT-OD (Fig. 2e). These behaviors are consistent with the typical abnormal movements reported in rodent models of LID induced by L-DOPA in the literature15,34,35, including persistent licking, gnawing, head bobbing, limb torsion, and forelimb hyperkinesia. Such abnormal movements are highly repetitive and rhythmical, primarily occurring within specific time windows following L-DOPA administration34. Intermittent high-concentration dopamine release intensifies overactivation of dopamine D1 receptors in the midbrain-striatal pathway, thereby triggering aberrant activity in cortical–basal ganglia circuits. We conclude that these five abnormal movement patterns represent core diagnostic features for identifying the LID phenotype (Fig. 3a). Other types of behavioral patterns were significantly downregulated in LID compared to both WT and PD mice (Supplementary Fig. 2a). Additionally, behavioral mapping revealed distinctions between PD and WT mice, with PD mice showing significantly reduced Walking and HD-L behaviors (Supplementary Fig. 2b), resembling the bradykinesia characteristic of clinical PD patients4. Furthermore, PD mice exhibited significantly increased Turn Grooming (turning to groom the posterior fur) and AT-OD behaviors (Fig. 3b). Turn Grooming may relate to sensorimotor integration deficits induced by unilateral 6-OHDA lesions, as studies have shown that unilateral nigrostriatal damage leads to contralateral sensory neglect, prompting compensatory increases in ipsilateral grooming36. The elevated frequency of AT-OD suggests impairments in postural control and orofacial coordination, possibly arising from dysfunction in basal ganglia–cortical circuits37.
A Motion skeletons of five behavioral features in LID mice. B Motion skeletons of two behavioral features in PD mice. C Five kinematic parameters (speed back, body length, body height, body angle, body angular velocity). Left, example WT movement parameters (blue) within 40 min; Right, PMF curves (yellow). 1–16 represent skeletal points of the mouse. D PMF curves of five kinematic parameters (I, speed back; II, body length; III, body height; IV, body angle; V, body angular velocity) for WT, PD, and LID. Bold traces and shaded areas represent mean ± SEM. (****p < 0.0001, ***p < 0.001, **p < 0.01, *p < 0.05, Kolmogorov–Smirnov test, Bonferroni correction). E Classification results of WT, PD, and LID using SVM in the PC1 and PC2 plane. The background color indicates the model’s classification confidence (Maximum posterior), with darker colors representing higher confidence. Solid contour lines show the probability gradients of different regions. F Projection values calculated by projecting the points in (E) onto the decision boundaries (I, LID vs. PD, p < 0.0001; II, LID vs. WT, p < 0.0001; III, PD vs. WT, p = 0.0002; Mann–Whitney test).
To delineate the locomotion disparities among the three groups of mice, we meticulously analyzed five kinematic parameters: speed back, body length, body height, body angle, and body angular velocity (Fig. 3c). We calculated the probability mass function (PMF) for each parameter to illustrate shifts in distribution across groups. Our results showed that LID mice exhibited smaller body angle and shorter body length compared to PD and WT mice, indicating a pronounced tendency toward body contraction and torsion (Fig. 3d Ⅱ, Ⅳ). Body angular velocity increased significantly in the high-speed range, likely due to a substantial increase in speed during twisting (Fig. 3d Ⅴ). Body height was significantly greater in LID mice relative to the other groups, suggesting more frequent elevation of the torso (Fig. 3d Ⅲ). Back speed showed only minor differences between LID and PD mice, with no significant difference compared to WT (Fig. 3d Ⅰ). When comparing PD and WT mice, PD mice showed significantly reduced body angle and body length, indicating a tendency toward postural contraction even without the development of dyskinesia. No significant differences were observed in body height or body angular velocity.
To verify whether these behavioral features could effectively distinguish among the three groups of mice, we embedded the behavioral scores of 15 distinct locomotor patterns from each mouse into a 2D PCA space for visualized analysis. We then employed a SVM classifier to categorize the data. Our results revealed three distinct clusters corresponding to WT, PD, and LID individuals in the 2D PCA space (Fig. 3e), with WT and PD clusters positioned in closer proximity yet still exhibiting clear classification boundaries. To further quantify intergroup differences, we projected all data points in the 2D space onto the decision boundary direction, extracting their 1D representations along this axis. Statistical analyses demonstrated significant differences between each pair of groups along this projection (Fig. 3f I, II, III).
Thus, our findings indicate that unsupervised clustering by 3D reconstruction of spontaneous locomotor behavior based on skeletal points can effectively quantify and differentiate WT, PD and LID mice with locomotor disorders. According to this we established a behavioral pattern dataset for three types of mice.
Identification of behavioral signatures of the mouse disease model based on hypergraph self-attention networks
Animal behavior manifests as a hierarchical structure of pose dynamics constructed in a bottom-up manner based on temporal changes. The basic movement units of an individual (such as joint angle variations and gait rhythms) are combined and coordinated over time to gradually form higher-level spatial movement patterns (such as walking, exploring, and feeding), ultimately constituting a complete behavioral sequence25,36,37,38. In order to achieve efficient recognition and classification of the spatiotemporal characteristics of complex behavioral dynamics, we employed a hypergraph neural network model incorporating a self-attention mechanism38 to fully explore the higher-order relationships and collaborative structures among multiple nodes during the behavior process.
The architecture of the Hypergraph Self-attention Neural Network is composed of multiple components (Fig. 4a). To capture the collaboration and interaction among multiple body joints, the 3D skeletal point data of the mouse is fed into the Hypergraph Self-Attention (HyperSA) module for spatial feature modeling. We employed the standard Transformer architecture, applying Layer Normalization (LN) before the multi-head hypergraph self-attention (HyperSA) layer and enhancing stability and non-linear expression capability through residual connections and ReLU activation. The internal structure of HyperSA is illustrated (Fig. 4b), the input features are first mapped through three linear transformations to generate query (Q), key (K), and value (V) vectors. During initialization, hyperedges are constructed based on a partition of body parts (Fig. 4e). Subsequently, attention scores are constructed in conjunction with three types of structure-enhanced information: k-hop relative position encoding based on the shortest path between joints (structural distance embedding), hyperedge embedding representation obtained through hypergraph structure aggregation, and global attention bias capturing different structural preferences. These structural pieces of information are weighted and summed with the result of Q·K to form a structure-aware attention matrix, which, after Softmax normalization, is used to weight-sum the V vectors, ultimately yielding a spatial feature representation that integrates local connectivity and global structure awareness (see Methods for details).
A Workflow and detailed structure of the hypergraph attention network. B Structural details and process of the HyperSA module. C Confusion matrix for 15 behavior categories, with the vertical axis representing the actual categories and the horizontal axis representing the predicted categories. D Attention weight matrix. Distribution of attention weights between queries and keys in the Hyper Self-Attention module. E Hyperedge initialization graph. Hyperedges are constructed by grouping joints within the same semantically defined body region (e.g., head, limbs). F Attention distribution map of the head region with the nose as the query joint. G Distribution of attention in the limb region was mapped with the back as the query joint. Red lines indicate inter-joint attention weights, with darker colors indicating higher weights.
To capture higher-order temporal dependencies between movement sequences, after spatial modeling, the features are fed into the Temporal Conv module. Considering the complex dynamic characteristics of mouse pose sequences in the temporal dimension, we employ a Multi-scale Temporal Convolution (MS-TC) module. This module works through multiple convolutional branches in parallel, each of which first reduces the channel dimension using a 1 × 1 convolution and then employs different kernel sizes and dilation rates to capture motion features at various temporal scales. After L layers of feature extraction, the output undergoes dimensionality reduction and feature integration through Global Average Pooling, is mapped via a fully connected layer, and ultimately yields class probabilities through the Softmax layer for behavior classification tasks.
The hypergraph self-attention neural network is constructed by alternately stacking HyperSA and temporal convolution layers, as shown below:
We validated the performance of this model using behavioral analysis data from WT, PD, and LID mice. We first performed rigorous data cleaning and preprocessing on 15 predefined behavioral categories, ultimately constructing a behavioral database containing 68,311 behavioral segments. This database was partitioned into training and test sets at an 8:2 ratio for training and validating the model.
We conducted a comprehensive evaluation of the model’s classification performance using the established dataset of behaviors from three types of mice. We employ Accuracy, Macro-F1, and Weighted F1 as evaluation metrics, benchmarking our method against ST-GCN39 and BlockGCN40. The results demonstrate that our method achieves competitive performance, outperforming both baseline models across all metrics with an accuracy of 82.67%, macro-F1 of 82.20%, and weighted F1 of 82.61% (Supplementary Table 2). An ablation study is further conducted to demonstrate the effectiveness of the HyperSA module (Supplementary Table 3). The diagonal results of the confusion matrix indicate that although certain behavioral categories, such as Walking and Rising, exhibited slightly lower accuracy, characteristic behaviors of LID, such as the AT-OD category, achieved the highest recognition accuracy of up to 0.95, demonstrating high sensitivity and stability of the algorithm (Fig. 4c). Diagonal entries differed significantly from off-diagonal values, indicating strong specificity and a low misclassification rate across multiple categories. From the confusion matrix, we found that Walking was mainly misclassified as Trotting or Sniffing while walking, likely because Walking itself serves as a foundational movement for both, resulting in structural overlap in locomotor patterns that challenges precise discrimination by the model. Rising was primarily misclassified as Hunching and Sniffing in situ, likely due to similar initial postures based on stationary or standing positions; both Rising and Hunching involve hindlimb support, while head movements during Sniffing in situ might be mistaken for rising movements, thereby increasing classification confusion.
To elucidate the spatiotemporal characteristics of mouse movement sequences by the hypergraph self-attention neural network, we described the attention matrix. The results showed that skeletal points corresponding to the mouse’s head (1, 2) and limbs (7, 8, 9, 10) exhibited higher attention weights during processing, suggesting their critical role in behavioral recognition (Fig. 4d). In the head region, we visualized the attention scores of the mouse during turning using the nose as the query joint. The red lines in the figure represent the attention weights, with darker colors indicating higher weights. In the initial frame (t), the mouse has not yet made a significant turn, and the attention distribution is relatively dispersed. As the behavior progresses, the attention gradually focuses on the joints on the side where the turn will occur. When the turn is completed, a marked asymmetry in the attention distribution between the left and right body joints can be clearly observed—the joints on the inside of the turn are assigned higher attention weights (Fig. 4f). In the limbs region, we also visualized the attention scores for the mouse’s transition into an upright posture, using the back as the query joint. During the behavior, the back joint enhances the attention weights with the four front limb joints to maintain body balance (Fig. 4g). The above observations are consistent with our understanding of the structure of the behavior itself, indicating that in mouse behavior analysis, the hypergraph attention mechanism can effectively capture the higher-order associations between joints and has good structural perception capabilities.
Taken together, the Hypergraph Self-attention Neural Network demonstrated excellent performance in terms of accuracy, exhibiting strong discriminative power across different behavior categories.
Therapeutic effects of amantadine and clozapine on LID in mice
Amantadine (AMAN) and clozapine (CLZ) are two commonly used pharmacological agents in the treatment of LID41,42,43,44,45. AMAN is believed to alleviate dyskinetic symptoms by suppressing hyperactivity within the basal ganglia circuitry46,47,48, whereas the therapeutic effects of CLZ on motor disturbances are primarily attributed to its antagonistic action on serotonin receptors and dopamine D₂/D₃ receptors8,49,50. Existing studies have shown that these two drugs exhibit potential therapeutic value in improving PD motor complications. However, both drugs have obvious side effects and significant individual differences in efficacy51,52,53,54,55,56. We verify the possible differences in the mechanisms of action of these two drugs from the perspective of behavioral quantification.
We administered intraperitoneal injections of AMAN (40 mg/kg) or CLZ (2.5 mg/kg) to LID model mice (Fig. 5a), and continuously recorded their spontaneous behaviors for a 3-h period (Fig. 5b). Based on our trained hypergraph neural network model, we calculated behavioral scores at 5-min intervals and visualized the dynamic changes of the 15 defined behavioral types in LID (n = 12), AMAN-treated (n = 6), and CLZ-treated (n = 6) mice (Fig. 5c).
A AMAN and CLZ dosing schedules. B Time series of mouse behavioral features after AMAN and CLZ administration. C Changes in 15 behavioral features within 3 h after AMAN and CLZ administration (LID = 12, AMAN = 6, CLZ = 6) (****p < 0.0001, ***p < 0.001, **p < 0.01, *p < 0.05, Kolmogorov–Smirnov test, Bonferroni correction). D Three differential efficacy features after AMAN and CLZ treatment among the representative features of LID. E Behavioral transition chord diagrams for mice after AMAN and CLZ administration. Each chord in the diagram represents a behavioral switch from a previous state to a subsequent state, with colors corresponding to the initial state and line thickness reflecting the frequency of that transition type (I, AMAN; II, CLZ; III, LID). F Differences in behavioral transitions between AMAN and LID as well as CLZ and LID. The vertical axis represents the initial state, and the horizontal axis represents the subsequent state. The color intensity indicates the normalized average transition difference (calculated as AMAN minus LID and CLZ minus LID) (I, AMAN-LID; II, CLZ-LID).
Our results showed that AMAN treatment exerted a significant intervention effect on behavioral evolution throughout the 3-hour period. The typical severe dyskinesia phenotype, AT-OD, showed a substantial decline throughout the entire duration, maintaining a low level and demonstrating a strong inhibitory effect and stable control capability (Fig. 5c AT-OD). Notably, the HD-H behavior exhibited a unique phase-specific characteristic under AMAN intervention, with its peak occurrence delayed compared to the LID group (Fig. 5c HD-H).
In contrast, the CLZ-treated group displayed a differentiated intervention pattern. Firstly, under CLZ treatment, several non-LID behaviors (such as Grooming, Hunching, Turn grooming, and Rearing) showed an overall decrease in proportion over the 3 h, reflecting a broad suppressive effect on overall behavioral activity, suggesting that it may produce a wide-spectrum behavioral regulatory effect through a central inhibitory mechanism. Additionally, CLZ had a pronounced intervention effect on HD-H behavior, characterized by a significant decline. However, the AT-OD behavior score exhibited a time-dependent increasing trend, which may be related to the cumulative effect of side effects during drug metabolism.
Mild dyskinesia behavior HD-M increased in both early, middle, and late stages after intervention with the two drugs compared with the LID group, suggesting a structural transformation of the behavioral spectrum from severe (HD-H) to mild dyskinesia (HD-L). In the mildest dyskinesia phenotype HD-L (with obvious rotational manifestations), both drug-treated groups showed higher behavioral levels in the middle and late stages, which may represent a compensatory enhancement of physiological movements (rotation) after severe dyskinesia is inhibited. Moreover, the most severe AT-PI almost completely disappeared after intervention with AMAN and CLZ, with the overall score approaching zero.
Allover, our results showed that over the course of time, AMAN and CLZ exhibited differences in seven features: AT-OD, HD-H, HD-M, Grooming, Rearing, Running, and Trotting, among which AT-OD, HD-H, and HD-M are typical features of LID.
The chord diagram of behavioral transitions and the intergroup heatmap of behavioral transfer frequencies further supported the dynamic structural shifts in mouse behavior (Fig. 5e, f). Our results showed that compared to LID mice, the AMAN-treated group exhibited a significantly reduced transition frequency toward the most severe dyskinetic phenotype, AT-PI, along with an increased frequency of transitions toward the milder HD-M (Fig. 5f Ⅰ). These results suggests that AMAN may inhibit the activity of circuits related to dyskinesia in mice, which is consistent with the clinical principle of AMAN treatment that enhances dopamine release, reduces glutamate release, and modulates NMDA receptor activity to suppress overactivation in the basal ganglia region46,47,48. In contrast, for the CLZ-treated group, although CLZ demonstrated excellent inhibition of body twisting in mice based on behavioral scores, in terms of the transition trends of LID-characteristic movements, CLZ only reduced the transitions from HD-H to other LID movements and even increased the probability of transitions from other LID-characteristic movements to the more severe AT-PI behavior. Moreover, CLZ administration decreased the transition probabilities among the vast majority of non-LID actions (Fig. 5f Ⅱ). These findings reflect the overall sedative or movement-inhibitory effects of CLZ as an antipsychotic drug on the brain and indicate the lack of target specificity in CLZ treatment, which is consistent with its clinical characteristic of having more side effects54,55,56.
To evaluate the therapeutic effects of the two drugs, we employed the five kinematic parameters previously mentioned. The results showed that, compared with the LID group, both drugs exhibited varying degrees of recovery trends in multiple indicators (Fig. 6a). In terms of body height, LID mice were significantly higher than the AMAN and CLZ groups, indicating that both drugs alleviated the abnormal behavior of trunk elevation to some extent, with the distribution of CLZ shifting more to the left (Fig. 6a Ⅲ). There was also a significant difference between AMAN and CLZ, suggesting that the two drugs may involve different muscle tone regulation mechanisms in their improvement strategies. Regarding body length, the CLZ group was significantly lower than the LID and AMAN groups, while there was no significant difference between AMAN and LID (Fig. 6a Ⅱ). This indicates that CLZ treatment may lead to noticeable body contraction in mice, which may reflect its potential side effects. However, there were no significant differences between the three groups in body angle, back speed and body Angular velocity (Fig. 6a I, IV, V).
A PMF curves of five kinematic parameters (Ⅰ, speed back; Ⅱ, body length; Ⅲ, body height; Ⅳ, body angle; Ⅴ, body angular velocity) after AMAN and CLZ administration. Bold traces and shaded areas represent mean ± SEM. (****p < 0.0001, ***p < 0.001, **p < 0.01, *p < 0.05, Kolmogorov–Smirnov test, Bonferroni correction). B Classification of behavioral outcomes 180 min after AMAN and CLZ administration using SVM in the PC4 and PC5 plane. C Statistics of AMAN and CLZ administration. Projection values calculated by projecting the points in Fig B onto the decision boundary (p = 0.00216, Mann–Whitney test). D Classification of behavioral outcomes 180 min after AMAN and CLZ administration and the LID group using SVM in the PC3 and PC5 plane. E Comparison statistics of AMAN and CLZ administration with the LID group. Projection values calculated by projecting the points in Fig D onto the decision boundary (Ⅰ, AMAN vs. LID, p = 0.00022; Ⅱ, CLZ vs. LID, p = 0.00129; Mann–Whitney test).
To further examine the behavioral differences between AMAN and CLZ treatments, we performed a classification analysis using PCA and SVM, based on each animal’s score across 15 defined behavioral categories. In the 2D PCA space, the AMAN and CLZ groups exhibited a clear distributional shift, forming a distinct boundary in the visualization (Fig. 6b). Projection of the data onto the 1D SVM decision axis revealed a significant difference in behavioral distribution between the two groups (Fig. 6c).
To comprehensively assess the structural differences in behavior between treatment and disease states, we incorporated all three groups (LID, AMAN, and CLZ) into a unified classification model. The resulting PCA embedding revealed three relatively distinct cluster structures in 2D space (Fig. 6d). Although some individuals appeared at cluster boundaries or showed limited overlap, statistical analysis of the projection values indicated significant differences between both AMAN and LID, and CLZ and LID (Fig. 6e I, II). These findings reflect differences in the behavioral features of AMAN and CLZ efficacy.
In summary, we found that AMAN and CLZ treatments can be well-distinguished in terms of behavioral representation, with differences in efficacy manifested in three behavioral features: AT-OD, HD-H, and HD-M (Fig. 5d).
Discussion
Based on the computer vision approach, we have constructed a standardized dataset of behavioral data for WT, PD, and L-DOPA-induced LID mice using 3D multi-view motion capture technology and machine learning methods.
To account for the spatiotemporal complexity of mouse behaviors, we propose a framework based on Hypergraph Self-attentive Neural Network, which has identified the behavioral features of WT, PD, and L-DOPA-induced LID mice and demonstrated the reliability of this algorithmic framework. We further applied this framework to evaluate behavioral-scale differences between AMAN and CLZ treatments in LID mice, providing a clear algorithmic framework for efficacy quantification.
Traditional animal behavior analysis methods mainly rely on manual observation and subjective scoring (e.g., AIMs13), which is not only time-consuming and labor-intensive, but also highly susceptible to subjective bias of the scorer14, severely limiting the objectivity and reliability of the study. Unsupervised clustering approaches to behavioral recognition require joint modeling on the overall data, which significantly increases the scale of behavioral data, elevates the demand for computational resources, and limits the application and clinical translational potential of large-scale data analysis21. The framework we propose combines the advantages of unsupervised clustering analysis and supervised learning, fully utilizing the fine motor information provided by the high-precision pose estimation of DeepLabCut27. By using unsupervised behavior clustering for supervised network training, we have achieved fully automatic, efficient, and objective behavior classification. This design not only reduces the interference of human factors but also enables the rapid and accurate processing of large batches of data, significantly enhancing the reproducibility and generalizability of behavior analysis.
However, the framework of this study still faces some pressing challenges. In particular, the frequent occlusion of joints in complex animal behavior data, which is especially prominent in the complex movements such as head and neck rotation or limb twisting exhibited by LID mice, can easily lead to errors in skeletal point recognition. Therefore, how to further design robust algorithms with occlusion adaptability or integrate higher-dimensional and more structured spatial information to ensure the accuracy of behavior data tracking is an important direction for future research. Beyond visible-spectrum video, two complementary sensing modalities emerge as viable alternatives. Radio-frequency (RF) sensing provides a solution independent of ambient light and requires only a relaxed line-of-sight, having already enabled contactless, longitudinal tracking of PD severity, progression, and medication response57. Separately, depth-sensing cameras (e.g., RGB-D/Kinect) maintain accurate pose and kinematic estimation under low-light conditions and have been successfully applied to objectively quantify bradykinesia and related motor impairments58. Future efforts could therefore focus on fusing these two technologies to overcome the limitations of any single sensing modality. In addition, although the current hypergraph structure can effectively express the high-order relationships between joints, it still falls short in highlighting the importance of local key patterns. In the future algorithm development process, the modeling of the impact of local detail features on global behavior should be strengthened.
How to objectively quantify the changes in posture and time scale caused by drug effects is an urgent issue that needs to be clarified in the clinical treatment of Parkinson’s disease. This study provides a quantitative analysis basis for solving this problem by elaborately depicting the differences in movement behavior characteristics between WT, PD, and LID mice. Based on the behavioral dynamic monitoring data within three hours after drug administration, we found that the two drugs showed differentiated therapeutic effects in the temporal dimension. The behavioral assessment results showed that both AMAN and CLZ could improve specific abnormal movement patterns, with AMAN showing particularly significant improvement in limb movement coordination and effectively promoting the recovery of physiological movement patterns. In contrast, although the CLZ group generally reduced the scores of abnormal movement features in various dimensions, the proportion of dystonia induction significantly increased, characterized by characteristic crawling postures and tail rigidity symptoms, with considerable individual variability in response. It is worth noting that the differences in the behavioral state transition probabilities of the two drugs in LID mice suggest that they may regulate abnormal movements through different neural circuits. The non-specific inhibitory characteristics shown by CLZ also reveal to some extent its greater side effects54,55,56. These quantitative analyses of drug efficacy differences provide reference value for the study of cell biology and neural circuits.
Although the behavior analysis framework of this study has shown excellent performance in identifying motor symptoms and has uncovered some indicators that could not be evaluated by AIM scoring, the clinical symptoms of Parkinson’s disease patients also include non-motor symptoms such as cognitive impairment and emotional abnormalities59,60,61. Future research needs to further expand and improve the current behavior analysis framework by integrating multimodal data such as EEG, cognitive tests, and emotional assessments to develop a more comprehensive and refined disease evaluation system. This will enable a more comprehensive revelation of the pathological mechanisms of Parkinson’s disease and the overall therapeutic effects of interventions. In addition, further improving the interpretability and transparency of the framework is also an important direction for future research. This is not only crucial for enhancing the credibility and clinical acceptance of the model but also beneficial for the translational application in the field of neuro-mechanistic research and medicine.
Methods
Animals
The animals used in this study were C57BL/6N mice, obtained from ShangHai Model Organisms Company. This strain is widely utilized in research due to its well-characterized genetic profile and consistent phenotype. The mice were housed in a controlled environment with a 12-h light/dark cycle, maintained at a temperature of 22 ± 2 °C and 50 ± 10% relative humidity. They were kept in standard plastic cages with unrestricted access to food and water, ensuring optimal conditions for their health and well-being. Prior to the experiments, the mice were allowed to acclimate to the laboratory environment for at least one week. All animal procedures and experiments were performed under guidelines approved by the Institutional Animal Care and Use Committee (lACUC) at Fudan University Shanghai Medical College. At the end of experiments, mice were euthanized by exposure to carbon dioxide (CO₂), in accordance with institutional guidelines, to minimize pain and distress.
6-OHDA lesion
To establish a unilateral dopamine depletion model, mice were administered bilateral injections of 6-hydroxydopamine hydrochloride (6-OHDA-HCl, Sigma-Aldrich) into the left striatum. The control group received an equivalent volume of saline containing 0.02% ascorbic acid (Sangon Biotech) at the same injection sites. Mice were anesthetized with 3–4% isoflurane (RWD Life Science) during induction and maintained under 1–1.5% isoflurane for the duration of the surgery. The animals were positioned in a stereotaxic frame (Stoelting Ltd.) for accurate injection placement. The 6-OHDA-HCl was dissolved in saline containing 0.02% ascorbic acid, achieving a final concentration of 3.6 mg/mL.
For the injection, 2 µL of the 6-OHDA solution was delivered to each of the two target sites in the sensorimotor region of the left striatum using the following coordinates (from bregma): Site 1: anteroposterior (AP) + 1.00 mm, mediolateral (ML) + 2.10 mm, dorsoventral (DV) −3.4 mm; Site 2: AP + 0.30 mm, ML + 2.30 mm, DV −3.3 mm. The injections were performed at a rate of 0.2 µL/min using a 10 µL Hamilton syringe. After the injection, the glass needle was left in place for an additional 10 min to allow for optimal diffusion of the toxin, before being slowly withdrawn. The scalp was then sutured, and the mice were allowed to recover.
In the postoperative period, mice were carefully monitored, and their cages were placed on heating pads to maintain body temperature.
LID AIM score
To assess the success of the lesion and select well-modeled mice, apomorphine-induced rotation was performed one day prior to virus injection. Mice that exhibited at least 7 full-body contralateral rotations per minute (≥7 rpm)62 after receiving an apomorphine injection (0.5 mg/kg, s.c., Sigma-Aldrich) were considered successfully lesioned and suitable for further testing. Two weeks after the 6-OHDA lesion, daily administration of L-dopa (levodopa, 10 mg/kg, i.p., Sangon Biotech) along with benserazide hydrochloride (8 mg/kg, Sigma-Aldrich) began to evaluate the development of levodopa-induced dyskinesia (LID). Control mice were injected with saline at the same dosage schedule. Before the test, mice were habituated to the experimental setup, including a circular open-field arena (transparent acrylic cylinder, 30 cm diameter) for 30 min per day for three days. On the test day, animal behavior was recorded using four synchronised RealSense D435i cameras.
LID severity was evaluated during a 1-min observational period every 20 min according to the standard AIMs (Abnormal Involuntary Movements) rating scale63. This scale comprises three categories: 1. Axial AIMs: Torsional (dystonic) movements of the neck and upper trunk toward the contralateral side of the lesion. 2. Limb AIMs: Hyperkinetic or mixed dystonic-hyperkinetic movements of the forelimbs. 3. Orofacial AIMs: Involuntary movements of the jaw, tongue protrusion, or severe cases of self-biting. Each body segment was scored on a scale from 0 to 4 based on the duration of AIMs observed. The total AIMs score, the sum of axial, limb, and orofacial scores, could reach a maximum of 12 points in the most severe cases. For all behavioral assessments, AIMs were graded blindly by two independent raters to ensure objectivity and consistency.
Drug
The following pharmacological agents were commercially acquired: 6-hydroxydopamine hydrochloride (H4381, 100 mg), apomorphine (A43932, 50 mg), 3,4-dihydroxy-L-phenylalanine (L-DOPA; SML0091, 25 mg), and amantadine hydrochloride (A1260, 5 g) from Sigma-Aldrich. Clozapine (0444, 50 mg) was procured from Tocris Bioscience. Vehicle solutions included 0.9% normal saline (NS) for amantadine and 4.8 mg/mL dimethyl sulfoxide (DMSO; Sigma-Aldrich D8418-250ML) for clozapine. Administration protocols comprised intraperitoneal (i.p.) injections of amantadine and clozapine 30 min and 5 min prior to L-DOPA treatment, respectively49. All compounds were prepared in accordance with manufacturer guidelines and administered via sterile techniques.
3D multi-view motion capture system
To achieve high-precision recording of spontaneous mouse behaviors, we constructed a multi-view, synchronized acquisition and 3D reconstruction system for behavioral tracking. The system employed four Intel RealSense D435i cameras, mounted at 90-degree intervals around a 90 × 90 × 90 cm stainless-steel frame, enclosing the behavioral arena. Each camera was connected to a data acquisition workstation via USB 3.0 and synchronized image capture was implemented using custom Python scripts based on the pyrealsense2 SDK. Video streams were recorded at a resolution of 1280 × 720 pixels and a frame rate of 30 frames per second. Prior to data collection, the camera array was calibrated using the Stereo Camera Calibrator toolbox in MATLAB. A 6 × 7 checkerboard pattern was imaged to extract intrinsic and extrinsic camera parameters, correct lens distortion, and optimize spatial alignment across the camera views, thereby ensuring the accuracy and consistency of 3D reconstruction. Behavioral assessments were conducted in a standard open field test (OFT) paradigm, where mice were allowed to explore freely for a fixed duration to capture spontaneous locomotor activity. The arena consisted of a transparent acrylic cylinder with a height of 20 cm and a diameter of 35 cm, mounted on a white plastic baseplate. A 36-inch monitor was suspended centrally above the arena to provide uniform white illumination, minimizing environmental light artifacts. Multiscale behavioral data were recorded from WT, PD, LID, and drug-treated mice, providing a reliable basis for subsequent attitude analysis and dynamics modeling.
Pose estimation based on DeepLabCut
In this study, we employed DeepLabCut27 for pose tracking of freely moving mice. DeepLabCut is a deep learning-based, markerless pose estimation toolbox that enables precise identification of anatomical landmarks on the animal’s body without the need for external markers.
We used DeepLabCut version 3.0 for pose estimation, running on Python 3.10 with a TensorFlow backend and an NVIDIA GeForce GTX 3090 GPU. A total of 8000 representative frames were extracted from behavior videos across all experimental groups. Each frame was manually annotated with 16 anatomical keypoints, including the nose, neck, left and right forelimbs, left and right hindlimbs, left and right forepaws, left and right hindpaws, back, tail base, mid-tail, and tail tip. These labeled frames were used to construct the training dataset. The dataset was split into training and testing subsets in a 95% to 5% ratio. A ResNet-50 backbone was used for model training, which was conducted for 1,000,000 iterations with the objective of minimizing cross-entropy loss. The resulting model achieved a mean test error of less than 3.5 pixels, indicating high prediction accuracy. In the inference phase, the trained model was applied to all behavioral video frames to extract the 2D coordinates (x, y) and confidence scores of each keypoint (see Supplementary Data 1).
3D pose reconstruction based on Pose3D
To achieve high-precision 3D pose reconstruction of freely behaving animals, we employed Pose3D64 to reconstruct 3D skeletal trajectories from the multi-view 2D keypoints predicted by DeepLabCut. Pose3D is a lightweight, geometry-based multi-view pose estimation tool that supports synchronized multi-camera input and standard camera calibration formats, and is fully compatible with DeepLabCut outputs. During data collection, two synchronized cameras were used to simultaneously capture mouse behavior from different angles. The intrinsic and extrinsic parameters of each camera were obtained using the OpenCV calibration pipeline and imported into Pose3D in JSON format. For each camera view, DeepLabCut was used to extract 2D keypoint coordinates and associated confidence scores for each anatomical landmark in every frame. Pose3D aggregated the multi-view keypoints and computed the 3D positions of each joint using linear least squares triangulation and the Direct Linear Transform (DLT) method. To ensure robustness of pose reconstruction, only keypoints with confidence scores above 0.9 were retained for triangulation. Missing keypoints in the temporal sequence were interpolated using a local temporal window (Supplementary Table 4). The resulting 3D trajectories were subsequently smoothed using Gaussian or low-pass filtering to reduce noise and eliminate frame-to-frame jitter.
Validation of 3D reconstruction performance
To validate the robustness of the 3D motion-capture system under conditions of body-part occlusion or viewpoint disappearance, we evaluated reconstruction performance based on multi-camera recordings. Since 3D reconstruction only requires any two cameras to obtain the 2D coordinates of the same point from different perspectives, the number of available cameras determines reconstruction reliability21. We applied a pose estimation likelihood threshold of >0.9 to determine whether a given camera was used and calculated the proportion of available cameras for each body part (Supplementary Fig. 3).
Behavior atlas analysis
In this study, we utilized the machine learning-based behavior analysis tool, Behavior Atlas21, to identify and classify mouse behaviors. Multi-view recordings were first integrated to construct a 3D skeletal model of each mouse. During the behavior analysis phase, both kinematic features (e.g., velocity) and non-kinematic features (e.g., local actions such as sniffing) were extracted to decompose and cluster behavior patterns. Dynamic Time Alignment Kernels (DTAK) were employed to quantify similarity between non-kinematic features, and high-dimensional behavioral representations were embedded into a 2D space using Uniform Manifold Approximation and Projection (UMAP). Based on these high-dimensional features, an unsupervised clustering algorithm was applied to classify spontaneous behaviors into 40 distinct clusters, each representing a characteristic behavioral motif observed during the experiments. To improve clustering accuracy, representative video clips from each cluster were manually reviewed, and clusters with over 80% consistency across annotations were labeled. Finally, the 40 behavioral clusters were consolidated into 15 behavior categories. This approach effectively minimized bias associated with subjective labeling and enhanced the scientific rigor and reproducibility of behavior identification. The resulting behavioral feature space and classification framework enabled precise decomposition of complex mouse behavior, providing a robust foundation for subsequent behavioral phenotyping and comparative analyses.
Evaluation of posture similarity
Video segments corresponding to 40 behavioral motifs were examined and preliminarily labeled according to naming conventions described in the literature (Table 1). Following approaches commonly used in previous studies18,19,21,65,66, we then compared quantitative features, including mean speed, mean body angle, mean body length, and mean body height (Supplementary Fig. 4a), and integrated these with the relative positions of motifs in the three-dimensional UMAP embedding space (Supplementary Fig. 4c) as well as their statistical trends (Supplementary Fig. 4b) across different mouse groups. Motifs were merged only when both quantitative features and statistical trends consistently supported their similarity, and the entire process was cross-validated by multiple researchers to minimize bias. The discriminability among categories was further evaluated using t-SNE visualization and logistic regression analysis (Supplementary Fig. 4d).
Hypergraph self-attention neural network
We adopted a state-of-the-art Hypergraph Self-attention Neural Network38 to learn pose representations from both spontaneous and dyskinetic behaviors across 15 defined behavioral categories, with the goal of identifying complex behavioral phenotypes and predicting behavior patterns. The model takes as input the sequential 3D skeletal coordinates reconstructed from multi-view recordings. It is specifically designed to capture the multi-joint coordination and temporal dynamics exhibited by mice during both normal and abnormal movements.
For each frame of pose data, a hypergraph H = (V, Ɛ) is constructed, where each hyperedge \({{\mathcal{E}}}_{k}\subset V\) connects a group of joints that share high-order functional synergy (e.g., ipsilateral forelimb and hindlimb, or coordination between neck and forepaw). The hypergraph is represented by a joint–hyperedge incidence matrix \(H\in {{\rm{{\mathbb{R}}}}}^{|{\mathcal{E}}|\times {|V|}}\), where \({H}_{k,i}=1\) indicates that the \(i\) joint is a member of the \(k\) hyperedge. To compute the feature representation of each hyperedge, the model aggregates the features of all joints connected by that hyperedge using the following formulation:
Here, \(X\in {{\mathbb{R}}}^{\left|V\right|\times C}\) denotes the embedded feature representation of each joint, \({We}\in {{\mathbb{R}}}^{C\times C}\) is the linear projection matrix, and \({De}\) is the hyperedge degree matrix. The aggregated hyperedge features are then broadcast back to their corresponding joints to construct an enhanced hyperedge-aware representation:
This process is equivalent to assigning each joint the contextual feature information of its associated structural group (i.e., hyperedge), which is then used for subsequent attention computation.
To enable the self-attention mechanism to incorporate the topological structure of joint connectivity, Model introduces a k-hop relative positional embedding (k-hop RPE) mechanism based on graph distance. Specifically, the shortest path length \(\phi \left(i,j\right)\) between each pair of joints is first computed, and the corresponding positional bias \({R}_{\phi \left(i,j\right)}\) is retrieved from a learnable embedding table \(R\in {\mathbb{R}}{}^{K\times C}\,\). This bias is then added to the attention score computation. This approach allows the model to perceive the structural distance between any two joints in the skeleton and encourages the attention to focus on topologically adjacent regions.
Building upon the above architectural design, Model extends the conventional self-attention mechanism by introducing the Hypergraph Self-Attention (HyperSA) module, which integrates four types of attention bias terms:
Here, \({q}_{i}\) and \({k}_{j}\) denote the query and key vectors of joints \(i\) and \(j\), respectively, and \(u\) is a learnable global bias vector. The final attention output is computed as:
Here, \({B}_{{ij}}\) denotes a learnable relational bias term, which models the asymmetric interactions between heterogeneous joint pairs.
The complete model consists of multiple layers of alternating HyperSA modules and multi-scale temporal convolution (MS-TC) blocks. After processing, global average pooling is applied for dimensionality reduction, followed by a fully connected layer and a Softmax classifier to output the probability distribution over 15 behavioral categories.
To reduce within-group variability while preserving the original data structure, a data cleaning process was applied. The final dataset consisted of 68,311 samples, which were randomly split into a training set (54,648 samples) and a test set (13,663 samples). The model learns the features of spontaneous behavior versus dissimilar behavior in the training set and verifies its prediction performance in the test set. The prediction results are evaluated by accuracy, which measures how well the predicted values match the real measurements. The values range from 0 to 1, with higher values indicating better predictive performance of the model.
ST-GCN and BlockGCN
For comparative analysis, we benchmarked our method against two graph-based baseline approaches: ST-GCN39 and BlockGCN40. ST-GCN (Spatial-Temporal Graph Convolutional Networks) is a pioneering framework that models spatial-temporal dependencies by graph convolutions on human skeleton sequences. BlockGCN extends ST-GCN by introducing hypergraph connections to capture higher-order correlations, thereby enhancing the representation capacity for complex structural relationships. Both baselines were implemented and evaluated under the same experimental settings as our proposed method for fair comparison. The performance was assessed using five standard classification metrics: Accuracy, Macro Precision, Macro Recall, Macro F1 (MF1), and Weighted F1. These metrics are defined as follows:
where \({w}_{i}=\frac{{n}_{i}}{N}\), \({n}_{i}\) is the number of instances in class i, and N is the total number of instances. \({TP}\), \({TN}\), FP, and FN denote true positives, true negatives, false positives, and false negatives, respectively. \(C\) represents the total number of classes. The macro-averaged metrics treat all classes equally, while the Weighted F1 accounts for class imbalance by weighting each class’s F1-score by its support.
Kalman filter and particle filter
To ensure robustness in trajectory reconstruction, we evaluated multiple filtering approaches including linear interpolation, Kalman filtering, and particle filtering. Kalman filtering is an optimal recursive estimation algorithm that uses a series of measurements over time to produce estimates of unknown variables. Particle filtering is a sequential Monte Carlo method particularly effective for non-linear and non-Gaussian estimation problems. In our implementation, we compared these methods by randomly dropping 1% of high-confidence tracking points (confidence > 0.9) and reconstructing them. The performance was quantified using Root Mean Square Error (RMSE) and Mean Absolute Error (MAE) in millimeters (mm), with the respective formulas given as:
where \({y}_{i}\) represents the observed or ground truth value, \({\hat{y}}_{i}\) represents the predicted value, and \(n\) is the total number of data points.
Kinematic parameters and PMF calculation
To comprehensively characterize posture dynamics across treatment groups, five key kinematic parameters were defined and extracted. For each parameter, a probability mass function (PMF) was computed using histogram analysis with 100 bins. The parameters are defined as follows:
back speed: Instantaneous velocity of the back skeletal point between adjacent frames, calculated as the frame-to-frame spatial displacement multiplied by the sampling frame rate.
body length: The Euclidean 3D distance between the nose and tail base keypoints.
body height: The z-coordinate of the back keypoint minus the average z-coordinate of the four paw keypoints, reflecting trunk elevation above the ground.
body angle: The angle formed between two vectors originating from the back keypoint and pointing toward the neck and tail base keypoints, respectively.
body angular velocity: The angular velocity of the directional vector between the tail base and back keypoints, computed as the frame-to-frame change in angle of this vector multiplied by the sampling frame rate.
Support vector machine (SVM) classification
We used a Support Vector Machine (SVM) model to classify mouse models based on spontaneous behavior data, including WT, PD, LID, and LID mice treated with either AMAN or CLZ. Behavioral features were represented by a 15-dimensional vector of behavior scores corresponding to predefined behavior categories. Following feature matrix construction, we applied principal component analysis (PCA) for dimensionality reduction. Singular value decomposition (SVD) was used to compute variance-normalized projections, and principal components (PCs) explaining at least 95% of the cumulative variance were retained. These PCs represent linear combinations of the original behavioral features. Subsequently, we identified pairs of PCs with the highest discriminative power and trained multi-class SVM classifiers using linear, polynomial, and radial basis function (RBF) kernels. A grid search was performed to optimize the selection of PC pairs, kernel types, and SVM hyperparameters. Each model configuration was evaluated across 20 random initializations to ensure robustness. Given the relatively small dataset, leave-one-out cross-validation (LOOCV) was used to assess generalization performance and validate the stability and predictive capacity of the classifier under different data partitions.
Statistical analysis
All data analyses were performed using GraphPad Prism 9.0.0 and Python 3.10. Normality and homogeneity of variance were assessed for all experimental groups prior to statistical testing. Group comparisons were evaluated using two-way analysis of variance (ANOVA) followed by Welch’s t test with Bonferroni correction for multiple comparisons, depending on the experimental design. For comparisons of PMF curves of kinematic parameters and the 3 h behavioral curves, the Kolmogorov–Smirnov test was employed with Bonferroni correction for multiple comparisons. The Mann–Whitney U test was used to compare the projection values on the decision boundary. For all pairwise comparisons, we report effect sizes (Cohen’s d), with 95% confidence intervals (95% CIs), interpreted as small (0.2), medium (0.5), and large (0.8). Cohen’s d with 95% CIs are provided in the Supplementary Data 2. All data are presented as mean ± standard error of the mean (SEM). Statistical significance thresholds were defined as follows: *p < 0.05, **p < 0.01, ***p < 0.001 and ****p < 0.0001.
Data availability
The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.
References
Dauer, W. & Przedborski, S. Parkinson’s disease: mechanisms and models. Neuron 39, 889–909, https://doi.org/10.1016/s0896-6273(03)00568-3 (2003).
Shulman, J. M., De Jager, P. L. & Feany, M. B. Parkinson’s disease: genetics and pathogenesis. Annu. Rev. Pathol. 6, 193–222, https://doi.org/10.1146/annurev-pathol-011110-130242 (2011).
Jankovic, J. Parkinson’s disease: clinical features and diagnosis. J. Neurol. Neurosurg. Psychiatry 79, 368–376, https://doi.org/10.1136/jnnp.2007.131045 (2008).
Yasuhara, T. Neurobiology research in Parkinson’s disease. Int. J. Mol. Sci. 21, https://doi.org/10.3390/ijms21030793 (2020).
Kalia, L. V. & Lang, A. E. Parkinson’s disease. Lancet 386, 896–912, https://doi.org/10.1016/S0140-6736(14)61393-3 (2015).
Barbeau, A. Long-term side-effects of levodopa. Lancet 297, 395, https://doi.org/10.1016/S0140-6736(71)92226-4 (1971).
Olanow, C. W. & Stocchi, F. Levodopa: a new look at an old friend. Mov. Disord. 33, 859–866, https://doi.org/10.1002/mds.27216 (2018).
Huot, P., Johnston, T. H., Koprich, J. B., Fox, S. H. & Brotchie, J. M. The Pharmacology of l-DOPA-induced dyskinesia in Parkinson’s disease. Pharmacol. Rev. 65, 171–222, https://doi.org/10.1124/pr.111.005678 (2013).
Espay, A. J. et al. Levodopa-induced dyskinesia in Parkinson disease: current and evolving concepts. Ann. Neurol. 84, 797–811, https://doi.org/10.1002/ana.25364 (2018).
Kwon, D. K., Kwatra, M., Wang, J. & Ko, H. S. Levodopa-induced dyskinesia in Parkinson’s disease: pathogenesis and emerging treatment strategies. Cells 11, https://doi.org/10.3390/cells11233736 (2022).
Lama, J., Buhidma, Y., Fletcher, E. J. R. & Duty, S. Animal models of Parkinson’s disease: a guide to selecting the optimal model for your research. Neuronal Signal 5, Ns20210026, https://doi.org/10.1042/ns20210026 (2021).
Dwi Wahyu, I., Chiken, S., Hasegawa, T., Sano, H. & Nambu, A. Abnormal cortico-basal ganglia neurotransmission in a mouse model of l-DOPA-induced dyskinesia. J. Neurosci. 41, 2668–2683, https://doi.org/10.1523/jneurosci.0267-20.2020 (2021).
Monville, C., Torres, E. M. & Dunnett, S. B. Validation of the l-dopa-induced dyskinesia in the 6-OHDA model and evaluation of the effects of selective dopamine receptor agonists and antagonists. Brain Res. Bull. 68, 16–23, https://doi.org/10.1016/j.brainresbull.2004.10.011 (2005).
Marsh, D. M. & Hanlon, T. J. Seeing what we want to see: confirmation bias in animal behavior research. Ethology 113, 1089–1098 (2007).
Peng, Q. et al. The rodent models of dyskinesia and their behavioral assessment. Front. Neurol. 10, 1016, https://doi.org/10.3389/fneur.2019.01016 (2019).
Mimura, K. et al. Unsupervised decomposition of natural monkey behavior into a sequence of motion motifs. Commun. Biol. 7, 1080, https://doi.org/10.1038/s42003-024-06786-2 (2024).
Wiltshire, C. et al. DeepWild: application of the pose estimation tool DeepLabCut for behaviour tracking in wild chimpanzees and bonobos. J. Anim. Ecol. 92, 1560–1574, https://doi.org/10.1111/1365-2656.13932 (2023).
Tseng, Y. T. et al. Systematic evaluation of a predator stress model of depression in mice using a hierarchical 3D-motion learning framework. Transl. Psychiatry 13, 178. https://doi.org/10.1038/s41398-023-02481-8 (2023).
Liu, J. et al. Mapping the behavioral signatures of Shank3b mice in both sexes. Neurosci. Bull. 40, 1299–1314, https://doi.org/10.1007/s12264-024-01237-8 (2024).
Jhuang, H. et al. Automated home-cage behavioural phenotyping of mice. Nat. Commun. 1, 68. https://doi.org/10.1038/ncomms1064 (2010).
Huang, K. et al. A hierarchical 3D-motion learning framework for animal spontaneous behavior mapping. Nat. Commun. 12, 2784. https://doi.org/10.1038/s41467-021-22970-y (2021).
Kabra, M., Robie, A. A., Rivera-Alba, M., Branson, S. & Branson, K. JAABA: interactive machine learning for automatic annotation of animal behavior. Nat. Methods 10, 64–67, https://doi.org/10.1038/nmeth.2281 (2013).
Wang, S. H. et al. Tracking the 3D position and orientation of flying swarms with learned kinematic pattern using LSTM network. IEEE International Conference on Multimedia and Expo (ICME), pp. 1225-1230, https://doi.ieeecomputersociety.org/10.1109/ICME.2017.8019406 (2017).
Yin, C., Liu, X., Zhang, X., Wang, S. & Su, H. Long 3D-POT: a long-term 3D drosophila-tracking method for position and orientation with self-attention weighted particle filters. Appl. Sci. 14, 6047 (2024).
Yang, Q. et al. Reynolds rules in swarm fly behavior based on KAN transformer tracking method. Sci. Rep. 15, 6982. https://doi.org/10.1038/s41598-025-91674-w (2025).
Wiltschko, A. B. et al. Revealing the structure of pharmacobehavioral space through motion sequencing. Nat. Neurosci. 23, 1433–1443, https://doi.org/10.1038/s41593-020-00706-3 (2020).
Mathis, A. et al. DeepLabCut: markerless pose estimation of user-defined body parts with deep learning. Nat. Neurosci. 21, 1281–1289, https://doi.org/10.1038/s41593-018-0209-y (2018).
Ravbar, P., Branson, K. & Simpson, J. H. An automatic behavior recognition system classifies animal behaviors using movements and their temporal context. J. Neurosci. Methods 326, 108352. https://doi.org/10.1016/j.jneumeth.2019.108352 (2019).
Zhao, Y. et al. Automatically recognizing four-legged animal behaviors to enhance welfare using spatial temporal graph convolutional networks. Appl. Anim. Behav. Sci. 249, 105594. https://doi.org/10.1016/j.applanim.2022.105594 (2022).
Lundblad, M., Picconi, B., Lindgren, H. & Cenci, M. A. A model of L-DOPA-induced dyskinesia in 6-hydroxydopamine lesioned mice: relation to motor and cellular parameters of nigrostriatal function. Neurobiol. Dis. 16, 110–123, https://doi.org/10.1016/j.nbd.2004.01.007 (2004).
Lundblad, M. et al. Pharmacological validation of a mouse model of l-DOPA-induced dyskinesia. Exp. Neurol. 194, 66–75, https://doi.org/10.1016/j.expneurol.2005.02.002 (2005).
Han, Y. et al. MouseVenue3D: a markerless three-dimension behavioral tracking system for matching two-photon brain imaging in free-moving mice. Neurosci. Bull. 38, 303–317, https://doi.org/10.1007/s12264-021-00778-6 (2022).
Phu, K.-A., Hoang, V.-D. & Le, V.-T. -L. Predicting occluded skeletal joints via tracking-based feature extraction. Neurocomputing 651, 131004. https://doi.org/10.1016/j.neucom.2025.131004 (2025).
Carta, M. et al. Role of striatal L-DOPA in the production of dyskinesia in 6-hydroxydopamine lesioned rats. J. Neurochem. 96, 1718–1727, https://doi.org/10.1111/j.1471-4159.2006.03696.x (2006).
Caulfield, M. E., Stancati, J. A. & Steece-Collier, K. Induction and assessment of levodopa-induced dyskinesias in a rat model of Parkinson’s disease. J. Vis. Exp. https://doi.org/10.3791/62970 (2021).
Siegfried, B. & Bures, J. Conditioning compensates the neglect due to unilateral 6-OHDA lesions of substantia nigra in rats. Brain Res. 167, 139–155, https://doi.org/10.1016/0006-8993(79)90269-5 (1979).
Crossman, A. R. Functional anatomy of movement disorders. J. Anat. 196, 519–525, https://doi.org/10.1046/j.1469-7580.2000.19640519.x (2000).
Zhou, Y. et al. Hypergraph transformer for skeleton-based action recognition. Preprint at https://arxiv.org/abs/2211.09590 (2022).
Yan, S. Xiong, Y. & Lin, D. Spatial temporal graph convolutional networks for skeleton-based action recognition. In Proceedings of the AAAI conference on artificial intelligence, https://doi.org/10.1609/aaai.v32i1.12328 (2018).
Zhou, Y. et al. BlockGCN: Redefine Topology Awareness for Skeleton-Based Action Recognition. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2049–2058, https://doi.org/10.1109/CVPR52733.2024.00200 (2024).
Rascol, O. & Fabre, N. Dyskinesia: L-dopa-induced and tardive dyskinesia. Clin. Neuropharmacol. 24, 313–323, https://doi.org/10.1097/00002826-200111000-00002 (2001).
Durif, F. et al. Clozapine improves dyskinesias in Parkinson disease: a double-blind, placebo-controlled study. Neurology 62, 381–388, https://doi.org/10.1212/01.wnl.0000110317.52453.6c (2004).
Schaeffer, E., Pilotto, A. & Berg, D. Pharmacological strategies for the management of levodopa-induced dyskinesia in patients with Parkinson’s disease. CNS Drugs 28, 1155–1184, https://doi.org/10.1007/s40263-014-0205-z (2014).
Bastide, M. F. et al. Pathophysiology of L-dopa-induced motor and non-motor complications in Parkinson’s disease. Prog. Neurobiol. 132, 96–168, https://doi.org/10.1016/j.pneurobio.2015.07.002 (2015).
Fox, S. H. et al. International Parkinson and movement disorder society evidence-based medicine review: update on treatments for the motor symptoms of Parkinson’s disease. Mov. Disord. 33, 1248–1266, https://doi.org/10.1002/mds.27372 (2018).
Galarraga, E., Bargas, J., Martínez-Fong, D. & Aceves, J. Spontaneous synaptic potentials in dopamine-denervated neostriatal neurons. Neurosci. Lett. 81, 351–355, https://doi.org/10.1016/0304-3940(87)90409-5 (1987).
Parsons, C. G., Panchenko, V. A., Pinchenko, V. O., Tsyndrenko, A. Y. & Krishtal, O. A. Comparative patch-clamp studies with freshly dissociated rat hippocampal and striatal neurons on the NMDA receptor antagonistic effects of amantadine and memantine. Eur. J. Neurosci. 8, 446–454, https://doi.org/10.1111/j.1460-9568.1996.tb01228.x (1996).
Brigham, E. F. et al. Pharmacokinetic/pharmacodynamic correlation analysis of amantadine for levodopa-induced dyskinesia. J. Pharm. Exp. Ther. 367, 373–381, https://doi.org/10.1124/jpet.118.247650 (2018).
Lundblad, M. et al. Pharmacological validation of behavioural measures of akinesia and dyskinesia in a rat model of Parkinson’s disease. Eur. J. Neurosci. 15, 120–132, https://doi.org/10.1046/j.0953-816x.2001.01843.x (2002).
Cenci, M. A. & Crossman, A. R. Animal models of l-dopa-induced dyskinesia in Parkinson’s disease. Mov. Disord. 33, 889–899, https://doi.org/10.1002/mds.27337 (2018).
Hauser, R. A., Lytle, J., Formella, A. E. & Tanner, C. M. Amantadine delayed release/extended release capsules significantly reduce OFF time in Parkinson’s disease. NPJ Parkinsons Dis. 8, 29, https://doi.org/10.1038/s41531-022-00291-1 (2022).
Rascol, O., Fabbri, M. & Poewe, W. Amantadine in the treatment of Parkinson’s disease and other movement disorders. Lancet Neurol. 20, 1048–1056, https://doi.org/10.1016/S1474-4422(21)00249-0 (2021).
Sawada, H. et al. Amantadine for dyskinesias in Parkinson’s disease: a randomized controlled trial. PLoS One 5, e15298, https://doi.org/10.1371/journal.pone.0015298 (2010).
Muench, J. & Hamer, A. M. Adverse effects of antipsychotic medications. Am. Fam. Physician 81, 617–622 (2010).
Lind, P. A., Parker, R. K., Northwood, K., Siskind, D. J. & Medland, S. E. Clozapine efficacy and adverse drug reactions among a nationwide study of 1021 Australians prescribed clozapine: the ClozaGene study. Schizophr. Bull. 51, 458–469, https://doi.org/10.1093/schbul/sbae065 (2025).
Naheed, M. & Green, B. Focus on clozapine. Curr. Med. Res. Opin. 17, 223–229, https://doi.org/10.1185/0300799039117069 (2001).
Liu, Y. et al. Monitoring gait at home with radio waves in Parkinson’s disease: A marker of severity, progression, and medication response. Sci. Transl. Med 14, eadc9669. https://doi.org/10.1126/scitranslmed.adc9669 (2022).
Wu, Z. et al. Kinect-based objective evaluation of bradykinesia in patients with Parkinson’s disease. Digit Health 9, 20552076231176653. https://doi.org/10.1177/20552076231176653 (2023).
Poewe, W. Non-motor symptoms in Parkinson’s disease. Eur. J. Neurol. 15, 14–20, https://doi.org/10.1111/j.1468-1331.2008.02056.x (2008).
Aarsland, D. et al. Parkinson disease-associated cognitive impairment. Nat. Rev. Dis. Prim. 7, 47, https://doi.org/10.1038/s41572-021-00280-3 (2021).
Batzu, L., Podlewska, A., Gibson, L., Chaudhuri, K. R. & Aarsland, D. A general clinical overview of the non-motor symptoms in Parkinson’s disease: neuropsychiatric symptoms. Int. Rev. Neurobiol. 174, 59–97, https://doi.org/10.1016/bs.irn.2023.11.001 (2024).
da Conceição, F. S., Ngo-Abdalla, S., Houzel, J. C. & Rehen, S. K. Murine model for Parkinson’s disease: from 6-OH dopamine lesion to behavioral test. J. Vis. Exp. https://doi.org/10.3791/1376 (2010).
Cenci, M. A. & Lundblad, M. Ratings of L-DOPA-induced dyskinesia in the unilateral 6-OHDA lesion model of Parkinson’s disease in rats and mice. Curr. Protoc. Neurosci. Chapter 9, Unit 9.25, https://doi.org/10.1002/0471142301.ns0925s41 (2007).
Sheshadri, S., Dann, B., Hueser, T. & Scherberger, H. 3D reconstruction toolbox for behavior tracked with multiple cameras. J. Open Source Softw. 5, 1849 (2020).
Gschwind, T. et al. Hidden behavioral fingerprints in epilepsy. Neuron 111, 1440–1452.e1445, https://doi.org/10.1016/j.neuron.2023.02.003 (2023).
Sebastianutto, I., Maslava, N., Hopkins, C. R. & Cenci, M. A. Validation of an improved scale for rating l-DOPA-induced dyskinesia in the mouse and effects of specific dopamine receptor antagonists. Neurobiol. Dis. 96, 156–170, https://doi.org/10.1016/j.nbd.2016.09.001 (2016).
Acknowledgements
We thanks to PKU-Nanjing Joint Institute of Translational Medicine (Nanjing 211800, China) for their assistance with mouse behavior analysis. We also extend our gratitude to BayONE Scientific and Dr. Kang Huang for their guidance and support on camera synchronization. This work was supported by grants from National Science and Technology Innovation 2030 Major Program (2025ZD0217201 to H.F.S.), Youth fund of National Natural Science Foundation of China (82201406 to L.H.G.), National Natural Science Foundation of China (grants No. 82171421, 82371432, 92249302 to J.W.), National Health Commission of China (grants No. Pro20211231084249000238 to J.W.).
Author information
Authors and Affiliations
Contributions
Xiaochen An prepared Figs. 1–3, 5 and 6, some Supplemental Figures, the table, the Supplemental materials and the Supplemental materials, and wrote the manuscript. Qi Yang prepared Fig. 4 and some Supplemental Figures. Lu Su performed the animal experiments. Linhua Gan and Bo Shen revised the manuscript. Jiajun Ji provided data visualization. Jian Wang provided professional guidance and funding. Haifeng Su conceived the study content, the presentation of the data, manuscript writing and review, and the funding.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
An, X., Su, L., Yang, Q. et al. A spatiotemporal hypergraph self-attention neural networks framework for the identification and pharmacological efficacy assessment of Parkinson’s disease motor symptoms. npj Parkinsons Dis. 11, 338 (2025). https://doi.org/10.1038/s41531-025-01187-6
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41531-025-01187-6








