Abstract
Over the first few months after birth, the typical emergence of spontaneous, fidgety general movements is associated with later developmental outcomes. In contrast, the absence of fidgety movements is a core feature of several neurodevelopmental and cognitive disorders. Currently, manual assessment of early infant movement patterns is time consuming and labour intensive, limiting its wider use. Recent advances in computer vision and deep learning have led to the emergence of pose estimation techniques, computational methods designed to locate and track body points from video without specialised equipment or markers, for movement tracking. In this study, we use automated markerless tracking of infant body parts to build statistical models of early movements. Using a dataset of infant movement videos (n = 486) from 330 infants we demonstrate that infant movement can be modelled as a sequence of eight motor states using autoregressive, state-space models. Each, motor state Is characterised by specific body part movements, the expression of which varies with age and differs in infants at high-risk of poor neurodevelopmental outcome.
Similar content being viewed by others
Introduction
The emergence of motor skills in infants enriches their interaction with the environment1. From lifting their head while prone, to sitting, standing and walking, motor development instigates new skills and opportunities for interaction with a child’s environment and carers, facilitating goal-directed actions and learning2,3.
Early motor development has cascading effects on executive function3. Evidence of goal-directed movement has been observed in the developing fetus from as early as mid-gestation4. From birth, as movements transition from spontaneous general movements to planned motor actions, connections are reinforced between primary motor systems and high-order cortex5, accompanied by changes in functional activity of the cerebral cortex6 and a shift in metabolic activity from primary sensory and motor cortex towards parietal and frontal regions7,8.
In school-age children, motor skills are associated with higher-order cognitive functions and academic achievement9,10,11. Motor trajectories vary significantly across individual infants, however, and longitudinal studies have demonstrated that early postnatal motor control and earlier attainment of developmental milestones in infancy are associated with improved behavioural and educational outcomes2,12,13,14. In contrast, disrupted motor development, in the form of abnormal postnatal general movements, is a core feature of several neurodevelopmental and cognitive disorders15,16,17 and a consequence of disruptions to early development, including preterm birth18. The importance of early motor development in supporting later cognitive functions is highlighted by the promising efficacy of motor interventions in improving outcomes in at-risk populations19,20.
From studies of voluntary movements in humans and primates, it has been posited that movements are couched in a series of motor ‘primitives’, short kinematic elements that can be strung together and combined to form more complex actions21,22,23,24,25,26,27,28,29,30,31. In neonates, motor primitives that are present at birth are mirrored in the basic locomotor elements of walking in both toddlers and adults32. Over the first few months of life, the emergence of spontaneous, general movements: trunk rotations, and coordinated movements of the arms and legs, follow a well-defined trajectory33.
Between nine and twenty weeks of age, spontaneous movements are characterised by continuous ‘fidgety’ movements of the arms, legs, neck and trunk with moderate speed and variable acceleration34,35. Absence or abnormality of these developmental motor patterns are a core indicator of the General Motor Assessment (GMA), an early screening tool with high predictive validity for neurodevelopmental outcomes and excellent inter-rater reliability34,36,37,38. While the complete absence of typical fidgety movements is a good predictor of severe neurological deficits such as cerebral palsy, abnormal movements exaggerated in amplitude, speed, and jerkiness or limited in repertoire are associated with minor neurological dysfunctions and delayed motor development36,39,40,41. Abnormal or absent fidgety movements in infancy may also be associated with delayed cognitive and language development and have been observed in genetic neurodevelopmental disorders40,42,43,44,45.
Typically, the quality of general movements in infancy can be rated using a trained assessor’s Gestalt perception from observing a supine infant, either in person or via video recording, with no direct handling or interaction34,36,46,47. The most valid and reliable method being Prechtl’s assessment of General Movements37. By contrast, quantitative kinematic analysis of body movements often rely upon specialised technology: 3D cameras, motion capture markers, or sensors48,49,50. This can be burdensome for study participants and families and may preclude multiple, repeated assessments over time. As such, studies of motor development are often limited to relatively coarse measures of motor development (e.g.: milestone attainment)51,52. The development of specialised smartphone apps has enabled at-home, video-based movement assessments46,53. By facilitating remote assessment, these approaches improve accessibility to clinical expertise and allow identification of high-risk infants outside of clinical settings36,46,53,54,55.
Recent advances in computer vision and deep learning have resulted in the emergence of pose estimation techniques for movement tracking, computational methods designed to locate and track body points from video, without specialised equipment or markers56,57. Pose estimation tools have proven to be applicable in a variety of different video acquisitions and tasks outside of controlled laboratory settings, including tracking infant movements54,58,59,60,61. Such models require fine-tuning, or training, on infant data to accommodate significant differences in body segment size and scale compared to adults54,60. Once trained, these tools are able to track body points with the same accuracy as human annotations in infant cohorts, often within 5 pixels. While single-camera video-based tools are unable to fully capture the complexities of 3D movement, we and others have achieved promising results predicting early motor outcomes from automated pose estimation54,58,59,62.
A common criticism of machine learning approaches in clinical settings is the opacity and limited interpretability of automated model predictions63. While attention-based methods are useful to identify important periods or sequences in timeseries data they are divorced from movement mechanics54,59. To aid interpretation, alternative statistical models, grounded in ethological observations that spontaneous movement can be organised as a set of movement ‘syllables’ or short motor sequences, may hold promise in the study of early human movement64,65,66. These unsupervised, data-driven state-space models frame complex movements as a set of short sequences with unique dynamics across body points that can vary in timing and frequency across individuals, allowing the identification of motor sequences that change with age, or are affected by neurological or developmental disorders65,67,68.
In this study, we use automated markerless tracking of infant movements to build statistical models of early motor development. Using autoregressive, state-space models we test the hypothesis that early movement patterns can be modelled as a progression through a series of discrete dynamic states, the expression of which changes with age and is altered in infants at high-risk of poor neurodevelopmental outcomes.
Results
Autoregressive state-space models capture infant movement patterns
In a set of n = 486 smartphone videos, of supine infants in a state of active wakefulness (n = 330 individuals, aged 12 to 18 weeks’ corrected age; n = 151 born preterm; 49.4% female), we used a pre-trained deep learning model to automatically label and extract the position of 18 body points54,69 (Fig. 1a). All videos were 3 min in length and interpolated to 25 frames per second (4500 frames) to accommodate variations in frame rates. In every video frame (n = 4500), each body point was represented by an \(x\) and \(y\) coordinate (normalised to infant position and size). After filtering and downsampling to 10 Hz, we used Principal Component Analysis (PCA) to generate a compact representation of body movement, combining body point data into a set of Principal Movement (PMs)70. Each PM captures a coordinated pattern of movement across body points, reducing the dimensions of the data and accounting for redundancy between correlated body points (Fig. 1b; Figure S1). Overall, 15 PMs explained 89% of variance in pose (Fig. 1b). For each video, body position in each frame was represented by a set of 15 weights (one per PM).
Using the first derivative of PM weight over time, we modelled changes in movement velocity using state-space models, evaluating goodness-of-fit with 5-fold cross-validation (Fig. 1c-d; Figure S2). We performed initial model selection using a random subset of n = 100 videos (ensuring a maximum of one video per participant was selected) as an evaluation set, comparing model performance between hidden Markov models with and without autoregressive terms (ARHMM and HMM, respectively) and between simple Gaussian models without state progression (GMM) (Fig. 1c). These data-driven approaches for clustering multivariate timeseries identify periods with similar movement dynamics across videos.
State-space modelling of infant movement dynamics. (a) Movement videos (3 min length) of infants aged between 12 and 18 weeks of age were acquired at home using a specialised smartphone app (Baby Moves). Using a custom-trained deep learning algorithm (DeepLabCut), video frames were automatically labelled to track several key body points. (b) Following preprocessing and quality control, movements were represented by a set of Principal Movements (PMs). Plots shows the variance explained (cumulative variance; right axis) by each PM in a random subset of n = 100 videos. Inset: First PM. Principal movement is shown by position of markers at different weights. (c) The dynamic contribution of PMs to bodypoint movement in each video was modelled using state-space models. Graphical model shows dependence of observations, x, on the transition between states, z, and previous values of x at t-1 and t-2. In HMM, autoregressive components are removed. In GMM, state progression is removed. (d) Goodness-of-fit was compared between models using 5-fold cross-validation. Plots shows AIC across folds for ARHMM models with different values of lag (lower is better). (e) the first 500 frames of a randomly selected video are shown (top). Each line shows the first derivative of a given principal movement over time. Bottom: Synthetic data. 500 frames generated from the trained AR(2)HMM model (k = 8) show observed movement dynamics are captured by the state-space model.
For each model class, we evaluated model fit over a number of states from k = 1 (i.e.: all movement is generated from a single set of autocorrelation dynamics) up to k = 15. For all models, performance in held-out test data (averaged over cross-validation folds) improved when including additional states, with little significant improvement above 5 to 8 states (Figure S2; Table S1). The number of states indicates the timescale of observed temporal structure within the movement data and provides a natural segmentation of continuous behaviour into meaningful components of movement65.
Overall, model performance was higher using ARHMM models compared to HMM and was, in turn, higher than in GMM models (Figure S2). Under the simpler GMM model, infant movement dynamics are modelled as variations of a set of poses or positions with no additional temporal structure. HMM models include additional transition probabilities, testing the hypothesis that behavioural modules, or movements, transition from one to another with a specific probability that is consistent over subjects. The addition of autoregressive (AR) coefficients tests module-specific dependencies of movement along a shorter, sub-second timescales. The improved performance of ARHMM models supports the hypothesis that spontaneous infant movements can be described with a hierarchical process, with movements grouped into modules encoding the occurrence of stereotyped behaviours or movement patterns across individuals, the dynamics of which are governed by autocorrelation over a short timescale. Increasing the autoregressive order improved ARHMM model performance moderately (Fig. 1d). Based on average performance over folds in the evaluation set, the best fit model was selected as an 8-state ARHMM with lag = 2. This model was then fit on the full dataset for further analysis. We confirmed that this model was able to generate synthetic data samples with feature distributions and dynamics matched to empirical data (Fig. 1e; Figure S3).
The expression of motor states in early infancy
Using the fitted model, we estimated the framewise state-membership for each video (Fig. 2a), identifying periods where movements followed a given state dynamic. On average, state transitions were rapid with each state capturing only small segments of each video. Across all videos, the median state length (excluding single frame instances) was 3 frames (0.3 s; Fig. 2a) with the average [IQR] maximum time spent in any given motor state = 19 [15.75-26.0] frames (1.9 s). State occupancy (proportion of time spent in each state) varied across individual videos (Fig. 2a-b).
Using linear mixed effect models, we tested whether the expression of specific movement states changed with age. We found that between 12 and 18 weeks of age, the number of state 5 movements decreased (\(\beta\)age= -9.26; p < 0.001), whereas the number of state 7 movements increased (\(\beta\)age= 7.23, p < 0.001; Fig. 2c; Table S2). A small increase was also observed in state 1 (p = 0.03) and a decrease in state 4 (p = 0.01), though these did not pass correction for multiple comparisons.
Early movement patterns differ in high-risk and preterm born infants
Participant’s motor development was assessed using the GMA (n = 326/330) by two independent trained assessors. We next tested if state occupancy varied between participants with normal (n = 289) compared to abnormal (sporadic, absent or abnormal, n = 37) fidgety general movements. Where two videos were available, the GMA was based on the video acquired at the later timepoint35,54. As preterm birth is an independent risk factor for poor motor development, we included birth status (preterm normal/abnormal = 117/31; term control normal/abnormal = 172/6) along with age in the model as additional factors.
Movements in state 7 were significantly increased in participants with abnormal or absent GMs (\(\beta\)GM = 23.7, p = 0.003; Table S3). Preterm birth was associated with increased occupancy of states 4 and 5 (\(\beta\)birth = 38.7, 26.8, p < 0.001, = 0.001, respectively) and a decreased count of state 1, 3 and 8 movements (\(\beta\)birth = -31.9, -35.8, -34.4, respectively, p < 0.001; Table S3).
At 2 years of age, follow-up assessments were available for 302/330 (motor and cognitive), 272/330 (language) participants. We found that lower motor scores at 2 years were associated with increased occupancy of state 7 in infancy after adjusting for preterm birth status, although this association did not pass correction for multiple comparisons (\(\beta\)motor = -0.39, p = 0.014; Table S4). Improved cognitive scores and language scores were associated with lower occupancy of state 4 (\(\beta\)cog = -1.19, \(\beta\)lang -0.85, p = 0.0012, 0.0051, respectively) and state 1 (\(\beta\)lang -0.67, p = 0.003; Table S5-6).
Changes in state occupancy during infant movement. (a) Individual movement dynamics were modelled using best-fit ARHMM. Top row: predicted motor states for each video frame in three videos chosen at random. Each frame is represented by a vertical line, coloured according to its predicted motor state. Middle row: Frames are sorted according to state membership. The time spent in each motor state varies across individuals. Bottom row. State occupancy (% frames in each state) for each video are shown as a bar plot. (b) Group average state occupancy. Proportion of time spent in each motor state is shown for all individual videos (n = 486, markers represent individual videos). Boxplot indicate median and interquartile range. (c) Change in state-occupancy with age was modelled using linear mixed effects models. Videos from the same individual are joined by a line. Black line indicate main effect of age with 95% confidence intervals shaded. Significant effects (p < 0.05 after Bonferroni correction over 8 states) are highlighted in bold.
Specific motor states capture high velocity body movements
To examine movement dynamics of each motor state, we identified all frames in each state for every participant video. A random sample of state-specific movements (n = 5000, frame-to-frame movement of each body point) is shown in Fig. 3a. The magnitude of bodypoint movements in each state vary significantly, with several states capturing high velocity and large magnitude movements between frames (e.g.: states 4, 7 and 8), and other signifying periods of low motion, or rest (states 2, 3 and 6). State 7 captures high velocity movements of bodypoints in all directions, including large magnitude movements of feet (ankles, toes) and hands (wrist).
State movement dynamics. (a) Direction and magnitude of frame-to-frame movements of each body point in each state, centered on the group average position. Movements were randomly selected (n = 5000) from frames in each state in all videos. (b). For each body point, the top 10% high velocity frames (those containing most movement) were identified and grouped by state membership. Plots shows average % high velocity frames for each group of bodypoints in each state for every video (n = 486). Boxplot shows median and interquartile range.
To identify potential specificity of movements across bodypoints in each state, we identified the top 10% of frames and corresponding states with the largest frame-to-frame motion in each bodypoint, grouped into three larger clusters: arms (shoulder, elbow, wrist), legs (hips, knee, ankle, toe) and head (crown, chin, eyes) (Fig. 3b). Of the three high-motion states (4,7 and 8), we found that each state captured different patterns of movement, with states 4 and 8 preferentially capturing high velocity leg and arm movements, respectively, in contrast to the whole-body movements of state 7. In contrast, state 3 and 5 were associated with smaller magnitude movements of the arms and legs, respectively. States 2 and 6 largely signified periods of low movement, capturing few high velocity frames across body points.
Discussion
In this study, we combined deep learning-based video tracking with autoregressive state-space models to identify dynamic movement states in infants aged 12 to 18 weeks. We observed that infant movement patterns can be modelled as a sequence of motor states, each characterised by specific body part movements, with expression that varies with age and differs in infants at high-risk of poor neurodevelopmental outcome.
Early motor development has been extensively studied, with the timing and trajectory of major movement milestones defined across ages and populations52. In the months following birth, several characteristic movement periods have been defined, the occurrence and calibre of which offer prognostic value for later developmental and neurological outcomes16,33,34,35,38,40,41,42,44,45,71. Alongside others, we have recently demonstrated that automated methods, founded upon recent computer vision and machine learning innovations56,69, are able to capture body part movements and quantify risk in infant populations using video alone, without the need for specialised tracking equipment54,58,62. While powerful, a current limitation of these methods is the difficulty interpreting algorithmic model decisions. Here, we combined AI-based video tracking with a statistical modelling framework previously used in ethological studies of spontaneous animal movement64,65,66,68. These data-driven methods are able to parse complex, multivariate movement traces, identifying shared patterns of motor behaviour that are common to study participants. In turn, this allows the analysis of timing, frequency and progression of each motor state over time and the quantification of individual variations in movement65,68.
Hidden Markov Models describe a system where observable, external events depend upon a set of internal, unobservable factors. In the context of this study, we are unable to observe the internal factors that lead to different motor behaviours in the infant, therefore we infer the identity of a series of modules, or states, that occur in sequence and are each associated with a specific type of movement observable in the video data. We find that HMM models outperform simpler GMM models, that do not account for the sequential progression of internal states and that the addition of autoregression terms, to account for the correlation in movements between adjacent frames, confers a significant benefit. Once fitted to the infant movement data, we find that ARHMMs can produce synthetic multivariate timeseries data that share distributional and dynamic features with the empirical data (Fig. 1e). Our ARHMM models demonstrate that a relatively small number of states (between 5 and 8) are sufficient to explain the full complement of infant movements, outperforming simple models based on covariance and autocorrelation between body parts alone (i.e.: k = 1 state), or more complex models with higher number of internal states (k = 10, 15).
The notion that complex motor behaviours can be modelled as a combination of simpler movement ‘syllables’ or ‘primitives’ is long-standing25,27,28,32,65. The states defined in this study can be viewed in a similar way: as compositional elements of complex movements. By segmenting multivariate movement traces into a set of modules, we can isolate particular movement patterns, each governed by its own set of autocorrelation dynamics, that reflect a stereotyped body movement – fast, slow, specific or global. An inherent limitation of 2D pose estimation is the inability to fully capture movement across all three body planes, making accurate calculation of typical kinematic metrics such as joint angles or rotations difficult. As an alternative, our data-driven approach takes advantage of statistical spatial and temporal dependencies between limb movements, without the need to fully describe movement patterns across three dimensions. In this study, we describe eight motor states, or coordinated movement patterns, three of which are associated with high velocity, large magnitude, movements between adjacent frames (states 4, 7 and 8). States 4 and 8 capture high velocity movements in specific body parts, the legs and arms respectively. These movements of the legs likely correspond to flexion and extensions of the legs, whereas those of the arms likely correspond to the hands coming towards the midline and out to full extension. State 7 captured high velocity movements of the whole body, including leg and arm movements. Other states are associated with small movements or pauses in movement (Fig. 3). While some states captured up to 40–50% of frames in a single video, the median time spent in a single state was very short with the maximum time spent before switching around 2 sec on average across individuals. The frequent transitions between distinct movements provide a granular description of motor behaviours with body part position dictated by the interaction of specific movement types and the location of body points in the preceding frames. These movement states, when strung together, encode the full repertoire of transient, spontaneous infant movements.
We found that large, high velocity, whole-body movements (state 7) increased in frequency with age. This movement type was also associated with neurodevelopmental risk, increased in infants with abnormal or absent fidgety general movements and associated with lower motor scores at 2 years. In contrast, movements in state 4 and 5, involving high-amplitude movements of the legs were reduced in older infants but increased in those born preterm, suggesting a potential difference in developmental timing in high-risk infants35,72,73,74. Abnormal fidgety movements that are exaggerated in amplitude and speed are a known risk factor for poor neurodevelopmental outcome39,42,75. State 7 movements may reflect such exaggerated whole-body movements although the alignment between motor state timing and specific pathological movement types requires further validation. We lack high resolution annotation of movements within each video needed for a direct comparison between motor state and movement type. However, others have demonstrated that automated video tracking methods are able to detect abnormal periods associated with specific types of GM, such as fidgety movements76. While it is unlikely that a single motor state would fully capture specific abnormal movements observed with the GMA, our study indicates that state-space models of infant movement are able to parameterise complex, spontaneous infant movements and are sensitive to abnormalities or absence of typical developmental movements although they do not yet constitute a clinically interpretable outcome for diagnosing or managing movement disorders. In combination with other, extended motor assessments such as the Motor Optimality Score (revised; MOS-R)77 or BabyOscar78, they may also provide further insight into the importance of specific movement patterns, posture and quality observations during this time period79,80,81. We propose that this approach will prove useful to the automated detection and improved understanding of atypical motor behaviour in high-risk infant populations.
Materials & methods
Participant information
In total, we acquired video data from parent-participants from the Victorian Infant Collaborative Study (2016/2017 cohort) n = 341 infants aged between 12- and 18-weeks’ corrected age. This included n = 155 infants born extremely preterm (< 28 weeks’ gestation; 77 [50%] female; mean age ± SD = 26.8 ± 2.0 weeks) and n = 186 term-born control infants (37–42 weeks’ gestation; (91 [49%] female; mean age ± SD = 39.5 ± 1.2 weeks). In 165 infants (78 preterm; 87 term), two videos were acquired, resulting in a total of 506 videos. Following initial quality control after automated labelling, we excluded 18 videos (see Trajectory data processing below) and excluded a further two videos due to missing age data (1) and a high video resolution that precluded automated labelling, resulting in a final dataset of n = 486 videos acquired from 330 infants (151 preterm; 163 female). Full details of the study protocol can be found in Spittle et al.46 The study was approved by the Royal Children’s Hospital Ethics Committee (HREC35237) and performed in accordance with the Declaration of Helsinki. Written informed consent was obtained from parent/caregivers of infants prior to inclusion in the study.
Video acquisition
Infant movement videos were captured using a dedicated smartphone app, Baby Moves, by a parent/caregiver at home on their personal device between April 2016 and May 2017, as detailed previously46,54. Briefly, following in-app guidance, infants were recorded lying quietly in a supine position for 3 min. Videos were uploaded to a secure database in MP4 format. Due to differences in device model and settings, videos were acquired at different resolutions with a median frame rate of 30 (range: 15–31) frames per second. Each video was 3 min in length resulting in mean ± S.D frames per video of 5100 ± 497.54
Markerless movement tracking
We implemented a custom keypoint labelling system based on DeepLabCut (v2.1)69 to track infant body part movements from video, for details see: Passmore et al.54. In brief, a deep learning model was trained to identify 18 body points (head, eyes, chin, shoulders, elbows, wrists, hips, knees, heels and halluces) in each video frame. For each frame, the predicted \(x,y\) pixel coordinates of each point are returned along with a measure of prediction confidence69. Points with a confidence < 0.2 were removed on a framewise basis54.
We have previously shown that this system results in human-level labelling accuracy, with a mean error in unseen data of 6.8 pixels (human inter-rater error = 6.9 pixels) and is robust to different video resolutions and frame rates, as well as infant clothing, background, and lighting54.
Trajectory data processing
Following automated labelling, keypoint data were processed using a custom pipeline consisting of quality control, outlier removal, gap filling, adjustment for camera movement and scaling (Figure S4)54. For initial quality control, videos with less than 70% of keypoints labelled on average across frames were removed, resulting in n = 18 excluded videos. Within each video, keypoint outliers were identified using a two-step process, first identifying labels that lay outside of an ellipse centered and scaled to infant position and size, then identifying labels that lay outside of body part-specific ellipses centred on each keypoint’s median position over time54.
After quality control, gaps due to missing, excluded or occluded labels were filled using linear interpolation (for gaps of < 5 frames), or an iterative multivariate imputation82,83. As videos were acquired from handheld devices, we accounted for camera movement by applying a rotation to all points to align the hips and shoulders across frames. To account for differences in infant size, keypoint coordinates were scaled to infant length (defined as distance from crown to mid-hip).
For each frame, we extracted the normalised \(x\) and \(y\) coordinate of each keypoint, resulting in \(d=36\) pose features per frame. The timeseries of each feature was bandpass filtered using a fourth-order, zero-phase Butterworth filter (0.01–5 Hz) to reduce low-frequency-drift and the high-frequency jitter commonly associated with automated framewise labelling. After removing drift, timepoints where \(x\) or \(y\) position of a given keypoint were > 3 standard deviations from the median position of the whole cohort were identified as outliers and, to reduce computational burden, the length of each timeseries downsampled to 1800 frames (10 Hz frame rate) using a first-order spline interpolation, excluding any outlying points.
Principal movements
Pose features data from each video can be represented as a \(t\) timepoint \(\times\:d\) feature (1800 \(\times\) 36) matrix, \(M\). As body part movements are highly correlated over time, we can apply PCA to generate a more compact representation of movement by a smaller number of basis components that each represent a coordinated pattern of movement across body parts, or Principal Movements (PM)70.
To estimate a set of PMs, we randomly selected n = 100 videos (ensuring only one video was selected for any given participant). Each feature timeseries was demeaned to remove body position and normalised to unit standard deviation to equalise the contribution of each feature to the decomposition. Normalised coordinate data were concatenated along the time dimension and decomposed with SVD:
Resulting in a set of pose eigenvectors, or PMs, \({V=\{v}_{1},\:\dots\:,{v}_{k}\}\), and weights, \({US=\{\epsilon}_{1},\dots,{\epsilon}_{k}\}\) that, for a given mode \(k\), quantify the degree to which posture at time \(t\) deviates from the mean pose in the direction of \({v}_{k}\). Unseen data can be projected onto this basis set through multiplication with the set of calculated eigenvectors. Principal movements are shown in Figure S1.
State-space modelling
Individual movement dynamics were modelled using an autoregressive Hidden Markov Model (ARHMM), a data-driven approach to multivariate timeseries clustering. Using Hidden Markov Models (HMM), a continuous, \(d\)-feature \(\times{t}\) timeseries, \(X\), can be modelled as the output of a Markov process, \(Z\), with a set of \(K\) hidden states, \(\{{z}_{1},\:{z}_{2},\dots\:,\:{z}_{k}\}\), that are not directly observable. Each state is associated with a \(d\)-dimensional mean, \({\mu}_{k}\), and a \({d}\:\times\:{d}\) covariance matrix, \({S}_{k}\) from which the multivariate observations are drawn at each time step. The hidden states are assumed to progress as a Markov process where the next state is only dependent on the current state: \(p\left({z}_{t}\right|{z}_{t-1})\). Switches between states are governed by a \(K\:\times\:K\) transition matrix, \(T\) containing state transition probabilities, \({\Phi}\), where \({{\Phi}}_{zz{\prime}}\) indicates the probability of transitioning from state \(z\) to state \(z\)’ at the next timestep. In HMMs, the outputs, or observations, are determined solely by the given state at each time step and therefore do not capture short term correlations in timeseries data.
In an ARHMM, the observations at time \(t\), \({x}_{t}\), are dependent on both the hidden state, \({z}_{k}\), and the observations at previous timepoints, \(\{{x}_{t-1},\:{x}_{t-2},\dots\:,{x}_{t-l}\}\) with \(l\) determined by the degree of lag, \(L\), specified. As such the observations for a given state, \({z}_{k}\), are defined as:
Where \({x}_{t}\) is a set of observations at time \(t\), \({A}_{k}^{l}\) is a matrix containing the linear dynamics (the relationship between observations at \(t\) and at a given lag, \(l\)) for a given state, \({\mu}_{k}\) is a state-dependent bias (average value of each variable in each state) and \({S}_{k}\), a state-dependent covariance function. The number of hidden states and the degree of lag considered in each state are hyperparameters for the model. To focus on dynamic movement, ARHMM was applied to the first derivative (i.e.: velocity) of each timeseries.
Cross-validation
Model performance was initially assessed using 5-fold cross-validation across n = 100 randomly selected videos. In each fold, we performed a grid search across ARHMM hyperparameters: \(L=[1,\:2]\) and \(K=[1,\:\:2,\:\:5,\:\:8,\:\:10,\:\:15]\), where \(K=1\) represents a special case of a standard \(AR\left(l\right)\) model (i.e.: no state progression). As baseline models, we also tested standard HMM models (i.e.: no autoregression) across a range of \(K\) as well as Gaussian Mixture Models (GMM) where observations are drawn from a set of \(K\) multivariate distributions (i.e.: no state progression or autocorrelation). Assessment of model performance was based on average log-likelihood of test data samples given the model, averaged over five folds. Models were fit using stochastic expectation-maximisation for a maximum of 500 iterations or until convergence. After cross-validation, the best performing model was fit on all data samples (n=486 videos; max. 500 iterations). To account for stochasticity in the model fitting procedure, final model fitting was repeated 25 times and the model parameters averaged. All (AR)HMM and GMM models were implemented using the ssm84 and scikit-learn83 packages, respectively.
Motor assessment and follow-up
To assess motor development at the time of video acquisition, infant videos were independently assessed using Prechtl’s General Motor Assessment (GMA)34. Assessments were performed by two independent, trained assessors, blinded to each participant’s neonatal history, viewing each video. General movements (GMs) were classified as normal if fidgety GMs were intermittently or continuously present, absent if fidgety GMs were not observed or were sporadically present, or abnormal if fidgety GMs were exaggerated in speed and amplitude. Disagreements were resolved by a third experienced GMA trainer and assessor who made the final decision. Any videos rated as unscorable were not evaluated in this study. GMA scores were available for 326/330 infants.
Neurodevelopmental follow-up was performed at 2-years’ corrected age using the Bayley Scales of Infant and Toddler Development-3rd edition (Bayley-III) motor, cognitive and language domains. Bayley-III scores were available for 302/330 infants for motor and cognitive domains and 272/330 infants for the language domain.
Statistical analysis
Linear mixed effects models were used to test main effects of age, birth status (preterm or term) and neurodevelopmental outcomes (GMA and Bayley-III) on state occupancy. To account for repeated measures from multiple videos acquired in the same subject, participant ID was included as a random effect. For each analysis, statistical significance was considered at p < 0.05 after Bonferroni correction for multiple comparison over states (i.e.: 0.05 / 8).
Analyses and visualisations were implemented in Python (3.10) using packages including numpy85, scipy86, scikit-learn83, ssm84, matplotlib87, seaborn88, pandas89 and einops.90.
Data availability
Analysis code supporting this study is available at https://github.com/garedaba/state-space. Requests for data that supports the findings of this study can be made to the Murdoch Children’s Research Institute Data Office (data.requests@mcri.edu.au). The data is not publicly available due to privacy and ethical restrictions.
References
Massion, J. Postural control systems in developmental perspective. Neurosci. Biobehav Rev. 22, 465–472 (1998).
Adolph, K. E. & Franchak, J. M. The development of motor behavior. Wiley Interdiscip Rev. Cogn. Sci. 8, (2017).
Gottwald, J. M., Achermann, S., Marciszko, C., Lindskog, M. & Gredebäck, G. An embodied account of early executive-function development. Psychol. Sci. 27, 1600–1610 (2016).
Zoia, S. et al. Evidence of early development of action planning in the human foetus: A kinematic study. Exp. Brain Res. 176, 217–226 (2007).
Tau, G. Z. & Peterson, B. S. Normal development of brain circuits. Neuropsychopharmacology 35, 147–168 (2010).
Marshall, P. J., Bar-Haim, Y. & Fox, N. A. Development of the EEG from 5 months to 4 years of age. Clin. Neurophysiol. Off J. Int. Fed. Clin. Neurophysiol. 113, 1199–1208 (2002).
Chugani, H. T., Phelps, M. E. & Mazziotta, J. C. Positron emission tomography study of human brain functional development. Ann. Neurol. 22, 487–497 (1987).
Kostović, I., Judas, M., Petanjek, Z. & Simić, G. Ontogenesis of goal-directed behavior: anatomo-functional considerations. Int. J. Psychophysiol. Off J. Int. Organ. Psychophysiol. 19, 85–102 (1995).
Schmidt, M. et al. Disentangling the relationship between children’s motor ability, executive function and academic achievement. PLoS ONE. 12, e0182845 (2017).
Wassenberg, R. et al. Relation between cognitive and motor performance in 5- to 6-year-old children: Results from a large-scale cross-sectional study. Child. Dev. 76, 1092–1103 (2005).
Zysset, A. E. et al. Predictors of executive functions in preschoolers: Findings from the SPLASHY study. Front. Psychol. 9, (2018).
Ghassabian, A. et al. Gross motor milestones and subsequent development. Pediatrics 138, (2016).
Hitzert, M. M., Roze, E., Braeckel, K. N. J. A. V. & Bos, A. F. Motor development in 3-month-old healthy term-born infants is associated with cognitive and behavioural outcomes at early school age. Dev. Med. Child. Neurol. 56, 869–876 (2014).
Murray, G. K., Jones, P. B., Kuh, D. & Richards, M. Infant developmental milestones and subsequent cognitive function. Ann. Neurol. 62, 128–136 (2007).
Bhat, A. N. Is motor impairment in autism spectrum disorder distinct from developmental coordination disorder? A report from the SPARK study. Phys. Ther. 100, 633–644 (2020).
Hadders-Algra, M. General movements: a window for early identification of children at high risk for developmental disorders. J. Pediatr. 145, S12–18 (2004).
St. John, T. et al. Emerging executive functioning and motor development in infants at high and low risk for autism spectrum disorder. Front. Psychol. 7, (2016).
Spittle, A. J., Cameron, K., Doyle, L. W., Cheong, J. L. & Victorian Infant Collaborative Study Group. Motor impairment trends in extremely preterm children: 1991–2005. Pediatrics 141, (2018).
Dusing, S. C. et al. Supporting play exploration and early developmental intervention versus usual care to enhance development outcomes during the transition from the neonatal intensive care unit to home: A pilot randomized controlled trial. BMC Pediatr. 18, (2018).
Spittle, A., Orton, J., Anderson, P. J., Boyd, R. & Doyle, L. W. Early developmental intervention programmes provided post hospital discharge to prevent motor and cognitive impairment in preterm infants. Cochrane Database Syst. Rev. 11, CD005495 (2015).
Abeles, M. et al. Compositionality in neural control: An interdisciplinary study of scribbling movements in primates. Front. Comput. Neurosci. 7, 103 (2013).
Giszter, S. F. Motor primitives—New data and future questions. Curr. Opin. Neurobiol. 33, 156–165 (2015).
Flash, T. & Hogan, N. The coordination of arm movements: An experimentally confirmed mathematical model. J. Neurosci. 5, 1688–1703 (1985).
Flash, T. & Hochner, B. Motor primitives in vertebrates and invertebrates. Curr. Opin. Neurobiol. 15, 660–666 (2005).
Sosnik, R., Hauptmann, B., Karni, A. & Flash, T. When practice leads to co-articulation: The evolution of geometrically defined movement primitives. Exp. Brain Res. 156, 422–438 (2004).
Sternad, D. et al. Transitions between discrete and rhythmic primitives in a unimanual task. Front. Comput. Neurosci. 7, 90 (2013).
Hogan, N. & Sternad, D. Dynamic primitives in the control of locomotion. Front. Comput. Neurosci. 7, 71 (2013).
Hemeren, P. E. & Thill, S. Deriving motor primitives through action segmentation. Front. Psychol. 1, (2011).
Thoroughman, K. & Shadmehr, R. Learning of action through adaptive combination of motor primitives. Nature 407, (2000).
Bizzi, E., Cheung, V. C. K., d’Avella, A., Saltiel, P. & Tresch, M. Combining modules for movement. Brain Res. Rev. 57, 125–133 (2008).
Mussa-Ivaldi, F. A., Giszter, S. F. & Bizzi, E. Linear combinations of primitives in vertebrate motor control. Proc. Natl. Acad. Sci. U S A. 91, 7534–7538 (1994).
Dominici, N. et al. Locomotor primitives in newborn babies and their development. Science 334, 997–999 (2011).
Hadders-Algra, M. Early human motor development: From variation to the ability to vary and adapt. Neurosci. Biobehav Rev. 90, 411–427 (2018).
Einspieler, C. & Prechtl, H. F. R. Prechtl’s assessment of general movements: a diagnostic tool for the functional assessment of the young nervous system. Ment Retard. Dev. Disabil. Res. Rev. 11, 61–67 (2005).
Kwong, A. K. L. et al. Occurrence of and temporal trends in fidgety general movements in infants born extremely preterm/extremely low birthweight and term-born controls. Early Hum. Dev. 135, 11–15 (2019).
Kwong, A. K. L. et al. Parent-recorded videos of infant spontaneous movement: Comparisons at 3–4 months and relationships with 2-year developmental outcomes in extremely preterm, extremely low birthweight and term-born infants. Paediatr. Perinat. Epidemiol. 36, 673–682 (2022).
Kwong, A. K. L., Fitzgerald, T. L., Doyle, L. W., Cheong, J. L. Y. & Spittle, A. J. Predictive validity of spontaneous early infant movement for later cerebral palsy: A systematic review. Dev. Med. Child. Neurol. 60, 480–489 (2018).
Kanemaru, N. et al. Jerky spontaneous movements at term age in preterm infants who later developed cerebral palsy. Early Hum. Dev. 90, 387–392 (2014).
Prechtl, H. F. et al. An early marker for neurological deficits after perinatal brain lesions. Lancet 349, 1361–1363 (1997).
Spittle, A. J. et al. General movements in very preterm children and neurodevelopment at 2 and 4 years. Pediatrics 132, e452–458 (2013).
Kanemaru, N. et al. Specific characteristics of spontaneous movements in preterm infants at term age are associated with developmental delays at age 3 years. Dev. Med. Child. Neurol. 55, 713–721 (2013).
Butcher, P. R. et al. The quality of preterm infants’ spontaneous movements: An early indicator of intelligence and behaviour at school age. J. Child. Psychol. Psychiatry. 50, 920–930 (2009).
Marschik, P. B., Soloveichick, M., Windpassinger, C. & Einspieler, C. General movements in genetic disorders: A first look into Cornelia De Lange syndrome. Dev. Neurorehabilitation. 18, 280–282 (2015).
Einspieler, C. et al. Highlighting the first 5 months of life: General movements in infants later diagnosed with autism spectrum disorder or Rett syndrome. Res. Autism Spectr. Disord. 8, 286–291 (2014).
Gima, H. et al. Early motor signs of autism spectrum disorder in spontaneous position and movement of the head. Exp. Brain Res. 236, 1139–1148 (2018).
Spittle, A. J. et al. The Baby moves prospective cohort study protocol: Using a smartphone application with the general movements assessment to predict neurodevelopmental outcomes at age 2 years for extremely preterm or extremely low birthweight infants. BMJ Open. 6, e013446 (2016).
Gima, H., Shimatani, K., Nakano, H., Watanabe, H. & Taga, G. Evaluation of fidgety movements of infants based on Gestalt perception reflects differences in limb movement trajectory curvature. Phys. Ther. 99, 701–710 (2019).
Airaksinen, M. et al. Automatic posture and movement tracking of infants with wearable movement sensors. Sci. Rep. 10, 169 (2020).
Tao, W., Liu, T., Zheng, R. & Feng, H. Gait analysis using wearable sensors. Sensors 12, 2255–2283 (2012).
Irshad, M. T., Nisar, M. A., Gouverneur, P., Rapp, M. & Grzegorzek, M. AI approaches towards Prechtl’s assessment of general movements: A systematic literature review. Sensors 20, 5321 (2020).
Goyen, T. A. & Lui, K. Longitudinal motor development of apparently normal high-risk infants at 18 months, 3 and 5 years. Early Hum. Dev. 70, 103–115 (2002).
WHO Multicentre Growth Reference Study Group. WHO motor development study: Windows of achievement for six gross motor development milestones. Acta Paediatr. Oslo nor. 1992 Suppl. 450, 86–95 (2006).
Adde, L. et al. In-motion-app for remote general movement assessment: A multi-site observational study. BMJ Open. 11, e042147 (2021).
Passmore, E. et al. Automated identification of abnormal infant movements from smart phone videos. PLOS Digit. Health. 3, e0000432 (2024).
Svensson, K. A., Örtqvist, M., Bos, A. F., Eliasson, A. C. & Sundelin, H. E. K. Usability and inter-rater reliability of the NeuroMotion app: A tool in general movements assessments. Eur. J. Paediatr. Neurol. 33, 29–35 (2021).
Cao, Z., Hidalgo, G., Simon, T., Wei, S. E. & Sheikh, Y. OpenPose: Realtime multi-person 2D pose estimation using part affinity fields. ArXiv181208008 Cs (2019).
Toshev, A. & Szegedy, C. DeepPose: Human pose estimation via deep neural networks. https://doi.org/10.1109/CVPR.2014.214 (2013). https://arxiv.org/abs/1312.4659v3
Ihlen, E. A. F. et al. Machine learning of infant spontaneous movements for the early prediction of cerebral palsy: A multi-site cohort study. J. Clin. Med. 9, 5 (2019).
Nguyen-Thai, B. et al. A spatio-temporal attention-based model for infant movement assessment from videos. IEEE J. Biomed. Health Inf. 25, 3911–3920 (2021).
Chambers, C. et al. Computer vision to automatically assess infant neuromotor risk. IEEE Trans. Neural Syst. Rehabil Eng. Publ IEEE Eng. Med. Biol. Soc. 28, 2431–2442 (2020).
Reich, S. et al. Novel AI driven approach to classify infant motor functions. Sci. Rep. 11, 9888 (2021).
Groos, D. et al. Development and validation of a deep learning method to predict cerebral palsy from spontaneous movements in infants at high risk. JAMA Netw. Open. 5, e2221325 (2022).
Chen, H., Gomez, C., Huang, C. M. & Unberath, M. Explainable medical imaging AI needs human-centered design: Guidelines and evidence from a systematic review. Npj Digit. Med. 5, 1–15 (2022).
Costacurta, J. C. et al. Distinguishing discrete and continuous behavioral variability using warped autoregressive HMMs. 06.10.495690 Preprint at (2022). https://doi.org/10.1101/2022.06.10.495690 (2022).
Wiltschko, A. B. et al. Mapping sub-second structure in mouse behavior. Neuron 88, 1121–1135 (2015).
Weinreb, C. et al. Keypoint-MoSeq: Parsing behavior by linking point tracking to pose dynamics. https://doi.org/10.1101/2023.03.16.532307 (2023).
Gschwind, T. et al. Hidden behavioral fingerprints in epilepsy. Neuron 111, 1440–1452e5 (2023).
Wiltschko, A. B. et al. Revealing the structure of pharmacobehavioral space through motion sequencing. Nat. Neurosci. 23, 1433–1443 (2020).
Mathis, A. et al. DeepLabCut: Markerless pose estimation of user-defined body parts with deep learning. Nat. Neurosci. 21, 1281–1289 (2018).
Federolf, P. A. A novel approach to study human posture control: Principal movements obtained from a principal component analysis of kinematic marker data. J. Biomech. 49, 364–370 (2016).
Ohmura, Y., Gima, H., Watanabe, H., Taga, G. & Kuniyoshi, Y. Developmental changes in intralimb coordination during spontaneous movements of human infants from 2 to 3 months of age. Exp. Brain Res. 234, 2179–2188 (2016).
Sæther, R. et al. A change in temporal organization of fidgety movements during the fidgety movement period is common among high risk infants. Eur. J. Paediatr. Neurol. 20, 512–517 (2016).
Ferrari, F. et al. The ontogeny of fidgety movements from 4 to 20 weeks post-term age in healthy full-term infants. Early Hum. Dev. 103, 219–224 (2016).
Cioni, G. & Prechtl, H. F. R. Preterm and early postterm motor behaviour in low-risk premature infants. Early Hum. Dev. 23, 159–191 (1990).
Einspieler, C. et al. Are abnormal fidgety movements an early marker for complex minor neurological dysfunction at puberty? Early Hum. Dev. 83, 521–525 (2007).
Gao, Q. et al. Automating general movements assessment with quantitative deep learning to facilitate early screening of cerebral palsy. Nat. Commun. 14, 8294 (2023).
Örtqvist, M. et al. Reliability of the motor optimality score-revised: A study of infants at elevated likelihood for adverse neurological outcomes. Acta Paediatr. 112, 1259–1265 (2023).
Sukal-Moulton, T. et al. Baby Observational Selective Control AppRaisal (BabyOSCAR): Convergent and discriminant validity and reliability in infants with and without spastic cerebral palsy. Dev. Med. Child. Neurol. 66, 1511–1520 (2024).
Kwong, A. K. L. et al. Early motor repertoire of very preterm infants and relationships with 2-year neurodevelopment. J. Clin. Med. 11, 1833 (2022).
Örtqvist, M., Einspieler, C., Marschik, P. B. & Ådén, U. Movements and posture in infants born extremely preterm in comparison to term-born controls. Early Hum. Dev. 154, 105304 (2021).
Einspieler, C. et al. Cerebral palsy: Early markers of clinical phenotype and functional outcome. J. Clin. Med. 8, 1616 (2019).
van Buuren, S. Groothuis-Oudshoorn, K. Mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45, 1–67 (2011).
Pedregosa, F. et al. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).
Linderman, S., Antin, B., Zoltowski, D. & Glaser J. SSM: Bayesian learning and inference for state space models. (2020).
Harris, C. R. et al. Array programming with NumPy. Nature 585, 357–362 (2020).
Virtanen, P. et al. SciPy 1.0: Fundamental algorithms for scientific computing in Python. Nat. Methods. 17, 261–272 (2020).
Hunter, J. D. & Matplotlib A 2D graphics environment. Comput. Sci. Eng. 9, 90–95 (2007).
Waskom, M. L. Seaborn: Statistical data visualization. J. Open. Source Softw. 6, 3021 (2021).
McKinney, W. Data structures for statistical computing in Python. Proc. 9th Python Sci. Conf. 56–61. https://doi.org/10.25080/Majora-92bf1922-00a (2010).
Rogozhnikov, A. Einops: Clear and reliable tensor manipulations with Einstein-like notation. In International Conference on Learning Representations (2022).
Acknowledgements
We would like to acknowledge the parents/families of infants who participated in the study. We would also like to acknowledge the extended Victorian Infant Collaborative Study team for their contribution in collecting infant and 2-year follow up data. We would like to acknowledge funding from the Rebecca L Cooper Medical Research Foundation (PG2019421 to G.B.), National Health and Medical Research Council Investigator Grant (1194497 to G.B., 2016390 to J.L.Y.C.), NVIDIA Corporation Hardware Grant program, The Royal Children’s Hospital Foundation, Melbourne and the Murdoch Children’s Research Institute Clinician Scientist Fellowship.
Author information
Authors and Affiliations
Contributions
Recruitment and data acquisition: A.L.K., J.E.O., A.L.E., J.L.Y.C., A.J.S. Data processing: A.L.K., E.P., G.B. Data analysis: E.P., G.B. Resources: E.P. G.B. J.L.Y.C., A.J.S. Supervision: A.J.S., G.B. Writing: E.P, G.B. Revising and editing: all authors.
Corresponding author
Ethics declarations
Competing interests
A.S. is a tutor with the General Movements Trust. All other authors have no conflicts of interest to declare.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Passmore, E., Kwong, A.K.L., Olsen, J.E. et al. Quantifying spontaneous infant movements using state-space models. Sci Rep 14, 28598 (2024). https://doi.org/10.1038/s41598-024-80202-x
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41598-024-80202-x





