Motion impact score for detecting spurious brain-behavior associations

Kay, Benjamin P.; Montez, David F.; Marek, Scott; Tervo-Clemmens, Brenden; Siegel, Joshua S.; Adeyemo, Babatunde; Laumann, Timothy O.; Metoki, Athanasia; Chauvin, Roselyne J.; Van, Andrew N.; Suljic, Vahdeta; Krimmel, Samuel R.; Miller, Ryland L.; Newbold, Dillan J.; Zheng, Annie; Seider, Nicole A.; Scheidter, Kristen M.; Monk, Julia S.; Feczko, Eric; Randolph, Anita; Miranda-Domínguez, Óscar; Moore, Lucille A.; Perrone, Anders J.; Conan, Gregory M.; Earl, Eric A.; Malone, Stephen M.; Cordova, Michaela; Doyle, Olivia; Lynch, Benjamin J.; Wilgenbusch, James C.; Pengo, Thomas; Graham, Alice M.; Roland, Jarod L.; Gordon, Evan M.; Snyder, Abraham Z.; Barch, Deanna M.; Fair, Damien A.; Dosenbach, Nico U. F.

doi:10.1038/s41467-025-63661-2

Download PDF

Article
Open access
Published: 29 September 2025

Motion impact score for detecting spurious brain-behavior associations

Nature Communications volume 16, Article number: 8614 (2025) Cite this article

3282 Accesses
4 Citations
29 Altmetric
Metrics details

Subjects

Abstract

In-scanner head motion introduces systematic bias to resting-state fMRI functional connectivity (FC) not completely removed by denoising algorithms. Researchers studying traits associated with motion (e.g. psychiatric disorders) need to know if their trait-FC relationships are impacted by residual motion to avoid reporting false positive results. We devised Split Half Analysis of Motion Associated Networks (SHAMAN) to assign a motion impact score to specific trait-FC relationships. SHAMAN distinguishes between motion causing overestimation or underestimation of trait-FC effects. We assessed 45 traits from n = 7270 participants in the Adolescent Brain Cognitive Development (ABCD) Study. After standard denoising with ABCD-BIDS and without motion censoring, 42% (19/45) of traits had significant (p < 0.05) motion overestimation scores and 38% (17/45) had significant underestimation scores. Censoring at framewise displacement (FD) < 0.2 mm reduced significant overestimation to 2% (1/45) of traits but did not decrease the number of traits with significant motion underestimation scores.

Evaluating the effects of volume censoring on fetal functional connectivity

Article Open access 16 April 2025

Joint contribution of adaptation and neuronal population recruitment to response level in visual area MT: a computational model

Article Open access 10 July 2025

Default mode network activity is related to efficiency in a combined motion error and gambling task

Article Open access 06 January 2026

Introduction

Head motion is the largest source of artifact in structural and functional MRI (fMRI) signals^{1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20}. The technical challenge posed by motion cannot be overstated and has motivated the creation of behavioral interventions^{9,21,22,23,24} and real-time motion tracking software to reduce the amount of in-scanner head motion^5,25. Even with highly compliant participants, involuntary sub-millimeter head movements systematically alter fMRI data^{6,14,17,19,20}. Unfortunately, non-linear characteristics of MRI physics make removal of motion artifact during post-processing difficult²⁶. Compared to task fMRI, resting-state functional connectivity (FC) is especially vulnerable to motion artifact because the timing of the underlying neural processes is unknown^{2,4,6,9,10,12,14,19,20,27,28,29,30}. The effect of motion on FC has been shown to be spatially systematic, causing decreased long-distance connectivity and increased short-range connectivity, most notably in the default mode network^19,20,27,28.

Results from early studies of children, older adults, and patients with neurological or psychiatric disorders have been spuriously related to motion^{6,19,29,30,31}. For example, motion artifact systematically decreases FC between distant brain regions²⁸ leading some investigators to conclude that autism decreases long-distance FC when, in fact, their results were due to increased head motion in autistic study participants. These cautionary findings have motivated the creation of numerous approaches to mitigate motion artifact including global signal regression^28,32, motion parameter regression²⁹, spectral filtering, respiratory filtering^25,33,34, principal component analysis^1,11, independent component analysis^7,15,35, multi-echo pulse sequences⁸, despiking of high-motion frames¹³, and combinations thereof^3,4,16,25. However, given the complexity of these approaches, it is difficult to be certain that enough motion artifact has been removed to avoid over- or underestimating trait-FC effects.

It is increasingly common for brain-wide association studies (BWAS) involving many thousands of participants (e.g. HCP, ABCD, UK Biobank) to provide data that have already been processed to remove motion^36,38,39,39; however, even in these cases there is some choice as to how much data to retain or to censor. In such large studies, obtaining the raw data and re-applying a different motion processing method is computationally expensive. Excluding high-motion fMRI frames (timepoints) from analysis by censoring is a post-hoc approach shown to further reduce residual motion artifact^14,20,27,29. Power et al.⁴⁰ and Pham et al.⁴¹ note a natural tension between the need to remove some motion-contaminated volumes to reduce spurious findings (false positive inference) but not so many volumes as to bias the sample distribution of a trait by systematically excluding individuals with high motion who may exhibit important variance in the trait of interest (e.g., low scores on attention measures associated with greater motion).

This difficulty in censoring threshold selection arises in part because most approaches for quantifying motion are agnostic to the hypothesis under study^{6,14,19,20,29}. However, some traits or groups of participants are more strongly correlated with motion than others. For example, study participants with attention-deficit hyperactivity disorder or autism have higher in-scanner head motion than neurotypical participants^5,6,9,30,42. Even when much of the overall signal variance associated with motion has been removed, inferences about such motion-correlated traits may still be significantly impacted by motion artifact^{6,9,18,20,29,30}. Therefore, in addition to standard approaches for quantifying motion, methods for quantifying trait-specific motion artifact in FC are needed.

Approaches for quantifying the association of specific trait-FC effects with motion include measuring changes in distance-dependent correlations at different levels of motion censoring^27,28, measuring spatial (i.e., across edges) similarity of trait-FC effects with motion-FC effects^29,30, and measuring differences in trait-FC effects between groups matched according to levels of motion^6,19. However, these measures do not establish a threshold for acceptable or unacceptable levels of trait-specific motion. Nielsen et al.¹² proposed using support vector machines to test whether head motion is significantly predicted by FC, but the method applies only to high-order multivariate models of trait-FC effects. Siegel et al.¹⁸ proposed conceptualizing the relationship between motion and trait-FC effects by comparing within-participant and between-participant variance in the trait-FC effects explained by motion. Siegel’s original method was limited because it required repeated resting-state fMRI (rs-fMRI) scans of the same participant on different days, and it used a simple correlation measure that could not account for covariates or distinguish between motion artifact causing over- or under-estimation of the trait-FC effects.

Thus, we developed a novel method for computing a trait-specific motion impact score that operates on one or more rs-fMRI scans per participant and can be adapted to model covariates. We capitalize on Siegel et al.’s¹⁸ observation that traits (e.g. weight, intelligence) are stable over the timescale of an MRI scan whereas motion is a state that varies from second to second. The proposed Split Half Analysis of Motion Associated Networks (SHAMAN) capitalizes on the relative stability of traits over time by measuring difference in the correlation structure between split high- and low-motion halves of each participant’s fMRI timeseries. When the trait-FC effects are independent of motion, the difference in each half of the connectivity will be not-significant because traits are stable over time. A significant difference is detected only when state-dependent differences in motion have an impact on the trait’s connectivity. A direction (positive or negative) of the motion impact score that is aligned with the direction of the trait-FC effect is consistent with motion causing overestimation of the trait-FC effect, a “motion overestimation score.” A motion impact score opposite the direction of the trait-FC effect is consistent with motion causing underestimation of the trait-FC effect, a “motion underestimation score.” Permutation of the timeseries and non-parametric combining^43,44 across pairwise connections yields a motion impact score and a p-value distinguishing significant from not-significant impacts of motion on trait-FC effects.

Recently the Adolescent Brain Cognitive Development (ABCD) Study collected up to 20 minutes of rs-fMRI data on 11,874 children ages 9–10 years^21,45,46 with extensive demographic, biophysical, and behavioral data^47,48. Such large data sets have made it possible to quantify reproducibility in resting-state fMRI, and they have revealed that the true effect sizes of brain-wide association studies (BWAS) are smaller than previously thought due to sampling variability⁴⁹. Failure to consider head motion is another source of inconsistent results. Thus, we first characterized the effectiveness of standard denoising approaches at reducing motion artifact. Then we considered the residual trait-specific impact of head motion on FC in the high-quality ABCD data after denoising and varying levels of post-hoc motion censoring. We performed supplementary analyses on the Human Connectome Project^50,51,52 to demonstrate the generalizability of our results to other denoising methods and data sets.

Results

The Effect of Residual Motion is Large Even After Denoising and Censoring

In order to characterize the impact of residual head motion on trait-FC effects, we first performed preliminary analyses to quantify how much residual motion was left in the data after denoising. Of the 11,874 children recruited into ABCD, n = 9652 children with at least 8 minutes of rs-fMRI data were included in this portion of the analysis. ABCD-BIDS is the default denoising algorithm for pre-processed ABCD data^25,37. It includes global signal regression, respiratory filtering, spectral (low-pass) filtering, despiking, and regressing out the motion parameter timeseries. The relative performance of ABCD-BIDS was evaluated by comparing how much of the between-participant variability in the fMRI timeseries (averaged across regions of interest) was explained by head motion (framewise displacement, FD) in a linear, log-log transformed model before and after applying ABCD-BIDS (Supplementary Fig. 1). After minimal processing³⁹ (i.e. motion-correction by frame realignment only), 73% of signal variance (as estimated by taking the square of Spearman’s rho) was explained by head motion. After further denoising using ABCD-BIDS (respiratory filtering, motion timeseries regression, despiking/interpolation of high-motion frames), 23% of signal variance was explained by head motion. Therefore, ABCD-BIDS achieved a relative reduction in the proportion of signal variance related to motion of 69% compared to minimal processing alone (see Methods).

However, even after denoising with ABCD-BIDS, it was still possible to detect large between-participant differences associated with head motion. The average FC matrix across all participants is shown in Fig. 1a, b. The residual motion-FC effect was quantified by regressing each participant’s FD (averaged over all resting-state scans) against their FC to generate a motion-FC effect matrix with units of change in FC per mm FD (Fig. 1c, d). The average FC matrix and motion-FC effect matrix were compared by computing the correlation between their edges (pairwise connections). The motion-FC effect matrix had a strong, negative correlation of Spearman ρ = −0.58 with the average FC matrix. This strong, negative correlation persisted even after motion censoring at FD < 0.2 mm (Spearman ρ = −0.51). Thus, across all functional connections, connection strength tended to be weaker in participants who moved more compared to participants who moved less.

**Fig. 1: Average connectivity matrix, motion-FC effect matrix, and effect sizes.**

The decrease in FC due to head motion was larger than the increase or decrease in FC related to traits of interest (Fig. 1e, f). The largest motion-FC effect size for a single connection was |r | = 0.10. More conservatively, the 98^th percentile for the motion-FC effect size was |r | = 0.04. While these effect sizes were small in absolute terms, Marek et al.⁴⁹ have shown that the largest 1% of univariate trait-FC effect sizes in ABCD are on the order of |r | = 0.06. Of the 45 variables we evaluated, parental income bracket had the largest trait-FC effect size at |r | = 0.06 (98^th percentile |r | = 0.02). With motion censoring at FD < 0.2 mm the largest motion-FC effect was |r | = 0.08 (98^th percentile |r | = 0.03) and the largest effect size for parental income was |r | = 0.06 (98^th percentile |r | = 0.02). Therefore, even with motion denoising and censoring, the motion-FC effect was large in relation to trait-FC effects.

Most Traits Were Significantly Correlated with Head Motion

Of the 11,874 total children recruited into ABCD, and 9,652 children with at least 8 minutes of rs-fMRI data, n = 7270 children without missing data for any of the 45 traits we examined were included in this and subsequent analyses.

Surprisingly, many seemingly unrelated traits were shown to be correlated with head motion in the Human Connectome Project (HCP)¹⁸. In ABCD, 87% (39/45) of traits or variables evaluated showed significant (p < 0.05) correlations between the trait (e.g. BMI) and head motion (FD, in mm; Supplementary Data 1). WISC-V (Wechsler Intelligence Scale for Children, 5th edition) matrix reasoning subscore⁵³ had the highest trait-FD correlation at Pearson r = −0.12, Spearman ρ = −0.12, p < 0.001, and Child Behavioral Checklist (CBCL) somatization subscore had the least trait-FD correlation at Pearson r = 0.0004, Spearman ρ = −0.02 (not significant).

Body mass index (BMI)⁵⁴ and matrix reasoning subscore⁵³ were selected as exemplar traits. Both exhibited significant trait-FD correlations (Supplementary Fig. 2). Children with higher BMI exhibited more in-scanner head motion (Spearman ρ = 0.13, p < 0.001). Children with higher matrix reasoning scores had lower head motion (Spearman ρ = −0.12, p < 0.001).

SHAMAN Detected the Impact of Motion on Trait-FC Effects in Simulated Data

To clarify and validate the principle of the SHAMAN approach, a generative model was used to simulate fMRI timeseries data with experimentally-controlled amounts of “BRAIN” trait-FC effects, “MOTION” motion-FC effects, and relationship between them. Simulation of non-linear relationships between the trait-FC effects and motion caused denoising to fail and for SHAMAN to detect a significant impact of residual motion on trait-FC effects (motion impact score) as depicted in Fig. 2. The SHAMAN algorithm is described further in the Methods, and simulation methods are described further in the Supplementary Material. Briefly, the FC matrix for a simulated participant contained both “BRAIN” and “MOTION” effects (Fig. 2a), reflecting the failure of denoising to completely remove the “MOTION” effects. The participant’s fMRI timeseries was split into high- and low-motion halves based on the FD timeseries. More “MOTION” was visible in the FC matrix from the high-motion half of the timeseries compared to the low-motion half, whereas the amount of “BRAIN” did not vary over time and was therefore was equal between halves (Fig. 2b). When the low-motion trait-FC matrix was subtracted from the high-motion trait-FC effect matrix, the identical “BRAIN” trait-FC effects canceled out in the difference FC matrix (Fig. 2d) whereas the “MOTION” did not. Finally, the “BRAIN” trait was regressed against the difference FC matrices to generate a matrix of motion impact score (Fig. 2e). The motion impact score approximates how estimates of the trait-FC effects are changed by head motion (Supplementary Fig. 16d,f). A motion overestimation score approximates how much the trait-FC effect is overestimated due to motion. A motion underestimation score approximates how much the trait-FC effect is underestimated due to motion.

**Fig. 2: Diagram of the SHAMAN (Split-Half Analysis of Motion-Associated Networks) algorithm using simulated data.**

Residual Motion Inflates Functional Connectivity Associations with Biophysical Traits

We found that residual motion-related signal inflated the effect size of many traits, especially the biophysical traits of BMI, age, sex, weight, and height. Trait-FC effects were calculated separately for each of these traits on fMRI data after denoising with ABCD-BIDS using conventional methods with mean framewise displacement (FD) as a covariate to control for between-participant differences in motion (e.g., FC ~ BMI + FD). Without frame censoring, all five biophysical traits had significant motion overestimation scores, meaning that their trait-FC effects were significantly (p < 0.05) greater due to the impact of residual head motion. Three biophysical traits, BMI, sex, and weight, also had significant motion underestimation scores, meaning that motion inflated trait-FC effects in some networks but suppressed them in other networks (Supplementary Data 2, 3).

BMI was selected as an exemplar trait to illustrate overestimation of the trait-FC effect due to motion. The trait-FC effects of BMI (FD as a covariate, without motion censoring) were compared to its motion impact score at each edge in Fig. 3 and projected onto the cortical surface using root mean square connectivity⁵⁵ in Fig. 4. Many of the network pairs (e.g. on the diagonal: default mode, visual, dorsal attention, and cingulo-opercular) had a negative trait-FC effect and a negative motion impact. Since these trait-FC effects and corresponding motion impacts were both in the same direction (negative in this case), the impact of motion was to inflate the size of the trait-FC effects. An overall motion overestimation score (Stouffer’s Z⁴³) for BMI of 92 was calculated using omnibus combining across the entire connectivity matrix to control for multiple comparisons. An omnibus significance of p < 0.001 was obtained using permutation.

**Fig. 3: Trait-specific impact of motion on functional connectivity matrices.**

**Fig. 4: Trait-specific impact of motion on functional connectivity (FC).**

Note that regressing against the negation of BMI, a trait for which high values were associated with low in-scanner head motion, caused the trait-FC effects of inverse-BMI and the motion impact to both be positive instead of negative. Since the trait-FC effects and corresponding motion impact were still both in the same direction (positive in this counterexample), the overall motion impact score was still a motion overestimation score.

Residual Motion Can Inflate and Suppress Associations with Demographic and Behavioral Trait

WISC-V matrix reasoning score was selected as an exemplar trait to illustrate a motion underestimation score (Figs. 3 and 4). The overall motion underestimation score for WISC-V of 44 was significant at (omnibus) p < 0.001. The motion overestimation score for WISC-V was not significant. The trait-FC effects for WISC-V (FD as a covariate, without motion censoring) and its respective motion impact scores are shown in Fig. 3 and projected onto the cortical surface in Fig. 4. In networks where the motion impact was significant (e.g. sensorimotor hand), it trended in the opposite direction (positive vs negative) of the corresponding trait-FC effect, thus suppressing the trait-FC effect.

In total, 34% (13/38) of demographic and behavioral traits had significant motion overestimation scores, and 37% had significant motion underestimation scores, prior to frame censoring (Supplementary Data 2, 3). Broken down further by category, there was a significant motion overestimation score for 20% (1/5) of demographic traits, 50% (5/10) of cognitive traits, and 32% (7/22) of personality traits. There was a significant motion underestimation score for 67% (4/6) of demographic traits, 70% (7/10) of cognitive traits, and 14% (3/22) of personality traits. Thus, demographic and behavioral traits were less impacted by residual motion than biophysical traits.

Frame Censoring after Denoising Reduces False Positives due to Motion

Frame censoring is a simple, post-hoc approach to address residual motion artifact after motion processing; however, censoring also has the potential to bias sample proportions through exclusion of high-motion participants^{14,20,27,29,31,41}. The SHAMAN algorithm was used to quantify the tradeoff between reduction in motion impact score and sampling bias due to the exclusion of high motion participants (Supplementary Data 2-4). See the work of Pham et al.⁴¹ for additional approaches to optimal censoring threshold selection.

Prior to frame censoring, 7270 participants had at least 8 minutes of rs-fMRI data and were not missing data on any of the 45 trait traits studied. Frame censoring at FD < 0.3 mm (filtered for respiratory motion)^25,33,34 and excluding participants with < 8 minutes of data remaining excluded 5% (384/7270) participants, and censoring at FD < 0.2 excluded 11% (818/7270) of participants. Exclusion of these participants at FD < 0.2 mm caused the average values of only two traits, gender and number of MRI runs completed, to shift by more than 1% (Supplementary Fig. 4, 5) compared to the uncensored sample of participants. On the other hand, censoring at FD < 0.3 reduced the number of traits with significant motion ovestimation scores from 42% (19/45) to 11% (5/45), and censoring at FD < 0.2 further reduced the number to just one physical trait, height, see Fig. 5 (Supplementary Data 2, 3). Similar results using the ABCD data were obtained in our supplementary analyses using DVARS, an alternative to FD for quantifying in-scanner head motion calculated from the root mean square variance of the difference between successive fMRI timepoints. Similar results were obtained in Human Connectome Project (HCP) data where the overall trend of motion impact scores that improved with motion censoring (especially for biophysical traits) was also observed. However, the fraction of traits with significant motion overestimation scores was lower in HCP (9/76, 12%) compared to ABCD (19/45, 42%).

**Fig. 5: Effects of frame censoring on motion impact score.**

More stringent censoring at FD < 0.1 mm (filtered) did not reduce the number of traits in ABCD with significant motion overestimation scores any further, but exclusion of 36% (2712/7270) participants at FD < 0.1 shifted the average values of 24% (11/45) of traits by > 1%, compared to no censoring (Supplementary Data 4, Supplementary Figs. 4 and 5). For example, boys moved their heads 0.07 mm more, on average (mean FD), than girls (r = 0.10, p < 0.001). At a very stringent motion censoring threshold of FD < 0.1, the study population shifted from majority boys (52.6%) to majority girls (51.6%; Supplementary Fig. 4). Boxplots showing trait distributions at different motion censoring thresholds for all 45 traits can be found in Supplementary Fig. 5. Motion censoring at FD < 0.1 did reduce significant motion underestimation scores from 38% (17/45) with no censoring and at FD < 0.2 to 20% (9/45) at FD < 0.1.

Motion Impact Was Distributed Across Brain Regions

Motion impact scores were computed at multiple levels of granularity: for individual pairwise connections (edgewise), combined (using omnibus Stouffer’s Z) across edges in a region or node (parcelwise), or combined across the whole brain. Significance at the whole-brain level could theoretically be driven by extremely high motion impact scores concentrated in a few regions or by high motion impact scores distributed across multiple brain regions. To quantify the regional distribution of motion impact score, the whole-brain motion impact score was recalculated after iteratively excluding the brain region with the highest motion impact score until the whole-brain motion impact score was no longer significant. The number of regions excluded was the number of regions significantly contributing to the impact of motion across the brain. We found that, especially for motion overestimation scores, significant impacts of motion were distributed across many brain regions. BMI and matrix reasoning were selected as exemplar traits to visualize the number of motion-impacted regions needed to detect a significant impact of motion across the whole brain (Supplementary Fig. 6). For BMI, 39% (155/394) of regions contributed to motion overestimation score whereas <1% of regions contributed to motion underestimation score. For matrix reasoning, 19% (76/394) of regions contributed to motion underestimation score; the whole-brain motion overestimation score for matrix reasoning was not significant.

Trait-FD Correlation Did Not Predict Motion Overestimation Score

Calculating motion impact score is computationally intensive and might be unnecessary if overestimation of spurious trait-FC effects could be detected using the trait-motion (trait-FD) correlation alone. A post-hoc analysis was performed to test whether trait-FD correlation predicted motion impact score. Large trait-FD correlation failed to predict motion overestimation score (r = −0.05, p = 0.74) and therefore did not detect spurious trait-FC correlations without the aid of SHAMAN (Supplementary Fig. 7). This was true even when limiting analysis to behavioral traits, which tend to be of greatest interest to researchers (33 traits, r = 0.05, p = 0.74). Large trait-FD correlation was correlated with motion underestimation score (r = 0.36, p = 0.015).

Data Quantity Affected Motion Impact Score

Motion impact scores without motion censoring were computed for 7270 participants, of whom 1,291 did not complete all four 5-minute-long resting-state fMRI scans. The number of fMRI frames before censoring was not related to FD overall (Spearman ρ = 0.155, p = 0.20), but there was a significant negative association when restricting analysis to participants with mean FD > 1.0 mm (Spearman ρ = −0.188, p < 0.001). Thus, participants with very high amounts of motion had less fMRI data even before motion censoring. A supplementary analysis clamping the number of frames/timepoints to 600 (8 minutes of data) per participant was performed to test whether the ability to detect significant motion impact scores was dependent on participants having unequal amounts of rs-fMRI data. The results are shown in Supplementary Data 5. Clamping the number of timepoints to 600, without motion censoring or exclusion of participants, reduced the number of traits with significant motion overestimation scores from 42% (19/45) to 20% (9/45). All 5 of the physical traits still had significant motion overestimation scores after clamping. Clamping the number of frames decreased the number of traits with significant motion underestimation scores from 38% (17/45) to 20% (9/45). Clamping the number of frames was not as effective as motion censoring at FD < 0.2 mm, which reduced the number of traits with significant motion overestimation scores to 2% (1/45).

Patterns of Motion Impact Replicate Across Data Sets and Motion Measures

The finding of widespread, trait specific motion impact preferentially affecting primarily biophysical variables and improved by motion censoring is not unique to the ABCD data set. We observed similar patterns of motion impact using an alternative data set, the Human Connectome Project (HCP); see the Supplementary Material. We also obtained similar results and reduction in motion impact with motion censoring using DVARS^56,57, an alternative to FD for quantifying head motion. The greatest difference between ABCD and HCP was a lower proportion of traits affected by motion overestimation in HCP. Superior robustness to motion impact in the HCP data could be due to using an alternative motion mitigation strategy (ICA-FIX^7,16 for HCP vs ABCD-BIDS for ABCD), differences in participant motion characteristics (FD 0.16 ± 0.06 mm for HCP vs 0.29 ± 0.35 mm, for ABCD), or differences in available trait measures for each study.

Testing for Trait-specific Motion Effects Requires Thousands of Participants

Our analyses were conducted on a large sample (n = 7270). We performed bootstrapping (sampling with replacement) at different sample sizes to assess the sensitivity, or statistical power, of SHAMAN to detect significant motion impact scores in our exemplar traits, BMI and matrix reasoning, as a function of sample size. As expected, power increased as the sample size increased (Supplementary Fig. 8). A sample size of about n = 5000 participants was powered to detect a motion overestimation score for BMI in 88% of simulations at a significant threshold of (α = 0.05). The motion underestimation score for matrix reasoning was more difficult to detect, with successful detection in only 55% of simulations at n = 5000. The need for thousands of participants to avoid false positive inference is consistent with recent fMRI studies⁴⁹.

Motion Impact Score is Specific to Motion-Associated Traits

To be useful, SHAMAN must also be specific, producing not-significant motion impact scores when trait-FC effects are not associated with motion. The randomly-assigned participant ID was used as a trait to simulate a trait-FC effect independent from motion. Participant ID was not significantly correlated with FD (r = 0.0006). We performed bootstrapping with participant ID at different sample sizes to assess SHAMAN’s specificity. SHAMAN did not falsely detect a significant motion impact score at any sample size from 100 to 6,000 participants (Supplementary Fig. 8).

Discussion

The Impact of Motion on Brain-Wide Association Studies is Widespread and Trait Dependent

It has been proposed that head motion likely affects inference about some trait-FC effects more than others^{6,19,27,28,29,30}, but only a few studies^12,18 have explored the differential impact of motion on specific trait-FC effects within a statistical framework. We build upon this prior work¹⁸ by employing a novel metric, motion impact score, to quantify the impact of residual in-scanner head motion on trait-FC effects for 45 demographic, biophysical, intelligence, and personality traits within the ABCD study. Similar to prior findings in the Human Connectome Project (HCP) data¹⁸, we find that, even after denoising with ABCD-BIDS, a large proportion (42%) of trait-FC effects in the ABCD data (n = 7270) are significantly inflated by residual head motion when motion censoring is not used, and that post-hoc motion censoring is an effective strategy for mitigating residual motion.

Motion Impact May Arise from Heterogeneous Sources

The SHAMAN method distinguishes between transient (state) and persistent (trait) changes in fMRI signal related to a nuisance timeseries (head motion), but it does not distinguish between different biophysical sources of transient fMRI signal. Such sources could include not only artifactual signal from proton spin effects, but also fMRI activation in motor cortex, and even the transition between different FC states⁵⁸. Researchers specifically studying the relationship between small head movements and transient changes in neuronal activity might employ our SHAMAN method to discover interesting FC changes related to transient head motion. For more conventional BWAS, any fMRI signal change time-locked with head motion, regardless of source, has the potential to affect reproducibility if not removed or controlled for.

Motion May Negatively Affect the Signal-to-Noise-Ratio (SNR)

We find that, after denoising, the motion-FC effect matrix has a large effect size (max |r | = 0.10) and is strongly anti-correlated with the between-participant average FC matrix (r = −0.63), see Fig. 1 and Supplementary Fig. 10, thus potentially impacting the interpretation of many traits whose FC effects are shaped by the average FC matrix. One parsimonious explanation for this motion-connectivity anti-correlation is that data from participants with higher motion have a lower signal to noise ratio (SNR). Decreased SNR could occur due to the biophysical characteristics of head motion increasing the proportion of noise relative to brain signal captured by the MRI scanner. We also observe that participants with very high motion (mean FD > 1.0 mm) completed fewer fMRI scans leading to a shorter duration of fMRI data, higher sampling variability, and thus lower SNR. It is also possible that denoising algorithms inadvertently remove more brain signal from high-motion data than they do from low-motion data. Designers of denoising algorithms may wish to evaluate not only how much motion is removed from the data, but also how the SNR of the data are affected by processing.

While we hypothesize that motion-related signal loss plays a large role in explaining the similarity between the motion-FC effect matrix and average FC matrix, we cannot definitely prove the association from these analyses. Other factors contributing to the observed motion-FC effect matrix may include distance-dependent attenuation of connectivity by motion artifact⁴⁰. Propensity of an individual to move is itself a stable, heritable trait^59,60,61, and some⁶² have even proposed that intrinsic differences in FC related to movement are encoded in the FC matrix. While our findings do not exclude the possibility that motion has an intrinsic neurobiological basis, we find it surprising that motion’s trait-FC effect would have a larger effect size than any other trait-FC effect. It is more plausible that the motion-FC effect arises from a combination of artifact, signal loss, and biologically-meaningful motion-related traits and FC states.

Biophysical Traits Were Most Likely to be Confounded by Motion Distortions

While we find that many trait-FC relationships (42%) are spuriously inflated by motion, it is remarkable that all five biophysical trait-FC relationships (BMI, age, sex, weight, and height) are significantly inflated by motion. Biophysical traits are also the only class for which we detect significant motion underestimation scores after very stringent censoring at FD < 0.1. As seen in Fig. 3 illustrating BMI, the largest deviations in the trait-FC connectivity matrices for biophysical traits cluster along the diagonal and have a large degree of similarity to both the average (across participants) FC matrix and the motion-FC effecty matrix. It is possible that the biophysical traits are directly related to motion (e.g. larger head sizes are capable of greater displacement), affect signal to noise ratio in a similar way to motion (e.g. excess adipose tissue attenuates radiofrequency energy, decreasing SNR), or that biophysical traits serve as a proxy for other behavioral traits related to movement. If the relationship between biophysical traits and SNR or behavioral covariates could be modeled then their true trait-FC relationships could be identified.

Motion Censoring Reduces the Risk of False Positive Inference

Motion censoring at FD < 0.2 mm (filtered for respiratory artifact) is very effective at mitigating motion overestimation score (Fig. 5, Supplementary Data 2, 3) and results in relatively little selection bias due to participant exclusion (Supplementary Fig. 4, 5, Supplementary Data 4) in this neurotypical cohort. Siegel et al.¹⁸ also observed decreased residual impact of motion after framewise censoring in the HCP data. Less stringent censoring at 0.3 retained more participants but also more traits with motion overestimation; even more stringent censoring at FD < 0.1 (filtered) does not further reduce motion overestimation score, has little effect on motion underestimation score, and introduces greater selection bias, for example, reversing the ratio of boys to girls in the censored data compared to the uncensored data. The degree of selection bias would likely be even greater for children with developmental disorders such as autism³¹. Therefore, motion censoring at FD < 0.2 mm (filtered) is a reasonable strategy to avoid making false positive inferences about most trait-FC effects.

The potential for selection bias^31,40 and loss of power from motion censoring might be further addressed through a combination of approaches including: recruiting more participants from high-motion groups (e.g. children with autism), scanning them for longer, adopting behavioral strategies to reduce in-scanner head motion²², employing real-time adaptive quality assurance strategies such as FIRMM⁵ to ensure collection of sufficient data to maintain the desired power after frame censoring, adaptively selecting an optimal motion censoring threshold⁴¹, and employing statistical methods to account for selection bias³¹.

Motion Reduces Statistical Power

Statistical power is the ability to reject the null hypothesis when it is actually false, or the ability to avoid false negative inference. We show that motion predisposes to false negative inference about trait-FC effects (motion underestimation score) for many traits, even after motion censoring. Therefore, motion reduces statistical power for specific traits in a way that is not accounted for by conventional power analysis methods. While researchers may be more attentive to false positive results, false negative results and statistical power are also important. For example, failure to detect an association between lead exposure and neurotoxicity would contribute to incorrect policy decisions about lead mitigation, increased lead exposure, and adverse public health outcomes. False negative results are especially damaging when they arise in large brain wide associations studies (BWAS, e.g., ABCD), which are assumed to be adequately powered.

SHAMAN Facilitates Region of Interest (ROI) Analyses

It is common to select a priori regions or nodes of interest on which to perform FC analyses. ROI analyses may also be beneficial for excluding spatially-structured motion artifacts. Using non-parametric combining, SHAMAN can assign a motion impact score to a single node, a subset of nodes, or the whole brain, reflecting a continuum⁴⁴ from high specificity (at the level of the ROI) and low power to low specificity and high power (across the whole brain) to detect motion impact. ROI analysis of motion impact score can provide insight into whether a priori regions of interest are impacted by residual motion, and facilitate analysis of a few nodes, selected a priori, which do not have a significant motion impact score even when the whole brain does have a significant motion impact score. While ROI analysis is typically used to limit the number of statistical tests or comparisons, as the brain can be subdivided into thousands of combinations of ROIs, the number of possible statistical tests/comparisons is vast. Therefore, caution is advised when interpreting “cherry picked” regions of interest a posteriori with a low motion impact score, especially for traits where motion impact score is widely distributed over a large proportion of nodes (e.g., BMI: 39% of nodes).

SHAMAN Can Quantify Statistical Significance of Motion Distortion

The problem of estimating motion artifact is intractable because the precise relationship between head motion and fMRI signal is not known²⁶. Denoising algorithms must therefore make assumptions about the motion-fMRI relationship. For example, regressing the motion parameters out of the fMRI signal²⁹ makes a strong, linear assumption about the motion-fMRI relationship. Alternative machine learning approaches such as ICA-FIX^7,16 may achieve greater robustness to motion artifact by relaxing these linear assumptions, but no method is assumption-free. Detecting residual motion after denoising is especially challenging because the same assumptions used by the denoising algorithm cannot be reused. For example, regressing the motion parameters out of the fMRI signal a second time will not remove additional motion artifact. SHAMAN approaches this problem by assuming that the high-motion half of the data will have the same correlation structure as the low-motion half except for artifact from motion. Therefore, while quantifying residual motion precisely is impossible, SHAMAN detects when the impact of motion on a trait-FC relationship is significant. In theory the residual impact of motion might be significantly non-zero but small enough that it would not cause false inference. However, consistent with prior work⁴⁹, we show that the motion-FC effect is similar in magnitude to the effect size of most trait-FC effects. Therefore, a significant motion overestimation score is likely to cause false inference.

SHAMAN Predicts Residual Motion Impact Better Than Trait-motion Correlation Alone

This study was motivated by the hypothesis that trait-motion (FD) correlations for traits (e.g. BMI) correlated with head motion would be more likely to be spuriously related to motion. Are strong trait-FD correlations sufficient to predict spurious trait-FC effects? Traits highly correlated with motion were more likely to have significant motion underestimation scores. However, strong trait-FD correlations did not predict which trait-FC relationships would be spuriously inflated by motion, biasing toward false positive results. Therefore, SHAMAN provides vital information about the risk of false-positive inference due to residual trait-specific motion impact beyond the trait-motion correlation alone.

Detecting the Impact of Residual Trait-Specific Motion Requires Thousands of Participants

Marek et al.⁴⁹ showed that thousands of participants are needed to measure trait-FC effects reproducibly. Similarly, we find that 2000–3000 participants are needed to detect the motion overestimation score on the BMI-FC effect, and 6000 participants are needed to detect the motion underestimation score on the matrix reasoning-FC effect (Supplementary Fig. 8). We postulate that motion underestimation scores require larger sample sizes for detection because affected trait-FC effects would tend to have smaller effect sizes. Our findings therefore reinforce the benefit of large sample sizes for BWAS. When large samples are not feasible (e.g. in studies of rare diseases), extrapolation of our findings suggests framewise motion censoring is a prudent strategy to avoid false positive inferences. Researchers should also exercise caution when interpreting trait-FC effects for biophysical traits (e.g. BMI) because they are particularly susceptible to spurious inflation by motion artifact.

Quantifying Motion Impact Helps To Avoid False Inference

Many traits of interest to human population neuroscience are significantly correlated with head motion (87%). Motion denoising algorithms successfully remove much of the effect of motion, but trait-specific motion effects remain likely to distort inferences drawn from functional connectivity. The risk of false positives is highest for biophysical traits (e.g., BMI) and false negatives are most likely for cognitive/behavioral traits (e.g. matrix reasoning). Frame censoring is an effective strategy to reduce trait-specific motion distortions and can increase the likelihood of finding a significant, reproducible effect. Methods for finding an optimal frame censoring threshold⁴¹ can be combined with SHAMAN to minimize data loss. Spatial masks generated by our novel SHAMAN method can be used to avoid the most motion-impacted regions during ROI selection. Given that residual head motion is most likely to falsely suppress true associations between FC and cognitive traits, more aggressive motion suppression and denoising techniques should lead to the discovery of new cognition-FC relationships.

Methods

Standard Methods

Ethics

This project used resting-state functional MRI, demographic, biophysical, and behavioral data from 11,572 9–10 year old participants from the ABCD 2.0 release⁶³. The ABCD Study obtained centralized institutional review board (IRB) approval from the University of California, San Diego. Each of the 21 sites also obtained local IRB approval. Ethical regulations were followed during data collection and analysis. Parents or caregivers provided written informed consent, and children gave written assent.

Behavioral

The Adolescent Brain Cognitive Development (ABCD) study participants are well-phenotyped with demographic, physical, mental health⁴⁷, and cognitive⁴⁸ batteries. For the purpose of this report we selected TR 45 broadly-interesting traits for which complete data was available on a majority of participants: NIH Toolbox, Wechsler Intelligence Scale for Children (WISC-V)⁵³, Cognitive Behavioral Checklist (CBCL), Prodromal Psychosis Scale (PPS), Behavioral Inhibition/Avoidance Scales (BIS/BAS), and Urgency Premeditation Perseverance Sensation Impulsive Scale (UPPS-P), see Supplementary Data 1. We also selected 5 physical (BMI, age, sex, weight, and height) and 5 demographic (number of people living in household, combined parental income bracket, parents’ marital status, highest parental education level, and residential deprivation index) traits commonly used as covariates, as well as 2 study-related variables (number of MRI scans completed and drowsiness), to see if any of these would be highly correlated with motion. Data were downloaded from the NIMH Data Archive (ABCD Release 2.0), and the traits of interest were extracted using the ABCDE software we have developed and which we have made available here: https://gitlab.com/DosenbachGreene/abcde.

Exemplar Traits

Two exemplar traits were selected from the ABCD study a posteriori to illustrate the SHAMAN method. Body mass index (BMI) is measured in units of kg/m² with a normative range of about 14–22 for children ages 9–10 years⁵⁴. The Wechsler Intelligence Scale for Children, 5th edition (WISC-V) matrix reasoning subscore has a mean of 10, standard deviation of 3, and maximum value of 19⁵³, with higher scores indicating superior performance on the test.

MR Imaging

Functional magnetic resonance imaging (fMRI) was acquired at 21 sites using a protocol harmonized for 3 Tesla GE, Philips, and Siemens scanners with multi-channel receive coils²¹. In addition to anatomical and task-fMRI, each participant had up to four 5-minute-long resting-state scans (TR = 800 ms, 20 min total). Participants with less than 8 min of resting-state data, the minimum duration needed for high-quality estimation of connectivity⁶⁴, were excluded from analysis. A subset of sites using Siemens scanners used FIRMM motion tracking software⁵ that allows extending the scan on the basis of on-line measurement of motion.

Following acquisition, fMRI data were processed using standardized methods including correction for field distortion, frame-by-frame motion co-registration, alignment to standard stereo-tactic space, and extraction of the cortical ribbon³⁹. Resting-state data were further processed to remove respiratory and motion artifact by temporal bandpass filtering, global signal regression, and regression against the rigid-body motion parameters using the ABCD-BIDS motion processing pipeline, a derivative of the Human Connectome Project (HCP) processing pipeline³⁸ described by Fair et al.^25,37. Processing dependencies include FSL⁶⁵ and FreeSurfer⁶⁶.

Parcellation

It is possible to compute functional connectivity between each voxel or vertex. However, this approach is burdened by a high proportion of unstructured noise and large computer memory requirements. We therefore adopted a ROI-based approach based on the 333 cortical parcels described by Gordon et al.⁶⁷ augmented by the 61 subcortical parcels described by Seitzman et al.⁶⁸ for a total of 394 parcels, or nodes.

Quantifying Motion

Motion in fMRI studies is typically estimated using spatial co-registration of each fMRI volume (or frame) to a reference frame, or temporal analysis of variance¹⁴. In this study we quantify motion using framewise displacement, FD (L1-norm), in mm after filtering for respiratory artifact^33,34 because it is the default metric used in the ABCD-BIDS processing pipeline^25,37.

Assessing the Performance of ABCD-BIDS

The primary aim of this study was to quantify trait-specific residual motion, but we also quantified trait-agnostic residual motion for comparison to our trait-specific findings and to assess the overall performance of the ABCD-BIDS motion processing tool (Supplementary Fig. 1). Multiple methods have been developed to quantify trait-agnostic residual motion after the fMRI data have been denoised or “processsed.”⁴⁰ We employed a straightforward approach of measuring the proportion of between-participant variation in fMRI signal variance explained by motion. FD was averaged over each participant’s resting-state fMRI scans. The fMRI signal variance within each parcel was computed for each participant before and after ABCD-BIDS (without frame censoring). The fMRI signal variance was averaged across parcels to generate a single measure of fMRI signal variance for each participant. The relationship between fMRI signal variance and framewise displacement was fit with a log-log model. Assumptions of linearity were further relaxed by using the square of Spearman’s ρ instead of the coefficient of determination R². The relative reduction in the proportion of signal variance related to motion achieved by ABCD-BIDS was computed as 1 - (ρ_after/ρ_before)².

Correlation of Demographic & Behavioral Traits with Motion

We compare continuous traits of interest with motion (FD, in mm) using the linear product-moment correlation coefficient, r. This quantity reflects the degree to which in-scanner head motion is related to a trait of interest and is reported in Supplementary Fig. 2.

Frame Censoring

Frame censoring excludes spurious variance at the cost of data loss⁶⁹. Therefore, determining the optimal frame censoring threshold is an empirical question. We considered the effect of no motion censoring (i.e. all frames included in analysis), censoring at an FD cutoff of 0.3 mm, censoring at a stringent FD cutoff of 0.2 mm, and censoring at a very stringent FD cutoff of 0.1 mm. Participants with less than 8 min of resting-state data were excluded from further analysis at each censoring threshold, including the no motion censoring threshold. This resulted in a different number of participants at each censoring threshold (Supplementary Fig. 4). The 8 min cutoff was selected due the difficulty of estimating stable values for functional connectivity with shorter scan duration⁶⁴.

Functional Connectivity

We begin with the standard approaches for computing and making inferences about functional connectivity, which are familiar to all fMRI researchers. To avoid ambiguity, the methods are briefly summarized here, and diagrammed in Supplementary Fig. 9 for reference. By convention, each brain region or parcel is referred to as a “node” and the functional connections between nodes, which are referred to as “edges,” are computed as the pairwise linear correlation coefficients between nodes. A vector of edges is typically visualized in an edge, correlation, effect, or “connectivity” matrix. As correlations are constrained to vary from −1 to 1, the correlation coefficients are atanh (FIsher Z) transformed prior to regression. Inferences about trait-FC effects are made by modeling the atanh-transformed edges as a linear combination of some participant-wise trait of interest (e.g. matrix reasoning score) and some covariates (e.g. FD) using least-squares regression. The estimate of the trait-FC effect at each edge is divided by the standard error of the estimate to obtain a t-value at each edge.

Methods for SHAMAN

Rationale

We begin by assuming traits and their effect on resting-state functional connectivity are stable over time whereas states vary with time⁷⁰. Here a trait is a variable such as height, intelligence score, or favorite color, which is stable over the timescale of an fMRI scan. Extrinsic variables like parental income are also considered traits for this purpose. State is a variable such as instantaneous respiratory rate or head motion that varies during an fMRI scan. Although it has been observed that the propensity to move is a trait which is stable over months to years⁶⁰, here we consider second-to-second variations in the FD timeseries.

It follows from this assumption that if we split an individual participant’s fMRI BOLD timeseries in half, we will draw the same individual- and group-level inferences on a trait from each half of the data (to within sampling error). If trait-specific connectivity is not associated with head motion then split-half inference will be invariant to the amount of motion in each half of the data. Thus, we can detect trait-specific motion-associated connectivity by comparing inferences when the data is split according to the FD timeseries (see Supplementary Fig. 13 for distribution of median FD) vs split at random.

Algorithm Steps

The SHAMAN (Split-Half Analysis of Motion-Associated Networks) algorithm is diagrammed in Fig. 2. Its steps are enumerated below.

1.
An individual participant’s resting-state BOLD timeseries from one or more fMRI scans are concatenated (Fig. 2a).
2.
The individual participant’s fMRI BOLD timeseries is split in half (Fig. 2b). One half (left) contains the half of the timeseries with the lowest FD (motion) values, and the other half (right) contains the half with the highest FD values.
3.
A connectivity/edge matrix is computed separately for each half of the individual participant’s data (Fig. 2b). See Supplementary Fig. 10 for an illustration using real connectivity data.
4.
Average (over time) FD is computed for each half of the data (Fig. 2c).
5.
Steps 1-4 are repeated for each participant in the study.
6.
The between-participant effect of motion is accounted for by regressing (across participants) mean FD out of each half of the connectivity data (Fig. 2c).
7.
The difference in residuals between low- and high-motion halves from Step 6 is computed for each participant (Fig. 2d).
8.
The trait/variable of interest is regressed against the difference in residuals from Step 7 (Fig. 2d), optionally in the presence of covariates.
9.
The resultant connectivity matrix is the motion impact for that trait (Fig. 2e).

If motion processing works perfectly, or if participants do not move at all, then each half of the data will yield identical connectivity matrices, and the difference in residuals in Step 7 above (Fig. 2d) will be zero to within sampling error. Subsequently, the result of regression in Steps 8 and 9 will also be zero (Fig. 2e).

If spurious effects of motion remain in the data then the difference in residuals in Step 7 (Fig. 2d) will not be all zeros. If the trait/variable of interest is associated with head motion then the regression in Steps 8 and 9 will reveal the trait-specific motion impact (Fig. 2e).

Statistical Inference

The SHAMAN algorithm reveals the trait-specific impact of motion. This connectivity matrix will be zero under the null hypothesis of zero motion impact. To detect significant motion impact, we test the omnibus alternative hypothesis that one or more edges in this matrix is non-zero. To avoid making strong parametric assumptions about the data, we use a permutation scheme followed by the non-parametric combining method described by Winkler et al.⁴³.

The permutation scheme is diagrammed in Supplementary Figs. 11 and 12 and described in greater detail in the following section. In Step 2 of the algorithm (Supplementary Fig. 11b) the data are split in half at random, without respect to the FD timeseries. The rest of the algorithm proceeds as before. The algorithm is repeated many times with different random splits (permutations) of the data to obtain a null distribution for the motion-associated connectivity matrix (Supplementary Fig. 11e).

Once a null distribution is generated, we use non-parametric combining⁴³ to test the null hypothesis that all edges in the motion-associated connectivity matrix are zero. Each edge is treated as a separate “modality” to avoid assumptions about exchangeability or clustering of edges. We selected Stouffer’s combining function^71,72 because Stouffer’s Z-score is well-known and has an intuitive interpretation. The non-parametric combining process yields a single, omnibus Stouffer’s Z-score for the entire connectivity matrix, which we call the motion impact score. Comparing the motion impact score to its null distribution yields an omnibus p-value that controls for family-wise error rate across the entire motion-associated connectivity matrix. When the p-value is small a significant amount of motion-associated connectivity is present. When the motion impact score (Stouffer’s Z-score) is large, a large amount of motion associated connectivity is present.

Motion Blocks Permutation

Functional MRI timeseries are known to be weakly non-stationary and autocorrelated⁷³, therefore the assumption of exchangeability and unconstrained permutation of timepoints may not be appropriate. Specifically, consecutive timepoints could be randomly assigned to opposite halves of the data, making it more likely that the randomly (during permutation) split halves of the timeseries data would be similar. This would have the effect of making the difference between the randomly split halves smaller, thereby shrinking the null distribution toward zero and overestimating the significance of the motion impact score.

The problem of inference on weakly-dependent data is well-known in the statistical literature. Politis & Romano⁷⁴ describe the block and stationary bootstrap procedures for use on weakly-dependent data. Essentially, rather than resample individual timepoints with replacement, the Politis & Romano procedure resamples blocks of consecutive timepoints with replacement, where the block lengths are drawn from a geometric distribution. However, resampling the data into randomly-sized blocks makes it difficult to optimally separate the high- and low-motion halves of the data, since the randomly-sized blocks might not conform to the natural divisions of the motion timeseries.

Therefore, we devised a scheme we call “motion blocks” permutation that allows for optimal separation of high- and low-motion data while accounting for weak temporal dependency within the timeseries. The motion blocks permutation scheme is depicted in Supplementary Fig. 12. For the un-permuted split, the data are divided into high- and low-motion halves as usual, resulting in “blocks” of consecutive high- and low-motion data points. For the permuted split, the blocks, rather than individual data points, are randomly assigned to each half of the data. This strategy preserves the same amount of weak temporal dependence in the permuted data as in the un-permuted data.

The SHAMAN p-value Relates to Motion, Not Broader Significance

The p-value generated by the SHAMAN algorithm tests the omnibus null hypothesis that no edge (pairwise connection) in a variable/trait’s FC effect matrix is significantly obscured by or inflated by head motion. It does not test the hypothesis that the variable itself has a significant relationship with connectivity. On the contrary, a small p-value from SHAMAN (or a correspondingly large motion score) indicates that variable’s relationship with FC may be spuriously associated with head motion.

Motion Overestimation vs Underestimation Score

The main objective of SHAMAN is to detect how motion impacts trait-FC effect for specific traits. When motion impact has the same sign as the trait-FC effect (both positive or both negative) then motion causes overestimation of the size of the trait-FC effect and would tend to cause false positive inference (concluding the trait-FC effect is significant when it is not). When motion impact has the opposite sign of the trait-FC effect then motion causes underestimation of the size of the trait-FC effect and would tend to cause false negative inference (failing to detect a significant trait-FC effect when it really is significant). We can then compute separate motion impact scores, a motion overestimation score and a motion underestimation score, by counting only those edges whose motion impact is the same (overestimation) or opposite (underestimation) sign as the trait-FC effect. Distinguishing between over- and under-estimation requires us to know with some degree of certainty whether the trait-FC effect is positive or negative. We use a heuristic approach in which the trait-FC effect at a given edge is positive if its t-value is > 2 and negative if its t-value is < −2. If the t-value is near zero then we ignore the motion impact at that edge.

Timeseries Simulation

In order to test the validity of the SHAMAN algorithm we devised a generative model for fMRI BOLD timeseries data for which the parameters of brain signal and motion can be experimentally controlled. Such a model necessarily requires simplifying assumptions since the true biophysical properties of the brain are not known. Several fMRI data simulators exist⁷⁵, but none allows for realistic simulation of head motion artifact. Therefore, for our simulator, we worked backwards from the assumptions of the conventional, massively univariate, resting-state functional connectivity regression model (Supplementary Fig. 9) in which between-participant variables of interest are predictors (the design matrix) and atanh (Fisher Z) transformed linear correlation coefficients between brain regions are the observed data.

We generated two basis correlation matrices with the same dimensions (394 × 394 nodes) as the Gordon-Laumann-Seitzman^67,68 brain parcellation containing bitmap representations of the text “brain” and “motion,” see Fig. 2 and Supplementary Fig. 14. The Cholesky decomposition of each basis correlation matrix was used to simulate the timeseries of brain and motion signals. The process for generating these simulated timeseries is diagrammed in Supplementary Fig. 15. Normally distributed random data were multiplied by the Cholesky decomposition of the respective basis correlation matrix to generate simulated “BRAIN” and “MOTION” timeseries for each simulated participant. We simulated 394 nodes x 1024 fMRI frames (time points) for 1024 simulated participants. The variance, across the 394 nodes, at each of 1024 time points in the motion timeseries was used as the FD timeseries for each participant.

The brain and motion timeseries were mixed together in controlled proportions. Working backwards, the 1024 participant x 1 predictor “brain” column of the across-participants design matrix (i.e. the trait variable) was drawn at random from the standard normal distribution and then shifted to obtain all positive numbers. The 1024 × 1 “motion” column of the design matrix was taken as the mean FD timeseries of each participant. The brain and motion timeseries for each participant were then mixed by addition in the time domain according to their proportions (brain, motion) of variance in the design matrix. For example, if a participant’s row in the design matrix were (1.6329, 0.2216) then the mixed timeseries was generated by mixed = sqrt(1.6329) * brain + sqrt(0.2216) * motion. An additional step detailed in the Supplementary Material was used to correct the variance of the mixed timeseries to satisfy the assumptions of linear regression on correlation matrices.

Node Analysis

We can quantify motion impact within individual nodes/parcels of the brain network. This may be advantageous when planning a region of interest analysis or to help visualize motion-associated connectivity in anatomical space. Instead of performing omnibus non-parametric combining over the entire connectivity matrix, we perform non-parametric combining to obtain a Stouffer’s Z-score and p-value over the edges connected to a single node or subset of nodes.

The process for node analysis is best illustrated In Supplementary Fig. 6. First, an omnibus p-value is computed across all edges. Then a Stouffer’s Z-score (motion impact score) is computed for each individual node using the edges connected to that node. The node with the largest motion score is excluded by deleting its connecting edges from the connectivity matrix. The whole process is then repeated until all nodes have been excluded. The thick red/black line in Supplementary Fig. 6 shows the omnibus p-value increasing (becoming less significant) from left to right as each successive node is excluded. Portions of the line below p = 0.05 are highlighted in red. The point at which the line intersects with p = 0.05 corresponds to the number of nodes affected by motion-associated connectivity.

Node analysis is also used to visualize motion-associated connectivity anatomically in Fig. 4c,d. The Stouffer’s Z-score, or “motion Z-score,” is computed for each individual node as above. The resultant motion scores are visualized on the cortical surface.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

Participant level data from ABCD are openly available pursuant to consortium-level data access rules. The ABCD data repository grows and changes over time (https://nda.nih.gov/abcd). The ABCD data used in this study came from ABCD Annual Release 2.0 (https://doi.org/10.15154/1503209).

Code availability

Analysis code specific to this study can be found at https://github.com/DosenbachGreene/shaman. Code for processing ABCD and UKB data can be found at https://github.com/DCAN-Labs/abcd-hcp-pipeline.

References

Behzadi, Y., Restom, K., Liau, J. & Liu, T. T. A component based noise correction method (CompCor) for BOLD and perfusion based fMRI. NeuroImage 37, 90–101 (2007).
Article PubMed Google Scholar
Bolton, T. A. W. et al. Agito ergo sum: correlates of spatio-temporal motion characteristics during fMRI. NeuroImage 209, 116433 (2020).
Article PubMed Google Scholar
Burgess, G. C. et al. Evaluation of denoising strategies to address motion-correlated artifacts in resting-state functional magnetic resonance imaging data from the human connectome project. Brain Connect. 6, 669–680 (2016).
Article PubMed PubMed Central Google Scholar
Ciric, R. et al. Benchmarking of participant-level confound regression strategies for the control of motion artifact in studies of functional connectivity. NeuroImage 154, 174–187 (2017).
Article PubMed Google Scholar
Dosenbach, N. U. F. et al. Real-time motion analytics during brain MRI improve data quality and reduce costs. NeuroImage 161, 80–93 (2017).
Article PubMed Google Scholar
Fair, D. A. et al. Distinct neural signatures detected for ADHD subtypes after controlling for micro-movements in resting state functional connectivity MRI data. Front Syst Neurosci. 6, https://doi.org/10.3389/fnsys.2012.00080 (2013).
Griffanti, L. et al. ICA-based artefact removal and accelerated fMRI acquisition for improved resting state network imaging. NeuroImage 95, 232–247 (2014).
Article PubMed Google Scholar
Kundu, P. et al. Integrated strategy for improving functional connectivity mapping using multiecho fMRI. Proc. Natl. Acad. Sci.USA 110, 16187–16192 (2013).
Article ADS CAS PubMed PubMed Central Google Scholar
Makowski, C., Lepage, M. & Evans, A. C. Head motion: the dirty little secret of neuroimaging in psychiatry. J. Psychiatry Neurosci. 44, 62–68 (2019).
Article PubMed Google Scholar
Mowinckel, A. M., Espeseth, T. & Westlye, L. T. Network-specific effects of age and in-scanner subject motion: a resting-state fMRI study of 238 healthy adults. NeuroImage 63, 1364–1373 (2012).
Article PubMed Google Scholar
Muschelli, J. et al. Reduction of motion-related artifacts in resting state fMRI using aCompCor. NeuroImage 96, 22–35 (2014).
Article PubMed Google Scholar
Nielsen, A. N. et al. Evaluating the prediction of brain maturity from functional connectivity after motion artifact denoising. Cereb. Cortex. 29, 2455–2469 (2019).
Article PubMed Google Scholar
Patel, A. X. et al. A wavelet method for modeling and despiking motion artifacts from resting-state fMRI time series. NeuroImage 95, 287–304 (2014).
Article PubMed Google Scholar
Power, J. D., Barnes, K. A., Snyder, A. Z., Schlaggar, B. L. & Petersen, S. E. Spurious but systematic correlations in functional connectivity MRI networks arise from subject motion. NeuroImage 59, 2142–2154 (2012).
Article PubMed Google Scholar
Pruim, R. H. R. et al. ICA-AROMA: A robust ICA-based strategy for removing motion artifacts from fMRI data. NeuroImage 112, 267–277 (2015).
Article PubMed Google Scholar
Salimi-Khorshidi, G. et al. Automatic denoising of functional MRI data: Combining independent component analysis and hierarchical fusion of classifiers. NeuroImage 90, 449–468 (2014).
Article PubMed Google Scholar
Satterthwaite, T. D. et al. Impact of in-scanner head motion on multiple measures of functional connectivity: relevance for studies of neurodevelopment in youth. NeuroImage 60, 623–632 (2012).
Article PubMed Google Scholar
Siegel, J. S. et al. Data quality influences observed links between functional connectivity and. Behav. Cereb. Cortex. 27, 4492–4502 (2017).
Article Google Scholar
Van Dijk, K. R. A., Sabuncu, M. R. & Buckner, R. L. The influence of head motion on intrinsic functional connectivity MRI. NeuroImage 59, 431–438 (2012).
Article PubMed Google Scholar
Yan, C. G. et al. A comprehensive assessment of regional variation in the impact of head micromovements on functional connectomics. NeuroImage 76, 183–201 (2013).
Article PubMed Google Scholar
Casey, B. J. et al. The Adolescent Brain Cognitive Development (ABCD) study: Imaging acquisition across 21 sites. Dev. Cogn. Neurosci. 32, 43–54 (2018).
Article CAS PubMed PubMed Central Google Scholar
Greene, D. J. et al. Behavioral interventions for reducing head motion during MRI scans in children. NeuroImage 171, 234–245 (2018).
Article PubMed Google Scholar
Pagliaccio, D. et al. Anomalous functional brain activation following negative mood induction in children with pre-school onset major depression. Dev. Cogn. Neurosci. 2, 256–267 (2012).
Article PubMed Google Scholar
Vanderwal, T., Kelly, C., Eilbott, J., Mayes, L. C. & Castellanos, F. X. Inscapes: A movie paradigm to improve compliance in functional magnetic resonance imaging. NeuroImage 122, 222–232 (2015).
Article PubMed Google Scholar
Fair, D. A. et al. Correction of respiratory artifacts in MRI head motion estimates. NeuroImage 208, 116400 (2020).
Article PubMed Google Scholar
Friston, K. J., Williams, S., Howard, R., Frackowiak, R. S. J. & Turner, R. Movement-related effects in fmri time-series: movement artifacts in fMRI. Magn. Reson Med. 35, 346–355 (1996).
Article CAS PubMed Google Scholar
Power, J. D. et al. Methods to detect, characterize, and remove motion artifact in resting state fMRI. NeuroImage 84, 320–341 (2014).
Article PubMed Google Scholar
Power, J. D. et al. Ridding fMRI data of motion-related influences: removal of signals with distinct spatial and physical bases in multiecho data. Proc Natl. Acad. Sci. USA 115. https://doi.org/10.1073/pnas.1720985115 (2018).
Satterthwaite, T. D. et al. An improved framework for confound regression and filtering for control of motion artifact in the preprocessing of resting-state functional connectivity data. NeuroImage 64, 240–256 (2013).
Article PubMed Google Scholar
Tyszka, J. M., Kennedy, D. P., Paul, L. K. & Adolphs, R. Largely typical patterns of resting-state functional connectivity in high-functioning adults with autism. Cereb. Cortex. 24, 1894–1905 (2014).
Article PubMed Google Scholar
Nebel, M. B. et al. Accounting for motion in resting-state fMRI: What part of the spectrum are we characterizing in autism spectrum disorder?. NeuroImage 257, 119296 (2022).
Article PubMed Google Scholar
Murphy, K. & Fox, M. D. Towards a consensus regarding global signal regression for resting state functional connectivity MRI. NeuroImage 154, 169–173 (2017).
Article PubMed Google Scholar
Gratton, C. et al. Removal of high frequency contamination from motion estimates in single-band fMRI saves data without biasing functional connectivity. NeuroImage 217, 116866 (2020).
Article PubMed Google Scholar
Havsteen, I. et al. Are movement artifacts in magnetic resonance imaging a real problem?—A narrative review. Front Neurol. 8, 232 (2017).
Article PubMed PubMed Central Google Scholar
Griffanti, L. et al. Hand classification of fMRI ICA noise components. NeuroImage 154, 188–205 (2017).
Article PubMed Google Scholar
Alfaro-Almagro, F. et al. Image processing and Quality Control for the first 10,000 brain imaging datasets from UK Biobank. NeuroImage 166, 400–424 (2018).
Article PubMed Google Scholar
Feczko, E. et al. Adolescent Brain Cognitive Development (ABCD) Community MRI Collection and Utilities. Neuroscience; https://doi.org/10.1101/2021.07.09.451638 (2021).
Glasser, M. F. et al. The minimal preprocessing pipelines for the Human Connectome Project. NeuroImage 80, 105–124 (2013).
Article PubMed Google Scholar
Hagler, D. J. et al. Image processing and analysis methods for the Adolescent Brain Cognitive Development Study. NeuroImage 202, 116091 (2019).
Article CAS PubMed Google Scholar
Power, J. D., Schlaggar, B. L. & Petersen, S. E. Recent progress and outstanding issues in motion correction in resting state fMRI. NeuroImage 105, 536–551 (2015).
Article PubMed Google Scholar
Phạm DĐ, McDonald, D. J., Ding, L., Nebel, M. B. & Mejia, A. F. Less is more: balancing noise reduction and data retention in fMRI with data-driven scrubbing. NeuroImage 270, 119972 (2023).
Article PubMed Google Scholar
Kay, B. P. et al. Stimulant medications affect arousal and reward, not attention. bioRxiv https://doi.org/10.1101/2025.05.19.654915 (2025).
Winkler, A. M. et al. Non-parametric combination and related permutation tests for neuroimaging: NPC and Related Permutation Tests for Neuroimaging. Hum. Brain Mapp. 37, 1486–1511 (2016).
Article PubMed PubMed Central Google Scholar
Noble, S., Mejia, A. F., Zalesky, A. & Scheinost, D. Improving power in functional magnetic resonance imaging by moving beyond cluster-level inference. Proc. Natl Acad. Sci. 119, e2203020119 (2022).
Article CAS PubMed PubMed Central Google Scholar
Jernigan, T. L. & Brown, S. A. Introduction. Dev. Cogn. Neurosci. 32, 1–3 (2018).
Article PubMed PubMed Central Google Scholar
Volkow, N. D. et al. The conception of the ABCD study: From substance use to a broad NIH collaboration. Dev. Cogn. Neurosci. 32, 4–7 (2018).
Article PubMed Google Scholar
Barch, D. M. et al. Demographic, physical and mental health assessments in the adolescent brain and cognitive development study: Rationale and description. Dev. Cogn. Neurosci. 32, 55–66 (2018).
Article PubMed Google Scholar
Luciana, M. et al. Adolescent neurocognitive development and impacts of substance use: Overview of the adolescent brain cognitive development (ABCD) baseline neurocognition battery. Dev. Cogn. Neurosci. 32, 67–79 (2018).
Article CAS PubMed PubMed Central Google Scholar
Marek, S. et al. Reproducible brain-wide association studies require thousands of individuals. Nature 603, 654–660 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Elam, J. S. et al. The human connectome project: a retrospective. NeuroImage 244, 118543 (2021).
Article CAS PubMed Google Scholar
Van Essen, D. C. et al. The WU-Minn human connectome project: an overview. NeuroImage 80, 62–79 (2013).
Article PubMed Google Scholar
Smith, S. M. et al. Resting-state fMRI in the human connectome project. NeuroImage 80, 144–168 (2013).
Article PubMed Google Scholar
Wechsler, D. & Pearson Education, I. Psychological Corporation. WISC-V: Wechsler Intelligence Scale for Children. NCS Pearson, Inc. PsychCorp; 2014.
Grummer-Strawn, L. M., Reinold, C., Krebs, N. F. & Centers for Disease Control and Prevention (CDC) Use of World Health Organization and CDC growth charts for children aged 0-59 months in the United States. MMWR Recomm. Rep. Morb. Mortal. Wkly Rep. Recomm. Rep. 59, 1–15 (2010).
Google Scholar
Laumann, T. O. et al. Functional system and areal organization of a highly sampled individual human brain. Neuron 87, 657–670 (2015).
Article CAS PubMed PubMed Central Google Scholar
Afyouni, S. & Nichols, T. E. Insight and inference for DVARS. NeuroImage 172, 291–312 (2018).
Article PubMed Google Scholar
Smyser, C. D., Snyder, A. Z. & Neil, J. J. Functional connectivity MRI in infants: Exploration of the functional organization of the developing brain. NeuroImage 56, 1437–1452 (2011).
Article PubMed Google Scholar
Abreu, R., Leal, A. & Figueiredo, P. Identification of epileptic brain states by dynamic functional connectivity analysis of simultaneous EEG-fMRI: a dictionary learning approach. Sci. Rep. 9, 638 (2019).
Article ADS PubMed PubMed Central Google Scholar
Couvy-Duchesne, B. et al. Heritability of head motion during resting state functional MRI in 462 healthy twins. NeuroImage 102, 424–434 (2014).
Article PubMed Google Scholar
Engelhardt, L. E. et al. Children’s head motion during fMRI tasks is heritable and stable over time. Dev. Cogn. Neurosci. 25, 58–68 (2017).
Article PubMed PubMed Central Google Scholar
Hodgson, K. et al. Shared genetic factors influence head motion during MRI and body mass index. Cereb Cortex. Published online October 15, cercor;bhw321v1 (2016).
Zeng, L. L. et al. Neurobiological basis of head motion in brain imaging. Proc. Natl. Acad. Sci. USA 111, 6058–6062 (2014).
Article ADS CAS PubMed PubMed Central Google Scholar
Jernigan, T. & Jernigan, T. Adolescent Brain Cognitive Development Study (ABCD) - Annual Release 2.0. Published online https://doi.org/10.15154/1503209 (2019).
Birn, R. M. et al. The effect of scan length on the reliability of resting-state fMRI connectivity estimates. NeuroImage 83, 550–558 (2013).
Article PubMed Google Scholar
Jenkinson, M., Beckmann, C. F., Behrens, T. E. J., Woolrich, M. W. & Smith SM. FSL. NeuroImage. 62, 782–790 (2012).
Fischl, B. Automatically parcellating the human cerebral cortex. Cereb. Cortex. 14, 11–22 (2004).
Article PubMed Google Scholar
Gordon, E. M. et al. Generation and evaluation of a cortical area parcellation from resting-state correlations. Cereb. Cortex. 26, 288–303 (2016).
Article PubMed Google Scholar
Seitzman, B. A. et al. A set of functionally-defined brain regions with improved representation of the subcortex and cerebellum. NeuroImage 206, 116290 (2020).
Article PubMed Google Scholar
Raut, R. V., Mitra, A., Snyder, A. Z. & Raichle, M. E. On time delay estimation and sampling error in resting-state fMRI. NeuroImage 194, 211–227 (2019).
Article PubMed Google Scholar
Gratton, C. et al. Functional brain networks are dominated by stable group and individual factors, not cognitive or daily variation. Neuron 98, 439–452.e5 (2018).
Article CAS PubMed PubMed Central Google Scholar
Stouffer, S. A., Suchman, E. A., Devinney, L. C., Star, S. A. & Williams, Jr. RM. The American Soldier: Adjustment during Army Life. (Studies in Social Psychology in World War II), 1. Princeton Univ. Press; xii 599 (1949).
van Zwet, W. R. & Oosterhoff, J. On the combination of independent test statistics. Ann. Math. Stat. 38, 659–680 (1967).
Article MathSciNet Google Scholar
Guan, S. et al. The profiles of non-stationarity and non-linearity in the time series of resting-state brain networks. Front Neurosci. 14, 493 (2020).
Article PubMed PubMed Central Google Scholar
Politis, D. N. & Romano, J. P. The stationary bootstrap. J. Am. Stat. Assoc. 89, 1303–1313 (1994).
Article MathSciNet Google Scholar
Ellis, C. T., Baldassano, C., Schapiro, A. C., Cai, M. B. & Cohen, J. D. Facilitating open-science with realistic fMRI simulation: validation and application. PeerJ 8, e8564 (2020).
Article PubMed PubMed Central Google Scholar

Download references

Acknowledgements

ABCD Acknowledgement Data used in the preparation of this article were obtained from the Adolescent Brain Cognitive Development (ABCD) Study (https://abcdstudy.org), held in the NIMH Data Archive (NDA). This is a multisite, longitudinal study designed to recruit more than 10,000 children age 9-10 and follow them over 10 years into early adulthood. The ABCD Study is supported by the National Institutes of Health and additional federal partners under award numbers U01DA041022, U01DA041028, U01DA041048, U01DA041089, U01DA041106, U01DA041117, U01DA041120, U01DA041134, U01DA041148, U01DA041156, U01DA041174, U24DA041123, U24DA041147, U01DA041093, and U01DA041025. A full list of supporters is available at https://abcdstudy.org/federal-partners.html. A listing of participating sites and a complete listing of the study investigators can be found at https://abcdstudy.org/scientists/workgroups/. ABCD consortium investigators designed and implemented the study and/or provided data but did not necessarily participate in analysis or writing of this report. This manuscript reflects the views of the authors and may not reflect the opinions or views of the NIH or ABCD consortium investigators. The ABCD data repository grows and changes over time. The ABCD data used in this report came from Anual Release 2.0, https://doi.org/10.15154/1503209. This work was supported by NIH grants MH100019 (SM), MH121518 (SM), MH123091 (AZ), NS115672 (AZ), NS133486 (JLR), DA057486 (BT-C), MH129616 (TOL), NS110332 (DJN), MH100019 (NAS), MH129493 (DMB), NS123345 (BPK), NS098482 (BPK), NS098577 (AZS), DA041148 (DAF), DA04112 (DAF), MH115357 (DAF), MH096773 (DAF and NUFD), MH122066 (EMG, DAF, and NUFD), MH121276 (EMG, DAF, and NUFD), MH124567 (EMG, DAF, and NUFD), NS129521 (EMG, DAF, and NUFD), NS140256 (EMG and NUFD), and NS088590 (NUFD); by the National Spasmodic Dysphonia Association (EMG); by Mallinckrodt Institute of Radiology pilot funding (EMG); by the McDonnell Center for Systems Neuroscience (AZ); by the Taylor Family Institute Fund for Innovative Psychiatric Research (TOL); by the Andrew Mellon Predoctoral Fellowship from the Dietrich School of Arts & Sciences, University of Pittsburgh (BTC); and by the Extreme Science and Engineering Discovery Environment (XSEDE) Bridges at the Pittsburgh Supercomputing Center through allocation TG-IBN200009 (BTC).

Author information

Authors and Affiliations

Department of Neurology, Washington University School of Medicine, St. Louis, MO, USA
Benjamin P. Kay, David F. Montez, Babatunde Adeyemo, Athanasia Metoki, Roselyne J. Chauvin, Andrew N. Van, Vahdeta Suljic, Samuel R. Krimmel, Ryland L. Miller, Dillan J. Newbold, Annie Zheng, Nicole A. Seider, Kristen M. Scheidter, Julia S. Monk, Abraham Z. Snyder & Nico U. F. Dosenbach
Department of Psychiatry, Washington University School of Medicine, St Louis, MO, USA
Scott Marek, Joshua S. Siegel, Babatunde Adeyemo, Timothy O. Laumann & Deanna M. Barch
Mallinckrodt Institute of Radiology, Washington University School of Medicine, St. Louis, MO, USA
Scott Marek, Timothy O. Laumann, Evan M. Gordon, Abraham Z. Snyder, Deanna M. Barch & Nico U. F. Dosenbach
Masonic Institute for the Developing Brain, University of Minnesota, Minneapolis, MN, US
Brenden Tervo-Clemmens, Eric Feczko, Anita Randolph, Óscar Miranda-Domínguez, Lucille A. Moore, Anders J. Perrone, Gregory M. Conan & Damien A. Fair
Department of Psychiatry and Behavioral Sciences, University of Minnesota, Minneapolis, MN, USA
Brenden Tervo-Clemmens
Institute for Translational Neuroscience, University of Minnesota, Minneapolis, MN, USA
Brenden Tervo-Clemmens
Department of Psychiatry, NYU Langone Medical Center, New York, NY, USA
Joshua S. Siegel
Department of Biomedical Engineering, Washington University in St Louis, St Louis, MO, USA
Andrew N. Van & Nico U. F. Dosenbach
Department of Neurology, NYU Langone Medical Center, New York, NY, USA
Dillan J. Newbold
Department of Pediatrics, University of Minnesota, Minneapolis, MN, USA
Eric Feczko, Anita Randolph, Óscar Miranda-Domínguez & Damien A. Fair
Data Science and Sharing Team, National Institute of Mental Health, NIH, DHHS, Bethesda, MD, USA
Eric A. Earl
Department of Psychology, University of Minnesota, Minneapolis, MN, USA
Stephen M. Malone
Joint Doctoral Program in Clinical Psychology, University of California San Diego, San Diego, CA, USA
Michaela Cordova
Department of Psychiatry, Oregon Health & Science University, Portland, OR, USA
Olivia Doyle & Alice M. Graham
Minnesota Supercomputing Institute, University of Minnesota, Minneapolis, MN, USA
Benjamin J. Lynch & James C. Wilgenbusch
University of Minnesota Informatics Institute, University of Minnesota, Minneapolis, MN, USA
Thomas Pengo
Taylor Family Department of Neurosurgery, Washington University School of Medicine, St. Louis, MO, USA
Jarod L. Roland
Department of Psychological and Brain Sciences, Washington University in Saint Louis, St Louis, MO, USA
Deanna M. Barch & Nico U. F. Dosenbach
Institute of Child Development, University of Minnesota, Minneapolis, MN, USA
Damien A. Fair
Department of Pediatrics, Washington University School of Medicine, St Louis, MO, USA
Nico U. F. Dosenbach

Authors

Benjamin P. Kay
View author publications
Search author on:PubMed Google Scholar
David F. Montez
View author publications
Search author on:PubMed Google Scholar
Scott Marek
View author publications
Search author on:PubMed Google Scholar
Brenden Tervo-Clemmens
View author publications
Search author on:PubMed Google Scholar
Joshua S. Siegel
View author publications
Search author on:PubMed Google Scholar
Babatunde Adeyemo
View author publications
Search author on:PubMed Google Scholar
Timothy O. Laumann
View author publications
Search author on:PubMed Google Scholar
Athanasia Metoki
View author publications
Search author on:PubMed Google Scholar
Roselyne J. Chauvin
View author publications
Search author on:PubMed Google Scholar
Andrew N. Van
View author publications
Search author on:PubMed Google Scholar
Vahdeta Suljic
View author publications
Search author on:PubMed Google Scholar
Samuel R. Krimmel
View author publications
Search author on:PubMed Google Scholar
Ryland L. Miller
View author publications
Search author on:PubMed Google Scholar
Dillan J. Newbold
View author publications
Search author on:PubMed Google Scholar
Annie Zheng
View author publications
Search author on:PubMed Google Scholar
Nicole A. Seider
View author publications
Search author on:PubMed Google Scholar
Kristen M. Scheidter
View author publications
Search author on:PubMed Google Scholar
Julia S. Monk
View author publications
Search author on:PubMed Google Scholar
Eric Feczko
View author publications
Search author on:PubMed Google Scholar
Anita Randolph
View author publications
Search author on:PubMed Google Scholar
Óscar Miranda-Domínguez
View author publications
Search author on:PubMed Google Scholar
Lucille A. Moore
View author publications
Search author on:PubMed Google Scholar
Anders J. Perrone
View author publications
Search author on:PubMed Google Scholar
Gregory M. Conan
View author publications
Search author on:PubMed Google Scholar
Eric A. Earl
View author publications
Search author on:PubMed Google Scholar
Stephen M. Malone
View author publications
Search author on:PubMed Google Scholar
Michaela Cordova
View author publications
Search author on:PubMed Google Scholar
Olivia Doyle
View author publications
Search author on:PubMed Google Scholar
Benjamin J. Lynch
View author publications
Search author on:PubMed Google Scholar
James C. Wilgenbusch
View author publications
Search author on:PubMed Google Scholar
Thomas Pengo
View author publications
Search author on:PubMed Google Scholar
Alice M. Graham
View author publications
Search author on:PubMed Google Scholar
Jarod L. Roland
View author publications
Search author on:PubMed Google Scholar
Evan M. Gordon
View author publications
Search author on:PubMed Google Scholar
Abraham Z. Snyder
View author publications
Search author on:PubMed Google Scholar
Deanna M. Barch
View author publications
Search author on:PubMed Google Scholar
Damien A. Fair
View author publications
Search author on:PubMed Google Scholar
Nico U. F. Dosenbach
View author publications
Search author on:PubMed Google Scholar

Contributions

B.P.K. is the corresponding author. B.P.K., D.F.M., S.F.M., B.T.C., J.S.S., B.A., T.O.M., A.M., R.J.C., A.N.V., S.R.K., D.J.N., A.Z., N.A.S., E.F., A.R., O.M.D., J.L.R., E.M.G., A.Z.S., D.M.B., D.A.F., and NUFD made substantial contributions to the conception and design of the analysis. R.L.M., K.M.S., J.M., L.A.M., A.J.P., G.M.C., E.A.E., S.M.M., M.C., O.D., B.J.L., J.C.W., T.P., and A.M.G. made substantial contributions to data acquisition and analysis. VS made substantial contributions to the SHAMAN software.

Corresponding author

Correspondence to Benjamin P. Kay.

Ethics declarations

Competing interests

D.A.F. and N.U.F.D. have a financial interest in Turing Medical Inc. and may financially benefit if the company is successful in marketing FIRMM motion-monitoring software products. D.A.F. and N.U.F.D. may receive royalty income based on FIRMM technology developed at Washington University School of Medicine and Oregon Health and Sciences University and licensed to Turing Medical Inc. D.A.F. and N.U.F.D. are co-founders of Turing Medical Inc. These potential conflicts of interest have been reviewed and are managed by Washington University School of Medicine, Oregon Health and Sciences University and the University of Minnesota. The other authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks the anonymous reviewer(s) for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Description of Additional Supplementary Files

Supplementary Data 1

Supplementary Data 2

Supplementary Data 3

Supplementary Data 4

Supplementary Data 5

Supplementary Data 6

Supplementary Data 7

Supplementary Data 8

Supplementary Data 9

Supplementary Data 10

Supplementary Data 11

Supplementary Data 12

Supplementary Data 13

Supplementary Data 14

Reporting Summary

Transparent Peer Review file

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Kay, B.P., Montez, D.F., Marek, S. et al. Motion impact score for detecting spurious brain-behavior associations. Nat Commun 16, 8614 (2025). https://doi.org/10.1038/s41467-025-63661-2

Download citation

Received: 26 December 2023
Accepted: 21 August 2025
Published: 29 September 2025
Version of record: 29 September 2025
DOI: https://doi.org/10.1038/s41467-025-63661-2

This article is cited by

Individual identifiability following Procrustes alignment of functional gradients: effect of subspace dimensionality
- Leonard Sasse
- Casey Paquola
- Kaustubh R. Patil
Communications Biology (2026)
Limited generalizability of dynamic fMRI correlates of adolescent rumination
- Isaac N. Treves
- Madelynn S. Park
- Christian A. Webb
Nature Mental Health (2025)