Introduction

Motor skills are essential for school-aged children to successfully perform daily activities, academic tasks, and sports within a school environment1. The Movement Assessment Battery for Children, Second Edition (MABC-2), is a globally recognized tool for assessing motor skills in this population2,3. It consists of eight tasks categorized into three domains: manual dexterity, aiming and catching, and balance4. Due to its brevity and engaging nature, the MABC-2 is widely used in both clinical and research settings as the gold standard to identify motor difficulties in children, and its normative data has been extensively used to identify children with developmental coordination disorder (DCD).

The MABC-2 was developed and normed in the U.K4. , but its applicability to school-aged children in Eastern cultures remains uncertain. In Taiwan, where no Chinese version of the MABC-2 is available, clinical practice relies on the UK norms from the original manual to convert raw scores into standard scores and assess motor skill deficits in children. However, motor skill development is profoundly influenced by social and cultural factors5,6,7, as children’s experiences with physical and manipulative activities may shape their motor abilities. Huang et al. found that preschool children in Taiwan outperformed their UK counterparts in manual dexterity and balance tasks, whereas UK children excelled in aiming and catching5. Similarly, Hua’s study of children in China reported comparable trends8. These findings suggest that lifestyle, educational practices, and cultural factors play a crucial role in shaping motor development. As a result, it is essential to investigate motor differences across different populations to ensure that assessments like the MABC-2 accurately reflect children’s motor abilities in children worldwide and provides valuable insights for researchers and clinicians in establishing reasonable expectations for children’s motor development.

Beyond comparing overall motor performance across cultures, it is also crucial to examine item-level performance on the MABC-2 to determine whether specific tasks are culturally appropriate. This includes investigating ceiling and floor effects, which can significantly impact the assessment tool’s sensitivity by limiting its ability to detect subtle motor difficulties or developmental progress9,10. When too many children score at either the maximum or minimum level, the test fails to capture meaningful individual differences, reducing its clinical and research utility. Huang et al.5 identified ceiling effects in three out of eight tasks when assessing preschool children in Taiwan, suggesting that certain items may not be sufficiently challenging for this population. Given the shared cultural background, similar effects may exist in school-aged children, yet there is currently little evidence in this age group. Without such data, the specific strengths and weaknesses of children in Eastern cultures remain unclear, increasing the risk that some tasks may fail to differentiate motor skills effectively, potentially leading to misclassification or inaccurate assessments. Investigating floor and ceiling effects in school-aged children is therefore essential for ensuring accurate identification of motor skill deficits across diverse populations. Such research would not only enhance the cross-cultural validity of the MABC-2 but also serve as a reference for refining the test to improve its applicability in Eastern cultures.

Furthermore, it is equally important to assess the psychometric properties of the MABC-2 when applied in different cultural settings. While the assessment’s reliability and validity have been extensively examined, there is a lack of evidence regarding its psychometric properties in school-aged children in Taiwan. Existing studies have primarily focused on preschool children or populations with DCD and intellectual disabilities in Taiwan and China8,11. For example, Hua et al. reported good internal consistency, strong inter-rater and test–retest reliability, and satisfactory construct validity in a large sample of preschool children8. Similarly, Wuang’s study on children with DCD demonstrated good internal consistency and excellent test–retest reliability11. However, psychometric properties are highly sample-dependent, and the reliability, validity, and responsiveness of the MABC-2 in school-aged children in Taiwan remain largely unknown. Conducting such evaluations is essential to ensure that the assessment produces accurate and meaningful results for this population.

As children transition into elementary school, the demands on their motor skills increase, making it crucial to understand their motor development and identify motor skill deficits among this period. Although the MABC-2 is widely used for motor skill assessment globally, its applicability and psychometric properties in Eastern cultures, particularly among school-aged children, remain largely unknown. This gap in knowledge limits the accurate interpretation of its results in these populations. Huang et al.5 previously used the MABC-2 to examine motor skill differences between Eastern and Western cultures; however, their study primarily focused on preschool children. Given the substantial differences in educational systems, parental expectations, and daily activities between preschoolers and school-aged children, performance trends may also vary between these age groups. Therefore, it is necessary to comprehensively investigate motor skill differences in school-aged children at both overall and item levels while also examining the psychometric properties of the MABC-2.

This study had two primary aims. First, it sought to evaluate both overall and item-level differences in motor performance between children in Taiwan and those in the U.K. Second, it aimed to examine the psychometric properties of the MABC-2, including test–retest and inter-rater reliability, construct validity, convergent validity, predictive validity, and responsiveness. We hypothesized that significant differences would be observed in certain domains and tasks of the MABC-2, underscoring the importance of considering cultural influences when adopting norms from other countries. Additionally, we expected the MABC-2 to demonstrate acceptable to excellent psychometric properties. The findings of this study will provide valuable insights into the cross-cultural applicability of the MABC-2, enhancing its cultural validity and ensuring its reliability and effectiveness in assessing motor skills across diverse populations.

Methods

Participants

Children in this study were recruited from various sources, including elementary schools, a rehabilitation clinic, and a youth developmental center in Taipei, New Taipei, Tainan, and Changhua. These locations are representative of western Taiwan, where the majority of the population resides. Inclusion criteria were children: (1) enrollment in elementary school (age 6 to 12 years), (2) absence of significant motor disabilities such as cerebral palsy, and (3) ability to follow at least two-step instructions. The exclusion criterion was parents who were unable to read Mandarin Chinese. Because the different age bands of the MABC-2 are applicable to children aged 3 to 6 (age band 1), 7 to 10 (age band 2) and 11 to 17 (age band 3) years, the children recruited in our study were further stratified into three age groups according to the different age bands of the MABC-2. This study received ethical approvals from National Taiwan University Hospital and E-DA Hospital.

Measures

The movement assessment battery for Children, second edition (MABC-2)4

The MABC-2 is a motor skills assessment comprising 8 items grouped into 3 domains: manual dexterity, aiming and catching, and balance. Higher domain scores indicate better motor skills, and the sum of these domain scores provides an overall performance score. These summed scores are further converted into standard scores and percentiles. The mean of the standard score is 10, with SD of 3. The percentiles can be interpreted using a traffic light system. Percentiles equal to or lower than 5% indicate significant motor difficulties (red light), while those between 6% and 15% suggest borderline motor difficulties (yellow light). Percentiles above 15% suggest no significant motor difficulties (green light). The psychometric properties of the MABC-2 have been validated in several countries, ensuring its reliability and accuracy as an assessment tool8,11,12,13,14.

In this study, children administered the MABC-2 had their performance converted to standard scores on each domain and the total scale, based on the U.K. standardized sample. The standard scores were used to examine the floor and ceiling effects, as well as the psychometric properties of the MABC-2 and to compare the motor skills with the standardized sample from the U.K.

In addition, The MABC-2 checklist, a supplementary parent-report questionnaire, was used for examining the convergent validity of the MABC-2. The MABC-2 checklist assesses children’s performance in executing motor tasks in the school context. Each item is rated on a 4-point scale, with higher scores indicating worse motor performance.

The vineland adaptive behavior Scale, third edition (VABS-3)15

The VABS-3 was used as the criterion to assess the predictive validity of the MABC-2. It consists of three subscales: Caring for Self, Caring for Home, and Caring for Community. In this study, the Caring for Self and Caring for Home subscales were selected. Items within these subscales were rated on a 3-point scale. A higher sum score on the caring for self and caring for home subscales indicates a greater proficiency in adaptive behaviors related to self-care and domestic settings. The VABS-3 has good psychometric properties15,16,17.

Procedures

Researchers responsible for data collection underwent rigorous training in administering the MABC-2. They were either occupational therapists or graduates of an occupational therapy program. As all graduates had completed internship training and had experience working with children with health conditions, they met the MABC-2 administration requirements.

Before data collection, researchers administering the MABC-2 received training from a senior occupational therapist (the corresponding author), who had extensive clinical experience with the MABC-2. They practiced administering the assessment with colleagues for at least two weeks, and their performance was evaluated at least twice. Researchers could begin data collection only after demonstrating smooth administration and achieving over 90% consensus in rating scores with other researchers. This training process ensured the quality and reliability of the evaluations.

To recruit participants for the study, we reached out to various sources, including school teachers, clinicians working in a rehabilitation clinic, and administrators at a youth development center. Facilities that expressed willingness to participate in the study were requested to distribute a cover letter explaining our research project and the informed consent forms to parents. Parents who agreed to have their children participate in the study returned the signed informed consent forms to the researchers. The assessment process involved three sessions for each child: the initial evaluation, a retest, and a follow-up at six months. During the initial evaluation, two researchers worked together to administer the MABC-2 assessment to each child. One researcher served as the main evaluator, while the other concurrently rated the child’s performance for inter-rater reliability. In addition to the MABC-2 assessment, parents were also asked to complete the MABC-2 checklist during the initial evaluation. The retest session occurred within the 2 weeks following the first session. Only the main evaluator at the first session administered the MABC-2. Six months later, at the follow-up session, parents were requested to fill out the caring for self and caring for home subscales of the VABS-3.

Statistical analyses

The comparison of motor skills between children in Taiwan and the standardized sample from the U.K.

To compare motor skill differences across countries, we retrieved the raw mean scores and standard deviations for each item from the normative sample aged 6 to 10 years reported by Ke et al.18. Differences in motor skills for children aged 11 to 12 years were not analyzed due to the unavailability of raw mean scores and standard deviations, as well as insufficient sample sizes for these age groups. Tests of differences between two means using published summary statistics were conducted to compare the two samples.

The floor and ceiling effects of the MABC-2

The raw scores were examined using descriptive analyses. We examined the floor and ceiling effects for each item in different age groups (1 year as a group interval). More than 20% of the children achieving the highest or lowest raw scores on a particular item was indicative of a ceiling or flooring effect.

Reliability

Because Taiwan-specific normative data for the MABC-2 were unavailable, raw scores were transformed into age group–based z scores using the means and standard deviations of our sample within each age group, and were subsequently summed to generate domain scores and a total scale score for the examination of psychometric properties.

The intraclass correlation coefficient (ICC) (2,1) was used to evaluate test–retest and inter-rater reliability, which is a two-way random-effects model with a single rating. ICC (2,1) values greater than 0.9 were considered to indicate excellent reliability, while those exceeding 0.7 denoted good reliability, and values above 0.6 were considered acceptable for both test–retest and inter-rater reliability18,19.

To examine the magnitude of random measurement errors for the inter-rater and test–retest reliability of the MABC-2, we applied the standard scores to children in Taiwan and calculated the Minimal Detectable Change (MDC) along with its corresponding MDC percentage. The MDC is calculated by the following formula

$$\:\text{M}\text{D}\text{C}=1.96\times\:\text{S}\text{E}\text{M}\times\:\surd\:2.$$

Where SEM = \(\:1\times\:\sqrt{(1-\text{r}\text{e}\text{l}\text{i}\text{a}\text{b}\text{i}\text{l}\text{i}\text{t}\text{y})},\) and reliability is test-retest or inter-rater reliability. For z-score, an MDC percentage above 100% indicates a notable presence of random measurement error in the assessment and thus potential sources of variability within the results20,21,22.

Validity

Convergent validity was examined with Pearson’s correlation coefficients (r) between the MABC-2 total standard score and the MABC-2 checklist total score. Moderate to high correlations indicate good convergent validity.

Predictive validity refers to the extent to which a test can accurately forecast future performance or outcomes on a related measure. It evaluates how well scores from one assessment correlate with later behavior, performance, or an established criterion. According to the Standards for Educational and Psychological Testing, the criterion variable is a measure of an attribute or outcome that is operationally distinct from the test itself. The test is not a direct measure of the criterion but rather a hypothesized predictor of that targeted outcome. Previous studies have shown an association between motor skills and children’s daily living skills in personal and domestic contexts, so the personal living and domestic living subscales was used as predictive criteria for the MABC-2. Predictive validity was examined with Pearson’s correlation coefficients (r) between the MABC-2 at initial evaluation and the caring for self and caring for home subscales of the VABS-3 at 6-month follow-up. Small to moderate correlations indicate good predictive validity23.

Construct validity was examined using confirmatory factor analyses. We employed two models to explore the factor structure of the MABC-2, as depicted in Fig. 1. The first model assessed whether the 8 tasks individually contributed to the 3 domains, while the second model investigated if the collective influence of the 8 tasks contributed to the overall motor skill performance. The two models helped to justify the appropriateness of using the domain and total scores of the MABC-2. The evaluation of model fit relied on the root mean square error of approximation (RMSEA), chi-squared, comparative fit index (CFI), Goodness-of-fit index (GFI), Adjusted goodness-of-fit (AGFI) and standardized root mean square residual (SRMR). Model performance was deemed satisfactory when meeting the criteria indicative of a well-fitting model: RMSEA < 0.06; CFI, GFI, and AGFI > 0.9; and SRMR < 0.05.

Fig. 1
Fig. 1
Full size image

The factor structure of the MABC-2.

Responsiveness

Paired t-tests were used to investigate the difference in scores between the initial evaluation and the 6-month follow-up assessments. In addition, we calculated the effect sizes, with values of 0.2, 0.5, and 0.8 signifying small, moderate, and large responsiveness, respectively.

Results

Characteristic of participants

A sample of 257 children was recruited in our study, as detailed in Table 1. Of the total sample, 71 children received single assessments, 168 were assessed twice within 2 weeks, and 90 completed follow-up evaluations after a 6-month period. The majority of the children (76.4%) had no diagnosis of developmental disabilities. Table 1 also provides a breakdown of diagnoses among children with developmental disabilities, such as autism spectrum disorder and attention deficit hyperactivity disorder. Additionally, a significant proportion of parents demonstrated a high level of education, with 90.31% of mothers and 84.5% of fathers having completed education beyond the secondary stage.

Table 1 The demographics of the participants (N = 257).

The comparison of motor skills between children in Taiwan and the standardized sample from the UK

Table 2 presents the mean raw scores for each item by boys and girls across the different age groups. Compared with the U.K. standardized sample, Taiwanese children generally performed better on manual dexterity tasks, particularly task 1 (i.e., placing coins/pegs) using the preferred hand, but performed worse on both aiming and catching tasks. In addition, within specific age bands, Taiwanese children showed better performance on balance task 1, namely standing on one foot.

Table 2 The comparison of motor skills for Taiwanese school-aged boys and girls with the standardized sample from the UK.

The floor and ceiling effects of the MABC-2

Table 3 displays the results of the examination of floor and ceiling effects, but the results of the effects at age 12 were not further interpreted due to the limited sample size of only 7 children. Significant floor effects were observed in the task 1 of aiming and catching at ages 7 (22.4%) in the MABC-2. Significant ceiling effects (from 25.0% to 88.4%) were found in 4 of the 8 tasks across all age groups, including the drawing trail in the manual dexterity domain and all three tasks in the balance domain.

Table 3 The ceiling and floor effects of each item of the MABC-2a in each age group.

Reliability of the MABC-2

Tables 4 and 5 list the ICC values and MDC% of the three domains and the total scale. As for test–retest reliability, the ICC values were acceptable to high for the three domains and the overall performance (ICC = 0.619–0.853). Among the three domains, the reliability was relatively low in the domains of aiming and catching and balance (ICC = 0.619–0.764). The MDC% were large across the three age bands (106% to 171%).

Table 4 Test–retest reliability and random measurement error of the MABC-2.
Table 5 Inter-rater reliability and random measurement error of the MABC-2.

As for inter-rater reliability, the ICC values were generally high across the 3 age bands, ranging from 0.839 to 0.985, except the balance domain for Age band 3 (ICC = 0.771). The MDC% of the three domains and total scale were acceptable to large (MDC% =33.9–133).

Validity of the MABC-2

In terms of convergent validity, small correlations were identified between the MABC-2 and the MABC-Checklist within age bands 2, with correlation coefficients − 0.302 (p = .001). No statistically significant correlations were observed in age band 1 and 3 (p > .05).

For construct validity, both the three-factor and one-factor models demonstrated adequate model fit. In the three-factor model, the manual dexterity domain was represented by three tasks, the aiming and catching domain by two tasks, and the balance domain by three tasks. After allowing a correlation between the error terms of Balance 2 and Balance 3, the model-fit indices were RMSEA = 0.058, CFI = 0.961, GFI = 0.972, AGFI = 0.938, and SRMR = 0.044. In the one-factor model, after allowing correlations between Task 1 and Task 2 of manual dexterity and between the two tasks of aiming and catching, all eight tasks loaded onto a single factor representing the total scale. The resulting model-fit indices were RMSEA = 0.052, CFI = 0.965, GFI = 0.973, AGFI = 0.946, and SRMR = 0.038.

As for predictive validity, only a small correlation was found between the total scale and the home subscales of theVABS-3 in age band 1 (r = .388, p < .05). Additionally, the moderate to high correlations between the MABC-2 checklist and the caring for self and caring for home subscales of the VABS-3 were significant in age bands 2 and 3 (r = -.491 to -0.864, p < .05).

Responsiveness of the MABC-2

Table 6 displays the responsiveness of the MABC-2 across the three age bands. In general, for the three age groups, the differences in the mean standard scores on the three domains and total scale were significant (p < .05), except the domain of manual dexterity for age band 1 and 3. Moderate to large positive effect sizes (0.443–2.339) were found across the three age bands. Unexpectedly, negative trivial to small effect sizes were found for the subscale of aiming and catching in age bands 2 (-0.371) and 3 (-0.192).

Table 6 The responsiveness of the MABC-2.

All methods were carried out in accordance with relevant guidelines and regulations.

Discussion

This study aimed to comprehensively evaluate the applicability of the MABC-2 from a measurement perspective in the context of Taiwan. Three key findings emerged from our investigation. First, notable ceiling effects were observed in the items related to the manual dexterity and balance domains. Second, the MABC-2 demonstrated satisfactory psychometric properties, including reliability, validity, and responsiveness. Third, distinct patterns of motor skills were evident among the children in Taiwan compared to the standardized sample from the U.K. These insights can aid clinicians in interpreting MABC-2 results more effectively when assessing school-aged children in Taiwan.

It is important to highlight that within the set of 8 tasks, significant ceiling effects were observed in 4 tasks in our Taiwanese sample. This suggests that these specific tasks might be relatively easy for children in Taiwan. The findings align with previous research indicating that Eastern children tend to excel in domains related to manual dexterity and balance5. Consequently, there is a potential task modification when utilizing the MABC-2 with children in Taiwan. For example, the paper sheet of the drawing trail task for age band 2 could be applied to age band 1. This insight underscores the need for future studies to delve deeper into this observation and its implications.

When comparing our sample’s performance to that of the standardized sample, we identified two key findings. First, children in Taiwan demonstrated superior manual dexterity. Second, their performance in aiming and catching was consistently lower across all three age bands.

The differences in motor performance can likely be attributed to two main factors. First, variations in motor skill development opportunities may play a role. Taiwanese children may have more exposure to activities that enhance manual dexterity but fewer opportunities to practice ball-handling skills24. Due to limited outdoor space, long school hours (8 AM to 5 PM), and additional cram school sessions, children in Taiwan often engage more in handwriting-related tasks than in ball sports. Moreover, although physical education classes include ball sports, they are not commonly practiced outside of school or in extracurricular activities, unlike in Western countries where free time for students is more prevalent. Second, as children progress through elementary school, they may increasingly participate in fine motor activities while reducing their involvement in gross motor tasks25. This shift in activity patterns may contribute to the observed motor performance differences. Overall, our findings highlight distinct motor skill development trends in Taiwanese children, emphasizing the need to establish culturally specific norms for motor performance in Taiwan.

The test–retest reliability was found to be acceptable. However, it was relatively low in the domains of balance and of aiming and catching across the three age bands of the MABC-2. These findings are consistent with previous research, which reported ICC values ranging from 0.23 to 0.9726,27. Two reasons may contribute to the relatively low test–retest reliability. First, a practice effect may be present in tasks in the balance domain. Upon review of the initial and retest mean standard scores on the balance domain, it was observed that the retest scores were consistently higher than the initial evaluation scores across all three age bands. This suggests that the children may have been more acquainted with the tasks during the second evaluation, resulting in improved performance and potentially affecting the test–retest reliability. Second, activities involving aiming and catching inherently entail unpredictable elements, such as the unpredictable trajectory of a thrown beanbag, which is challenging for children to control. Moreover, children in Taiwan may not be adept at such activities, as lower mean scores are evident in age bands 2 and 3. When children face challenges in ball-related activities, replicating their performance consistently at two different time points is difficult. Based on the insights gleaned from our study, we suggest that when administering the domains of balance and of aiming and catching, the practice opportunities outlined in the manual may not be sufficient for children; thus, providing additional practice opportunities may be advantageous in helping the children become familiar with the tasks. Increased familiarity could lead to performances that more accurately reflect their true abilities.

The inter-rater reliability of the MABC-2 demonstrated a high level of consistency, with ICCs ranging from 0.7 to 0.9. These results affirm that the rating instructions provided by the MABC-2 are clear and straightforward, facilitating uniformity in children’s performance assessment across different raters. Additionally, our training process uncovered valuable insights that can enhance rating accuracy. For instance, during the aiming and catching task, it is advisable for examinees to position themselves alongside the mats to mitigate angle-related issues with beanbag throws. Likewise, when observing children walking along a line, examiners should align their steps with those of the children, maintaining a position at the children’s side or back. As a result of these findings, it is evident that the MABC-2 can effectively assess school-aged children, regardless of the specific rater involved in the evaluation process.

It is important to note the large random measurement error of the MABC-2 across the three age bands, especially for test-retest reliability. Caution is warranted, as children’s performance may be largely influenced by random factors, meaning that only sufficiently large change scores can be interpreted as reflecting a true change in ability. This considerable level of measurement error may be explained by two factors. First, the MABC-2 includes only eight items, and a limited number of items can increase measurement error28. Second, while the tasks are reasonably balanced in terms of difficulty and effort required, this balance may contribute to variability in children’s performance. In addition, performance may also be influenced by chance factors.

The presence of considerable random measurement error suggests the need for more practice sessions to help children better understand the tasks and improve their proficiency. This is further supported by the increase in scores during the second evaluation, which may indicate that children possess the necessary abilities but may not fully demonstrate them initially due to task unfamiliarity. With additional practice, children become more comfortable with the tasks, enabling them to perform more consistently. Consequently, incorporating practice sessions could be a valuable strategy for minimizing the impact of random measurement errors on assessment results.

In our study, we have confirmed the presence of both three-factor and one-factor models for the MABC-2, and these models exhibit appropriate model-fit. This reaffirms the validity of the proposed models for the MABC-2. Previous research has generally supported the idea of a three-factor construct for the MABC-2, although some slight modifications may be necessary in certain cases12,13,14. Our study also introduced additional correlations the one-factor and the three-factor model was employed. Specifically, for the one-factor model, we found correlations between the first two tasks of the manual dexterity subscale and between two tasks related to the aiming and catching subscale. For the three-factor model, a correlation was added for the two tasks reflecting dynamic balance. These additional correlations can be justified by the shared fundamental components and similar execution methods inherent in these tasks. In summary, our results indicate that the eight tasks within the MABC-2 can be effectively grouped into three motor-related components or collectively represent an individual’s overall motor skill. In summary, these results also justify the assumption that applying the standard score of the U.K. norm did not affect the validity of the MABC-2. The subscale and total scores of the MABC-2 respectively represented the children’s fine motor, gross motor, and overall motor skills.

The correlations between the MABC-2 and the MABC Checklist showed small associations, and the correlation between the MABC-2 and the VABS-3 was only significant in age band 1. These findings suggest reasonable convergent validity and acceptable predictive validity. The presence of small correlations underscores the divergence in perspectives between caregivers and therapists, as well as the distinction between overall motor performance (the MABC checklist) and motor skills (the MABC-2). Notably, no predictive validity of the MABC-2 was observed in age bands 2 and 3. This phenomenon might be attributed to the fact that children in these age bands have likely reached a level of self-care proficiency where they have developed compensatory strategies for performing self-care tasks. Consequently, motor skill difficulties may not significantly impact their performance in self-care and household chore activities at this stage. However, it is conceivable that these motor skill challenges could influence more advanced skills required in school-related activities.

On the other hand, the MABC checklist exhibited strong correlations with the home and community scales of the VABS-3. The results suggested that motor performance had high correlations with skills in activities of daily living. The other plausible explanation was that both questionnaires were completed by parents.

In our study, we observed moderate to large responsiveness of the MABC-2. This finding aligns well with the natural developmental expectations for children, who are expected to develop their skills further over time. Consequently, the results indicate that the MABC-2 effectively captures and measures children’s progress, making it a valuable tool for assessing outcomes. However, it is important to note an unexpected discovery: negative effect sizes were identified in age bands 2 and 3 in the domain of aiming and catching. This unexpected outcome may be attributed to substantial random measurement error encountered during the baseline assessment, rendering the baseline results somewhat unstable. As a consequence, the follow-up tests were affected by this instability, leading to the observation of negative effects. These findings underscore a key point: when interpreting children’s performance on the domain of aiming and catching, a more conservative approach should be adopted, especially when a child’s performance lags behind that of their peers.

Our study sample was recruited from primary schools and a rehabilitation clinic across northern, central, and southwestern Taiwan, including children with and without diagnoses. These recruitment locations are representative of western Taiwan, where the majority of the population resides. This approach aligns with the sampling strategy of the MABC-2 standardization, which also included children with and without diagnoses. Additionally, approximately 5–10% of our sample comprised children with developmental conditions, consistent with their prevalence in the general Taiwanese population. The majority of our participants exhibited typical development, and those with diagnoses were enrolled in general education classes. Therefore, our findings can be considered generalizable to school-aged children in western Taiwan.

We identified two limitations in our study. First, the limited number of children in age band 3 who participated in the follow-up assessment might have constrained the representativeness of our sample. Second, when assessing construct validity (i.e., the factor structures of the MABC-2), we grouped all children together because the sample size was insufficient to conduct separate analyses for each age band. To enhance the robustness and validity of our findings, future studies should consider increasing the sample size.

Conclusion

Motor skill development varies across cultures. Moreover, the MABC-2 demonstrated appropriate psychometric properties in our sample of school-aged children in Taiwan. The adequate psychometric properties suggest that the MABC-2 provides accurate and reliable assessments of children’s motor skills. However, two important considerations should be noted when using the MABC-2 with school-aged children in Taiwan. First, the presence of significant random measurement errors and performance variability should be considered, particularly in the aiming and catching domain. To address this, we recommend incorporating additional practice sessions or multiple assessments to improve the reliability of results in this specific domain.