Introduction

Autism Spectrum Disorder (ASD), or autism, is a lifelong neurodevelopmental condition characterized by impairments in social communication and the presence of repetitive, unusual sensory-motor behaviors1,2. The global prevalence of ASD is approximately 1% in children3, with different rates reported in specific regions, such as 2.8% in children aged 8 years in the United States4 and 0.7% in children aged 6 to 12 years in China5. Several factors, including improved survey methodologies, diagnostic practices, public awareness and access to services, have contributed to this increase in prevalence3,6. Over time, understanding of ASD has developed from a categorical diagnosis to a dimensional perspective of neurodiversity7. This change redefined autism as a spectrum that includes a wide range of characteristics within a unified framework8. However, the expanded diagnostic criteria may have too low thresholds for diagnosis9, and have introduced significant heterogeneity within ASD. This heterogeneity suggests that the term “spectrum” encompasses multiple subgroups with distinct etiological phenotypes10,11.

Recent neuroimaging studies increasingly highlight ASD’s biological heterogeneity as a key barrier to identifying consistent brain-behavior relationships. Traditional case-control neuroimaging studies, which assume ASD and typical developing groups as homogeneous entities, often fail to account for subgroup differences12. This has led to inconsistent findings across brain regions and participants, further complicating efforts to identify reliable brain-behavior biomarkers. The decreasing effect sizes in group comparison studies over time13 highlight the difficulty of constructing mechanistic models of ASD. This heterogeneity challenge extends to intervention research, where differential treatment responses across ASD subgroups14,15 suggest that precision approaches accounting for neurobiological variation could enhance outcomes. Data-driven clustering techniques have emerged as a promising approach to address this complexity by leveraging cognitive and behavioral profiles alongside patterns of brain morphology and function. Various clustering methods16,17,18,19,20 have been used to identify meaningful subgroups, in order to uncover the neurobiological mechanisms of ASD and provide insights into subgroup-specific traits8,21. While despite decades of effort, inconsistencies in sample selection, classification methods, and feature selection between studies remain significant challenges21. These issues have limited the ability of clustering techniques to consistently identify robust and replicable subgroups.

Inconsistencies in ASD studies may be due to altered neurodevelopmental adaptations22 and participant selection biases23. These limitations highlight the need to group variables with strong biological relevance for the ASD subgroups24. At the same time, individual differences in sex, intelligence, medication use, and comorbidities25,26, along with methodological variability27 in data acquisition and analysis, complicate the interpretation of the results of neuroimaging studies. In response to these challenges, normative modeling28 techniques offer a promising framework to capture the full range of neurobiological variation12. This approach enables standardized assessments by comparing an individual’s brain morphology to population-based references29. In previous ASD studies, references were often built on internal or public datasets18,20,30,31, but the relatively small sample size limits the generalizability and reliability of the results. Recently, the Lifespan Brain Chart Consortium (LBCC, https://github.com/brainchart/lifespan) provided neurodevelopmental variations throughout life from the most inclusive data available32, providing a robust framework for us to achieve percentile-based comparisons of ASD with considerable statistical power.

This study adjusted the LBCC normative models in two large-scale cross-cultural ASD consortia: the China Autism Brain Imaging Consortium (CABIC, https://php.bdnilab.com)33 and the Autism Brain Imaging Data Exchange (ABIDE, https://fcon_1000.projects.nitrc.org/indi/abide)34,35. Focusing on childhood development, we used the largest-scale brain charts32 to explore neurodevelopmental heterogeneity in ASD before adolescence. We classified ASD into subgroups using spectral clustering based on Out-of-Sample (OoS) centile scores, a biological measure which quantified brain morphological deviations of ASD from normative growth. A Support Vector Machine (SVM) with Recursive Feature Elimination and Cross-Validation (RFECV) identified key brain regions driving subgroup classification. The ABIDE-based classifier was then applied to CABIC to identify similar subgroups. Brain-behavior analyses were performed separately for each dataset to identify reproducible subgroup-specific correlations. From a perspective of brain morphology, this work disentangles the mechanisms of mixed neurosubtypes and provides new insight into the biological complexity of ASD.

Results

Brain morphological profiles reveal two distinct autism subgroups

We first analyzed the ABIDE dataset. The distributions of the OoS centile scores across 34 bilateral Desikan-Killiany36 cortical volumes ranged from 0 to 1 (Supplementary Fig. 1a). OoS scores estimate the morphological variations of each participant in 34 brain regions compared to the median regional volumes of the population (50th centile, equivalent to a 0.5 OoS score). Relative to 0.5, individuals with more atypical phenotypes have more extreme scores. A Shapiro-Wilk test on the 34 OoS scores indicated that none followed a normal distribution (Table 1, column 2). Using spectral clustering on the OoS scores, we identified two distinct clusters (ASD subgroups) with the highest silhouette coefficient among cluster solutions ranging from two to ten (Fig. 1a). All p-values for between-subgroup comparisons (two-sample t-tests for the paracentral and pars orbitalis regions, and rank-sum tests for all other regions) were  <0.001). One subgroup, referred to as “L”, exhibited generally Lower OoS scores, while the other referred to as “H”, displayed Higher scores (Table 1, columns 5 and 7).

Table 1 Group comparisons of regions morphology between overall and two ASD subgroups
Fig. 1: Clustering-derived subgroups reveal distinct brain morphological patterns in ASD from ABIDE dataset.
Fig. 1: Clustering-derived subgroups reveal distinct brain morphological patterns in ASD from ABIDE dataset.
Full size image

a Silhouette scores across different clustering solutions (2 to 10 clusters). The highest silhouette score indicates two distinct subgroups. b Median OoS scores of 34 cortical regions before clustering. c Median OoS scores of ASD subgroup H (top) and L (bottom). d SHAP summary plot displaying the top 10 brain regions with the highest contributions to the SVM model’s predictions. e The mean SHAP values across the 29 selected cortical regions contributed to the classification. f Prevalence maps depicting the proportion of participants with extreme (2.5% for subgroup L, 97.5% for subgroup H) structural anomalies. g OoS scores across 34 cortical regions (top) and global measures (bottom) for subgroup H (orange) and L (green). TCV total cortical volume, WMV total white matter volume, GMV total cortical gray matter volume, sGMV total subcortical gray matter volume, CSF total ventricular cerebrospinal fluid volume, mCT mean cortical thickness, tSA total surface area, Dots indicate the mean value of OoS scores, bars indicate the standard deviation. See Supplementary Information Supplementary Fig. 1b and Supplementary Fig. 1d for detailed density plots. Distribution of participant ages (h); IQ scales (i); ADI-R (j); PDD category (k, top left), MRI scanner model (k, top middle), manufacturer (k, top right), and data collection site (k, bottom); SRS (l); and ADOS (m) across the two subgroups, with subgroup L participants being significantly younger than those in subgroup H (p = 0.02). Note, the left hemispheres are plotted in (b, c, e, and f) just for visualization purposes. For plots h–j, l, and m, the center line shows the median; the box limits represent the 25th and 75th percentiles; the whiskers show the minimum and maximum values; and the dots represent potential outliers.

Certain regions of the brain showed significant differences in their OoS score distributions between the two subgroups (Fig. 1g; Supplementary Fig. 1b. For CABIC results, please see Supplementary Fig. 2a and 2b). Subgroup H exhibited greater variability in the occipital lobe, particularly in the lingual gyrus, pericalcarine cortex, and cuneus. In contrast, subgroup L showed more pronounced reductions in these regions compared to the normative benchmark (0.5). Other regions, such as the pars opercularis and superior temporal gyrus, also exhibited subgroup-specific changes. The results of the Shapiro-Wilk test for OoS scores (Table 1, columns 4 and 6) indicated that more brain regions in subgroup H followed a normal distribution compared to subgroup L. Broader variability was observed in regions such as the entorhinal cortex in subgroup L and the pericalcarine in subgroup H, highlighting further subgroup-specific clustering.

Distinct structural and age differences between subgroups

The median OoS scores before clustering (Fig. 1b; Table 1, column 3) reveal a mixed pattern of deviations across regions. Most regions exhibit reduced volumes, particularly in the occipital lobe. However, some regions, such as the precuneus, paracentral lobule, transverse temporal gyrus, and insula, show slightly higher OoS scores. This combined pattern reflects overlapping structural deviations from both subgroup H and L (Fig. 1c), obscuring the distinct morphological differences that become evident after clustering. For instance, although subgroup L shows smaller volumes in the insula and precuneus, the larger volumes in subgroup H predominantly drive the combined pattern in the whole population. In contrast, smaller volumes of the lingual gyrus and the middle temporal gyrus in subgroup L have a stronger influence on the overall result. Some regions, such as the pericalcarine cortex, are almost entirely influenced by subgroup L. Meanwhile, regions like the inferior and superior temporal gyrus exhibit volumes close to the normative average because of the contrasting characteristics of both subgroups effectively canceling each other out.

To identify the brain regions most associated with morphological abnormalities, we calculated the prevalence of abnormalities for each region in the two subgroups. The prevalence represents the percentage of participants with OoS scores below 0.025 (2.5% centile) for subgroup L, while above 0.975 (97.5% centile) for subgroup H. Subgroup L (Fig. 1f; Table 1, column 8) exhibited the highest prevalence of abnormalities in regions such as the middle and inferior temporal gyrus, pericalcarine cortex, frontal pole, and medial orbital frontal gyrus. In contrast, subgroup H (Fig. 1f; Table 1, column 9) generally exhibited less severe abnormalities, with a concentration in the insula, transverse temporal gyrus, and caudal anterior cingulate. Participants in subgroup L were significantly younger than those in subgroup H (Fig. 1h; p = 0.02, Supplementary Table 1). However, no significant differences were observed in the category PDD (p = 0.32), the MRI scanner model (p = 0.07) and the manufacturer (p = 0.86), the data collection site (p = 0.16) (Fig. 1k), IQ (Fig. 1i, see Supplementary Table 1 for details), or scores from ADI-R (Fig. 1j), SRS (Fig. 1l), and ADOS-2 (Fig. 1m). The corresponding results for CABIC are shown in Supplementary Fig. 3.

Machine learning reveals subgroup morphological features

The optimized SVM model for the identification of subgroups demonstrated robust predictive performance. It used a polynomial kernel with a regularization parameter C = 0.1 and a kernel coefficient γ = 1. The model achieved a high classification accuracy (0.95, p < 0.001) and a F1 score (0.94, p < 0.001) in five cross-validation folds. Figure 1d shows the SHapley Additive exPlanations (SHAP) summary plot for the top 10 selected features. Figure 1e and Supplementary Table 2 summarize the mean SHAP values for all 29 selected regions. SHAP values37 quantify the contribution of individual brain regions to the predictions of the model. Among the regions, the isthmus cingulate, entorhinal cortex, precuneus, and middle temporal gyrus emerged as the most predictive features. Higher SHAP values for these regions were strongly associated with subgroup H, while lower values corresponded to subgroup L. These findings indicate different volumetric patterns in the cingulate, temporal, and parietal areas, effectively distinguishing the two subgroups.

Distinct structural covariance patterns across subgroups

Structural covariance analysis on OoS scores revealed significant correlations between brain regions, with distinct patterns observed across ASD subgroups and control group (Fig. 2c). These patterns were reproduced in both the ABIDE and CABIC datasets. In subgroup H, enhanced covariation was identified between the isthmus cingulate and the caudal middle frontal region (ABIDE: z = 2.88, p = 0.00; CABIC: z = 2.25, p = 0.02). In subgroup L, the parahippocampal gyrus showed reduced covariation with the entorhinal cortex (ABIDE: z = −2.07, p = 0.04; CABIC: z = − 3.25, p = 0.00). Similarly, reduced covariation was observed between the posterior cingulate and the entorhinal cortex (ABIDE: z = −2.12, p = 0.03; CABIC: z = −2.64, p = 0.01) and between the insula and the entorhinal cortex (ABIDE: z = −3.30, p = 0.00; CABIC: z = −2.18, p = 0.03). An increased covariation was observed in subgroup L between the parahippocampal gyrus and the lateral occipital cortex (ABIDE: z = 1.97, p = 0.05; CABIC: z = 2.04, p = 0.04).

Fig. 2: Brain-behavior correlations and structural covariations of Out-of-Sample scores.
Fig. 2: Brain-behavior correlations and structural covariations of Out-of-Sample scores.
Full size image

a Reproducible correlations between brain region volumes, measured as Out-of-Sample (OoS) scores, and clinical measures across ABIDE and CABIC datasets (subgroup H only). b Mediation models for significant brain-behavior associations identified in subgroup H (ABIDE cohort only). Black solid arrows represent significant effects, while gray arrows indicate non-significant ones. Top: transverse temporal; bottom left: inferior temporal; bottom right: isthmus cingulate. c Significant structural covariations across ABIDE and CABIC datasets. Positive z-values (red lines) indicate stronger covariance in ASD participants compared to controls, while negative z-values (blue lines) reflect weaker covariance.

Brain-behavior correlations in Subgroup H

We examined correlations between cortical region volumes (OoS scores) and clinical measures in the ABIDE and CABIC datasets, identifying significant associations that were consistent between both datasets. All significant correlations were found within subgroup H. The transverse temporal gyrus showed positive correlations with ADOS (ADOS-2 in ABIDE and ADOS in CABIC) Total and Social Affect scores (Fig. 2a). In the CABIC dataset, moderate correlations were observed for Total (r(198) = 0.17, p = 0.02) and Social Affect (r(191) = 0.21, p = 0.00) score. Stronger correlations were identified in the ABIDE dataset for Total (r(93) = 0.24, p = 0.01) and Social Affect (r(93) = 0.20, p = 0.04) score. The inferior temporal gyrus was positively correlated with RRB (restricted interests and repetitive behaviors), with a slightly stronger correlation in ABIDE (r(94) = 0.19, p = 0.05) compared to CABIC (r(189) = 0.17, p = 0.02). The volumetric centile of the isthmus cingulate is correlated with SRS Autistic Mannerisms, showing a higher correlation in ABIDE (r(58) = 0.33, p = 0.01) than in CABIC (r(192) = 0.14, p = 0.04). No significant correlations were identified between structural covariance patterns and cognitive behavior outcomes.

As a common comorbidity of ASD, intellectual disability is closely associated with atypical brain morphological developmental patterns that accompany individual differences in intellectual functioning. To examine whether Full-Scale Intelligence Quotient (FIQ) mediates the relationships between brain morphology and cognitive behaviors within clusters, mediation analyses were conducted on the significant correlations we found in subgroup H. Results revealed direct effects for specific cortical regions (Fig. 2b). The transverse temporal gyrus showed a direct effect on ADOS-2 Total (\(C^{\prime} =0.21,p=0.03\)), while the isthmus cingulate exhibited a direct effect on SRS Autistic Mannerisms (\(C^{\prime} =0.32,p=0.01\)). As our analysis relied on cross-sectional data, the observed brain-behavior relationships represent correlations rather than causal effects. Our mediation hypotheses regarding developmental brain-behavior causality require validation through longitudinal studies. While no mediation effects through FIQ were observed, both from brain  → cognitive behaviors or in reverse (Supplementary Fig. 4). Neither direct nor indirect effects were significant for correlations to ADOS-2 RRB or Social Affect.

Discussion

This study used the ABIDE and CABIC datasets to discover distinct brain morphological subgroups of ASD within a young male cohort. We identified two subgroups characterized by significant differences in OoS scores in 34 cortical regions. Machine learning models showed a high predictive accuracy in distinguishing these subgroups based on their brain morphology. Furthermore, structural covariance and brain behavior correlation analysis illustrated different patterns of morphological relationships and their associations with clinical measures, especially in subgroup H. The consistent correlations across both the ABIDE and CABIC datasets highlight the robustness and reproducibility of our results across diverse cohorts. We will now discuss these findings from several neurobiological and neurodevelopmental perspectives to enhance our understanding of the distinct morphological patterns identified.

Studies have reported extremely large or small head circumferences in individuals with ASD, possibly related to different subgroups. Early brain overgrowth is one of the most consistent findings in ASD research38. Abnormal increases in brain size during early development suggest that atypical cell proliferation significantly contributes to ASD symptoms39. Ultrasound studies have detected head enlargement during the second trimester in individuals later diagnosed with ASD40. iPSC-based studies reveal enlarged embryonic stage brain cortical organoids (BCOs) in babies with ASD, with larger BCOs associated with more severe social symptoms41. MRI scans of these individuals show overgrowth in primary auditory and somatosensory cortices, while undergrowth in the visual cortex41. This neuronal overproduction, caused by the acceleration of the cell cycle, results in impaired differentiation and ultimately disrupts neural functions42.

Smaller brain volumes are also frequently reported in ASD. A meta-analysis43 identified volumetric reductions in a large cluster of regions, including the parahippocampal gyrus and entorhinal cortex. Higher levels of autistic traits have been associated with a smaller total cortical volume (TCV), a lower cortical thickness, a smaller surface area and a lower gyrification44,45. One potential explanation for this global brain underdevelopment is impairment caused by insufficient blood circulation and oxygen saturation46. For example, children with complex congenital heart disease (CHD) also demonstrate cognitive difficulties similar to those observed in ASD and have an increased likelihood of developing ASD47. Similarly, head enlargement also coincide with greater increases in height often45. Therefore, brain overgrowth could be part of the broader physical growth dysregulation48 too.

These evidences provide the distinct neurobiological basis for the subgroups identified in our study. The different underlying physiological mechanisms may explain the inconsistent effectiveness of commonly used pharmacological treatments. For example, unsupervised data-driven cluster analysis on ASD children revealed an optimum of two intranasal oxytocin intervention-sensitive subtypes49. Our discovery of two distinct ASD subtypes with divergent brain morphologies further supports the necessity for subtype-specific therapeutic strategies. Future research focusing on targeted pharmacological interventions for individual subtypes will not only explain how these treatments modulate brain function and ultimately translate into clinical benefits, but also advance personalized medicine in ASD. Specifically, by mapping neurobiological heterogeneity to differential therapeutic outcomes, this approach could reconcile previous inconsistencies in treatment efficacy while optimizing intervention protocols for mechanistically defined patient subgroups.

Abnormal neural migration and cortical laminar organization in ASD50 suggest that structural abnormalities are unlikely to be localized. Postmortem studies have identified focal laminar disorganization and mis-migrated neurons in the prefrontal and temporal cortex, particularly in layers 2, 3, and 450. Layers 2 and 3 support information exchange between cortical regions, while layer 4 receives sensory inputs51. Disruptions in these layers can cause profound miscommunication between the sensory and higher-order cognitive regions. Key regions in our clustering analysis (such as the precuneus, isthmus cingulate, and entorhinal cortex; Fig. 3a, a~c) are primarily involved in higher-order association networks (Fig. 3a). As an idea for exploring hierarchical information flow of functional connectivity, Stepwise Functional Connectivity (SFC) analysis was introduced to disentangle brain functional networks, and has discovered the complex connectivity transitions from primary sensation to higher-order association regions of the brain52,53. These existing information flow suggest different pathways of impairment as the basis for the heterogeneity of ASD. Inconsistencies in information flow reported by previous ASD studies21,54 may arise from two distinct structural impairment patterns identified in our study.

Fig. 3: The theoretical brain functional network impairment model of two ASD subgroups.
Fig. 3: The theoretical brain functional network impairment model of two ASD subgroups.
Full size image

a A brain map combining the 34 Desikan regions36 with the latest 15 large-scale brain functional networks estimated from individuals57. a, precuneus; b, isthmus cingulate; c, entorhinal; d, middle temporal; e, inferior temporal gyrus; f, pericalcarine; g, frontal pole; h, caudal anterior cingulate; i, transverse temporal; and j, insula. b A schematic of the potential spatial distributions of distinct functional impairment networks inferred from 34 cortical regions with high abnormal prevalence for two subgroups. VIS Visual (C Central, P Peripheral), SMOT Somatomotor, AUD Auditory, dATN Dorsal Attention, PM-PPr Premotor-Posterior Parietal Rostral, AN Action-mode network, SAL/PMN Salience/Parietal Memory Network, FPN Frontoparietal Network, LANG Language, DN Default-mode Network.

The regional volume abnormalities identified in our study involve distinct functional brain regions in the two subgroups. Subgroup L showed a notable prevalence in the middle temporal gyrus, inferior temporal gyrus, pericalcarine cortex and frontal pole (Fig. 3a, d~g), while subgroup H in the caudal anterior cingulate, transverse temporal cortex, and insula (Fig. 3a, h~j). Drawing on the spatial distributions55 of 7 large-scale functional networks56, we constructed a schematic (Fig. 3b) of the latest 15 large-scale functional networks of the brain estimated from individuals57 (Fig. 3a). This perspective revealed different pathways of functional impairment in the regions with a high prevalence of abnormalities for each subgroup. Subgroup L exhibited an atypical morphology mainly affecting Visual – Dorsal Attention – Frontoparietal – Language and Default-mode networks (Fig. 3b, green path), while subgroup H affected the Auditory and Somatomotor – Action-mode and Salience/Parietal Memory – Frontoparietal networks (Fig. 3b, orange path), indicating disturbances in peculiar unimodal sensory integration and multimodal cortical functions. These networks are organized by three-order hierarchy that are agree well with myelination reference maps57, showing that the cerebral cortex develops sequentially, radiating outward from motor and sensory cortex58.

This phenomenon may hint at the presence of distinct abnormal brain functional circuits from first- to third-order networks in the subgroups. For example, in adults without ASD59, increased gray matter volume in regions of the somatomotor network was associated with greater attention to detail, while changes in regions of the visual network were associated with poor imagination. These findings suggest potential cognitive patterns in two subgroups of ASD that have yet to be discovered. At the same time, these two subgroups we found may respond differently to biologically targeted therapies, emphasizing the need for more personalized approaches. For example, functional connectivity-guided continuous theta-burst stimulation (cTBS) on individual’s dorsolateral prefrontal cortex target with the strongest connectivity to amygdala significantly improve social communication skills in minimally verbal ASD children60. As a reduction in brain volume and improvements in behavior outcomes, this ASD subtype likely fit the characteristics of subgroup H in our study. Targeted modulation of subgroup specific impairment pathway may gain better therapeutic effects, therefore further refine treatment protocols for better translational outcomes. In this case, if observed social improvements stem from rebalancing the subgroup H pathway, future interventions could achieve enhanced treatment efficacy by employing personalized neuromodulation protocols that precisely target the most morphologically burdened brain regions within this circuitry.

In typically developing children, the principal functional gradient55 of the cortical organization reflects differentiation within unimodal sensory areas, progressing from the somatomotor and auditory cortex to the visual cortex. The transition from childhood to adolescence extends this organization to higher-order association cortices61. Brain maturation appears to involve a shift from local to distributed network architecture. The distinct morphological abnormalities in these two ASD subgroups, observed before this transition begins, raise a critical question: How do these “morphological foundations” disrupt their development from childhood to adolescence? These structural impairment pathways in childhood are likely to contribute further to heterogeneity in ASD during and after adolescence.

Different network-based vulnerabilities suggest that interventions targeting specific impairments within each subgroup could be more effective. Interestingly, we found no significant differences in cognitive or behavioral measures between the subgroups. Although there distinct patterns of structural covariation between subgroups, but no significant correlations with autism-related behaviors. Structural covariance reflects long-term developmental changes. Reduced structural covariance may indicate disruptions in shared developmental pathways, while enhanced covariance could result from compensatory adaptations in response to atypical neurodevelopmental trajectories in ASD. This implies that morphological deficiencies may activate compensatory mechanisms during development, leading to similar cognitive and behavioral outcomes.

These neurodevelopmental compensation mechanisms raise questions about the stability of subgroup morphology distinctions across different developmental stages. To validate the robustness of our ASD clusters across different development stage, we implemented sliding 4-year age windows (with a predefined threshold of at least 80 participants to ensure statistical power). For each window, we applied the identical spectral clustering pipeline and evaluated the consistent cluster indices across each narrow age range and the full age range. All the narrow age range population yielded two distinct brain subtypes as similar as the characteristics in our main results. The assignment consistency were 90.48% (5–8.9 yrs, N = 84), 91.89% (6–9.9 yrs, N = 111), 98.67% (7–10.9 yrs, N = 150), 97.71% (8~11.9 yrs, N = 175) and 88.42% (9~12.9 yrs, N = 190). These remarkable consistency in subgroup classification suggests that the specific impairments on brain morphology are consistently maintained across development.

The observed age difference between subgroups, though minor, could have significant implications during sensitive developmental periods. Lifespan trajectories32 of 34 brain regions show an early steep increase in volume, followed by a near-linear decline. The middle temporal gyrus peaks at 7.8 years, while the caudal anterior cingulate peaks at 9.2 years32, possibly related to the result that subgroup L is generally younger than subgroup H. However, in the CABIC cohort, we observed an inverse age relationship across both classification methods (whether using the ABIDE-classifier or independent clustering), with the subgroup L being significantly older than the H (Supplementary Fig. 3a and 5i). The age difference subgroups needs future studies with larger sample sizes to validate. While here, we propose some possible reasons for the ABIDE cohort to understand that these distinct structural abnormalities likely reflect differences in the timing of atypical developmental processes.

The isthmus cingulate, a key region in subgroup clustering, typically peaks in volume at 3.8 years. The lack of age-related decline in cortical volume was consistently found in subgroup H in both the ABIDE and CABIC datasets (See Supplementary Analysis and Supplementary Fig. 1f). However, subgroup H aligns with normative references before age 5, while subgroup L shows smaller initial volumes (Supplementary Fig. 1e). In subgroup H, an increase in structural covariance between the caudal middle frontal cortex (peaking at 7.75 years) and the isthmus cingulate (Fig. 2c) suggests synchronized developmental trajectories. This may reflect precocious caudal middle frontal development, delayed isthmus cingulate development, or both regions following similarly abnormal patterns22. Correlation analyses link larger isthmus cingulate volumes in subgroup H directly with more pronounced autistic mannerisms (Fig. 2a). These evidence hint at delayed development of the isthmus cingulate in subgroup H. With a normal OoS distribution (p = 0.06; Table 1, column 6) in this subgroup, this morphological change appears to represent a homogeneous and quantifiable pattern. In subgroup L, the parahippocampal gyrus (peaking at 10.63 years) shows reduced covariance with the entorhinal cortex (peaking at 22.67 years) but increased covariance with the lateral occipital cortex (peaking at 5.21 years). This pattern suggests a precocious development of the parahippocampal gyrus and/or the delayed lateral occipital cortex in this subgroup. Enhanced structural covariances may also reflect compensatory adaptations to mitigate structural disadvantages inherent in abnormal developmental patterns.

Recent genome-wide association studies (GWAS)62 reveal that cortical phenotypes exhibit distinct genetic architectures, with shared genetic variants influencing both normative brain size variation. Furthermore, significant genetic correlations between cortical expansion patterns and cognitive measures suggest multi-level mechanisms underlying neurodevelopment. These findings collectively motivate future study of ASD to build longitudinal cohorts to explore how genetically mediated cortical maturation trajectories influences different stages of development in ASD subgroups. Such approaches would bridge genotype-phenotype mapping to clinically biomarkers, advancing precision diagnostics and targeted interventions during sensitive neurodevelopmental windows.

One key contribution of this study is the reproducibility of brain-behavior associations across the ABIDE and CABIC datasets. In our analysis, we didn’t apply on any multiple comparisons correction, three considerations may explain. First, the dual-dataset cross-validation design required consistent results across two independent cohorts (heterogeneous in culture/age), which substantially reduces false-positive risks through empirical replication. Second, as this is an exploratory target-discovery studies, overly conservative adjustments might obscure biologically meaningful results. Third, the cross-population reproducibility were prioritized in our study. We considered that uncorrected-but-replicated results had greater biological credibility than findings that were statistically corrected but failed to replicate. As a result, this cross-dataset validation confirms that the findings reflect stable neurobiological characteristics.

In subgroup H, an enlarged transverse temporal gyrus may impair precise auditory processing. This disruption can affect auditory responsiveness and the ability to integrate and understand information during social interactions. It may also damage early language perception and acquisition, particularly before a diagnosis of ASD is made. Research has found that when ASD is accompanied by cognitive learning needs, head enlargement becomes even more pronounced45. This suggests a shared mechanism linking brain overgrowth with impaired intellectual functioning. Alternatively, cognitive impairments might mediate the relationship between this overgrowth and ASD. Although intelligence moderates the relationship between ASD traits, transverse temporal gyrus abnormality remains a direct indicator (Fig. 2b). This suggests impaired primary sensory perceptual function that is directly disrupted by abnormal sensory input. The inferior temporal gyrus (ITG) is also uniquely associated with restricted interests and repetitive behaviors, despite the low prevalence of structural abnormalities in subgroup H. This region plays a crucial role in object recognition63. Disruptions in these functions can lead to a tendency to focus repetitively on objects or details. While as the RRB scale in ADOS cannot depict all the dimensions of repetitive behaviors, related results need future studies focusing on more detailed investigations on RRB to validate.

We classified ASD subtypes solely based on brain morphology, assuming a causal relationship between brain structure and cognition. However, this relationship is inherently complex, thus interpreting the brain in regions goes against its interconnected nature. Although subgroup L exhibited a higher prevalence of structural abnormalities, suggesting more extensive morphological variations, we did not find reproducible brain-behavior associations between cohorts. Several factors may explain these findings. First, cognitive impairments related to structural abnormalities in subgroup L might be more sensitive to cultural differences. Second, the validity of ASD diagnoses in subgroup L might be questionable. The best predictor of a DSM-IV diagnosis has been found to be often the specific clinic attended, rather than any defining characteristic of the individual64. This indicates that current diagnostic tools may not capture the nuanced behaviors directly related to brain abnormalities in subgroup L. Considering global reduction in brain volume, which may result from cardiogenic etiological damage, our findings highlight the urgent need for new methods and paradigms to enable deep phenotyping and reduce the reliance on behavioral scales for differentiation. Third, heterogeneity within subgroups may also contribute to the limited reproducibility findings. Behavioral paradigms targeting brain regions with a normal distribution of OoS scores could better reveal subgroup-specific relationships between brain morphology and cognitive behavior.

In our analysis, we have rigorously controlled for site effects to minimize their impact on the results. To validate our approach, we performed independent spectral cluster analysis using only NYU site data from ABIDE. The assignment consistency compare with the main analysis reached 90.48%, demonstrating both the robustness of our clustering framework to site-specific variability and the biological validity of the identified subtypes. This high concordance further supports the clinical translatability of our neurosubtyping scheme, as it exists even when derived from diverse data sources, therefore providing a solid foundation for future multi-site biomarker development targeting subtype-specific mechanisms.

Some limitations in this study should be noted. First, previous research highlights atypical structural connectome asymmetry in ASD65, while we could not fully evaluate hemispheric differences within subgroups using LBCC trajectories. Second, the cross-sectional nature of the data limits causal inferences about how these abnormalities of the brain morphology arise and their relationships with ASD symptoms. Future research should thoroughly investigate brain function and cognitive differences between these subgroups while exploring how demographic, environmental, and developmental factors shape brain-based subtypes. Combining longitudinal data in vivo and in vitro42 could expand cohort diversity and provide deeper insights into the neurobiological and molecular mechanisms underlying different subtypes. When designing research cohorts to study further, focus on more representative ASD prototypes66 could help reduce heterogeneity within subgroups. Third, we should note that the use of ADOS rather than ADOS-2 in the CABIC cohort represents a methodological limitation that future studies with larger datasets should address.

More attention should be paid on people with more severe ASDs, such as minimally verbal autism, who remain underrepresented in brain imaging studies67, partly due to the challenges of collecting neuroimaging data, especially in young children. Our research strategy uses structural magnetic resonance imaging, feasible in natural sleep or sedation, with significant clinical potential. However, multimodal approaches with higher ecological validity are essential to identify neurosubtypes that include these samples. Importantly, all current findings are derived from male ASD cohorts, any clinical translations should be cautiously limited to male populations until validated in females. Therefore, future studies should expand ASD cohorts with more female representation to enable comprehensive study of sex-specific brain morphological heterogeneity.

At the same time, replication with different clustering algorithms is required to confirm the stability of these findings. For example, we employed Gaussian mixture modeling (GMM) as a validation framework to assess consistency with our main clustering results. GMM assumes the observed data are generated from a mixture of multiple distinct Gaussian distributions68. As the high cluster indices consistency (83.58%) across both methods, the brain morphology and cognitive behavior have the similar characteristics as spectral clustering analysis (Supplementary Fig. 6). Participants in subgroup L were also significantly younger than those in subgroup H (p = 0.03). No significant differences were observed in other comparisons between two subgroups. These replication with GMM demonstrates that our ASD neurosubtyping is not method-dependent. While more nuanced clustering approaches and strategies should be explored in the future, as disease variations may exist within the range of normal variation69, and multiple subpopulations may also exist in typically developing populations70. Finally, translational studies using animal models are essential for understanding the mechanisms underlying these different subtypes. Such studies can provide critical information on how abnormalities in brain morphology develop71.

Conclusion

Linking specific brain morphology abnormalities to cognitive or behavioral impairments in ASD is challenging. Different biological mechanisms can lead to similar behavioral outcomes, complicating the identification of neural correlates. This overlap may explain the difficulty in replicating neuroimaging findings between studies or cohorts. Using the largest brain normative datasets and two large-scale cross-cultural consortiums, we applied normative modeling to disentangle the heterogeneity of brain morphology in ASD. By breaking the ASD population into smaller and more morphologically homogeneous subgroups, we identified two subtypes of morphological abnormalities. We identified significant correlations between cortical regions and autistic traits across both consortia, with particular emphasis on the isthmus cingulate cortex, as well as the transverse and inferior temporal gyri. Based on the abnormal prevalence of regional volumes, we found two structural impairment pathways that could disrupt sensory to higher cognitive functions. These findings highlight the importance of untangling mixed biological mechanisms and offer insight to develop effective subgroup-driven individualized interventions. In conclusion, our study provides a reference for elucidating the etiological mechanisms of neurodevelopmental disorders and advancing future subgroup-driven precision clinical practice.

Methods

Participants and data preprocessing

Participants were selected from the Autism Brain Imaging Data Exchange cohort34,35. The original dataset includes 1060 autistic and 1166 non-autistic individuals aged 5~64 years from 24 different sites. All diagnoses were based on the DSM-IV or DSM-V criteria. We used T1-weighted MRI data from the ABIDE (I and II) cohort. The acquisition parameters are publicly available at https://fcon_1000.projects.nitrc.org/indi/abide/. Data quality was assessed through manual visual inspection aided by visualization outputs from MRIQC (https://mriqc.readthedocs.io/en/latest/)72 (version 22.0.6). We visually rated the quality of 2451 images using a 3-class framework73, with “0” denoting images that suffered from gross artifacts and were considered unusable, “1” with some artifacts, but that were still considered usable, and “2” free from visible artifacts. Images with an “0” score (for example, with severe motion artifacts, segmentation errors, or structural abnormalities) were excluded. A total of 1907 (77.8%) images from 1726 participants passed the quality control. For the participants who had more than one T1-weighted run, we selected the one with the best imaging quality.

Due to the limited samples of female ASD participants, and the consideration that sex-specific developmental trajectories32 and autism-related neuroanatomical differences74,75 may confound data-driven population clustering, our study focuses only on male participants under 13 years old. This age range reflects our interest in brain development during childhood and early adolescence. This yielded 274 ASD participants for downstream analyses. The participant ID list is provided in the Supplementary Data 1. Details for participants excluded on each step is provided in Fig. 4.

Fig. 4: General analytic flow in this study.
Fig. 4: General analytic flow in this study.
Full size image

Brain charts pictures are updated from Bethlehem et al.32.

Pre-processing performed through the self-developed MRI data processing platform: Connectome Computation System (CCS, https://github.com/zuoxinian/CCS) pipeline76,77. CCS optimizes the preprocessing of structural MRI by applying various structure image processing software through script. During our pre-processing, the raw T1-weighted image was first reoriented to RPI orientation and cropped. Skull stripping was then performed using Advanced Normalization Tools (ANTs)78 to generate a brain mask. For cortical surface reconstruction and gray and white matter segmentation, the recon-all pipeline from FreeSurfer (https://surfer.nmr.mgh.harvard.edu/)(version 6.0.0) was used, with the default brain extraction step replaced by the ANTs-derived brain mask. Each T1-weighted image was segmented into gray matter (GM), white matter (WM), and cerebrospinal fluid (CSF).

The total cortical volume (TCV) was calculated as the sum of the GM and WM volumes, following the LBCC method32. The mean cortical thickness (CT) was calculated as a weighted average of the thicknesses of the left and right hemispheres. The total surface area (SA) was determined by summing the surface areas of white matter of both hemispheres. Regional brain volumes for 34 bilaterally averaged cortical regions were evaluated using Desikan-Killiany parcellation36. To reduce site-related variability, we applied ComBat harmonization79. This process was implemented using neuroCombat (https://github.com/Jfortin1/ComBatHarmonization) R package.

Normative scoring and clustering analysis

To assess how the brain morphology of each participant aligned with age-standardized normative trajectories, we used the out-of-sample (OoS) centile score32. First, all non-ASD individuals from the site-harmonized ABIDE data set were employed to estimate cohort-specific statistical offsets. The centile scores of each brain measurement were then estimated for all individuals with ASD compared to the offset trajectory (see Methods: Centile scoring of new MRI data in the original LBCC work32). This approach takes advantage of the generalized additive models for location, scale, and shape (GAMLSS)80 framework to estimate statistical offsets such as mean, variance, and skewness. These offsets capture individual deviations from typical brain development. OoS scores are comparable between age and cohort. The scoring process was conducted with the publicly available code from Lifespan Charts website (https://github.com/brainchart/Lifespan).

Using OoS scores from 34 brain regions as classification features, we applied spectral clustering81 among individuals with ASD in the ABIDE cohort. Spectral clustering was selected for its ability to detect nonlinear data structures, which are common in brain imaging datasets. Unlike k-means clustering, which relies on centroid-based optimization, spectral clustering transforms the data into a lower-dimensional space using the eigenvalues of a similarity matrix. Before clustering, the Jarque-Bera test was performed to determine whether OoS scores follow a normal distribution.

Clustering analysis was implemented using the kernlab package R. We tested clusters of 2 to 10 to identify the optimal solution. The quality of clustering was evaluated using the Silhouette Coefficient (SC)82, which assesses both within-cluster cohesion and between-cluster separation. SC values range from -1 to 1, with higher scores indicating better defined clusters. The optimal number of clusters was decided by the highest SC score.

Key Brain Regions for Subgroup Classification

We developed a predictive model using machine learning with individual OoS scores from 34 regions of the brain as input characteristics. We trained a Support Vector Machine (SVM) classifier83 with Recursive Feature Elimination and Cross-Validation (RFECV). This was implemented using the scikit-learn library in Python. Combining recursive feature elimination (RFE)84 with cross-validation allows the model to iteratively identify the most informative features while maintaining generalizability and robustness. The data set was divided into 80% training sets and 20% testing sets. Using a linear SVM as the base estimator, RFECV removed less important features in a stepwise manner, with performance monitored through 5-fold cross-validation. This process ensured that the final feature set balanced predictive power and reduced over-fitting.

Following feature selection, we fine-tuned the SVM model by performing a grid search to optimize key hyperparameters, including regularization strength (C) and kernel coefficients (γ). The model was evaluated across different kernel functions, such as linear, polynomial, and radial basis functions, to determine the best configuration. Once optimized, the model was trained on the selected features to achieve optimal classification performance. SHapley Additive exPlanations (SHAP)37 were applied to quantify the contribution of each brain region to the model predictions.

Structural covariance network analysis

To investigate morphological relationships of a specific cluster, we constructed structural covariance analysis85 on OoS scores of each two regions. Pearson’s correlation analysis was performed on residuals obtained from linear regression models that controlled for site effects and TCV OoS scores. To enable statistical comparison, correlation coefficients were normalized using Fisher’s r-to-z transformation, defined as:

$$Z=\frac{1}{2}{\log }_{e}\left(\frac{1+r}{1-r}\right),$$
(1)

where r represents the Pearson correlation coefficient between two regions. The difference (cluster i/ii vs. normal control) in transformed correlation coefficients was calculated using:

$$z=\frac{{Z}_{1}-{Z}_{2}}{\sqrt{\frac{1}{{n}_{1}-3}+\frac{1}{{n}_{2}-3}}},$$
(2)

where n1 and n2 are the two sample sizes being compared. The resulting z-values were converted to p-values using the cumulative distribution function of the standard normal distribution.

Cluster differences and brain-behavior analysis

We investigated differences between clusters and demographic, technical and clinical variables. For categorical variables, such as subtypes of perinatal developmental disorders (PDD), scanner models, and manufacturers, Chi-square tests were employed. For data collection sites, Fisher’s exact test was used with Monte Carlo simulations to approximate p values. To ensure robust analysis, sites with fewer than 10 participants were excluded. For other continuous variables, statistical methods were selected based on data distribution: the Wilcoxon rank sum test was used for nonnormally distributed data, and the two-sample t-test for data shows normal distribution. The effect sizes were calculated using Cramér’s V for categorical variables and Cohen’s d for continuous variables. Statistical significance was set at p < 0.05.

To further explore brain-behavior relationships within clusters, we performed correlation analyses using Full-Scale Intelligence Quotient (FIQ), ADOS-2 (Social Affect and Restricted and Repetitive Behaviors, RRB), ADI-R (Social Interaction and RRB), and SRS subscales. Pearson’s correlation was applied to most measures, while Spearman’s correlation was used for ADOS-2 RRB scores due to their limited range. In these analyzes, the effects of the site and the OoS scores of TCV were controlled. Statistical significance was set at p < 0.05, and no correction for multiple comparisons was applied.

For each group, we also examined partial correlations to investigate structural covariance between 34 brain regions and their association with cognitive behaviors. For each participant, the partial covariance scores of two regions were calculated as:

$$partia{l}_{p}=residua{l}_{region1}\times residua{l}_{region2}$$
(3)

The resulting partial covariance scores were then correlated with the same measurements with the same correlated methods.

Finally, we performed mediation analyzes to examine whether FIQ mediates the relationships between brain morphology and cognitive behaviors within clusters. Linear models were used to estimate direct and indirect effects, with adjustments for site and OoS scores of TCV. All brain morphology and cognitive behavior variables were standardized to ensure comparability among participants. The significance of indirect effects was assessed using the Sobel test. As the brain abnormalities are detectable in the early development40 before the emergence of autistic behaviors86, we built a mediation model to examine the brain’s effect on cognitive behaviors. To provide a more comprehensive account, we also analyzed the reverse mediation model.

Detection of reproducible results between datasets

We applied the ABIDE-based classifier to an independent dataset from the China Autism Brain Imaging Consortium33, which is a multicenter collaboration among clinical and research institutions in China. It is designed to collect neuroimaging, demographic, and behavioral data on children and adolescents with ASD. Diagnoses in the CABIC cohort were based on DSM-IV, DSM-V, or ICD-11 criteria and validated using ADOS and ADI-R. The dataset includes 2,656 participants, including 1,503 ASD individuals and 1,153 typically developing children. Detailed MRI acquisition parameters are publicly accessible at https://php.bdnilab.com/sites/. There were in total 968 ASD males (4.56 ± 1.69) and 526 (7.37 ± 2.73) typically controls included into analysis. The participant ID list is provided in the Supplementary Data 2.

CABIC participants, aged 1 to 13 years, tend to exhibit more severe ASD symptoms, resulting in lower variability within clusters. Although this homogeneity is clinically valuable, it poses challenges in detecting subtle brain-behavior relationships. In contrast, the ABIDE cohort provides greater behavioral differences, enhancing the ability to identify nuanced patterns in brain structure. By leveraging these complementary datasets, our goal was to identify results that are reproducible across both cohorts. We repeated the statistical, structural covariance, and correlation analyses on CABIC cohort. However, the limited FIQ data in CABIC restricted the replication of mediation analysis with sufficient statistical power. Therefore, mediation analysis was conducted on the ABIDE cohort only.

In this study, we applied a SVM classifier trained on the ABIDE cohort on the CABIC cohort. To evaluate the consistency between this transfer approach and independent clustering approach, we replicated the identical analytical process directly on ASD participants from CABIC cohort (See details in the Supplementary Information). Optimal cluster solutions was also two (Supplementary Fig. 5a). Among 968 participants, 929 (95.97%) exhibited consistent cluster indices across both methods. This high concordance highlights the robustness of our spectral clustering framework across heterogeneous datasets.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.