Introduction

Autism spectrum disorder (ASD) is a complex neurodevelopmental condition characterized by challenges in social interaction and communication, along with repetitive, restricted, and stereotyped behavioral patterns1,2,3,4. While previous studies have documented heterogeneous neurodevelopmental patterns in ASD, a growing consensus suggests that ASD involves complex and dynamic neurodevelopmental mechanisms, with distinct neuroanatomical variations emerging across different developmental stages5,6,7,8,9. Neuroimaging studies have delineate stage-specific neurodevelopmental trajectories in ASD, characterized by transient early brain overgrowth followed by atypical trajectory attenuation10,11,12,13,14. These findings underscore the necessity of considering developmental-stage dependencies when defining neurodevelopmental norms for ASD.

ASD has been widely described as a disorder of connectivity15,16,17, with a large body of evidence supporting atypicalities in both functional and structural brain networks, particularly in regions involved in social cognition18,19,20,21. fMRI studies have identified atypical functional activity in multiple brain regions implicated in ASD, including key components of the social brain network, such as the amygdala, prefrontal cortex, cingulate cortex, and middle temporal areas22, as well as disrupted connectivity among these regions. These connectivity disruptions have been linked to difficulties in mentalizing, social cognition, and higher-order emotional processes in individuals with ASD23,24,25. Atypical neuroanatomy has also been widely reported in ASD. For example, gray matter volume (GMV) measurements have revealed significant volumetric changes in social brain regions, particularly in the cingulate cortex, fusiform gyrus, amygdala, temporal, frontal cortices, and insula26. Emerging evidence underscores the pivotal role of brain network architecture in shaping anatomical development. Accumulating studies demonstrate consistent associations between atypical structural features and network connectivity patterns across both neurotypical and clinical populations27,28. A study of healthy individuals reveal that connectome architecture governs the maturation of cortical thinning trajectories from childhood through adolescence29. Notably, pathological associations between reduced cortical thickness and network organization have been identified in major depressive disorder30. Moreover, transdiagnostic analyses have shown that network epicenters drive transdiagnostic co-alteration patterns across functional and structural domains in six neurodevelopmental conditions, including ASD31. In summary, these findings suggest that brain networks provide an organizational framework for brain development, shaping its structural integrity31,32. Based on this evidence, we hypothesize that functional network organization constrains atypical neurodevelopmental trajectories in ASD.

Structural neuroimaging studies suggest that key neuroanatomical features, such as GMV, may provide critical insights for the differential diagnosis of ASD or for elucidating its underlying biological mechanisms33,34. However, existing findings remain inconsistent35,36,37, and show small effect sizes38,39. One important reason for these inconsistencies is that most studies have employed a standard case-control analysis approach38,40, which may overlook the heterogeneity introduced by age. To address this limitation, the present study employs a sliding age window approach, conducting stratified analyses across different developmental stages to identify atypical morphological patterns in individuals with ASD. Furthermore, to quantify the distributional deviation (DEV) of GMV between ASD and typically developing controls (TDCs), we introduced the Kullback-Leibler (KL) divergence41. KL divergence has proven effective in characterizing GMV distributions and providing a robust morphological connectivity metric42. Sebenius and colleagues further applied symmetric KL divergence to calculate a morphometric inverse divergence structural similarity metric between regional distributions, showing greater consistency with known cortical symmetry43. Building on this foundation, we applied KL divergence and expected value analyses to compute age-specific DEV scores and examined their variation patterns across age groups, aiming to determine whether overall gray matter density in ASD increases or decreases with age.

Finally, to test whether functional networks constrain these atypical developmental trajectories, atypical morphological brain regions identified at the previous developmental stage were used as input to a network diffusion modeling (NDM) framework, which was then applied to predict deviations in the subsequent stage. The hypothesis of the current study consists of two main points: (1) atypical morphological development in ASD is primarily concentrated in the social brain, with increases in GMV or density during early development, followed by a reduction during adolescence; (2) functional networks constrain the trajectory of atypical morphological development in ASD. To test these hypotheses, we integrated structural morphology and functional connectomics to quantify DEV in individuals with ASD relative to TDCs, and explored the neural network mechanisms underlying these atypical morphologies. Specifically, we conducted the following analyses: (1) a total of 766 participants aged 8–18 years were selected from the ABIDE dataset and divided into age groups using a sliding window approach; (2) DEV was used to quantify deviations in developmental trajectory across age groups and to identify atypical developmental patterns in individuals with ASD; (3) Atypical morphological brain regions identified at the preceding stage were used as input to NDM to predict developmental deviations in the subsequent stage, thereby assessing whether functional networks constrain atypical developmental trajectories.

Results

Distribution deviation (DEV) in brain regions across different age groups

We applied a sliding-window approach to stratify participants by age (Fig. 1A) and calculated DEV of GMV using KL divergence and expected value analysis (Fig. 1B). As shown in Fig. 2, glass brain visualization highlighted the top 10% of DEV regions in each age group, which were defined as atypical brain areas and were mainly located in the frontal, parietal, and temporal lobes. During early adolescence (before the 13–15 age group), individuals with ASD exhibited positive values in DEV compared to TDCs (see Supplementary Table S1), suggesting that greater GMV density in ASD. Conversely, negative deviation emerged in late adolescence (after the 13–15 age group), indicating reduced GMV density in ASD compared to TDCs. This shift is also evident across the whole brain (see Supplementary Fig. S1) and reflects a nonlinear developmental trajectory in ASD, characterized by early cortical overgrowth followed by delayed or attenuated maturation.

Fig. 1: Flow chart illustrating data analysis pipeline.
figure 1

A Sliding window approach for age grouping: Participants from the ASD and TDC groups were divided into overlapping age bins, each spanning 2 years with a 1-year overlap. For each age group, regional gray matter volume (GMV) maps were extracted separately for ASD and TDC individuals. B Estimation of distribution deviation (DEV): For each region of interest (ROI), the probability density functions (PDFs) of GMV values were estimated for both ASD and TDC groups, and using the Kullback–Leibler (KL) divergence and expected values to quantified DEV between the two distributions. The top 10% of regions with the highest DEV values were identified as atypical regions for each age group, forming age-specific variation maps. C Functional connectivity-constrained DEV diffusion modeling: To examine whether observed developmental changes in atypicality patterns are shaped by intrinsic brain connectivity, a network-based diffusion model (NDM) was applied. Using functional connectivity (FC) networks, atypical regions from a given age window served as seed inputs to simulate the spread of atypicality over time. The model fitting was performed by comparing simulated maps to observed DEV patterns in the subsequent age window. yrs refers to the year.

Fig. 2: The glass brain map illustrates the spatial distribution of deviation values in regions with atypical development.
figure 2

Red areas indicate regions where deviation values are higher in the ASD group compared to TDCs (with color bar label 1 representing the top 6–10% of positive DEV values, and label 2 representing the top 0–5%). Blue areas indicate regions where deviation values are lower in the ASD group (with color bar label –1 representing the top 6–10% of negative DEV values, and label –2 representing the top 0–5%). Label 0 denotes regions with DEV values outside these thresholds, darker colors indicate larger DEV scores. Darker colors indicate larger DEV scores.

To evaluate the robustness of these findings, we repeated the analyses using different thresholds (top 5% and 15%) (see Supplementary Figs. S2 and S3). The spatial distribution of atypical regions remained largely consistent across thresholds. Although minor variations in extent were observed, the core regions, predominantly localized in the frontal, parietal, and temporal lobes were consistently preserved. Moreover, the developmental trajectories of DEV values across age groups remained stable under different thresholds. During early adolescence (before the 13–15 age group), atypical regions in individuals with ASD exhibited positive deviations (Supplementary Tables S2 and S3), whereas during late adolescence (after the 13–15 age group), the DEV values became negative.

Atypical developed brainregions and meta-analysis

To better characterize the DEV of atypical brain regions in ASD, we superimposed the atypical brain regions from the nine age groups onto a single map (Fig. 3A). We found these regions in the superior temporal sulcus (STS), cingulate gyrus (CG), insula, and superior parietal lobule (SPL) across different age groups. When these regions were mapped onto a normative network, we found that 28.26% of them were located within the salience/ventral attention networks (SN/VAN) (Fig. 3B), suggesting that atypical development in ASD may be related to dysregulated attentional control functions. To further explore the functional significance of these atypical regions, we conducted meta-analytic decoding using the Neurosynth database (Fig. 3C). The results indicated that these regions are predominantly associated with cognitive control, response-inhibition, and inhibitory-control. Additionally, when mapped onto the brain’s functional gradient space, these atypical ROIs were predominantly situated at the transmodal end of Gradient 1 (Fig. 3D), suggesting a topographical bias toward higher-order associative cortices. This spatial preference may reflect the influence of large-scale functional network architecture in constraining atypical developmental trajectories.

Fig. 3: Spatial distribution and functional profiling of atypical brain regions in ASD.
figure 3

A A glass brain map displaying the spatial distribution of the top 10% most atypical brain regions, aggregated across all age groups. The color bar represents the frequency with which each region was identified as atypical across age groups. B Proportion of atypical regions assigned to each functional network. The bar plot shows the proportion of atypical regions located in each network. C Potential psychological processes associated with atypical regions, decoded using Neurosynth. D Atypical regions projected onto the principal functional gradients of the cerebral cortex (Gradient 1 and Gradient 2; Margulies et al.100). The color bar indicating the number of times these abnormal regions appeared across the 9 age groups.

Distribution deviation links with FC connections

Next, we tested whether the DEV was associated with the functional network architecture. To this end, we first constructed group-averaged functional connectomes, consisting of 246 nodes based on resting-state fMRI images from each ASD age group. Next, we retained the top 10% of connections to generate a binary group-level connectome (Fig. 4A). Then, we estimated the relationship between the DEV of a node and the mean DEV of its directly connected neighbor nodes within the backbone (Fig. 4B). Our analysis revealed a significant spatial correlation between the DEV of a node and the mean DEV of its directly connected neighbors (Fig. 4C; r = 0.517, p < 0.001). To further evaluate the significance of this spatial correlation, we compared the observed result to a spin-null model, aiming to determine whether the observed correlation was driven by regional correspondence rather than the spatial autocorrelation in the DEV values. Specifically, we generated 1000 surrogate maps by randomly rotating region-level DEV values across the cortical surface. Figure 4A, C displays results for the 6th age group (ages 13–15), results for other age groups are provided in Supplementary Fig. S4.

Fig. 4: Associations of regional DEV with the functional network architecture.
figure 4

A Age group-level connectome backbone at a 246-node resolution in ASD. B A schematic diagram illustrating the relationship between functional network and associated DEV. The extent of DEV for a given node (red) was correlated with the mean DEV extent of its directly connected neighbors (blue) to examine whether the DEV was associated with the functional network. C A significant correlation was observed between the nodal DEV and the mean of its directly connected neighbors (Pearson correlation) r = 0.517, p < 0.001, two-sided). The scatter plot shows the result at 1000-node resolution (linear fit (central line in blue) with a 95% confidence interval (shadows in gray). D The observed Pearson correlations (shown as red circles) were compared against a spin test by generating 1000 surrogate maps through random rotations of the r values (blue boxes).

The NDM of the functional connectome fitted the DEV in the next age group

Next, we used NDM to simulate the dynamic spread of DEV values in each atypical brain region across age stage (see Fig. 1C). Using NDM to analyze atypical brain regions in the previous age stage, the optimal values derived from this analysis are then used to fit the DEV in GMV for the subsequent age stage (Fig. 5A). Our analysis revealed that DEV values of of atypical brain regions in one age group, after being processed through NDM, could significantly predict DEV values in the next age group (Fig. 5B). Figure 5C illustrates the network diffusion pattern of DEV across three adjacent age groups (ages 12–16 years), based on the average diffusion values from = 1 to t = 10. Red and blue nodes represent positive and negative DEV, with the pattern predominantly involving seven nodes located within the SN, which persist as atypical across all three age stages. The results show that diffusion values were largely concentrated in spatially adjacent regions and within the SN. Notably, during the 13–15 age group, the CG exhibited lower DEV values in the ASD group compared to the TDCs group. As diffusion progressed, several atypical regions in the 14–16 age group demonstrated a transition from positive to negative DEV, suggesting a negative developmental trend with increasing age. Network diffusion pattern for the remaining age groups are provided in the Supplementary Fig. S5.

Fig. 5: Network Diffusion Modeling (NDM) of atypical brain regions in ASD across age groups.
figure 5

A Schematic illustration of the NDM framework applied to atypical brain regions identified at each age stage. For a given age window, n atypical regions were used as source nodes in the NDM. Each source region generated a 246-dimensional diffusion map, and these n diffusion maps were subsequently used to estimate the fitted DEV scores. Pearson correlation analysis was conducted between the predicted DEV scores and the observed DEV scores from the subsequent age group. The color bar denotes both the intensity and polarity of the DEV values. B Summary table showing the performance of NDM-based model fitting between adjacent age groups. Each row presents the explanatory power () and r between predicted and actual DEV scores from one age group to the next (Permutation test p < 0.001); C Spatiotemporal diffusion patterns of seven representative nodes across three consecutive age group. Red and blue nodes denote regions with positive and negative DEV values, respectively. Nodes boxed in black indicate key regions referenced in seven representative nodes.

Robustness and replication analyses

In the current study, to further assess the robustness of the KL divergence model, we introduced three widely used probability distribution distance metrics for comparative analysis: ED (Supplementary Fig. S6A), MMD (Supplementary Fig. S6B), and TVD (Supplementary Fig. S6C). The results demonstrated a significant correspondence between the atypical brain regions identified by KL divergence and those detected using alternative metrics. Specifically, regions identified through ED showed a correlation of r = 0.577 (p < 0.001) with those derived from KL divergence, while MMD and TVD yielded correlations of r = 0.827 (p < 0.001) and r = 0.683 (p < 0.001), respectively. These findings support the stability and reliability of KL divergence in characterizing atypical brain regions.

To validate the robustness and reproducibility of our findings, we conducted several complementary analyses. First, we repeated all primary analyses using alternative brain parcellation schemes comprising 200 and 400 cortical regions (see Supplementary Fig. S7). The results revealed a consistent spatial distribution of atypical regions across parcellation schemes, with convergence observed primarily in the insula, cingulate gyrus, and temporal lobe. Second, we replicated all primary analyses using an independent dataset from the NYU site (r = 0.725, p < 0.001) (Supplementary Fig. S8B) and a dataset comprising male only participants (r = 0.963, p < 0.001) (Supplementary Fig. S8A). The replication results were highly consistent with those obtained from the discovery dataset, further reinforcing the robustness of our findings. Overall, these results from the independent replication datasets provide strong evidence for a nonlinear trajectory of brain development from childhood to adolescence. Taken together, these converging lines of evidence, across alternative metrics, and independent samples, highlight the stability and reliability of our findings, and support the presence of a nonlinear trajectory of brain development from childhood through adolescence.

Discussion

Here, we introduce an approach to assessing DEV of GMV using KL divergence combined with expected value analysis, revealing distinct developmental trajectories in ASD. We introduce an approach to assessing DEV of GMV using KL divergence combined with expected value analysis, revealing distinct developmental trajectories in ASD. Our results show that individuals with ASD exhibit a positive DEV values related to TDCs during early adolescence (before the 13–15 age group), suggesting a potentially greater GMV density in ASD. However, this trend reverses during late adolescence (after the 13–15 age group). Notably, most atypical brain regions were located in the SN/VAN, which is associated with inhibitory control, emotion, motivation, empathy, perception, risk assessment, and decision-making. Additionally, the significant correlation between the DEV of brain nodes and their FC underscores the role of network dynamics in atypical development in ASD. Furthermore, the NDM highlights how FC influences the progression of these atypical brain regions across different age stages, providing further insights into developmental mechanisms underlying ASD.

Using KL divergence and expected value, we calculated the DEV of GMV in individuals with ASD, revealing a dynamic trajectory of brain development. Our findings have shown that during early adolescence (before the 13–15 age group), DEV values in individuals with ASD are positive relative to TDCs. However, this trend reverses in late adolescence (after the 13–15 age group), where DEV values in individuals with ASD become negative relative to TDCs. These results have aligned with previous studies, many based on the ABIDE dataset, which have reported widespread increases in cortical thickness and GMV during early development that attenuated later in development34,38,44. For example, Chen et al. observed that GMV increases during early adolescence in ASD, but declines with age, sometimes resulting in GMV reductions in specific brain regions10. Similarly, a study has demonstrated that ASD exhibit relative GMV decreases through adolescence and young adulthood45. Our findings supported this nonlinear trajectory and may help explain the inconsistent GMV findings reported across the literature. Notably, this trajectory has also been observed in FC, where several studies have reported age-related increases in FC during childhood (ages 5–12 years), followed by gradual declines beginning in adolescence46,47. Further research has suggested that this pattern may reflect a developmental sequence in ASD, where early uncoordinated structural growth among brain regions is followed by increasingly atypical functional synchronization48. In summary, this “overgrowth-to-slowed-development” trajectory has likely reflected neurodevelopmental dysregulation across distinct stages in ASD.

In the current study, we defined the top 10% of DEV as atypical brain regions and consistently found these regions in the superior temporal sulcus (STS), cingulate gyrus (CG), insula, and superior parietal lobule (SPL) across different age groups. This finding is consistent with previous research, in which some studies have reported alterations in GMV in ASD, primarily in the insua, CG, cerebellum, SPL and STS12,49,50. These results highlight the recurrent structural abnormalities in key brain regions associated with ASD. Specifically, the insula has been consistently implicated in ASD, particularly in relation to emotional processing and empathy51. Dysfunction in this region may contribute to the social cognition difficulties often observed in individuals with ASD52. Similarly, the STS plays a critical role in language, sensory integration, and the processing of social information, with deficits in this area linked to challenges in social communication53. Furthermore, the CG, which is involved in cognitive control and social cognition, exhibits underactivation in ASD, potentially explaining impairments in executive functioning and emotional regulation54. Despite these recurring findings, inconsistencies in the literature remain. For example, a previous study reported reduced GMV in the STS among individuals with ASD aged 8–50 years compared to TDCs55, while whereas another study found increase GMV among younger individuals with ASD aged 0–12 years12. These conflicting results can be attributed to the developmental trajectory of ASD, characterized by early brain overgrowth followed by delayed maturation or even reductions in GMV later in life, which aligns with our findings of an initial increase in GMV deviations during early adolescence, followed by a reduction in later stages. A recent study also revealed that the GMV of atypical brain regions in the 6–12 age group was significantly higher than in the 12–18 and 18+ groups9. Moreover, differences in sample characteristics, including the age range of participants and the severity of ASD, may further contribute to these inconsistencies. Understanding these dynamic developmental patterns is critical for identifying age-specific neural biomarkers of ASD and could provide valuable insights into the timing and nature of potential interventions.

We further identified that a significant proportion (28.26%) of these atypical developmental regions were localized within the SN. The SN plays a crucial role in both social and non-social functions, including emotion, motivation, empathy, perception, risk assessment, decision-making, and sensorimotor integration56,57. The SN includes the insula and dorsolateral cingulate cortex, and the insula plays a central role within the SN, acting as a gatekeeper of executive control58. Recently, a dysfunction hypothesis of SN dysfunction in ASD has been proposed59. This theory, supported by neuroimaging studies51,60, posits that the insula, which processes the subjective evaluation and emotional salience of sensory experience, functions as a key regulator of interactions between the executive control network (ECN) and the default mode network (DMN). Our findings provide further evidence supporting this hypothesis, as the top three atypical networks we identified were the SN, DMN, and frontoparietal network (FPN). Based on extensive research, these three networks: DMN, FPN, and SN, are often referred to as canonical networks61. The atypical functional organization within these networks and their dynamic cross-network interactions may underlie a wide range of psychiatric symptoms, as described in the “triple-network model of psychopathology”62,63. The SN, in particular, functions as a dynamic switch between inward focus and self-referential processes mediated by the DMN, and task-related and directed attention on outside stimuli maintained by the FPN64. Meta-analytic results from Neurosynth further reinforce these findings, linking the atypical regions in ASD to terms related to “cognitive-control” and “response-inhibition” processes. When the identified atypical brain regions were mapped onto the cortical surface, they displayed a gradient-like distribution. This gradient projection suggests that the disruption of cortical organization in ASD is not random but follows a specific hierarchical pattern. This organization provides further insights into how disruptions in cortical structure may underpin the heterogeneity of ASD symptoms.

The results of current study reveal a significant relationship between the DEV of GMV and the functional network architecture in individuals with ASD. By constructing group-averaged FC matrices and examining the correlation between the DEV of each node and its directly connected neighbors, we observed significant spatial correlations across all age groups (Supplementary Fig. S4). These findings indicate that the atypical development in the brains of individuals with ASD is not randomly distributed, rather, it is constrainted by FC. This conclusion aligns with previous research, for instance, prior studies have shown that found cortical thinning during childhood and adolescence is primarily observed in lateral frontal and parietal nodes, with structural constraints shaped by white matter network architecture29. Similarly, one investigation found that regional deformation is correlated with the deformation of structurally and functionally connected neighbors65. Their follow-up study further demonstrated that regional atrophy is correlated with the deformation of structurally and functionally connected neighbors66. Collectively, these findings underscore the critical role of brain network organization in neurodevelopmental disorders, including ASD67,68. Moreover, atypical FC has been linked to core symptoms of ASD, such as social interaction and communication impairments69. Atypical structural patterns are also associated with clinical dysfunctions, particularly in social difficulties70. Importantly, the interplay between atypical structure and FC often manifests as both weakened connectivity and compensatory increases in specific regions. A multimodal meta-analysis found that decreased spontaneous functional activity in the left insula is associated with increased GMV12. Reductions in gray or white matter integrity in depressed patients often correspond to increased neural activity or connectivity compared to healthy controls71, highlighting the broader relevance of structure-function interactions. To confirm the robustness of our findings, we employed a spin-based null model, which indicated that the observed correlations are not driven by mere spatial proximity but reflect genuine functional network dependencies30. These results suggest that the spatial distribution of atypical structure in ASD is constrained by functional networks, providing further insights into the neurodevelopmental mechanisms underlying ASD.

Through the application of the NDM model, we provide evidence for the propagation of brain structural deviations in individuals with ASD across different developmental stages. The NDM model results show that DEV in each age group, after processing, can effectively predict the DEV in the next age group, supporting the dynamic transmission of atypical brain developmental over time72. This age-related dynamic is supported by a wealth of research indicating that atypical brain features in ASD are developmentally dependent. For example, a longitudinal study reported that a decrease in structural connectivity within the frontoparietal network in ASD during adolescence and early adulthood, and then this connectivity in this subnetwork could predict reduced symptom73. Similarly, prior research study highlighted significant functional and structural differences among children, adolescents, and adults with ASD9. These structural and functional changes likely contribute to ASD-related atypical behaviors, particularly imbalances in excitation and inhibition, which predominantly impact regions involved in attention and social processing74. To further validate these findings, we examined diffusion maps across different age groups and observed that the propagation of DEV follows a symmetric pattern, and occurs more frequently within the SN. This suggests that DEV within the SN are more likely to propagate within the network and influence adjacent brain areas. These alterations may further impact attention and social behaviors. Our findings provide strong support for the hypothesis of SN dysfunction51,59,75. Previous studies have reported that individuals with ASD exhibit atypical activation patterns in SN-related brain regions. For instance, children with ASD have showed significantly greater activation in the right anterior insula compared to TDCs during both social and non-social attention tasks76. This finding not only offers new evidence for understanding atypical neurodevelopment in ASD but also lays a foundation for further exploration of their associations with behavioral and clinical features. These insights are crucial for guiding the early identification of ASD and highlight the need for age-specific intervention strategies.

In the current study, we assessed the stability and robustness of the KL divergence computation model by introducing three widely used alternative measures of distributional distance: ED, TVD, and MMD. These metrics were chosen for their complementary theoretical properties and their relevance to the statistical comparison of probability distributions. Specifically, ED captures sample-wise differences based on pairwise Euclidean distances77, TVD quantifies the maximum difference in probability mass between two distributions78, and MMD assesses discrepancies between distributions in a reproducing kernel Hilbert space using a kernel-based framework79. By calculating the spatial correlation patterns between the three alternative metrics and KL divergence, we observed a high degree of spatial and statistical concordance in the overall atypicality maps. This strongly supports the robustness of KL divergence in detecting meaningful group-level distributional deviations and reinforces its reliability in identifying atypical morphological patterns in ASD, independent of the specific choice of distance metric.

To further assess reproducibility, we replicated the primary analyses using an independent dataset from the NYU site and a separate subsample consisting of male participants. The replication dataset demonstrated high correlation and consistency with the original dataset, not only confirming the reliability of the KL divergence model but also enhancing the generalizability of the results across different samples and subgroups.

In addition, we tested whether our findings were sensitive to the choice of brain parcellation scheme by re-running the primary analyses using alternative cortical parcellations of 200 and 400 regions (see Supplementary Fig. S7). The spatial distribution of atypical regions remained consistent across parcellations, with major effects consistently localized to the insula, cingulate gyrus, and temporal cortex. This parcellation-invariant result further underscores the stability of the observed atypicality patterns.

Collectively, these converging lines of evidence, across multiple distance metrics, independent replication datasets, and harmonization-free analyses, highlight the methodological stability, robustness, and reproducibility of the KL divergence framework. The model’s consistent ability to identify atypical brain regions across varied conditions supports its utility in characterizing spatiotemporal trajectories of brain development in ASD. Furthermore, this suggests that the use of KL divergence can reliably capture structural brain changes in ASD, offering a promising tool for understanding the neurobiological mechanisms and for potential therapeutic applications.

Limitation

Despite the valuable insights provided by this study, several limitations should be acknowledged. First, although our diffusion modeling approach revealed that atypical structural development may be constrained by functional network architecture, the analysis was limited to FC. This choice was based on both the spatial correspondence of maturational deviations with known functional gradients and the limited availability of high-quality diffusion MRI data in the ABIDE cohort, which prevented robust reconstruction of structural connectivity. Future work integrating SC-based network diffusion models will be essential for validating and extending the current results. Second, the multi-site nature of the ABIDE dataset introduces considerable heterogeneity in data acquisition protocols, scanner types, and demographic distributions across sites and age groups. Although we used ComBat harmonization to reduce site-related variance and included biological covariates to preserve meaningful effects, we acknowledge that residual confounding factors may still exist, especially in underrepresented age groups from certain sites. Finally, the study primarily focuses on structural brain development and functional connectivity, and future research shouldexplore how other neurobiological factors, such as genetic or environmental influences, contribute to atypical brain development in ASD.

Conclusion

The current study provides further insights into the dynamic developmental trajectories of atypical brain development in individuals with autism spectrum disorder (ASD) by assessing the distribution deviations (DEV) of gray matter volume (GMV) using Kullback-Leibler divergence and expected value. The results revealed a nonlinear pattern of brain development in ASD, suggesting that early intervention during the critical window of early adolescence may be essential to mitigating the long-term effects of these deviations. Furthermore, the involvement of specific functional networks such as the salience network (SN) provides potential targets for network-based interventions aimed at improving cognitive and behavioral outcomes. Finally, the predictive power of network diffusion modeling (NDM) underscores that atypical brain development is constrained by functional networks, indicating that structural deviations in ASD are not randomly distributed but are shaped by the spatial dependencies of functional connectivity (FC) architecture. This finding not only deepens our understanding of the mechanisms underlying atypical development in ASD but also provides important insights for age-specific intervention strategies.

Materials and methods

Participants

All analyses in this study were based on publicly available MRI datasets from the Autism Brain Imaging Data Exchange initiative (ABIDE-I and ABIDE-II). All data were collected with informed consent in accordance with procedures approved by institutional review boards for human subject research. Detailed information about diagnostic protocols and ethical statements is available on the ABIDE website (https://fcon_1000.projects.nitrc.org/indi/abide/). In the current study, the inclusion criteria for both the ASD and TDC group, inclusion required a minimum of 20 subjects per year of age. The selection criteria for individual subjects were as follows: (1) subjects with minimal head motion (maximum translation <2 mm and rotation <2°, with less than 50% of frames showing high mean framewise displacement); (2) subjects with complete cortical coverage during scanning; and (3) subjects who underwent both resting-state functional magnetic resonance imaging (rs-fMRI) and T1-weighted structural brain imaging. Based on these criteria, we included participants with ASD (N = 301, aged 12.82 ± 2.70 years; 49 females) and TDC (N = 375, aged 12.45 ± 2.73 years; 63 females) from 18 sites, further demographic details are provided in Table 1. A detailed breakdown of participant numbers stratified by site and age (8–18 years) is provided in Supplementary Table S4 for both the ASD and TDC groups.

Table 1 subject demographics

Gray matter volume analysis

All T1-weighted images were visually evaluated for motion artifacts by three experienced raters. Image quality was scored on a 5-point scale (1 = poor, 5 = excellent), and participants with an average score ≤2 were excluded from further analysis. The MRI data processing was performed with Computational Anatomy Toolbox, CAT12 (http://www.neuro.uni-jena.de/cat/) and Statistical Parametric Mapping 1280, using standard pipeline: (1) global intensity correction; (2) modulation and segmentation of the images into gray matter, white matter and cerebrospinal fluid (CSF) after blood vessel correction and skull stripping81; (3) registration to standard space using diffeomorphic anatomical registration through exponentiated lie algebra (DARTEL) algorithm; (4) thorough quality control of the structural images, comprising both automated image quality metrics provided by CAT12 (including noise-to-contrast ratio, inhomogeneity); and (5) co-registration to Montreal Neurological Institute (MNI) space82, and spatial smoothing with a 6-mm FWHM kernel. All procedures were carried out following the guidelines provided in the Cat12 software manual83.

To address inter-site variability inherent in the multi-site ABIDE dataset, we applied ComBat harmonization to the regional GMV estimates. This method corrects for site-specific effects while preserving biologically meaningful variability. Consistent with recommendations from previous studies84, age, sex, and diagnostic group were included as biological covariates in the ComBat model to ensure that developmental and group-related effects were retained following harmonization.

Functional data pre-processing

All resting-state fMRI images were preprocessed by the DPARSF toolkit (http://rfmri.org/DPARSF)85. The preprocessing steps were carried out as follows: (1) remove the first 10 volumes to allow for magnetic field stabilization; (2) slice timing correction; (3) head motion realignment, where participants with maximum head movement >2.0 mm translation or >2° rotation were removed; (4) spatial normalization to the MNI (Montreal Neurological Institute) space (3 × 3 × 3 mm3 resolution); (5) spatial smoothing using an isotropic Gaussian kernel with a full-width at half-maximum (FWHM) of 8 mm and detrending; (6) nuisance covariates regression, the covariates consist of the Friston 24 motion parameters86, white matter and cerebrospinal fluid (CSF); (7) eliminate linear trends and the band-pass filtering (0.01–0.08 Hz).

Functional connectivity data analysis

The fMRI volumes were registered to the MNI template and partitioned into 246 regions of interest (ROIs) based on the Brainnetome Atlas87. Subsequently, FC between regions was quantified by computing Pearson correlation coefficients between all possible pairs of the 246 ROIs using the resting-state time series. This process generated a 246 × 246 symmetric FC matrix for each participant, where each element represented the strength of functional coupling between two regions. To focus on the most robust and significant connections, the FC matrix was thresholded. Specifically, for each threshold over the full range of sparsity values from 0 (no node pairs connected) to 1 (all node pairs connected), a binary matrix was generated by setting elements above the threshold to 1 and those below the threshold to 088. Thresholds were selected to yield binary matrices with densities ranging from 0 to 1. For example, densities of 10%, 20%, 30%, 40%, and 50% corresponded to retaining the top 10%, 20%, etc., of connections in the original FC matrix. This thresholding method has been extensively utilized in previous research on functional brain networks88,89. In the current study, we report primary results at a sparsity level of 10%, with additional findings available in the supplementary materials (Supplementary Table S5).

Distribution deviation estimation based on Kullback-Leibler divergence estimation

Traditional ROI-based statistical methods primarily rely on averaging or summing voxel level values, which overlooks the underlying morphological distribution patterns across different cortical regions90. To overcome these limitations, Kong and colleagues proposed an approach to estimating interregional morphological connectivity using KL divergence41. Wang and colleagues further validated the effectiveness of this method in constructing morphological similarity networks42. However, a key limitation of KL divergence is its lack of directionality, it yields only positive values and therefore cannot capture the direction of morphological deviation. To address this issue, we proposed a DEV method that integrates expected value computations (see Supplementary Fig. S9) with KL divergence to quantify developmental changes in brain morphology. Specifically, by computing the expected values of the two probability density functions (PDFs) and taking their difference, we obtained an indicator that reflects the direction of deviation relative to TDCs. This enabled us to detect whether brain development in individuals with ASD deviates positively or negatively from that of TDCs.

The methodological workflow is shown in Fig. 1. Participants were initially divided into different age groups based on their age, with the first group spanning 8–10 years. Each age group covered a 2-year range and overlapped with the adjacent group by 1 year, resulting in a total of 9 age groups ranging from 8 to 18 years (Fig. 1A). Within each age group, for each participant, we first extracted the GMV values for all the voxels within each ROI, as defined by the brain parcellation scheme. Subsequently, the group mean GMV map was computed for the TDCs, and the PDF of these values was estimated using kernel density estimation with bandwidths chosen automatically91 (Fig. 1B). Next, we calculated the expected value of the PDF in both ASD and TDCs groups. Using these distributions, we computed the KL divergence and expected value between every pair of ROIs based on their respective PDF. KL divergence is a concept in probability theory that quantifies how one probability distribution diverges from a second, reference distribution43. From an information theory perspective, it can be interpreted as the amount of information lost when approximating one distribution with another42. Formally, the KL divergence from distribution Q to P is defined as:

$${D}_{{KL}}\left(P\parallel Q\right)={\sum }_{i}^{n}P\left(i\right)\log \frac{P\left(i\right)}{Q\left(i\right)},$$

where P and Q are two PDFs n = 27 is the number of sample points, as referenced in this study42. For a given pair of PDFs P and Q, we estimate \({D}_{{KL}}\left(P,Q\right)\), the KL divergence between them. It is important to note that \({D}_{{KL}}\left(P\parallel Q\right)\) is not symmetric, meaning that \({D}_{{KL}}\left(P\parallel Q\right)\ne {D}_{{KL}}\left(Q\parallel P\right)\). To address this asymmetry, we use a commonly-used symmetric version of the metric, computed as follows and in line with previous work43,92.

$${D}_{{KL}}\left(P,Q\right)={\sum }_{i}^{n}\left(P\left(i\right)log \frac{P\left(i\right)}{Q\left(i\right)}+Q\left(i\right)log \frac{Q\left(i\right)}{P\left(i\right)}\right),$$

Finally, the KL divergence is a non-directional and strictly non-negative measure, it cannot fully capture the direction of deviations. To address this limitation, we introduced the expected value to compute the DEV. The DEV is computed as:

$${DEV}\left(i\right)={sign}\left({E}_{{Ai}}-{E}_{{Ti}}\right)* {D}_{{KL}}\left(i\right),$$

Where i represents brain region i, \({E}_{{Ai}}\) and \({E}_{{Ti}}\) denotes the expected value of the PDF for brain region i in the ASD and TDC group (Detailed calculations are provided in Supplementary Fig. S9), \({sign}\left({E}_{{Ai}}-{E}_{{Ti}}\right)\) represents the extraction of the sign of \(\left({E}_{{Ai}}-{E}_{{Ti}}\right)\), \({D}_{{KL}}\left(i\right)\) refers to the KL divergence between brain region i in the ASD and TDCs groups. DEV reflects the GMV deviation of ASD relative to TDCs, where negative values indicate \({AS}{D}_{{GMV}} < {TDC}{s}_{{GMV}}\), and positive values indicate \({AS}{D}_{{GMV}} > {TDC}{s}_{{GMV}}\).

Based on the aforementioned computations, we obtained a 9 × 246 DEV matrix of DEV values, which was reshaped into a single column and normalized (Fig. 1B). ROIs with DEV values in the top 10% were identified as atypical regions93. To assess the robustness of this threshold, we also tested alternative cutoffs at 5% and 15%, results were qualitatively consistent across thresholds. Finally, these atypical ROIs were mapped back to each respective age groups.

In the current study, to rigorously evaluate the reliability of KL divergence computation model, we implemented three established probability distribution distance metrics for comparative validation: Energy Distance (ED)77, Total Variation Distance (TVD)94, and Maximum Mean Discrepancy (MMD)95 (see Supplementary Divergence Measures: ED, TVD, and MMD). Specifically, we calculated the spatial correlation (Pearson correlation) between their respective frequency matrices. Here, the frequency matrix was defined as a vector representing the number of developmental time windows in which each brain region was identified as atypical.

Association between distribution deviation and the connectome

To test whether DEV was related to its direct connectome, which is derived from FC we first employed the following model to evaluate the cross-node relationship between the DEV of a given node and its directly connected neighboring nodes29:

$${\hat{T}}_{i}=\frac{1}{N}{\sum }_{j\ne i,j=1}^{{N}_{i}}{T}_{j},$$

Where, \({\hat{T}}_{i}\) represents the estimated DEV extent of node i based on its directly connected neighbors. Tj represents the DEV extent of the jth neighbors, and Ni is the number of directly connected neighbors of node \(i\).

Specifically, we used group-level binary FC networks to identify the FC neighbors of each cortical node within each age group. Then, we calculated the spatial correlation between DEV (nodal t-value) and its estimated value (\({\hat{T}}_{i}\)). The correlation coefficient was then used to quantify the association between the FC edges and the DEV.

Null models

We evaluated a null model using a spatial permutation test (“spin test”) to determine whether the observed correlation was specific to the actual DEV pattern rather than being influenced by the spatial autocorrelation of DEV. A spin test is a spatial permutation method based on the angular permutations of spherical projections at the cortical surface29. Specifically, we used the spherical projection of the average surface to define spatia coordinates for each parcel. These parcel coordinates were then randomly rotated, and original parcels reassigned the value of the closest rotated parcel96. This process was repeated 1000 times to generate surrogate brain maps. The p value was determined as the proportion of correlations in the null models that were greater than the observed correlation.

Network-based diffusion model

We used the NDM to investigate whether the DEV of atypical developmental brain regions spreads throughout the brain via a diffusion process. NDM simulates the dynamic propagation of pathology across nodes in a weighted network through a diffusion process. The diffusion process is defined by the following formula97:

$$f\left(t\right)={e}^{-{aHt}}{f}_{0},$$

Here, t represents the diffusion time in the model, with arbitrary units; f(t) is the vector of diffusion amounts at each region at time t. The strength of the diffusion process is controlled by the constant a, H is the Laplacian operator of the weighted structural connectivity matrix, and f0 represents the initial distribution of the pathology98. We initialize the model repeatedly using each atypical brain region as a starting seed for each age group, setting the initial state of the seed region to 1 and the initial state of all other regions to 0. For each initialization, with a constant a = 1, NDM is used to estimate the diffusion of all other regions over time from \(t=0\) to 100. Finally, an optimal parameter is obtained to fit the DEV values for the next age group (see Fig. 1C). The NDM provides a more direct test of the spreading hypothesis by simulating a passive diffusion process to model how DEV alterations propagate from specific seed regions. Consequently, the NDM evaluates both the mechanism of spread and the probable source, or epicenter, from which the spread originates. More information about NDM can be found in supplementary (see Supplementary Development of a Network Diffusion Model and Supplementary Fig. S10).

Meta-analytic decoding of the atypical developmental brain regions

In the current study, we used Neurosynth to decode the potential cognitive processes associated with atypical developmental brain regions across the developmental period. First, we generated a composite map by overlapping the brain regions identified for each age group and input it into the Neurosynth system. Next, we ranked all correlation coefficients for each state in descending order and identified the terms most strongly associated with each state99.

Gradient analysis

To investigate the functional distribution patterns of atypical developmental brain regions, we performed a gradient analysis. Specifically, we constructed a matrix capturing the frequency with which each brain region was identified as atypical across developmental time windows. We then mapped these regions into the functional gradient space, using a standardized cortical gradient provided by Margulies et al.100, as implemented in the neuromaps toolbox101. This reference gradient reflects the principal axis of functional organization across the cortex, enabling spatial interpretation of atypicality patterns in relation to large-scale cortical hierarchies. We retained the first two gradient dimensions for visualization and interpretation, as they captured the most prominent axes of functional differentiation. which allows standardized gradient mapping across brain parcellations.

Statistics and reproducibility

All variables were carefully examined for outliers, data entry errors, and missing values. Group differences in age were analyzed using an independent two-sample t-test for demographic data, while chi-square test was applied to assess differences in sex and handedness. Pearson correlation coefficients were calculated to examine the relationships between DEV and FC edges. Statistical significance was set at p < 0.05, and all tests were two-tailed unless otherwise specified.

To validate the robustness and reproducibility of our findings, we conducted several complementary analyses. First, we repeated all primary analyses using alternative brain parcellation schemes consisting of 200 and 400 cortical regions, and confirmed the consistency of results. Second, to assess the generalizability of our results, we repeated the primary analyses in two subsamples: (1) a single-site cohort from New York University (NYU), which offers one of the largest and most balanced datasets in ABIDE (ASD = 55, TDC = 62). Full demographic details for the NYU is provided in Supplementary Table S6, and (2) a male-only subsample to control for potential sex-related effects. Third, to quantify the reproducibility of the spatial patterns of atypical development, we focused on the frequency maps of atypical regions across developmental time windows. For each group, we computed a 246 × 1 vector representing the frequency with which each brain region was identified as atypical. We then calculated the Pearson correlation coefficient (r) between the frequency maps derived from each subsample and those from the full dataset.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.