Introduction

Autism spectrum disorder (ASD) is a heterogeneous neurodevelopmental disorder characterized by two core symptoms including difficulties in social communication and interaction and restricted and repetitive behaviors (Diagnostic and Statistical Manual of Mental Disorders, DSM-V), with an estimated prevalence of around 2.76% [1] in the U.S. and around 1% in China [2]. Currently there is no approved pharmacological intervention for the core symptoms of ASD, especially in terms of social dysfunction [3]. While the hypothalamic neuropeptide oxytocin (OXT) has shown great translational promise in improving social function in animal models of autism [4] and from acute dosing in both typically developing and autistic individuals [5, 6], findings from a large number of clinical trials over the past decade using chronic intranasal treatment protocols have been inconsistent. Thus, while positive effects of repeated daily doses of intranasal OXT have been observed either on social responsivity [7,8,9,10,11,12,13] and/or on repetitive or adaptive behaviors [9, 14, 15], other studies found no significant overall effects of OXT [16,17,18,19,20].

A number of factors may contribute to inconsistent findings in clinical trials using chronic intranasal OXT. To date there is some evidence for dose-magnitude [12, 15] and dose-frequency dependency and treatment duration [9]. In terms of demographic and physiological measures one study on children reported stronger effects in younger children [20] and others have reported associations between improvement in the social responsivity scale and basal and/or treatment dependent changes in plasma OXT concentrations [9, 10]. Treatment context may also play a role with several trials reporting positive effects when OXT is given prior to positive social interactions [9] or psychosocial training [8]. However, no study has adopted a data driven approach to assess whether there may be an autism sub-type where individuals are more likely to be responsive to chronic OXT treatment on the basis of multiple clinical, physiological and behavioral measures taken prior to treatment.

Recently, a number of studies have used phenotypic/genetic-[21, 22], biological- [23, 24], neuronal-[25] and behavior assessments [26] to determine ASD subtypes to aid earlier and more accurate detection as well as inform treatment/intervention selection to provide more evidence for individualized precision medicine [27]. To date these different approaches come up with a variable number of subtypes and tend to only include one, or a few measures. A subtype approach using multiple measures and data-driven clustering analysis may offer a more robust approach to help both identify autism subtypes [28, 29] and hopefully also provide a method to help predict which individuals may respond best to specific types of treatment. For example, a recent study used this approach to determine subgroups with migraine without aura who were more responsive to electroacupuncture treatment [30].

In the current study we therefore firstly aimed to classify autism subtypes using unsupervised data-driven clustering analysis including all of the multiple, clinical, physiological, behavioral and eye-tracking measures taken at baseline to avoid any kind of preselection bias and secondly to investigate whether individuals identified as exhibiting reliable reductions in symptom severity in our recent intranasal OXT clinical trial [9] were more likely to be from a specific subtype.

Methods

Participants

Data were used from an original published trial [9] including a final sample of 41 children diagnosed with ASD (mean age ± SD = 5.0 ± 1.3, 3 girls) who received a 6-week course of intranasal OXT (24IU, every other day Sichuan Defeng Pharmaceutical Co, Sichuan, China, followed by a 30 min period of positive social interaction) and placebo (PLC, same ingredients but without the peptide and also followed by a 30 min period of positive social interaction—see Fig. 1 for details of protocol and measures). These data have been re-analyzed to determine whether distinct subtypes could be identified using pre-treatment baseline measures which were differentially responsive to the subsequent 6-week intranasal OXT intervention. The original study used a computer randomized, double-blind, placebo-controlled, cross-over design but we primarily focused on the data before and after receiving 6-weeks of intranasal OXT every other day followed by positive social interaction. This study was conducted in the Chengdu Maternal and Children’s Central hospital (CMCCH) between June 2019 and July 2021 in accordance with the latest Declaration of Helsinki and has been approved by the Ethics Committee of CMCCH affiliated to the University of Electronic Science and Technology of China (number 201983) as well as the general ethics committee of the University (number 1420190601). The trial was pre-registered (Chinese Clinical Trial Registry: ChiCTR1900023774). All written informed consent was provided by their parents or legal guardians. Inclusion criteria were as follows: (1) age range 3–8 years; (2) diagnosed with ASD according to the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-5) and Autism Diagnostic Observation Schedule-2 (ADOS-2), with scores of 5–10 on the ADOS-2 Comparison scale (i.e. moderate to severe autism) [31]; (3) free of any chromosomal abnormalities or neurological diseases (e.g., epilepsy, Rett syndrome), or some other psychiatric disorders; (4) without any severe respiratory, hearing or visual impairments.

Fig. 1: Data inclusion.
figure 1

The schematic shows the included data (part of Le et al. [9]) for the clustering analyses of the cross-over trial from in total of 41 ASD. Subjects were required to complete a range of assessments, caregiver-based questionnaires and eye-tracking tasks as well as biological sample collection (saliva and blood) before and after receiving oxytocin (OXT) for 6 weeks (if treatment order is OXT first, the averaged outcome measures between W1 and W2 are defined as the baseline or pre-treatment data and the end of W8 is taken as the post-OXT; if treatment order is PLC first, the averaged measures between W11 and W12 are defined as the baseline or pre-treatment data and the end of W18 is regarded as the post-OXT). Outcome measures taken at each point are shown. Autism Diagnostic Observation Schedule-2 (ADOS-2); Social Responsivity Scale-2 (SRS-2); Adaptive behavior assessment system-II global adaptive composite (ABAS-II GAC); Social communication quotient (SCQ); Repetitive behavior scale—revised (RBS-R); caregiver strain questionnaire (CSQ); Eye-tracking, Tasks 1 and 2. Points where blood or saliva samples were taken are also indicated.

Assessment and questionnaires

During the trial [9], severity of autistic symptoms was assessed using the gold-standard objective assessment (ADOS-2 Comparison, total scores and subscale scores including social affect-SA and restrictive and repetitive behavior -RRB) and the caregiver-based social responsivity scale-2 (SRS-2 total score). These were the primary outcome measures. Additionally, a wide range of secondary questionnaire-based measures of adaptability, social communication and repetitive behavior were taken before and after treatment, including the adaptive behavior assessment system-II, global adaptive composite (ABAS-II GAC) [32], social communication questionnaire (SCQ) [33], repetitive behavior scale—revised (RBS-R) [34]. Additionally, social subtype scores were measured using the Beijing Autism Subtype questionnaire (BASQ [35]—only before treatment) and stress in caregivers measured using the caregiver strain questionnaire (CSQ) [36] before and after treatment. Biological sampling including blood (6 ml—collected into EDTA tubes—only before treatment) and saliva samples (1 ml using Salivette, both before and after treatment) collected from each participant for measurement of OXT concentrations using a standard ELISA (ENZO Life Sciences, New York, USA—Kit ADI 900-153A) incorporating an extraction step.

Eye-tracking tasks

In the trial [9] all children were instructed to watch a monitor passively during two eye-tracking tasks including a dynamic social interest task and a static face emotion expression where no responses were required. The tasks were given before, during (3 weeks) and after (6 weeks) the OXT and PLC treatments using different stimulus sets (see [9]). The social interest task displayed pairs of dynamic dancing Chinese human versus dynamic geometric patterns simultaneously over 40 s (20 × 2 s videos shown in two separate clips). The face emotion task consisted of 4 children’s faces and 2 adult faces expressing four different emotions (neutral, happy, angry and fear) in both genders. Each face was randomly presented for 2 s with a 500 ms interval between them. In the social interest task, the primary measure was the proportion (%) of total time spent viewing the dynamic social stimuli relative to the dynamic geometric stimuli. In the face emotion task, the primary measures were the proportion of time (%) spent viewing the eye, nose and mouth regions relative to the rest of the face for each of the four face emotions. Only data from the eye and nose were included since these were either weakly or not correlated with each other implying that they would contribute differently to the cluster analysis (all |rs| < 0.45, ps from 0.01–0.299).

Data analyses

Characterizing autism subtypes based on basal individual clinical severity and eye-tracking performance

A data-driven k-means clustering algorithm with Manhattan (L1) distance [37] was used to determine autism subtypes. Specifically, demographic information (age, gender), clinical variables (ADOS-2 total and comparison scores, ADOS-2 SA and RRB) [38], assessment questionnaires (ABAS-II GAC, SCQ, RBS-R, CSQ, BASQ-aloof, passive, active but odd), eye-tracking performance (task 1: proportion of time (i.e. % of total duration of gaze at stimuli) viewing dynamic social as opposed to geometric stimuli; task 2: proportion of time (i.e. % gaze duration relative to the whole face) spent viewing the eyes and nose for each of four different emotions (neutral, anger, fear, happy) and endocrine data (basal saliva OXT and basal plasma OXT concentrations) of each of the 41 children with autism were defined as the clustering features (in total 23 features). Notably, previous eye-tracking studies in autistic individuals have consistently demonstrated a prominent trend of spending less time viewing the eyes but also altered viewing of other face regions when scanning faces [39] which may be important for conveying important social cues for interpersonal communication [40] and recognition of the underlying emotions [41]. Altered patterns of gaze can also vary across individual emotions in autistic compared to neurotypical individuals [42]. In the current study we included the proportion of time spent viewing the eye and nose regions, which were found to be the main features viewed consistently in the trial [9], as separate items for 4 different face emotions (neutral, angry, fear, happy). We varied the number of clusters (subtypes) from 2 to 10 and repeated the clustering algorithm 20 times with different random initial cluster centroids to help find a lower, local minimum total sum of distances. The optimal number of subtypes was determined by the variance ratio score (VRS), which is calculated as the ratio of the between-cluster variance to the within-cluster variance [30] where a higher VRS indicates better clustering performance. In addition, adjusted cosine similarity can remedy the potential drawback of two vectors with very different attribute values and can check whether there is a high or low similarity between clusters or subtypes [43]. This was calculated for each individual across the determined autism subtypes to validate the k-means clustering results. Additionally, an ablation analysis [44] was conducted, where each single measure was removed systematically from all included variables to verify the contributing role in the identification of autism subtypes via cluster analysis.

Identifying responders during OXT or PLC treatment across autism subtypes

Additionally, changes in ADOS-2 total scores between pre- and post OXT or PLC treatment were evaluated in individual participants using the Reliable Change Index (RCI) [45] as a common approach for assessing the clinical significance (i.e. improve, no-change and deteriorate) of the 6-week OXT and PLC interventions followed by positive social interactions. Here, the RCI is calculated by subtracting an individual’s score after an intervention (i.e. intranasal OXT or PLC) from their baseline score (i.e. before treatment), and then dividing the result by the standard error of the difference of the measure. If the RCI value is ±1.96 or greater, it indicates a reliably significant change (in the case of ADOS total score >+1.96 = “deterioration” and >−1.96 = “improvement”). Where RCI is less than ±1.96, it is considered not significant and categorized as “no change” [20, 46]. Individuals demonstrating a significant “improvement” using RCI criteria were considered as OXT+ (RCI-based OXT responders, n = 18) and those classified as exhibiting “no change” were defined as OXT− (RCI-based OXT non-responders, n = 23, response rate = 43.9%, Fig. 3A [9]). There were no individuals classified by the RCI analysis as showing a “deterioration”. For the PLC condition using the same RCI criteria 6 (14.6%) individuals showed significant “improvement” and 4 (9.8%) significant deterioration.

To identify OXT and PLC responders across subtypes based on clustering analyses, the responder rate was measured as a ratio between the number of responders and the total number of participants. The responder rate of different subtypes was compared with the null distributions of the permutation responder rate (10,000 permutations). In addition, the difference in the responder rate across subtypes was compared using a chi-square test. Given a lower frequency of improvement on the less objective caregiver completed SRS-2 following the 6-week OXT based on the RCI (27% for OXT and 0% for PLC) [9], we did not include SRS-2 scores in the further analyses.

Evaluation of clustering-based subtypes on responses to OXT or PLC treatment

In accordance with the clustering analysis, each individual with autism could be assigned to either subtype 1 or subtype 2 and paired t-tests with p values FDR corrected were performed to evaluate differences between the two subtypes following the 6-week OXT treatment (i.e. pre- vs. post-treatment) with each dose followed by a 30 min period of positive social interaction on all primary and secondary outcome measures taken. Furthermore, as a control for the specificity of effects of OXT treatment we performed a similar analysis for when the same individuals received 6 weeks of PLC treatment followed by 30 min of positive social interaction.

Results

Two autism subtypes identified by clustering analysis

As shown in Fig. 2A, the VRS showed that the maximum value or optimal clustering performance was with two subtypes, and it monotonically decreased as the number of subtypes increased. We also calculated the inter-subject adjusted cosine similarity [43] across all clustering features, and the result revealed a high degree of similarity within each subtype (Fig. 2B). Paired t-tests showed that a higher similarity within subtype (mean ± SD = 0.106 ± 0.095) than that between subtypes (mean ± SD = −0.150 ± 0.094) was observed (t = 9.30, 95% CI = [0.200, 0.312]).

Fig. 2: Two ASD subtypes were determined and the corresponding basal characters.
figure 2

A Selection of the optimal number of subtypes in K-means clustering based on variance ratio score (VRS). B The inter-subjects adjusted cosine similarity across all clustering features. The responder rate of subtype 1 (C) and subtype 2 (D) was compared with the null distributions of the permutation responder rate (10,000 permutations). E The significant differences between the two subtypes in the 11 outcome measures including clinical variables, questionnaires and eye-tracking performances. ADOS-SA Autism Diagnostic Observation Schedule-2-Social affect, SCQ Social communication quotient, Aloof subscale of Beijing Autism Subtype Questionnaire. *p < 0.5; **p < 0.01; ***p < 0.001; FDR-corrected.

No differences across the two subtypes were observed in terms of age, gender ratio and oxytocin plasma or saliva concentrations (ps > 0.226). Importantly, the proportion of OXT responders for the two subtypes were 2 (13.3%) for subtype 1 (n = 15, 2 girls) and 16 (61.5%) for subtype 2 (n = 26, 1 girl, Fig. 3B, top) and significantly lower (Fig. 2C) or higher (Fig. 2D) than the permutation responder rate, respectively. Chi-square test suggested that a significantly greater proportion of OXT responders were found in subtype 2 (chi-square = 8.975, p = 0.003, effect size phi = 0.468, a large effect size). Analysis using t-tests with FDR correction showed that individuals in subtype 2 (higher OXT responder rate subtype) had significantly different scores on 11 different measures including greater interest in the eyes and reduced interest in the nose while viewing face emotions and less clinical severity (ADOS-2 SA subscale, SCQ and aloof social sub-scale scores) as compared with subtype 1 (lower OXT responder rate subtype, see Fig. 2E).

Fig. 3: Responders to oxytocin (OXT) identified using the reliable change index (RCI).
figure 3

A In the original study (Le et al. [9]) without cluster analysis. B In the two identified subtypes of autism (ASD) based on all 23 baseline measures and following ablation analysis only using ADOS-2 (Autism diagnostic observation schedule-2) and eye tracking measures for time spent looking at the eyes and nose of emotional faces. C Box plots showing comparison of responses to OXT for the two autism subtypes showing greater combined improvements in ADOS-2, ABAS-GAC (adaptive behavior assessment system-II, global adaptive composite scale) and RBS (Repetitive behavior scale) scores and time spent looking at dynamic social stimuli and the eyes of angry faces and nose of neutral ones.

For the control analysis using the PLC condition the proportion of responders in the two subtypes did not differ (subtype 1 = 1/15 (6.7%) and subtype 2 = 5/26 (19.2%) (chi-square = 1.20, p = 0.273). Similarly, the proportion of individuals showing deterioration under PLC did not differ between subtypes (subtype 1 = 1/15 (6.7%) and subtype 2 = 3/26 (11.5%) (chi-square = 0.256, p = 0.613).

Ablation analysis of variables contributing most to the accuracy of cluster analysis identification of subtypes

The results of ablation analyses, where each single measure was removed systematically from the analysis, revealed that face eye-tracking performance (angry eyes, angry nose, fear eyes, neutral eyes and neutral nose) and ADOS-2 assessment (SA and RRB subscale scores) contributed most to subtype clustering since if any of them were removed, the maximum OXT responder rate was reduced (>4%, details see Table 1). The contributions of proportion of gaze time on the eyes and nose of the different emotions varied, with neutral expression eyes and angry and neutral expression nose contributing most (each −11.4%), with angry eyes and fear nose less (−4.4 and −5.9%) and happy eyes and nose not contributing at all, further supporting the conclusion that the different emotions and face features contributed differentially to subtyping. Notably, the numbers of OXT responders for the two subtypes were 3 (18.8%) (subtype 1, n = 16) and 15 (60%) for subtype 2 (n = 25) (chi square = 6.740, p = 0.009, phi = 0.405) if only these seven ablation-based outcomes involved in ADOS-2 and eye tracking for emotional faces were included as features (Fig. 3B, bottom).

Table 1 Ablation protocol applied to clustering analysis.

Detailed evaluation of responses to OXT and PLC treatment in the two autism subtypes

To confirm that individuals with autism subtype 2 (n = 26) were more sensitive to OXT, we compared all outcome measures before and after 6-week OXT treatment across two subtypes. Paired t-tests with FDR correction suggested that OXT selectively improved individuals in subtype 2 in terms of autism severity (ADOS-2 total score, Mpre = 16.54 ± 4.23, Mpost = 14.77 ± 4.25, t = 5.29, p = 0.009, Cohen’s d = 0.417; comparison score, Mpre = 6.85 ± 1.12, Mpost = 6.42 ± 0.99, t = 4.28, p = 0.009, Cohen’s d = 0.407; SA, Mpre = 11.62 ± 3.44, Mpost = 10.69 ± 3.37, t = 3.21, p = 0.017, Cohen’s d = 0.273; RRB, Mpre = 4.92 ± 1.79, Mpost = 4.08 ± 2.04, t = 3.20, p = 0.017, Cohen’s d = 0.438), adaptive behavior (ABAS-II GAC, Mpre = 68.58 ± 15.11, Mpost = 74.46 ± 17.33, t = 2.97, p = 0.049, Cohen’s d = 0.362; RBS, Mpre = 13.81 ± 8.62, Mpost = 11.69 ± 7.08, t = 2.40, p = 0.049, Cohen’s d = 0.269) as well as increased the time spent on social stimuli (Mpre=44.76 ± 26.22, Mpost = 52.99 ± 26.13, t = 2.46, p = 0.049, Cohen’s d = 0.362), angry eyes (Mpre = 20.35 ± 10.91, Mpost = 29.49 ± 17.35, t = 2.36, p = 0.049, Cohen’s d = 0.631) and neutral nose (Mpre = 17.49 ± 11.92, Mpost = 25.48 ± 15.47, t = 2.47, p = 0.049, Cohen’s d = 0.578) but not for subtype 1 following 6-weeks of OXT treatment (details see Table 2 and Figs. 3C and 4).

Table 2 The comparisons between before and after 6-week OXT treatment in all outcomes.
Fig. 4: Box plots showing the evaluation of pre vs. post oxytocin treatment between the two autism subtypes.
figure 4

The differences for (A) ADOS-2 (Autism diagnostic observation schedule-2) total score (B) proportion of time spent looking at the eyes of angry faces and (C) proportion of time spent looking at dynamic social compared to dynamic geometric stimuli are shown. *p < 0.5; **p < 0.01; FDR-corrected.

Additionally, the current subtype model was also applied to the PLC condition in the same individuals to demonstrate whether the two subtypes responded differently or not. Paired t-tests with FDR correction suggested that there were no significant differences in all outcomes in either subtype 1 (all ps > 0.12) or subtype 2 (all ps > 0.09) following 6-weeks of PLC treatment followed by a period of positive social interaction.

Discussion

In the current study, we used data from our recent intranasal OXT intervention clinical trial [9] to identify two distinct autism subtypes by employing unsupervised data-driven clustering analysis of 23 different baseline measures. The proportion of individuals showing reliable improvements in clinical symptoms (ADOS-2 total score) after the 6-week OXT treatment followed by a period of positive social interaction was considerably and significantly greater in one subtype (16/26 = 61.5%) than in the other one (2/15 = 13.3%). This was not the case for the small number of individuals who showed either reliable improvements (n = 6) or deterioration (n = 4) after the PLC treatment followed by positive social interactions. Notably, ablation analysis revealed that the minimum number of included features for the two autism subtypes could be reduced to two main objective assessments (i.e. ADOS-2 scores and eye-tracking analysis of time spent viewing features of emotional faces; a total of 7 individual features) with 60% of OXT responders still being identified in one of the subtypes. Moreover, the autism subtype with the greater proportion of OXT responders was characterized by exhibiting greater interest in viewing the eyes and less interest in viewing the nose region of faces with different emotional expressions and a lower severity of social symptoms at baseline relative to the other subtype. In addition, during OXT treatment followed by positive social interaction, individuals with the autism subtype with more OXT responders showed significant overall improvement in autistic symptoms, adaptive behaviors and altered proportion of time spent viewing either the eye or nose regions of faces, whereas following the PLC treatment followed by positive social interaction they did not. Taken together our findings suggest that there may be a subtype of young children with autism who are more likely to show reduced symptoms in response to chronic OXT treatment followed by a period of positive interaction and provide a promising approach to help identify individuals most likely to show beneficial effects of OXT-based interventions in future studies. However, further validation of this clustering-based subtype established by combined clinical assessment and eye-tracking is required.

In the present study, a data-driven k-means clustering algorithm utilizing Manhattan (L1) distance was employed to determine autism subtypes. This choice of distance metric was deemed appropriate given its superior performance compared to the Euclidean distance metric (L2) in high-dimensional data mining tasks [37]. Notably, our analysis encompassed 23 diverse measures, highlighting the complexity of the data set. Subsequently, an ablation analysis [44] was conducted to systematically assess the contribution of each individual measurement in the oxytocin-sensitive ASD subtype model, thereby enhancing our understanding of the underlying factors that characterize this subtype. This ablation analysis identified 7 features which influenced the subtype analysis, two clinical (ADOS-2, SA and RRB) and five eye-tracking during face expression analysis.

Previous research primarily based on studies using single-dose administration of OXT have repeatedly shown that responses are often modulated by both context and personal characteristics [47, 48] although less has been established concerning chronic treatment outcome responses. For chronic intervention studies some have reported associations between social symptom improvements and OXT concentrations [9, 10] or age [20] but interestingly in our current analysis neither of these appeared to be important for sub-typing or predicting whether individuals exhibited improved symptoms. OXT receptor genotype has also been considered to be a factor modulating responses to both acute [49] and chronic doses of OXT [9] but even without including such genotype information in the current analysis we could establish autism subtypes with different proportions of OXT responders. One previous study has reported greater improvements in autistic symptoms and increases in plasma OXT following a combined electroacupuncture and behavioral intervention in the aloof and passive social subtypes [50] and there was some indication in our current analysis that individuals scoring high on the aloof dimension were actually less likely to respond to OXT.

Interestingly, two main objective assessments of autistic symptoms (ADOS-2 and eye-tracking face task) could still reliably detect 60% OXT responder rate in subtype 2 based on ablation analyses, indicating these two measures played the primary role in distinguishing subtypes. ADOS-2 has long been considered as a gold standard measure of autistic symptom severity with a high sensitivity and specificity [51]. In the current analysis, subtype 1 individuals generally scored higher on both the social affect and repetitive and restrictive behavior scales than subtype 2 individuals suggesting that chronic OXT treatments are more likely to benefit individuals with lower initial overall symptom severity. However, scores on the social affect scale appear to be more informative in this respect suggesting that OXT may particularly benefit more individuals with less severe social symptoms. On the other hand, while eye tracking measures have revealed altered visual preferences in autism and have been proposed as an early biomarker of ASD [23, 24, 52, 53] they have not been extensively included as treatment outcome measures. While some previous studies have included similar eye-tracking measures in trials involving adults rather than children with autism and chronic daily OXT treatments they did not find significant effects relative to PLC [12, 13]. Of the two eye-tracking tasks included in our original study the one exhibiting the greatest utility for both subtyping and predicting OXT responders was the passive face emotion expression paradigm with individuals in the two subtypes typically exhibiting either very low interest in viewing the eye region of all face emotion types and instead greater interest in looking at the nose region (subtype 1) or a greater interest in viewing the eye region and less interest in looking at the nose region (subtype 2). Despite the suggestion of a reciprocal relationship between time spent gazing at the eyes relative to the nose there was only a weak or absent correlation between them suggesting that they are independent to some extent. Thus, it would appear that individuals who are more likely to respond to OXT generally exhibit more initial interest in looking at the eye region and less at the nose region of faces even though this is still different from that found in typically developing children. This corresponds to some extent with the pattern of ADOS-2 scores in the two subtypes since some studies have reported that autistic children who spend more time looking at the eyes of faces are likely to have less severe social symptoms [54, 55]. Interestingly, the ablation analysis revealed that the amount of time viewing the eyes of angry, fearful and neutral expression faces and looking at the nose of angry and neutral faces were of greatest importance for discriminating between responders to OXT in the two subtypes. These represent the most potentially threatening or ambiguous face emotion stimuli and there is some evidence that autistic children are particularly avoidant of the eyes of threatening expressions [56,57,58]. A recent study using eye-tracking data to distinguish between autistic and neurotypical found that reduced fixations on the eyes of angry expression faces was a key factor in both children and adults [42]. Neutral expression faces are also responded to as threatening and evoke negative affect in young children [59] and autistic individuals are more likely to interpret neutral expression faces as threatening [60] and show increased attention [61] and amygdala responses to neutral faces relative to neurotypical individuals [62].

The other eye-tracking task we used visual preference for dynamic social (dancing children) compared with dynamic geometric patterns. Autistic children generally spend more time looking at the geometric patterns while TD children spend more time looking at the dynamic social stimuli and was originally developed by Pierce et al [53]. We had previously found that this task was the most reliable in discriminating TD from ASD in Chinese children [52] and preference for the social stimuli can be facilitated by acute as well as chronic OXT treatment [9, 63]. It has been proposed that this task may identify a specific ASD “GeoPref” subtype (dynamic geometric images vs. social images of children interacting and moving) with higher symptom severity and reduced resting state functional connectivity [23, 24], however we did not find any differences in this task across the two autism subtypes identified using multiple measures in our current study and the ablation analysis did not show that it contributed to identifying OXT responders.

Although neuroimaging data, such as resting state functional connectivity, has also been used to identify autism subtypes to facilitate diagnosis and prediction of response to treatment [64, 65], it is challenging to collect neuroimaging data in a large population of individuals with ASD, especially in young children and has a relatively high cost. Administering an MRI scan on such a young ASD child under natural sleep or following extensive training can be difficult and use of sedative drugs may influence blood oxygen level dependent signals. Given our findings that only two objective, and easily administered, assessments including ADOS-2 and an eye-tracking task for emotional faces are needed for potential screening of OXT responders it may not be necessary to try and utilize either neuroimaging or more extensive genotyping to do so, although that does not of course mean that these measures could be informative in the context of other kinds of interventions.

Some limitations of the current study should be acknowledged: (1) the current sample size is comparatively small due to strict inclusion criteria and the long-term intervention, although the observed differences between effects of OXT on the two subtypes identified are highly significant. (2) External and independent data would be further needed to validate our findings although that is currently difficult to do this using any publicly available data from previous studies since as a minimum they would need to have included both the same ADOS-2 measures and employed the same eye-tracking paradigms we used in young children. It is also unclear whether previous trials identified sufficient individuals showing a greater reliable improvement in their symptoms under OXT relative to PLC as defined by the RCI (see [46] and [20] for example).

In summary, our approach of performing an unsupervised data-driven cluster analysis of a population of autistic children showing variable responses to chronic OXT treatment has revealed that out of 23 baseline measures included two different objective assessments (ADOS-2 and eye-tracking) which contributed 7 different measures were effective in identifying two different subtypes with markedly different responses to OXT. These two assessments may therefore represent a potential screening tool to identify individuals most likely to show improved symptoms following a chronic OXT treatment intervention.