Fig. 1: Overview of the analytic pipeline.

A We employed multivariate-objective optimization independent component analysis with reference (MOO-ICAR) to estimate 105 multi-scale intrinsic connectivity networks (ICNs) at the subject level. The reference for the 105 ICNs was derived from a large sample of over 100,000 participants. We computed subject-level static functional network connectivity (FNC) by calculating pairwise Pearson correlations between cleaned time courses of ICNs, resulting in a 105 × 105 symmetric FNC matrix for each participant. ICNs are grouped together based on their anatomical and functional properties. B Principal Component Analyses Plus Canonical Correlation (PCA-CCA) was fitted on the discovery set, including patients, relatives, and controls, to find FNC features associated with the Brief Assessment of Cognition in Schizophrenia and Wechsler Memory Scale Backward and Forwards Tests. The optimal number of principal components was estimated directly from the data using the Elbow criteria. Three canonical correlations remained statistically significant in the replication set and two remained statistically significant after adjusting for covariates. We conducted a pair-wise comparison between patients (N = 256) and controls (N = 126) for the remaining two canonical variates adjusting for covariates in the replication set. Similar comparisons including first degree relatives (N = 93) who make up the rest of the replication set (N = 475) can be found in Supplementary material. Patients and controls presented statistically significant differences in both cognitive canonical variates and in the first FNC canonical variate but did not in the second FNC canonical variate, therefore we did not include the second pair in subsequent analyses. C We selected FNC features with the highest correlation with the first canonical variate in patients in the discovery set. We selected FNC features with a loading >|0.1047| according to the elbow method (Cognitive FNC features in Psychosis or CFPs). Left: Correlation values of the 5460 functional network connectivity features and the first FNC canonical variate in 105 × 105 symmetric matrices in patients in the discovery set. Blue colors: negative correlation. Yellow-red colors: positive correlation. Lighter colors: correlation closer to 0. Middle: elbow plot of ranked absolute values of correlations with a vertical line at 0.1047, the value chosen as threshold. Right: FNC features with a correlation with absolute values lower than 0.1047 and therefore not included in k-means clustering are shown in white. The 1077 FNC features selected for k-means clustering are shown in color. D We conducted k-means clustering using patients from the discovery set to find subgroups of patients based on CFPs. Silhouette index solution for two clusters was statistically significant (p = 0.0005). E We assigned patients from the replication set to one of the clusters obtained in the discovery set based on the shortest Euclidean distance between each subject’s centroid and the clusters’ centroid for the 1077 CFPs. F We computed the centroid of the 1077 CFPs in the control group to simulate a control cluster. Like E, we assigned first-degree relatives to either one of the patients’ clusters or to the control cluster based on the shortest distance between the centroid of each subject and the centroids of the clusters.