Introduction

In psychosis, knowledge of unique physiology and pathology may aid diagnosis and promote targeting the most effective treatments to the needs of individual patients [1, 2]. Currently, there is substantial neurobiological heterogeneity within and overlap between schizophrenia, schizoaffective disorder, and bipolar disorder with psychosis, the most prominent of the idiopathic psychoses. The Bipolar-Schizophrenia Network for Intermediate Phenotypes (B-SNIP [3,4,5]), hoped to improve the standard of care by identifying neurobiological features common across and unique to each of the prominent psychosis diagnoses.

To achieve these goals, B-SNIP assumed three basic requirements: (i) use of large “trans-diagnostic” samples across diagnoses to capture clinical and neurobiological heterogeneity; (ii) quantification of biological and clinical features at multiple levels of analysis with an emphasis on capturing cognitive and physiological correlates of psychosis; and (iii) integration over biomarkers that index a single construct (e.g., cognition, behavioral inhibition, auditory sensory registration) because no single measure adequately captures any underlying brain deviation. Even using large samples, multiple and multi-level biomarkers, and considering biomarkers both individually and as integrated statistical constructs, B-SNIP found no neurobiological patterns distinguishing schizophrenia, schizoaffective disorder, or bipolar disorder with psychosis at a clinically useful level [6, 7].

Following this realization, B-SNIP modified the target from DSM diagnoses to idiopathic psychosis generally. The approach transitioned to probing whether there are neurobiologically distinct subgroups within the larger trans-diagnostic psychosis sample. The first iteration of this program (B-SNIP1 [6]) used numerical taxonomy (objective methods for identifying individuals with similar features on the units of classification) with integrated behavioral, cognitive, and electroencephalography (EEG) biomarkers (called bio-factors) and identified three “psychosis Biotypes.” A second data collection and analysis effort (B-SNIP2 [7]) re-quantified biomarker data from both B-SNIP1 and B-SNIP2 using updated procedures and correcting previous inconsistencies (these differences were fully described in Clementz et al., 2022 [7]. The outcome (i) replicated all steps in the biomarker and bio-factor quantification process, (ii) replicated psychosis Biotypes, (iii) cross-validated the subgrouping approach between B-SNIP1 and B-SNIP2, and (iv) construct validated Biotypes’ defining physiological features. These outcomes inspired testing whether B-SNIP-derived psychosis subgroups facilitate treatment targeting [8] and three major projects are currently ongoing.

Using data from 3507 individuals, this paper aims to further update and verify the practical utility and robustness of this laboratory-centered approach to stratifying idiopathic psychosis cases. These advancements also support the development of ADEPT [9, 10], a decision tree approach for efficiently diagnosis B-SNIP psychosis Biotypes using limited information. First, a full B-SNIP-type laboratory evaluation is impractical for many settings. Previous psychosis Biotype algorithms combined information from multiple laboratory tests (e.g., multiple EEG/ERP paradigms were previously required to estimate a single defining feature of Biotypes, like N100 ERP magnitude). This means, for instance, to quantify the N100 required two different ERP tasks. Developing an efficient diagnostic procedure, however, requires that each test must be separately evaluated (like paired stimuli ERP separate from oddball ERP) to determine their individual utility and unique contributions. This paper accomplishes that goal.

Second, B-SNIP previously screened biomarkers for usefulness by applying statistical comparisons between DSM psychosis and healthy groups because there was no other generally accepted approach. That is, possibly useful biomarkers (those that entered subsequent analyses) required a DSM group difference. That strategy missed informative biomarkers [e.g., [11]] because DSM psychosis diagnoses do not straightforwardly map to biological reality [12]. In this paper, we devised a strategy to eliminate that requirement, so the biomarker selection process is completely agnostic to clinical diagnoses or any group differentiations. Possibly useful biomarkers are based only on the properties of the tasks themselves. We also expanded the range of physiological assessments to include a direct measure of intrinsic neural activity [13], an important feature for differentiating psychosis cases [7, 14, 15].

Third, a drawback of numerical taxonomy is that it yields solutions regardless of whether subgroups are present and whether the outcomes are consistent. Previously, we estimated the number of subgroups using the gap statistic [16] and the preclustering step of SPSS’s TwoStep cluster analysis algorithm. In this paper, we expand to 24 such estimators of cluster number. In addition, for the best estimate number of clusters, we evaluate the consistency of assigning an individual subject to their modal group (what we call psychosis Biotypes) as a function of sample size used to derive the clustering solution. These analyses probe the robustness of the full sample solution and estimate the number of cases needed to construct a consistent biomarker-based diagnostic system applicable to the individual psychosis case.

Fourth, most psychosis cases in biomarker studies are medicated. Additionally, most have been chronically ill and chronically medicated. These factors create uncertainty whether biomarker differences between groups are related to trait illness, medications, or other effects of living with and adapting to a chronic condition. One way to evaluate such concerns is to study the biological relatives of persons with an idiopathic psychosis. Medication and the consequences of having a chronic medical condition have less explanatory power if the non-psychotic first-degree relatives show the same patterns of biomarker deviations as their ill relatives. In this paper, we show the fine-grained responses across bio-factors in a large sample of first-degree relatives using these updated biomarker quantification procedures.

Methods

In the current B-SNIP database, there are 1907 psychosis cases, 705 nonpsychotic first-degree biological relatives of those cases, and 895 healthy persons recruited from the community. This is an increase of 479 psychosis and 423 healthy subjects over our previous numerical taxonomy paper [7], plus a re-analysis of biological relative data originally published (Clementz et al. [6]). The re-analyses of relative data allow for more fine-grained analyses than were previously reported. The procedures for data collection and pre-processing are the same as Clementz et al. [7], but with updated and modified final quantification approaches. Those approaches are completely described in Supplementary Methods, including how to access the B-SNIP data.

B-SNIP recruitment sites were in Athens GA (University of Georgia and Augusta University Medical College of Georgia), Baltimore MD (Maryland Psychiatric Research Center), Boston MA (Beth Israel Deaconess Medical Center), Chicago IL (University of Illinois-Chicago and University of Chicago), Dallas TX (UT Southwestern Medical Center), Detroit MI (Wayne State University), and Hartford CT (Institute of Living). All recruitments, interviews, and laboratory data collections were completed at those locations. The Institutional Review Board at participating institutions approved the projects and data collection and analysis were completed in accordance with relevant guidelines and regulations. All participants provided informed consent prior to involvement.

Cases were drawn from academic and community mental health centers, small towns with large universities, large cities, inner cities, rural regions, affluent and less affluent areas. B-SNIP recruited a research sample, not an epidemiological sample; nonetheless, the large study numbers and broad geographical recruitment foster generalizability of the outcomes across the range of early onset through midcourse to chronic idiopathic psychosis. See Table 1 for demographic information of probands and healthy comparison subjects. See Table S1 for demographic information of first-degree relatives.

Table 1 Demographic characteristics of Biotypes.

Clinical evaluations

B-SNIP recruitment details and approaches are available in Tamminga et al. [5]. Briefly, clinically stable outpatients were administered the Structured Clinical Interview for DSM diagnosis [DSM-IV-TR [17]]. Psychosis cases were limited to schizophrenia (n = 783), schizoaffective disorder (n = 582), and bipolar I disorder with psychosis (n = 542) because these are the diagnoses with the highest prevalence in most settings. Cases were rated on the Birchwood Social Functioning [SFS [18]], Montgomery-Asberg Depression Rating [MADRS [19]], Positive and Negative Syndrome [PANSS [20, 21]], and Young Mania Rating [YMRS [22]] scales. Healthy persons were free of lifetime psychosis syndromes, recurrent mood syndromes, and a history of psychosis or bipolar disorders in their first-degree relatives. Table S2 shows the clinical information by group. Table S3 provides a summary of medication information. We previously illustrated that medication effects do not significantly account for group differences on biomarker features [see [7]]. As shown in Table S3, participants across all psychosis groups were largely on the same medications, although their biomarker profiles differ.

Biomarker panel

Participants completed comprehensive laboratory evaluations within a few weeks of their clinical assessments. Papers on the individual laboratory paradigms provide extensive data collection and analysis details [11, 23,24,25,26,27,28,29,30,31]. Details of biomarker quantification and numerical taxonomy procedures are in Clementz et al. [7]. See also Supplementary Methods accompanying this manuscript.

The laboratory measures used for Biotypes creation are traditional endophenotypes [32]. Each paradigm has a substantial literature supporting its use as a biomarker of psychosis. Paradigms include (i) the Brief Assessment of Cognition in Schizophrenia [BACS [33, 34]] to test general cognitive performance, (ii) pro- and anti-saccades [saccades [35,36,37]] to assess speed of visual orienting, goal maintenance, and inhibitory control under perceptual conflict, and (iii) a stop signal task [SST [38]] to assess adequacy of inhibitory control using speeded motor responses [24, 25]. See also Supplementary Tables S7S9.

There are also three assessments of brain physiology as measured with dense-array electroencephalography (EEG). Event-related brain potentials (ERP) were measured with (iv) auditory paired stimuli and (v) auditory oddball paradigms [39,40,41,42] to assess preparation for and recovery from sensory activations, neural responses to stimulus salience and relevance, context updating in working memory, and nonspecific (or intrinsic) brain activity during performance (i.e., brain activity not time-locked to stimulus processing). The 9–10 s inter-pair interval of the paired stimuli paradigm was also included as a direct measure of (vi) intrinsic EEG activity, or IEA [i.e., background brain activity not associated with ongoing stimulus processing requirements [11]]. See also Supplementary Tables S10S14.

Data reduction and creation of bio-factors

Within each laboratory measurement domain (BACS, saccades, SST, paired stimuli ERP, oddball ERP, IEA/EEG), principal component analysis (PCA; Covariance Matrix, Promax Rotation, Kaiser Normalization, Kappa = 3) reduced multiple variables within that domain to an efficient and smaller variable set. This was done for two main reasons. First, for estimating the true value on any theoretical construct, multiple independent measures are better than any single variable. For example, neural response to stimulus salience is better estimated by many ERP measures than by a single voltage from a single sensor at one time point. Second, reducing the redundancy of measurements via data integration methods like PCA increases the accuracy of numerical taxonomy [43].

All psychosis and healthy participants were included in PCAs using standardized variables. Age and sex-adjusted biomarker data were used, if such effects were statistically significant, based on the procedure described in Dukart et al. [44], and previously described in Clementz et al. [7]. This approach produced variables integrated over multiple biomarker measurements, which were the units of analysis for numerical taxonomy. We call these PCA variates “bio-factors” since they capture multiple facets of neuro-cognition and physiological responses and labelled them based on their most characteristic biomarker associations. Similar subcomponents of these procedures proved to be stable and replicated with high accuracy in two independent samples, each of which contained >700 psychosis and >200 healthy participants [7]. For additional analysis comparing the updated bio-factors to previous bio-factors, see the supplemental methods and supplemental figures S13 and S14.

Nonpsychotic first-degree relatives’ bio-factor scores were created by applying the PCA coefficients obtained from psychosis and healthy persons.

BACS

The BACS subtests, covering verbal abilities, processing speed, reasoning, problem solving, and working memory, were scored according to standard procedures. PCA of the BACS subtests identified one component (see Figure S1 for subtest means and SDs, and Table S4 for structure matrix). Thus, there is one BACS bio-factor.

Saccades

Participants completed three pro-saccade (gap, synchronous, and overlap) and one overlap anti-saccade condition [28, 31, 36, 45]. Trials were scored for (i) direction (to evaluate correct or error response) and (ii) onset latency. Pro-saccade latencies, anti-saccade latencies, and proportion of correct anti-saccades were included in the PCA (see Figure S2 for variable means and SDs, and Table S4 for structure matrix). The scree identified two bio-factors called “latency” and “antisaccade.”

Stop signal task

A baseline task of go-only trials, with a visual stimulus presented pseudo-randomly to the left or right of central fixation, assessed baseline reaction time. For stop-signal trials, a go cue appeared to the left or right. On 40% of trials, a stop signal was presented at central fixation [24, 25]. Participants were instructed to respond quickly and accurately to the go cue unless they encountered the stop signal. Strategic slowing (difference between response latencies on baseline go trials and go trials during stop signal performance) and proportion of stop signal errors were included in the PCA (see Figure S3 for variable means and SDs, and Table S4 for structure matrix). The scree identified one SST bio-factor.

Auditory ERP tasks

For the paired-stimuli task, participants passively listened through headphones to at least 120 broadband auditory click pairs with 500 msec inter-click interval occurring every 9.5 sec on average (9–10 s inter-pair interval). For the oddball task, participants listened through headphones to 567 standard (1000 Hz) and 100 target (1500 Hz) tones presented in pseudorandom order (1300 msec inter-trial interval) and pressed a button when a target was detected (to maintain vigilance).

Data from trials free of artifacts (±75 mV) were averaged to create 64-sensor ERPs. In addition to analyzing the grand-averaged ERP in the time domain, a frequency-wise PCA of evoked power [23, 26, 29, 30] empirically defined low, beta, and gamma frequency bands. The combination of temporal and frequency information over all sensors and time points maximizes the use of spatial, temporal, and oscillatory information. A spatial PCA [23, 26, 29, 30, 46, 47] on the grand-averaged ERP and each frequency band yielded four waveforms (“virtual sensors”). These virtual sensors were analyzed instead of separate sensors, efficiently summarizing the spatial distributions, minimizing the number of statistical comparisons, and maximizing the signal/noise ratio of the EEG/ERP data [46] (See Figure S4 for paired stimuli and Figures S5S7 for the oddball mean responses across time for each time course). For the paired-stimuli task, the PCA scree identified three bio-factors (see Figure S8 for the structure weights). For the oddball task, the PCA scree identified three bio-factors (see Figure S9 for the structure weights).

Intrinsic EEG activity (IEA)

Data derived from the 9–10 s inter-pair interval of the paired-stimuli task [11]. No stimuli were presented during this period. EEG data were pre-processed following methods described above and in Thomas et al. [11]. Data were transformed into the frequency domain, with frequency bands empirically determined using PCA [11], resulting in four primary bands: delta/theta, alpha, beta, and gamma (see Figure S10 and Table S4). The PCA scree identified one IEA bio-factor.

Data analyses

Clustering

The 11 bio-factors were used to construct psychosis subgroups via k-means clustering [7]. Only psychosis cases were used at this stage since the goal was to determine how to meaningfully parse bio-factor variance within psychosis. The number of clusters given the data were determined by the gap statistic [16] and the 23 estimators in the NBclust package in R [48]. See Supplementary Method for details.

Clustering membership consistency

Bootstrapped samples of psychosis cases were selected at sizes of 500–1800 cases, in 100 case increments, with 1000 pseudo-replicates for each sample size. Each of the clustering solutions were then compared to the raw total sample solution using unadjusted and adjusted Rand Indices for the least and most conservative estimates of cluster membership consistency. The Rand Index evaluates the quality of clustering solutions at the level of the individual case. Rand Indices provide a metric to compare the proportion of agreement between two clustering solutions. The unadjusted Rand Index yields a measure of clustering similarity across cases without considering possible change agreements. The adjusted Rand Index adjusted for possible chance agreements between clustering solutions, so is a more conservative estimate of cluster quality. For additional information examining the overall and by Biotype cluster separation ratios, see supplemental methods and table S17.

Group differences

The 11 bio-factors were compared between-groups using analysis of variance, with Tukey’s method for post-hoc evaluations (HSD or Tukey-Kramer where appropriate). For statistical significance in omnibus tests, the Holm-Bonferroni procedure [49] was used to maintain the family-wise alpha at 0.05. For first-degree relatives’ comparisons, degrees of freedom were adjusted based on the number of unique families included in the analysis since some families in the study had multiple members.

Results

Bio-factors by psychosis and healthy groups

The means, standard deviations, and effect sizes for the total psychosis versus healthy groups are presented in Table 2. Of the 11 bio-factors, those in the cognition set (BACS, antisaccade, SST) differentiated the best (F’s >90.7, p’s < 0.001, Glass ∆’s of −1.04, −1.03, and −0.45). The ERP response magnitude bio-factors also significantly differentiated groups (F’s >33.6, p’s < 0.001), but with considerably less separation (Glass ∆’s of −0.35 and −0.25); the same was true of the intrinsic activity bio-factors (F’s >12.3, p’s < 0.001, Glass ∆’s of 0.19, 0.26, and 0.16). The bio-factors in the stimulus salience set were more modestly differentiating, with the latency bio-factor failing to separate psychosis and healthy groups (F < 1, p >0.770, Glass ∆ = −0.01). The other two stimulus salience bio-factors, P300 complex and paired-stimuli S2, showed significant effects (F’s of 6.2 and 13.8, p’s of 0.013 and < 0.001) of small group differentiations (Glass ∆’s of 0.09 and −0.16).

Table 2 Standard score mean (SD) of bio-factors by psychosis and healthy.

Number of clusters

These analyses addressed the best estimate of the number of clusters given individual participant scores across the 11 bio-factors. Gap statistic figures are presented in Figure S11, and results from the 23 cluster number estimators of the NBclust package are presented in Table S5. Both the Gap statistic and NBclust majority rule indicate the most parsimonious solution is three clusters. Therefore, k-means was obtained requesting a three-cluster solution; the algorithm achieved cluster stability within 43 iterations [see Supplementary Methods and [7]]. The k-means outcome resulted in ~630 observations per cluster (psychosis Biotypes) as described in Table 1 (BT1 n = 630, BT2 n = 631, BT3 n = 646).

Consistency of cluster membership assignment

Figure 1 shows two consistency estimates for k-means membership of individual cases using a sub-sampling approach (1000 iterations at each subsample size from 500–1800 probands). The first Rand Index (upper red line) shows consistency with the full model solution without adjusting for chance assignment. This outcome shows remarkable consistency of >95% agreement for samples of greater than 1400 observations, excellent agreement of >90% for sample sizes of greater than 900, and still good agreement >82% for sample sizes of at least 500. The second Rand Index (lower black line) shows consistency adjusting for the probability of a case being assigned by chance to one of the three groups. This most conservative metric shows remarkable agreement of >95% for samples of greater than 1600, excellent agreement of >90% for samples of greater than 1500, and good agreement of >80% for samples of greater than 1000.

Fig. 1: Cluster consistency.
figure 1

To test cluster member consistency, a bootstrapping approach was used on subsamples in sizes ranging from 500–1800, in steps of 100, of the total sample (1000 iterations each step). A random subsample was selected at each sample number, normalized, and submitted to the k-means clustering algorithm 1000 times. Membership from each subsample clustering solution was compared to the membership of the total sample with an adjusted and non-adjusted rand index. Rand index is a similarity measure between two clustering solutions. Red line is the average of the non-adjusted rand index the 1000 iterations at each subsample. Error bars are the 99% Confidence interval. Gray shading represent the 40th and 60th percentile values. Black line is the adjusted rand index average across the 1000 iterations at each sub-sample. At sample sizes greater 1000, the adjusted rand index shows good and the unadjusted rand index shows excellent consistency across subsamples.

Bio-factors by B-SNIP psychosis Biotypes

Figure 2 (left plot) shows bio-factors plotted by group membership (see also Supplemental Tables 715 for unadjusted individual variables and bio-factors by Biotype and DSM diagnoses). See Figure S12 for a comparison of cases on and off antipsychotics at the time of testing. All bio-factors differentiate psychosis Biotypes (Holm-Bonferroni adjusted significance, F’s = 20.21–703.81, p’s < 0.001). This result is not surprising because numerical taxonomy used these bio-factors to create maximally homogeneous and distinct groups. When adding the healthy group to the models, all comparisons remained significant and of similarly large magnitude (Holm-Bonferroni adjusted significance, F’s = 14.96–442.78, p’s < 0.001). The differences in bio-factor pattern for psychosis groups in comparison to the healthy group is consistent, but more robust than previous reports [7]. Consequently, we maintained the same designations of BT1 (low cognitive performance and low neural response magnitudes), BT2 (low cognitive performance, poor inhibition, accentuated intrinsic brain activity), and BT3 (reasonably normal across most bio-factors but mildly deviant on measures of stimulus salience).

Fig. 2: Bio-factors by B-SNIP psychosis Biotypes and their relatives.
figure 2

Standardized values of each of the 11 bio-factors that were included in the clustering algorithm by Biotype. The average healthy values was subtracted from each bio-factor to show the relative difference across measures. Across all bio-factors there were significant differences betweene psychosis Biotypes (Holm-Bonferroni adjusted significance, F’s = 20.21–703.81, p’s < 0.001). Error bars = 99% Confidence intervals. Standardized values of each of the 11 bio-factors by non-psychotic first-degree relatives separated by their proband Biotype membership. BACS, antisaccade, paired stimuli and oddball ERPs, and IEA bio-factors differentiate the relative groups (Holm-Bonferroni adjusted significance, F’s >4.1, p’s < 0.007). Error bars = 99% Confidence interval.

Biotypes have unique patterns across the 11 bio-factors. They are also distinguished from healthy persons on these variables. To ease visualization of unique group bio-factor patterns, we used canonical discriminant analysis (CDA) with the criterion of group membership (BT1, BT2, BT3, healthy) and the 11 bio-factors as predictors. This is a simplification of group differentiations in the 11-variable space of the bio-factors. The CDA yielded three significant variates (chi-squares >110.7, p’s < 0.001; canonical correlations of 0.69, 0.64, and 0.26, p’s < 0.001; see Fig. 3 and table S5).

Fig. 3: Canonical discriminant analysis variates by B-SNIP psychosis Biotypes.
figure 3

Results of the canonical discriminant analysis which had 3 significant variates (chi-squares >110.7, p’s < 0.001; canonical correlations of 0.69, 0.64, and 0.26, p’s < 0.001). A. CDA variate 1 (Pattern 1) was associated with lower cognitive scores and higher ongoing frequency activity from the Oddball and Paired Stimulus tasks. B. CDA Variate 2 (Pattern 2) was associated with reduced ERP responses and intrinsic neural activity. C. CDA variate 3 (Pattern 3) was associated with the frontal P3a EEG response, Paired Stimulus S2 ERP activity, and pro-saccade Latency. See table S6 for CDA Structure Matrix.

CDA Variate 1, what we termed “Pattern-2” because it captures BT2, has the most significant associations with antisaccades (r = 0.60), BACS (r = 0.56), and ongoing EEG high frequency activities (oddball r = 0.48, paired-stimuli r = 0.47). Lower scores indicate a trio of poor cognitive performance and behavioral inhibition combined with accentuated background brain activity during stimulus processing. Pattern-2 best distinguishes BT2 from the other groups, with post-hoc tests showing a pattern of BT2 < BT1 < (BT3 = HC). CDA Variate 2, termed “Pattern-1,” best separates BT1 from the other groups and is associated with a reduced ERP responses (paired-stimuli r = 0.69, oddball r = 0.72) and reduced intrinsic EEG activity (IEA r = 0.70). Lower scores indicate generally reduced neural activity. Post-hoc comparisons show a pattern of BT1 < (HC = BT2) < (BT2 = BT3). CDA Variate 3, termed “Pattern-3,” best separates BT3 from the other groups and is associated with frontal P3 complex responses (r = 0.61), response to the second stimulus of the paired-stimuli paradigm (r = −0.57), and prosaccade latencies (r = −0.34). Lower scores indicate altered sensitivity to stimulus salience in comparison to healthy persons. Post-hoc comparisons show a pattern of BT3 < (BT2 = BT1) < H.

First-degree relatives by proband cluster membership

Figure 2 (right plot) shows bio-factors plotted for the non-psychotic first-degree relatives by the proband to whom they are related. The BACS, antisaccade, paired stimuli and oddball ERPs, and IEA bio-factors differentiate the relative groups (Holm-Bonferroni adjusted significance, F’s >4.1, p’s < 0.007). The relatives’ patterns of deviation on those bio-factors are like the patterns among their probands. The BACS [(BT1 = BT2) < H < BT3; effect sizes for relatives versus healthy of BT1 = −0.33, BT2 = −0.28, and BT3 = 0.26] and antisaccade bio-factors [(BT1 = BT2) < (H = BT3); effects sizes of BT1 = −0.30; BT2 = −0.44, and BT3 = 0.15] show similar patterns of deviations, except that BT3 relatives had the best general cognitive performance, even in comparison to healthy persons. The patterns for the ERP magnitude measures recapitulate the probands’ patterns with BT1 relatives having lower ERP amplitudes than the other three groups combined (paired stimuli ERP effect sizes of BT1 = −0.29; BT2 = 0.01, and BT3 = 0.01; oddball ERP effect sizes of BT1 = −0.47; BT2 = −0.08, and BT3 = −0.08). Likewise, the IEA bio-factor shows the same pattern as the ERP magnitude measures, with BT1 relatives being lower than the other three groups (effect sizes of BT1 = −0.63; BT2 = −0.16, and BT3 = −0.08).

Discussion

The B-SNIP consortium reported and replicated [7] that DSM schizophrenia, schizoaffective disorder, and bipolar disorder with psychosis are not obviously neurobiologically distinctive, a conclusion also supported by large-scale genetics projects [12]. Such realities muddy efforts to link a specific clinical feature or syndrome to identify a pathology responsive to a specific intervention. Guze [1] believed standardized clinical evaluations in psychiatry are useful for detecting specific disease entities, but only “up to a point.” As the “all psychosis” versus “healthy” comparisons reveal, much is hidden by imprecise groupings. For psychosis, including neuroscience in diagnostic definitions could be beneficial since the brain is the most affected organ. B-SNIP implemented laboratory assessments to stratify idiopathic psychosis cases regardless of their specific clinical diagnoses [6, 7]. We demonstrated that these psychosis Biotypes have unique patterns of clinical features that can assist laboratory diagnosis [9, 10]. This paper modifies and extends our previous efforts. We improved the efficiency and reinforced the robustness of the B-SNIP psychosis Biotypes algorithm and identification in three ways:

  1. (i)

    Previous iterations required multiple laboratory paradigms to assess a single construct (e.g., the n100 response in an auditory ERP). We changed to using paradigms individually. This approach (i) provides internal replication of effects common to multiple paradigms, (ii) aides identification of efficient tests for distinguishing psychosis subgroups by disambiguating all measures, and (iii) facilitates the development of efficient diagnostic algorithms for B-SNIP Biotype assignments [9, 10]. It also could aid in clinically settings where a patient might only complete or have high quality data on one EEG measurement. We confirm that auditory paired-stimuli and oddball ERPs have the same patterns across psychosis Biotypes. Three measures of background or intrinsic brain activity (IEA, and ongoing activity during the paired-stimuli and oddball tasks) also showed the same Biotypes differentiations. These similar laboratory measures, may track differently across interventions, thus providing unique treatment targets and outcome assessments.

  2. (ii)

    In previous iterations, biomarker selection was tied to statistical differences between healthy and DSM diagnostic groups. Here, biomarker inclusion is agnostic to group membership and is instead based on characteristics of the measures themselves (e.g., the prototypical patterns of EEG/ERP responses to auditory oddball stimuli). We expanded the physiological assessments to include a direct measure of intrinsic neural activity (IEA). This updated approach yielded two additional bio-factors beyond the nine used in Clementz et al. (2022 [7]), which had validated and refined the nine bio-factors from Clementz et al. (2016 [6]) (see supplemental methods and figures S13 and 14 for comparison between iterations of bio-factors). The newly added measures, IEA and separate measures of ongoing activity from the paired stimuli and oddball paradigms, and rearrangement of what is measured by individual bio-factors (i.e., paired-stimuli and oddball ERPs disentangled, frontal P3 ERP uniquely quantified), enhanced separations between psychosis Biotypes. We also extracted three components from the 11 bio-factors that uniquely identified the most characteristics features of psychosis Biotypes from each other and from healthy comparison subjects: Pattern-2 for BT2, Pattern-1 for BT1, and deviations of Pattern-3 for BT3. The discrimination of BT3 from healthy persons and other idiopathic psychosis cases was a new addition made possible by this modified approach.

  3. (iii)

    For procedures like k-means, it is crucial to validate the veridicality of the number of subgroups. In comparison to previous psychosis Biotypes iterations, we used one overlapping (gap statistic) and 23 new cluster estimation procedures (from NBclust in R [48]). The most probable outcome was again three subgroups (10 of 24 estimates). We also evaluated the consistency of assigning cases to their modal subgroup across a range of bootstrapped sample sizes. The bio-factors are highly repeatable [8], so the chance group assignment correction of the adjusted rand index is conservative for our case. Nevertheless, even using a conservative approach, individual cases are assigned to their modal Biotype group with exceptional to good consistency with sample sizes greater than 1000. The outcome of this analysis, however, highlights the sample sizes required to construct a robust neuroscience-assisted classification procedure applicable to individual cases. This also may be true for clinically-anchored classification approaches, but we are unaware of a similar demonstration for experiential systems like DSM.

The analysis of biological relatives’ bio-factors probed which neuro-biological features may be crucial to the etiology and pathogenesis of B-SNIP psychosis Biotypes. Bio-factor patterns of probands were replicated among their nonpsychotic first-degree relatives for BACS, antisaccade, auditory ERPs and IEA bio-factors. Given that these variables also show high familial similarity [more liberally called heritability [6]], these measures appear to be endophenotypes [32], but only for specific psychosis Biotypes. IEA is a particularly distinguishing measure between psychosis subgroups, especially between Biotypes 1 and 2. Relatives of BT1 probands showed similarly reduced IEA, although less severe, suggesting a familial component to this deficit. In contrast, BT2 relatives did not differ from healthy controls. Even BT3 psychosis cases, who are statistically equivalent to healthy persons on cognitive performance, are deficient in relation to their non-psychotic family members. This finding may refine previous reports of cognitive endophenotypes for bipolar disorder [50, 51], and also recapitulates the generally lower cognitive performance of psychosis probands in relation to their unaffected first-degree relatives [52]. These outcomes raise the possibility that laboratory measures such as motor inhibition from the SST, induced EEG activity, and stimulus salience assessments may be pathology markers of idiopathic psychosis after it has developed rather than causal markers related to elevated psychosis-risk. Alternatively, the nonfamilial bio-factors may index other acquired deviations that explain why members of the same high-risk family differ in psychosis manifestation.

B-SNIP’s multi-domain, neurobiologically defined, trans-diagnostic psychosis Biotypes offer an alternative to the traditional clinical approach for case stratification. This approach has been galvanized by the extent of unsuccessful research into the biological mechanisms of conventional psychosis diagnoses and the failure to develop new treatments. With our updated biologically based sub-groups, we have attempted to capture homogenous disease groups with common biological dysfunctions. This is an approach we will continue to refine and is being more widely adopted in biological psychiatry research [53,54,55,56]. Critically, it deserves thorough and rigorous testing. The bio-factor underpinnings of B-SNIP Biotypes are stable and replicable, and the Biotypes themselves show excellent cross- and construct validation [7, 14, 15, 57,58,59,60,61,62,63,64,65]. They also have differentiating patterns of clinical features [8, 9, 66].

This does not mean B-SNIP psychosis Biotypes are the final or correct model; the solution is a function of variables used and methods applied. We hope more investigators will probe their usefulness for understanding psychosis etiology and improving treatment targeting. Second, the reason for our selections were described in multiple previous papers and herein. With any clustering solution, there is the possibility of overfitting to the clusters to the data included in the selection. We have attempted to mitigate this with our iterative approach and shown that our clustering solutions are consistent with our sub-sample analyses. This also does not exclude the possibility that other laboratory tests can enhance understanding of idiopathic psychosis. Third, we demonstrate it is unlikely medication effects account for Biotype group differences: cases across Biotypes are on similar medications but have different bio-factor patterns, medications share limited variance with biomarker scores [7] and unmedicated biological relatives show the same patterns of bio-marker deviations as the case to whom they are related. The lack of especially antipsychotic medication effects on cognitive performance is also consistent with recent reports [67]. Nevertheless, like most similar studies in this literature, we do not have information on whether these persons were taking their medications as prescribed or what these persons were like before exposure to psychiatric medications.

The B-SNIP psychosis Biotype approach yields rational, testable hypotheses of pathophysiological theories and treatment targets that are not derivable from any available alternative. It perhaps offers a beginning for transitioning a part of psychiatry to a laboratory discipline. The ability to tailor treatments to individual psychosis patients and improve outcomes, however, will be the ultimate validator of this or any other approach to psychosis diagnosis [68].