Fig. 4: dark septate endophyte colonization has a direct association with the relative abundance of plant pathogens, the bacterial/fungal ratio, and the diversity of different functional genes.

a Structural equation modeling (SEM) of the proposed direct and indirect drivers of the relative abundance (centered-log ratio of metabarcoding reads) of putative fungal plant pathogens in soil (n = 41), (b) SEM of the proposed direct and indirect drivers of the ratio (metagenomic reads) of bacteria/fungi in soil (n = 43), (c) SEM of the proposed direct and indirect drivers of the diversity (Shannon H Index) of bacterial carbohydrate-active enzymes (CAZymes) in soil (n = 43), (d) SEM of the proposed direct and indirect drivers of the diversity (Shannon H Index) of total bacterial functional genes in roots (n = 39), (e) SEM of the proposed direct and indirect drivers of the diversity (Shannon H Index) of nitrogen (N) cycling genes in roots (n = 39). All SEM models were calculated only on data from Sorbus and Alnus trees on which we measured dark septate endophyte (DSE) colonization (n = tree species × site), which is indicated by gray boxes. Included environmental variables were mean annual temperature (MAT) indicated by yellow boxes, the soil carbon/nitrogen (C/N) ratio, and soil pH, both indicated by brown boxes. All linear mixed-effects models within the SEMs used plot embedded in site crossed with tree species as a random effects structure, only significant (p < 0.05) and marginally significant (p = 0.05) paths are displayed (calculated using the Kenward-Roger approximation), pseudo R2 values (marginal and conditional) are listed for response variables, line thickness represents the standardized regression coefficients (Std. coeff. listed in the path diagrams above p values), black solid paths indicate a positive relationship, black dashed paths indicate a negative relationship, red paths indicate correlated error. Overall model fit was assessed based on The Akaike information criterion (AIC) and Fisher’s C values were used to calculate model p values, with p > 0.05 indicating acceptable model fitness. The statistical test used was two-sided. For detailed SEM results see Supplementary Data 3.