Introduction

Conventionally, two different types of starter cultures are being exploited in industrialized cheese production: defined starter cultures and complex undefined starter cultures1. Defined cheese starter cultures are usually composed of one or more strains of mesophilic lactococci with known characteristics. These individual strains have been generally isolated from undefined complex starter cultures to obtain axenic single strains. The currently used undefined starter cultures were generally isolated in the 50s and 60s of the twentieth century from cheese production farms and were kept frozen to retain their original composition. For example, cultures used in the Netherlands to produce Gouda cheese are domesticated undefined cultures stored as frozen stocks and reactivated only the strictly necessary to minimize the numbers of propagation cycles and to limit compositional shifts2. Both defined and undefined starter cultures are essential for fermenting lactose into lactate and degrading caseins, resulting in the milk acidification and preservation3. However, mixed-strains undefined starter cultures were reported to be more resilient to stress and bacteriophage attack and to display a more robust performance compared with defined low-strain-diversity dairy cultures4. These properties have triggered a renaissance of interest towards undefined starter cultures with the aim to understand bacterial interactions and co-evolution forces shaping these communities5,6,7,8,9.

In Italian long ripened hard cheese production as for Parmigiano Reggiano, Grana Padano, and Trentingrana, a third kind of undefined starter cultures is produced by incubating cheese whey under conditions that favor the growth of desirable thermophilic lactic acid bacteria (LAB)10. In case of Parmigiano Reggiano (PR) cheese, natural whey starter (NWS) is daily produced in every farm belonging to the Protected Designation of Origin (PDO) Consortium (Specifications of Parmigiano Reggiano Cheese; https://www.parmigianoreggiano.com/consortium/rules_regulation_2/default.aspx). A part of the whey of the previous cheesemaking round is removed after curd cooking (54–56 °C), progressively cooled until reaching 49–46 °C for approximately 20 h, and subsequently used to inoculate a new milk batch for the following cheese production. Consequently, NWS cultures encounter considerable environmental changes and consist of a highly variable bacterial communities which differ over time and space in species and strain composition depending upon the trend in whey cooling, the size of fermentation bioreactor, and the bacterial quality of cheese whey. The PR NWS variability is further increased by the usage of unpasteurized cow milk as raw material which in turn affects the microbiota inhabiting cheese whey. Previous studies on PR NWS attributed predominant microbiota to Lactobacillus helveticus11,12,13,14. Further knowledge on bacterial composition of PR NWS was provided by Bottari et al.15 and Bertani et al.16 by using culture-independent techniques. These seminal studies demonstrated that PR NWS is composed by comparable percentages of L. helveticus and Lactobacillus delbrueckii, while Limosilactobacillus fermentum (basionym: Lactobacillus fermentum) and Streptococcus thermophilus are present to a lesser extent. All these species are acid-tolerant, microaerophilic, thermophilic, and well adapted to thrive the selection pressure occurring during whey fermentation17,18,19.

Metabarcoding approaches, based on high throughput sequencing (HST) of variable regions of the 16S rRNA gene, have been widely used to describe the composition of bacterial communities in dairy ecosystems, generating a body of knowledge useful to improve quality and safety of dairy products20,21. Although the NWS performance is critical to assure reliability of PR cheesemaking, there is limited information on the NWS variability in term of species abundance over the PDO production area. Yet, we know little about the microbial interactions inside PR NWS which can shape microbiota composition.

The aim of this pilot study was to provide a culture dependent and independent profile of microbiota inhabiting PR NWS sampled over the PDO production area and to correlate them with physicochemical and technological parameters.

Results

Physicochemical parameters and viable microbial counts

Ten NWS samples, termed C1 to C10, were collected from different dairy farms located in the production area of PR cheese during the Autumn. Values of pH ranged from 3.27 to 3.57, while titratable acidity was from 27.3 to 33.4 SH°/50 mL (Supplementary Table S1). Fermentative performance of NWS samples, measured as ΔSH°/50 mL, were also highly variable, suggesting differences among NWS samples in milk acidification rate. As expected, lactate was the main organic acid detected in NWS (Fig. 1A), but its content was variable among samples, reflecting differences in NWS microbial composition according to the home-made nature of mixed microbiota forming NWS (Supplementary Table S1). Lactate concentrations in samples C5 and C10 were significantly lower than in other NWS (p < 0.05). During lactic fermentation, some organic acids (lactic and acetic acid) increase, while other organic acids (citric acid, etc.) derived from lipolysis, carbohydrate metabolism, or amino acid metabolism decrease22. Under low pH conditions, homofermentative species could undergo a shift from homofermentative to a mixed-acid profile with the production of acetate and succinate from citrate catabolism23. Furthermore, citric, acetic, and succinic acids were also by-products of yeast sugar catabolism. Citric acid was the second organic acid present in the NWS analyzed, followed by acetic and succinic acids. Sample C10 showed the highest concentration of acetic acid (p > 0.05), followed by C7 and C8. The non-dissociated form of acetic acid is dependent on the external pH and on the pKa of the acid (pKa = 4.76). In acidic environments, like NWS, many organic acids are in non-dissociated forms and can penetrate the cell membrane, accumulating within the cytoplasm and causing loss of viability and cell death24. Accordingly, the acetic acid produced by NWS mixed cultures is mainly in the non-dissociated at pH < 4.5 and it can penetrate the cytoplasm producing deleterious effects on the cells. Ethanol was detected at low concentrations (data not shown).

Figure 1
figure 1

Organic acids contents and microbial counts of ten NWS samples. Organic acids concentrations are expressed in g/L as means ± standard deviation of four replicates (A). Microbial counts of presumptive lactobacilli (B), streptococci (C), cocci-shaped LAB (D), mesophilic yeasts (E), and lactose-fermenting thermophilic yeasts (F) are expressed as means of Log10 CFU/mL of at least three replicates. Significant differences are indicated with different letters (p < 0.05), as calculated by one-way ANOVA. Plotted with GraphPad Prism v.8.00 software (San Diego, CA, USA, https://www.graphpad.com/).

Presumptive thermophilic LAB were estimated in three different conditions, namely MRS (pH 5.4) medium anaerobically (for lactobacilli enumeration); M17 medium supplemented with sterile skimmed whey (M17-SSW) aerobically (for streptococci enumeration), and M17-SSW medium anaerobically (Fig. 1B–F). All the samples showed presumptive lactobacilli counts ≥ than 7.5 Log10 CFU/mL (Fig. 1B), while cocci-shaped populations enumerated both aerobically and anaerobically on M17-SSW medium ranged from 6.65 ± 0.23 to 8.37 ± 0.17 Log10 CFU/mL (Fig. 1C,D). The samples C2, C7, and C10 showed the lowest streptococci counts as enumerated on M17-SSW medium aerobically (p < 0.05). Less remarkable differences were detected in the plate counts on M17-SSW medium incubated under anaerobic conditions (Fig. 1D).

Mesophilic yeasts were detected in all samples, with C1 showing the highest counts (p < 0.05) (Fig. 1E). Selective enumeration on YPLA medium at 42 °C showed that thermotolerant and lactose-fermenting species represent the dominant yeast fraction (Fig. 1F). These results agreed with data previously collected from other PR-NWS samples25,26. The lack of detectable level of ethanol in all NWS suggest that yeasts either preferably consume lactate or engage a respiratory sugar catabolism.

Cultivable bacterial fraction characterization

After several attempts, we successfully isolated 57 Gram positive bacterial axenic cultures; catalase reaction was negative for all the isolates, except for T2013. We observed a decline in cultivability of isolates in all three culture conditions. Fifty-three percentage of colonies either did not grow after the first purification step or died during the subsequent rounds of purification by streaking on the same isolation medium. The cultures most recalcitrant to isolation were those from M17-SSW medium incubated in anaerobiosis at 42 °C. This apparently disagrees with the assumption that media that mimic environmental conditions should increase and diversify the number of cultivable bacteria.

16S-ARDRA with the selected restriction enzyme MseI discriminated all the species considered, except for L. delbrueckii subsp. bulgaricus and L. delbrueckii subsp. lactis (Supplementary Table S2). Differentiation between L. delbrueckii subsp. bulgaricus and L. delbrueckii subsp. lactis was successfully obtained with the enzyme EcoRI27. Analyzing the electrophoretic profiles obtained with MseI, out of 57 NWS isolates 56 were classified into three species, namely, L. helveticus (33.90%), L. delbreueckii (22.03%), and St. thermophilus (38.98%). Culture conditions affected the species recovery, with MRS (pH 5.4) selecting L. helveticus and M17-SSW under anaerobiosis L. delbrueckii, respectively. The medium M17-SSW under aerobic conditions was confirmed to be selective for St. thermophilus28. All the L. delbrueckii isolates resulted to be assigned to subsp. lactis, according to EcoRI restriction pattern (Supplementary Table S2).

A total of 27 high-quality sequences obtained from 13 bacterial isolates representative of the previously established restriction patterns and 14 reference strains retrieved from GenBank RefSeq database was used to infer phylogenetic positions of bacterial cultures (Fig. 2A). A total of three major clades were represented, including the genera Streptococcus, Lactobacillus, and Staphylococcus, respectively. One strain, T2103, was grouped with Staphylococcus capitis ATCC 19258T, sharing 100% similarity. Two strains, T10105 and T8106, were grouped with St. thermophilus ATCC 19258T, sharing 100% similarity among them and 99.93% similarity with the reference strain. The remaining 10 isolates were included into the Lactobacillus clade. Within this clade branches representing species differentiation were supported by high bootstrap values for all the species considered, except for L. delbrueckii. Five bacterial isolates formed a monophyletic cluster with L. helveticus DSM 20075T. The branching was supported by 100% bootstrapping value. The discrimination among L. delbrueckii subspecies was poorly supported as expected from the high similarity in 16S rRNA gene sequence within species29. Five strains were grouped within the L. delbrueckii clade (100% bootstrapping), nearer to L. delbrueckii subsp. lactis than to L. delbrueckii subsp. bulgaricus. No isolates belonging to L. fermentum were found.

Figure 2
figure 2

Characterization of cultivable microbial fraction of NWS. (A). Evolutionary relationships of thirteen NWS microbial isolates (in bold) inferred using the Neighbor-Joining method. The percentages of replicate trees (values higher than 60%) in which the associated taxa clustered together in the bootstrap test (1000 replicates) are shown as circles next to the branches. The evolutionary distances were computed using the Kimura 2-parameter method rate variation among sites was modeled with a gamma distribution (shape parameter = 1). The analysis involved 27 nucleotide sequences and E. coli 16S rRNA gene partial sequence (NR_114042.1) was used as outgroup. (B) Pie-chart depicting microbial species frequencies. (C) Species distribution in each NWS sample. Numbers on the column represent biotypes scored by UPGMA analysis of (GTG)5 rep-PCR fingerprinting data. H and D agree with NWS types obtained through 16S rRNA metabarcoding analysis. (A) was visualized and edited with iTOL (available at https://itol.embl.de/), while (B,C) with GraphPad Prism v.8.00 software (San Diego, CA, USA, https://www.graphpad.com/).

To assess the extent of intra-species diversity, we genotyped the isolates using the micro-satellite region (GTG)5 in rep-PCR. After UPGMA clustering analysis, a total of 34 genotypes was found with a reproducibility cut-off of 88% (Supplementary Fig. S1). From 34 genotypes 20 were singletons and accounted for the 58.82% of inter-individual variability. We divided L. delbrueckii into 4 subclusters and 3 singletons, L. helveticus into 5 subclusters and 6 singletons, and St. thermophilus into 5 subclusters and 10 singletons. More than one biotype was present for every species within each sample. Simpson (1-D) a-diversity index obtained for the total population of 34 genotypes was 0.68. Based on the overall results, the cultivable fraction was composed by St. thermophilus as dominant species, followed by L. helveticus (Fig. 2B). Species distribution was variable across the samples, with L. helveticus accounting for the 50% of isolates in only three samples (C6, C9, and C10, respectively) (Fig. 2C). Staphylococcus capitis was found only in C7 and appeared an occasional contaminant maybe related to low milk quality.

Bacterial community profiling

Problems in LAB cultivability suggested us to attempt 16S rRNA metabarcoding as alternative approach for profiling bacterial community of NWS samples. A total of 4,325,460 raw paired-end sequences were obtained from the 10 NWS samples considered in this study. After denoising, 2,162,730 paired end reads were retained with an average of 40,210.52 reads per sample (range 23,726.0 to 66,119.0). A total of 55 ASVs having more than 4 reads were identified, of which 45 were further selected at 0.001% of read frequency to eliminate the underestimated ASVs. The rarefaction curves reached the plateau for all NWS samples, suggesting that a complete coverage of the NWS bacterial community had been reached through the sequencing depth used (Supplementary Fig. S2).

The ASVs passing the quality control were aligned and subjected to taxonomic profiling. Based on Silva database annotation, 6 taxa were observed and most 16S rRNA gene sequences (95.7%) were classified at species level. Four species were identified, namely L. helveticus, L. delbrueckii, Streptococcus spp., and L. fermentum (Fig. 3A). Manual BLAST search revealed that the remaining two taxa were ascribed to L. delbrueckii and S. salivarius subsp. thermophilus.

Figure 3
figure 3

Relative abundance and alpha diversity analysis of NWS samples. (A) Cumulative bar chart representing the relative abundance on the vertical axis and the NWS samples (treatments; n = 50) on horizontal axis. Only taxa contributing to more than 0.1% of the total abundance in at least one sample are shown. (B) Observed features, Shannon index, and Faith’s PD index. Plotted with GraphPad Prism v.8.00 software (San Diego, CA, USA, https://www.graphpad.com/).

To evaluate the taxa abundance, alpha diversity metrics (richness, Shannon index, and Faith’s PD) were calculated within each sample replicates (n = 5), regarded as a community (Fig. 3B). The richness index was significantly affected by the dairy farm (H = 41.15, p < 0.001), with the highest bacterial richness in samples C2, C3, and C10, whilst the lowest richness in C1, C6, C8, and C9, respectively (Fig. 3B). The Shannon index accounts for both richness and evenness, with higher values indicating more diverse and uniformly distributed ASVs. The site of production did not significantly influence the Shannon index (H = 15.15; p > 0.05). Sample C1 significantly differed from samples C3, C5 and C6 in Shannon index. Although C1 and C6 had similar low richness, they differed in Shannon index, suggesting that C6 had a lower number of more highly distributed observed features compared to C1 (Fig. 3B). Faith’s PD, which additionally incorporates the phylogenetic distance of the ASVs, was strongly affected by cheese factory (H = 33.75; p < 0.001) with NWS C2, C6, and C9 showing the lowest phylogenetic diversity (Fig. 3B).

Beta diversity provides an overview of the similarities in the bacterial communities between dairy farms. Principal coordinate analysis (PCoA) plots of the Bray–Curtis and weighted UniFrac indexes showed two significant groups of samples, termed type-H (C2, C7, C9, and C10) and type-D (samples C1, C3, C4, C5, C6, and C8) (Fig. 4A,B, respectively). No significant differences were found among cheese factories in Jaccard and unweighted UniFrac distances (data not shown). Since the weighted UniFrac distance accounts for ASVs abundance and unweighted UniFrac distance for the presence/absence of ASVs, we supposed that the main source of differences among NWS samples was the individual microbial abundance rather than species composition. PERMANOVA analysis was applied to the distance dataset in order to find significant differences among the treatments. The test revealed that cheese factory strongly affects microbial population diversity among the samples (Bray–Curtis PERMANOVA: pseudo-F = 80.53; p-value = 0.001).

Figure 4
figure 4

Beta diversity plots of NWS samples based on Principal coordinates analysis (PCoA) of (A) Bray–Curtis and (B) weighted Unifrac distance. Points represent samples, while color indicates dairy farms. The percentage reported on axes represent the amount of total variance depicted by each of them. (A,B) were plotted with R v.4.1.1, while the ellipses were calculated and drawn with 0.95 of confidence level using ggplot v.2 3.3.5 (stat_ellipse).

Bacterial signature detection

Considering only the ASVs whose relative abundance in each sample was greater than 0.001%, Firmicutes resulted the dominant phylum, that describes more than 99% of the bacterial microbiota in all the samples. Lactobacillus helveticus and L. delbrueckii were the dominant taxa, followed by Streptoccoccus spp. and L. fermentum. We also observed that L. delbrueckii and Streptoccoccus spp. represent 79.7% of the total average relative abundance in NWS type-D, while L. helveticus had an average relative abundance of 60.9% in NWS type-H. L. fermentum was detected only in NWS cluster H at low relative abundance (0.24%) (Fig. 5A).

Figure 5
figure 5

(A) Species level composition of NWS type-D and type-H. The average relative abundances of each bacterial taxon and clusters are reported on the vertical and horizontal axis, respectively. (B) ANCOM differential abundance volcano plot. (C) sPLS-DA of the NWS samples (n = 50) showing discrimination between samples H (blue circles) and D (orange triangles). (D) Loading plot shows the discriminant power of species in explaining differences between groups on component PLS1. The direction of the bars (left or right) relates to the direction of the loadings in panel A. The higher the absolute value, the bigger is the discriminative power. Orange and blue bars indicate a higher abundance in D or H group, respectively. (A) was plotted with GraphPad Prism v.8.00 software (San Diego, CA, USA, https://www.graphpad.com/), while (BD) were built using mixOmics R package v.6.16.0 (https://CRAN.R-project.org/package=mixOmics).

To determine significant compositional differences across groups, we used the statistical framework ANCOM30, which infers absolute abundance from relative abundance data and can detect which taxa are differently abundant across groups. ANCOM analysis identified Streptococcus spp. (w = 5) and L. delbrueckii (w = 5) as different between groups. These species were more abundant in cluster D than in cluster H (Fig. 5B).

To detect specific ASVs which contribute to separate NWS samples into type-D and type-H, sPLS-DA was performed. sPLS-DA is a multivariate method performed on the clr-transformed microbiome data to identify microbial drivers or biomarkers discriminating samples groups. sPLS-DA plot showed that PLS1 and PLS2 explain 39 and 22% of the variation in NWS microbiota composition, respectively, and effectively separate cluster H from cluster D (Fig. 5C). To further identify the specific taxa that were predominant as the biomarkers between the groups, the contribution to PLS1 was calculated. Streptococcus spp. was the most significant taxon in NWS type-D followed by L. delbrueckii, while L. helveticus characterized NWS type-H (Fig. 5D).

Differential analysis of predicted functional content

To provide a preliminary functional insight for the taxonomic profiles observed in our study, we used the PICRUSt2 software31 and we performed the differential abundance analysis with three independent tools, such as ALDEx21.26.032, ANCOM-BC 1.4.033, and MaAsLin234. The PICRUSt2 analysis predicted the presence of 1345 functional gene orthologs, 589 enzymes and 118 metabolic pathways across all NWS samples. Fifty-three metabolic pathways reported in the MetaCyc database were significant as demonstrated by ALDEx2 1, ANCOM-BC, and MaAsLin2 analyses, respectively. The heat map of 15-top pathways showed that NWS type-H and D differed from each other in some metabolic functions (Fig. 6). In detail, samples were assigned to 4 clusters based on the composition of predicted metabolic pathways. One included almost all samples H, except for C9 and one replicate of sample C10. Four pathways were enriched in cluster H; among them, we found the galactose degradation I pathway and the N10-formyl-tetrahydrofolate biosynthesis, involved in purine biosynthesis. NWS type-D appeared enriched in pathways related to aromatic and branched chain amino acid (BCAA) biosynthesis (Fig. 6).

Figure 6
figure 6

Functional heatmap. Hierarchical clustering heatmap visualized with pheatmap R package v.1.0.12 shows samples in columns and the 15 most characterizing MetaCyc pathways in rows. The PICRUSt2-predicted abundance levels are represented by the background color, where blue means low and red means high abundance. The main experimental conditions are shown in green and violet, representing NWS type-D and NWS type-H, respectively.

Spearman correlation analysis

Spearman correlation analysis assessed the relationships between taxa, plate counts, and physicochemical/technological outcomes (Fig. 7). As expected, lactic acid concentration showed a strong positive correlation with titratable acidity (r = 1.0; p = 5.511 × 10–7), in agreement with the high amount of this acid detected in NWS. Lactobacilli counts were positively correlated to lactate concentration and titratable acidity (r = 0.721; p = 0.023) and negatively correlated with the abundance of Streptococcus spp. (r = 0.661; p = 0.044). This negative relationship agreed with previous sPLS-DA analysis. As expected, abundance of L. delbrueckii was negatively correlate with abundance of L. helveticus (r = 0.981; p = 0.044), whilst relative abundance of Streptococcus spp. negatively correlated with abundance of L. helveticus (r = − 0.721; p = 0.023) and positively correlated with counts of presumptive streptococci population (r = 0.709; p = 0.027) (Fig. 7). This last result further confirms that M17-SSW medium under aerobiosis represents the appropriate cultivation condition for detecting St. thermophilus, as previously reported28.

Figure 7
figure 7

Spearman’s correlation analysis of relative abundance of 4 selected ASVs with plate counts, physicochemical parameters, and organic acid concentrations. Blue to red scale denote positive to negative associations. Spearman’s correlations were employed in agreement with data distribution and verified by Kruskal–Wallis test. *p < 0.05, **p < 0.01, ***p < 0.0001 following the Spearman’s correlations. Plotted with GraphPad Prism v.8.00 software (San Diego, CA, USA, https://www.graphpad.com/).

Remarkably, mesophilic yeast population was positively related to the abundance of L. delbrueckii (r = 0.661; p = 0.044) and negatively related to L. helveticus (r = − 0.685; p = 0.035). This result suggests that NWS type-D could be more prone to yeast contamination than NWS type-H. Furthermore, mesophilic yeasts were positively correlate with citrate concentration (r = 0.709; p = 0.031) (Fig. 7). Yeasts inhabiting NWS are generally Krebs-positive species suitable to use both lactate and Krebs cycle intermediates as carbon and energy sources35. Although not statistically significant (r = 0.697; p = 0.067), there was a positive correlation between abundance of L. fermentum and acetic acid concentration, in agreement with the heterofermentative catabolism of this species22.

Acidification assays in milk

Difficulty in axenic culture cultivation and co-occurrence of St. thermophilus and L. delbrueckii in NWS type-D suggested that mutualistic interactions could take place between NWS species and that these bacteria can grow better together than alone. To test this hypothesis, we assessed the milk acidification performance of three randomly selected tester strains representative of species L. delbrueckii, L. helveticus and St. thermophilus, respectively. All tester strains exhibited slow acidification curves as monocultures, with L. delbrueckii CBB09 showing the slowest acidification trend (p < 0.05) (Fig. 8). When St. thermophilus was cocultured with L. delbrueckii, trend in acidification significantly increased (p < 0.05). Tricultures of St. thermophilus RBC06, L. delbrueckii CBB09, and L. helveticus RBB04 resulted in faster acidification trend compared to both monocultures and cocultures (Fig. 8). Significantly, NWS sample outcompeted both tricultures and cocultures in pH decrease.

Figure 8
figure 8

Acidification curves of St. thermophilus, L. helveticus and L. delbrueckii susp. lactis axenic cultures isolated from NWS. Tester strains were inoculated in milk as monocultures (grey), coculture (St. thermophilus RBC06 x. L. delbrueckii susp lactis CBB09) (orange), and triculture (St. thermophilus RBC06 x. L. delbrueckii susp lactis CBB09 × L. helveticus RBB04) (green). Milk inoculated with fresh NWS (light grey) and uninoculated milk (black) were used as positive and negative controls, respectively. Values are mean of at least three replicates. Bars when visible represent standard deviation values. Plotted with GraphPad Prism v.8.00 software (San Diego, CA, USA, https://www.graphpad.com/).

Discussion

Undefined cultures are complex bacterial communities where there are usually multiple strains per species and only a few species being dominant36. Here, we proposed the existence of two main NWS community types in PR cheesemaking, named NWS type-H and NWS type-D. Notably, the distinctive feature that characterizes NWS type-H and NWS type-D is the dominance of NWS microbiota by L. helveticus and L. delbrueckii/St. thermophilus, respectively. Intriguingly, co-occurrence of L. delbrueckii and St. thermophilus in NWS type-D has been also documented in a Swiss hard cheese starter, where these species form a stable community and engage mutualistic interactions by metabolite exchanges37. Like in yogurt cultures, they metabolically complement each other: L. delbrueckii is as protease-competent species which provides amino-acids to the poorly proteolytic species St. thermophilus38, meanwhile St. thermophilus possesses metabolic pathways for folate, lactate, and formate production39. In our case the higher relative abundance of St. thermophilus in samples dominated by L. delbrueckii supports that similar cross-feeding relationships could exist in PR NWS too. The observed reduced cultivability of axenic cultures obtained from PR NWS samples supported this hypothesis. Application of metagenomic approaches and reconstruction of metagenome-assembled genomes (MAG) from PR NWS will better elucidate this metabolic complementation.

Differently from the Swiss hard cheese starter37, in PR NWS communities L. helveticus plays a key role other than L. delbrueckii and St. thermophilus. Coexistence of these species makes PR NWS very similar to the natural starter used for Swiss Gruyère-Type cheese40. Our results showed a differential presence of L. helveticus in NWS type-H compared to NWS type-D. This species was inversely related to St. thermophilus, but no samples were found with L. helveticus as being the only species present. L. helveticus is generally considered positive in dairy fermentation as this bacterium overcomes the multiple amino acid auxotrophies through a large set of proteases and peptidases which breakdown proteins and peptides into amino acids41. The proteolytic system of L. helveticus has been associated with important dairy traits, such as fast cheese ripening, enhanced flavor development, and reduced bitterness42. However, the lack of PR NWS samples with only L. helveticus as dominant species could be positive as overdominance of L. helveticus reduces rheological properties and cheese elasticity due to an excessive proteolysis and carbon dioxide production40,43. The correct balance between the LAB species in dairy ecosystems supports that the inter-species interactions are responsible for both the development of the expected organoleptic complexity and texture, and the resilience toward stress conditions and colonization by spoilage microbes8. Compared to previous studies12, no L. fermentum was detected except for sample C10. In Trentingrana cheese NWS more species were detected other than L. helveticus, including Lacticaseibacillus paracasei, Lactiplanctibacillus plantarum, and L. fermentum44.

Among spoilage microbes, yeasts can utilize lactose and lactate reducing NWS acidifying performance. Among yeasts isolated from PR NWS, only Kluyveromyces marxianus breakdowns lactose, but all the other species, such as Torulaspora delbrueckii, Wickerhamiella pararugosa, and Saccharomyces cerevisiae, are galactose-fermenting yeasts26. The inverse relationship between yeast contamination and L. helveticus could depend on the ability of L. helveticus to ferment both glucose and galactose, leaving no sugar moieties available for yeast respiration. In contrast, L. delbrueckii is generally described as Gal- species45, while in metadatabase BacDive (http://bacdive.dsmz.de) St. thermophilus has been reported as variable in galactose fermentation. A lower ability to consume galactose in PR NWS type-D could explain the higher yeast contamination observed in these samples. In the attempt to provide a preliminary functional insight for the taxonomic profiles observed in our study, we used PICRUSt2 software and three different algorithms for the differential abundance analysis of predicted pathways. The results suggest that bacterial pathways involved in purine biosynthesis and galactose catabolism were significantly represented in NWS type-H, while branched chain and aromatic amino acids biosynthetic pathways were overrepresented in NWS type-D. Tryptophan and BCAA are important precursors of flavor compounds46 and could suggest that NWS type-D could differentially impact the quality of PR cheese compared to NWS type-H both in terms of residual amount of galactose and flavor compound precursors. In accordance with our data, Santarelli et al.47 divided PR cheeses during moulding phase in two groups, one characterized by high lactate level and dominated by L. helveticus and the other one with a low lactate level and a high level of St. thermophilus.

Remarkably, uncultivability of individual single colony strains did not allow us to collect enough isolates to depict a complete picture of the intra-species diversity in PR NWS samples. Poorly cultivability also determined little consistency of culture-dependent data with respect to the 16S rRNA metabarcoding profiles. Reasons for this uncultivability could be several, from phage infection to the lack of metabolic complementation that occurs in undefined starter community. Although being partial and preliminary, genotyping data collected from the survived single colonies showed a high degree of genetic heterogeneity at strain level rather than at species level, with more than one biotype per species within each sample and different biotypes among samples. Similar results have been reported in other undefined starter communities6,37,48. The intraspecies diversity is thought to be responsible for resilience against environmental uncertainty49 and is linked to functionally adaptive traits encoded by genomic islands and mobile genetic elements50. Bacteriophages could have an additional role to regulate population diversity through density-dependent predation51. The ‘Kill-the-winner’ model predicts that phage predation ensures diversity by suppression of the more abundant strains52. Unfortunately, 16S rRNA metabarcoding approach adopted in this study neither characterized phages nor determined CRISPR arrays variability in L. helveticus, St. thermophilus and L. delbrueckii MAGs. However, undefined bacterial communities similar to PR NWS, like Trentingrana NWS, were proved to be contaminated by lysogenic phages44. Somerville et al.37 also demonstrated that genomic differences among isolates from Swiss hard cheese starter are mainly due to CRISPR spacers. All these findings support that phage-bacteria interactions shape bacterial diversity in undefined NWS.

Our growth experiments showed that monocultures of St. thermophilus, L. bulgaricus, and L. helveticus isolated from PR NWS have poor acidifying ability, with L. delbrueckii subsp. lactis exhibiting the worse performance followed by St. thermophilus. This agrees with previous observations in yogurt where monocultures of St. thermophilus or L. bulgaricus subsp. bulgaricus grow slowly when milk was not supplemented with amino acids and formate, respectively39. In our experiment L. helveticus also grew slowly in milk. This disagrees with the general assumption that L. helveticus is a fast-acidifying species42. L. helveticus strains with a slowly milk-coagulating phenotype have been previously isolated from undefined starters53 and Mongolian fermented milk54. This phenotype is linked to the loss of plasmids harboring prt genes53, loss of aminopeptidases-encoding genes55, and deficiency in purine biosynthesis56. Gene decay and genome reduction have been frequently documented in microbes inhabiting nutritional rich environments57. According to the Black Queen hypothesis58, protease-negative strains can invade the population and dominate over the protease-positive strains59. The progressive increase in milk acidification rate of co-cultures and tri-cultures supports that L. helveticus also could benefit from cross-feeding relationships with St. thermophilus and L. bulgaricus subsp. lactis. These results agree with previous observation that individual strains of NWS thermophilic lactobacilli grown in whey can benefit from the presence of cell-free supernatant fluids from whey cultures of other strains60. Massive milk acidification tests with a wider number of strains per species will confirm these preliminary observations and will help to decipher mechanisms underpinning these metabolic interactions.

In conclusion, the present study demonstrated that PR NWS are bacterial communities that can be clustered in two types based on differences in species abundance and that they are shaped by complex relationships at inter and intra-species level. Further studies based on metagenomic HST of a broader assortment of NWS samples will be of interest to confirm whether metabolite complementation and phage infection are the driving forces of these interactions. The functional and genomic characterization of PR NWS strains isolated in this study will also led light on intra-species distribution of relevant ecological and technological phenotypes, such as proteolytic ability, galactose fermentation, and folate and purine biosynthesis.

Materials and methods

Reference strains and culture conditions

Type strains and tester strains used in this study were St. thermophilus DSM 2061T and RBC06; L. helveticus DSM 20075T and LBB04; L. delbrueckii spp. lactis DSM 20072T and CBB09; Lactobacillus delbrueckii subsp. delbrueckii DSM 20074T and L. fermentum DSM 20052T. All the type strains were from DSMZ collection and cultivated according to DSMZ growth conditions. Tester strains St. thermophilus RBC06, L. helveticus LBB04, and L delbrueckii subsp. lactis CBB09 were previously isolated from PR NWS and 16S rRNA gene sequenced (GenBank accession numbers OM891849, ON936798, and ON936797, respectively).

Sampling and physicochemical analyses

Ten NWS samples were collected in the PDO area of the PR cheese. All the samples were collected in October 2021 to minimize seasonal variations. Geographical localization and details on dairy farm management system were reported in Supplementary Table S3, while temperature curves used for NWS production in Table S4, respectively. When necessary, samples were collected from multiple fermentation units in each dairy farm and then mixed. Samples were kept immediately at 4 °C and transported in laboratory for further analyses. pH was measured directly using a pH meter (Crison Instruments, Barcelona, Spain), without dilution. Titratable acidity was determined in 50 mL of NWS using the Soxhlet–Henkel method with 0.25 N NaOH and recorded in Soxhlet–Henkel degrees (°SH/50 mL)26. Acidification rate of NWS samples was assessed as previously reported61 and expressed as Δ°SH/50 mL. Lactic, succinic, acetic, and citric acids were enzymatically determined, according to manufacturer’s instructions (Megazyme, Wicklow, Ireland).

Microbial enumeration and viable fraction characterization

Microbiological analyses were carried out by ten-fold diluting the samples in physiological water (9 g/L NaCl) and spread them on: de Man-Rogosa-Sharp (MRS; Oxoid, Milan, Italy) medium (brought to pH 5.4 with 1 N HCl) for the enumeration of the presumptive lactobacilli; M17 medium62 supplemented with SSW (Morga AG, Ebnat-Kappel, Switzerland) (M17-SSW) at the final concentration of 7% v/v for the numeration of presumptive cocci LAB; YPDA medium for the generalist yeast enumeration; and YPLA for the enumeration of lactose fermenting yeasts. The SSW was prepared according to Fornasari et al.28. MRS and M17 media were supplemented with cycloheximide at the final concentration of 20 mg/L, while YPDA and YPLA media with chloramphenicol at the final concentration of 10 mg/L. MRS plates were incubated at 42 °C for 48 h under anaerobiosis (Oxoid, Milan, Italy); M17-SSW plates at 42 °C for 72 h both under aerobic and anaerobic conditions; YPDA and YPLA plates at 28 °C and 42 °C for 48 h, respectively. For anaerobic conditions, Oxoid AnaeroGen system (Thermo Fisher Scientific, Waltham, MA, USA) was used. Viable cell counts were recorded as number of colony forming units (CFU)/mL recovered from plates with colonies ranging from 20 to 200 and expressed as Log10 CFU/mL means of at least three replicates.

Bacterial colonies were submitted to at least two rounds of streaking and then characterized for micromorphology, Gram staining, and catalase test. All the clones collected in this wok were conserved ex situ at − 80 °C in liquid medium supplemented with 25% (v/v) glycerol. Bacterial cultures were submitted to DNA extraction as previously reported63 and preliminarily distinguished into clusters by Amplified 16S Ribosomal DNA Restriction Analysis (16S-ARDRA) using the diagnostic endonuclease MseI (Thermo Scientific, Waltham, MA, USA). This endonuclease was selected based on in silico analysis of the 16S rRNA gene sequences of the most frequently recognized LAB species in NWS carried out on with Snapgene software (www.snapgene.com). Briefly 16S rRNA gene was PCR amplified from wild isolates and references strains using the primers 27f (5-’CTGGGATCCATTTACTCGAGAGTTTGATCCTGGCTCAG-3’) and 1490r (5’-GGTTCCCCTAAGCTTACCTTGTTACGACTTC-3’)64 at the following conditions: 5 min at 95 °C, 30 cycles with 1 min at 95 °C, 2.5 min at 58 °C and 2 min at 72 °C and 5 min of final extension at 72 °C. PCR amplicons were digested with MseI restriction enzyme for 3 h according to manufacturer’s instructions and the resulting DNA restriction fragments were separated by electrophoresis in a 2% (w/v) agarose gel with ethidium bromide (0.5 mg/mL) in 0.5X Tris–borate-EDTA (pH 8.0) buffer at 90 V for 90 min and visualized under a UV source. Each gel was documented with a GelDoc apparatus (Biometra, Göttingen, Germany). When required, 16S rDNA amplicons were also digested with EcoRI27. For all the investigated bacterial strains (reference and wild), restriction fragment sizes were measured (in bp) by comparison with the GeneRuler 100 bp Plus bp DNA Ladder (Thermo Scientific, Waltham, MA, USA).

At least one amplicon for each 16S-ARDRA profile was Sanger sequenced with both amplification primers by BMR Genomics (Padua, Italy). Trimmed sequences were annotated by MegaBLAST search against the NCBI-Refseq database using minimum cutoff values of 98% identity. Phylogenetic placement of query and reference sequences was conducted in MEGAX65 (Biomatters, Auckland, NZ) and analyzed using Neighbor joining method66. The rate variation among sites was modeled with a gamma distribution (shape parameter = 1)67. The branches of the inferred unrooted tree were assayed using bootstrap analysis with 1000 replicates68. The sequences obtained from bacterial cultures were deposited in the GenBank NCBI database with the accession numbers ON755074-ON755086. Genotyping of isolates was carried out by (GTG)5 repetitive bacterial DNA elements PCR (rep-PCR) as previously reported63. All trees were visualized using Interactive Tree of Life (ITOL)69.

Total microbial DNA extraction and quantification

Total bacterial DNAs were extracted with the Dneasy PowerSoil DNA kit (Qiagen, Valencia, CA, USA). Briefly, NWS samples were thawed on ice, divided into five aliquots of 1 mL each and centrifuged at 16,000×g for 5 min. Cells were resuspended into 800 μL of CD1 buffer, added with 200 μL of zirconium beads (0.1 mm) and incubated at 65 °C for 10 min. After vortexing with a TissueLyzer (Qiagen) at 25 Hz for 10 min, 550 μL of lysate were added with 250 μL of CD2 buffer and DNA was extracted according to the manufacturer’s instructions. All DNA samples were diluted into 120 μL and checked for integrity by electrophoresis in a 1.5% (w/v) agarose gel, while quantity was verified by absorbance measurements at 260 and 280 nm using a Nanodrop (NanoDrop™ 2000, Thermo Fisher Scientific, Waltham, Massachusetts, USA), and its integrity was verified by electrophoresis on 1.5% agarose gels. DNA was stored at − 80 °C before further processing.

Illumina sequencing and data processing

Total DNA samples were used as a template for the preparation of 16S rRNA gene amplicon libraries following standard Illumina library preparation procedure70. Briefly, the V3-V4 hypervariable regions of the 16S rRNA gene were initially amplified using a universal pair of primer: 341F (5′-CCTACGGGNBGCASCAG-3′) and 805R (5′-GACTACNVGGGTATCTAATCC-3′) resulting in an amplicon length of ~ 464 bp71. A second PCR was then carried out to attach dual Illumina barcode indices and adapters using the Nextera XT library preparation kit (Illumina, San Diego, CA). PCRs products were purified and normalized using a SequalPrep Normalization Plate and sequenced through 300 × 2 bp paired-end sequencing on an Illumina MiSeq V3 platform (Illumina, San Diego, CA, USA) at BMR Genomics (Padua, Italy).

Microbial sequences were processed in the Quantitative Insights into Microbial Ecology 2 (QIIME 2) bioinformatics platform, version qiime2-202172. Primers were removed using the q2-cutadapt plugin. Later, paired-end sequences were subjected to quality control including denoising, merging, and chimera removal using the DADA2 plugin73 implemented in QIIME 2 (dada2 denoise-paired with the following parameters trunc_len_f:260, trunc_len_r:245). The resulting table of amplicon sequence variants (ASV)74 was subsequently filtered at 0.001% to remove singletons and very rare ASVs. Taxonomic classification of ASVs was carried out with q2-feature-classifier plugin75 using trained OTUs at 99% from Silva database version 13876. The taxonomic assignment of poorly classified ASVs were manually verified using NCBI blastn77. Samples were rarefied to 23,726 sequences by random subsampling in QIIME2 before downstream alpha and beta diversity analyses. No samples were excluded by the rarefaction step.

Ecological and statistical analysis

Statistical computing and graphical generation were performed using the R programing environment unless otherwise indicated (r-project.org), while alpha and beta diversity analyses were done using the various tools of QIIME2 diversity plugin. To evaluate alpha diversity, for each sample we computed various indices including observed features, Shannon, evenness, and Faith’s phylogenetic diversity (Faith's PD). The Kruskal–Wallis test was used as a non-parametric statistical test to test pairwise differences (https://docs.qiime2.org/2022.2/plugins/available/diversity/alpha-group-significance/). The distance matrices of the main beta diversity metrics (Bray–Curtis, Jaccard, weighted Unifrac, and unweighted Unifrac) and their corresponding principal coordinate analysis (PCoA) were also calculated, in order to investigate the dissimilarity in bacterial communities between dairy farms. Statistical analyses of beta diversity were conducted using the PERMANOVA test with 999 permutations (beta-group-significance). Beta diversity PCoA were projected onto 2D ordination plots by using ggplot2 and ellipses were drawn for each pond around 95% confidence intervals, assuming a multivariate normal distribution, applying the stat_ellipse function included in the same package. Differential abundance of taxa was tested at different levels (family, genus, species, and ASV) using ANCOM30 plugin implemented in Qiime2. To avoid errors during data normalization step, pseudocounts were added to ASV table using “qiime composition add-pseudocount” before running ANCOM.

Then we applied Partial Least Squares Discriminant Analysis (PLS-DA) with mixOmics78 to divide the samples into different predefined type-H and D and finally sparse PLS-DA (sPLS-DA) was used to select the most discriminative features able to separate the two main groups.

Parametric statistics (one-way ANOVA and related post-hoc tests) on chemical data and Spearman correlation coefficient analysis with a two-sided 95% confidence interval were computed using GraphPad Prism 8 software (San Diego, CA, USA).

The overall level of significance was set at p < 0.05.

Functional analysis

Functional abundances were inferred to frequency filtered ASVs table using PICRUSt2 pipeline (v2.4.1)31. Subsequently, three tools available in Bioconductor R package79 (ALDEx2 1.26.032, ANCOM-BC 1.4.033 and MaAsLin2 1.8.034) were applied to perform the differential analysis of PICRUSt2-predicted pathways relative abundances and to identify the most characterizing pathways in the two main groups. In all methods a minimum prevalence cutoff of 0.1 was set as an analysis parameter. Moreover, since ALDEx2 uses centered log-ratio (clr) transformation and PICRUSt2 sometimes outputs non-integers values, it was necessary to round values (https://forum.qiime2.org/t/aldex2-error-when-processing-picrust2-output/14104). Significant values for ALDEx2 were filtered considering a q-value cutoff of 0.1 on the Expected Benjamini-Hochberg (BH) corrected P value of Wilcoxon test. A q-value cutoff of 0.1 on the adjusted p-value was applied to retain only most significant results also from ANCOM-BC and MaAsLin2 outputs. A consensus method was finally applied to the results to mark as significant only those pathways that were selected by all the three methods. Visual representation of the top 15 most significant pathways was created using the pheatmap method of the pheatmap R package v.1.0.1280.

Milk acidification assays

Every strain was precultured in 35 mL of the corresponding growth medium until the late exponential phase at 42 °C under anaerobiosis for L. delbrueckii and L. helveticus and under aerobiosis for St. thermophilus tester strains, respectively. After assessment of optical density at 600 nm, cells washed with saline water (9 g/L NaCl) and used to inoculate 50 mL of UHT skimmed milk in sterile glass bottles at the final concentration of 2 × 107 CFU/mL as monocultures. In coculture (L. delbrueckii × St. thermophilus) and triculture (L. delbrueckii × St. thermophilus × L. helveticus) experiments the strains were mixed each other at the final concentration of 2 × 107 CFU/mL. Milk samples were incubated at 42 °C and fermentation trials were monitored in triplicates as pH decrease over time with a pH meter (XS Instruments, Carpi, Italy). Uninoculated milk was used as negative control. Ten mL aliquot of a freshly prepared NWS was sampled from a dairy farm in the PDO production area of PR cheese and immediately transferred to the laboratory and used to inoculate milk as positive control. NWS cells concentration was assessed a Bürker glass chamber to inoculate UHT skimmed milk with 2 × 107 cell/mL.