Introduction

Pseudomonas putida KT2440 is a versatile and metabolically robust biotechnological workhorse1,2. It thrives in diverse environments3,4, degrades a wide range of organic compounds4,5, and as a result, is employed to produce a variety of bulk and fine chemicals6. Recent studies have focused on harnessing the potential of P. putida KT2440 and on understanding and manipulating its metabolic network7. Genome-scale metabolic models (M-models) have long been used in metabolic engineering to identify metabolic bottlenecks and potential improvements in metabolic pathways for bioproduction8,9,10. While M-models provide valuable insights into metabolic capabilities, they do not account for macromolecular expression and the biosynthetic cost of enzymes and, as a consequence, require extensive constraining11,12,13. Thus, predictions using M-models can lack robustness11, making it challenging to predict engineering strategies to improve performance14.

Models of metabolism and gene expression (ME-models) mechanistically describe gene expression pathways and their intertwined role with metabolic pathways to achieve optimal resource allocation for growth12. As a result, ME-models make predictions beyond the scope of traditional M-models, including unconstrained by-product secretion11, overflow metabolism15, cofactor usage15, protein overproduction14, and proteomic responses to stress conditions16. However, the reconstruction of ME-models is time-intensive and requires extensive manual curation, which has led to a reduced number of reconstructed ME-models, only available for Bacillus subtilis14, Clostridium ljungdahlii15, Escherichia coli12,17, and Thermotoga maritima11.

Here, we reconstructed an ME-model for P. putida KT2440, iPpu1676-ME, offering an unprecedented level of detail in the cellular function and proteome allocation of this bacterium. We show the improved predictive capabilities of proteome limitation in P. putida KT2440. Furthermore, we interrogated the gene expression of P. putida KT2440 using transcriptomics (RNA-Seq) and translatomics (Ribo-Seq) data. We analyzed the translational prioritization of pathways, as well as pathways that were significantly less prioritized for translation. When contrasting the model predictions against these sequencing datasets, we found stronger agreement with the ME-model compared to the M-model. Thus iPpu1676-ME represents a valuable asset for accurate predictive modeling as a tool for bioprocess design and optimization and metabolic engineering8,9,10 in this industrially important strain.

Results

iPpu1676-ME predicts proteome limitation and overflow metabolism in P. putida KT2440

The ME-model of Pseudomonas putida KT2440, iPpu1676-ME, was reconstructed based on a previous genome-scale metabolic model (M-model), iJN14621. The gene expression machinery was integrated into the ME-model following available ME-model reconstruction protocols12,18. Protein complex stoichiometries, function, localization, translocation pathways, and transcriptional unit compositions were retrieved and mapped from the genome of P. putida KT2440 (AE015451.219) as well as the strain-specific database in BioCyc20. iPpu1676-ME consists of 7526 metabolites, 14,414 reactions, and 1676 genes. Compared to the original M-model template this represents an increase of 250% in metabolite, 392% in reaction, and 15% in gene coverage (Fig. 1a–c). The expression machinery (E-matrix) adds up to 5443 metabolites, including types of RNA (mRNA, tRNA, and rRNA), proteins with and without modifications, and complexes (Fig. 1a). In addition, the E-matrix contains 5040 reactions, including translation and modification, translocation, transcription, and tRNA charging (Fig. 1b). Finally, the added 214 genes in iPpu1676-ME (Fig. 1c) correspond to mapped gene expression machinery including tRNA ligases, ribosomal proteins, RNA polymerase subunits, transcription factors, and protein modification machinery (including translocation machinery) (Supplementary Fig. 1).

Fig. 1: Properties and predictions of the ME-model of P. putida KT2440.
figure 1

a and b Breakdown of metabolites (a) and reactions (b) in iPpu1676-ME as compared to the template M-model, iJN14621. c Genome coverage of iPpu1676-ME and iJN1462. d Comparison of flux variability analysis of the M- and ME-models of P. putida KT2440. The distribution of the flux ranges is bimodal due to the tolerance of the QuadMINOS solver of 10−16, below which fluxes can vary at a negligible level. e Prediction of the maximum growth rate of P. putida KT2440 by the ME-model as opposed to the overestimation in the M-model, compared to reported maximum growth rates6,21,22,23,24. Area plots show the predicted secretion rates of overflow metabolites acetate and 2-ketogluconate by the ME-model, which the M-model does not predict.

As previously reported, the integration of the E-matrix provides the ME-model with a quantitative description of the biosynthetic cost of biochemical reactions12. Reaction fluxes are limited by the underlying gene expression cost, resulting in reduced flux variability11 and higher certainty in model predictions. The flux variability analysis of iPpu1676-ME showed reduced flux ranges (Fig. 1e). The majority of the M-model ranges are above 100, while ME-model ranges are mostly below it, increasing the certainty of predictions in the latter. Moreover, the ME-model readily recapitulates the maximum growth rate of P. putida KT2440 in a glucose-containing minimal medium (Fig. 1d), which has been reported at 0.58 (σ = 0.02) h−1 and at a glucose uptake rate of 8.15 (σ = 2.00) mmol/gDW/h from five different studies6,21,22,23,24. As opposed to the M-model, whose metabolic reactions are unconstrained by biosynthetic costs, the ME-model reaches proteome limitation when in nutrient excess and is capable of generating biologically relevant simulations without any additional constraints (Fig. 1d).

We further assessed the accuracy of intracellular flux predictions by contrasting them to previously published reports of metabolic flux analysis (MFA) in P. putida KT244025,26. Notably, both M- (Supplementary File 1) and ME-models (Supplementary File 2) reproduce the overall activity of the core metabolism assessed in the experimental MFA studies. At a glucose uptake rate of 2.21 mmol/gDW/h25 (simulation constraints and flux distributions are provided in Supplementary Data 1), glucose is converted to 6-phospho-d-gluconate, which is then assimilated and conveyed to glycerol-3-phosphate, bypassing the upper half of glycolysis. Interestingly, there is minimal but nonzero flux through part of the pentose phosphate pathway predicted by both M- and ME-models and observed in MFA26. The second half of glycolysis is active and feeds into the tricarboxylic acid cycle, with all enzymatic steps being active25,26. However, only the M-model incorrectly predicted isocitrate lyase and malate synthase (the glyoxylate shunt) to be active, which were observed to be inactive in MFA25,26. Furthermore, we found that the ME-model correctly predicts the activity of pyruvate kinase25,26, which is inactive in the M-model simulations. In the latter, phosphoenolpyruvate is converted to pyruvate through dGTP:pyruvate 2-O-phosphotransferase.

Another improvement in ME-models is the prediction of proteome limitation, which leads to the mechanistic prediction of overflow metabolites in ME-models15. In iPpu1676-ME, this leads to predicting 2-ketogluconate and acetate secretion in excess of glucose (Fig. 1e). While 2-ketogluconate is a known secreted metabolite by P. putida, acetate was predicted as a minor by-product, which has been shown in oxygen-limited conditions22,27. Notably, iPpu1676-ME maintains the same 85% accuracy in gene essentiality prediction1 of 54 metabolic gene knockout strains in M9 minimal medium28.

Multi-omic data reveal translational prioritization in P. putida KT2440

Translational efficiency (TE) refers to the rate of protein synthesis per unit of mRNA transcript29. The translation of a group of genes is said to be prioritized if their TE (see the “Methods” section) is high relative to other genes29. Translational prioritization readily informs the resource allocation strategies of an organism30,31,32. It can be used to infer specific objectives30,33, such as maximizing growth or uptaking a substrate. Thus, we aimed to interrogate the proteome allocation of P. putida and its translational prioritization using RNA-Seq and Ribo-Seq. Then, we contrasted it with the predictions by iPpu1676-ME.

P. putida was grown in glucose-containing M9 minimal medium in three biological replicates, and paired RNA-Seq and Ribo-Seq were performed. The three samples yielded a wide range of gene activity in both datasets with four and six orders of magnitude differences, as observed in the raw read counts from RNA-Seq (Supplementary Fig. 2a) and Ribo-Seq (Supplementary Fig. 2b), respectively, reinforcing the disparity in resource allocation throughout the genome. In order to discard technical artifacts34 as a confounding factor, we performed two-tailed t-tests (p < 0.05) and calculated Pearson correlation coefficients of the replicates. No significance in the means of the read count distributions was observed, and all replicates were very strongly correlated (PCC > 0.9) in both RNA-Seq and Ribo-Seq (Fig. 2a).

Fig. 2: Multi-omic data of P. putida KT2440 growth in glucose.
figure 2

a Pearson correlation coefficient (PCC) of and two-tailed t-test (p-values) for the difference in mean of the CPM distributions in the RNA-Seq and Ribo-Seq replicates. Strong correlation and no significant mean difference across replicates indicate there are no replicate outliers. b Correlation of RNA-Seq and Ribo-Seq CPM in the three samples. c Metabolic pathways with higher rank in the Ribo-Seq dataset against the RNA-Seq dataset (p < 0.05 as calculated by the right-tailed Mann–Whitney U-test). Translational efficiencies (TEs) are shown for reference. d Metabolic pathways with lower rank in the Ribo-Seq dataset against the RNA-Seq dataset (p < 0.05 as calculated by the left-tailed Mann–Whitney U-test).

When contrasting across datasets, RNA-Seq and Ribo-Seq counts-per-million (CPM) show significant correlations, with a PCC of 0.78 (p = 0.0) in the three samples (Fig. 2b). Despite the significant correlation at the whole-dataset level, there are variations in the resource allocation at the pathway level in the transcriptome and the translatome, which shows differential translational prioritization across the genome of P. putida. Thus, we assessed whether metabolic pathways were observed with high or low translational prioritization. We measured the translational prioritization of a pathway using the average TE of its associated genes (see the “Methods” section) and calculated the significance of the prioritization through a one-tailed Mann–Whitney U (MWU) test. As part of the MWU test, we sorted and ranked the genes in the RNA-Seq and Ribo-Seq datasets and contrasted if their ranks differed significantly in either dataset. A high TE and a low right-tailed MWU test p-value for a significant increase in rank in the Ribo-Seq dataset (p < 0.05) support high translational prioritization.

A right-tailed Mann–Whitney U test revealed that there were ten metabolic pathways with significantly higher ranks in the Ribo-Seq dataset (p < 0.05), which can be evidenced by the calculated translational efficiencies (TE)30,35,36 between 1.4 and 2.1 (Fig. 2c). Notably, the highest prioritization (highest TE) was calculated for nicotinamide biosynthesis (TE = 2.1, p = 5.38e−4), which produces the essential cofactor NAD+. Other core metabolic pathways, such as urea cycle (TE = 1.9, p = 0.02), pyrimidine biosynthesis (TE = 1.7, p = 5.04e−3), lipid A biosynthesis (TE = 1.7, p = 0.02), and queuosine biosynthesis (TE = 1.7, p = 0.006), which are directly associated with cell proliferation37, showed significant prioritization. As expected, there was significant prioritization of gene expression, including translation (TE = 1.8, p = 0.03) and tRNA charging (TE = 1.6, p = 0.04).

On the other hand, 11 metabolic pathways had significantly lower ranks as calculated by a left-tailed Mann–Whitney U test (p < 0.05). The lowest prioritization was shown for pathways less necessary for growth in glucose, namely levulinate metabolism (TE = 0.2, p = 7.02e−4), phenylacetyl-CoA catabolon (TE = 0.3, p = 0.03), cellulose metabolism (TE = 0.5, p = 0.02), and starch and sucrose metabolism (TE = 1.0, p = 0.01). Membrane-associated pathways, such as inner membrane transport (TE = 1.0, p = 0.02), outer membrane transport (TE = 0.8, p = 8.5e−3), iron uptake (TE = 0.7, p = 0.03), peptidoglycan biosynthesis (TE = 0.8, p = 0.04), and murein recycling (TE = 0.5, p = 0.01), showed low translational prioritization. It is worth noting that a TE equal to 1.0 can still mean an overall lower prioritization due to the variation in ranks in both datasets, such is the case for starch and sucrose metabolism and inner membrane transport (Fig. 2d).

The ME-model recapitulates optimal proteome allocation

Our translational prioritization analysis highlighted significantly higher TEs for growth-required pathways, both metabolic and gene expression-related (Fig. 2c, d). Most of the transcriptome and translatome are allocated for these pathways (Fig. 3a), led by translation and followed by energy and biomass production. However, multi-omic data does not provide a quantitative understanding of the metabolic and gene expression rates. Predictive metabolic models are used to attain this understanding. Thus, here, we assessed the improved performance of the ME-model of P. putida KT2440 over the M-model when contrasting against the expression levels inferred from multi-omics (simulation constraints and flux distributions are provided in Supplementary Data 2).

Fig. 3: Correlation between model predictions and multi-omics at the pathway level.
figure 3

a and b Accumulated counts per million (CPM) for each pathway in the RNA-Seq (a) and the Ribo-Seq (b) datasets. Only the top 10 largest contributing pathways in either dataset are highlighted. b Correlation between Ribo-Seq and RNA-Seq CPM at the pathway level. c Correlation between multi-omics and M- (iJN1462) or ME-model (iPpu1676-ME) predictions at the pathway level.

We compared M- and ME-model predictions of cumulative pathway-level fluxes of P. putida KT2440 against pathway-level expression from RNA-Seq, Ribo-Seq, and the calculated TE from them. The ME-model significantly outperformed the M-model in all cases (Fig. 3b). As a reference, pathway-level RNA-Seq and Ribo-Seq are correlated with a PCC of 0.94 (Fig. 3c). In the M-model, several subsystems were predicted with low flux, causing relevant data points to be more sparse. The PCCs were 0.50 for RNA-Seq (p = 3.55e−18) and 0.49 for Ribo-Seq (p = 5.50e−17), which signifies an existent but weak correlation between both datasets. On the other hand, the ME-model showed stronger positive correlations with PCCs of 0.71 (p = 2.27e−44) and 0.76 (p = 1.40e−53) for RNA-Seq and Ribo-Seq, respectively. Notably, despite the stronger correlation between the ME-model predictions and RNA-Seq and Ribo-Seq, there was only a weak positive correlation for TE (PCC = 0.46). Therefore, the predicted expression fluxes by the ME-model are predictive of the observed transcriptome and translatome but not of the TE. Thus, the predictive capability of iPpu1676-ME to recapitulate proteome allocation of P. putida KT2440 showcases its potential to be further used to interrogate and optimize the resource allocation in this organism.

Discussion

Pseudomonas putida KT2440 has broad potential for biotechnological applications1,2, and several studies have been focusing on optimizing culture conditions, growth, and metabolic capabilities for this bacterium. While bioinformatics tools and databases have informed potential pathways in P. putida38,39, only predictive metabolic models can quantitatively estimate improvements in metabolism, growth, product rates, and yields40,41,42. Similar to previous studies using ME-models, we here showed the prediction of by-product secretion11, overflow metabolism15, and prediction of proteome allocation16. While previous ME-model reconstructions have widely proven prediction improvements, multi-omics datasets (RNA-Seq and Ribo-Seq) have not been integrated or contrasted with these simulations. Our translational prioritization analysis showed significantly high TE for queuosine biosynthesis alongside core biomass precursor biosynthetic pathways. Interestingly, this pathway has been reported as a cell division regulator in other bacteria37. On the other hand, transport and other membrane-associated functions were found to have low translational prioritization.

The ME-model for P. putida KT2440, iPpu1676-ME, showed significant improvements in the predictive capabilities of transcriptome and translatome over the template M-model, iJN14621. iPpu1676-ME achieved an outstanding PCC of 0.71 against RNA-Seq and 0.76 against Ribo-Seq. As a reference, the multi-omics model and analytics (MOMA), a semi-supervised machine learning pipeline, achieved PCCs between 0.58 and 0.85 with RNA-Seq in E. coli across 16 strains43. It is worth noting that the higher agreement of Ribo-Seq and the predicted proteome allocation by the ME-model underscores the precision of this sequencing technology in identifying the metabolic goal of an organism30. In addition, it highlights the ability of a ME-model to predict optimal proteome allocation in a metabolically diverse organism such as P. putida KT24401,2.

Some limitations affect our analysis. Determination of the active proteome is challenging through sequencing technologies due to various technical limitations. For example, RNA degradation affects measurements by RNA-Seq44, and RNA transcription trends do not always carry over to translation due to translational prioritization effects29,30. On the other hand, direct measurement of the proteome through proteomics is hindered by the difficulty of whole-proteome determinations and the inherent noise in mass spectrometry data45. On the other hand, Ribo-Seq provides an accurate representation of translation in vivo29,30. While it cannot detect protein stability, modification, and folding46, it has been shown to provide a genome-scale understanding of translational prioritization29,30. Furthermore, we noticed variation between the replicates, which can be due to the inherent flexibility of the metabolism of P. putida1,2. However, the correlation was strong enough (PCC > 0.9) to ensure there were no significant outliers in the replicates. Another limitation of this study is that metabolic models, and thus ME-models, are limited to modeling enzymes with either a metabolic or a gene expression function12. However, there is a fraction of the proteome that can have an alternative function, which can be structural or still unknown. For example, this fraction has been estimated to be approximately 36% of the proteome in E. coli12, and it might be the cause of some of the unexplained variances in our comparison between model predictions and Ribo-Seq.

Here, we provide a modeling framework alongside multi-omic datasets that were not available to date, yielding an important resource for further understanding the translational resource allocation in this industrially relevant microorganism. Overall, we envision that the ME-model and multi-omic resources brought forward in this work will serve as powerful tools for the metabolic engineering and optimization of P. putida KT2440.

Methods

Reconstruction and simulation of the ME-model

The ME-model of P. putida KT2440, iPpu1676-ME, was reconstructed using coralME18, with the available genome AE015451.219, M-model (iJN14621), and BioCyc20-derived annotation files as inputs. Input manual curation files of iPpu1676-ME are explained in Table 1. The code and scripts in this work were developed and run in Python 3.10, COBRApy47 version 0.26.3, and coralME version 1.1.5. Simulations were performed using the Quad MINOS software courtesy of Prof. Michael A. Saunders at Stanford University48. The Quad MINOS solver was compiled under Ubuntu 22.04 with gfortran version 5 and Python 3.10 (pip 22.3.1, wheel 0.38.4, numpy 1.21.6, scipy 1.7.3, cython 0.29.32, and cpython 0.0.6). Flux distributions were calculated with a binary search algorithm that looks for the maximum possible growth rate that is feasible.

Table 1 Input and manual curation files and their description

Computations were performed on a 64-bit Ubuntu 22.04.3 LTS (Jammy Jellyfish); AMD Ryzen 9 7900X@4.70 GHz (12 cores, 24 threads); 4 × 32 GB 6000 MHz DDR5 RAM. It is worth noting that iPpu1676-ME is one of the largest ME-models available. As a reference, the E. coli ME-model contains 1678 genes12. As such, optimizing iPpu1676-ME typically takes five minutes in the computer described here. This is several orders of magnitude longer than the associated M-model, iJN14621 (~72 ms). While this computation time still allows for most analyses performed in metabolic modeling, it can hinder its application in large-scale sampling of conditions, e.g., simulating a bioreactor.

Gene essentiality predictions

Gene essentiality was predicted in iPpu1676-ME by closing (setting upper and lower bounds to zero) the translation reaction of a gene and testing for ME-model feasibility. Feasibility is tested at a growth rate of 10−3 with the QuadMINOS solver in quad-precision and a tolerance of 10−16.

Flux variability analysis

Flux variability analysis (FVA) was performed using the built-in method model.fva() in coral ME. In order to generate comparable datasets between the M- and the ME-models, we cast the M-model into an ME-model instance (function from_cobra()). The FVA in both models was performed using the QuadMINOS solver in quad-precision and a tolerance of 10−16.

Culture and sequencing of P. putida KT2440

A preculture of Pseudomonas putida KT2440 was grown in 5 mL of liquid 1× M9 medium (Sigma Aldrich, M6030) with 30 mM glucose under oxic conditions at 37 °C for 2 days. The optical density at 600 nm (OD600) was measured to assess carrying capacity using the Molecular Devices SpectraMax M3 Multi-Mode Microplate Reader (VWR, cat # 89429-536). The preculture was diluted in triplicate to an OD600 of 0.1 in 25 mL of 1x M9 medium and incubated under oxic conditions at 37 °C in a shaking incubator, with optical density measured hourly. Samples were harvested by centrifugation and pellets were saved for multi-omics analysis as detailed below (RNA-, and Ribo-Seq).

Transcriptomic (RNA-Seq) sample preparation

RNA was extracted from P. putida replicates using the RNeasy mini kit (Qiagen), with rRNA removal performed using the QIAseq FastSelect-5S/16S/23S kit (Qiagen). RNA-Seq libraries were constructed using the KAPA RNA HyperPrep kit (Roche) and barcoded with TruSeq indexes (Illumina). Amplification was monitored in real-time using SYBR-Green and halted upon reaching the amplification plateau.

Translatomic (Ribo-Seq) sample preparation

The preparation of Ribo-Seq samples utilized a protocol adapted from previously described methods designed for axenic bacterial cultures33,49. Briefly, the bacterial cultures were treated with chloramphenicol and pelleted. At this point, we modified this protocol to address the chloramphenicol resistance of Pseudomonas putida KT2440 by resuspending the pellet in RNAlater and flash-freezing in liquid nitrogen. Samples were thawed, pelleted, and RNAlater removed, before proceeding to mechanical bacterial lysis. Bacterial lysis was performed in the presence of a lysis solution containing additional chloramphenicol and Guanosine-5′-[(β,γ)-imido] triphosphate (GMPPNP) to inhibit protein elongation. Lysates were treated with MNase and DNase to digest nucleic acids that were not protected by ribosomes. Monosomes were recovered using RNeasy mini spin size-exclusion columns (Qiagen) and RNA Clean & Concentrator-5 kit (Zymo). rRNA was removed with the QIAseq FastSelect-5S/16S/23S kit (Qiagen), and MetaRibo-Seq libraries were constructed using the NEBNext Small RNA Library Prep set for Illumina. Amplification was followed in real-time with SYBR-Green and stopped upon plateau plateau, and PCR products were purified with the Select-a-size DNA Clear & Concentrator kit (Zymo).

Sequencing

Library quantity and average size were assessed with the 4200 TapeStation System (Agilent). Library concentrations were quantified using the Qubit dsDNA HS Assay kit and QuBit 2.0 Fluorometer (Invitrogen). Sequencing was performed by UCSD IGM on the Illumina NovaSeq S4, PE100 platform with a minimum sequencing depth of 50 million reads for transcriptomic samples and 100 million reads for translatomic samples.

Sequence alignment and post-processing of RNA-Seq and Ribo-Seq

Paired-end read sequencing files from RNA-Seq and Ribo-Seq (FASTQ format) were processed using Python 3.7. Reads were trimmed using trim_galore version 0.6.10 (Cutadapt version 2.6)50. Reads were aligned to the genome of P. putida KT2440 using bowtie2 version 2.2.551. For Ribo-Seq, single-end read sequence alignment was performed since read length is short enough that both directions are redundant. For RNA-Seq, reads aligning to ribosomal RNA were discarded. Raw read counts were estimated using Woltka version 0.1.552, with the built-in function “classify”. Finally, cross-sample analyses in this work were performed using the counts per million (CPM) normalization of read counts, as shown in Eq. (1).

$$CPM=\frac{Raw\,read\,count}{\sum Raw\,read\,counts\,in\,sample}\,\ast {10}^{6}$$
(1)

Statistical analysis of translational prioritization from multi-omics

Translational prioritization was inferred from calculating translational efficiency (TE) (Eq. (2))30,35,36, and significance was calculated using the Mann–Whitney U test. Higher translational prioritization was determined if TE ≥ 1.0 and p < 0.05 in a right-tailed Mann–Whitney U test, while lower was determined if TE ≤ 1.0 and p < 0.05 in a left-tailed Mann–Whitney U test. The test was performed using scipy.stats.mannwhitneyu package of scipy version 1.11.153.

$$TE=\frac{Ribo-Seq\,CPM}{RNA-Seq\,CPM}$$
(2)

Benchmarking analysis of model predictions with multi-omics

Benchmarking of the models (M- and ME-models) predictions was performed with a correlative analysis between the pathway activity inferred from RNA-Seq and Ribo-Seq and the predicted pathway fluxes in the M-model (iJN14621) and the ME-model (iPpu1676-ME). For RNA-Seq and Ribo-Seq, gene-level CPM data were grouped and summed by annotated subsystem in the ME-model. For M-model predictions, the reaction fluxes were grouped and summed by annotated subsystem in the M-model. For ME-model predictions, protein translation fluxes were grouped and summed by annotated subsystem in the M-model. We then log-transformed the fluxes as log10(f + 1), where f is the combined flux of a subsystem. The data in all three samples was used in the regression to maintain maximal statistical power. Then, linear regression was performed with the statsmodels version 0.14.1, which returned the regressed linear model and 95% confidence interval. The Pearson correlation coefficient (PCC) and significance (p-value) were calculated using scipy.stats.pearsonr of scipy version 1.11.1.