Fig. 1: Study design and comparative plasma proteomic analysis between IE and non-IE groups. | Nature Communications

Fig. 1: Study design and comparative plasma proteomic analysis between IE and non-IE groups.

From: Integrated plasma and vegetation proteomic characterization of infective endocarditis for early diagnosis and treatment

Fig. 1

A Overview of the cohort populations and research strategy. B UMAP of the plasma samples in Cohort 1. C Volcano plot displaying dysregulated plasma proteins in IE at admission. The unpaired two-sided Wilcoxon rank-sum test was used for differential analysis, with p-values adjusted for FDR. The x-axis represents log2fold-change (FC), and the y-axis represents -log10false discovery rate (FDR). D Bar chart illustrating dysregulated pathways in IE. The x-axis represents -log10FDR. E Bar plot illustrating the weighted feature importance of the top 20 plasma proteins identified by the ensemble model to distinguish between IE and non-IE samples. The error bars represent the standard error of the mean (SEM) of the weighted feature importance calculated from five-fold cross-validation across six algorithms. Data are presented as mean ± SEM (n = 6). F Scatter line plot illustrating optimal feature combinations yielding the highest accuracy, determined through recursive feature elimination cross-validation (RFECV). The error bands represent the 95% confidence interval (CI) from five-fold cross-validation. G Heatmap showing the expression profiles of the 10 proteins identified in the model. Each column represents a patient sample, and rows indicate proteins. The color range in the heatmap represents the row z-score of the normalized protein expression values, ranging from +1.5 (red) to −1.5 (blue). The annotation of the pathogen identified by metagenomic next-generation sequencing (mNGS) for each sample is displayed above the heatmap, with blanks indicating missing records. H, I Receiver Operating Characteristic (ROC) curves (H) and confusion matrices (I) illustrating model performance on the hold-out test set of the discovery cohort (Cohort 1) and two external validation cohorts, Cohort 2 and Cohort 3. Source data are provided as Source Data files.

Back to article page