Fig. 3: Zyxin, Secretoglobin family 1A member 1, and Leukocyte immunoglobulin-like receptor subfamily A member 3 are associated with ARS.
From: Dynamics of the blood plasma proteome during hyperacute HIV-1 infection

a Flow chart representing the total number of samples used in ARS classification and the exclusion criteria. b Bar graph comparing the distribution of AHI symptoms between participants that were defined to be with and without ARS (N = 33). ARS was defined based on 11 AHI symptoms, and unobserved linkages between symptoms using Latent Class Analysis. Incremental latent group models were assessed to predict the goodness of fit. The model with two latent groups was the best fit, with the lowest BIC value (660.5) compared to three (678.6), four (699.2), or five (714.7) groups. Study participants were grouped based on their predicted posterior probabilities into those with ARS (N = 20/33 (60%)) and those without ARS (13/33 (40%)). c Box plots displaying the results of the cross-validated performance measure (accuracy) for the ARS PLS-DA models based on the following datasets: V0 + V1 + V2; V0 + V1-V0 + V2-V0; V1 + V2; V1-V0 + V2-V0; V1-V0; and V2-V0. The models were trained to predict ARS “Yes” or “No” and evaluated in 10 5-fold cross-validations, resulting in 50 individual accuracy values from 50 test sets. Each boxplot shows the distribution of accuracy values across 50 cross-validation models. The center line within the box represents the median, the box bounds the interquartile range, and the whiskers the minimum and maximum values of 1.5 × IQR beyond the box. Any data points beyond these are considered outliers, and shown as individual points. d Score plot based on the V1-V0 + V2-V0 dataset (with the highest accuracy value) from (c), indicating the group membership of each sample. There was clear discrimination between the ARS-No (orange) and the ARS-Yes (green) samples on the first (x-axis) and second components (y-axis). Axis labels indicate the percentage of variation explained per component. e Boxplot showing the variable importance in projection (VIP) scores in the PLS-DA model based on V1-V0 or V2-V0 for each protein. The VIP score summarizes the contribution a variable (protein) makes to the model. This plot identifies the most important proteins for the classification of ARS “Yes” or “No”. Proteins with high VIPs are more important in providing class separation. Black points represent the full model, and the boxplots indicate the distributions of 10 cross-validation models. The sample size corresponds to the 50 VIP scores computed for each protein across the 50 cross-validation models. The center line within the box represents the median, the box bounds the interquartile range, and the whiskers the minimum and maximum values of 1.5 × IQR beyond the box. Any data points beyond these are considered outliers, and shown as individual points. Red dots represent the VIP scores for the full PLS-DA model, capturing the importance of each protein feature in the model’s ability to discriminate between groups. These points reflect the average or specific metric of the VIP values used to build the full model. f Forest plots indicating effect sizes (log2 fold change) and 95% confidence intervals for proteins significantly differentially expressed at 2 weeks and 1 month post estimated date of infection (EDI), relative to pre-infection levels (V1-V0 and V2-V0, respectively). Only individuals from the IAVI cohort were included since ARS data are only available for this cohort. Circles and triangles indicate depleted (depl) and neat plasma, respectively. The statistical analysis was conducted using linear mixed-effects models with a random intercept for each patient, treating visit number as a categorical variable. The differential protein expression was assessed using a global ANOVA, with post hoc tests identifying specific visit comparisons (e.g., V0 vs. V1, V0 vs. V2, and V1 vs V2). The Benjamini-Hochberg’s FDR method with a 5% FDR threshold was used to correct for multiple testing, with a fixed p-value cut-off of 0.005. g Heatmap of proteins associated with ARS based on hierarchical clustering of the V1-V0 and V2-V0 expression of the selected proteins. The heatmap provides a visual representation of coordinated changes of the proteins identified through PLS-DA and linear regression in relation to ARS status. h Pirate plots showing the V1–V0 protein expression for the top proteins between those with and without ARS. i Table representing the longitudinal protein expression profiles for the top proteins associated with ARS. For each profile and protein, the number (%) of patients with or without ARS were recorded. Abbreviations: ARS acute retroviral syndrome, PLS-DA Partial Least Squares Discriminant Analysis, V0 visit 0 (collected before estimated date of infection), V1 visit 1 (collected 10–14 days post estimated date of infection), V2 visit 2 (collected 15–42 days before estimated date of infection), V1-V0 difference between visit V1 and V0, V2–V0 difference between visit V2 and V0, V2–V1 difference between visit V2 and V1, VIP variable importance in projection, Log2FC log 2-fold change. The asterisk (*) appended to the end of certain protein names indicates proteins detected in neat plasma, while proteins without an asterisk were identified in depleted plasma samples. Source data are provided as a Source Data file.