Fig. 5: Differences in long-term symptom groups across measured serum, cellular and transcriptional variables.

a, Grouping of patients with PS or NPS from hierarchical clustering of symptom severity scores (0, worst; 5, best) across seven symptom categories. The disease severity group (groups B–E) and total symptom score (summation across symptoms) are indicated above the heat map. The distribution of the responses to the follow-up questionnaires at Q1 and Q2 is shown (top). b, PLS-DA analysis of symptom groups from a study conducted on immune-cell counts, serum parameters and reticulocyte data collected between days 15 and 30. c, Variables driving differentiation of individuals with NPS and PS on PLS component 1, colored according to the group with highest mean. d, Unsupervised hierarchical clustering of patient data from day 15–30, using the 15 leading variables as in c. Patient symptom groups, severity groups and symptom severity scores are shown above the heat map. The cluster capturing most PS individuals is outlined by a black box. Missing data are shown in white in the heat map. e, Fold change (log2-transformed) in median serum inflammatory and iron parameters of individuals with PS compared with NPS at different time windows (left). The significance of the symptom group effect was calculated by linear regression of log2-transformed measures corrected for age; no multiple testing correction was applied. Patient-level data for the boxed parameters in more detail (right). The gray band represents the IQR of the HCs; the y axis is shown as a logarithm base ten scale. Measures taken at days 0–180 and 181–360 are annotated on the basis of the Q1 and Q2 symptom groups, respectively. f, Volcano plot showing genes that are differentially expressed, from differential gene expression analysis with age correction, between the PS (red) and NPS (green) groups at day 15–30 (left). Normalized expression for EPOR and EPAS1 (right); P values are from differential gene expression analysis before FDR correction. The gray band indicates the IQR of HC expression. CPM, counts per million reads. g, Significantly enriched HALLMARK and iron-homeostasis gene sets from GSEA run on the log2(FC) ranked gene list from a comparison of NPS and PS groups across time windows. PFDR values from GSEA are shown, with up- and downregulated gene sets in PS colored red and blue, respectively. a,e,f, Box plots show the minimum value, 25th percentile, median, 75th percentile, maximum value and outliers beyond 1.5× the IQR. ˙P < 0.1, *P < 0.05, **P < 0.005 and NS, not significant; mDCs, myeloid DCs.