Extended Data Fig. 5: SCAVENGE captures cell states relevant to COVID-19-severity risk.
From: Variant to function mapping at single-cell resolution through network propagation

a, The UMAP plots showed the presence of seed cells. b, Box plots depicting SCAVENGE TRS of COVID-19-severity risk for cells from healthy donors (HD) and patients with mild (COVID-Mild) and severe COVID-19 (COVID-Severe). Boxplots (n=21,780 for HD, n=41,543 for COVID-Mild and n=27,219 for COVID-Severe) show the median with interquartile range (IQR) (25–75%); whiskers extend 1.5x the IQR. The significance was calculated using two-sided Student's t-test (P = 1.5e-116 for COVID-Severe and HD, P = 6.3e-71 for COVID-mild and HD, and P = 9.7e-6 for COVID19-Severe and COVID-mild). c, An illustration of the method used to calculate empirical P values for cell state determination. The background network propagation score is calculated based on SCAVENGE analysis using randomly selected seed cells that match topology attributes of real seed cells. A null distribution of network propagation score is generated, and the empirical P value is calculated from comparison between network propagation score and null distribution (Methods). d, Cell number distribution of COVID-19 severity risk variant-enriched and -depleted cell states across all cell types. Bar plots depicting cell numbers and the order of cell types are kept identical to that from Fig. 4c. e, Chromatin activity scores of genes in the vicinity of cell state-associated risk variants. Log-normalized gene scores between COVID-19 severity risk variant-enriched and -depleted cell populations for indicated genes are shown.