Fig. 7: Analysis of physicochemical properties of clean and polluted air molecules.

Structures of molecules in clean and polluted air (i.e. VIPs > 1.00, p[corr] > ±0.50; Fig. 5) were used to compute physicochemical descriptors of toxicological and environmental relevance. a, b A subset of the WSOC + molecular network (see Fig. 4) is shown to highlight the tight relationship between structural analogy and physicochemical properties, with features color-coded according to (a) chemical group, and (b) predicted lipophilicity (logP). In b, example structures from the same network are reported with corresponding logP values. c Positive Pearson’s correlation between the predicted logP and the liquid chromatography retention time (Rt) on a C18 stationary phase for POC and WSOC extracts (ESI+ and ESI− data combined). d Two-dimensional density plots visualizing the distribution of logP and topological polar surface area (TPSA, Ų) in clean and polluted air. e Density distribution and median values of logP in clean and polluted air molecules. f, g Scatterplots with molecular features color-coded according to detection in the analyzed extracts, showing a negative Pearson’s correlation between the (f) LogP and O/C, and (g) between MR and solubility (logS). h Density distribution and median values of the index of refraction (nD) in clean and polluted air molecules. All Pearson’s correlation coefficients (ρ) are shown with p values < 2.2e–16. Significant differences between clean and polluted air profile distributions are reported for unpaired Student’s t-test p-values (two-sided).