Fig. 2: Geographical sources of PM2.5 and modelling of molecular profiles.

a Relative contribution of air masses for each 48 hr sample (n = 40 time-points) originating from the Arabian Sea, Peninsular India, Indo-Gangetic Plain or Indian Ocean, and b the mean path of these four 10-day back-trajectory clusters around the receptor site MCOH (6.80°N, 73.20°E). The satellite image corresponds to the monitoring campaign day 2nd of February, 2018 (NASA Worldview; https://worldview.earthdata.nasa.gov/). c Scores of the orthogonal partial least square (OPLS; 2 + 1 + 0) multivariate model correlating the back-trajectories to the nontarget (LC+GC) HRMS chemical profiles of PM2.5 samples, with Indian Ocean back-trajectory cluster cross-validated [CV]-ANOVA p value = 0.0013. See also principal component analysis (PCA) in Fig. S10, and details in Supplementary data 3—Models and statistics OPLS 2 + 1). Each sample is colored according to the most dominant back-trajectory within the 48 h window. d Tropospheric nitrogen dioxide (NO2) levels measured along the back-trajectories by satellite remote sensing (OMI/Aura; NASA). e Polycyclic aromatic compound abundance (±SE) in GC-HRMS for analytes identified at Level 1 (dibenzo[a,i]pyrene, 4H-cyclopenta[def]phenanthren-4-one, benzo[b]naphto[1,2-d]thiophene). Error bars indicate S.E. Significant differences relative of levels in Arabian Sea (n = 14), Peninsular India (n = 13), and Indo-Gangetic Plain (n = 7) to the Indian Ocean (n = 6) are shown by the Welch t-test (two-sided): *p ≤ 0.05; **p ≤ 0.01; ***p ≤ 0.001; and ****p ≤ 0.0001 (d.f. = 18, 17, 11).