Introduction

The increase in anthropogenic aerosols affects the radiative properties of liquid clouds by altering their microphysical (cloud droplet concentration, \({N}_{d}\), and effective radius, \({R}_{e}\)) and macrophysical (liquid water path, \({{{\mathrm{LWP}}}}\), and liquid cloud fraction, \(f\)) properties, phenomena generically known as aerosol–cloud interactions (ACI). Additional aerosols cause a monotonic increase in \({N}_{d}\) and, given a constant \({{{\mathrm{LWP}}}}\), a decrease in \({R}_{e}\), leading to a net cooling effect on the earth-atmosphere system, referred to as the Twomey effect1. Subsequently, \({{{\mathrm{LWP}}}}\) and \(f\) respond to changes in \({N}_{d}\), and \({R}_{e},\) known as rapid adjustments of ACI2,3, which, together with the Twomey effect, contribute to estimating the effective radiative forcing due to ACI (\({{{{\mathrm{ERF}}}}}_{{{{\mathrm{aci}}}}}\)). An example of such an adjustment is the cloud lifetime effect4,5. The latest report of the Intergovernmental Panel on Climate Change pointed out that ACI is one of the largest sources of uncertainty in current climate assessments6.

In fact, besides \({N}_{d}\), the shape of the cloud particle size distribution (PSD) is also a key factor influencing the Twomey effect. Increased aerosols change the cloud PSD shape, impacting cloud albedo and thus contributing to \({{{{\mathrm{ERF}}}}}_{{{{\mathrm{aci}}}}}\), referred to as the dispersion effect (DE). Liu and Daum7 analyzed marine clouds sampled by aircraft and pointed out that the DE can offset the number effect (i.e., considering only the impact of \({N}_{d}\) changes on \({R}_{e}\) in the Twomey effect) by 10–80%. As research progressed, aircraft-based studies found such discrepant results that DE not only offsets ACI8,9 but may also enhance it10,11,12,13 or have no significant impact14,15. Considering the regional nature and significant uncertainty of the DE, this effect is typically ignored when estimating \({{{{\mathrm{ERF}}}}}_{{{{\mathrm{aci}}}}}\) based on satellite observations3,16.

By using parameterizations derived from the regional aircraft data in general circulation models (GCMs), the global impact of DE on ACI can be estimated. For instance, modelers applied different parameterizations17,18,19,20,21 in GCMs and found DE can offset the number effect by −13–35%9,17,22; Xie et al.23 incorporated three parameterizations17,19,20 into a GCM and demonstrated that DE could offset ~7–14% of ACI globally. The high uncertainty in model evaluations may be related to the use of regionally dependent cloud PSD parameterizations. Additionally, GCM results can only provide the range of DE variations and generally lack observational validation on a global scale.

DE can be characterized as the sensitivity of the cloud PSD parameter (commonly represented by the ratio of effective radius to volume-mean radius, \(\beta\)) to the increasing aerosol number concentration (\({N}_{a}\))24,25. However, it is challenging to simultaneously obtain both \(\beta\) and \({N}_{a}\) in clouds, so some cloud variables are commonly used as proxies for \({N}_{a}\). In the early stages, \({N}_{d}\) was widely used9,17,20. As research progressed, it was found that changes in the liquid water content per cloud droplet (i.e., \({{{\mathrm{LWC}}}}/{N}_{d}\), hereafter \({{{\mathrm{LN}}}}\)) could better reflect DE, with a power–law relationship19,26:

$$\beta=a({{{LN}}})^{b},$$
(1)

where \(a\) and \(b\) are fitting parameters and DE is closely related to the parameter \(b\)19. Therefore, the key to quantifying DE is obtaining \(\beta\) and \({{{\mathrm{LN}}}}\), thereby determining parameter \(b\).

The data of \(\beta\) and \({{{\mathrm{LN}}}}\) used in previous studies were mainly obtained through regional aircraft observations9,17,19,20,27, which makes DE quantification lack global representativeness. In satellite observations, the cloud PSD is typically described by \({R}_{e}\) and the effective variance (\({V}_{e}\)), with \({V}_{e}\) being often fixed or discrete28,29, which hinders the global analysis of DE. Recently, a global dataset with both \({R}_{e}\) and continuous \({V}_{e}\) available from the POLarization and Directionality of Earth’s Reflectances (POLDER) instrument was established using artificial neural networks (NNs)30 (hereafter, POLDER-NNs, see Methods), allowing us to provide a global estimate of the impact of DE on ACI.

Two steps are proposed for the global quantification (see Methods for more details):

Step 1: Using \({R}_{e}\) and \({V}_{e}\) retrieved by POLDER-NNs over global regions dominated by liquid-phase stratiform clouds, the cloud-top β and \({{{\mathrm{LN}}}}\) can be determined as

$${\beta }_{P}={\left[\left(1-{V}_{e}\right)\left(1-2{V}_{e}\right)\right]}^{-1/3},\, {{LN}}_{P}=A\left(1-{V}_{e}\right)\left(1-2{V}_{e}\right){{R}_{e}}^{3}\cdot$$
(2)

Here, the subscript \(P\) indicates calculations based on the POLDER-NNs dataset, \(A=4\pi {\rho }_{w}/3\), where \({\rho }_{w}\) is the water density. By using \({\beta }_{P}\) and \({{LN}}_{P}\) for parameter fitting in Eq. (1), a cloud PSD parameterization based on global observations can be obtained.

Step 2: In adiabatic clouds, the cloud optical thickness (\({\tau }_{c}\)) depends on cloud-top \(\beta\) and \({N}_{d}\), as well as \({{{\mathrm{LWP}}}}\) (\({\tau }_{c}\propto {{\beta }^{-1}{{{\mathrm{LWP}}}}}^{5/6}{{N}_{d}}^{1/3}\))31. By using the adiabatic liquid water lapse rate (\({c}_{w}\)) to relate \({{{\mathrm{LWP}}}}\) to cloud-top \({{{\mathrm{LWC}}}}\), \({\tau }_{c}\) can be fully expressed in terms of cloud-top variables, leading to \({{{\tau }_{c}\propto \beta }^{-1}{{{\mathrm{LWC}}}}}^{5/3}{{N}_{d}}^{1/3}\). Given that \(\beta\) follows Eq. (1), based on the calculation of \({{{{\mathrm{ERF}}}}}_{{{{\mathrm{aci}}}}}\)3, the impacts of DE on the number effect (\({{{{\mathrm{DO}}}}}_{{Nd}}\), i.e., the impact of instantaneous DE) and the \({{{\mathrm{LWP}}}}\) adjustment effect (\({{{{\mathrm{DO}}}}}_{{{{\mathrm{LWP}}}}}\), i.e., the impact of adjusted DE) can be expressed as

$${{{{\mathrm{DO}}}}}_{{Nd}}=-3b \cdot 100\,\%,{{{{\mathrm{DO}}}}}_{{{{\mathrm{LWP}}}}}=\frac{3}{5}b \cdot 100\,\%,$$
(3)

where \({{{\mathrm{DO}}}}\) stands for the dispersion offset (in %), and \(b\) can be derived from step 1.

Results and discussion

Global distribution of DE’s impact on ACI

First, we use POLDER-NNs data over global regions dominated by liquid-phase stratiform clouds to calculate the spatial distribution of the parameter \(b\), and based on this, derive the global distributions of \({{{{\mathrm{DO}}}}}_{{Nd}}\) and \({{{{\mathrm{DO}}}}}_{{{{\mathrm{LWP}}}}}\), as shown in Fig. 1.

Fig. 1: Spatial distribution of annual mean values for variables related to satellite data (POLDER-NNs) used in this study.
figure 1

The first row shows variables directly provided by the POLDER-NNs dataset, including a the effective radius (\({R}_{e}\)) and b the effective variance (\({V}_{e}\)). The second row shows variables calculated from \({R}_{e}\) and \({V}_{e}\), including c the particle size distribution parameter (\({\beta }_{P}\)) and d the liquid water content per cloud droplet (\({{{{\mathrm{LN}}}}}_{P}\)). The third row presents e the parameter \(b\) by fitting \({\beta }_{P}\) and \({{{{\mathrm{LN}}}}}_{P}\) within each grid point (1° × 1°). The fourth row indicates variables derived from the parameter \(b\), including the impacts of the dispersion effect on f the number effect (\({{{{\mathrm{DO}}}}}_{{Nd}}\)) and g the liquid water path adjustment effect (\({{{{\mathrm{DO}}}}}_{{{{\mathrm{LWP}}}}}\)), where the dotted areas indicate the fitting relationships are statistically significant with a 95% confidence level. The global mean and standard deviation are shown in the title of each plot, with those for the dotted areas provided in parentheses.

The variables directly provided by the POLDER-NNs dataset are presented in Fig. 1a, b, including \({R}_{e}\) and \({V}_{e}\). The \({\beta }_{P}\) and \({{{{\mathrm{LN}}}}}_{P}\) calculated from \({R}_{e}\) and \({V}_{e}\) are shown in Fig. 1c, d. Overall, the spatial distribution of \({\beta }_{P}\) is similar to that of \({V}_{e}\), while \({{{{\mathrm{LN}}}}}_{P}\) is similar to \({R}_{e}\). Specifically, both \({\beta }_{P}\) and \({{{{\mathrm{LN}}}}}_{P}\) exhibit distinct land–ocean distribution characteristics, which align with our general understanding of these variables (see Methods). By fitting a power–law relationship between \({\beta }_{P}\) and \({{{{\mathrm{LN}}}}}_{P}\) within each grid point, the parameter \(b\) can be determined (Fig. 1e). The dotted areas indicate that the fitting relationships are statistically significant with a 95% confidence level. Overall, the spatial distribution of the parameter \(b\) exhibits two characteristics: (1) \(b\) values across different grid points are predominantly negative; (2) there is significant spatial variability for the parameter \(b\). Next, we conduct further analysis focusing on the two characteristics.

In detail, the proportion of negative \(b\) values fitted using POLDER data exceeds 97% (with the dotted areas being 100%) (Supplementary Fig. 1). However, the parameter \(b\) calculated through aircraft observations could be negative19 or positive26. We think the discrepancy is likely due to the fact that the previous in situ measurements were primarily of local/regional scale with higher spatial resolution compared to the POLDER-NNs30. Aircraft in situ observations typically cover a range of kilometers, capturing fine-scale variations of \(\beta\) to \({{{\mathrm{LN}}}}\) within clouds. Lu et al.32 and Zhang et al.33 demonstrated, through in situ observations and numerical simulations, that the cloud PSD parameter shows a positive correlation with \({{{\mathrm{LN}}}}\) for small cloud droplets. The relatively coarse spatial grid of the POLDER-NNs can only capture the dominant large-scale relationship, overshadowing the less frequent positive correlations and leading to our generally negative calculated \(b\) values (Fig. 1e). However, since the grid scale used in GCMs is also relatively coarse and reflects the overall conditions within large-scale grids of hundreds of kilometers, these results are appropriately matched for model evaluations.

Additionally, there is significant spatial dependency in the distribution of \(b\) values across different grid points (with a standard deviation of 0.015 and 0.013 within the dotted areas). Overall, more negative \(b\) values are predominantly concentrated in regions heavily influenced by anthropogenic aerosols, indicating that the PSD shape response to aerosol changes is more sensitive in these regions. To explain the spatial distribution of \(b\), we plotted the relative changes of \({\beta }_{P}\) and \({{{{\mathrm{LN}}}}}_{P}\) (Supplementary Fig. 2). Analysis revealed that the parameter \(b\) over land is determined by the combined variations of \({\beta }_{P}\) and \({{{{\mathrm{LN}}}}}_{P}\), while over ocean, it is primarily determined by the variations in \({\beta }_{P}\) (see Methods). However, current GCMs do not consider spatial variations in the parameter \(b\), which could introduce biases in simulating the cloud PSD and DE.

The spatial distribution of \(b\) can be used to estimate the spatial distribution of \({{{{\mathrm{DO}}}}}_{{Nd}}\) and \({{{{\mathrm{DO}}}}}_{{{{\mathrm{LWP}}}}}\) (Fig. 1f, g), indicating that the impacts of instantaneous and adjusted DE also exhibit spatial variability. Considering only regions with high-reliability \(b\) values (dotted areas), DE globally exhibits a 9.31% offset on the number effect and a 1.86% enhancement on \({{{\mathrm{LWP}}}}\) adjustment effect.

Overall assessment of DE’s impact on ACI

Next, we examine all POLDER-NNs data collected throughout the year over global regions dominated by liquid-phase stratiform clouds. By fitting \({\beta }_{P}\) and \({{{{\mathrm{LN}}}}}_{P}\) calculated from all the data, we find that they exhibit a clear power–law relationship (Pearson correlation coefficient r = −0.53, p < 0.01), with the fitting equation being \({\beta }_{P}=0.68{{{{{\mathrm{LN}}}}}_{P}}^{-0.024}\) (Fig. 2a). According to Eq. (3) and \(b=-0.024\), DE can offset 7.2% of the number effect but enhance the \({{{\mathrm{LWP}}}}\) adjustment effect by 1.44%.

Fig. 2: Relationships between particle size distribution and liquid water content per droplet across regions.
figure 2

The x-axis represents the particle size distribution parameter (\({\beta }_{P}\)), while the y-axis represents the liquid water content per cloud droplet (\({\rm{LN}}_{P}\)). The first row shows scatter density plots, and the second row displays joint histograms, all plotted with logarithmic scales on both axes. Panels a, d correspond to the global region (60°S–60°N), b, e to land, and c, f to ocean. The solid black lines represent the fitted curves, and the fitting equations along with statistical parameters are labeled in the top right corner, where \(y\) means \({\beta }_{P}\) and \(x\) means \({{{{\mathrm{LN}}}}}_{P}\). For the scatter density plots, the color represents the frequency of data points within each small interval. And \({x\_avg}\) and \({y\_avg}\) means the averages of \({{{{\mathrm{LN}}}}}_{P}\) and \({\beta }_{P}\), respectively. For the joint histograms, the color represents the probability density within this range, wherein each column is normalized so that it sums to 1. The black dot represents the median of \({\beta }_{P}\) with an equal number of samples. The shaded area is the 95% confidence interval (according to a Student’s \(t\)-test) to represent the fitting uncertainty.

Considering the impact of underlying surfaces on the cloud PSD, we further conduct regressions for land and ocean separately (Fig. 2b, c). The \(b\) values are −0.026 for land and −0.028 for ocean. Correspondingly, the values of \({{{{\mathrm{DO}}}}}_{{Nd}}\) and \({{{{\mathrm{DO}}}}}_{{{{\mathrm{LWP}}}}}\) are 7.8, −1.56% for land and 8.4, −1.68% for ocean. Specifically, the mean \({{{{\mathrm{LN}}}}}_{P}\) for the land (5.2 \({{\rm{ng}}}\)) is lower than that for the ocean (7.3 \({{\rm{ng}}}\)), but the mean \({\beta }_{P}\) for the land is slightly higher (1.098 vs. 1.095). Overall, the impact of underlying surfaces on the parameter \(b\) is insignificant, changing the \(b\) value by only about 0.002.

A linear regression in log-to-log space is applied to all the satellite data directly in the above analyses. However, a more widely used approach for calculating the sensitivity between two variables with huge amounts of satellite data is the pre-binned method16,34,35,36. Here, we also use the pre-binned method to fit the parameter \(b\), as shown in Fig. 2d–f. The values of the parameter \(b\) in global, land, and ocean regions are −0.020, −0.024, and −0.023, with \(\left|r\right|\) not less than 0.87, demonstrating a stronger power–law relationship. Comparing these with Fig. 2a–c, the absolute biases between the two methods for global, land, and ocean regions are 0.004, 0.002, and 0.005, respectively, suggesting that the calculation method of the fitting parameter influences the results, particularly in ocean regions. Additionally, the empirical power–law appears to fit the oceanic data better than the land data in the pre-binned method, which is more susceptible to extreme values. We speculate that this is mainly related to retrieval algorithm challenges in accurately retrieving large cloud droplets. Compared to oceanic regions, retrieval uncertainty for large cloud droplets over land is greater30. This increased uncertainty may cause \({\beta }_{P}\) over land to become less sensitive to \({{{{\mathrm{LN}}}}}_{P}\) with a value greater than 5 ng (see Methods), thereby weakening the fitted correlation coefficient (Fig. 2e).

Quantitative analysis of uncertainty

Currently, the quantifiable sources of uncertainty include: (1) the inherent limitations of the POLDER-NNs30; (2) cloud heterogeneity37; (3) the retrieval method, wavelength, and grid scale29; and (4) the fitting method for parameter \(b\)34. Based on previous studies, we determined the uncertainty range of \(b\) caused by different sources by considering a bias-corrected random Gaussian noise (Fig. 3a). Subsequently, a Monte Carlo method38 was applied to evaluate the overall impact of the four uncertainty sources on the estimation of \(b\). It was assumed that the \(b\) values associated with each source follow a normal distribution. A random sampling process was conducted 10 million times, and for each iteration, the mean \(b\) value influenced by the different sources was calculated, resulting in a probability density distribution of \(b\) (Fig. 3b). The mean of this distribution is taken as the best estimate of \(b\), while the 5–95% confidence interval is used to represent the uncertainty range. A detailed description of the method is provided in the Methods section.

Fig. 3: A framework of uncertainty quantification.
figure 3

a The flowchart of uncertainty quantification, where \({b}_{{s\_F}}\) and \({{{{\mathrm{SE}}}}}_{{s\_F}}\) represent the fitting parameter \(b\) and its standard error (\({{{\mathrm{SE}}}}\)) considering different sources of uncertainty (\(S=s1,{s}2,{s}3\), representing sources 1, 2, and 3) and using different fitting methods (\(F=p,{d}\), representing pre-binned and direct fitting methods). \(N({b}_{{s\_F}},\,{{{{{\mathrm{SE}}}}}_{{s\_F}}}^{2})\) represents a normal distribution with a mean of \({b}_{{s\_F}}\) and a standard deviation of \({{{{\mathrm{SE}}}}}_{{s\_F}}\). b The probability distribution functions of the parameter \(b\) in ocean, land, and global scales, where the point and errorbar represent the best estimate (i.e., the mean value) and its 5–95% confidence interval.

The results show that the best estimate of the \(b\) (with 5–95% uncertainty in parentheses) is −0.024 (−0.026 to −0.022) globally, −0.025 (−0.028 to −0.022) over land, and −0.029 (−0.031 to −0.026) over ocean. The corresponding \({{{{\mathrm{DO}}}}}_{{Nd}}\) values are 7.2% (6.6–7.8%), 7.5% (6.6–8.4%), and 8.7% (7.8–9.3%), and the \({{{{\mathrm{DO}}}}}_{{{{\mathrm{LWP}}}}}\) values are −1.44% (−1.56 to −1.32%), −1.50% (−1.68 to −1.32%), and −1.74% (−1.86 to −1.56%), respectively, as shown in the Table 1. This indicates that the dispersion effect caused by increased aerosols offsets the number effect by ~7% and enhances the \({{{\mathrm{LWP}}}}\) adjustment effect by about 1.4% globally, with a stronger impact on clouds over the ocean.

Table 1 Comparisons of results from this study, aircraft observations, and general circulation models (GCMs)

Comparison with previous studies

This section compares \(b\) and \({{{\mathrm{DO}}}}\) obtained by POLDER-NNs with those by aircraft and GCMs (Table 1). In most studies, the \(b\) values were derived from aircraft observations and \({{{\mathrm{DO}}}}\) values estimated in GCMs (an exception is the study by Liu and Daum7, which derived \({{{\mathrm{DO}}}}\) from aircraft data and theoretical estimation).

The aircraft data used to fit \(b\) in previous studies were conducted in various regions. For instance, Liu et al. 19 obtained a value of −0.14 by analyzing aircraft observations sampled in North America. Martins and Silva Dias26 fitted the relationship between \(\beta\) and \({{{{\mathrm{LN}}}}}\) by studying clouds in the Amazon, obtaining a value of 0.072. In this study, whether fitting globally different regions as a whole (−0.026 to −0.022) or fitting data separately within each grid and then calculating the global mean (−0.031 \(\pm\) 0.013), the \(b\) values fall within the range of previously fitted \(b\) values using aircraft observations (Table 1). Additionally, the \(b\) values calculated in this study are predominantly negative across various global regions, and their absolute values are smaller compared to previous results.

Previous studies primarily focused on the impact of instantaneous DE (i.e., \({{{{\mathrm{DO}}}}}_{{Nd}}\)), as shown in Table 1. In the early stages, Liu and Daum 7 suggested that \({{{{\mathrm{DO}}}}}_{{Nd}}\) could even reach 80%. However, as research progressed, the values for \({{{{\mathrm{DO}}}}}_{{Nd}}\) consistently decreased. Recent research based on GCMs shows that the impact of instantaneous DE ranges from −13% (enhancement) to 35% (offset)9,17,22. Our results (6.6–7.8%) fall within the range of \({{{{\mathrm{DO}}}}}_{{Nd}}\) calculated by previous studies. However, previous estimations could only provide a range of \({{{{\mathrm{DO}}}}}_{{Nd}}\) calculated from different parameterizations, without validation with global observations. The \({{{{\mathrm{DO}}}}}_{{Nd}}\) obtained in this study could be a validation reference for the assessment of parameterizations, contributing to further improvements and developments in GCMs.

Previous parameterizations derived from aircraft in situ observations are all at a local/regional scale, which partially meets the requirements for global simulations conducted by GCMs. This study utilizes satellite data to obtain a global-scale fitting parameterization (\(\beta=0.68{{{{\mathrm{LN}}}}}^{-0.024}\) or \(\beta=0.73{{{{\mathrm{LN}}}}}^{-0.020}\), see Fig. 2), which better meets the requirements for GCM applications. In the future, the parameterization is expected to enhance the models’ ability to simulate cloud PSD for liquid-phase stratiform clouds, thereby reducing the uncertainty in climate assessments.

To ensure the robustness and applicability of the conclusions, the following sections provide a detailed discussion of the assumptions, causal interpretation, limitations, and implications for future ACI studies.

Assumption of adiabaticity of clouds

The derivation of \({{{\mathrm{DO}}}}\) is dependent on the assumption that the observed clouds are under moist adiabatic conditions, wherein clouds are not subject to the entrainment-mixing process. According to the dependence of optical depth on the adiabatic liquid water lapse rate (\({c}_{w}\)) in Eq. (15) as \({{c}_{w}}^{-1/6}\), variations in sub-adiabaticity due to entrainment-mixing are unlikely to significantly affect the albedo or the results of this study when modifying the overall \({c}_{w}\) in the cloud (Eq. (16)).

Causal analysis of aerosol effects on \({{{{\mathrm{LN}}}}}\) and \({{\boldsymbol{\beta }}}\)

First, this causal relationship aligns with the physical understanding of cloud microphysical processes. The formation of the cloud PSD is primarily controlled by, until collision-coalescence sets in, the condensation growth process19,21. And the negative relationship basically reflects the fact that condensation leads to a narrow size distribution as droplets grow. The empirical relationship between \({{{\mathrm{LN}}}}\) and \(\beta\) was demonstrated by Wood 39, which was later supported by the theoretical analysis of Liu et al. 40 and further validated by aircraft observations from several campaigns by Liu et al. 19. A significant relationship obtained using POLDER-NNs to some extent validates this assumption (Figs. 1, 2). Therefore, we think the co-varying relationship between \(\beta\) and \({{{\mathrm{LN}}}}\) has solid physics behind it, instead of being an observational artifact.

In addition, this causal relationship can also be confirmed through statistical analysis. To examine the relationship between aerosols, we employ a POLDER aerosol product41,42,43,44 to calculate the aerosol index (\({{{\mathrm{AI}}}}\)) and plot joint histograms of \({{{\mathrm{AI}}}}\) versus \({{{\mathrm{LN}}}}\) and \(\beta\) (See Methods). As shown in Fig. 4a, b, \({{{\mathrm{AI}}}}\) and \({{{\mathrm{LN}}}}\) exhibit a negative power–law relationship, whereas \({{{\mathrm{AI}}}}\) and \(\beta\) show an overall positive power–law correlation, which is consistent with theoretical analysis. When aerosol increases and the \({{{\mathrm{LWC}}}}\) in the cloud remains relatively stable, the liquid water per cloud droplet (\({{{\mathrm{LN}}}}\)) decreases. At the same time, \(\beta\) increases overall with \({{{\mathrm{AI}}}}\), aligning with most previous studies that reported aerosol-induced broadening of the cloud PSD7,9,19,20. Given the negative power–law relationship between \({{{\mathrm{LN}}}}\) and \(\beta\) (Fig. 2), it is hypothesized that \({{{\mathrm{LN}}}}\) plays a crucial mediating role in the impact of \({{{\mathrm{AI}}}}\) on \(\beta\).

Fig. 4: Relationships between cloud parameters and aerosol, along with results of mediation analysis.
figure 4

Joint histograms for a the liquid water content per cloud droplet (\({{{{\mathrm{LN}}}}}_{P}\)) and b the particle size distribution parameter (\({\beta }_{P}\)) versus the aerosol index (\({{{\mathrm{AI}}}}\)) over the global region (60°S–60°N), with both axes on logarithmic scales. c The result of mediation analysis, where ACMI and ADI represent the average causal mediation impact and average direct impact, respectively. The impact size means the change in \({\mathrm{ln}}(\beta )\) due to a one-unit increase in \({\mathrm{ln}}({{{\mathrm{AI}}}})\). The bar represents the mean impact size, and the error bar indicates the 95% confidence interval.

To investigate this, we conducted a causal mediation analysis45 to examine the role of \({{{\mathrm{LN}}}}\) as a mediator in \({{{\mathrm{AI}}}}\)’s impact on \(\beta\), with the results shown in Fig. 4c (see Methods). The average causal mediation impact (0.0037) indicates the influence of \({{{\mathrm{AI}}}}\) on \(\beta\) transmitted through the mediator \({{{\mathrm{LN}}}}\), while the average direct impact (0.0041) represents the direct influence of \({{{\mathrm{AI}}}}\) on \(\beta\). The total impact is 0.0078, and all results are statistically significant (p < 0.001). The results indicate that \({{{\mathrm{LN}}}}\) plays a significant mediating role in the impact of \({{{\mathrm{AI}}}}\) on \(\beta\), further supporting the causality of the findings.

Limitations of this study

Some limitations of this study need to be pointed out. First, due to the limited data resolution and the predefined data filtering criteria, this study mainly focuses on liquid-phase stratiform clouds, which account for over 78% of the samples (see Methods and Supplementary Fig. 3). Accordingly, the cloud PSD analysis, quantification of the dispersion effect, and parameterization proposed in this work are primarily applicable to liquid-phase stratiform clouds. While some large-scale cumulus clouds are included, the relevance of our findings to typical cumulus clouds remains uncertain and warrants further investigation using higher-resolution satellite observations46,47,48 (e.g., from the multi-viewing multi-channel multi-polarization imaging49). Given that cumulus clouds generally play a less significant role in ACI50, our focus aligns with the core objectives of current ACI research. Second, the analysis in this study is limited to 1 year of data (2006). Longer-term datasets are needed in the future for further validation and trend analysis. Finally, although the \(\beta\) values calculated based on \({V}_{e}\) fall within the uncertainty range of previous studies, the \({V}_{e}\) in POLDER-NNs has only been compared with synthetic data and has not yet been validated against observational data. Despite these limitations, the quantitative method and analytical framework proposed in this study can still provide valuable insights for future DE research.

Implications for future ACI estimation

Additional aerosols can modify the shape of cloud PSD, thereby modifying the cloud albedo and forcing, a phenomenon known as the dispersion effect (DE). However, this effect is typically ignored when calculating the \({{{{\mathrm{ERF}}}}}_{{{{\mathrm{aci}}}}}\), largely due to a lack of global quantitative estimations of DE’s impact on ACI. To address the gap, this study proposed a quantitative method based on physical mechanisms, utilizing physical equations and theoretical derivations. By using POLDER satellite observations, this study quantifies the global impact of DE on ACI over regions dominated by liquid-phase stratiform clouds. Based on a comprehensive analysis from multiple perspectives, it can be concluded that DE offsets the number effect by ~7% but enhances the \({{{\mathrm{LWP}}}}\) adjustment effect by around 1.4% on a global scale. Hence, the DE has a non-negligible impact on ACI, necessitating its consideration in future \({{{{\mathrm{ERF}}}}}_{{{{\mathrm{aci}}}}}\) calculation.

Methods

POLDER retrievals

The POLarization and Directionality of Earth’s Reflectances (POLDER) instrument, here in its version mounted on the PARASOL (polarization and anisotropy of reflectances for atmospheric science coupled with observations from a Lidar) microsatellite51, provides global cloud properties by multi-angle polarimetric observations52,53,54. Thanks to the polarimetric measurements, POLDER can retrieve two pieces of information on the cloud PSD at the cloud top, namely, besides the cloud effective radius (\({R}_{e}\)), also the effective variance (\({V}_{e}\))55,56.

Recently, Di Noia et al. 30 introduced a neural network algorithm (NNs) to utilize multi-angle and multi-wavelength polarimetric measurements from POLDER Level-1 data (5 km \(\times\) 6 km) to retrieve high-resolution, numerically continuous \({R}_{e}\) and \({V}_{e}\). Comparisons with currently available POLDER datasets indicate that the algorithm possesses improved capabilities in retrieving \({R}_{e}\). Furthermore, the method can provide continuous values of \({V}_{e}\) from 0.03 to 0.35, which are often fixed or discrete in others28,29. To ensure the accuracy of the retrieval results, eliminate the influence of ice-/mixed-phase clouds, and facilitate subsequent data usage, ref. 30 performed strict data screening and re-gridding on the retrieved data, resulting in a 1° × 1° dataset containing only liquid cloud samples, referred to as L2-REGRID-REFF. The screening and processing of data include:

  1. 1.

    The cloudbow scattering angle range observed by POLDER is between 135° and 165°;

  2. 2.

    The total cloud cover is greater than 0.95;

  3. 3.

    The cloud phase index is less than 50 (excluding ice clouds and mixed-phase clouds);

  4. 4.

    The cloud-top pressure is higher than \(600{{\rm{hPa}}}\) (further excluding the influence of ice);

  5. 5.

    Data over the ocean is not affected by sun glint;

  6. 6.

    The data is from mid-to-low-latitude regions with ample sampling points (60° \({{\rm{S}}} \sim\) 60° N);

  7. 7.

    The data is re-gridded to the 1° × 1° grid with a daily (instantaneous) temporal resolution.

Due to the large volume of Level-1 data (several terabytes) and the high computational burden of using the NNs algorithm, the L2-REGRID-REFF provided by Di Noia et al. 30 only includes results from 2006. To ensure the subsequent analysis results have statistical significance, we further excluded grid points with fewer than ten valid samples throughout the year. Finally, we obtained 1,127,994 valid samples, of which 400,776 are land data and 727,218 are ocean data. This dataset is referred to as POLDER-NNs and has been used in this study.

Cloud types in the POLDER-NNs dataset were classified based on the International Satellite Cloud Climatology Project (ISCCP) cloud classification scheme57. The results show that stratocumulus, altostratus, cumulus, altocumulus, nimbostratus, and stratus account for 51.2, 22.2, 15.8, 5.7, 2.8, and 2.4% of the samples, respectively, as illustrated in Supplementary Fig. 3. Notably, stratocumulus clouds, despite their partly cumulus-like appearance, are typically categorized as stratiform clouds due to their broad horizontal extent and limited vertical development58. In addition, due to the strict cloud cover criterion (cloud cover >0.95 within the 5 km × 6 km grid), the cumuliform clouds included here are limited to those with relatively large horizontal scales, while small, isolated cumulus clouds (e.g., significantly smaller than 5 km × 6 km) are excluded.

In summary, the results in this study are primarily applicable to liquid-phase stratiform clouds (stratocumulus, altostratus, nimbostratus, and stratus, collectively accounting for over 78% of the samples). While some large-scale cumulus clouds are included in the dataset, the applicability of our findings to typical cumulus clouds remains uncertain and warrants further investigation. Additionally, this study specifically focuses on the impact of the dispersion effect on ACI. Since cumulus generally play a less dominant role in ACI due to their short lifetimes, small spatial coverage, and weak coupling with large-scale radiative processes50, our focus on liquid-phase stratiform clouds aligns well with the primary goals of current ACI research.

\({{\boldsymbol{\beta }}}\) and \({{{{\mathbf{LN}}}}}\) derived from POLDER retrievals

In GCMs, \({R}_{e}\) is generally parameterized through a PSD parameter \(\beta\) that relates \({R}_{e}\) to the volume-mean radius (\({R}_{v}\))59,60:

$${R}_{e}=\beta {R}_{v}=\beta {\left(\frac{3{{{\mathrm{LWC}}}}}{4\pi {\rho }_{w}{N}_{d}}\right)}^{1/3},$$
(4)

where \({{{\mathrm{LWC}}}}\) is the liquid water content (in \({{\rm{kg}}}\,{{{\rm{m}}}}^{-3}\)), \({\rho }_{w}\) = \(1000\,{{\rm{kg}}}\,{{{\rm{m}}}}^{-3}\) is the water density, and \({N}_{d}\) is the number concentration of cloud droplets that are assumed spherical. \(\beta\) can be well estimated by the relative dispersion (\(\varepsilon\), defined as the ratio of the standard deviation to the mean radius) by assuming a gamma distribution of the cloud PSD60:

$$\beta=\frac{{\left(1+2{\varepsilon }^{2}\right)}^{2/3}}{{\left(1+{\varepsilon }^{2}\right)}^{1/3}},$$
(5)
$$\varepsilon=\frac{\sigma }{{R}_{m}},$$
(6)

where \(\sigma\) is the standard deviation, and \({R}_{m}\) is the mean radius. However, both \(\sigma\) and \({R}_{m}\) cannot be obtained through prognosis in GCMs. In order to calculate \(\beta\) in global models, different parameterizations were proposed.

In the early 21st century, parameterizations of the cloud PSD only considered the relationship between \(\varepsilon\) or \(\beta\) and \({N}_{d}\)9,17,20. Later parameterizations gradually recognized the importance of \({{{\mathrm{LWC}}}}\) and began linking \(\beta\) with \({{{\mathrm{LWC}}}}/{N}_{d}\) (the liquid water mass per cloud droplet, hereafter \({{{\mathrm{LN}}}}\)), expressing their relationship in power–law form19,26, as shown in Eq. (1). Additionally, two-moment cloud microphysics schemes in GCMs can prognosticate \({{{\mathrm{LWC}}}}\) and \({N}_{d}\) directly61, making this form convenient for use in models.

According to the retrieval algorithm of POLDER, the PSD of liquid clouds (\(\phi\)) is characterized by a gamma distribution52,62:

$$\phi \left(r\right)=C{{r}_{c}}^{(1-3{V}_{e})/{V}_{e}}\,{e}^{\left(-{r}_{c}/({R}_{e}{V}_{e})\right)},$$
(7)

where \(C\) is the intercept parameter, and \({r}_{c}\) is the cloud droplet radius. Correspondingly, \(\sigma\) and \({R}_{m}\) can be expressed in terms of \({R}_{e}\) and \({V}_{e}\)62:

$${\sigma }_{P}=\sqrt{\left(1-2{V}_{e}\right){V}_{e}}{R}_{e},$$
(8)
$${{R}_{m}}_{P}=(1-2{V}_{e}){R}_{e},$$
(9)

where the subscript \(P\) indicates calculation using POLDER data.

Furthermore, \(\varepsilon\) and \(\beta\) can be derived from POLDER retrievals as

$${\varepsilon }_{P}=\frac{{\sigma }_{P}}{{{R}_{m}}_{P}}=\sqrt{\frac{{V}_{e}}{1-2{V}_{e}}},$$
(10)
$${\beta }_{P}=\frac{{\left(1+2{\varepsilon }_{P}^{2}\right)}^{2/3}}{{\left(1+{\varepsilon }_{P}^{2}\right)}^{1/3}}={\left[\left(1-{V}_{e}\right)\left(1-2{V}_{e}\right)\right]}^{-1/3},$$
(11)

It should be noted that the mode radius of the gamma function is \(\left(1-3{V}_{e}\right){R}_{e}\)52, which means that the \({V}_{e}\) should be less than 1/3 to ensure the existence of peaks in the gamma function (i.e., the mode radius is greater than 0). In the data provided by Di Noia et al. 30, the range of \({V}_{e}\) is given as 0.03 to 0.35. When conducting subsequent studies, the parts greater than 1/3 were excluded first.

According to Eqs. (4) and (11), we can obtain cloud-top \({{{\mathrm{LN}}}}\) using \({R}_{e}\) and \({V}_{e}\) provided by POLDER (denoted as \({{{{\mathrm{LN}}}}}_{P}\)):

$${{{LN}}}_{P}=\frac{4\pi {\rho }_{w}}{3}\left(1-{V}_{e}\right)\left(1-2{V}_{e}\right){{R}_{e}}^{3} \cdot$$
(12)

Exponentially fitting the \({\beta }_{P}\) and \({{{{\mathrm{LN}}}}}_{P}\) yields parameter \(b\) (Eq. (1)), which can be used to quantitatively assess the global impact of DE on ACI, as described below.

Impacts of DE on ACI

The effective radiative forcing of aerosol–cloud interactions (\({{{{\mathrm{ERF}}}}}_{{{{\mathrm{aci}}}}}\)) can be represented as the forcing sum of the Twomey effect (instantaneous effect) and the associated rapid adjustments3,16:

$${{{{\mathrm{ERF}}}}}_{{{{\mathrm{aci}}}}}={F}_{{Nd}}+{F}_{{{{\mathrm{LWP}}}}}+{F}_{f},$$
(13)

where \({F}_{{{{\mathrm{Nd}}}}}\), \({F}_{{{{\mathrm{LWP}}}}}\), and \({F}_{f}\) are the radiative forcing of the Twomey effect, and the radiative adjustments of \({{{\mathrm{LWP}}}}\), and \(f\), respectively.

Considering the dependence of cloud albedo (\({\alpha }_{c}\)) on cloud optical depth (\({\tau }_{c}\))63 and the relationship between \({\tau }_{c}\) with \({{{{{\mathrm{LWP}}}}}}\) and \({N}_{d}\)31, there are relationships in adiabatic clouds3,64:

$$\frac{{{\rm{dln}}}{\alpha }_{c}}{{{\rm{dln}}}{\tau }_{c}}=1-{\alpha }_{c},$$
(14)
$${\tau }_{c}=\frac{6}{5}\pi {Q}_{{{{\mathrm{ext}}}}}{B}^{\frac{2}{3}}{{\beta }^{-1}{{{\mathrm{LWP}}}}}^{\frac{5}{6}}{{N}_{d}}^{\frac{1}{3}},$$
(15)

where \(B={\left[\left(4/3\right)\pi {\rho }_{w}\right]}^{-1}{\left(2{c}_{w}\right)}^{-1/4}\), \({c}_{w}\) is the adiabatic liquid water lapse rate (\({c}_{w}={{\rm{d}}}{{{\mathrm{LWC}}}}/{{\rm{d}}}z\), \(z\) is the height above the cloud base) and is considered to be constant through the cloud64, \({Q}_{{{{\mathrm{ext}}}}}\) is the Mie efficiency factor and is usually set to 231,65, \(\beta\) is often set as a constant in \({{{{\mathrm{ERF}}}}}_{{{{\mathrm{aci}}}}}\) calculation3,16, and \({N}_{d}\) is assumed vertically uniform in adiabatic clouds. Thus, Eq. (15) can be rewritten as

$${\tau }_{c}\propto {{{{\mathrm{LWP}}}}}^{\frac{5}{6}}{{N}_{d}}^{\frac{1}{3}} \cdot$$
(16)

Correspondingly, \({F}_{{Nd}}\) and \({F}_{{{{\mathrm{LWP}}}}}\) can be represented as3

$${F}_{{Nd}}=\frac{1}{3}{\alpha }_{c}\left(1-{\alpha }_{c}\right)\cdot {c}_{{Nd}}\cdot \Delta {\mathrm{ln}}{N}_{d,{ant}},$$
(17)
$${F}_{{{{\mathrm{LWP}}}}}=\frac{5}{6}{\alpha }_{c}\left(1-{\alpha }_{c}\right)\cdot {c}_{{{{\mathrm{LWP}}}}}\cdot \Delta {\mathrm{ln}}{{{{\mathrm{LWP}}}}}_{{{{\mathrm{ant}}}}},$$
(18)

where \(\Delta {\mathrm{ln}}{N}_{d,{{{\mathrm{ant}}}}}\) and \(\Delta {\mathrm{ln}}{{{{\mathrm{LWP}}}}}_{{{{\mathrm{ant}}}}}\) are the anthropogenic perturbations of \({N}_{d}\) and \({{{\mathrm{LWP}}}}\). And \({c}_{{Nd}}\) and \({c}_{{{{\mathrm{LWP}}}}}\) are the effective cloud fractions for \({N}_{d}\) and \({{{\mathrm{LWP}}}}\), respectively. Its “effectiveness” stems not solely from the partial coverage offered by liquid clouds but also from considering the spatial correlations among other pertinent factors in deriving \({F}_{{Nd}}\) and \({F}_{{LWP}}\)3. It should be noted that \({F}_{{Nd}}\) here only considers the impact of \({N}_{d}\) changes on \({R}_{e}\) and consequently \({\alpha }_{c}\), without accounting for \(\beta\). Therefore, it can be regarded as part of the Twomey effect (referred to as the number effect).

In adiabatic liquid clouds, the vertical profile of \({{\rm{LWC}}}\) is termed the adiabatic condensation profile with a constant \({c}_{w}\):

$${{{\mathrm{LWC}}}}(z)={\int }_{0}^{z}{c}_{w}{{\rm{d}}}z={c}_{w}z,$$
(19)

and for \({{{\mathrm{LWP}}}}\):

$${{{\mathrm{LWP}}}}={\int }_{0}^{H}{c}_{w}{zdz}=\frac{1}{2}{c}_{w}{H}^{2},$$
(20)

where \(H\) is the cloud depth and \({c}_{w}\) is approximately \(2\,{{\rm{mg}}}\,{{{\rm{m}}}}^{-3}\,{{{\rm{m}}}}^{-1}\). Considering that satellite observations mainly capture information at the cloud top, here we relate Eq. (16) to the cloud-top \({{{\mathrm{LWC}}}}\) and \({N}_{d}\), denoted as \({{{{\mathrm{LWC}}}}}_{{{{\mathrm{top}}}}}\) and \({N}_{{d\_{{\mathrm{top}}}}}\), respectively. Since \({N}_{d}\) remains constant in the adiabatic cloud, \({N}_{{d\_{{\mathrm{top}}}}}\) equals \({N}_{d}\). Utilizing Eq. (19), we can derive \({{{{\mathrm{LWC}}}}}_{{{{\mathrm{top}}}}}\) as:

$${{{{\mathrm{LWC}}}}}_{{{{\mathrm{top}}}}}\,={c}_{w}H .$$
(21)

Combining Eqs. (20) and (21), we obtain:

$${{{\mathrm{LWP}}}}=\frac{{{{{{\mathrm{LWC}}}}}_{{{{\mathrm{top}}}}}}^{2}}{2{c}_{w}} .$$
(22)

Substituting Eq. (22) and \({N}_{{d\_{{\mathrm{top}}}}}\) into Eq. (16), we get:

$${\tau }_{c}\propto {{{{{\mathrm{LWC}}}}}_{{{{\mathrm{top}}}}}}^{\frac{5}{3}}{{N}_{d{{\rm{\_}}}{{{\mathrm{top}}}}}}^{\frac{1}{3}},$$
(23)

Although Eq. (23) incorporates \(\beta\) in the definition of \({\tau }_{c}\) (Eq. (15)), \(\beta\) is treated as a fixed parameter during practical implementation. As a result, variations in \(\beta\) cannot be accounted for when evaluating \({\tau }_{c}\) under anthropogenic aerosol perturbations (Eq. (16)), thereby neglecting the influence of the dispersion effect in the estimation of \({{{{\mathrm{ERF}}}}}_{{{{\mathrm{aci}}}}}\) (Eqs. (17, 18)). When the dispersion effect is considered (i.e., when \(\beta\) is treated as a variable rather than a constant), \({\tau }_{c}\) can be represented as64:

$${\tau }_{c}\propto {\beta }^{-1}{{{{\mathrm{LWP}}}}}^{\frac{5}{6}}{{N}_{d}}^{\frac{1}{3}}\cdot$$
(24)

Given \(\beta=a{\left({{{{\mathrm{LWC}}}}}_{{top}}/{N}_{{d\_top}}\right)}^{b}\) (Eq. (1)), and utilizing Eq. (23), we now can rewrite \({\tau }_{c}\) in terms of \({{{{\mathrm{LWC}}}}}_{{{{\mathrm{top}}}}}\), \({N}_{{d\_{{\mathrm{top}}}}}\), and the \(b\) parameter as

$${\tau }_{c}\propto {{{{{\mathrm{LWC}}}}}_{{{{\mathrm{top}}}}}}^{\frac{5}{3}-b}{{N}_{d{{\rm{\_}}}{{{\mathrm{top}}}}}}^{\frac{1}{3}+b}\cdot$$
(25)

Correspondingly, \({F}_{{Nd}}\) and \({F}_{{{{\mathrm{LWP}}}}}\), when considering the dispersion effect (i.e., \({F}_{{N}_{d}}^{{{{\mathrm{disp}}}}}\) and \({F}_{{{{\mathrm{LWP}}}}}^{{{{\mathrm{disp}}}}}\)), can be written as

$${F}_{{N}_{d}}^{{{{\mathrm{disp}}}}}=\left(\frac{1}{3}+b\right)\cdot {\alpha }_{c}\left(1-{\alpha }_{c}\right)\cdot {c}_{{{{\mathrm{Nd}}}}}\cdot \Delta {\mathrm{ln}}{N}_{d,{{{\mathrm{ant}}}}}={F}_{{{{\mathrm{Nd}}}}}+3b{F}_{{{{\mathrm{Nd}}}}},$$
(26)
$${F}_{{{{\mathrm{LWP}}}}}^{{{{\mathrm{disp}}}}}=(\frac{5}{6}-\frac{b}{2})\cdot \alpha _{c}(1-{\alpha }_{c})\cdot {c}_{{{{\mathrm{LWP}}}}}\cdot \Delta {\mathrm{ln}}{{{{\mathrm{LWP}}}}}_{{{{\mathrm{ant}}}}}={F}_{{{{\mathrm{LWP}}}}}-\frac{3}{5}b{F}_{{{{\mathrm{LWP}}}}},$$
(27)

where \(3b{F}_{{Nd}}\) and \(-3/5\cdot b{F}_{{{{\mathrm{LWP}}}}}\) represent the radiative forcing caused by the instantaneous DE (\({F}_{{{{\mathrm{DE}}}}\_{Nd}}\)) and the adjusted DE (\({F}_{{{{\mathrm{DE}}}\_{{\mathrm{LWP}}}}}\)). Accordingly, the total radiative forcing caused by the dispersion effect (\({F}_{{{{\mathrm{DE}}}}}\)) can be expressed as:

$${F}_{{{{\mathrm{DE}}}}}={F}_{{{{\mathrm{DE}}}}\_{Nd}}+{F}_{{{{\mathrm{DE}}}}\_{{\rm{LWP}}}}=3b{F}_{{Nd}}-\frac{3}{5}b{F}_{{{{\mathrm{LWP}}}}} \cdot$$
(28)

To quantify the impact of the dispersion effect on ACI, we define the dispersion offset (\({{{\mathrm{DO}}}}\)):

$${{{{\mathrm{DO}}}}}_{{{{\mathrm{Nd}}}}}=-\frac{{F}_{{{{\mathrm{DE}}}}\_{Nd}}}{{F}_{{{{\mathrm{Nd}}}}}} \cdot 100\,\%=-3b \cdot 100\,\%,$$
(29)
$${{{{\mathrm{DO}}}}}_{{{{\mathrm{LWP}}}}}=-\frac{{F}_{{{{\mathrm{DE}}}}\_{{{\mathrm{LWP}}}}}}{{F}_{{{{\mathrm{LWP}}}}}} \cdot 100\,\%=\frac{3}{5}b \cdot 100\,\%,$$
(30)

where \({{{{\mathrm{DO}}}}}_{{{{\mathrm{Nd}}}}}\) and \({{{{\mathrm{DO}}}}}_{{{{\mathrm{LWP}}}}}\) indicate the impact of DE on the number effect (i.e., the impact of instantaneous DE) and \({{{\mathrm{LWP}}}}\) adjustment effect (the impact of adjusted DE), respectively. Positive values indicate offsetting, while negative values indicate enhancement. The parameter \(b\) can be obtained by fitting the \({\beta }_{P}\) and \({{{{\mathrm{LN}}}}}_{P}\) from POLDER-NNs, as described in Eqs. (11) and (12).

Explanation of the spatial distributions of \({{{\boldsymbol{\beta }}}}_{{{\boldsymbol{P}}}}\) and \({{{{{\mathbf{LN}}}}}}_{{{\boldsymbol{P}}}}\)

In the main text, we pointed out that \({\beta }_{P}\) and \({{{{\mathrm{LN}}}}}_{P}\) exhibit distinct land–ocean distribution characteristics. Here, we further discuss these characteristics.

The values of \({\beta }_{P}\) over land are generally higher than those over ocean, reflecting the broadening effect of aerosols on the cloud PSD, which is consistent with previous analyses based on aircraft observations8,9. However, even within the same oceanic or terrestrial region, \({\beta }_{P}\) exhibits regional variations. For instance, over oceans, coastal areas, and regions affected by aerosols (such as the North Pacific) have higher \({\beta }_{P}\) values compared to more open ocean areas (like the Southern Ocean). Similarly, over land, \({\beta }_{P}\) also shows regional differences. For example, \({\beta }_{P}\) values are higher in East Asia and India compared to the relatively cleaner Europe. Notably, \({\beta }_{P}\) is higher over India than over the more polluted regions of China, which we think could be related to the abundant moisture conditions in India, though the specific mechanisms need further investigation. The \({{LN}}_{P}\), which physically represents the amount of water vapor each cloud droplet can obtain, also shows land–ocean distribution characteristics. As shown in Fig. 1d, \({{{{\mathrm{LN}}}}}_{P}\) values are generally lower over land compared to ocean, which is associated with relatively less moisture and higher aerosol concentrations over land.

Explanation of the spatial distribution of the parameter \({{\boldsymbol{b}}}\)

The distribution of the parameter \(b\) varies significantly across different grid points (standard deviation of 0.015, with 0.013 in the dotted areas), exhibiting distinct spatial distribution characteristics. Overall, regions with \(\left|b\right|\) are primarily located in areas heavily influenced by anthropogenic aerosols, such as ocean regions near continental coastlines, the northern Indian Ocean, the North Atlantic, the North Pacific, and low-latitude regions of the South Pacific, as well as land areas including East Asia, South Asia, the Indian subcontinent, central Africa, and southern North America. This indicates that the cloud PSD in these regions is more sensitive to changes in aerosols.

To explain the spatial distribution of the parameter \(b\), we plotted the relative changes in \({{{{\mathrm{LN}}}}}_{P}\) and the corresponding \({\beta }_{P}\) (denoted as \(\Delta {\mathrm{ln}}({{{{\mathrm{LN}}}}}_{P})\) and \(\Delta {\mathrm{ln}}\left(\,{\beta }_{P}\right)\), respectively) as shown in Supplementary Fig. 2a, b. The \(\Delta {\mathrm{ln}}({{{{\mathrm{LN}}}}}_{P})\) represents the difference between the mean \({\mathrm{ln}}({{{{\mathrm{LN}}}}}_{P})\) of the highest 10% and the lowest 10% within a grid point, while \(\Delta {\mathrm{ln}}\left(\,{\beta }_{P}\right)\) represents the corresponding difference in \({\mathrm{ln}}\left(\,{\beta }_{P}\right)\). With a constant \({{{\mathrm{LWC}}}}\), an increase in \({{{\mathrm{LN}}}}\) indicates a decrease in cloud droplet/aerosol number concentration. Mathematically, using the principle of invariance of differential forms, we have \({{\rm{d}}}{\mathrm{ln}}({{{{\mathrm{LN}}}}}_{P})={{\rm{d}}}({{{{\mathrm{LN}}}}}_{P})/{{{{\mathrm{LN}}}}}_{P}\), meaning \(\Delta {\mathrm{ln}}({{{{\mathrm{LN}}}}}_{P})\) represents the relative change in average cloud droplet water content due to a decrease in aerosols. Similarly, \(\Delta {\mathrm{ln}}\left(\,{\beta }_{P}\right)\) indicates the relative change in cloud PSD due to a decrease in aerosols.

By slightly transforming the power–law relationship between \(\beta\) and \({{{\mathrm{LN}}}}\) (Eq. (1)), we get \(b={{\rm{dln}}}({{{\mathrm{LN}}}})/{{\rm{dln}}}(\beta )\). We attempt to approximate \(b\) using \(\Delta {\mathrm{ln}}({{{{\mathrm{LN}}}}}_{P})/\Delta {\mathrm{ln}}({\beta }_{P})\) (Supplementary Fig. 2c), thereby explaining the spatial variation of parameter \(b\) through the spatial distributions of \(\Delta {\mathrm{ln}}({{{{\mathrm{LN}}}}}_{P})\) and \(\Delta {\mathrm{ln}}({\beta }_{P})\). Comparing Supplementary Fig. 2c and Fig. 1e, the ratio of \(\Delta {\mathrm{ln}}\left({{{{\mathrm{LN}}}}}_{P}\right)\) and \(\Delta {\mathrm{ln}}({\beta }_{P})\) shows a similar global mean and spatial distribution to the parameter \(b\), indicating that our analytical method is reasonable.

In general, the variation of \(\Delta {\mathrm{ln}}({{{{\mathrm{LN}}}}}_{P})\) exhibits a clear land–ocean distribution pattern. Compared to land regions, oceanic regions have generally lower values with less pronounced regional distribution features (Supplementary Fig. 2a). However, the variation of \(\Delta {\mathrm{ln}}({\beta }_{P})\) in oceanic regions shows distinct regional distribution characteristics, with larger absolute values mainly occurring in the northern Indian Ocean, the North Atlantic, the North Pacific, and low-latitude regions of the South Pacific, ultimately leading to larger absolute values of \(b\) in these areas. This indicates that the spatial distribution of the fitting parameter \(b\) in ocean regions is primarily determined by the relative variation of \({\beta }_{P}\), while the relative variation of \({{{{\mathrm{LN}}}}}_{P}\) across different regions is not significant.

In contrast to oceanic regions, the spatial distribution of the fitting parameter \(b\) in land areas is determined by the relative changes in both \({\beta }_{P}\) and \({{{{\mathrm{LN}}}}}_{P}\). For instance, in the Indian subcontinent, a relatively small relative change in \({{{{\mathrm{LN}}}}}_{P}\,\)(Supplementary Fig. 2a) leads to a larger relative change in \({\beta }_{P}\) (Supplementary Fig. 2b), resulting in a more negative \(b\) in this region. In Europe, although the relative change in \({{{{\mathrm{LN}}}}}_{P}\) is similar to that in the Indian subcontinent, the relative change in \({\beta }_{P}\) is relatively weaker, leading to a smaller absolute value of the parameter \(b\). Similar results are observed in other land areas. Overall, in land regions, the fitting parameter \(b\) is determined by the combined relative changes in \({\beta }_{P}\) and \({{{{\mathrm{LN}}}}}_{P}\).

It is important to note that due to the limited variables provided by the POLDER-NNs dataset, our analysis remains relatively coarse, and further analysis of the physical mechanisms is necessary in the future.

Factors influencing the \({{\boldsymbol{\beta }}}-{{{{\mathbf{LN}}}}}\) relationship over land and ocean

In this study, we consider \({{{\mathrm{LN}}}}\) to be the dominant factor influencing \(\beta\), and their relationship can be represented by a power–law relationship. However, when \({{{\mathrm{LN}}}}\) is either small or large, the relationship between \(\beta\) and \({{{\mathrm{LN}}}}\) appears to deviate from this power–law fitted using the pre-binned method, particularly over land (Fig. 2e). In other words, when \({{{\mathrm{LN}}}}\) reaches extreme values, \(\beta\) may be significantly affected by other factors. We speculate that this is related to the following three factors:

  1. 1.

    Bias of the retrieval algorithm: Compared to the ocean, the retrieval uncertainty, especially for large cloud droplets over land is greater30. Further analysis suggested that this may be linked to the influence of aerosols above clouds in these regions on the POLDER-NNs method30. This effect may cause \(\beta\) over land to become insensitive to \({{{\mathrm{LN}}}}\) when greater than 5 ng (Fig. 2e), thereby reducing the fitted correlation coefficient.

  2. 2.

    Condensation (evaporation) occurring simultaneously with new activation (deactivation) for small droplets: Lu et al.32 and Zhang et al.33. demonstrated through in situ observations and numerical simulations that \(\varepsilon\), which is proportional to \(\beta\), shows a positive correlation with \({{{\mathrm{LN}}}}\) for small cloud droplets. This may deviate the \(\beta -{{{\mathrm{LN}}}}\) relationship away from a negative correlation. Compared to oceanic clouds, cloud droplets over land tend to be smaller overall, which may contribute to the different fitting performance in land and ocean.

  3. 3.

    Collision-coalescence process for large droplets: When \({{{\mathrm{LN}}}}\) is large enough, the collection process (precipitation initiation) may occur66, thereby disrupting the original characteristics of \(\beta\). Considering that the retrieval of large cloud droplets over land is inherently less accurate, this effect may be further exacerbated.

The above hypotheses regarding the land–ocean differences require further in-depth investigation for validation.

Process of uncertainty quantification

The first source of uncertainty (Src1) arises from the inherent uncertainty in the POLDER-NNs method. The POLDER-NNs dataset employed in this study is generated via a neural network algorithm. Discrepancies between this dataset and the exact solution derived from the radiative transfer model introduce uncertainty in the retrievals of \({R}_{e}\) and \({V}_{e}\). Based on Table 3 from Di Noia et al. 30, Src1 is found to cause a bias (\({{{{{\mathrm{Bias}}}}}}\)) and root mean square error (\({{{{{\mathrm{RMSE}}}}}}\)) in \({R}_{e}\)/\({V}_{e}\) of 0.08/−0.01 and 0.92/0.03, respectively, as shown in Supplementary Table 1. Based on this, we introduce a bias-corrected random Gaussian noise for each \({R}_{e}\)/\({V}_{e}\) value in POLDER-NNs, denoted as \(N(-{{{\mathrm{Bias}}}},{{{\mathrm{RMSE}}}})\), where \(N\) represents a normal distribution with a mean of \(-{{{\mathrm{Bias}}}}\) and a standard deviation of \({{{{\mathrm{RMSE}}}}}^{1/2}\). Using the corrected values, we calculate \(\beta\) and \({LN}\), then fit the data to obtain the parameter \(b\) and its corresponding standard error (\({{{\mathrm{SE}}}}\)) while accounting for the impact of Src1. To ensure the robustness of the results and avoid potential biases from a single random sampling, we repeat this process 10,000 times (Fig. 3a). The final estimates of \(b\) and \({{{\mathrm{SE}}}}\), considering Src1, are obtained by averaging all sampled results and are denoted as \({b}_{s1{{\rm{\_}}}d}\) and \({{{{\mathrm{SE}}}}}_{s1{\_d}}\), where \(s1\) refers to Src1 and \(d\) indicates the use of the direct fitting method (Supplementary Table 1).

The second source of uncertainty (Src2) stems from the impact of cloud heterogeneity. The POLDER-NNs data utilized in this study are characterized by a relatively coarse resolution, which may introduce errors by assuming homogeneity within the retrieval area (~6 km resolution). Therefore, it is essential to account for the effects of cloud heterogeneity on the retrievals of \({R}_{e}\) and \({V}_{e}\). Shang et al.37 evaluated this impact by modeling a cloud field comprising several equal-area subregions with constant cloud optical thickness but varying \({R}_{e}\) and \({V}_{e}\) values. Based on the differences between the actual and retrieved \({R}_{e}\)/\({V}_{e}\) (as shown in Table 2 in Shang et al. 37), Src2 is found to cause the \({{{\mathrm{Bias}}}}\) and \({{{\mathrm{RMSE}}}}\) in \({R}_{e}\)/\({V}_{e}\) of −0.71/0.02 and 0.88/0.04, respectively (Supplementary Table 1). Following the uncertainty quantification framework from Src1, the uncertainty in \(b\) caused by Src2 is then determined, and the \({b}_{s2{\_d}}\) and \({{{{\mathrm{SE}}}}}_{s2{\_d}}\) are provided, as shown in Supplementary Table 1.

The third source of uncertainty (Src3) arises from the use of the POLDER retrieval method, wavelength selection, and grid-scale processing. The POLDER-NNs data are derived by applying machine learning to the multi-angle, multi-wavelength polarimetric measurements with a grid scale of 1° × 1°. Shang et al.29 introduced an enhanced primary cloudbow retrieval (PCR) algorithm to estimate \({R}_{e}\) and \({V}_{e}\) from POLDER, creating a global retrieval dataset for four months (February, May, August, and November 2008) (denoted as POLDER-PCR). Unlike POLDER-NNs, POLDER-PCR employs traditional retrieval methods, clearly differentiating between various retrieval wavelengths (670/865 nm), with a grid resolution of 0.7° \(\times\) 0.7°. Data from POLDER-PCR for low- and mid-latitude regions (60°S–60°N) were selected for fitting parameter \(b\). Based on the fitting outcomes across two wavelengths, the uncertainty in \(b\) due to Src3 is determined, and the corresponding \(b\) and \({SE}\) are provided (denoted as \({b}_{s3{\_d}}\) and \({{{{\mathrm{SE}}}}}_{s3{\_d}}\)), as shown in Supplementary Table 1. Although POLDER-NNs and POLDER-PCR utilize different wavelengths and grid scale configurations, as well as distinct retrieval methodologies, the derived \(b\) values are relatively consistent. Compared to POLDER-NNs, the \(b\) values obtained from POLDER-PCR exhibit larger \({{{\mathrm{SE}}}}\), likely due to the smaller sample size of the POLDER-PCR dataset (68,555 valid samples) and discontinuous \({V}_{e}\) values (intervals of 0.02). However, this convergence in \(b\) values obtained from two independent retrievals provides a basis for evaluating the potential influence of Src3.

The fourth source of uncertainty (Src4) arises from the fitting method used for parameter \(b\). As discussed before, commonly used fitting methods for large satellite datasets include direct and pre-binned fittings, though which of the two is the superior method remains unclear. Therefore, both fitting methods were employed to derive \(b\) and \({SE}\) using Src1–Src3, as shown in Supplementary Table 1.

Finally, the overall impact of four sources on the estimation of \(b\) is quantified using a Monte Carlo method, similar to that of Boucher and Haywood 38 and Bellouin et al. 3. It is assumed that the \(b\) values induced by different sources, as shown in Supplementary Table 1, follow a normal function (i.e., \(b\, \sim \,N({b}_{S{{\rm{\_}}}F},\,{{{{{\mathrm{SE}}}}}_{S{{\rm{\_}}}F}}^{2})\), where \(S\) represents different sources, and \(F\) indicates the fitting method). A random sampling process is then performed 10 million times, and for each random result, the mean value of \(b\) affected by different sources is calculated, resulting in a probability density distribution of \(b\). The mean is then computed as the best estimate of \(b\) (−0.024), and the 5–95% confidence intervals are determined as the uncertainty range for \(b\) (−0.026 to −0.022), as shown in Fig. 3b and Supplementary Table 1.

Similar uncertainty quantification has also been applied to clouds over ocean and land, with the relevant parameters listed in Supplementary Table 2 and the probability density distributions shown in Fig. 3b.

Causal mediation analysis

To examine the relationship between aerosols, \({{{\mathrm{LN}}}}\), and \(\beta\), as well as the mediating effect of \({{{\mathrm{LN}}}}\) on the impact of aerosols on \(\beta\), we use the POLDER Level 3 aerosol products, generated using a generalized retrieval of atmosphere and surface properties “components” approach, gridded at a 1° × 1° resolution (POLDER-3/GRASP, version 1.1)41,42,43,44. This product is officially recommended for studies involving both the Ångström exponent (\({{{\mathrm{AE}}}}\)) and aerosol optical depth (\({{{\mathrm{AOD}}}}\)) and demonstrates a high consistency with the aerosol robotic network (AERONET) on a global scale44,67.

To better characterize the properties of aerosols that can be activated as cloud droplets, this study uses \({{{\mathrm{AE}}}}\) and \({{{\mathrm{AOD}}}}\) at 565 nm from the POLDER-3/GRASP to calculate the aerosol index (\({{{\mathrm{AI}}}}\), \({{{\mathrm{AI}}}}={{{\mathrm{AE}}}}\times {{{\mathrm{AOD}}}}\))34. Before analysis, \({{{\mathrm{AI}}}}\) were matched with POLDER-NNs on a daily basis within a 1° × 1° grid to ensure full spatiotemporal alignment between the two datasets. The results of the matched data analysis are shown in Fig. 4, where the power–law fitting of \({{{\mathrm{LN}}}}\) and \(\beta\) with \({{{\mathrm{AI}}}}\) is performed using the pre-binned method, and the mediation effect analysis is conducted using the R package for a causal mediation analysis45.