Main

Our primary sample consists of 69 central galaxies in the nearby Universe with direct estimates of black hole (BH) masses derived from resolved kinematics of stars or gas11,19,20,21. We have included only central galaxies to avoid any environmental impact on the interstellar medium (ISM) properties of galaxies. The sample includes several types of galaxy, including spirals, lenticulars and ellipticals. We obtained the atomic hydrogen (HI) 21-cm emission fluxes, which trace the atomic gas mass MHI, by crossmatching with nearby galaxy databases (Methods and Extended Data Table 1).

We define the HI gas content as the ratio of the HI mass and the stellar mass represented as μHI = MHI/M. We first examine the relationship between μHI and BH masses and compare it with the μHIM correlation in Fig. 1. The correlation with MBH is found to be more significant than the correlation with M, with a Spearman correlation coefficient of r = −0.49 (P = 10−4.7) and r = −0.39 (P = 10−3.0), respectively. More importantly, the partial correlation between μHI and M while controlling for MBH, that is, removing both their dependence on MBH, indicates that μHI shows no dependence on M (r = −0.13, P = 0.29; Fig. 1 (bottom left)), whereas strong residual correlation exists between μHI and MBH while controlling for M (r = −0.41, P = 10−3.3; Fig. 1 (bottom right)). Moreover, although the μHIM correlation differs significantly for early- and late-type galaxies with the early-type galaxies exhibiting systematically lower μHI at fixed M, galaxies with different morphologies follow the same μHIMBH relation. This suggests that the low HI values in those early-type galaxies on the μHI–M relation are probably only a reflection that these galaxies have more massive BHs compared with late-type galaxies with similar M.

Fig. 1: Comparison between the relations of μHI to M and μHI to MBH for the BH sample.
figure 1

a,b, μHIM (a) and μHIMBH (b) correlations. Galaxies are colour-coded by their morphological T types, with smaller values being more early-type and larger values more late-type morphologies. The orange lines represent the best-fitted linear relation, taking into account the uncertainties of both variables. c,d, Comparison of the partial correlation of μHI–M (while controlling for MBH) (c) and μHI–MBH (while controlling for M) (d). The x- and y-axes show the residual in μHI and M after removing their dependence on MBH in the left panel, and MBH after removing their dependence on M in the right panel: Δlog μHI = log μHI − log μHI(MBH) and Δlog M = log M − log M(MBH) in the left panel, and Δlog μHI = log μHI − log μHI(M) and Δlog MBH = log MBH − log MBH(M) in the right panel. The horizontal dashed line indicates zero correlation, that is, there is no intrinsic correlation between the two quantities. The Spearman correlation coefficients between the two corresponding variables are shown in each panel. The error bars refer to 1σ measurement errors.

Although the partial correlation between μHI, MBH and M offers direct evidence that BHs play a more crucial part than M in regulating μHI, the heterogeneous nature of this sample makes it challenging to determine how the resulting relation could be applicable to broad galaxy populations. To validate this relation, we used a large sample of nearby galaxies with deep HI observations (Methods and Extended Data Fig. 1), which comprises 474 group central galaxies with 109.5M < M < 1011.5M and reliable central velocity dispersion (σ) measurements. Out of this, 281 of them are detected in HI with HI upper limits available for the remaining 193 sources. MBH for this galaxy sample is inferred from the MBH–σ relation (Methods). Hereafter we will call this enlarged galaxy sample ‘the galaxy sample’, and we call the sample with directly measured MBH ‘the BH sample’.

The μHI–M and μHI–MBH relations for the galaxy sample are shown in Fig. 2. Both MBH and M are found to be tightly correlated with μHI with respective r = −0.72 and r = −0.60. However, the partial correlation suggests that the μHI–M correlation almost disappears (when controlling for MBH) with r = −0.14, compared with r = −0.49 for the μHI–MBH relation (when controlling for M). This further suggests that the μHI–M correlation is mostly driven by the μHI–MBH and M–MBH correlations. Similar to the BH sample, early- and late-type galaxies follow the same μHI–MBH relation, but a different μHI–M relation.

Fig. 2: Comparison between the relations of μHI to M and μHI to MBH for the galaxy sample.
figure 2

a,b, μHI–M (a) and μHI–MBH (b) correlations. Galaxies are divided into early- and late-type galaxies based on their Sérsic indexes (separated at n = 2), which are shown in red and blue contours, respectively. The HI-detection rates of galaxies are shown as a function of stellar masses and BH masses. The vertical dashed lines indicate the position when the HI-detection fraction reaches 60%. c,d, μHI–M (c) and μHI–MBH (d) relations. The best-fitted relations for the HI-detected galaxy sample and the BH sample are shown by the black and orange lines, respectively. We also show the μHI–MBH relation for the full galaxy sample with the magenta line in d. e,f, The partial correlation between μHI and M while controlling for MBH (e), and the partial correlation between μHI and MBH while controlling for M (f). The corresponding Spearman coefficients are shown in each panel. The median 1σ error bars for the galaxy sample are shown in c and d.

The best-fitted μHIMBH relation for the HI-detected galaxy sample yields a slope of −0.43 ± 0.02 (Fig. 2, black line), which is steeper than that for the BH sample (−0.37 ± 0.06; Fig. 2, orange line). This is most likely driven by the selection biases in the BH sample, which are more complete and representative at large MBH but poorly sampled at low MBH. Moreover, we also derive an inherent μHIMBH scaling relation (Fig. 2, magenta line) encompassing both HI detections and non-detections, resulting in a steeper slope (−0.59 ± 0.19) than that for fitting the HI detections exclusively (Extended Data Table 2).

Next, we explore further the correlations between μHI and other main galactic parameters22, including stellar surface densities (Σstar), bulge masses (Mbulge) and specific star formation rates (SSFR), to determine whether MBH is the key parameter in determining μHI in galaxies. Figure 3 compares the correlation among μHI, M, MBH, Σstar, Mbulge and SSFR for the HI-detected galaxy sample. Although significant correlations exist between fHI and all these parameters, after removing their dependence on MBH, all the correlations almost disappear, with negligible Spearman coefficients and zero running medians. We also verify this by using the inherent μHIMBH relation derived for the full sample in Extended Data Fig. 3, yielding consistent results.

Fig. 3: The impact of MBH on the correlation between μHI and other main galactic parameters.
figure 3

ae, The HI-detection fraction along MBH (a) and some other main physical parameters of galaxies, including M (b), Σstar (c), Mbulge (d) and SSFR (e). The vertical dashed lines indicate the position at which the HI-detection rates hit 60%. fj, The relation between the parameters MBH (f), M (g), Σstar (h), Mbulge (i) and SSFR (j) and μHI. The contours denote the distribution of the HI-detected galaxy sample, whereas the filled red circles denote the BH sample with 1σ error bars. The best-fitted μHIMBH relations for the HI-detected galaxy sample and the BH sample are shown in f by the black and orange lines, respectively. The median 1σ error bars for the galaxy sample are shown. kn, The relation between the residual in μHI and the residual in other galactic parameters after removing their dependence on MBH: Δlog μHI = log μHI − log μHI(MBH) and Δlog X = log X − log X(MBH) with X representing M (k), Σstar (l), Mbulge (m) and SSFR (n), and μHI(MBH) and X(MBH) derived from their best-fitted relation with MBH (Extended Data Fig. 2). The solid medium-blue lines in kn show the running median of the residuals in μHI. The Spearman correlation coefficients for the HI-detected galaxy sample between the corresponding x and y variables are shown in fn.

Given that MBH, M, Σstar and Mbulge are all highly correlated, as a further test on the fundamental role of MBH in driving the correlation with μHI, we conduct a partial least squares regression between fHI and the parameter set of MBH, M, Σstar and Mbulge for the HI-detected galaxy sample and the BH sample, which shows that MBH is the most significant predictor parameter of μHI (Methods).

As MBH is proportional to the integrated energy of BHs across their accretion history2,13, our findings offer observational evidence that the accumulated energy from BHs is vital in regulating the accretion and/or cooling of cool gas in galaxies. The immense energy released from the accretion of SMBHs in massive galaxies is known to be at least comparable to the binding energy of host galaxies2,7,23. This energy is thought to significantly affect the accretion of gas onto the galaxy and the cooling of the circumgalactic medium (CGM) and ISM. As M is closely linked with the inner halo binding energy (total binding energies within effective radii of galaxies)24, \({M}_{\star }\propto {E}_{{\rm{b}}}^{\beta }\), the μHIMBH relation means \({M}_{{\rm{HI}}}\propto {M}_{\star }{M}_{{\rm{BH}}}^{-\alpha }\propto {E}_{{\rm{b}}}^{\beta }{M}_{{\rm{BH}}}^{-\alpha }\), where Eb represents the binding energy of the inner dark matter halo. At the stellar mass range probed by our BH sample, β ≈ 0.6, which is close to the value of α, yielding \({M}_{{\rm{HI}}}\propto {({E}_{{\rm{b}}}/{M}_{{\rm{BH}}})}^{\alpha }\) with α ≈ 0.6.

The analysis above indicates that the HI mass in galaxies is determined by the relative strength between the binding energy of the halo and the energy released from BHs (EBHMBH). The binding energy of the halo determines how much gas can be accreted onto the dark matter halo, whereas the energy from BHs ejects or heats up the gas, preventing it from further cooling. The contest between the two determines how much accreted gas can be eventually cooled and settled down onto the central galaxies. For such a mechanism to be effective, a negative feedback loop involving gas accretion or cooling and BH accretion or feedback would be required5,13. The fact that the accreted cool gas could feed both star formation and BH accretion makes this possible. When gas accretion or cooling is elevated, stronger BH accretion is also triggered, resulting in more energy ejected into the ISM and CGM, which inhibits further cooling or accretion of the cool gas. This eventually brings down the cool gas content (and also the BH accretion rates). Conversely, a lower cool gas content would generally lead to weaker BH accretion with less energy ejection into the ISM and CGM, which will facilitate further cool gas accretion or cooling and increase fHI until it reaches the average relation. The same physical process applies to both star-forming galaxies (SFGs) and quiescent galaxies. The difference is that although both MBH and M of SFGs can grow substantially through this process, most quiescent galaxies will probably maintain their MBH and M when they are quenched because of their overall low BH accretion rates and star formation rates. This scenario is shown in Fig. 4.

Fig. 4: Schematic of the proposed scenario on how BHs regulate cool gas content in galaxies.
figure 4

The large arrow indicates the μHI–MBH correlation. The background colour scale indicates the quiescent galaxy fraction as a function of MBH, which shows a sharp increase at MBH 107.5M (Methods and Extended Data Fig. 5), corresponding to μHI < 10%. At fixed MBH, galaxies could maintain their μHI at a certain level determined by the relative strength of the inner halo binding energy and MBH. Once gas accretion is enhanced onto galaxies (and their BHs), which increases μHI, MBH will also grow and release additional heating energy that prevents further gas cooling or accretion. This will bring down μHI together with increasing M by star formation and reach a balance at higher MBH. Although the same process takes place in both SFGs and quiescent galaxies, the growth of MBH or M should be much less significant in quiescent galaxies than in SFGs, and the large range of MBH among quiescent galaxies (MBH ~ 107−10M) is probably inherited from their different star-forming progenitors when they were quenched.

Under this scenario, the correlation between the total gas fraction (\({\mu }_{{\rm{HI}}+{{\rm{H}}}_{2}}\)) and MBH is expected to be even tighter than the μHI–MBH relation. This is because gas cooling from the CGM will probably first cool as HI gas and only later become molecular gas that hosts star formation. In other words, HI gas probes only one phase of the cold gas, whereas AGN feedback should affect the cooling of both atomic and molecular gas in galaxies. This is probably the case. Although the sample with both HI and CO measurements is small, a reduced scatter for the \({\mu }_{{\rm{HI}}+{{\rm{H}}}_{2}}\)MBH relation is found compared with the μHI–MBH relation (Methods and Extended Data Fig. 4). Apart from the small sample size, most of the galaxies with both HI and CO measurements are SFGs. Future studies with much larger and more representative galaxy samples with both HI and CO measurements will be needed to fully verify the \({\mu }_{{\rm{HI}}+{{\rm{H}}}_{2}}\)MBH relation.

As cool gas is the material of star formation, these findings also shed critical light on the intimate connection between the presence of massive BHs and the quiescence of galaxies. It explains well why most quiescent galaxies are present only at MBH 107.5M (refs. 10,11,12,13) (Extended Data Fig. 5), corresponding to a low level of cool gas content (10%), hence minimal star formation rates. The proposed mechanism reconciles the discrepancy between the absence of strong instantaneous negative AGN feedback and the tight correlation between MBH with galaxy quiescence. It is also consistent with empirical models indicating that the contest between dark matter halos and BHs governs the quenching of star formation in galaxies based on various observed galactic scaling relations25.

Although current studies have been confined to galaxies in the local Universe, the strong correlation across all redshifts between the quiescence of a galaxy and a prominent bulge, a high central stellar density or high central gravitational potential26,27,28,29,30, all of which suggest a large BH, implies that the same scenario may be applied to galaxies at high redshifts as well. Next-generation facilities, such as the Square Kilometer Array and the Next Generation Very Large Array, would be required to confirm this.

Methods

Cosmology

We adopted a Chabrier initial mass function (IMF)31 to estimate star formation rate and assumed cosmological parameters of H0 = 70 km s−1 Mpc−1, ΩM = 0.3, and ΩΛ = 0.7.

Sample selection

The BH sample

The sample for galaxies with directly measured BH masses is primarily from ref. 11, which includes 91 central galaxies collected from refs. 19,20,21. We excluded 18 sources with BH masses measured with reverberation mapping and kept only those measured with dynamical methods. We then added another 63 galaxies with measured BH masses from recent literature, which were matched with the group catalogue32 of nearby galaxies to select only central galaxies. We obtained the HI flux densities and masses of this sample by crossmatching with the nearby galaxy database, HyperLeda33. Our final sample includes 69 central galaxies with 41 from ref. 11 and the remaining from the compilation of recent literature. In Extended Data Table 3, we list the basic properties of our BH sample.

The galaxy sample

The sample for galaxies with HI measurements and indirect BH mass measurements are from the extended GALEX Arecibo SDSS Survey (xGASS; ref. 34) and HI-MaNGA programme35,36, which include HI observations towards a representative sample of about 1,200 and 6,000 galaxies with 109M < M < 1011.5M, respectively. The depth of the survey also allows for stringent constraints on the upper limits for the HI non-detections, enabling a comprehensive assessment of fHI for the entire sample. We limited the redshift z < 0.035 to ensure high HI-detection rates even at the highest stellar masses and BH masses. We selected only group central galaxies, which include at least one satellite galaxy in their groups, based on the crossmatch with the group catalogue37,38,39. Isolated central galaxies lacking any satellites in their groups are discarded because they may have probably suffered from additional environmental effects40. We derived the BH masses for the xGASS and HI-MaNGA samples with their velocity dispersion21 from SDSS DR1737 (σSDSS, and we require σSDSS ≥ 70 km s−1):

$$\log \left(\frac{{M}_{{\rm{BH}}}}{{M}_{\odot }}\right)=(8.32\pm 0.04)+(5.35\pm 0.23)\log \left(\frac{{\sigma }_{{\rm{SDSS}}}}{200\,{\rm{km}}\,{{\rm{s}}}^{-1}}\right).$$
(1)

Physical parameters of the BH and galaxy sample

Stellar masses

The stellar masses for the galaxy sample are taken from the MPA-JHU catalogue41,42, which are derived from SED fitting based on SDSS data. For the BH sample, because most of them lack the same photometric coverage as the galaxy sample, we derive their stellar masses from their K-band luminosity and velocity dispersion-dependent K-band mass-to-light ratio following ref. 21:

$${M}_{\star }/{L}_{{\rm{K}}}=0.1{\sigma }_{{\rm{e}}}^{0.45}.$$
(2)

As an accurate determination of σe is not available for all galaxies, we derived σe for the full BH sample from the tight correlation in ref. 21:

$$\begin{array}{l}\log \left(\frac{{\sigma }_{{\rm{e}}}}{{\rm{km}}}\,{{\rm{s}}}^{-1}\right)=(2.11\pm 0.01)+(0.71\pm 0.03)\log \left(\frac{{L}_{{\rm{K}}}}{1{0}^{11}{L}_{\odot }}\right)\\ \,\,\,\,\,\,\,+(-0.72\pm 0.05)\log \left(\frac{{R}_{{\rm{e}}}}{5\,{\rm{kpc}}}\right).\end{array}$$
(3)

To explore whether there are systematic differences between the two methods, we compare the stellar masses of the galaxy sample taken from the MPA-JHU catalogue and those derived from equation (2). A median mass difference 0.32 dex is found between the two methods (Extended Data Fig. 6), which may be attributed to the tilt from the fundamental plane beyond the mass-to-light ratio, for example, the dark matter component in the effective radius. We corrected these systematic mass differences for the BH sample to match that of the galaxy sample.

HI fraction and upper limits

The HI-detection limit depends not only on the sensitivity but also on the width of the HI line. To obtain more realistic upper limits, we first derived the expected HI line width for each HI non-detection. The width of the HI line indicates the circular velocity of the host galaxy, which should be proportional to the stellar masses. We explored this using the HI detections from the xGASS sample. Extended Data Fig. 1 shows the relation between M and the observed line width, as well as M and inclination-corrected line width. It indicates that the inclination-corrected line width is tightly correlated with M, which is further used to derive the expected line width for the HI non-detections. Combining the sensitivity of the HI observations and the expected line width, we derived the upper limits for all the HI non-detections in our BH and galaxy samples.

Morphology

For BH sample, the morphology indicator T is obtained from the HyperLEDA database33. It can be a non-integer because for most objects the final T is averaged over various estimates available in the literature. For the galaxy sample, we classified them in to the early types and late types based on the Sérsic index (from NASA-Sloan Atlas catalogue; NSA: Blanton M.; http://www.nsatlas.org) larger or smaller than 2.

Star formation rates

The specific star formation rates (SSFR) of the galaxy sample are from the MPA-JHU catalogue based on ref. 42. The SSFR for the BH sample is taken from the original reference.

Bulge masses

The bulge information is from refs. 43,44 for the BH sample and galaxy sample, respectively. More specifically, we calculate the bulge mass for the galaxy sample using r-band B/T.

Stellar mass surface density

We calculated the K-band effective radius for both the BH and the galaxy sample according to ref. 21: log Re = 1.16 log RK_R_EFF + 0.23 log qK_BA, where Re is the corrected apparent effective size, RK_R_EFF and qK_BA are K-band apparent effective radius and K-band axis ratio from 2MASS. After converting the apparent sizes to the physical sizes, the stellar mass surface density was derived as \({\varSigma }_{{\rm{star}}}={M}_{\star }/(2{\rm{\pi }}{R}_{{\rm{e}}}^{2})\).

H2 masses

We collected H2 masses from xCOLD GASS survey18 and ref. 45 for xGASS and MaNGA galaxies, respectively. We acknowledge that at least in the nearby Universe, the molecular-to-atomic gas mass ratio increases only weakly with stellar masses and remains relatively low over a wide stellar mass range, with \(R\equiv {M}_{{{\rm{H}}}_{2}}/{M}_{{\rm{HI}}} \sim 10-20 \% \) at 109M < M < 1011.5M. We calculate the total gas fractions as \({\mu }_{{\rm{HI}}+{{\rm{H}}}_{2}}\,=\)\(({M}_{{\rm{HI}}}+{M}_{{{\rm{H}}}_{2}})/{M}_{\star }\). For central galaxies (isolated centrals plus group centrals), we compare the MBHμHI and MBH\({\mu }_{{\rm{HI}}+{{\rm{H}}}_{2}}\) relation in Extended Data Fig. 4. The MBH\({\mu }_{{\rm{HI}}+{{\rm{H}}}_{2}}\) relation exhibits a stronger correlation with the smaller scatter than the MBH–μHI relation. We acknowledge that, based on molecular hydrogen gas content traced through dust extinction, previous studies show an MBH\({f}_{{{\rm{H}}}_{2}}\) correlation12. Future studies with more direct measurements of molecular hydrogen gas for large samples will be needed to examine in detail whether MBH also plays a fundamental part in regulating the molecular gas content in galaxies.

Quiescent fraction

To estimate the quiescent fraction at different MBH, we selected galaxies from the MPA-JHU catalogue of SDSS galaxies with the same criteria as the galaxy sample, except that we limited the velocity dispersion to greater than 30 km s−1 to cover broader MBH and we made no constraints on the HI detection. We classified the sample galaxies into star-forming and quiescent ones, separated at SSFR = −11. In each MBH bin, the quiescent fraction was calculated as the ratio between the number of quiescent galaxies and that of all galaxies. The result is shown in Extended Data Fig. 5, which is consistent with that of previous work29,46.

Linear least squares approximation

We implemented linear regression for the BH sample and the galaxy sample using Python package LTS_LINEFIT introduced in ref. 47, which is insensitive to outliers and can give the intrinsic scatter around the linear relation with corresponding errors of the fitted parameters.

Linear fitting including upper limits

To incorporate both detections and upper limits in the galaxy sample, we applied the Kaplan–Meier non-parametric estimator to derive the cumulative distribution function at different MBH bins (with Python package Reliability48), and performed 10,000 random draws from the cumulative distribution function at each bin to fit the relation between fgas and MBH. The linear relation and its corresponding errors are taken as the best fitting and standard deviations of these fittings (Extended Data Table 2). The non-detection rate of HI is relatively low across most of the MBH range and becomes significant only for galaxies with the most massive BHs (reaching about 50% at MBH > 108M).

Partial least square regression

To derive the most significant physical parameters in determining μHI statistically, we used the Python package Scikit-learn49 with partial least squares (PLS) Regression function, which uses a non-linear iterative partial least squares (NIPALS)50 algorithm. The PLS algorithm generalizes a few latent variables (or principal components) that summarize the variance of independent variables, which is used to find the fundamental relation between a set of independent and dependent variables. It has advantages in regression among highly correlated predictor variables. It calculates the linear combinations of the original predictor datasets (latent variables) and the response datasets with maximal covariance, then fits the regression between the projected datasets and returns the model:

$$Y=XB+F,$$
(4)

where X and Y are predictor and response datasets, B is the matrix of regression coefficients and F is the intercept matrix.

We constructed the X and Y matrices as the set of MBH, M, Σstar, Mbulge and the set of μHI. For the BH and galaxy samples, this returns the sample size of 45 and 189, respectively. The optimal number of latent variables (linear combinations of predictor variables) in PLS Regression is determined by the minimum of mean squared error from cross-validation (using function cross_val_predict in Scikit-learn) at each number of components. We find that the optimal number of latent variables for both the BH and the galaxy sample converges to one. Further increasing the number of latent variables yields only a few percentage changes in the mean squared errors, and MBH remains the most significant predictor parameter. Following appendix B in ref. 51, the variance contribution from different parameters to μHI is decomposed as

$${\rm{Var}}(Y)=\mathop{\sum }\limits_{i=1}^{4}{\rm{Var}}({X}_{i}{B}_{i})+{\rm{Var}}(F),$$
(5)

where Var is a measure of the spread of a distribution. The portion of each parameter variance is shown in the last column of the Extended Data Table 3, which shows that MBH dominates the variance. Further increasing the number of latent variables results only in a few percentage changes in the mean squared errors, and MBH remains the most significant predictor parameter.