Introduction

Climate change stands as the most urgent and formidable challenge of our era, China has committed to ambitious targets, aiming to achieve "peak carbon emissions" by 2030 and “carbon neutrality” by 2060. Household carbon emissions are intrinsically linked to the realization of these ambitious goals. The Opinion of the State Council on Accelerating the Comprehensive Green Transformation of Economic and Social Development points out that by 2030, a green lifestyle should be basically formed, the green transformation of consumption patterns should be promoted, and green consumption should be actively expanded. But with the development of industrialization and urbanization in China, carbon emissions continued to increase in 20201. In the past, China paid more attention to reducing carbon emissions from the production sector and neglected the carbon emissions from the household sector. Historically, China has placed greater emphasis on reducing carbon emissions from the production sector, while largely overlooking emissions from the household sector. However, household carbon emissions now account for over 50% of the nation’s total emissions2 and serve as the ultimate driver of emissions in the production sector. Therefore, conducting systematic theoretical research and empirical analysis on household carbon emissions holds profound theoretical and practical significance for China’s efforts to reduce carbon emissions and achieve its dual carbon goals. In rural China, characterized by clan-based communities, the influence of peer effects on residents’ consumption behavior is deeply entrenched. The peer effect emphasizes the phenomenon where an individual’s behavior is closely related to the surrounding group (usually referred to as the reference group) under the influence of bounded rationality3. Household carbon emissions among rural residents in China are influenced not only by intrinsic factors such as household income and demographic characteristics but also significantly by the carbon emission behaviors of the peer groups within the community. Firstly, under the influence of informational social effects, the impact of peer effect on household carbon emissions is closely associated with the social status of rural households. Drawing on the theories of social status concern and the relative income hypothesis, it is evident that household carbon consumption is not an independent decision but is significantly influenced by the carbon consumption behaviors of their peer groups. Social status is defined as the ranking of individuals based on socially recognized standards of resource (or power) allocation4. The pursuit of higher social status typically serves two primary purposes: (1) to directly enhance utility. As social beings, individuals inherently possess a preference for status, where higher social status fulfills self-actualization needs more effectively5. (2) It serves as an indirect means to achieve economic gains. Social status often implies access to intricate social networks, where robust resource integration capabilities enable individuals to secure greater economic benefits6. Groups with higher socioeconomic status typically exhibit stronger environmental awareness and greater capacity for green consumption. Driven by information integration and the pursuit of social status, households are inclined to align their carbon consumption decisions with those of higher socioeconomic status groups. Secondly, under the influence of normative social effects, the impact of peer effect on rural households’ carbon emissions is closely linked to their sense of social belonging. Low-carbon consumption is widely recognized as a social value in contemporary China. The rural areas of China function as a "rule-by-ritual" society7, where social order and norms are maintained through traditional customs and moral principles. When low-carbon consumption is endorsed as a prevailing social value, rural households adopt such behaviors to gain approval and recognition from their peers, thereby enhancing their sense of social identity and belonging.

Previous studies have revealed the role of the same group effect in the transmission of photovoltaic between cities8, and the impact of the neighbor effect on household consumption behavior of energy-saving products9. Previous studies on carbon emissions from Chinese households have mainly focused on the impact of factors such as income, wealth, population structure, household lifestyle, and social networks10,11,12,,11. In recent years, research has begun to explore the impact of peer effect on carbon emissions among Chinese residents12. However, given the clan-based clustering characteristic of rural Chinese communities, peer effect are expected to play a significant role in influencing rural residents’ carbon emissions. Yet, existing studies have not yet examined the relationship between these two factors. Unlike previous studies, this article will investigate the impact of the same group effect on household carbon emissions in rural region of China based on China Family Panel Studies (CFPS) data.

The potential contributions of this study are threefold: (1) It examines the impact of peer effect on rural household carbon emissions in China using micro-level data, offering significant insights for reducing carbon emissions and advancing the nation’s goals of achieving “carbon peak” and "carbon neutrality." (2) It elucidates the mechanisms through which peer effect influence farmers’ carbon emissions, specifically via informational social influence and normative social influence. (3) By employing double machine learning, the study investigates the neighbor effect on rural household carbon emissions and quantitatively assesses the heterogeneity across different income groups and regions.

Literature review and theoretical framework

Driving factors of household carbon emissions

Household carbon emissions play an important role in global carbon emissions13, therefore reducing household carbon emissions plays a crucial role in promoting global green emissions reduction14. More and more literature is beginning to focus on household carbon emissions and their influencing factors. One type of literature focuses on the impact of household economic characteristics on carbon emissions. The earliest research on domestic household carbon emissions focused on the impact of income. Zhang et al.10 conducted a systematic empirical study on the relationship between income and household carbon emissions, and their research showed that the reduction in carbon emissions caused by consumption in high-income households was greater than that in low-income households. In recent years, some scholars have noticed the impact of household housing wealth. Zhao et al.15 measured household carbon emissions using China Family Panel Studies (CFPS) data from 2012 to 2016 and analyzed the relationship between housing wealth and household carbon emissions. Their research found that housing wealth has a significant positive impact on household carbon emissions, with an impact almost twice that of financial wealth. For households with different housing ownership situations, the impact of housing wealth on household carbon emissions is heterogeneous. More literature has focused on the impact of demographic characteristics such as aging population and family size on household carbon emissions. Fan et al.13 used panel data from 30 provinces in China from 1997 to 2017 to empirically study the relationship between population aging and household carbon emissions through threshold regression and two-stage instrumental variable regression. The results show that when the aging rate of urban population is less than 0.083 or greater than 0.083, it has a positive impact on household carbon emissions, and this positive impact is relatively smaller when the aging rate is higher than 0.083. When the aging rate of rural population is below 0.066, the impact on household carbon emissions is not significant. However, when it is above 0.066, population aging has a significant positive impact on household carbon emissions. Zhang et al.16 empirically studied the impact of age, working hours, and carbon emissions on China Family Panel Studies (CFPS) data from 2010 to 2018. The research results show that working hours have a significant positive impact on indirect carbon emissions in households. After introducing the cross term of working hours and aging into the model, the regression coefficient of working hours is significantly negative, indicating that aging can reshape the relationship between working hours and carbon emissions. Chen and Hu17 empirically studied the impact path of population aging on household carbon emissions using a parallel multiple mediation model based on the 2018 China Family Panel Studies (CFPS) data. Its research shows that population aging has a positive direct impact on household carbon emissions, and the impact of population aging on household carbon emissions is closely related to housing area. 17 conducted a systematic analysis on the relationship between household size and household carbon emissions. They used input–output models and two-way fixed effects models, combined with Chinese urban household income and consumption expenditure data for empirical research. The results indicate that household size has a significant negative impact on per capita carbon emissions, and its impact on indirect carbon emissions is much greater than its impact on direct carbon emissions.

Some literature has conducted systematic research on household carbon emissions from the perspective of family lifestyle or social networks. Song and Huang11 focused on the relationship between urban household lifestyle and household carbon emissions, and conducted empirical research using linear regression and ordered Logit models. The results show that in the daily life of urban households, the carbon emissions caused by transportation are much greater than those caused by diet. At the same time, household income, the education level of the head of the household, and household size all have a significant positive impact on carbon emissions. Meng et al.18 analyzed the relationship between social networks and household carbon emissions by empirically studying the relationship between social network expansion and household carbon emissions using China Family Panel Studies (CFPS) data from 2014 to 2018. The results show that the expansion of social networks can promote the upgrading of consumption structure and thus promote carbon emissions, driven mainly by social comparison and attention to social status. At the same time, there is heterogeneity in the impact of social network expansion on carbon emissions of households with different ages, education levels, and income levels. According to the theory of seeking social status, social status focuses on the relationship between residents’ consumption and social status. Social status concerns lead people to always pursue a level of respect within a group, and individuals often achieve this goal through conspicuous consumption, which often leads to higher household carbon emissions.

Previous research on carbon emissions from Chinese households has mainly focused on the impact of factors such as income, education level, population structure, and family lifestyle. In recent years, attention has been paid to the impact of social networks, but research on the relationship between the same group effect and household carbon emissions is still limited, and there is still a lack of exploration in the important field of the same group effect and household carbon emissions. This article will discuss this issue and conduct a systematic study on the relationship and mechanism between the same group effect and rural household carbon emissions. It will also quantitatively measure the heterogeneous impact of the same group effect on carbon emissions of households with different incomes and regions in rural China.

The importance of peer effect in the rural household carbon emissions

In the existing literature, studies on the impact of social network relationships on household carbon emissions are closely aligned with the research theme of this paper. Recently, scholars have begun to explore the effects of peer effect on household carbon emissions in China. For instance, He et al.12 utilized panel data from the China Household Finance Survey (CHFS) to empirically demonstrate the presence of peer effect in Chinese household carbon emission behaviors. Their mechanism analysis revealed that peer effect influence household carbon emissions through learning imitation and competitive imitation mechanisms. However, existing research has largely overlooked the unique characteristics of rural residents in China. Rural society in China is characterized by a relational society, where peer effect are likely to play a particularly significant role in shaping household carbon emission behaviors. Therefore, this study focuses on this area, aiming to fill the research gap on rural household carbon emissions in China while providing practical insights for achieving the nation’s carbon peak and carbon neutrality goals.

Firstly, influenced by information sharing and knowledge dissemination, rural residents often pay attention to the differences between their own carbon consumption and that of their peers. Specifically, the carbon consumption behaviors of surrounding individuals significantly shape the carbon-related decisions of rural households. Secondly, under the influence of social norms, rural residents’ carbon consumption behaviors are profoundly affected by peer effect. The unique social function of interpersonal relationships in rural China fosters a dynamic where consumption decisions are characterized by mutual influence among community members.

In summary, given the unique social structures, cultural contexts, and consumption patterns of rural residents in China, conducting in-depth research on their carbon emission behaviors holds significant theoretical and practical value. Exploring the impact of peer effect on carbon emissions among rural Chinese residents not only helps uncover the social driving mechanisms behind carbon reduction in rural areas but also provides a scientific basis for formulating differentiated and targeted emission reduction policies. This research is of great importance for advancing China’s “Dual Carbon” goals of achieving carbon peak and carbon neutrality.

Theoretical framework

Study on the impact of cohort effect on household carbon emissions

According to the relative income hypothesis, individual consumption behavior is not independent and is closely related to the consumption behavior of neighboring individuals, that is, there is a demonstration effect. Under non independent preferences, there is often a phenomenon of herd effect in the behavior of individuals, where their behavior tends to be consistent with that of their reference group19,20,21. When individuals lack objective standards, they will select groups of people who are similar to themselves to obtain valuable reference information, and then adjust their behavior to be consistent with others22. Previous research on the relationship between the same group effect and carbon emissions has mainly focused on low-carbon product purchasing decisions and low-carbon practices of enterprises, with less attention paid to the relationship between the same group effect and household carbon emissions. Sokolowski23 empirically studied the impact of the same group effect on the promotion of rooftop photovoltaic systems in Polish residential buildings using panel data. The study showed that the same group effect has a positive effect on the installation behavior of residential photovoltaic systems. Jiang et al.24 extended the study of the same group effect to the low-carbon practices of enterprises, studying the behavior of enterprises imitating low-carbon practices in upstream and downstream industries, and analyzing the impact of environmental regulation intensity on the same group effect. The results show that low-carbon practices have a same group effect among enterprises, and the carbon emission intensity of enterprises is not only influenced by the low-carbon practices of peers, but also closely related to the low-carbon practices of upstream and downstream industries. Yu and Deng25 empirically studied the spatial correlation of household low-carbon product purchase decisions based on microdata using spatial discrete choice models and Bayesian Markov chain Monte Carlo methods. The results show that household low-carbon purchasing decisions are influenced by the purchasing behavior of surrounding groups, and there is a significant spatial correlation. Lei et al.9 analyzed the positive impact of neighbor effect on household energy-saving product consumption based on comprehensive survey data in China, using binary Logit regression model, generalized structural equation model, and propensity score matching method.

The demonstration effect typically emphasizes the phenomenon where the consumption behaviors of low-income groups align with those of high-income groups, while the peer effect focuses more on the influence of surrounding individuals’ consumption behaviors on one’s own behavior. Due to the pronounced clan-based clustering in rural China, social relationships exhibit a ripple effect that diminishes with distance. This unique characteristic determines that the primary influence on rural residents’ carbon consumption behaviors is the peer effect. From the perspective of interdependent preferences, this study integrates social interaction theory, social norms theory, and the realities of rural China to investigate the impact of peer effect on rural households’ carbon emissions.

Study on the mechanism of same group effect on household carbon emissions

The mechanism by which the same group effect affects individual decision-making mainly includes two aspects: informational social influence and normative social influence. The impact of information society emphasizes the acceptance of information obtained from others as evidence of authenticity, that is, individuals’ decision-making behavior or decisions depend on information provided by others in uncertain situations26 (Deutsch and Gerard 1955). Normative social influence holds that individuals choose to conform to the behavior of others in order to gain recognition when facing group pressure, that is, to follow group standards driven by social norms and expectations to avoid exclusion or ridicule26,27.

The impact of information society refers to the phenomenon where individuals’ decisions are influenced by information provided by others when faced with uncertainty and ambiguous situations, leading to a tendency towards consensus (Sherif 1936)28. Individuals will follow others’ low-carbon consumption decisions under incomplete information, as they believe they can obtain better information from the same group. When the surrounding population practices low-carbon consumption, it is bound to affect individual carbon emissions through information sharing and knowledge flow. At this time, the carbon emissions of farmers are closely related to the decisions of groups with higher socio-economic status. This is because, on the one hand, combining the theories of collective intelligence and social information processing, it is known that there is a close correlation between social information and individual decision-making, and individuals reduce decision-making risks through information integration (Surowiecki 2004)29,30,31. On the other hand, groups with higher socioeconomic status have stronger environmental awareness and green consumption ability32,33, and people always follow the consumption decisions of high socioeconomic status groups in order to demonstrate their identity and status.

The normative social influence mainly stems from individuals having a strong sense of social belonging and always fearing negative evaluations. Low carbon consumption is a widely recognized social value, and practicing low-carbon consumption often earns appreciation and recognition from others. When the surrounding population practices low-carbon consumption, individuals will also maintain consistency with their behavior under the influence of social norms. Driven by social identity and belonging needs, individuals often choose to abide by social norms and imitate the behavior patterns of the group in order to pursue group acceptance and recognition34. Group pressure can also prompt individuals to adjust their behavior to meet group expectations35. Hogg and Reid36 pointed out that social norms are shared thinking, emotion and behavior patterns among groups, and people will refer to this pattern when making decisions. They also studied in detail how social identity theory and self-classification theory affect individual decision-making and the spread of group norms. It is worth noting that cultural and social backgrounds are closely related to the degree of normative social influence, and normative social influence will be more pronounced in collectivist cultures37.

Based on the above analysis, this article proposes the following hypotheses.

Hypothesis 1: The same group effect has a positive impact on carbon emissions of rural residents.

Hypothesis 2: Under the influence of information society, the carbon emissions of lower social status groups align with those of higher social status groups.

Assumption 3: Under the mechanism of normative social influence, the smaller the gap in carbon emissions between households and surrounding groups, the stronger their sense of social belonging.

Model setting

  1. (1)

    Model setting of the impact of the same group effect on rural household carbon emissions.

This article mainly studies the impact of the same group effect on rural household carbon emissions, and most of the current related research is analyzed through traditional causal inference models. However, traditional econometric models have shortcomings in dealing with problems such as incorrect model function forms and high-dimensional explanatory variables. Double/debiased machine learning method can effectively solve these problems.

Chernozhukov et al.38 proposed a double/debiased machine learning method. Compared with non parametric regression methods, double/debiased machine learning method relies on machine learning and are suitable for high-dimensional situations where the model has complex functions with respect to covariates39. Compared with traditional econometric models, double/debiased machine learning method is more suitable for the research problem in this paper. Firstly, when studying the carbon emissions of farmers, many variables may have nonlinear effects on them. If the econometric model is set to a linear form, there may be model setting errors, which can affect the statistical properties of the estimates. Double/debiased machine learning method can effectively solve the problem of model setting bias39,40. Secondly, rural residents’ carbon emissions are influenced by many factors. In order to avoid endogeneity issues caused by omitted variables, efforts should be made to control the impact of relevant factors on farmers’ carbon emissions as much as possible. Traditional non parametric regression is prone to the "curse of dimensionality," while double/debiased machine learning method is still applicable in high-dimensional data. Finally, double/debiased machine learning method can make the estimation equation satisfy “Neyman orthogonality”, and consistent estimators can be obtained through K-fold cross fitting.

Firstly, this article constructs a partially linear double/debiased machine learning model.

$${Y}_{it}={\theta }_{0}{zw}_{it}+{g}_{0}\left({X}_{it}\right)+{U}_{it}, E\left({U}_{it}|{X}_{it},{zw}_{it}\right)=0$$
(1)
$${zw}_{it}={m}_{0}\left({X}_{it}\right)+{V}_{it}, E\left({V}_{it}|{X}_{it},{zw}_{it}\right)=0$$
(2)

Among them, \(i\) is the household, \(t\) is time, \({Y}_{it}\) is the dependent variable, representing the per capita carbon emissions of the household, and \({zw}_{it}\) represents the carbon emissions of surrounding households, which is used to measure the impact of the same group effect on household carbon emissions and is the core explanatory variable of the model. \({X}_{it}\) is the set of control variables, \({g}_{0}\left(\bullet \right)\) and \({m}_{0}\left(\bullet \right)\) are unknown functions, \({U}_{it}\) and \({V}_{it}\) are random interference terms, and the two are uncorrelated.

Given \({X}_{it}\), the conditional expectation on both sides of Eq. (1) can be obtained:

$$E({Y}_{it}|{X}_{it})={\theta }_{0}E({zw}_{it}|{X}_{it})+{g}_{0}\left({X}_{it}\right)+E({U}_{it}|{X}_{it})$$
(3)

Subtracting Eq. (3) from Eq. (1) yields:

$${Y}_{it}-E\left({Y}_{it}|{X}_{it}\right)={\theta }_{0}({zw}_{it}-E({zw}_{it}|{X}_{it}))+{U}_{it}$$
(4)

When \({U}_{it}\) is not correlated with \({V}_{it}\), a consistent estimator of \({\theta }_{0}\) can be obtained through ordinary least squares estimation. The double/debiased machine learning method estimates \(\widehat{E\left({Y}_{it}|{X}_{it}\right)}\) and \(\widehat{E\left({zw}_{it}|{X}_{it}\right)}\) separately using machine learning method, and then performs ordinary least squares regression on \({zw}_{it}-\widehat{E\left({zw}_{it}|{X}_{it}\right)}\) using \({Y}_{it}-\widehat{E\left({Y}_{it}|{X}_{it}\right)}\). This deviation like approach makes the estimation equation satisfy “Neyman orthogonality”.

Assuming \({l}_{0}\left({X}_{it}\right)\equiv E\left({Y}_{it}|{X}_{it}\right)\), \({m}_{0}\left({X}_{it}\right)\equiv E\left({zw}_{it}|{X}_{it}\right)\),\({\widehat{l}}_{0}\left({X}_{it}\right)=\widehat{E\left({Y}_{it}|{X}_{it}\right)}\), \({\widehat{m}}_{0}\left({X}_{it}\right)=\widehat{E\left({zw}_{it}|{X}_{it}\right)}\), \({\widehat{V}}_{it}=\widehat{{zw}_{it}-E\left({zw}_{it}|{X}_{it}\right)}\), then

$${\widehat{\theta }}_{0}={(\frac{1}{n}\sum {\widehat{V}}_{it}^{2})}^{-1}\frac{1}{n} \sum {\widehat{V}}_{it}({Y}_{it}-{\widehat{l}}_{0}\left({X}_{it}\right))$$
(5)

Next, we will examine the asymptotic properties of double/debiased machine learning estimators.

$$\begin{aligned}{\sqrt{n}(\widehat{\theta }}_{0}-{\theta }_{0}) & ={\left(\frac{1}{n}\sum {\widehat{V}}_{it}^{2}\right)}^{-1}\frac{1}{\sqrt{n}}\sum \left({m}_{0}\left({X}_{it}\right)-{\widehat{m}}_{0}\left({X}_{it}\right)\right){U}_{it} \\ & +{\left(\frac{1}{n}\sum {\widehat{V}}_{it}^{2}\right)}^{-1}\frac{1}{\sqrt{n}}\sum {V}_{it}{U}_{it}-{\left(\frac{1}{n}\sum {\widehat{V}}_{it}^{2}\right)}^{-1}\frac{{\theta }_{0}}{\sqrt{n}}\sum {\left({m}_{0}\left({X}_{it}\right)-{\widehat{m}}_{0}\left({X}_{it}\right)\right)}^{2}\\&+{\left(\frac{1}{n}\sum {\widehat{V}}_{it}^{2}\right)}^{-1}\frac{{\theta }_{0}}{\sqrt{n}}\sum {V}_{it}\left({m}_{0}\left({X}_{it}\right)-{\widehat{m}}_{0}\left({X}_{it}\right)\right)\\&+{\left(\frac{1}{n}\sum {\widehat{V}}_{it}^{2}\right)}^{-1}\frac{1}{\sqrt{n}}\sum \left({m}_{0}\left({X}_{it}\right)-{\widehat{m}}_{0}\left({X}_{it}\right)\right)\left({l}_{0}\left({X}_{it}\right)-{\widehat{l}}_{0}\left({X}_{it}\right)\right)\\&+{\left(\frac{1}{n}\sum {\widehat{V}}_{it}^{2}\right)}^{-1}\frac{1}{\sqrt{n}}\sum {V}_{it}({l}_{0}\left({X}_{it}\right)-{\widehat{l}}_{0}\left({X}_{it}\right))\end{aligned}$$
(6)

If \(\left({m}_{0}\left({X}_{it}\right)-{\widehat{m}}_{0}\left({X}_{it}\right)\right)\) and \(({l}_{0}\left({X}_{it}\right)-{\widehat{l}}_{0}\left({X}_{it}\right))\) both converge to 0 faster than \({n}^{-1/4}\), then \({\left({m}_{0}\left({X}_{it}\right)-{\widehat{m}}_{0}\left({X}_{it}\right)\right)}^{2}\) and \(\left({m}_{0}\left({X}_{it}\right)-{\widehat{m}}_{0}\left({X}_{it}\right)\right)\left({l}_{0}\left({X}_{it}\right)-{\widehat{l}}_{0}\left({X}_{it}\right)\right)\) will both converge to 0 faster than \({n}^{-1/2}\), and the corresponding terms will converge to 0 probabilistically.

It is worth noting that the last term \({\left(\frac{1}{n}\sum {\widehat{V}}_{it}^{2}\right)}^{-1}\frac{1}{\sqrt{n}}\sum {V}_{it}({l}_{0}\left({X}_{it}\right)-{\widehat{l}}_{0}\left({X}_{it}\right))\), \({V}_{it}\) is related to \(({l}_{0}\left({X}_{it}\right)-{\widehat{l}}_{0}\left({X}_{it}\right))\), which may cause the term to fail to converge to 0. K-fold cross fitting can effectively solve this problem, and previous studies have shown that using fivefold cross fitting yields relatively better results41.

  1. (2)

    Model setting of influence mechanism.

  2. (1)

    According to hypothesis 2, under the influence of information society, the carbon emissions of lower social status groups align with those of higher social status groups. To test hypothesis 2, this paper uses the double/debiased machine learning method introduced above to empirically study the impact of carbon emissions from higher social status groups on carbon emissions from lower social status groups. The constructed model is shown below.

    $${d\_Y}_{it}={\theta }_{0}{g\_zw}_{it}+{g}_{0}\left({X}_{it}\right)+{u}_{it}, E\left({u}_{it}|{X}_{it},{g\_zw}_{it}\right)=0$$
    (7)
    $${g\_zw}_{it}={m}_{0}\left({X}_{it}\right)+{v}_{it}, E\left({v}_{it}|{X}_{it},{g\_zw}_{it}\right)=0$$
    (8)

Among them, \(i\) is the family, \(t\) is time, \({d\_Y}_{it}\) is the dependent variable, which is the per capita carbon emissions of families with lower social status, and \({g\_zw}_{it}\) represents the average level of per capita carbon emissions of families with higher social status in various regions. It is used to study the mechanism of the same group effect on rural residents’ carbon emissions through informational social influence and is the core explanatory variable of the model. \({X}_{it}\) is the set of control variables, \({g}_{0}\left(\bullet \right)\) and \({m}_{0}\left(\bullet \right)\) are unknown functions, \({u}_{it}\) and \({v}_{it}\) are random interference terms, and the two are uncorrelated.

  1. (2)

    According to hypothesis 3, under the mechanism of normative social influence, the smaller the gap in carbon emissions between households and surrounding groups, the stronger their sense of social belonging. To test hypothesis 3, this paper empirically studies the impact of relative carbon emission levels on family social belonging using probit and logit models. A model was constructed as shown below.

    $${gs}_{it}=f(\text{l}{txf\_cj}_{jt},{Z}_{it},{\varepsilon }_{it})$$
    (9)

where \(i\) represents family, \(t\) represents year, and \(j\) represents province. \(\text{l}{txf\_cj}_{jt}\) represents the relative carbon emission level of farmers lagged by one period, and \({Z}_{it}\) represents the control variable. \({gs}_{it}\) is a dummy variable representing the strength of a family’s sense of social belonging. If a family has a strong sense of social belonging in year \(t\), the value is 1; otherwise, the value is 0. \({\varepsilon }_{it}\) is a random perturbation term.

Since the dependent variable of Eq. (9) is a dummy variable, Probit and Logit models can be chosen, and the specific models have the following forms:

$${gs}_{it}^{*}={\beta }_{0}+{\beta }_{1}\text{l}{txf\_cj}_{jt}+{\beta }_{2}{Z}_{it}+{\varepsilon }_{it}$$
(10)

If all explanatory variables in the model are represented by \({M}_{it}\), then

$$E({\varepsilon }_{it}|{M}_{it})=0$$
(11)
$$P({gs}_{it}|{M}_{it})=\Lambda ({M}_{it}{\prime}\beta {)}^{{Y}_{it}}(1-\Lambda ({M}_{it}{\prime}\beta ){)}^{1-{gs}_{it}}$$
(12)
$$P({gs}_{it}=1|{M}_{it})=E({gs}_{it}|{M}_{it})=\Lambda ({M}_{it}{\prime}\beta )$$
(13)
$$P({gs}_{it}=0|{M}_{it})=\text{1} - \Lambda ({gs}_{it}{\prime}\beta )$$
(14)

Among them, \({gs}_{it}^{*}\) is the latent variable. When \(\Lambda\) is the distribution function of the standard normal distribution, the above equation is the Probit model. When \(\Lambda (z)=\frac{{e}^{z}}{1+{e}^{z}}\), the above equation is the Logit model. Both Probit and Logit models use maximum likelihood estimation to estimate. Assuming that \(\lambda (z)=\frac{d\Lambda (z)}{dz}\), for each household in period t, the marginal effect \(\frac{\partial P({gs}_{it}=1|{M}_{it})}{\partial (\text{l}{txf\_cj}_{jt})}=\lambda ({M}_{it}^{{^{\prime}}}\beta )\cdot {\beta }_{1}\) indicates that the probability of rural households having a strong sense of social belonging when lagged by one period of relative carbon emission level changes depends on the household characteristics \({M}_{it}\), but the sign of this probability depends on the regression coefficient. When the regression coefficient is negative, it indicates that the large gap in carbon emission levels between farmers and surrounding households in the early stage is not conducive to improving their sense of social belonging.

Data sources and variable settings

Data sources

The data used in this article mainly includes: (1) China Family Panel Studies (CFPS) data from 2012 to 2020; (2) The input–output tables of multiple regions in China for 2012, 2015, and 2017 in the Carbon Emission Accounts and Datasets (CEADs)42,43,44,45; (3) The carbon emission data of each province in the Carbon Emission Accounts and Datasets (CEADs) for the years 2011, 2013, 2015, and 2017.

The main processing of the China Family Panel Studies (CFPS) data is as follows. (1) Data filtering. Firstly, in the CFPS data, there are some households whose annual income is lower than their food expenses every year, which may be related to measurement errors or the fact that the household has held large-scale banquets. Drawing on the practices of Yan46, Hang and Yan47, this article excludes samples where the household’s annual income is insufficient to cover food expenses. In addition, considering the significant differences between the household expenditure items in the base period (2010) of the China Family Panel Studies (CFPS) data and those from 2012 to 2020, this article will use the data from 2012 to 2020 for empirical research. Finally, this article excluded households that withdrew from the survey in the middle based on 2012, constructed balanced panel data, and conducted empirical research based on this. (2) Determination of the head of household. There is no question about the head of household in the CFPS data, but there is a question about "the most familiar member of the family’s finances". Based on this, this article identifies the head of household. (3) Definition of rural samples. CFPS adopts an integrated urban–rural sampling frame, so its criteria for defining urban–rural samples are not unique. This article refers to the following classification criteria provided by CFPS Classroom to define rural households. There are mainly the following types: The CFPS project team defined the urban–rural attributes of households by matching their residential community addresses with the standards published by the National Bureau of Statistics. Based on this, this article determined the rural sample for benchmark regression According to the CFPS survey data, there are questions about the type of community where the interviewed families are located. Based on this, this article defines rural families as those whose community is a village committee CFPS data asked respondents about the nature of their household registration in a personal questionnaire. This article defines households with a household registration type of "agricultural household registration" as rural samples Based on the household economic questionnaire in CFPS, the rural sample is defined as whether the household engages in its own agricultural activities, that is, when the household engages in corresponding agricultural activities, it is recognized as a rural household.

The input–output tables and carbon emission data of each province have been processed as follows. The input–output tables for multiple regions in China mainly include data from 2012, 2015, and 2017. According to Zhang et al.10, this article corresponds the economic sectors in the input–output table to eight major categories of consumption, calculates the corresponding input coefficient matrix, and calculates the total output of each region. Considering the availability of data and drawing on the approach of16, this paper matches the input–output table of 2012 with the household consumption data of CFPS provinces in 2012 and 2014, matches the input–output table of 2015 with the household consumption data of CFPS provinces in 2016, and matches the input–output table of 2017 with the household consumption data of CFPS provinces in 2018 and 2020.

Variable setting

Dependent variable

  1. (1)

    Household per capita carbon emissions (Y). This article draws on the approach of16) and uses indirect carbon emissions to measure household carbon emissions Using China Family Panel Studie (CFPS) data to obtain household consumption data, including food, clothing, housing, healthcare, transportation and communication, cultural and educational entertainment, household equipment and daily necessities, and other consumer expenditures. Following Zhang et al.'s10 approach, this paper maps the 42 economic sectors in China’s input–output table into eight consumption categories. For example, sectors related to clothing include the Manufacturing of Textiles, Manufacture of Textile Wearing Apparel, Footwear, Caps, Manufacture of Leather, Fur, Feather, and Related Products in China’s input–output table. The Carbon Emission Accounts and Datasets (CEADs) obtain carbon emission data for various economic sectors. Calculate the indirect carbon emissions of households using the input–output method.

    $$C=\frac{{C}_{s}}{X}{(I-A)}^{-1}y$$
    (15)

Among them, A is the direct input coefficient matrix aligned with eight consumption categories (8 × 8 dimensions). \(\frac{{C}_{s}}{X}\) denotes the row vector (1 × 8 dimensions) of carbon emissions per unit output mapped to these eight consumption categories. y is a column vector corresponding to household consumption expenditure (8 × 1 dimensions). The per capita carbon emissions of households were calculated by combining household carbon emissions and household size.

  1. (2)

    Per capita carbon emissions of low social status families (Y1). In CFPS data, there is a question about personal social status called "Your social status in the local area". Respondents answered 1–5, indicating that their self-rated social status ranges from low to high. This article measures the social status level of households based on the self-evaluation of social status by the head of the household, and divides farmers into high social status households and low social status households based on this. This study follows the approach of Hang and Yu48, defining households with a self-evaluated social status level greater than 3 as high social status families and those with a self-evaluated social status level of 3 or below as low social status families. When the head of the household self-evaluates their social status level to be greater than 3, it is considered the family to be a high social status family; When the household head’s self-evaluation of social status is less than or equal to 3, it is considered a low social status family. When testing hypothesis 2 through empirical research, this article takes the per capita carbon emissions of low social status households as the dependent variable.

  2. (3)

    Dummy variable of family and social belonging (Y2). This article draws on the approach of Chen et al.49 and combines it with the CFPS questionnaire on individuals’ self-assessment of their social status (ranging from 0 to 10 in descending order) to measure a household’s sense of social belonging based on the self-assessment of their social status by the head of the household. This article combines the CFPS survey questionnaire to generate a dummy variable for measuring family social belonging. When the household owner’s self-rated popularity level is above 5, the value is 1,otherwise, the value is 0.

Core explanatory variables and control variables

  1. (1)

    The core explanatory variable in benchmark studies is the average per capita carbon emissions of the reference group. When studying the impact of the same group effect on carbon emissions of rural residents, the selection of the reference group is crucial. Previous studies have shown that people are always compared to similar populations (such as age, occupation, or geographic location)50,51,52 this article combined the characteristics of CFPS data and used various regions (provinces, municipalities, autonomous regions) and counties as reference groups. To avoid endogeneity issues, this article selects the average carbon emissions of other households in various regions and counties as the core explanatory variable. This is because the average carbon emissions of other households in the reference group will affect the carbon emissions of a single household, but the carbon emission level of a single household is not sufficient to affect the average carbon emissions of other households in the reference group.

  2. (2)

    The core explanatory variables in mechanism research are the average carbon emissions of high social status households and the relative carbon emission level of households. Based on the previous analysis, the average per capita carbon emissions of high social status households in various regions (provinces, municipalities, autonomous regions) and counties were used as the core explanatory variable to study their impact on the per capita carbon emissions of low social status households, in order to verify the mechanism of the impact of the information society (hypothesis 2). The relative carbon consumption level of a household is the ratio of its carbon emissions to the carbon consumption levels of other households in different regions, used to measure the gap in carbon emissions between the household and the reference group.

Control variables. Drawing on the approaches of15,16,53. The control variables in this study mainly include: per capita household income, per capita household debt, Dummy variable of household head occupation (The value is 1 when the head of household works in government departments, public institutions, and state-owned enterprises, otherwise, the value is 0), Number of housing units owned by the family, Dummy variable of the Housing Fund (If someone in the family has a housing fund, the value of this variable is 1, otherwise the value is 0), Number of older adults in the family, Dummy Variable of the Family Occupation (If someone in the family works in a government department, public institution, or state-owned enterprise, the value of this variable is 1, otherwise, the value is 0), The education level of the household head, Age of head of household, Health status of the household head, The degree of concern of the household head for the environment.

Table 1 reports the descriptive statistical results of the main variables in this article.

Table 1 Descriptive statistics of main variables.

Household per capita carbon emissions serve as the dependent variable in this study, while the average carbon emissions of other households within the same region and county are the core explanatory variables, used to measure the impact of peer effect on household carbon emissions. According to the descriptive statistics, the average household per capita carbon emissions is 7.87, while the average carbon emissions of other households within the same region and county are 7.69 and 7.67, respectively. The close proximity of these values indicates a strong relationship between the reference group’s (region or county) carbon emissions and household per capita carbon emissions, laying a solid foundation for exploring the influence of peer effect on household carbon emissions in subsequent analyses.

Empirical analysis

Benchmark regression

This article uses a double/debiased machine learning model to empirically study the impact of peer effect on rural household carbon emissions. A fivefold cross test was used, and a random forest was used for prediction and solution. The robust standard error was used in the estimation of the model, and the corresponding regression results are shown in Table 2. The observed values in Table 2 are rural samples from CFPS data. As mentioned earlier, this article first defined the rural samples by matching the addresses of household residential communities with the standards published by the National Bureau of Statistics. The first column is the regression result of the average carbon emissions of other households within the same region as the core explanatory variable, and the second column is the regression result of the average carbon emissions of other households within the same county as the core explanatory variable.

Table 2 Results of the impact of same group effect on household carbon emissions.

(1) In Table 2 and the following tables, ***,**, and * respectively represent significance at the 1%, 5%, and 10% significance levels; (2) The parentheses indicate the robust standard error. (3) In Tables 2, 3, 4, 5 and 6 and Tables 8 and 9, Control Variable includes I, D, HO, N, Z, O, FO, E, A, H, C.

According to the estimation results in Table 2, regardless of whether the reference group is province-level or county-level regions, the average level of carbon emissions from other households in the reference group significantly impacts rural residents’ carbon consumption. Assuming all other conditions remain constant, for every unit increase in the average carbon emissions of other households within the same region, the average carbon emissions of rural residents increase by 0.6070 units; For every unit increase in the average carbon emissions of other households within the same county, the average carbon emissions of rural residents increase by 0.3089 units. This indicates that peer effect has a significant positive impact on the carbon emissions of rural residents.

Robustness test

Due to the fact that the definition criteria for rural samples in CFPS are not unique, this article will redefine the rural samples based on the type of community where the interviewed households are located, the household registration type of the head of the household, and whether the households engage in their own agricultural activities. A double/debiased machine learning model will be used to conduct regression analysis again.

  1. (1)

    Defining rural samples by community type.

Families with a community type of village committee were defined as rural samples, and regression analysis was conducted using a double/debiased machine learning model. The results are shown in Table 3. Compared with Table 2, the sample observation values of the reference group in province-level or county-level regions have undergone significant changes, but the regression results are still consistent with the previous text. At this time, the regression coefficients of the average carbon emissions of other households within the same region and the average carbon emissions of other households within the same county were 0.7988 and 0.4784, respectively, still supporting the conclusion that peer effect has a significant positive impact on rural residents’ carbon emissions.

Table 3 Classification of rural samples by community type.
  1. (2)

    Defining rural samples based on the household registration type of the head of household.

Families with agricultural household registration were defined as rural samples, and a double/debiased machine learning model was used for regression analysis. The results are shown in Table 4. Compared with Table 2, the sample observation values for the reference group in province-level or county-level regions have changed from 5290 and 5233 to 7410 and 7149, respectively, but the regression results are still consistent with the previous text. At this time, the regression coefficients of the average carbon emissions of other households in various regions and the average carbon emissions of other households in various counties and districts were 0.7361 and 0.4402, respectively, still supporting the conclusion that peer effect has a significant positive impact on rural residents’ carbon emissions.

Table 4 Sample of urban and rural areas divided by household registration.
  1. (3)

    Defining rural samples based on whether households engage in agricultural activities.

Families engaged in agricultural activities were defined as rural samples, and regression analysis was conducted using a double/debiased machine learning model. The results are shown in Table 5. Compared with Table 2, the sample observation values for the reference group in province-level or county-level regions have changed from 5290 and 5233 to 3803 and 3734, respectively, but the regression results are still consistent with the previous text. At this time, the regression coefficients of the average carbon emissions of other households in various regions and the average carbon emissions of other households in various counties and districts were 0.6118 and 0.2923, respectively, still supporting the conclusion that peer effect has a significant positive impact on rural residents’ carbon emissions.

Table 5 Urban and rural samples divided by whether or not engaged in agricultural activities.

Transmission mechanism testing

  1. (1)

    The impact of information society.

The previous theoretical analysis shows that peer effect can have an impact on farmers’ carbon emissions through informational social influence. In order to test this mechanism, this article takes the per capita carbon emissions of low social status rural households as the dependent variable, and the average carbon emissions of high social status rural households in province-level or county-level regions as the core explanatory variable. A double/debiased machine learning model is used for empirical research, in which a fivefold cross test is used and combined with random forest for prediction and solution, and robust standard errors are used. Table 6 shows the corresponding regression results.

Table 6 Transmission mechanism test of informational social impact.

According to the regression results in Table 6, the regression coefficients for the average carbon emissions of high social status households in province-level or county-level regions are 0.3502 and 0.1722, respectively. This indicates that peer effect can impact farmers’ carbon emissions through informational social influence, verifying hypothesis 2.

  1. (2)

    Normative social impact.

Sense of belonging is often closely related to a group and is one of the fundamental psychological needs of humans54. Sense of belonging emphasizes that individuals are accepted and needed by the group, or their own characteristics are similar to those of the group. Combined with the theoretical analysis in the previous text, the within-group effect can impact the carbon emissions of households through the mechanism of normative social influence. This section will use a panel ordered logit model to empirically study and analyze the impact of the gap in carbon emissions between households and the reference group on household social belonging. In empirical research, the dependent variable is the family’s sense of social belonging, and the core explanatory variable is the relative carbon emission level of the family lagged by one period. The regression results are shown in Table 7.

Table 7 Testing the transmission mechanism of normative social impact.

In order to examine the transmission mechanism of within-group effect on household carbon emissions through normative social influence, this paper will conduct regression analysis on the relationship between social belonging and relative household carbon emission levels. Following the approach of Ta and Sun55, in addition to the core explanatory variable in this article, the model also controls for household income grouping variables, household social status level, household head education level, and household head health level. Among them, the net income of households in each region is sorted from low to high each year and divided into 5 groups as household income grouping variables. In addition, considering that the participants of the fourth issue of CFPS data on personal self-assessment of popularity level were limited to individuals interviewed in early 2016, and the sample observation values were significantly different from those of the third, fifth, and sixth periods, this study used CFPS data from 2018 to 2020 in this section of the research.

According to the regression results in Table 7, it can be seen that the regression result of the relative carbon emission level of households with a lag of one period is significantly negative, which means that the greater the difference between the household’s carbon emission level and the reference group, the lower the probability of strong self-evaluation of social belonging, and this effect has a lag effect. Hypothesis 3 has been verified.

Study on the heterogeneity of carbon emissions from farmers affected by the same group effect

  1. (1)

    Heterogeneity based on income inequality.

Considering the income inequality among rural residents, this article will further investigate the heterogeneity of the impact of the same group effect on carbon emissions among different income groups. The author divides farmers into low-income and high-income groups according to the median household income of rural residents in province-level regions each year. The estimation results obtained using the double/debiased machine learning model are shown in Table 8.

Table 8 Heterogeneity study based on income.

According to Table 8, the impact of the same group effect on carbon emissions of low-income and high-income rural households is significant at the 1% significance level. At the same time, compared with low-income households, the impact of the same group effect on carbon emissions of high-income farmers is relatively greater, with a regression coefficient of 0.7233. Looking into the reasons, on average, high-income families have a relatively stronger ability to make decisions through information integration, and they usually have a higher socioeconomic status. Their status comparison behavior in luxury consumption is more severe, which leads to a closer relationship between their carbon emission behavior and the reference group’s carbon emission behavior.

  1. (2)

    Spatial heterogeneity.

Considering the regional imbalance in China’s economic development, there are significant differences in the carbon emissions behavior of farmers in different regions. This article will further analyze the heterogeneity of the impact of the same group effect on rural residents’ carbon emissions in the eastern, central, western, and northeastern regions of China. Based on the statistical system and classification standards provided by the National Bureau of Statistics, this article divides the economic zones of the 25 provinces/cities/autonomous regions covered by the CFPS sample. The specific divisions of the eastern, central, western, and northeastern regions are as follows: the eastern region includes Beijing, Tianjin, Hebei, Shanghai, Jiangsu, Zhejiang, Fujian, Shandong, and Guangdong; The central region includes Shanxi, Anhui, Jiangxi, Henan, Hubei, and Hunan; The western region includes Guangxi, Chongqing, Sichuan, Guizhou, Yunnan, Shaanxi, and Gansu; The Northeast region includes Liaoning, Jilin, and Heilongjiang. In addition, after data screening and constructing balanced panel data, only 191 households were surveyed annually in the Northeast region, which is significantly different from other economic zones. Therefore, this article excluded the Northeast region from the heterogeneity study by region. The regression results of the impact of the same group effect on farmers’ carbon emissions in different economic zones are shown in Table 9.

Table 9 Spatial heterogeneity study.

According to Table 9, the impact of the same group effect on rural household carbon emissions in the eastern and central regions is significant at the 1% significance level, while the impact on rural household carbon emissions in the western region is significant at the 5% significance level. Meanwhile, the same group effect has the most significant impact on carbon emissions of farmers in the eastern region, with estimated coefficients much higher than those in the central and western regions. The reason for this is that the population size in the eastern region is relatively larger, and population aggregation can effectively amplify the impact of the same group effect, which is consistent with the research conclusion of Zheng and Lu56.

Research conclusion and implications

This article conducts a systematic study on the impact and mechanism of the same group effect on carbon emissions of rural residents based on data from the China Family Panel Studies (CFPS) from 2012 to 2020. (1) Empirical research using double/debiased machine learning method has shown that the same group effect has a significant positive impact on rural household carbon emissions, and the carbon emissions of reference group households and farmers are closely related. The results of the robustness test using double/debiased machine learning method are still consistent with the basic regression when redefining the rural sample by community type, household registration type of the head of household, and whether the household engages in agricultural activities. (2) The same group effect can affect household carbon emissions through mechanisms of informational and normative social influence. The carbon emissions of high social status households in province-level or county-level regions have a significant positive impact on the carbon emissions of low social status farmers; The relative carbon emissions level of households lagging behind by one period has a significant negative impact on farmers’ sense of social belonging, revealing that when there is a large gap between farmers’ carbon emissions and the reference group, their sense of social belonging is relatively weaker. (3) The research results on the heterogeneous impact of the same group effect on carbon emissions of farmers show that compared with low-income households, the same group effect has a relatively greater impact on carbon emissions of high-income farmers; Compared with the central and western regions, the same group effect has a relatively greater impact on the carbon emissions of farmers in the eastern region.

Guiding the household sector to save energy and reduce emissions from the consumption demand side is of great significance for China to promote carbon peak and carbon neutrality, and to facilitate the comprehensive green transformation of economic and social development. Firstly, it is essential to establish a communication mechanism to foster positive interaction and information exchange among residents of different social statuses. This will encourage rural residents to expand their green consumption and adopt low-carbon consumption habits. This approach will fully leverage the peer effect to drive low-carbon consumption among rural residents. Secondly, the Villagers’ Committee should fully play the guiding and restrictive role of social norms, actively promote the green lifestyle, and vigorously advocate green and low-carbon consumption. Improve the social norm system, such as village regulations and people’s conventions, commend rural families that play an exemplary role in energy conservation and the practice of green consumption, and inspire farmers to develop the habit of low-carbon consumption. Through various forms like online promotion, artistic performances, and knowledge competitions, the concept of low-carbon conception is actively disseminated, thereby enhancing the positive influence of social norms on the green consumption behavior of rural residents. Thirdly, actively promoting green consumption lifestyles among residents in areas with larger populations can help achieve carbon peak and carbon neutrality goals in a ripple effect. Fourthly, for the high-income group, actively promoting green, low-carbon consumption and making it a primary lifestyle will be more conducive to advancing the dual carbon goals.