Introduction

For a long time, the issue of income inequality has aroused widespread concern in society, and has been an important research topic in the economic field. For developing countries, how to effectively narrow the income gap among farmers has always been a daunting task. Rising income inequality means that income flows to a small number of residents, leading to severe class divisions. This not only reduces the sense of achievement, happiness, and security of residents, but also has serious consequences for economic development and social stability (Zhang and Wan, 2006). In the critical period of economic development, narrowing the income gap between different classes and expanding the size of middle-income groups are essential conditions for developing countries to achieve sustainable economic development and improve people’s livelihood.

Farmers’ income inequality is a complex issue affected by a number of factors. With the rapid development of digital technology, people pay more and more attention to the role of digital technology. Due to differences in access to or use of digital technology, a digital divide has emerged among farmers of different genders, ages, education levels and income levels. Existing literature mainly discusses the impact of digital development on farmers’ income inequality from the perspectives of mobile phones, computers, the internet, e-commerce, digital inclusive finance, etc., but has not reached a consistent conclusion. Moreover, digital revolution is a dynamic process. As digital technologies continue to evolve, the digital divide among farmers is also likely to change accordingly. This requires scholars to keep pace with the times, closely follow the changes in practice, and carry out new theoretical and empirical research.

In recent years, the new generation of digital technologies represented by big data has developed rapidly. Data has become a key element in economic development, which is the fundamental feature that distinguishes the digital economy from the industrial economy and the agricultural economy. As big data analytics continues to evolve, the role of data elements in economic development will continue to strengthen. Big data refers to a dataset that is so large that it significantly exceeds the capabilities of traditional tools in collection, storage, management, and analysis. It is characterized by large data volume, high-speed data transmission, diverse data types, timely data updates, and high data quality. By making rational use of big data, it is possible to achieve high value creation at a low cost (Rodriguez et al., 2017). With the rapid popularization of the internet, all kinds of massive information are constantly being generated, collected, stored, processed, and utilized. The era of big data has arrived, bringing about comprehensive social changes and profoundly affecting people’s lives. At present, big data analytics is being rapidly applied across different industries, such as finance, insurance, online marketing, and scientific research (Cooper et al., 2013; Waga and Rabah, 2014). Governments around the world are also leveraging big data analytics to better address challenges in areas such as the economy, healthcare, job creation, natural disasters, and terrorism (Kim et al., 2014).

With the continuous development of big data, some scholars have begun to focus on the link between big data and farmers. Existing literature mainly explores the potential impacts of big data on farmers, the difficulties faced by developing countries in promoting the use of big data among farmers, and the willingness and driving factors of farmers to use big data (Waga and Rabah, 2014; Lokers et al., 2016; Turland and Slade, 2020; Zeng et al., 2024). However, there is limited literature that examines the impact of big data from a micro perspective, especially focusing on the relationship between big data and farmers’ income inequality.

Over the past two decades, rural e-commerce has developed rapidly in China. More and more farmers are using third-party e-commerce platforms for online marketing, and they are called “e-commerce farmers”Footnote 1. Compared with traditional farmers, e-commerce farmers are more proficient in using the internet and better at learning new knowledge (Zeng et al., 2017). Moreover, e-commerce platforms bring together various participants and accumulate a large amount of data, such as transaction records, comments, search traces, etc. By mining and analyzing these big data, important guidance can be provided for the operation of online businesses. The AliResearch Institute released the “2016 Online Business Data Application Analysis Report”, in which it was pointed out that some online businesses have begun to utilize data consciously, and many excellent online business operators are using data to analyze the market and identify business opportunities. The era of relying on intuition and experience is becoming a thing of the past, and the effective use of data has become the key to the upgrading of e-commerce. In addition, the use of big data products by e-commerce farmers to make business decisions is a new phenomenon, and few scholars have made studies on it. This phenomenon provides a good opportunity to discuss the relationship between big data developmentFootnote 2 and farmers’ income inequality. As human society enters the era of big data, academia should promptly expand the research perspective on the digital divide of farmers from the development of mobile phones, computers, the internet, e-commerce, and digital inclusive finance to the development of big data. This will help us to form a new understanding of the phenomenon of the farmers’ digital divide, and take corresponding policies and measures in time.

According to the entrepreneurship theory, the main internal factors contributing to income inequality among e-commerce farmers are the differences in entrepreneurial alertness and dynamic capabilities. The former reflects the entrepreneur’s ability to identify new opportunities and respond quickly (Roundy et al., 2017; Tang et al., 2012; Valliered, 2013), and the latter reflects the entrepreneur’s process capability in transforming new opportunities into actual performance (Corner and Wu, 2012; Helfat and Peteraf, 2015; Teece, 2018). These two constitute a “discovery-implementation” framework and neither is indispensable (Wright and Zammuto, 2013; Si et al., 2015). Therefore, if big data can help alleviate income inequality among e-commerce farmers, it must be because it helps to narrow the gap between e-commerce farmers in terms of entrepreneurial alertness and dynamic capabilities.

Additionally, a big data product is essentially a product, and its product attributes are intrinsic determinants of its effectiveness (Kumar et al., 2021). From the perspective of product attributes, we can conduct a more in-depth study on the impact of big data products on the income inequality of e-commerce farmers. This perspective is what existing big data research lacks attention to.

Therefore, this paper aims to reveal how big data products affect e-commerce farmers’ income inequality. Specifically, we attempt to answer three research questions: (1) Can big data products help alleviate e-commerce farmers’ income inequality? (2) How do big data products affect e-commerce farmers’ entrepreneurial alertness gag and dynamic capabilities gap? (3) How do big data products attributes affect the income inequality of e-commerce farmers?

To address these questions, this study uses a sample of 418 e-commerce farmers from Taobao villages in China, and employs the Tobit model and entropy balancing method to test the impact of the use of big data products and the attributes of big data products on income inequality among e-commerce farmers. The study found that the use of big data products significantly reduces the income inequality among e-commerce farmers by narrowing the gaps in entrepreneurial alertness and dynamic capabilities gap among e-commerce farmers. In addition, enhancing the usefulness, ease of use, and experience of big data products can help promote the common prosperity of e-commerce farmers.

The contributions of this paper are mainly reflected in two aspects. On one hand, it expands the research on the digital divide among farmers. This paper introduces the new phenomenon of Chinese e-commerce farmers using big data products and a new research perspective into the field of digital divide research. On the other hand, it extends the research on the relationship between big data development and farmers. Taking farmers’ income inequality as the research object, a topic that has been neglected in existing literature, this paper reveals that the mechanism by which big data products affect farmers’ income inequality lies in narrowing the gaps in farmers’ entrepreneurial awareness and dynamic capabilities, thereby enriching the theoretical framework of big data development.

The rest of the paper is structured as follows: Section “Literature review” reviews the relevant previous studies; Section “Hypotheses” proposes hypotheses; Section “Methodology” introduces the data and methods; Section “Results” reports the empirical results; Section “Discussion” further discusses the findings; and the last section concludes.

Literature review

Farmers’ income inequality from the digital divide

Digital divide is a global phenomenon, and how to bridge it is a major concern of all countries. Income inequality is the basic perspective to observe the digital divide of farmers. In general, differences in access to and application of digital technologies have led to differences in income growth among farmers. From the perspective of technology types, scholars have mainly researched the digital divide caused by mobile phones, computers, the internet, e-commerce development, and digital inclusive finance. There are three views in the existing literature on the impact of digital technology on farmers’ income inequality.

The first view is that the digital divide will exacerbate income inequality among farmers. Some scholars divide the differences in access to and use of digital technologies into explicit and implicit digital divides (Jamil, 2021). Firstly, the explicit digital divide directly exacerbates income inequality among farmers. If the low-income group lacks basic physical conditions to access smartphones, computers, or the internet, they will be unable to benefit from technological dividends, thereby directly worsening income inequality (DiMaggio et al., 2004). Secondly, the implicit digital divide will continuously worsen income inequality among farmers. Even if the low-income group has access to smartphones, computers, and the internet, their limited digital skills will restrict their ability to obtain technology dividends, thus perpetuating income inequality. Digital divide is not only about physical constraints, but also about behavioral limitations, and the impact of implicit digital divide on income inequality is more subtle (Martínez and Mora-Rivera, 2019). It has been found that rural households with younger and better-educated household heads, smaller household size, less initial income, more cultivated land tend to receive higher dividends from the adoption of e-commerce.

The second view is that the digital dividend can alleviate income inequality among farmers. On one hand, the information effect and employment effect generated by rural internet penetration have a greater impact on low-income groups, thus helping to mitigate income inequality among farmers. On the other hand, the rapid decrease in the cost of internet access is narrowing the digital divide, which is negatively correlated with the Gini coefficient (Zhang, 2013). Low-income groups in rural areas can access the internet through cheaper and more portable mobile phones, which can reduce information search costs and improve the efficiency of agricultural markets (Aker, 2010). Scholars also find that internet use alleviates household income inequality through information effects and employment effects (Zhang, 2022). Internet use significantly reduces income inequality between rural households, contributing to the goal of common prosperity in China (Li et al., 2023). Hao and Zhang (2024) investigate the relationship between digital financial use and residents’ income inequality and find that the use of digital payments, digital currency management, digital funds, and credit card can reduce residents’ income inequality.

The third view is that digital technology will neither widen nor narrow income inequality among farmers. There are two possible explanations for this. One is that digital technologies will not significantly impact farmers’ income, so it doesn’t matter whether it widens the income gap among farmers or not. Some research has found that digital technology may not necessarily have a significant positive impact on farmers in developing countries (Molony, 2008; Futch and McIntosh, 2009; Fafchamps and Minten, 2012). Couture et al. (2021) performed a randomized controlled trial in eight counties in China to evaluate the effect of Alibaba’s Rural Taobao Program. The results showed that e-commerce expansion through the program had no significant impact on the income of average rural producers. The alternative explanation is that digital dividends are inclusive and equitable. Luo and Niu (2019) examined the role of e-commerce participation in household income growth, based on a survey of Taobao villages in China. They found that the benefits brought by e-commerce seem to be widely shared among participants in a fair way.

Overall, the impact of digital technologies on farmers’ income inequality is complex and needs to be determined according to the type of technology, the degree of technology application, and the stage of regional development. Existing literature has not yet studied the digital income gap from the perspective of big data development, nor has it paid attention to the phenomenon of Chinese e-commerce farmers using big data products to assist in operating online stores. Whether big data products expand or narrow the income inequality of farmers, and how big data products affect the income inequality of farmers, all require further research.

The relationship between big data development and farmers

Whether the development of big data analysis can bring new opportunities for developing countries to address the issue of backwardness among farmers is a very important question. Some scholars have discussed it from the perspective of agricultural big data. Theoretically, by capturing, mining, and analyzing massive amounts of data, farmers and their organizations can develop economic value and improve agricultural productivity (Waga and Rabah, 2014; Lokers et al., 2016). However, in practice, the effectiveness of agricultural big data is not ideal because it still faces several issues, such as security, accuracy, and access (Nandyala and Kim, 2016). Some scholars have pointed out that big data analysis will lead to a new digital divide between farmers in developed countries and those in developing countries. At present, agricultural big data in developing countries is small in scale and lack of diversity, and there is insufficient supply of effective infrastructure for collecting and analyzing big data, as well as lack of professional human resources (Sawant et al., 2016; Rodriguez et al., 2017). In addition, big data analysis will mainly benefit farmers with large scale farms and high levels of education (Oluoch-Kosura, 2010).

There are also some research that focus on the factors affecting farmers’ adoption of farm management information systems (Carrer et al., 2017; Pivoto et al., 2018), the willingness to share farm data to big data platforms (Turland and Slade, 2020), the use of remote sensing data to guide the behavior and benefits of farm precision agriculture production (Coble et al., 2018; Toscano et al., 2019), and the key factors influencing farmers’ usage of agricultural big data services (Chen and Chen, 2019). However, this research does not target the group of e-commerce farmers.

In China, with the rapid development of rural e-commerce, many scholars pay attention to new phenomena and problems in such field. However, only a few institutions and scholars have paid attention to the phenomenon of e-commerce farmers using big data products. In May 2017, AliResearch released the "2016 Online business Data Application Analysis Report". The report shows that as annual sales increase, the proportion of online business operators using data tools also continues to rise (AliResearch, 2017). From April to July 2018, the research group of CARD Rural E-commerce Research Center of Zhejiang University conducted a questionnaire survey of 50 villages in China. The data show that about 28% of the sample farmers have used big data products (Guo et al., 2023). Scholars pointed out that the internet platform, by realizing the centralized release of information and the development of data products, brings a large amount of market information to rural residents and promotes the optimization of their production and operation (Li et al., 2024). Another study found that skill training and social communication played an important positive role in driving e-commerce farmers to use big data, and the use of big data significantly improved the income level of e-commerce farmers (Zeng et al., 2019).

Overall, the relevant literature on agricultural big data mainly discusses the impact of big data on farmers and the application of big data in developing countries from the macro level. The literature from the micro-perspective of farmers is relatively rare, especially the results of investigation and empirical analysis are quite scarce. The application of big data products based on e-commerce platforms by e-commerce farmers is a new practice in China, which has important demonstration value. The study of e-commerce big data products can provide positive reference for other big data applications in agriculture and rural areas. Unfortunately, not many scholars have paid attention to this new phenomenon. There are at least two areas where the existing research can be expanded. First, the existing literature has not explored the impact of e-commerce big data products on farmers from the perspective of income inequality. Second, the existing literature has not gone deep into the intrinsic attributes of big data products, and discussed the importance of product attributes for the effect of big data development.

Hypotheses

The impact of the use of big data products on income inequality among e-commerce farmers

In a general sense, big data products refer to all kinds of tools, platforms or services based on the use and application of big data technologies (Chen and Chen, 2019). By collecting, storing, processing, and analyzing massive amounts of data, they support the decision-making of businesses. In this paper, big data products specifically refer to a system developed by e-commerce platform enterprises that use big data tools to operate and manage e-commerce business (Zeng et al., 2024). Such a system can help online merchants better understand consumer needs, optimize marketing strategies, improve user experience and recommend products accurately, thereby increasing sales and customer satisfaction (Guo et al., 2023; Li et al., 2024).

Market transactions that occur in traditional societies are usually based on face-to-face interactions. Therefore, the ability of farmers to make business decisions requires long-term experience accumulation. The deeper the experience accumulated, the higher the accuracy of business decisions and the greater the income. In other words, the significant differences in experience accumulation largely lead to income disparities. In the era of big data, time no longer maintains the inherent order of events, but becomes non-serialized. The context of events can be segmented, the order of expansion can be disrupted, different events can be interwoven, and they can proceed simultaneously (Li et al., 2024). With the support of big data, the income of e-commerce farmers no longer strictly follows the experience accumulation mechanism over time. Low-income farmers can use big data to make up for their lack of experience and even achieve rapid catch-up in business. Taking the big data product “Business Advisor” developed by Alibaba as an example, it provides one-stop data product services for merchants and displays various data of store operation. The product includes real-time store data, real-time product rankings, industry rankings, store business overview, traffic analysis, product analysis, transaction analysis, service analysis, marketing analysis, and market trends. These auxiliary roles of big data products in business decision-making help narrow the gap in experience accumulation and human capital among e-commerce farmers. And low-income farmers have greater potential for income growth and higher marginal benefits, thus promoting income equality.

In the theory of farmers’ entrepreneurship, the gap in farmers’ income stems from the difference in farmers’ entrepreneurship alertness and dynamic capabilities. The former reflects the farmers’ ability to identify and seize entrepreneurial opportunities, while the latter reflects the farmers’ ability to transform entrepreneurial opportunities into real gain. Therefore, the mechanism of using big data products to alleviate the income inequality of e-commerce farmers can be divided into two parts: using big data products to narrow the entrepreneurial alertness gap and the dynamic capabilities gap.

Entrepreneurial alertness, also known as psychological alertness of entrepreneurial opportunity, significantly influences the entrepreneurial behavior and performance (Ardichvili et al., 2003). Kirzner (1978) firstly defined the concept of entrepreneurial alertness as the ability of entrepreneurs to respond quickly to the entrepreneurial opportunities that have previously been overlooked. Entrepreneurial alertness is not just a reaction to the outside world, but is always embedded in an entrepreneur’s ability (Valliered, 2013). In reality, not everyone is able to identify entrepreneurial opportunities and make profits, but individuals with higher levels of entrepreneurial alertness are more likely and quicker to spot new market opportunities and develop strategic plans to drive business growth (Gaglio and Katz, 2001). Empirical studies have shown that in order to achieve good entrepreneurial performance, entrepreneurs need to stay vigilant to business opportunities (Allinson et al., 2000; Ko and Butler, 2003).

Dynamic capability is an extension of the resource-based view, which emphasizes that enterprises should have a keen ability to respond to changes in the external environment and reallocate resources in time, which has a very important positive impact on the business performance of enterprises (Teece, 2018). The rapid development of digital technologies and changes in the economic situation have intensified the shift of the entrepreneurial environment from relatively stable and orderly to highly dynamic and complex. When environmental changes are highly discontinuous and to a large extent require dynamic capabilities to develop multiple capabilities simultaneously, it has become a key capability for enterprises to cope with volatile environments (Newbert, 2005). For a long time, the lack of effective identification, accurate perception and timely response to external market demand is an important reason for the difficulty in farmers’ business income growth. Farmers who rely on experience, imitation and luck to make business decisions are prone to make mistakes and take greater risks (Zeng et al., 2017). Therefore, how to effectively improve the dynamic capabilities of farmers is an important breakthrough to the problem of increasing farmers’ income.

Big data products help narrow the entrepreneurial alertness gap and dynamic capabilities gap among e-commerce farmers. The important logic of this lies in the information gap, more precisely, the gap between the sources of information and the access to it. Information is the premise of entrepreneurial decision-making. The lack of information sources and access has led to middle and lower-class e-commerce farmers being unable to identify entrepreneurial opportunities in a timely manner and constantly adjust in the implementation process. The theory of information poverty points out that the lack of organization-based information sources is an important cause of farmers’ information poverty (Yu, 2012). Among all kinds of information sources, the organization-based information source is one of the best. There is a close correlation between the use of organization-based information sources and people’s information wealth (Savolainen, 2007). In poor rural areas, information is blocked, and farmers lack understanding of external new technological achievements and are in a state of technical information poverty (Shen, 2013). Big data products, acting as an organization-based information source, help to alleviate farmers’ information poverty. This is because big data products developed by platform enterprises are a deep application of the vast amount of electronic transaction data precipitated by these enterprises. The scope of this information collection is extensive, the information is authentic and accurate, and it is obtained in a timely manner. Moreover, platform enterprises provide data services to e-commerce farmers in the form of marketing transactions, which means the information is highly organized. It offers a set of systematic data indicators with a high degree of visualization and intelligent decision-making capabilities.

On one hand, the use of big data helps to raise the entrepreneurial awareness of e-commerce farmers, especially those who lack information and keen intuition. In the pre-Internet era, farmers’ business decisions were based on subjective feelings and accumulated experience, with defects such as localism, inertia, and roughness (Li et al., 2024). The entire chain from producers to consumers is long and involves multiple links. And the efficiency of information collection, processing, and transmission is very low, leading to serious information lag and distortion (Zeng et al., 2017). The explosive growth of data and the development of data analysis technologies have greatly improved people’s abilities to process and analyze information. Big data makes information acquisition for e-commerce farmers more timely, comprehensive and accurate, significantly reducing subjective biases. Big data products developed by e-commerce platform companies present rapidly changing data to help e-commerce farmers grasp online market dynamics, constantly stimulate their vision and cognition, and keep them highly alert to market changes (Guo et al., 2023). By continuously paying attention to market changes, e-commerce farmers will increase their probabilities of discovering new opportunities. Especially for those in remote areas with narrow social networks and limited sources of information, big data products have a greater marginal effect, significantly compensating for their disadvantages in identifying entrepreneurial opportunities. For example, ZZ, an e-commerce farmer in Shuyang County, Jiangsu Province, is engaged in dried flower making and online marketing, with an average income. One day, he discovered through a big data product that the sales of simulated peach trees were experiencing an unusual surge. Combining this with data analysis, he speculated that this might be due to the consumption boom driven by the popular TV series “Three Lives Three Times, Ten miles of Peach Blossom”. He quickly realized that this was a new product opportunity. Consequently, his entire family began to make artificial peach trees and sell them online. Because their products were lifelike, the transaction volumes of their several family-operated online stores kept rising, and their income for that year reached 5 million yuan.

On the other hand, big data usage helps enhance the dynamic capabilities of e-commerce farmers, especially those with rough management and lack of market analysis capabilities. The uncertainties faced by e-commerce farmers include uncertainties in offline production and supply, as well as uncertainties in the online market (Gao and Liu, 2020; Gao et al., 2024). Only by enhancing their dynamic capabilities can they deal with various uncertainties (Cai et al., 2023). Through the use of big data products, e-commerce farmers can obtain a series of optimal parameter combinations related to their own products, including color, weight, flavor, price, logistics, etc., in order to compare and find their disadvantages or deficiencies in operations, so as to make targeted improvements (Zeng et al., 2019). Moreover, artificial intelligence and algorithm models can be applied to assist farmers in analyzing and predicting market changes, thereby avoiding subjective biases in production decisions (Zeng et al., 2024). For e-commerce farmers who originally lagged behind in dynamic capabilities, the use of big data products is even more helpful to them, because big data products act as their super “military advisor” for business management, helping them catch up with other e-commerce farmers. For example, XBF is an e-commerce farmer in Jieyang City, Guangdong province, who designs and sells jeans online. He specializes in adjusting his business strategy through big data products. For instance, he uses information about popular elements provided by big data products to design his own jeans, uses information about customer preferences provided by big data products to reduce his return rate and bad reviews, and uses information about the size of platform users provided by big data products to decide which platform he chooses as his main sales channel.

Based on the analysis above, this paper proposes the following hypothesis.

H1: The use of big data products can narrow the entrepreneurial alertness gap and the dynamic capabilities gap, thus reducing the income inequality of e-commerce farmers.

Big data products attributes and income inequality among e-commerce farmers

Product attributes are the inherent basic attributes and key features of a product. Essentially, a product is a collection of attributes (Kim and Chhajed, 2002), and the product presented to consumers is the comprehensive result of the interaction and mutual influence of its multidimensional attributes. The state of product attributes fundamentally determines the functional value of the product to the user. Since the use of big data products can alleviate income inequality among e-commerce farmers, the attributes of big data products determine the extent to which they can mitigate income inequality among e-commerce farmers.

In technology adoption theory, one of the most classic references to product attributes is the Technology Acceptance Model (TAM). One of the core ideas of TAM is that usefulness and ease of use reflect the adopters’ subjective psychological evaluations regarding the superiority and ease of use of the technology, respectively, and jointly determine their attitudes towards the new technology (Davis, 1989). TAM has been applied to research on technology acceptance and adoption behavior in various industries, including information technology management, online education, and e-commerce. Meanwhile, some scholars are attempting to extend TAM. They argue that usefulness and ease of use only reflect the product attributes at the purely physical and instrumental levels, but not the attributes at the level of emotional value. Therefore, they added new dimensions such as perceived risk, perceived interest, and compatibility (Elkins et al., 2013; Petersen and Kumar, 2015; Huang et al., 2017; Pan, 2017; He and Huang, 2020; Lutfi et al., 2022). We agree with expanding the TAM from a more comprehensive product attribute framework. Furthermore, we advocate incorporating experiential dimensions to encapsulate these new perspectives. The experiential dimension can systematically encompass the emotional aspects of products, especially in the fields of rural e-commerce and big data products. With the advent of the digital age, the experience economy has developed more rapidly. The younger generation, who grew up in the digital age, not only demands the usefulness and ease of use of products but also pays greater attention to the experiences that products bring them. In developing countries such as China, young people are the main force driving the development of rural e-commerce (Zeng et al., 2019), which determines that the big data products developed by e-commerce platform enterprises must be able to meet the needs of young people. In view of this, this paper defines the attributes of big data products as the usefulness, ease of use and experience of big data products psychologically perceived by online merchants. Specifically, the usefulness of big data products refers to online merchants’ psychological perception of how useful the big data products are in business decision-making. The ease of use of big data products refers to online merchants’ psychological perception of how easy the big data products are to use in business decision-making. And the experience of big data products refers to online merchants’ psychological perception of the degree to which big data products match aspects such as the risk of use, price level, pleasure of use, and user-friendly design. When e-commerce farmers perceive a higher level of usefulness, ease of use and experience of big data products, their willingness to use big data products will be stronger, which in turn helps to improve their entrepreneurial alertness and dynamic capabilities, and ultimately contributes to narrowing the income inequality among e-commerce farmers.

Based on the analysis above, this paper proposes the following hypothesis.

H2: The higher the levels of usefulness, ease of use, and experience of big data products, the lower the degree of income inequality among e-commerce farmers.

Methodology

Data

The data were obtained from a field household survey of e-commerce farmers in Zhejiang province, China, conducted by our research team from July to August 2022. The selection of e-commerce farmers in Zhejiang as the survey subjects was based on a consideration of both representativeness and feasibility. On the one hand, Zhejiang is at the forefront of the development of rural e-commerce in China. Zhejiang has the largest number of Taobao villages in China. Since this paper studies the impact of big data on income inequality among e-commerce farmers, by focusing on Zhejiang, we can identify e-commerce farmers who are able to apply data products at a low cost, and we can better observe the potential impacts of big data. Although there are some e-commerce farmers using big data products in other provinces, the investigation requires a lot of manpower and funds, which is beyond our capacity. We selected 15 typical Taobao villages in Zhejiang Province as our survey subjects. Specifically, the research group conducted a stratified screening based on the “List of Taobao Villages in China in 2014” released by AliResearch. According to the list, there were 62 Taobao villages in Zhejiang in 2014. Considering the prefecture-level cities where these Taobao villages are located, the levels of economic development, and the types of main products, we selected these 15 Taobao villages (a sample ratio of 25%) as the subjects of our survey. These villages have been awarded the title of “Chinese Taobao Village”, whose e-commerce started earlier, developed rapidly, and received widespread attention. These villages are distributed in Hangzhou, Ningbo, Jinhua, Lishui, Huzhou, Jiaxing, Shaoxing, Taizhou and Wenzhou, respectively producing and selling various types of products such as nuts, snacks, outdoor goods, furniture, tea, women’s wear, children’s wear, fur, down jackets, water heaters, shoes, socks, toys, carpet, and automotive supplies. Overall, these 15 Taobao villages are well-representative in terms of both regional distribution and product type. They are located in developed areas, moderately developing areas, and relatively underdeveloped areas, respectively. The types of products sold by these villages account for a major portion of the entire rural e-commerce market in China. Compared with e-commerce merchants scattered in ordinary villages, e-commerce farmers in Taobao villages have richer experience in online operation and a higher probability of using big data products, thus providing us with observable samples. In addition, e-commerce farmers in Taobao villages are highly concentrated in spatial distribution, which facilitates the investigation and allows us to obtain more samples within a limited time. We investigated 30 e-commerce farmers in each village using the incidental random sampling method, and after eliminating the questionnaires with many missing values, the final sample size was 418.

Empirical methods

The study examines the impact of using big data products on income inequality among e-commerce farmers, and uses the Kakwani individual relative deprivation index to measure income inequality among e-commerce farmers. Since the calculation results of the Kakwani individual relative deprivation index fall between 0 and 1, income inequality among e-commerce farmers is a censored dependent variable, which is suitable for regression estimation using the Tobit model. Tobit model is a regression model specifically designed to handle the censoring of dependent variables. Its core idea is that observations may be restricted within a certain range for some reason, and values beyond this range are “censored” or “truncated.” The traditional linear regression model cannot accurately describe the relationship of such censored dependent variables, but the Tobit model can effectively address this kind of issue. Therefore, the Tobit model has a wide range of applications in many academic fields. ‌The specific model is constructed as follows:

$$\begin{array}{l}Incom{e}_{i}^{\ast }={\phi }_{0}+{\phi }_{1}Bigdat{a}_{i}+{\phi }_{2}Control{s}_{i}+{\varepsilon }_{i}\\ Incom{e}_{i}=\left\{\begin{array}{lll}1 & if & Incom{e}_{i}^{\ast }\ge 1\\ Incom{e}_{i}^{\ast } & if & 1 > Incom{e}_{i}^{\ast } > 0\\ 0 & if & Incom{e}_{i}^{\ast }\le 0\end{array}\right.\end{array}$$
(1)

In Eq. (1), i represents individual e-commerce farmers, Income* represents real income inequality of e-commerce farmers, Income represents observed income inequality of e-commerce farmers, Bigdata is the core independent variable of whether or not e-commerce farmers use big data products, Controls is the control variables, ϕ0 is the constant term, and ε is the random disturbance term. If the use of big data products by e-commerce farmers is a random behavior, the estimated coefficient can accurately reflect the income effect of big data use. However, in reality, whether to use big data products is a subjective decision made by e-commerce farmers, not a random event. If Eq. (1) is directly estimated using the Tobit without considering this potential selection process, the estimation results will be biased. This is because the potential selection process and the unobserved factors are correlated (Shaver, 1998), resulting in a correlation between ε and ϕ1. In other words, the use of big data products by e-commerce farmers becomes endogenous.

To address the endogeneity of big data usage behavior, this paper introduces the entropy balancing method to preprocess the data, thereby controlling the estimation bias caused by self-selection to the greatest extent possible. Entropy balancing method was proposed by Hainmueller (2012), which presets a set of equilibrium constraints and gauge constraints, and matches these constraints by calculating a set of optimal weights, achieving precise matching of the samples in the treatment group and the control group under a specific matrix. The entropy balancing method can match each sample in the treatment group with a very similar sample in the control group, thereby retaining the useful information of all samples. The standardized mean difference and mean difference tests of the matched covariates are more robust, and the results are more reliable (Hainmueller and Xu, 2013).

To examine the impact of big data product attributes on income inequality among e-commerce farmers, we will replace the core explanatory variable in Eq. (1) with the usefulness, ease of use, and experiential aspects of big data products, and then conduct Tobit regression estimation for each case after entropy balancing matching.

Variables

The dependent variable in this study is the income inequality of e-commerce farmers, measured using the Kakwani individual relative deprivation index. The Kakwani individual relative deprivation index allows inequality to be decomposed at the individual level and satisfies the properties of dimensionlessness and standardization (Kakwani, 1984). The calculated values of the Kakwani individual relative deprivation index range from 0 to 1. The higher the Kakwani index, the lower the relative income level of e-commerce farmers, thus reflecting a greater degree of income disadvantage and deprivation. To calculate the Kakwani individual relative deprivation index for e-commerce farmers, it is necessary to measure their income levels. This study uses a scale-based approach to evaluate the income level of e-commerce farmers (see Table 1). The reason for not considering the direct income measurement in this paper is that, in reality, farmers do not know exactly what their income is. Besides, direct income measures rely on farmers’ memories to estimate, which is inaccurate. Therefore, this paper adopts the psychological perception method to measure the income level of e-commerce farmers. Specifically, the income level of e-commerce farmers is measured using the average score of all measurement items, which ranges from 1 to 5. In order to improve the reliability and validity, the content of the scale on e-commerce farmers’ income was adapted from the literature on farmers’ entrepreneurial performance. Statistics showed that the Cronbach’s alpha coefficient of e-commerce farmers’ income scale was 0.936, which was significantly higher than 0.60, showing good reliability. The convergent validity (AVE) was 0.715, which was significantly higher than 0.50, showing good convergent validity. The combined reliability (CR) was 0.938, which was significantly higher than 0.70, showing good combined reliability.

Table 1 The scale measurement items of the dependent variable and the mechanism variables.

The mechanism variables are entrepreneurial alertness inequality and dynamic capabilities inequality, which are also measured using Kakwani individual relative deprivation index. Entrepreneurial alertness and dynamic capabilities are measured in the form of scales, as shown in Table 1, using the average scores of the corresponding items, ranging from 1 to 5. In order to improve the reliability and validity, the content of the scales for entrepreneurial alertness and dynamic capabilities is adapted from existing literature. Statistics show that the Cronbach’s alpha coefficients of entrepreneurial alertness and dynamic capabilities are 0.895 and 0.907, respectively, which are significantly higher than 0.60, showing good reliability. The AVE are 0.743 and 0.651, respectively, which are significantly higher than 0.50, showing good convergent validity. The CR are 0.896 and 0.912, respectively, which are significantly higher than 0.70, showing good joint reliability.

When empirically testing the impact of big data product usage on income inequality among e-commerce farmers, the core explanatory variable is the usage behavior of e-commerce farmers on big data products, which is measured by “whether the e-commerce farmers used big data products in the past year”, so it is a binary variable. Specifically, we identify whether e-commerce farmers have used big data products in the past year by asking if they have paid to activate big data services. This is because e-commerce farmers can only access more practical data by paying to subscribe to a membership. Big data products based on e-commerce platforms are the products of the combination of e-commerce and big data analysis technology. At present, China’s big data products based on e-commerce platforms mainly include the “Business Advisor” developed by Alibaba Group and the “Jingdong Business Intelligence” developed by JD Group. Therefore, in the questionnaire, we focused on asking e-commerce farmers whether they used these two big data products.

When empirically testing the impact of big data product attributes on income inequality among e-commerce farmers, the core independent variables are the psychological perception of big data product attributes, which are examined in three dimensions: the usefulness of big data products, the ease of use of big data products, and the experience of big data products. The psychological perception of big data product attributes are measured in the form of a scale, as detailed in Table 2. The usefulness of big data products, the ease of use of big data products and the experience of big data products are all measured by the mean score of the corresponding measurement items with the value range from 1 to 5. In order to improve the reliability and validity, the content of the scales on big data product attributes was adapted from the existing literature. Statistics show that the Cronbach’s alpha coefficients of usefulness, ease of use and experience of big data products are 0.944, 0.872, and 0.721, respectively, which are significantly higher than 0.60, showing good reliability. The AVE of usefulness, ease of use, and experience are 0.854, 0.694, and 0.694, respectively. The CR of usefulness, ease of use and experience are respectively 0.946, 0.872, and 0.851, which are significantly higher than 0.70, showing good joint reliability.

Table 2 The scale measurement items for big data product attributes.

Discriminant validity refers to the degree of difference between the measurement items of different variables, reflecting the closeness between the measurement items and the corresponding variables. To test discriminant validity, the correlation coefficient matrix between the variables and the square root of the Average Variance Exreacted (AVE) value for each variable should be calculated first. If the square root of the AVE value for a variable is greater than the correlation coefficient between it and other variables, then the variable has good discriminant validity (Lin et al., 2015). The test results of discriminant validity are shown in Table 3. The square root of AVE value for each variable (the bold numbers on the diagonal in Table 3) is almost greater than the correlation coefficient between it and other variables (except for 0.887), indicating that the measurement items for the variables generally have good discriminant validity.

Table 3 Square root of AVE values and correlation coefficients for variables.

In terms of control variables, this paper refers to relevant literature and includes variables such as gender, age, education level, whether a member of the Communist Party of China (CPC), business operation status, whether holding a registered trademark, years of e-commerce business, main e-commerce formats, and main product types as control variables (Zeng et al., 2018; Zeng et al., 2024). The definitions of the control variables are shown in Table 4. The main e-commerce business formats are divided into two categories: new e-commerce business formats and traditional e-commerce business formats. The former includes live e-commerce, social e-commerce and other new e-commerce business formats, while the latter refers to the traditional e-commerce business format based on static graphic display.

Table 4 Variable descriptions and descriptive statistics.

From the descriptive statistics in Table 4, we can see that the average Kakwani individual relative deprivation indices for the income, entrepreneurial alertness, and dynamic capabilities of e-commerce farmers are all around 0.12, indicating that the overall level of internal inequality among the surveyed e-commerce farmers is relatively low. About 49% of e-commerce farmers surveyed used big data products. The attribute scores of big data products indicate that the current big data products developed by e-commerce platform enterprises are at a moderate to high level. The overall situation is good, but there is still room for improvement. As for the attribute dimensions of big data products, usefulness has the highest score of nearly 4.0, followed by the score of ease of use, about 3.6, and the score of experience is only about 3.4. The surveyed e-commerce farmers are evenly distributed in terms of gender, and most of the surveyed e-commerce farmers are young. The average education level is primarily undergraduate and junior college, with about 20% being members of the CPC, approximately 26% having conducted entrepreneurial operations, and 36% having registered trademarks. Most e-commerce enterprises have been established for less than 2 years, the majority of which are new types of e-commerce, and they mainly sell non-agricultural products.

Results

The impact of big data products use on e-commerce farmers’ income inequality

Table 5 reports the Tobit regressionFootnote 3 results of the impact of e-commerce farmers’ use of big data products on their income inequality. It can be observed that the use of big data products has a significant negative effect on income inequality, entrepreneurial alertness inequality and dynamic capabilities inequality among e-commerce farmers at the 1% level. This indicates that big data products have a common prosperity effect that helps narrow the income gaps within e-commerce farmers. This common prosperity effect stems from the fact that big data products are more conducive to enhancing the entrepreneurial alertness and dynamic capabilities of e-commerce farmers who are in a disadvantaged position. This is a new mindset of using advanced digital technologies to narrow the implicit digital divide. On one hand, certain digital technologies inevitably create digital divides within social groups. On the other hand, developing new technologies that cater to low-income groups can help them reap digital dividends faster and more fully. These technologies should effectively address the resource endowment disadvantages faced by low-income groups, especially in terms of information and human capital, thereby achieving significant income growth.

Table 5 Tobit regression results for the impact of big data products use on income inequality.

Since Tobit regression cannot effectively control the differences in covariates between the treatment group (e-commerce farmers using big data products) and the control group (e-commerce farmers not using big data products), the two groups of samples are adjusted by further adopting an entropy balancing approach to set constraints on the covariates such as first-order moments (mean), second-order moments (variance) and third-order moments (skewness) and using their automatically calculated optimal weights as balancing weights (Hainmueller, 2012). Thus, the two groups of samples are precisely matched under the constraints, and the selective bias of the samples is controlled to the maximal extent (Hainmueller and Xu, 2013). Table 6 demonstrates the mean and variance of the covariates before and after the entropy balancing treatment and the results of the matching test. Before matching, the means and variances of the covariates in the treatment and control groups were significantly different, and after the entropy balancing treatment, the differences in the means and variances of the covariates were significantly reduced. To further test the reliability of the entropy equilibrium results, the standardized mean differences (SMD) between the treatment and control groups can be calculated, and a t-test can be conducted. The results show that the SMD between the two groups after treatment are all zero and the p-values of the t-tests for the mean differences of the covariates are all one, indicating that the data of each covariate in the treatment group and the control group have been exactly matched (Zeng et al., 2024).

Table 6 Matching tests results for covariates after treatment.

Table 7 reports the Tobit regression results after entropy balancing matching. It can be seen that the regression results are robust after addressing the self-selection bias. Similarly, the use of big data products has a significant negative effect on the income inequality, the entrepreneurial alertness inequality, and the dynamic capabilities inequality of e-commerce farmers at the level of 1%. The hypothesis H1 was verified. By comparing the estimated coefficients in Tables 5 and 7, it can be seen that due to the self-selection bias, the Tobit regression will lead to a slight overestimation of the impact of big data products on the income inequality of e-commerce farmers. Specifically, the estimated coefficient of Tobit regression is −0.065, while the coefficient from the entropy balancing estimation is −0.056. The entropy balancing estimation showed that the Kakwani relative deprivation index of e-commerce farmers decreased by 0.056 on average after using big data products. In the surveyed sample, the average Kakwani relative deprivation index for e-commerce farmer who did not use big data products was 0.154. The value of 0.056 represents 36.36% of 0.154. It can be seen that the use of big data products plays a significant role in reducing the income gap among e-commerce farmers.

Table 7 Entropy balancing estimation results for the impact of big data products use.

Moreover, after controlling for self-selection bias, the estimation results of the control variables are more reasonable. The empirical results show that among the control variables, education, whether enterprise operation or not, and the types of main products have significant and stable effects on the income inequality of e-commerce farmers. As one of the most important ways to cultivate human capital, education has become a key factor affecting income inequality (Gustafsson and Li, 2002). Improving the education level of farmers is a strategic measure to narrow the income gap among farmers (Autor et al., 2003). Some e-commerce farmers will upgrade to enterprise operations after they have developed to a certain extent, which means that their operation scale has been expanded and a gap has emerged between them and other individual e-commerce farmers. From the perspective of industrial and commercial registration, enterprise operation means that farmers register as individual businesses or partnership firms. For e-commerce farmers, the benefits of realizing enterprise operations mainly include three aspects (Zeng et al., 2019). First, by entering more e-commerce platforms, e-commerce farmers can not only realize economies of scale in their online store operations but also further explore the market and increase sales. However, most e-commerce platforms have certain requirements for business registration. For example, registering on platforms such as Tmall and JD.com requires industrial and commercial registration as a company, while registering on the 1688 platform requires at least registration as an individual business. Second, it is to meet the needs of business development. Transitioning to enterprise operations is not only more attractive in terms of talent recruitment and attracting investment, but also enhances consumer trust and meets the need for invoice issuance. Third, access to policy resources. In recent years, many local governments have introduced policies to support e-commerce. Individual businesses and partnership companies are more likely to obtain policy support. In terms of main product types, compared with non-agricultural products, agricultural products are more conducive to narrowing the income gap among e-commerce farmers. In the absence of e-commerce, farmers who specialize in industrial goods are generally richer than farmers who specialize in agriculture. When e-commerce develops, the farmers who are mainly engaged in industrial products are the first to benefit. This is because industrial products are more likely to meet the requirements of e-commerce in terms of standardization, storage, and transportation. Later, agricultural e-commerce also gradually developed. The latecomer advantage enables e-commerce farmers with main agricultural products to catch up quickly and narrow the income gap between them and other e-commerce farmers.

Although the entropy balancing method can help address the self-selection problem, there may still be omitted variable bias caused by questionnaire design, measurement method, control variable selection, unobserved factors, model specification, and other situations. In this regard, we carried out a sensitivity analysis based on the method proposed by Cinelli and Hazlett (2020). As shown in Table 8, the \({R}_{D \sim W|X}^{2}\) of control variables such as gender, age and education are always less than the RV (0.164), and the \({R}_{Y \sim W|D,X}^{2}\) is always less than the \({R}_{Y \sim D|X}^{2}\) (0.031), no matter when the selective strength of the unobservable variables measured by the partial goodness-of-fit is equal to that of the control variables, or when the explanatory power of the unobservable variables for both the outcome variables and treatment variables is double or triple that of the control variables. According to Nunn and Wantchekon (2011), this indicates that the model has effectively controlled the key control variables and that omitted variable bias is not significant. In other words, the estimation of causal effects is insensitive. This eliminates our suspicion of omitted variable bias.

Table 8 Sensitivity analysis based on Cinelli–Hazlett method.

The impact of big data products attributes on e-commerce farmers’ income inequality

Table 9 reports the regression results on the impact of the total attributes of big data products on income inequality among e-commerce farmers. The total attributes level of big data products is the sum of the scale scores of the three dimensional attributes of big data products. It can be observed that the total attributes of big data products are negatively correlated with income inequality, entrepreneurial alertness inequality, and dynamic capabilities inequality among e-commerce farmers. This indicates that the higher the level of big data product attributes, the more effective they are in narrowing the gaps in income, entrepreneurial alertness, and dynamic capabilities. This verifies that product attributes are an important perspective to examine the impact of big data products on income inequality among e-commerce farmers. However, the effect of the overall attribute is the result of the synthesis of the three-dimensional attributes, and the effect of each dimension attribute requires further analysis.

Table 9 Entropy balancing estimation results for the impact of total attributes of big data products.

Table 10 reports the regression results on the impact of the usefulness of big data products on income inequality among e-commerce farmers. It can be observed that the usefulness of big data products is negatively correlated with income inequality, entrepreneurial alertness inequality, and dynamic capabilities inequality among e-commerce farmers. This shows that the higher the level of usefulness of big data products, the more effective they are in reducing the gaps in income, entrepreneurial alertness, and dynamic capabilities. The usefulness of big data products is demonstrated by the improvements in information access, operational efficiency, and income growth among e-commerce farmers after using these products. When e-commerce farmers perceive a higher level of usefulness of big data products, their willingness and behavioral performance in using these products become more active. This, in turn, enhances their entrepreneurial alertness and dynamic capabilities, and ultimately helps improve their entrepreneurial performance.

Table 10 Entropy balancing estimation results for the impact of usefulness of big data products.

Table 11 reports the regression results for the impact of the ease of use of big data products on income inequality among e-commerce farmers. It can be seen that the ease of use of big data products is negatively correlated with income inequality, entrepreneurial alertness inequality, and dynamic capabilities inequality among e-commerce farmers. This indicates that enhancing the ease of use of big data products strengthens their effectiveness in narrowing the income gap, entrepreneurial alertness gap, and dynamic capabilities gap. During the process of using big data products, e-commerce farmers perceive the ease of accessing data indicators, the ease of understanding data indicators, and the ease of data analysis. When e-commerce farmers perceive a higher level of ease of use of big data products, they tend to adopt a more positive and open attitude toward using these products, which in turn enhances their entrepreneurial alertness and dynamic capabilities, and ultimately contribute to the improvement of entrepreneurial performance.

Table 11 Entropy balancing estimation results for the impact of ease of use of big data products.

Table 12 reports the regression results for the impact of the experience of big data products on income inequality among e-commerce farmers. It can be observed that the experience of big data products is negatively correlated with income inequality, entrepreneurial alertness inequality, and dynamic capabilities inequality among e-commerce farmers. This indicates that a higher level of experience quality of big data products enhances its effectiveness in narrowing the income gap, entrepreneurial alertness gap, and dynamic capabilities gap. In terms of experiential factors such as use risk, use cost, use interest and product humanization, the actual supply of big data products can meet the needs and preferences of e-commerce farmers, thereby generating perceived cooperation (Oh and Yoon, 2014), which will help promote the use behavior of e-commerce farmers on big data products. As a result, it can enhance entrepreneurial alertness and dynamic capabilities, ultimately contributing to the improvement of entrepreneurial performance.

Table 12 Entropy balancing estimation results for the impact of experience of big data products.

Based on the estimation results from Tables 1012, hypothesis H2 is validated. Furthermore, by comparing the sizes of the estimated coefficients in Tables 1012, it can be seen that the magnitude of the effect of experience quality is the largest among the three product attribute dimensions. Specifically, for every one unit increase in the experience of big data products, the Kakwani relative deprivation index of e-commerce farmers will decrease by 0.168. This suggests that in a digital age dominated by young people, the design of big data products should emphasize the importance of experience quality. It also demonstrates the necessity of extending the product attributes involved in the technology acceptance model from two to three dimensions.

Group regression and comparative analysis

Table 13 reports the estimation results of the regressions grouped by gender, age and education. The results show that the use of big data products has a significant impact on income inequality among male e-commerce farmers, but not among female e-commerce farmers. This may reflect the differences in e-commerce entrepreneurship between male and female in rural areas. In rural areas, entrepreneurs are predominantly male. Rural women are less likely to engage in entrepreneurship because of their responsibilities for household chores and child-rearing, as well as their limited human capital. However, rural women with less access to entrepreneurial opportunities do not show significant differences in entrepreneurial alertness and dynamic capabilities. Thus, the use of big data products does not play a significant role in narrowing the gap in women’s entrepreneurial alertness and dynamic capabilities. On the contrary, there is a large number of male entrepreneurs in rural areas, and the internal differentiation among them can be significant. The role of big data products in narrowing intra-male inequalities will be more prominent.

Table 13 Entropy balancing estimation results for group regression by gender, age and education.

The empirical results also show that both e-commerce farmers younger than 30 years old and e-commerce farmers between 30 and 50 years old significantly reduce within-group income inequality after using big data products. A comparison of the estimated coefficients for the two groups shows that young people are more capable of stimulating the gap-narrowing effect of big data products than middle-aged people. This reflects the young generation’s nature to understand and absorb new things in the digital age. Young generation will be better positioned to benefit from digital revolution.

In terms of educational attainment, the use of big data products has a highly significant impact on reducing income inequality among e-commerce farmers with a moderate level of education. However, the use of big data products has no significant impact on e-commerce farmers with either low and high levels of education. This may be because the gaps in entrepreneurial alertness and dynamic capabilities within the group of e-commerce farmers with low education and within the group of e-commerce farmers with high education are not significant, so the compensating effect of big data products is relatively limited.

In addition, the impact of total attributes of big data products and the attributes of the three dimensions on income inequality is significant and does not differ among men, women, those under 30 years old, and those aged between 30 and 50. This shows that the importance of big data product attributes is beyond doubt. Focusing on improving the attributes of big data products is foundation for leveraging the role of big data products, and this should become an important consensus.

Discussion

The theoretical significance of this study

This study is a proactive response to the issue of income distribution among farmers in the era of big data. Although there has been extensive research on the digital divide affecting farmers, previous studies have primarily focused on the development of mobile phones, computers, the internet, e-commerce and digital inclusive finance (Molony, 2008; Futch and McIntosh, 2009; Aker, 2010; Fafchamps and Minten, 2012; Luo and Niu, 2019; Couture et al., 2021; Zhang, 2022; Li et al., 2023; Hao and Zhang, 2024), without addressing the development of big data. The impact of digital technology on income inequality among farmers has been proven to be very complex, depending on various factors such as the type of digital technology. Although we cannot define the relationship between digital technology and income inequality among farmers in a consistent manner, we should at least keep pace with this evolving relationship. With the advancement of the new technological revolution, big data is gradually intersecting with farmers. In China, an increasing number of rural households are becoming professional e-commerce sellers. Big data products developed by e-commerce platform companies can assist e-commerce farmers in making more scientifically accurate decisions for online store operations. This cutting-edge phenomenon is rare in other countries, but it indicates future development trends. With the continuous development of rural e-commerce and the ongoing improvement of big data analysis technology, the application of big data products developed by e-commerce platforms will gradually enter an accelerated development stage, and the positive effects of big data use will become increasingly evident. The results of this study indicate that the use of big data products can significantly reduce income inequality among e-commerce farmers, playing a role in promoting common prosperity. In the era of big data, the operating income of e-commerce farmers no longer strictly follows the experience accumulation mechanism that evolves in tandem with time. With the help of big data, low-income farmers can bridge the gap in entrepreneurial awareness and dynamic capabilities, and obtain higher marginal returns. This finding undoubtedly provides valuable new perspectives and materials for understanding the digital divide among farmers. In terms of types of digital technology, some digital technologies have skill thresholds that may lead to the emergence of a digital divide, while others are inclusive in nature, such as digital inclusive finance and e-commerce big data products, which offer significant assistance to grassroots groups.

The research in this article also helps to expand the study on the relationship between big data development and farmers. The existing literature mainly discusses the possible impact of big data technology on farmers at the qualitative level (Waga and Rabah, 2014; Lokers et al., 2016), as well as the obstacles faced by farmers in developing countries in enjoying the benefits brought by big data (Nandyala and Kim, 2016; Sawant et al., 2016; Rodriguez et al., 2017). However, the existing literature only focuses on the large agricultural databases with public attributes developed by the government, and rarely involves the data products developed by platform enterprises, and does not carry out quantitative research from the micro level of farmers. Although some Chinese scholars have begun to pay attention to e-commerce big data products (AliResearch, 2017; Zeng et al., 2019; Guo et al., 2023; Li et al., 2024), the research is still insufficient. This study demonstrates that the use of big data products affects income inequality among e-commerce farmers by affecting their entrepreneurial alertness and dynamic capabilities, establishing a connection between big data development and farmer entrepreneurship, and enriching our understanding of the impact mechanisms of big data development. Compared to agricultural public database, these data products are mainly accumulated on digital platforms, with large amounts of data, fast updates, and the ability to reflect the situation of the commodity market. Platform companies make these big data more publicly available through market transactions. In order to better motivate users to use big data products, platform companies have to pay attention to the attributes of big data products. Based on the technology acceptance model, this article constructs a product attribute framework including three dimensions: the usefulness of product, the ease of use of product, and experience of product. The article believes that usefulness and ease of use are crucial product attributes, but these two do not cover all aspects of product attributes. Even if a product is very useful to users and easy to operate, it does not guarantee that consumers will definitely use the product. This is because consumers also consider factors such as riskiness of use, price level, enjoyment, product humanization, human care, green health, etc. Especially for e-commerce farmers, they are different from traditional farmers. They are the younger generation, focusing more on holistic experiences and are promoters of the experience economy. Therefore, this article introduces the dimension of experience, making the consideration of product attributes of big data more comprehensive. Empirical results show that the marginal effect of the experience of big data products is greater. This indirectly proves the necessity of introducing the dimension of experience of product attributes based on the technology acceptance model.

The practical significance of this study

The findings of this study remind us to pay more attention to the leading basic data resources that have great potential value for improving people’s livelihood. How to efficiently integrate big data analysis technology and effectively utilize data resources to enhance the digital welfare of all farmers should be one of the most important issues concerned by the current government and academia. The government should fully realize the huge potential of economic value brought by big data, and promote the convenience, efficiency and legitimacy of data resource utilization through diversified and standardized mechanisms, so that the dividends of digital economic development can benefit more people. Although the empirical study in this paper takes e-commerce big data products developed by platform enterprises as an example, it is not different from other databases in essence. The logic of bridging the capability gap of the data products revealed in this paper is general and has nothing to do with the specific types of data products. Similarly, although this paper takes China as an example to carry out empirical research, the results are of general significance. In the theoretical analysis section, we did not derive the impact mechanism of big data products on income inequality among e-commerce farmers based on the context of China. Instead, we explained how big data products affect the entrepreneurial alertness gap and dynamic capabilities gap among e-commerce farmers within a general theoretical framework. The effects of the data elements mentioned in this paper do not emphasize the particularity of China. By taking China as an example, we aim to provide an early case for other developing countries, rather than highlighting China’s uniqueness. The logical relationships among these variables are not unique to China and are also applicable to e-commerce farmers in other developing countries. With the advent of the digital age, the role of data will be enhancing, and this paper provides empirical evidence from leading regions. As long as the latecomer regions gradually adopt big data technologies, they can also reap data dividends. It is reasonable to believe that the increasing application of big data products will profoundly affect the entrepreneurial behavior and income distribution pattern of e-commerce farmers. In fact, whether it’s rural e-commerce in China or similar practices in other developing countries, big data technologies have played a key role in driving farmers’ transition from traditional agriculture to the digital economy (Protopop and Shanoyan, 2016; Zeng et al., 2024). Therefore, the research conclusions of this paper are not only applicable to China, but also provide a theoretical basis and practical reference for other developing countries to alleviate farmers’ income inequality through technical progress in the context of digital economy. However, we also have to note the limitations of generalizations from the Chinese case. While the mechanisms could theoretically be applied elsewhere, successful outcomes are likely to depend on local conditions, including the availability of digital infrastructure and the broader policy environment. The development of big data products in China’s e-commerce sector has been propelled by the rise of e-commerce. The remarkable success of e-commerce in China stems in large part from substantial government investments in digital infrastructure, platform ecosystems, logistics networks, and related fields. Equally critical has been the government’s open-minded approach. Premature and stringent regulation of emerging industries risks stifling innovation—a pitfall China avoided by adopting a clear strategy: first actively fostering e-commerce growth, then gradually introducing regulatory frameworks. This pragmatic, phased approach offers a valuable model for other developing nations to consider.

The empirical results of this paper also provide important references for platform enterprises to improve data products and for governments to develop public databases from the perspective of data product attributes. Platform companies need to continue to increase their product innovation to further enhance the usefulness, ease of use and user experience of big data products, especially the user experience. By moderately opening more free trial functions, the initial impression of e-commerce farmers on big data products will be enhanced, which will help to further improve the user experience of e-commerce farmers and enhance the driving effect on the use of big data. On the basis of realizing the importance of promoting the integration of data and farmers, the government should translate the awareness into action, that is, accelerate the construction of agricultural databases. In the process of promoting the construction of public databases, government departments should focus on enhancing the data product attributes of databases, including usefulness, ease of use, and user experience.

The research results of this paper also have positive enlightenment for e-commerce farmers and even all farmers. For e-commerce farmers, they need to enhance their awareness of using big data products, and learn to use data products to enhance their entrepreneurial alertness and dynamic capabilities. Especially for those e-commerce farmers who have not yet accessed and used big data products, they should actively change their way of thinking and try to accept new things. Only in this way can they maintain creativity and competitiveness, keep pace with the times, and upgrade in a timely manner to avoid being eliminated.

Concluding remarks

Conclusions

How to effectively reduce the income gap among farmers has always been a topic of great concern to the governments in developing countries. With the advent of the era of big data, the relationship between the big data and income inequality among farmers is worth further research. In China, e-commerce platform companies represented by Alibaba and JD.com have developed big data products in the field of electronic transactions, providing data services for a large number of rural e-commerce entrepreneurs. This phenomenon provides us with an excellent opportunity to explore the income inequality among farmers from the perspective of big data. Based on the theoretical analysis, this paper proposes research hypotheses and further employs the Tobit model and the entropy balancing method to empirically test the impact of the use of big data products and the attributes of big data products on income inequality among e-commerce farmers, using a survey sample of 418 e-commerce farmers in China. The study found that the use of big data products significantly reduces the degree of income inequality among e-commerce farmers. This is because the use of big data products can reduce the entrepreneurial alertness gap and dynamic capabilities gap among e-commerce farmers. The results of the heterogeneity analysis show that the use of big data products mainly reduces income inequality in the male group and the group with a medium level of education, and young e-commerce farmers obtain a higher degree of digital dividends than middle-aged e-commerce farmers. In addition, product attributes are a prerequisite for big data products to play a role, which is generally recognized by different groups of e-commerce farmers. The enhancement of the attributes of big data products, such as usefulness, ease of use and experience, helps to promote the common prosperity of e-commerce farmers.

Policy implications

The research findings have important policy implications for governments in developing countries. First, to strengthen government workers’ understanding and emphasis on big data. It is suggested that the government strengthen the top-level design of big data, conduct training for government staff, and urge them to increase field research on platform enterprises. Second, to create a superior institutional environment for the development of data products. Government departments should improve the legal and regulatory framework for data elements, establish data element trading markets, and guide platform companies to develop inclusive and beneficial data products. Third, provide farmers with opportunities to learn big data knowledge. It is suggested that the government take a variety of measures to improve farmers’ big data literacy, including strengthening the publicity of big data knowledge in rural areas, distributing free big data related books to farmers, encouraging social welfare personnel to carry out voluntary activities to serve farmers, providing big data professional training to farmers, and encouraging young people to provide digital intergenerational feedback to the elderly. Fourth, stimulate the willingness of farmers to participate in the construction of public databases. It is recommended that the government stimulate farmers’ extensive participation in the construction of public databases, promptly solicit the needs and opinions of the masses, and continuously promote the iterative upgrading of public databases from the perspective of user experience. Finally, it is suggested that the government should strengthen cooperation with platform enterprises in the process of building public databases, give play to the first-mover advantages of platform enterprises in technology development and promotion experience, and reduce the deviation in the construction of government public database.

Limitations and prospects

There are some limitations to our study. First, the scope of the survey is limited to Zhejiang. Although Zhejiang is representative and cutting-edge in China, expanding the scope of the survey will help enhance the reliability of the study and enrich its analysis. In the future, we will carry out more surveys to further verify the conclusions of this paper and conduct more in-depth exploration. Second, this paper uses cross-sectional data. Due to the constraints of manpower and funds, we only conducted a one-time questionnaire survey. However, cross-sectional data has some limitations. In the future, follow-up surveys should be carried out as much as possible to obtain panel data. Third, the methodology needs to be improved. Although the entropy balancing method is used to reduce sample selection bias, it cannot avoid the potential influence of unobservable factors on estimation bias. Limited by data acquisition and variable design, we cannot further adopt more effective measurement methods such as instrumental variable model, processing effect model and endogenous transformation regression model. Finding reasonable instrumental variables to more thoroughly address the endogeneity problem and enhance the accuracy of empirical results is an important direction for future research. Fourth, this paper does not focus on platform diversity. We did not delve into the differences between the data products developed by Alibaba and JD.com, respectively. In addition, with the rapid rise of short video e-commerce and live streaming e-commerce, e-commerce farmers may also use information and data provided by other platforms, such as Douyin (TikTok), Little Red Book and Kuaishou. Therefore, we need to further broaden our research perspective. In the future, we could re-conduct the questionnaire on a larger scale and attempt to explore in depth the differences between the big data products of various platforms and their impact on the usage patterns of e-commerce farmers.