Abstract
The ubiquitous monitoring and collection capabilities of the IoE, as well as its innovative scenarios, have led to changes in the content and type of personal data. Personal data sensitivity, as a standard for measuring privacy attitudes, can provide a reference for the design and improvement of privacy systems. This study aims to evaluate individuals’ personal data sensitivity in the IoE context, to better understand individuals’ current privacy attitudes. This study uses a questionnaire survey to study personal data sensitivity and the antecedents affecting personal data sensitivity among 1921 Chinese citizens. Research suggests that, within the spectrum of 41 personal data categories, identifiers such as ID numbers and home addresses are deemed highly sensitive. Furthermore, within the IoE context, emerging types of personal data, including behavioural and facial recognition data, also demonstrate significant sensitivity. With respect to sensitivity levels, personal data can be categorized into four tiers: very highly sensitive data, highly sensitive data, medium sensitive data, and low sensitive data. The study also finds that perceived privacy risks, privacy concerns, and social influences have a significant impact on personal data sensitivity, and there are differences in public perception of personal data sensitivity among different genders, ages, and educational levels.
Similar content being viewed by others
Introduction
In the Internet of Everything (IoE) environment, the contents and everything” and “collecting all personal data”. The IoE is considered a huge, complex network ecosystem composed of objects, digital devices, digital individuals, digital enterprises, digital governments, data resources, and other elements connected by digital platforms and digital processes (Li 2020). Compared with the Internet of Things (IoT), which connects aspects such as sensors and devices, the IoE has a wider range of connected objects and can interact strongly with individual and social environments (Martino et al. 2017; Wang et al. 2023), translating the real world ubiquitously and holographically into a digital world. In particular, Everything to Person (E2P) service providers based on various scenarios such as personal work, travel, medical treatment, and entertainment scenes allow for ubiquitous data gathering and combination (Harari et al. 2016; Ioannou et al. 2020). These E2P service providers commit to analysing “who you are, what you are doing, what you think, and what you need” and creating more personalised and targeted user profiles or “digital identities” (Mathews-Hunt 2016), which can provide more customised and tailored services to expand their user base.types of personal data are expanding and transforming in unprecedented ways. The essence of IoE is “connecting
While the convenience of various services provided by IoE is built on the wide collection and utilisation of users’ personal data (Li et al. 2017), privacy and security risks become accordingly apparent once data breaches and abuse occur (Wang et al. 2023). Gradually, individuals value the importance of privacy due to mounting, serious privacy concerns and have become more prudent in their online adoption and data provision behaviour (Lyu et al. 2024; Ayaburi and Treku 2020; Hajli and Lin 2016). Individuals’ attitudes and behaviours towards data privacy depend not only on who collects information and why but also on what type of information is being treated as sensitive (Valdez and Ziefle 2018). Therefore, in the IoE environment, the new contents and types of personal data, the type of personal data considered sensitive, and the antecedents of sensitivity require further research. The question then arises: what personal data is deemed sensitive, and what determines the perception of this sensitivity?
To address these issues, regulatory bodies have initiated mechanisms for the classification of personal information. Regulations and laws such as the European General Data Protection Regulation (GDPR), Standards for Privacy of Individually Identifiable Health Information in the US, the Amended Act on the Protection of Personal Information of Japan, and the Personal Information Protection Law of China have been subsequently issued and perceive sensitive personal data as special categories of personal data that are directly related to the significant rights and interests of individuals. These regulations and laws suggest that the distinction between sensitive and general personal data may be on the table in future privacy legislation. However, as sensitive personal data involves the individual’s property rights, dignity, and freedom, the judgement on personal data sensitivity should not merely be determined by laws, regulations, or standards issued by government; the individual’s sensitivity towards their personal data of all types should be considered. Ohlhausen (2014) noted that public policy initiatives regarding privacy choices should incorporate feedback and attitudes from individuals themselves. Furthermore, “sensitivity” varies between individuals and is subjective, being based on individual psychological and cognitive characteristics. Demographic differences (Markos et al. 2017; Kang et al. 2022), perceived privacy risks (Robinson 2017), privacy concerns (Gopal et al. 2018), and social influence may lead to personal data being considered sensitive (Ohm 2014; Rumbold and Pierscionek 2018). These challenges have necessitated an understanding of users’ perspectives on personal data sensitivity, both within local contexts and on a global scale, providing a reference for the design and improvement of privacy systems.
Recent studies have attempted to conduct such inquiries (Tao et al. 2024; Schomakers et al. 2019; Kang et al. 2022), previous privacy studies have focused more on traditional personal data, primarily addressing basic demographic information such as name, e-mail address, or financial information such as credit card details (Malhotra et al. 2004). Very few studies have focused on the IoE context. The intelligence of IoE technology is changing our lives; in particular, IoE technologies such as deep learning, artificial intelligence, and big data require the sharing of multiple new types of personal data (Wang et al. 2023), such as facial data, online behavioural data, or spiritual data. Research has shown that these new types of personal data pose greater and more insidious risks of privacy breaches, and users are more cautious about them (Wang et al. 2023; Farayola et al. 2024). Additionally, existing studies have primarily focused on user samples from the United States and Europe (e.g., Markos et al. 2017; Schomakers et al. 2019), lacking insights into many other countries and regions with rapidly growing IoE user bases. For instance, in 2023, the number of internet users in China increased to 1.079 billion, with an internet penetration rate of 76.4%, forming the world’s largest and most vibrant digital society (China Internet Development Research Institute 2023). Therefore, considering the IoE context and the global nature of data businesses, clarifying the new contents and types of personal data, and understanding Chinese users’ perceptions of sensitivity towards different personal data, is worthwhile for both regulatory authorities and digital service providers reliant on data flows.
Based on these myriad considerations, this study aims to clarify the content and type of personal data in the IoE environment and further investigate the sensitivity of personal data and the antecedents that contribute to the sensitivity through empirical analysis, attempting to provide a Chinese perspective on personal data sensitivity. The anticipated results of this study are expected to offer novel perspectives and insights in the global field of privacy protection, providing decision-makers with a scientific basis and strategies for managing personal data sensitivity. Additionally, this research will identify endogenous and exogenous factors of individuals’ perceived sensitivity to personal data, enabling data service providers in practical applications to better understand and respect user privacy needs, thereby expanding their user base and market influence (Lappeman et al. 2023).
Theoretical background and related work
The following section defines the concept of information privacy and highlights the importance of sensitivity of information for understanding privacy attitudes and behaviours. Additionally, the following section outlines individual and cultural influences on privacy perceptions.
The content and type of personal data in the IoE era
In the context of the IoE, the content and type of personal data are constantly expanding. Personal data includes raw machine data and metadata, as well as abstract personality characterisation data (Wiese et al. 2017). According to the security level, personal data is divided into general personal data and sensitive personal data. However, there are various opinions on what personal data actually includes, which personal data is sensitive, and what the structure of personal data should be, especially in the IoE context. Defining these terms is an important prerequisite for further analysing personal data security and privacy issues.
From the perspective of information science, personal data can be defined as data that describes the individual’s attributes, movement status, and relations. Personal data itself exists objectively and is increasingly being digitised and stored with the advancement of technology. To some extent, the aggregation of personal data can form a complete ‘digital individual’. This type of personal data is often characterised by the digitisation of personal characteristics and behaviours, meaning that the essence of personal data is ‘digital individual’ data. ‘Digital individual’ data mainly consists of data that characterises the natural and behavioural attributes of individuals.
The data that characterises individual natural attribute characteristics is mainly used to describe “who the person is”, including natural attribute data, spiritual attribute data, and social attribute data (Li, 2022). Natural attributes mainly describe the ‘person’ in the material world; spiritual attributes mainly describe the ‘person’ in the spiritual world; and social attributes mainly describe the ‘person’ in real society. Firstly, natural attribute data mainly describes the physiological characteristics, physical characteristics, and health status of individuals. In the traditional social environment, human natural attribute features are typically classified as private and are rarely collected digitally. However, in the IoE era based on artificial intelligence, individuals’ physiological and physical characteristics are increasingly being digitised. Personal data that has been or is currently being digitised include facial data, human body data, and fingerprint data. In a general sense, all physical human aspects that constitute the ‘physiological person’ have the tendency to be digitised. The development of facial recognition technology and human body recognition technology has enabled humans to perceive and collect more and more facial and human data. Secondly, spiritual attribute data refers to a person’s emotions, psychological status, and ideas. These data are to some extent the most difficult to monitor, but with the development of the IoE and artificial intelligence technology, some new media application platforms are attempting to scientifically monitor individual psychological and emotional data (Wang et al. 2023). Thirdly, social attribute data is increasingly being digitised, collected, stored, and shared on a large scale with the help of various digital application platforms. The social attributes of an individual mainly include personal identity data and relationship data. Identity data largely refers to data that can identify or represent identity, including real identity data and online identity data. The real identity data mainly manifests as basic human demographical descriptions, such as age, income, education, and employment. This data may reflect social status. Online identity data typically refers to data that can define and characterise an individual’s identity in a virtual society. The digital form mainly includes login accounts and passwords on various application platforms, as well as network information (IP addresses) bound to online identities. In addition, the social attribute of an individual lies in their social relationships. Differences between individuals can be expressed due to the different social groups and social statuses they belong to. In the IoE era, relationship data can be presented through various social network materials, including friend relationships and communication relationships.
The data that characterises personal behavioural attributes mainly include two aspects: one is the digital record of real social behaviour and the other is the online social behaviour data of individuals as internet users (Li 2022). With the development of the IoE, real space and virtual space are merging with each other, and the boundary between personal real behaviour and online behaviour is gradually blurring, making distinctions between these spaces increasingly difficult. Thus, online social behaviour data can reflect real social behaviour, including individual behaviour data on various online digital application platforms. In the IoE era, the application models of intelligent new media mainly include the artificial intelligence and information acquisition application model; artificial intelligence and e-commerce application model; artificial intelligence and communication and interaction application model; intelligent life and entertainment application model; and smart city and intelligent government application models. Correspondingly, the types of online social behaviour data of individuals include personal information acquisition behaviour data, business behaviour data, life and entertainment behaviour data, and administrative data.
“Privacy datification” and “Data Privacyization” in the IoE era
Privacy is a multifaceted and nebulous concept (Wang et al. 2024), encompassing dimensions such as freedom of thought, control over one’s body, solitude at home, control over personal information, freedom from surveillance, protection of personal reputation, and protection from searches and interrogations (Solove, 2024). Consequently, definitions of privacy generally revolve around the key concepts of freedom, control, and self-determination.
The rapid proliferation of IoE technologies and their widespread adoption has enveloped our lives with a plethora of smart devices, which continuously gather our personal data (Wang et al. 2023), encompassing location information, behavioural patterns, and health status. These data have become commodified, utilized for delivering personalized services, refining product functionalities, and executing precise marketing strategies. However, this practice has also precipitated the risk of privacy breaches, posing a significant threat to individual freedom and dignity (Wang et al. 2023).
In response, the concept of privacy has become a widespread public sentiment, leading to the further development of privacy notions. Concepts such as information privacy (Solove and Schwartz 2020), data privacy (Farayola et al. 2024), smart privacy (Meg 2015), and integrated privacy (Gu 2020) have been introduced.
The history of privacy shows a close relationship with technological advancement; modern information technology has shifted the focus of privacy from being a personal domain free from interference to being about control over personal data. Privacy now exhibits two new characteristics: “privacy datification” and “data privacyization” (Mai 2016). The connotation of privacy has moved from “tolerance” to “sharing,” so addressing the issue of “privacy” is essentially about how individuals disclose and share their personal data to others, and to what extent.
How to measure personal data privacy: personal data sensitivity
The personal data sensitivity serves as a crucial contextual factor, influencing security and privacy concerns as well as individual behaviour. This recognition motivates how individuals disclose and share personal data with others, and also impacts individuals’ willingness to protect their privacy (Kim et al. 2019; Schomakers et al. 2022).
At present, a clear definition of personal data sensitivity is not present in the literature. Personal data sensitivity is typically understood through ideas of public expectation and legal interpretation. It is generally believed that personal data sensitivity is an attribute of personal data (Dinev et al. 2013) which refers to the level of concern individuals feel about providing a certain type of data in a specific situation (Weible 1993); potential psychological, physiological, and material losses (Mothersbaugh et al. 2012); personal perception or evaluation of data value (Wacks 1989); and the negative impact of a data breach (Bansal and Gefen 2010).
The concept of personal data sensitivity has long been at the core of data protection frameworks. Countries generally attach importance to the protection of sensitive personal data and have made different legal interpretations. As early as 1970, the concept of sensitive data appeared in the Personal Information Protection Act of Heisenberg, Germany. German scholars defined it as information with highly personal attributes, important for identifying an individual, and with the risk of causing harm or discrimination. Afterwards, various laws defined the scope of sensitive personal data types through enumeration. As Article 6 of the 1981 Council of Europe Personal Data Convention stipulates that personal data related to race, political views, religion or other beliefs, health, or criminal convictions shall not be automatically processed unless appropriate safeguards are provided by domestic law (Wong 2007), subsequent legislation such as the GDPR and Brazil’s General Data Protection Act, the amendments to the California Privacy Rights Act of 2020 (CPRA), and the Personal Information Protection Act of China in 2024 have added new categories such as ‘ethnic origin’, ‘philosophical beliefs’, ‘trade union membership’, ‘the processing of genetic data, biometric data’, and ‘sexual orientation’ with an open scope. Overall, national laws have listed sensitive personal data, but do not define sensitive personal data, and there are differences in the specific types of coverage.
From a diachronic user perspective, personal identification information is typically perceived as highly sensitive, but this cognition is influenced by the usage environment and evolves with technological advancements. For instance, studies in 1999 (Ackerman et al. 1999), 2004 (Malhotra et al. 2004), and 2007 (Hui et al. 2007) often categorized passwords, financial account numbers, and ID card numbers as sensitive information, while personal preferences such as television show preferences were considered the least sensitive. However, more recent studies from 2017 to 2024 indicate that users are concerned about personal preference data, such as shopping behavior, which they believe could reveal psychological aspects of their privacy (Milne et al. 2017; Markos et al. 2018; Schomakers et al. 2019; Kang et al. 2022; Tao et al. 2024). In conclusion, personal data sensitivity is a subjective perception that stems from the environmental characteristics of an individual at a particular time, and its significance is contingent upon the type of data and individual differences.
The antecedents of personal data sensitivity and research hypotheses
Privacy behavior is affected both by endogenous motivations (for instance, subjective preferences) and exogenous factors (for instance, changes in user interfaces) (Acquisti et al. 2015). Similarly, personal data sensitivity is also affected both by endogenous motivations and exogenous factors (Bansal and Gefen 2010; Dinev et al. 2015; Martino et al. 2017). Endogenous motivations refer to subjective characteristics such as cognitive level and personal experience, while exogenous factors describe the specific environment that an individual relies on, such as their social network relationships (Martin and Zimmermann 2024). This study mainly focuses on exogenous factors such as demographic differences, privacy experiences, perceived privacy risks, privacy concerns, and exogenous factor (social influence), which are considered important predictive factors in data sensitivity and privacy literature.
Privacy experience
When faced with similar situations, individuals’ attitudes and values may differ due to differences in previous experiences (Bansal and Gefen 2010; Xu et al. 2011; Wang et al. 2019; Su et al. 2018). Privacy experience is defined as an individual’s experience of privacy infringement. Previous studies have found that previous privacy experiences influence personal data sensitivity (Tao et al. 2024), and considering an individual’s privacy experiences can better explain attitudes and behaviours related to privacy (Xu et al. 2011). For example, some scholars have explored the potential for privacy experiences to break the implicit “social contract” formed between users and personal data service providers (Culnan 2000); additionally, the violation of the “social contract” can cause users to worry about their privacy and security of personal data (Pedersen, 1982; Pavlou and Gefen 2005), thereby negatively affecting the sensitivity of personal data. Research has shown that the privacy experiences of others can also affect the sensitivity of personal data. For example, mounting facial data breach incidents have led to an increase in individuals’ sensitivity to facial data (Sepas-Moghaddam et al. 2019; Ghaffary, 2019; Mehmood and Selwal 2020; Nandakumar and Jain 2015). Therefore, if individuals experience privacy breaches, hear of, or are exposed to the potential abuse of personal data collected from the internet, they tend to believe that they will also become victims of privacy violations. This belief causes individuals to become more sensitive to relevant personal data. Therefore, we propose the following hypothesis:
H1. Privacy experience positively affects personal data sensitivity.
Perceived privacy risks
The perceived privacy risk mainly derives from individuals’ anxiety regarding potential harm and feelings of being offended rather than real data abuse or economic and reputational losses (Martin et al. 2017). Research has shown that perceived privacy risks can trigger high privacy concerns among individuals, which in turn increases sensitivity to personal data (Martino et al. 2017; Schomakers et al. 2019). Previous studies have confirmed that, in personal data sharing, the greater the perceived risk to individuals the more sensitive the data is considered to be (Mählmann et al. 2017; Milne et al. 2017; Phelps et al. 2000). For example, due to the stronger predictive power of genetic data compared to health data, individuals have a greater perception of privacy risks regarding genetic data and therefore believe that genetic data is more personalised and sensitive. Based on this discrepancy, we believe that the greater an individual’s perception of privacy risks to personal data, the more sensitive the data will be considered. Accordingly, we hypothesise the following:
H2. Perceived privacy risk positively affects personal data sensitivity.
Privacy concern
Privacy concern refers to individuals’ concerns about the potential loss of privacy caused by the disclosure of personal data (Asplund and Nadjm-Tehrani 2016; Ioannou et al. 2020), including unauthorised secondary use and access, information error, breach, and abuse (Smith et al. 1996; Ozturk et al. 2017; Asplund and Nadjm-Tehrani, 2016), and is considered a major influencing factor in personal data sensitivity. Previous studies have shown that when individuals feel threatened and need to protect their privacy, their concern for privacy appears to trigger higher data sensitivity perceptions. Individuals have high concerns about the future use of shared health data, as they are concerned that as increasing amounts of health data are obtained, the data may be used for purposes other than those originally described, making individuals more sensitive to health data (Aitken et al. 2016). In addition, Milne et al. (2017) also confirmed that individuals are concerned about the social risks that may arise from the leakage of their home address and may classify their home address as sensitive. In short, individuals with strong privacy concerns place more emphasis on protecting their information privacy space and are more sensitive to personal data. Accordingly, we hypothesise the following:
H3. Privacy concern has a positive impact on personal data sensitivity.
Demographic differences
Individual differences affect privacy concerns and perceived risks, thereby affecting individuals’ level of privacy sensitivity (Tao et al. 2024; Martino et al. 2017; Schomakerset et al. 2019). Previous studies have indicated that age can influence individuals’ tolerance of or thresholds for privacy threats in online environments (Goldfarb and Tucker 2012; DeSilver 2013). Individuals’ elders can contribute to the development of higher personal data sensitivity in cases where an individual feels threatened and needs to preserve their personal privacy space. Additionally, Wang et al. (2020) claim that women are more concerned about collecting private information than men, resulting in a higher level of data sensitivity. Moreover, individuals with higher privacy education levels are more protective of their privacy space as they are more sensitive to and cautious with personal data processes and perceive these processes as privacy intrusions. In view of this, we propose the following hypotheses:
H4a. Age positively affects personal data sensitivity.
H4b. Compared to men, women have higher personal data sensitivity.
H4c. Education level positively affects personal data sensitivity.
Social influence
According to behavioural science theory, social influence refers to an individual changing their thoughts, emotions, and behaviours due to the influence of others in their social network (Singh et al. 2024). Social influence is considered one of the key factors in privacy behaviour and attitude research (Mishra et al. 2023; Mendel and Toch 2017; Cheung et al. 2015). The huge and complex network formed by the IoE has greatly changed the methods and scope of personal data processing. Individuals feel anxious and uncertain when collecting, using, and disseminating personal data. They make judgements by referring to the opinions of members of their social networks, thus creating a bandwagon effect. Many studies have found that social influences have a significant impact on people’s privacy sensitivity, as people typically adjust their attitudes, beliefs, and behavioural patterns in response to their social networks (Singh et al. 2024). Based on the review of relevant literature in sociology and psychology, some scholars believe that social influence mainly consists of two types of influences. The first type of influence involves subjective norms formed by internalising external expert information into one’s own cognitive beliefs through the mechanism of descriptive norms (imitation); Venkatesh et al. (2012) argue that individuals are influenced by society, including family, friends, and experts, which can greatly influence their perceived risk and uncertainty. The second type of influence involves individual image and identity confirmation. For example, individuals who are more concerned about others’ ideas and follow trends find it easier to generate “public conformity” to obtain more positive reviews and maintain their personal image (Singh et al. 2024; Mendel and Toch, 2017). Many studies have also confirmed that social influence has a positive impact on privacy sensitivity (Youn and Shin 2019). Therefore, this study uses subjective norms and individual image to measure social influence and makes the following hypotheses:
H5a. Subjective norms positively affect personal data sensitivity.
H5b. Individual image positively affect personal data sensitivity.
Research methods
The selection of personal data in the IoE era
The measurement of personal data sensitivity has a predetermined premise: the connotation and structure of personal data. Consequently, this study firstly establishes a comprehensive list of personal data and then analyses the public perception of personal data sensitivity. In the IoE era, the data of “digital individuals” mainly consists of data that characterises individual natural attributes and data that characterises individual behavioural attributes (Li 2022). The data that characterises individual natural attributes mainly includes natural attribute data, spiritual attribute data, and social attribute data. The data that characterises personal behavioural attributes includes personal information acquisition behaviour data, business behaviour data, intelligent life and entertainment behaviour data, and administrative data. Therefore, this study categorises personal data in the IoE era into seven subcategories.
The selection of personal data was based on previous literature (Martino et al. 2017; Schomakers et al. 2019; Rumbold et al. 2018; Goodman and Flaxman 2017; Turn 1976; Winegar and Sunstein 2019; Tao et al. 2024) and sensitive data categories listed in data protection regulations such as the GDPR, those of EU countries, the United States, China, and the United Kingdom. Through brainstorming discussions with 17 colleagues, a comprehensive list of 41 personal data items was eventually developed (see Fig. 1). The specific process is as follows:
Perceived sensitivity of all 41 personal data types.
We conducted three rounds of the brainstorming method, which is an open decision-making approach where a group of people find a solution to a problem together and then express their own views without objection or criticism. Hosts and recorders controlled and documented the storm process. During the first round of brainstorming, all members identified the types of personal data and discussed the classification of personal data in the IoE and according to their real-life experiences. During the second round of brainstorming, we listed 74 information types on paper cards, according to the prior literature. The participants discussed, evaluated, and improved ideas to achieve continuous creative collaboration. Some personal data was dropped, such as pet names and zip codes. Some personal data was added, such as ‘smart home data’, which is largely collected in the current IoE era. The first and second rounds of brainstorming were held over one day. After one week, we conducted the third round of brainstorming for further adoption and affirmation. After each round of brainstorming, a summary group led by a professor and supplemented by a moderator and recorder organised the discussion results, summarised and evaluated each group’s discussion, and finalised the chosen list of items (Fig. 1 and Appendix 1).
Data collection and procedure
This study adopts a questionnaire survey method to collect the required data. We employ a questionnaire survey method to study personal data sensitivity due to its efficiency in collecting data on a large scale, which is suitable for revealing general patterns of perception. Simultaneously, compared to interviews and experimental methods, questionnaire surveys are more economical in terms of cost, time, and resources, and they do not impose a significant burden on respondents. This approach to data collection is common for information sensitivity studies and has been proven to be effective (Milne et al. 2017; Tao et al. 2024). Regarding ethical considerations, all procedures performed in studies involving human participants followed the ethical standards of the institutional and/or national research committee and with the 1964 Declaration of Helsinki and its later amendments or comparable ethical standards. The developed online survey did not collect any identifiable information (e.g.name, personal email, telephone number, or ID) for confidentiality and anonymity.
To ensure good reliability and validity of the questionnaire, 96 samples were selected for pre-testing from November 10th to 17th, 2022 to meet the needs of scale revision. The overall reliability of the questionnaire is high (Cronbach’s α = 0.893). In the pre-testing phase, we prioritized respondent feedback, refining questionnaire ambiguities. Notably, we added clarifications for terms like “DNA” and “voiceprint” to facilitate accurate responses. For assessing privacy concerns, we crafted nine non-redundant items spanning information collection, error/breach risks, unauthorized access, and control issues, ensuring precise measurement of each concern.
The formal survey period was from November 25th to December 10th, 2022. Data were collected through an online questionnaire survey, and the participants were recruited through a commercial online access panel hosted by Beijing Dataway Horizon Co., Ltd (an online user panel of 3 million Chinese adult users) (Xu et al. 2023). Stringent quality control measures were implemented, assigning each respondent a unique access link that could only be used for a single response. IP address and cookie tracking were employed to avoid duplicate entries. Additionally, a minimum completion time of 7 minutes was enforced for each questionnaire to ensure thorough responses.
We utilize the most recent census data and the 2019 1‰ Population Census as a benchmark to calculate quotas for each demographic group (National Bureau of Statistics of China 2020): in terms of gender, the proportion of males is 51.24%, and that of females is 48.76% (gender quotas allow for a ± 2% deviation). In terms of age, the proportions are as follows: 18–24 years old (9%), 25–34 years old (24%), 35–44 years old (20%), 45–54 years old (22%), and 55 years old and above (25%) (age quotas allow for a ± 5% deviation). All respondents are financially incentivized.
We determine the sample size using the statistical formula: n = Z² * σ² / d². Given the sensitivity of the data, to better differentiate data sensitivity, we calculate the sample size at a 99% confidence level (Z = 2.68, σ = 0.5, d = 3%), resulting in a required sample size of 1849. This study collects a total of 2337 survey questionnaires. After excluding questionnaires with conflicting or identical answers, we obtain 1921 valid questionnaires, with a validity rate of approximately 82.20%, meeting the data requirements.
Variable measurement
The questionnaire was divided into four parts. The first part measured the privacy experience. The first question asked the following: “Have you ever had personal data leaked in your daily life and work (such as phone numbers, online browsing records, etc.), resulting in frequent receiving of various spam advertisements, harassing phone calls, etc.?” (1 denoted “Yes” and 2 denoted “No”). The second part measured the sensitivity of 41 personal data points. This study follows the method used by previous scholars to measure sensitivity (Schomakers et al. 2019), requiring respondents to evaluate personal data from very insensitive to very sensitive with a score of 1 to 10, respectively. The second question asked the following: “If you are required to provide the following personal data items, how SENSITIVE would you consider this personal data?” (1 denoted “not sensitive” and 10 denoted “very sensitive”). The third part mainly measured the antecedents of sensitivity; this section used a five-point Likert scale (1 denoted “strongly disagree”, 5 denoted “strongly agree”). A perceived privacy risk scale was derived from studies by Zlatolas et al. (2015) and to measure overall risk perception related to personal privacy. The measurements of privacy concern were adopted from studies by Korzaan and Boswell (2008), measuring the level of personal attention to privacy when personal data is collected and used. The measurements of social influence were adapted from Venkatesh and Davis’s (2000) studies, which measures the degree to which individuals are influenced by subjective norms and individual image when perceiving personal data sensitivity. The fourth part mainly investigated the demographic information of the respondents, including six items: gender, age, occupation, monthly income and education level. We have listed the variable scales in Appendix 2.
Sample
The sample quotas are generally aligned with China’s demographic information (National Bureau of Statistics of China 2020). As shown in Table 1, the gender ratio among these 1921 respondents is relatively balanced, with 984 male respondents accounting for 51.2% and 937 female respondents accounting for 48.8%; the age distribution is concentrated between 18 and 34 years old, with a total of 906 respondents accounting for 47.16%. Under the new technological background of the IoE, individuals in this age group are the main active group participating in online behaviour. Regarding education level, the majority of respondents are high school, vocational college, and undergraduate students accounting for 20.2%, 26.6%, and 42%, respectively. In addition, 60.54% of respondents have an average monthly income exceeding 5000 RMB.
Data analysis and results
Validity and reliability of constructs
This study employed SPSS 26.0 for descriptive analysis, confirmatory factor analysis, and hypothesis testing. This study adopt Cronbach’s α coefficient for reliability analysis of the scale; the value of Cronbach’s α for privacy concerns was 0.889, the value of Cronbach’s α for social influence is 0.737, the value of Cronbach’s α for perceived privacy risk is 0.896, and the value of Cronbach’s α for 41 personal data items is 0.966, which are all greater than 0.7, indicating that the scale in this study has good reliability.
The scales use in this study are all derived from previous mature scales and modified appropriately based on the research context, thus ensuring high content validity of the scales. When conducting construct validity analysis, an exploratory factor analysis (EFA) is performed to test the scale. We select principal axis factoring with Promax rotation and the Kaiser criterion for our analysis. This approach is preferred over principal component analysis as it allows for the extraction of concise factors aligned with current research (Brown 2009a). The choice of Promax rotation, an oblique method, is based on the assumption of significant factor intercorrelation (Brown 2009b). The application of the Kaiser criterion, a standard practice in EFA (Brown 2009c), further supports our methodological decisions. The resultant EFA demonstrates a strong fit, indicated by a KMO value of 0.726 and a chi-square value significant at the 0.05 level (chi-square = 87604.082, Sig < 0.001) (Hair et al. 2010).
Descriptive results on personal data sensitivity in China
The average sensitivity ratings for 41 personal data types are displayed in Fig. 1, sorted as descending from the most to least sensitive data. From the Chinese perspective, ID number represented the most sensitive data type, with a very high perceived sensitivity (M = 7.65), followed closely by mobile phone number (M = 7.48). The least sensitive items are religion (M = 4.95) and sleep quality (M = 4.93). The average sensitivity of 41 personal data items is 6.1, with a maximum of 7.65 and less than 8, indicating that the Chinese participants generally demonstrated low sensitivity to personal data.
The descriptive analysis results (Table 2) show that participants considered data that characterises individual behavioural attributes (M = 6.21) more sensitive than data that characterises individual natural attributes (M = 5.96). More specifically, participants believe that social attribute data (M = 6.77) and administrative data (M = 6.59) are more sensitive, while spiritual attribute data (M = 5.07) and information acquisition behaviour data (M = 5.78) are not very sensitive.
Cluster analysis
For each of the 41 personal data items, we firstly calculate the average value (S) of personal data sensitivity. Secondly, we develop an indicator of personal data sensitivity (PDS) with an average value between 0 and 100, SI = (10-S)/10 * 100 (according to Markos et al. 2017). In this study, we employ the k-means clustering method and achieve comparable clustering results using hierarchical cluster methods. Four clusters are labelled as: very highly, highly, medium, and low sensitive data (see Fig. 2). 3 personal data are perceived as very highly sensitive (M = 7.5, SD = 2.2), 11 as highly sensitive (M = 6.74, SD = 1.94), 20 as medium sensitive (M = 6.1, SD = 1.87), and the remaining 7 as low sensitive (M = 5.1, SD = 2.4) (see Table 3).
Clustering of 41 personal data.
Hypothesis testing
We initially employ multiple linear regression in SPSS 26.0 software to examine the relationships between personal data sensitivity and multiple independent variables (privacy experience, privacy concerns, social influence, perceived privacy risks, age, gender, and education level).
Subsequently, we categorize personal data sensitivity into four levels (e.g., extremely high sensitivity, high sensitivity, moderate sensitivity, and low sensitivity), and use multinomial logistic regression within SPSS26.0 to assess the impact of the independent variables on the four clusters. Table 4 reports the regression results.
Privacy experience has no impact on any personal data clusters. Privacy concern and perceived privacy risks have a strong positive impact on all data types, indicating that people who value privacy more believe that most personal data is more sensitive. Subjective norm has an impact on all data types, but sensitivity to low sensitive data is not significant. This suggests that for data that is less sensitive, individuals’ privacy decisions are more likely to be based on personal values and experiences rather than societal expectations. Individual image has an impact on all data types, but sensitivity to very highly and highly sensitive data is not significant. This is likely because for data that is highly sensitive, individuals’ privacy decisions are more influenced by intrinsic values and perceptions of privacy risks.
Additionally, age, gender, and educational level also affect individuals’ perceptions of the sensitivity of data across different levels of sensitivity. Age shows an impact on very highly, highly and low sensitive data, suggesting that as individuals age, they may become more cautious about privacy. Gender can influence the perceived sensitivity, and male respondents found medium and low sensitive data is to be more sensitive than female respondents. Education level do not significantly predict perceived sensitivity and those with higher education level found medium and low sensitive data to be more sensitive than respondents with lower education levels, indicating that individuals with more education level are more aware of the potential risks associated with all types of personal data, not just the most sensitive ones.
Discussion
Key findings of personal data sensitivity
This study presents a spectrum of overall sensitivity regarding personal data in the IoE era from the Chinese perspective, categorizing it into four clusters: very highly, highly, medium, and low sensitive data. Each cluster has its own subtleties, with the very highly sensitive data being irrevocable once compromised, the highly sensitive data posing risks of identity theft, the medium sensitive data potentially leading to discrimination, and the low sensitive data varying in sensitivity based on cultural and social contexts. This clustering of personal data into sensitivity levels underscores the varying degrees of concern that Chinese respondents have for different types of personal information in the IoE era (Wang et al. 2024; Kang et al. 2022).
We also find that Chinese respondents appeared to believe that the sensitivities of ID number, mobile phone number, and home address were far higher than religious belief, sleep quality, sexual orientation, and weight, which is consistent with existing research conclusions (Schomakers et al. 2019; Kang et al. 2022). These findings indicate that personal identification information is the most sensitive (Schomakers et al. 2019). Personally identifiable information (PII) refers to information that can be identified or located as belonging to an individual when used alone or in conjunction with other relevant data. If PII contains an attribute that can uniquely identify an individual, this attribute is a unique identifier, such as the national ID number, which is generally considered highly sensitive.
This study also reveals some notable differences. Individuals have shown increased sensitivity towards their home addresses, contrasting with previous beliefs that home addresses were not considered sensitive. This shift in perception may stem from new concerns about personal safety. In contemporary times, more people, particularly women, are living alone than ever before. The advancement of GPS technology now allows for real-time and precise location tracking, enabling easy and direct finding of individuals in the physical world, posing a highly relevant physical risk and instilling a significant sense of insecurity regarding personal safety, thus increasing the caution with which individuals disclose their home addresses (Milne et al. 2017). The study also finds that income levels are perceived as less sensitive in the Chinese cultural context, differing from Western cultural perspectives. In Chinese society, discussions about personal income are common and are typically not considered sensitive data (TC260 2021).
Furthermore, the study notes the heightened sensitivity of new personal data in the IoE environment. The continuous evolution of IoE technology is transforming modern life (Wang et al. 2023), particularly with the widespread adoption of smart home devices that enable the collection and analysis of a vast amount of personal data (Li, 2022; Schomakers et al. 2022). Due to the innovative nature of these application scenarios, research on personal data collected by smart products is still lacking. We include these new personal data in our study and find that individuals exhibit a certain degree of sensitivity to them, such as recordings of home conversations by smart speakers (M = 6.45), which is significantly higher than the average (M = 6.1). This reveals public concerns about technology’s deep integration into daily life and the fear of potential misuse of personal privacy (Singh et al. 2024). It reflects the evolving expectations of individuals regarding privacy and their acceptance of data use, as well as the increasing demand for the protection of personal data as technology advances.
Sensitivity ranking of seven types of personal data
This study’s results rank, from a sensitivity level perspective, the privacy level from high to low as follows: social attribute data (M = 6.78), administrative data (M = 6.59), life and entertainment behaviour data (M = 6.31), business behaviour data (M = 6.15), natural attribute data (M = 6.03), information acquisition behaviour data (M = 5.78), and spiritual attribute data (M = 5.07).
The social attributes of an individual mainly include their real identity data, real relationship data, online identity data, and online relationship data. With the help of various digital application platforms, individuals’ social attribute characteristics continue to be digitally collected, stored, and shared on a large scale. Social attribute data can centrally highlight the status of a person’s social capital and resources, which is to some extent a very important aspect of personal privacy (Li 2022). In addition, with the deepening of smart city and smart government application models, a large amount of personal data is recorded on various government application platforms. Administrative data generally involves personal privacy and has high matching, such as credit data and social security data. If these data are leaked, they can pose a potential threat to personal property or reputation. Behavioural data refers to the operational records and behavioural data of individuals on various online digital application platforms, including life and entertainment behavioural data, business behavioural data, and information acquisition behavioural data. By collecting behavioural data, companies can clearly describe user profiles, provide more relevant information, and provide personalised and customised products and services to meet individual needs and interests; however, these targeted solutions have raised privacy concerns as individuals feel their privacy is violated (Ioannou et al. 2020; Dwivedi et al. 2021; Mathews-Hunt 2016). Biometric data refers to the digital representation of physical features that identify individuals in the IoE environment (Ioannou et al. 2020; Wang et al. 2023). This type of personal data may be more sensitive because of characteristics such as uniqueness, identification, replicability, irreversibility of damage, and relevance of information (Wang et al. 2023; Morosan 2019). The sensitivity of spiritual attribute data is relatively low, as digital technology-oriented media accelerates the flow of human emotions and extends the real boundaries of human empathy emotions (Wu and Li, 2021). Social media has become a visible space for people to present, record, and share personal, emotional content in their daily lives. As a result, people are less sensitive to data related to spiritual attributes and show a more open and inclusive mind (Li 2022).
The antecedents of personal data sensitivity
This study also explored the antecedents of personal data sensitivity. The results show that perceived privacy risks and privacy concerns significantly affect the level of personal data sensitivity. These results enrich existing privacy literature, confirming that personal characteristics are an important predictor of personal data sensitivity (Ioannou et al. 2020; Baker-Eveleth et al. 2022; Benamati et al. 2017). However, privacy experiences do not significantly affect personal data sensitivity. In the context of numerous privacy incidents, complex provisions (Hargittai and Marwick 2016), a combination of difficult-to-manage technical affordances, network privacy, and constantly changing settings, individuals may express feelings of cynicism (Lyu et al. 2024) or apathy (Hargittai and Marwick 2016) towards privacy concerns, making them more insensitive (Moritz et al. 2021).
At the same time, individual sensitivity is to some extent influenced by demographic characteristics. Through our regression analysis, we demonstrate that age affects personal data sensitivity. As may be expected, older ages presented a higher perceived sensitivity, especially regarding very highly sensitive data and highly sensitive data. Previous studies have also found that older individuals have a greater risk of information overload and stronger personal data sensitivity (Markos et al. 2017). We also find that, compared to female, male respondents consider medium and low sensitive data is to be more sensitive. Medium sensitive data contains more personal data of behavioural types, and females share their personal data more on some shopping and entertainment platforms to obtain customised services or consumption opportunities, which may reduce their sensitivity to medium sensitive data. In addition, educational level can explain some individual differences (Schomakers et al. 2019; Markos et al. 2017). For example, individuals with high education levels generally use the internet for a long time and are exposed to a large amount of information. They have a relatively mature and comprehensive understanding of personal data privacy (DeSilver and Drew 2013).
Furthermore, this study confirms that social influence significantly affects personal data sensitivity. We have found that the public’s personal data sensitivity is usually affected by their social network. The development of IoE technology and social media have greatly changed the way in which individuals receive information, making individuals consciously or unconsciously form judgements based on the opinions of the majority (Singh et al. 2024). Especially regarding opaque personal data processing on platforms, individuals usually lack sufficient knowledge and equivalent information (Wang et al. 2023), and usually combine the statements of people around them and expert suggestions with their own privacy cognition systems to form judgements about personal data sensitivity. Therefore, in practice, social influence can be actively utilised to improve individuals’ personal data sensitivity and stimulate personal privacy protection behaviours.
Theoretical contributions
This study focuses on the perception of personal data sensitivity among Chinese users in the IoE era, and further explores the differential effects of some antecedents. It is also one of the first studies to extend the perception of user personal data sensitivity in the Asian IoE context.
Firstly, exploring the perception of personal data sensitivity among IoE users requires a comprehensive understanding of the antecedents of endogenous and exogenous factors. This study finds that users’ endogenous factors (age, gender, privacy issues, perceived privacy risks) and exogenous social influences significantly affect the sensitivity of personal data. On the one hand, endogenous factors significantly influence the sensitivity of personal data. On the other hand, personal cognition is more susceptible to exogenous social influence. Individuals tend to show public consistency in their attitudes towards certain issues due to the expectations of friends, family, relatives, and society, deeply accepting such attitudes. This change is rooted in personal beliefs and values (Singh et al. 2024). Particularly in the context of Chinese collectivism, an important feature of “relationships” is that they are guided and regulated by “public integration” and social pressure (He et al. 2022), thus affecting the perception of personal data sensitivity. These results enrich existing privacy literature, and future research will explore the differential impact of personal and extrinsic factors on the privacy-related perceptions of different users.
Secondly, as in previous studies, personal data privacy is not a single-dimensional concept but encompasses multiple layers and dimensions (Wang et al. 2024; Solove and Schwartz 2020). This study’s findings suggest that categorizing personal data items into low, medium, high, and very high privacy segments is more appropriate, which aligns with the existing literature that highlights the need for a more nuanced approach to data privacy (Tao et al. 2024; Schomakers et al. 2019; Markos et al. 2017). By revealing that different types of personal data may be influenced by different factors, such as perceived privacy risks and concerns, this study deepens the understanding of how personal data is perceived and valued by individuals. Our findings emphasize that in the IoE environment, research on personal data privacy should be more detailed and specific, considering the specific personal data items and their privacy segments, which helps to better understand the complexity and diversity of personal privacy.
Practical implications
We construct a ranking of personal data sensitivity. The ranking can help data governance bodies and data service providers better understand personal data sensitivity and may assist decision-making and privacy-protection.
For data governance, the development of privacy policies is a multidimensional decision-making process involving risk assessment, public awareness, cultural differences, legal frameworks, and technological development. Firstly, the development of privacy policies should be based on a thorough assessment of the potential risks of personal data, while also considering the public’s understanding and acceptance of privacy protection (Alemany et al. 2022; Kang et al. 2022). For instance, our study indicates that home addresses are often misclassified in terms of sensitivity, which underscores the need for policymakers to stay attuned to changing public perceptions and update privacy definitions accordingly (TC260 2021). Additionally, privacy policies must be tailored to the cultural and social context. In China, for instance, the public may have lower sensitivity to religion, sexual orientation, and political relationship information compared to other cultures.
Overall, data governance institutions need to establish a flexible mechanism to quickly respond to new privacy risks and challenges. At the same time, the regulatory system should maintain stability and predictability but also be subject to regular review and updates to reflect technological and market changes. For this purpose, regulatory bodies should work closely with industries, academia, and research institutions to better understand technological trends and potential risks. Additionally, data governance institutions must be acutely aware of dynamic changes in social consciousness and assess hidden risks through public opinion surveys and social media analysis. Through this comprehensive and dynamic approach, privacy policies can protect personal privacy while promoting social stability and public security.
For data service providers, this ranking can be used in several ways. Firstly, the ranking can help clarify the varying degrees of sensitivity among different datasets, enabling providers to pinpoint which types of data require more stringent privacy measures (Lappeman et al. 2023). For the most sensitive data, providers should enforce strict prior consent rules, while less sensitive data can be managed with opt-out options. Secondly, the ranking serves as a valuable tool for cross-border data service providers, helping them to reassess and adjust their personal data management policies to comply with local regulations and user expectations in different countries (Kang et al. 2022).
Moreover, with the rapid development of IoE technology, data service providers should pay special attention to managing the collection of data, entertainment data, and lifestyle behavioural data. Because the large-scale collection of these data poses a threat to personal privacy (Li 2022). Providers should ensure that the collection, transmission, storage, and use of these data follow appropriate privacy protection measures, especially in smart application scenarios (Kang et al. 2022). Finally, data service providers should establish a dynamic privacy protection strategy and regularly review and update privacy policies to ensure consistency with the latest privacy protection standards and best practices. Through these specific measures, data service providers can not only comply with personal data protection regulations but also foster user trust, gaining an advantage in the competitive market (Lappeman et al. 2023).
Conclusions and limitations
The sensitivity of personal data serves as a critical indicator for measuring public privacy perceptions, providing important guidance for the construction and improvement of privacy protection systems. In this study, we clarify the new contents and characteristics of personal data in the IoE environment and assess users’ perceptions of personal data sensitivity to gain a deeper insight into the public’s current privacy concepts. The results indicate that among the 41 types of personal data, there are significant differences in sensitivity. Sexual orientation and religion exhibit relatively low sensitivity, while information directly related to personal identification, such as ID numbers and addresses, are highly sensitive. Additionally, the study finds that in the IoE environment, new types of personal data, such as behavioural and facial data, also exhibit high sensitivity, reflecting the increasing public concern about the privacy challenges brought about by technological development.
Furthermore, personal data can be categorized into four clusters of sensitivity: very highly sensitive data, highly sensitive data, medium sensitive data, and low sensitive data. The study also reveals that users’ perceptions of personal data sensitivity in the IoE environment are influenced by intrinsic factors (perceived privacy risks, privacy concerns, gender, age, and educational level) and extrinsic factors (social influence). These findings emphasize the need for more detailed and specific research on personal data privacy in the IoE context, considering specific personal data items and their respective privacy segments, which can better understand the complexity and diversity of personal data privacy.
Finally, this study is subject to certain limitations that require further investigation. Firstly, the age and education level of the respondents in this study were widely distributed. Future research can focus on specific groups, such as teenaged, middle-aged, and elderly people, and can also add cultural, geographical, political, and other factors to enrich the heterogeneity of the study (Krasnova et al. 2012). Secondly, this study focused on identifying the perceptual sensitivity of different personal data types without considering contexts. In the current complex interactive environment of the IoE, individuals cannot know how personal data is transmitted, and the context in which data is used can change without individuals being made aware of the change. In addition, previous research has confirmed that risk assessment is at the core of perceived sensitivity (Milne et al. 2017; Schomakers et al. 2019). Therefore, we do not consider the context, which is suitable in the current IoE environment. However, many scholars still believe that privacy cognition and behaviour are highly dependent on context (Rohunen et al. 2018; Kokolakis 2017). For example, platform type, as an important contextual factor, has been shown to have a significant impact on users’ privacy perception and disclosure (Tang and Lin 2017; Yu et al. 2020). Therefore, it is worth exploring whether individuals have different sensitivities to different personal data, taking into account factors such as the purpose of the information processor and the processing context.
Data availability
The datasets are available from the corresponding author Meng Wang on reasonable request.
References
Ackerman MS, Cranor LF, Reagle J (1999) Privacy in e-commerce: examining user scenarios and privacy preferences. In Proceedings of the 1st ACM Conference on Electronic Commerce (pp. 1-8)
Acquisti A, Brandimarte L, Loewenstein G (2015) Privacy and human behavior in the age of information. Science 347(6221):509–514. https://doi.org/10.1126/science.aaa1465
Aitken M, de St Jorre J, Pagliari C et al. (2016) Public responses to the sharing and linkage of health data for research purposes: a systematic review and thematic synthesis of qualitative studies. BMC Med Ethics 17(1):1–24
Alemany J, Val ED, García-Fornes A (2022) A review of privacy decision-making mechanisms in online social networks. ACM Comput Surv (CSUR) 55(2):1–32
Asplund M, Nadjm-Tehrani S (2016) Attitudes and perceptions of IOT security in critical societal services. IEEE Access 4:2130–2138
Ayaburi EW, Treku DN (2020) Effect of penitence on social media trust and privacy concerns: The case of Facebook. Int J Inf Manag 50:171–181
Baker-Eveleth L, Stone R, Eveleth D (2022) Understanding social media users’ privacy-protection behaviors. Inf Comput Secur 30(3):324–345
Bansal G, Gefen D (2010) The impact of personal dispositions on information sensitivity, privacy concern and trust in disclosing health information online. Decis Support Syst 49(2):138–150
Benamati JH, Ozdemir ZD, Smith HJ (2017) An empirical test of an Antecedents - Privacy Concerns-Outcomes model. J Inf Sci 43(5):583–600. https://doi.org/10.1177/0165551516653590
Brown JD (2009a) Principal components analysis and exploratory factor analysis - Definitions, differences, and choices. Shiken: JALT Test Evaluation SIG Newsl 13(1):26–30
Brown JD (2009b) Choosing the right type of rotation in PCA and EFA. Shiken: JALT Test Evaluation SIG Newsl 13(1):20–25
Brown JD (2009c) Choosing the right number of components or factors in PCA and EFA. Shiken: JALT Test Evaluation SIG Newsl 13(1):19–23
Cheung C, Lee ZWY, Chan TKH (2015) Self-disclosure in social networking sites: the role of perceived cost, perceived benefits and social influence. Internet Res 25(2):279–299
China Internet Development Research Institute (2023) World Internet Development Report. Beijing: Commercial Press
Culnan MJ (2000) Protecting privacy online: Is self-regulation working? J Public Policy Mark 19(1):20–26
DeSilver D (2013) “Young Americans and Privacy: It’s Compli-cated,” Pew Research Center (June 20), http://www.pewresearch.org/fact-tank/2013/06/20/young-americans-and-privacy-its-complicated
Dinev T, Xu H, Smith JH et al. (2013) Information privacy and correlates: an empirical attempt to bridge and distinguish privacy-related concepts. Eur J Inf Syst 22(3):295–316
Dinev T, McConnell AR, Smith HJ (2015) Informing privacy research through information systems, psychology, and behavioral economics: Thinking outside the“APCO” box. Inf Syst Res 26(4):639–655. https://doi.org/10.1287/isre.2015.0600
Dwivedi YK, Hughes L, Ismagilova E et al. (2021) Artificial Intelligence (AI): Multidisciplinary perspectives on emerging challenges, opportunities, and agenda for research, practice and policy. Int J Inf Manag 57:101994
Farayola OA, Olorunfemi OL, Shoetan PO (2024) Data privacy and security in it: a review of techniques and challenges. Comput Sci IT Res J 5(3):606–615
Ghaffary S (2019) Amazon is trying to regulate itself over facial recognition software before congress does. Vox. https://www.vox.com/technology/2019/2/7/18216125/amazon-regulation-facial-recognition-software. Accessed 13 May 2023
Goldfarb A, Tucker C (2012) Shifts in privacy concerns. Am Econ Rev 102(3):349–353
Goodman B, Flaxman S (2017) European Union regulations on algorithmic decision-making and a “right to explanation. AI Mag 38(3):50–57
Gopal RD, Hidaji H, Patterson RA, Rolland E, Zhdanov D (2018) How Much to Share with Third Parties? User Privacy Concerns and Website Dilemmas. MIS Q 42(1):143–164
Gu LP (2020) Integrated privacy: A new type of privacy in the era of big data. Nanjing Soc Sci 04:106-111+122
Hair JF, Black W, Babin B, Anderson R, Tatham R (2010) Multivariate data analysis (6th ed.). Pearson Prentice Hall
Hajli N, Lin X (2016) Exploring the security of information sharing on social net-working sites: The role of perceived control of information. J Bus Ethics 133(1):111–123
Harari GM, Lane ND, Wang R et al. (2016) Using smartphones to collect behavioral data in psychological science: Opportunities, practical considerations, and challenges. Perspect Psychol Sci 11(6):838–854. https://doi.org/10.1177/1745691616650285
Hargittai E, Marwick A (2016) What can I really do? Explaining the privacy paradox with online apathy. Int J COMMUN-US 10:3737–3757
He P, Lovo S, Veronesi M (2022) Social networks and renewable energy technology adoption: empirical evidence from biogas adoption in China. Energy Econ 106:105789
Hui K, Teo HH, Lee ST (2007) The value of privacy assurance: An exploratory field experiment. MIS Q 31(1):19–33
Ioannou A, Tussyadiah I, Lu Y (2020) Privacy concerns and disclosure of biometric and behavioral data for travel. Int J Inf Manag 54:102122. https://doi.org/10.1016/j.ijinfomgt.2020.102122
Kang J, Lan J, Yan H et al. (2022) Antecedents of information sensitivity and willingness to provide. Mark Intell Plan 40(6):787–803. https://doi.org/10.1108/MIP-02-2022-0065
Kim D, Park K, Park Y, Ahn JH (2019) Willingness to provide personal information: Perspective of privacy calculus in IoT services. Comput Hum Behav 92:273–281
Kokolakis S (2017) Privacy attitudes and privacy behavior: A review of current research on the privacy paradox phenomenon. Comput Secur 64:122–134
Korzaan ML, Boswell KT (2008) The influence of personality traits and information privacy concerns on behavioral intentions. J Comput Inf Syst 48(4):15–24
Krasnova H, Veltri NF, Günther O (2012) Self-disclosure and privacy calculus on social networking sites: The role of culture. Bus Inf Syst Eng 4(3):127–135
Lappeman J, Marlie S, Johnson T, Poggenpoel S (2023) Trust and digital privacy: willingness to disclose personal information to banking chatbot services. J Financ Serv Mark 28(2):337
Li WD (2022) Privacy security of personal data cloud communication in the internet of everything. Acad Front 7:78–89. https://doi.org/10.16619/j.cnki.rmltxsqy.2022.14.008
Li WD (2020) The connotation, elements and composition of the Internet of Everything. Acad Front 6:40–45
Li H, Luo X, Zhang J, Xu H (2017) Resolving the privacy paradox: Toward a cognitive appraisal and emotion approach to online privacy behaviors. Inf Manag 54(8):1012–1022
Lyu T, Guo Y, Chen H (2024) Understanding people’s intention to use facial recognition services: the roles of network externality and privacy cynicism. Inf Technol People 37(3):1025–1051
Mählmann L, Schee Gen Halfmann S et al. (2017) Attitudes towards personal genomics and sharing of genetic data among older swiss adults: A qualitative study. Pub Health Genomics 20(5):293–306
Mai JE (2016) Big data privacy: The datafication of personal information. Inf Soc 32(3):192–199
Malhotra NK, Kim SS, Agarwal J (2004) Internet users’ information privacy concerns (IUIPC): The construct, the scalle, and a causal model. Inf Syst Res 15(4):336–355
Markos E, Labrecque LI, Milne GR (2018) A new information lens: The self-concept and exchange context as a means to understand information sensitivity of anonymous and personal identifying information. J Interact Mark 42:46–62. https://doi.org/10.1016/j.intmar.2018.01.004
Markos E, Milne GR, Peltier JW (2017) Information sensitivity and willingness to provide continua: a comparative privacy study of the united states and brazil. J Public Policy Mark 36:79–96. https://doi.org/10.1509/jppm.15.159
Martin K, Borah A, Palmatier RW (2017) Data privacy: Effects on customer and firm performance. J Mark 81(1):36–58
Martin KD, Zimmermann J (2024) Artificial Intelligence and its Implications for Data Privacy. Curr Opin Psychol 101829
Martino BD et al. (2017) Internet of everything: Algorithms, methodologies, technologies and perspectives. Springer, Singapore, (Eds.)
Mathews-Hunt K (2016) CookieConsumer: Tracking online behavioural advertising in Australia. Comput Law Security Rep. 32(1):55–90. https://doi.org/10.1016/j.clsr.2015.12.006
Meg J (2015) Privacy Without Screens & the Internet of Other People’s Things (April 3, 2015). Idaho Law Review, 2015, Available at SSRN: https://ssrn.com/abstract=2614066
Mehmood R, Selwal A (2020) Fingerprint biometric template security schemes: attacks and countermeasures. In: Singh PK, Kar AK, Singh Y, Kolekar MH,Tanwar S (eds) Proceedings of ICRIC. Springer International Publishing,Cham, pp. 455–467
Mendel T, Toch E (2017). Susceptibility to Social Influence of Privacy Behaviors: Peer versus Authoritative Sources. In Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing (CSCW ‘17). Association for Computing Machinery, New York, NY, USA, 581–593. https://doi.org/10.1145/2998181.2998323
Milne GR, Pettinico G, Hajjat FM et al. (2017) Information sensitivity typology: Mapping the degree and type of risk consumers perceive in personal data sharing. J Consum Aff 51(1):133–161. https://doi.org/10.1111/joca.12111
Mishra A, Baker-Eveleth L, Gala P et al. (2023) Factors influencing actual usage of fitness tracking devices: Empirical evidence from the UTAUT model. Health Mark Q 40(1):19–38. https://doi.org/10.1080/07359683.2021.1994170
Moritz B et al. (2021) Making sense of algorithmic profiling: user perceptions on Facebook. Inf Commun Soc 1–17. https://doi.org/10.1080/1369118X.2021.1989011
Morosan C (2019). Disclosing facial images to create a consumer’s profile. Int J Contemp Hosp Manag ahead-of-p(ahead-of-print),3149–3172. https://doi.org/10.1108/ijchm-08-2018-0701
Mothersbaugh DL, Foxx WK, Beatty SE et al. (2012) Disclosure antecedents in an online service context: The role of sensitivity of information. J Serv Res 15(1):76–98
Nandakumar K, Jain AK (2015) Biometric template protection: bridging the performance gap between theory and practice. IEEE Signal Process Mag 32(5):88–100. https://doi.org/10.1109/MSP.2015.2427849
National Bureau of Statistics of China (2020). The Seventh National Population Census Bulletin (No. 5). https://www.stats.gov.cn/sj/pcsj/rkpc/d7c/202303/P020230301403217959330.pdf
Ohlhausen MK (2014) Privacy challenges and opportunities: The role of the federal trade commission. J Public Policy Mark 33(1):4–9
Ohm P (2014) Sensitive information. South Calif Law Rev 88:1125–1196. https://heinonline.org/HOL/LandingPage?handle=hein.journals/scal88&div=39&id=&page Available online
Ozturk AB, Nusair K, Okumus F et al. (2017) Understanding mobile hotel booking loyalty: an integration of privacy calculus theory and trust-risk framework. Inf Syst Front 19(4):753–767
Pavlou PA, Gefen D (2005) Psychological contract violation in online marketplaces: Antecedents, consequences, and moderating role. Inf Syst Res 16(4):372–399
Pedersen DM (1982) Personality correlates of privacy. J Psychol 112(1):11–14
Phelps J, Nowak G, Ferrell E (2000) Privacy concerns and consumer willingness to provide personal information. J Public Policy Mark 19(1):27–41
Robinson C (2017) Disclosure of personal data in ecommerce: A cross-national comparison of Estonia and the United States. Telemat Inf 34(2):569–582
Rohunen A, Markkula J, Heikkilä M (2018) Explaining diversity and conflicts in privacy behavior models. J Comput Inf Syst 60(4):378–393
Rumbold JMM, Pierscionek BK (2018) What are data? A categorization of the data sensitivity spectrum. Big Data Res 12:49–59. https://doi.org/10.1016/j.bdr.2017.11.001
Schomakers EM, Lidynia C, Ziefle M (2022) The role of privacy in the acceptance of smart technologies: Applying the privacy calculus to technology acceptance. Int J Hum–Computer Interact 38(13):1276–1289
Schomakers EM, Lidynia C, Müllmann D, Ziefle M (2019) Internet users’ perceptions of information sensitivity-insights from germany. Int J Inf Manag 46:142–150. https://doi.org/10.1016/j.ijinfomgt.2018.11.018
Sepas-Moghaddam A, Correia P, Nasrollahi K et al. (2019) A double-deep spatioangular learning framework for light field based face recognition. IEEE TransCircuits Syst Video Technol 30(12):4496–4512. https://doi.org/10.1109/TCSVT.2019.2916669
Singh G, Bhatt S, Jhamb D (2024) Impact of privacy, technology readiness, and perceived crowding on adoption of telemedicine services. Int J 15(4):455–478
Smith HJ, Milberg SJ, Burke SJ (1996) Information privacy: Measuring individuals’ concerns about organizational practices. MIS Q 20(2):167–196. https://doi.org/10.2307/249477
Solove DJ (2024) Artificial intelligence and privacy. Available at SSRN
Solove DJ, Schwartz PM (2020) Information privacy law. Aspen Publishing
Su P, Wang L, Yan J (2018) How uses’ Internet experience affects the adoption of mobile payment: A mediation model. Technol Anal Strategie Manag 30(2):186–197
Tang JH, Lin YJ (2017) Websites, data types and information privacy concerns: A contingency model. Telemat Inf 34:1274–1284
Tao S, Liu Y, Sun C (2024) Understanding information sensitivity perceptions and its impact on information privacy concerns in e-commerce services: Insights from China. Comput Secur 138:103646
TC260 (2021) “Cybersecurity practices guidelines – guidelines for categorisation and classification of network data”, available at: https://www.tc260.org.cn/upload/2021-12-31/1640948142376022576.pdf (accessed 17 March 2023)
Turn R (1976, June) Classification of personal information for privacy protection purposes. In Proceedings of the June 7-10, 1976, national computer conference and exposition (pp. 301-307)
Valdez AC, Ziefle M (2018) The users perspective on the privacy-utility trade-offs in health recommender systems. International Journal of Human-computer Studies
Venkatesh V, Davis FD (2000) A theoretical extension of the technology acceptance model: Four longitudinal field studies. Manag Sci 46(2):186–204
Venkatesh V, Thong JYL, Xu X (2012). Consumer acceptance and use of information technology: extending the unified theory of acceptance and use of technology. MIS Q 157-178
Wacks R (1989) Personal Information: Privacy and the La. Clarendon Press, Oxford
Wang M, Qin Y, Liu J, Li W (2023) Identifying personal physiological data risks to the Internet of Everything: the case of facial data breach risks. Hum Soc Sci Commun 10(1):1–15
Wang L, Sun Z, Dai X et al. (2019) Retaining users after privacy invasions: The roles of institutional privacy assurances and threat-coping appraisal in mitigating privacy concerns. Inf Technol People 32(6):1679–1703
Wang L, Wang LY, Sun Z (2020) The mechanism of privacy invasion on experience on internet users self-disclosure. Syst Eng Theory Practical 40(01):79–92
Wang Y, Zhu J, Liu R, Jiang Y (2024) Enhancing recommendation acceptance: Resolving the personalization–privacy paradox in recommender systems: A privacy calculus perspective. Int J Inf Manag 76:102755
Weible RJ (1993) Privacy and Data: An Empirical Study of the Influence and Types and Data and Situational Context upon Privacy Perceptions (doctoral dissertation). Department of Business Administration, Mississippi State University.[Google Scholar]
Wiese J, Das S, Hong JI, Zimmerman J (2017) Evolving the ecosystem of personal behavioral data. Hum–Comput Interact 32(5-6):447–510
Winegar AG, Sunstein CR (2019) How much is data privacy worth? A preliminary investigation. J Consum Policy 42:425–440
Wong R (2007) Data protection online: Alternative approaches to sensitive data. J Int’L Com L Tech 2:9
Wu F, Li JM (2021) Virtual reality: Exploring the technological implementation path of empathy communication. J Southwest Minzu Univ(Humanities Soc Sci Ed) 7:178–184
Xu H, Dinev T, Smith J et al. (2011) Information privacy concerns: Linking individual perceptions with institutional privacy assurances. J Assoc Inf Syst 12(12):1
Xu P, Krueger B, Liang F, Zhang M, Hutchison M, Chang M (2023) Media framing and public support for China’s social credit system: An experimental study. New Media Soc 0(0). https://doi.org/10.1177/14614448231187823
Youn S, Shin W (2019) Teens’ responses to Facebook newsfeed advertising: The effects of cognitive appraisal and social influence on privacy concerns and coping strategies. Telemat Inf 38:30–45. https://doi.org/10.1016/j.tele.2019.02.001
Yu L, Li H, He W, Wang FK, Jiao S (2020) A meta-analysis to explore privacy cognition and information disclosure of internet users. Int J Inf Manag 51:102015
Zlatolas LN, Welzer T, Heričko M, Hölbl M (2015) Privacy antecedents for SNS self-disclosure: The case of Facebook. Comput Hum Behav 45:158–167. https://doi.org/10.1016/j.chb.2014.12.012
Acknowledgements
This research was funded by the National Social Science Foundation of China (Major Programme, Grant No. 22ZDA078).
Author information
Authors and Affiliations
Contributions
MW, WL, YQ and CC contribute equally to this research; MW, WL and YQ conceived of the idea, implemented the formula and carried out the case studies; WL and MW contributed to the idea, implementation and case study design; MW and YQ interpreted the results based on the survey; WL, MW and CC coordinated the study and revised it critically for important intellectual content; MW led the writing of the manuscript with contributions from all co-authors.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Ethical approval
This study adhered to institutional research guidelines and the Declaration of Helsinki ethical principles. Ethical approval was exempted by the School of Journalism and Information Communication, Huazhong University of Science and Technology (November 2022) based on three criteria: (1) deployment of non-invasive, observational methods in public settings involving anonymized data collection, (2) implementation of rigorous data protection protocols following ISO/IEC 27001 standards, and (3) absence of psychological or physical intervention risks. All participants were consenting adults (≥18 years) who provided informed consent through questionnaire completion, with data anonymization ensuring no personal identifiers were retained throughout the research process.
Informed consent
Participants were informed about the aim of the study, confidentiality of information, voluntary participation, and ability to opt out of the study if needed. Participants were informed through the question “Do you accept participation in this survey”. If they chose to “I accept to participate”, they could proceed the next page of the measures. All participants gave their agreement to participate in the study and consented to processing of their data.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Li, W., Qin, Y., Chen, C. et al. An investigation into personal data sensitivity in the Internet of Everything—insights from China. Humanit Soc Sci Commun 12, 311 (2025). https://doi.org/10.1057/s41599-025-04580-x
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1057/s41599-025-04580-x




