Abstract
Digital heritage experiences increasingly rely on Digital Twin Technology (DTT), yet how perception and emotion translate into behavioral intentions remains insufficiently understood. This study integrates PLS-SEM and fsQCA to examine cognitive–affective mechanisms underlying tourists’ environmental and cultural respect intentions. Structural modeling confirms two key routes: a cognitive path driven by perceived realism and cultural identity, and an emotional path anchored in narrative immersion and place attachment. Cultural capital moderates both paths in opposing directions. Meanwhile, fsQCA reveals multiple sufficient configurations across three mechanism types: narrative-driven emotional resonance, realism-based cognitive framing, and multi-factor collaboration. These findings introduce a novel Perception–Place–Behavior (PPB) framework that bridges technology acceptance and cultural theory in digital heritage contexts. Practically, a dual-mode system is recommended to accommodate users with varying cultural capital and enhance emotional engagement and cultural responsiveness.
Introduction
With the rise of immersive technologies, DTT has become an important tool in cultural heritage preservation and dissemination. Through high-precision 3D modeling, real-time synchronization, and interactive simulation, it functions as a digital “life-support system” for fragile sites such as the Mogao Caves and Pompeii1,2. Beyond preservation, DTT enables tourists to “travel through time and space,” fostering immersive historical experiences grounded in the notion of “virtual uniqueness”3: the idea that cultural symbols acquire exclusive presence and interactivity in digital environments. For example, the Palace Museum in Beijing uses DTT to reconstruct Qing Dynasty court life, integrating role-play and narrative design to transform visitors from passive viewers into embodied participants, shifting the experience from “browsing” to deep cultural understanding4.
However, despite growing investments in immersive technologies, such experiential enhancements have not translated proportionally into sustainable behavioral change among tourists5. This paradox of technological enthusiasm versus behavioral indifference reveals a persistent gap between digital engagement and behavior transformation2. Existing research has primarily emphasized technical aspects such as modeling, visualization, and interface optimization, while paying insufficient attention to the underlying psychological and behavioral mechanisms1,6.
Current DTT literature predominantly bifurcates into two tracks. The first concerns technological realization, focusing on 3D modeling, data integration, and system interactivity7,8. The second, anchored in TAM, evaluates user attitudes through variables like perceived usefulness and ease of use9,10. Yet TAM, while well-established in information systems research, proves inadequate in highly contextualized, affect-rich environments such as cultural heritage tourism. It lacks the capacity to explain emotional resonance, symbolic interpretation, and narrative immersion—factors fundamental to the construction of cultural meaning10,11.
Moreover, models emphasizing surface-level satisfaction or hedonic response fail to capture how visitors internalize cultural content psychologically2,12. In this regard, cultural identity should be reconceptualized as a dynamic process evolving from cognitive understanding to emotional involvement and ultimately behavioral commitment13,14,15,16. However, such internalization is often reduced to simplified indicators such as “pleasure,” neglecting the layered symbolic meaning-making process. Similarly, place attachment theories developed in physical contexts struggle to explain how emotional bonds form in virtual environments driven by symbolic cues17. Mechanisms such as narrative immersion, symbolic reconstruction, and role-playing underpin what we term symbolic embeddedness, yet this remains under-theorized18,19.
Cultural capital adds another layer of complexity. As Bourdieu noted20, it shapes individuals’ interpretive schemas and cultural preferences. Tourists with higher cultural capital may be more attuned to symbolic narratives21,22, but they may also exhibit a critical distance toward hypermediated experiences19—a phenomenon we refer to as the “anti-immersion effect.” This intersection of cultural capital and digital heritage experience remains theoretically underdeveloped. Most existing studies rely on SEM, which estimates linear, population-level relationships23. However, sustainable behavioral intentions often result from the interplay of multiple psychological conditions24. To better address this complexity, we adopt a dual approach, using PLS-SEM to examine average effects and fsQCA to uncover diverse causal combinations associated with behavioral outcomes.
In summary, the research questions addressed in this study are as follows:
-
(1)
How do immersive experiences triggered by DTT transform into cultural psychological construction processes through perception (realism and narrativity)?
-
(2)
How do cultural identity and place attachment mediate the relationship between perceptual experiences and tourists’ behavioral intentions?
-
(3)
How does an individual’s cultural capital moderate the pathways through which immersive experiences influence cultural psychological construction?
-
(4)
Under various combinations of perceptual and psychological variables, are there multiple equivalent pathways leading to tourists’ sustainable behavioral intentions?
To address these questions, this study proposes an integrated framework that moves beyond traditional perception–behavior models by incorporating perceptual stimuli, psychological construction, and behavioral responses. Theoretically, it emphasizes the mediating roles of cultural identity and place attachment within the PPB pathway, and introduces the moderating role of cultural capital. Methodologically, it combines PLS-SEM with fsQCA to identify both average effects and heterogenous causal configurations. Through this theoretical–methodological dual integration, the study seeks to respond to the limitations of current models and better capture the complexity of tourist behavior in digital heritage contexts.
Methods
The dual drivers of digital twin technology
In recent years, DTT has been widely employed in cultural heritage preservation and communication, combining precise visual modeling with immersive interaction2,7,25. By integrating 3D scanning, IoT sensors, and real-time rendering, DTT constructs a “virtual mirror” of heritage sites that enables visitors to explore and engage with cultural resources remotely26. From a perceptual perspective, two experiential drivers are particularly central to the way DTT shapes users’ responses: Perceived Realism (PR) and Narrativity (NC). We argue that both can be directly traced back to identifiable technical inputs in DTT, which provides a strong rationale for our hypotheses.
PR describes the subjective sense of consistency between virtual representations and their real-world cultural counterparts27. In DTT contexts, this sense of realism is not a vague impression but the outcome of several concrete technical inputs. Geometric and material fidelity—for example, dense point clouds, mesh resolution, and physically based rendering—allows surfaces, textures, and architectural details to be discerned without visual artifacts, reinforcing the impression of “being there”28. Physically plausible lighting and physics engines ensure that shadows, reflections, and object behaviors follow natural causal rules, thereby reducing contradictions that might otherwise undermine credibility29. Spatiotemporal registration and real-time data feeds (e.g., synchronizing IoT sensor input with a virtual environment) provide a temporal anchor, so visitors experience events that align with “what is happening now”30. Finally, system performance and latency (adequate frame rate, minimal lag) prevent perceptual breaks that can destroy the illusion of reality.
Together, these inputs converge to strengthen authenticity, presence, and the impression of “experiencing the real site”31,32. Empirical studies show that such realistic rendering significantly contributes to the internalization of cultural meaning and the formation of Cultural Identity (CI). Yet prior work also warns that realism is not linearly positive: too much sensory detail can overwhelm users’ cognitive resources, diverting attention to technical elements rather than cultural significance33,34. Leow and Ch’ng19 similarly note that an excessive focus on sensory fidelity may crowd out deeper cultural interpretation. Thus, while the relationship is contingent on context and individual background, the dominant expectation supported by both technology design and empirical evidence is that greater PR will facilitate the strengthening of cultural identity.
H1: PR positively influences CI.
NC refers to the storyfulness of digital-heritage presentations, capturing the extent to which visitors are immersed in a culturally meaningful storyline through role enactment, scene re-creation, and multisensory cues35,36. In DTT, narrativity is supported by several technical inputs. Interaction granularity, such as object-level inspection, manipulation, and path choice, allows visitors to become active participants rather than passive viewers, which heightens their sense of agency and story involvement. Narrative orchestration through branching triggers, quest-chain logic, and diegetic interfaces maintains dramatic progression and role goals, sustaining narrative transportation. Multisensory delivery, including spatial audio and haptic feedback, enriches the storyworld with embodied cues. Stable system performance prevents rhythm breaks that would disrupt immersion, while spatiotemporal registration with real-time events adds credible “evidence nodes” that make the enacted story feel situated rather than scripted.
Compared with static exhibitions, these narrativity-oriented affordances foster temporal transcendence—the sense of experiencing the past in the present—and situated immersion. Case projects illustrate this vividly: the Mostar Old Bridge combined 360° VR with embodied action to let visitors re-enact intangible rituals37, while the Carignano Palace reconstruction employed dynamic narration to place audiences within parliamentary history, deepening both comprehension and affective engagement38. Converging evidence shows that when visitors are narratively transported, that is, when they feel inside the storyworld, they are more likely to develop place-based bonding and belonging39,40. Nevertheless, narrativity also has boundary conditions. Fragmented plots or excessive branching can lead to cognitive overload and weaken transportation41, and when PR is very high, sensory fidelity may capture attention at the expense of narrative processing19. Acknowledging these risks yet following the dominant empirical pattern, we propose the following hypothesis:
H2: NC positively influences place attachment.
Psychological mechanisms of cultural identity and place attachment
Cultural Identity (CI) refers to the psychological bond that tourists form with a specific culture during heritage experiences. It typically comprises three dimensions: cognitive identity (understanding and internalizing cultural values), emotional identity (a sense of belonging and self-association), and behavioral intention (a commitment to cultural preservation)42,43. This construct not only reflects tourists’ subjective acceptance of culture but is also recognized as a key psychological foundation for fostering sustainable tourism behaviors44.
In virtual cultural contexts enabled by DTT, the formation mechanism of CI is undergoing transformation2. Through immersive experiences and interactive learning, DTT allows tourists to directly engage with the history and preservation values of heritage sites. For example, virtual archeology platforms that simulate restoration processes significantly enhance users’ cognitive identity by deepening their understanding of heritage craftsmanship45. Augmented reality (AR) guided tours reconstruct historical narratives, strengthening tourists’ situational resonance and emotional identification with heritage46.
However, some studies caution that in pursuit of immersion, certain projects adopt overly theatrical or entertaining approaches, which may divert tourists’ attention to the storyline itself while neglecting underlying cultural values. This can result in CI remaining at a superficial level of pleasurable experience39. Therefore, the effectiveness of DTT in fostering CI may vary depending on content design and user background. Nonetheless, there is broad consensus on its potential to influence identity construction. Prior research suggests that stronger CI leads to greater engagement in both EBI (e.g., pollution reduction, low-carbon travel) and CRI (e.g., observing local customs, protecting heritage sites)42. Accordingly, the following hypotheses are proposed to explore the mediating role of CI between PR and behavioral intention:
H3a: CI mediates the relationship between PR and EBI. H3b: CI mediates the relationship between PR and CRI.
Place Attachment (PA) refers to an individual’s emotional bond and psychological connection to a specific location, typically comprising two dimensions: place identity, which reflects tourists’ emotional belonging and sense of self-extension toward a heritage site; and place dependence, which denotes functional reliance and the perceived uniqueness of experiences associated with the site17,47.
In traditional tourism contexts, PA is primarily formed through physical presence and on-site experiences. However, this psychological mechanism is undergoing transformation within digitally constructed environments powered by DTT7,48. On one hand, the high replicability and on-demand accessibility of virtual environments may weaken tourists’ dependence on physical space49. On the other hand, immersive storytelling and role-play experiences may significantly enhance contextual resonance, thereby reinforcing tourists’ sense of PA19,50. For example, the VR reconstruction of Pompeii simulates life before the volcanic eruption, enabling visitors to develop deeper historical identification.
Existing studies suggest that higher levels of PA are positively associated with stronger intentions to engage in EBI51,52. When individuals form deep emotional connections to a heritage site, they are more likely to participate in protective actions, such as minimizing waste or supporting eco-friendly initiatives. Therefore, the following hypothesis is proposed:
H4a: NC influences EBI through PA.
On the other hand, when tourists perceive a heritage site as functionally irreplaceable, they are more inclined to demonstrate culturally respectful behaviors, such as complying with behavioral norms or maintaining the sanctity of the site51,53. This indicates that PA also plays a significant role in guiding behavioral responses. Thus, the following hypothesis is proposed:
H4b: NC influences CRI through PA.
CI and PA are closely linked in tourists’ psychological mechanisms32,54. Existing research has shown that an increase in CI often coincides with a deeper emotional attachment to cultural heritage sites55. In the context of DTT, this relationship is especially pronounced: visitors gain a deeper understanding of the historical significance of cultural heritage through virtual interactions, which not only strengthens their CI but also enhances their emotional connection and attachment to the site1,2. For example, the digital restoration project of the Yuanmingyuan (Old Summer Palace) not only helps visitors recognize its historical destruction but also fosters national cultural consciousness, thereby enhancing visitors’ PA to the site56. Therefore, CI and PA may jointly influence tourists’ behavioral intentions, forming a chain mediation path. Based on this, the following hypotheses are proposed:
H5a: CI and PA form a chain mediation path in the relationship between PR and EBI.
H5b: CI and PA form a chain mediation path in the relationship between PR and CRI.
The role of cultural capital in digital twin experiences
Cultural Capital (CC) refers to the cultural resources, cognitive structures, and aesthetic abilities possessed by individuals, profoundly influencing how people perceive, decode, and respond to cultural information20,57. In cultural heritage tourism, CC not only determines whether visitors can deeply understand the intrinsic meaning of heritage, but also affects their experience quality and psychological engagement pathways in DTT scenarios1,58. However, research on the moderating mechanisms of CC in DTT contexts remains scarce, particularly the lack of systematic empirical testing.
Firstly, CC enhances tourists’ narrative decoding ability. Visitors with high CC typically possess richer historical knowledge and aesthetic literacy, enabling them to interpret complex cultural narratives during immersive experiences21,59. For example, in the AR restoration project at Kyoto’s Kiyomizu-dera, they not only focus on the architectural details but also understand the religious symbolism and cultural context22. In contrast, those with lower CC often react more strongly to the “novelty” of the technological presentation but struggle to understand the cultural context60,61. Therefore, CC may enhance the positive impact of NC on PA.
Bourdieu’s habitus theory emphasizes that the cognitive frameworks and behavioral patterns formed through an individual’s social background and historical experiences profoundly influence their perception and response to the external world20. In the digital cultural heritage experience, tourists’ CC influences their perception and understanding of virtual heritage through habitus. For instance, visitors with higher CC are accustomed to critically examining the presentation of virtual heritage, placing higher demands on the accuracy, historical consistency, and cultural value of the technology2,62. This critical examination may lead high CC visitors to develop a more rational and profound CI in virtual environments, while those with lower CC are more likely to be influenced by sensory stimuli, resulting in a more superficial CI63. Based on this, the following hypotheses are proposed:
H6: CC positively moderates the impact of NC on PA.
H7: CC negatively moderates the impact of PR on CI.
PPB model and theoretical integration
Traditional models such as TAM and emotional theories provide valuable insights into the impact of digital technology on user behavior64, but they fall short in capturing the complex psychological processes in cultural heritage tourism. TAM emphasizes perceived usefulness and ease of use but overlooks identity construction and emotional engagement65, and remains insufficient in addressing culturally embedded behaviors66,67,68,69. Meanwhile, emotional theories often treat affect as an isolated driver, failing to integrate perception and cognition in a systemic way9,10.
The PPB model provides a more integrative framework, positioning CI and PA as mediating variables that bridge perceptual stimuli and behavioral outcomes. Unlike TAM’s linear logic, PPB emphasizes how psychological mechanisms shape behavioral intentions, incorporating elements from emotional and behavior change theories into a multi-layered causal chain. As shown in Fig. 1, this model helps conceptualize the relational structure among PR, NC, CI, PA, and behavioral outcomes such as EBI and CRI.
Solid blue arrows denote hypothesized positive effects; dashed orange arrows denote hypothesized negative effects. Light-blue nodes indicate independent variables, orange nodes indicate mediating variables, green nodes indicate dependent variables, and the yellow node denotes the moderating variable. See hypotheses H1–H6 for the specific moderated paths.
However, tourist responses in immersive cultural environments are rarely linear or homogeneous. For example, some individuals develop strong attachment under narrative engagement despite low realism, while others bypass affective identification through high cultural capital. These divergent pathways challenge the explanatory power of mean-based models70. To address this, we introduce fsQCA to identify multiple causal configurations underlying behavior. This configurational approach allows us to explore the path heterogeneity within the PPB chain, offering a more nuanced understanding of behavior formation in digital heritage contexts23,24,71.
Study area
This study focuses on three representative heritage sites in Guangzhou, China: the Chen Clan Ancestral Hall, Yongqingfang, and the Nanyue King Palace Ruins (Fig. 2). Site selection was based on three considerations:
First, the sites cover three major heritage types defined by the International Council on Monuments and Sites72—tangible (Chen Clan Ancestral Hall), intangible (Yongqingfang), and archeological (Nanyue King Palace Ruins). This typological diversity enhances the generalizability of the findings73.
Second, the cases reflect a gradient of digital interventions. The Chen Clan Ancestral Hall features high-fidelity AR restoration (0.1 mm accuracy), suitable for testing perceived realism’s effect on cultural identity (CI, H1)74. Yongqingfang employs narrative-driven VR to activate place attachment (PA, H2)75. The Nanyue King Palace Ruins combine LiDAR and AR for guided interaction, allowing investigation of PR–NC synergy on behavioral intention (H3–H5)76. This gradient offers a practical basis for hypothesis testing.
Third, selecting sites within a single urban context controls for regional variation. Guangzhou, a top-ranked cultural tourism destination in China (“China Cultural Tourism Statistical Yearbook”, 2023), ensures cultural consistency and minimizes confounding influences.
Questionnaire design and variable measurement
The structured questionnaire consisted of two parts. The first part collected basic demographic data (e.g., gender, age, education level, and site visited), while the second part focused on measuring the core constructs of this study, including seven variables in total: four independent variables (PR, NC, CI, and PA), two dependent variables (EBI and CRI), and one moderator (CC). Each variable was measured through a dedicated subscale using a 5-point Likert scale (1 = strongly disagree, 5 = strongly agree). A total of 21 items were used in the main scale.
To ensure theoretical consistency and cross-contextual validity, all scales were adapted from validated instruments in tourism, digital heritage, and immersive experience. The measurement of PR referred to the spatial presence scale by Wagler and Hanus77, with items anchored to the digital-twin experience (e.g., “the visuals and interactions in this digital-twin experience felt lifelike”) to ensure context specificity. NC was grounded in narrativity-oriented frameworks, drawing on Reese et al.78 and Mulholland et al.79, and adapted to highlight story involvement within the digital-twin setting. CI adopted the cognitive–emotional identity structure developed by Fu and Luo16, with explicit referent shifts from physical heritage visits to digital-twin presentations, using anchors such as “in this digital-twin experience” and “as presented in the digital twin” to capture identity formation in virtual contexts. PA was measured using the dual-dimensional model of Williams and Vaske17 (identity and dependence), with item wording similarly adapted to emphasize attachment to the place as represented in the digital twin. EBI and CRI were derived from established environmental-psychology scales80,81,82, and contextualized by framing behavioral intentions after the digital-twin experience rather than in generic terms. CC was measured through indicators of education, aesthetics, and participation, adapted from Bourdieu-inspired empirical models83,84, and treated as a background resource without DTT contextualization.
All English-language items underwent a rigorous translation and localization process, involving two experts—one specializing in digital heritage and the other in applied linguistics. They conducted independent forward translations, followed by consensus synthesis and back-translation. To ensure semantic clarity and contextual adaptability, a pilot test was conducted with 28 participants who had recently visited a cultural site and used DTT tools. Feedback indicated confusion with terms such as “narrative coherence” and “virtual realism,” prompting refinement of item phrasing. Descriptions of example DTT applications (e.g., AR-guided tours, VR immersion theaters) were added to reduce ambiguity.
Following the pilot, minor adjustments were made to item wording to improve comprehension, while preserving conceptual integrity. The final version demonstrated adequate face validity and internal consistency, and a summary of variables, item examples, and sources is presented in Table 1.
Sample selection
This study adopted a stratified convenience sampling method, targeting tourists who had visited three representative cultural heritage sites in Guangzhou—Chen Clan Ancestral Hall, Yongqingfang, and the Nanyue King Palace Ruins. The stratification was based on two dimensions: heritage type (traditional architecture, intangible cultural heritage, and archeological ruins) and mode of technological engagement (VR, AR, and digital interaction), ensuring comprehensive coverage of DTT application scenarios. Respondents were required to meet three core criteria: (1) be at least 18 years old; (2) have physically visited one of the targeted heritage sites within the past 12 months; and (3) have engaged with at least one DTT application during the visit, verified through device usage records or a DTT feature recognition test.
According to the dual criteria for sample size determination under PLS-SEM, the minimum sample size was calculated using both Hair et al.85 “10-times rule” (7 paths in the model, minimum 70 samples) and G*Power analysis (effect size = 0.15, α = 0.05, power = 0.8), which yielded a required sample size of at least 395. Considering a potential invalid response rate of 20%, a total of 600 questionnaires were distributed. Ultimately, 516 valid responses were collected, yielding a valid response rate of 86%.
Data collection and quality control
A multimodal approach combining online and offline data collection was employed. Offline responses were collected by trained surveyors stationed at the exits of the heritage sites, with DTT engagement verified via device usage logs. Online data were distributed through the official heritage site apps to maintain data integrity. To ensure sample diversity and data quality, measures such as IP address filtering and response time monitoring were implemented.
The data cleaning process was conducted in two stages: (1) primary cleaning removed responses with abnormally short completion times (less than three standard deviations below the mean, i.e., 98 s) or with over 10% missing data; (2) advanced cleaning used the Longstring index (threshold = 0.8) to detect patterned responses and Mahalanobis distance (p < 0.001) to identify multivariate outliers, ensuring robustness and reliability86,87.
Data analysis method
PLS-SEM was used to assess the hypothesized structural relationships, including mediation and moderation effects among multiple latent constructs. PLS-SEM is particularly suitable for studies with moderate sample sizes, non-normally distributed data, and evolving theoretical frameworks88. It also enables the analysis of complex models without requiring multivariate normality or large samples, distinguishing it from covariance-based SEM89.
The analysis was performed using SmartPLS 4.0. To ensure data quality, diagnostic checks were conducted in SPSS 29.0. All variance inflation factor (VIF) values were below the recommended threshold of 5.0, confirming no multicollinearity88. The Shapiro–Wilk test was used to evaluate normality, and deviations supported the appropriateness of PLS-SEM for this dataset.
Measurement model evaluation followed established criteria⁸⁵. Internal consistency reliability was assessed using Cronbach’s α and rho_A, both with thresholds of 0.70. Composite Reliability (CR) values above 0.70, and Average Variance Extracted (AVE) values above 0.50 indicated acceptable convergent validity. Discriminant validity was examined using the heterotrait–monotrait (HTMT) ratio, with values below 0.85 deemed acceptable85,88.
The structural model was evaluated using R² and adjusted R² values to assess explanatory power. Mediation and moderation effects were tested using bias-corrected bootstrap resampling (5000 iterations), with significance assessed based on confidence intervals90. The path weighting scheme was applied with a convergence criterion of 10⁻⁵, following recommendations for stable model estimation88.
To complement the linear estimations of PLS-SEM and uncover causal asymmetry and conjunctural patterns, this study employed fsQCA using fsQCA 3.1b software. FsQCA is particularly suitable for exploring equifinal mechanisms where multiple configurations of conditions can lead to the same outcome, thereby offering a richer causal interpretation beyond net effects24.
For five-point Likert scales, previous studies recommend direct calibration with values of 4, 3, and 2 to align thresholds with the semantic meaning of the scale and to avoid misclassifying mid-range responses as full members24,91. Percentile-based thresholds may distort meaning when distributions cluster near the midpoint. As shown in Table 2, our data exhibit moderate negative skewness and clustering around scale points 3–4, making percentile calibration particularly prone to inflating membership scores. In this study, we first computed continuous composite scores for each reflective construct by averaging their items, for example, PR1 to PR3. These scores were then calibrated through direct anchors. To preserve discrimination under the slightly skewed distributions in our data, we adopted a stricter 5–3–1 scheme, with 5 representing full membership, 3 the crossover point, and 1 full non-membership. Specifically, we employed the calibrate function in fsQCA 3.1b, which mapped the 1–5 item scores onto fuzzy set membership values between 0 and 1 according to these anchors.
The analysis followed standard fsQCA procedures. A necessity analysis was first conducted using a consistency threshold of 0.90 to identify individually indispensable conditions. This was followed by truth table construction, applying a frequency threshold of 1 and a consistency cutoff of 0.80 to determine sufficient condition sets. Among the solution types produced, the parsimonious solution was selected for interpretation, as it retains only the most essential causal paths while minimizing redundancy24. This approach ensures theoretical clarity and facilitates robust cross-model comparison with the PLS-SEM results.
Results
Descriptive analysis
To examine the overall distribution of the measured variables, descriptive statistics including means, standard deviations, skewness, and kurtosis were calculated (see Table 2). The mean scores for all items ranged from 3.5 to 4.2, with standard deviations between 0.6 and 0.9, indicating a moderately high and concentrated level of agreement among respondents. The absolute values of skewness and kurtosis were all below 1.0, suggesting no significant deviations from normality. These results support the assumption of approximate normal distribution, confirming the data’s suitability for subsequent PLS-SEM analysis involving path estimation and mediation testing.
Reliability analysis of constructs
According to Hair et al.88, internal consistency is acceptable when Cronbach’s α and rho_A exceed 0.70, while convergent validity is supported when CR > 0.70 and AVE > 0.50. Multicollinearity concerns are ruled out when all VIFs remain below 5.0. As shown in Table 3, all constructs met these standards: Cronbach’s α ranged from 0.822 to 0.906, rho_A from 0.839 to 0.925, CR from 0.893 to 0.940, and AVE from 0.735 to 0.840. VIF values (1.774–2.950) indicated no collinearity issues.
The explanatory power of the model was evaluated using R² values for the four endogenous constructs. According to the interpretive thresholds proposed by Hair et al.88—0.75 (substantial), 0.50 (moderate), and 0.25 (weak)—the results indicate moderate explanatory capacity for PA (R² = 0.301) and EBI (R² = 0.219), while CRI (R² = 0.150) and CI (R² = 0.084) fall within the range of weak yet meaningful predictive relevance.
Construct validity verification
To assess the suitability of the dataset for factor analysis, we applied the KMO and Bartlett’s test of sphericity (see Table 4). A KMO value above 0.80 is generally considered meritorious92; the obtained value of 0.846 thus indicates adequate sampling adequacy. Bartlett’s test yielded χ² = 5793.101 with df = 210 and p < 0.001, which is well below the conventional significance threshold of 0.05, rejecting the null hypothesis that the correlation matrix is an identity matrix. Together, these results confirm that the data meet the statistical prerequisites for factor analysis.
To verify the construct validity of the measurement model, principal component analysis with varimax rotation was conducted. This method is commonly employed to extract orthogonal factors and examine whether items cluster around theoretically expected dimensions. According to Hair et al.88, a factor loading of 0.70 or above is considered strong, while 0.60 is still acceptable in exploratory contexts. As shown in Table 5, seven components were extracted, matching the seven predefined constructs of the model.
All measurement items exhibited high loadings on their respective factors (range = 0.765–0.913), with no significant cross-loadings observed, indicating a clearly differentiated structure. Specifically, items for CC loaded on Component 1 (0.880–0.913), PA on Component 2 (0.793–0.852), EBI on Component 3 (0.800–0.847), CRI on Component 4 (0.790–0.872), CI on Component 5 (0.801–0.864), PR on Component 6 (0.765–0.858), and NC on Component 7 (0.769–0.810). The clean structure and absence of cross-loading further confirm the construct discriminant validity and theoretical coherence of the measurement model.
Following standard PLS-SEM criteria, indicator reliability was assessed via standardized outer loadings. All items exceeded the recommended threshold of 0.70, with values ranging from 0.833 to 0.933. The standard errors were narrow, and the bootstrapped t-values fell between 32.026 and 109.477, all significant at p < 0.001 as reported in Table 6. No indicator was found within the 0.40 to 0.70 interval, so no removal was required. The interaction terms CC×NC and CC×PR were modeled as single-indicator constructs in the two-stage procedure; their loadings were fixed at 1.000 and were therefore excluded from inferential testing. Taken together with the CR and AVE values reported below, these results confirm satisfactory indicator reliability and convergent validity for all reflective constructs.
Discriminant validity was further assessed using the HTMT, a method proposed by Henseler et al.93 that offers higher sensitivity in detecting discriminant validity issues compared to the traditional Fornell–Larcker criterion or cross-loading analysis. HTMT estimates the ratio of between-construct correlations (heterotrait–monotrait) to within-construct correlations (monotrait–monotrait), with values below 0.85 generally considered acceptable. As shown in Table 7, all HTMT values ranged from 0.017 to 0.605, remaining well below the conservative threshold of 0.85, thus confirming adequate discriminant validity among the latent constructs.
PLS-SEM results
To assess the structural relationships among the latent variables, PLS-SEM was employed, and path significance was evaluated via bootstrapping with 5000 resamples85. As shown in Table 8, the evaluation involved four statistical indicators: standardized path coefficients (reflecting the magnitude and direction of effects), standard errors (STDEV), t-values, and p values. A path is considered statistically significant if its t-value exceeds 1.96 and its p-value falls below 0.05 under a two-tailed test assumption.
All hypothesized paths were statistically supported. Specifically, PR positively influenced CI (β = 0.230, t = 3.998, p < 0.001), and NC significantly enhanced PA (β = 0.275, t = 4.805, p < 0.001), confirming H1 and H2. Regarding moderation, CC strengthened the indirect effect of NC on PA (H6: β = 0.154, t = 3.738, p < 0.001), while it negatively moderated the effect of PR on CI (H7: β = –0.115, t = 2.143, p = 0.032), suggesting a suppressor effect.
The mediating roles of CI and PA were also confirmed. CI significantly predicted both PA (β = 0.338, t = 6.346, p < 0.001) and CRI (β = 0.243, t = 4.227, p < 0.001), while PA positively influenced both CRI (β = 0.219, t = 4.032, p < 0.001) and EBI (β = 0.356, t = 6.591, p < 0.001). Additionally, CI had a direct effect on EBI (β = 0.191, t = 3.247, p = 0.001), and CC directly enhanced PA (β = 0.182, t = 4.353, p < 0.001), confirming the parallel pathways proposed in the model.
This study employed the Bootstrap method to examine mediation effects. As shown in Table 9, all 95% confidence intervals for indirect effects excluded zero, indicating the significance of the mediation paths. Specifically, in H3a, PR significantly influences EBI via CI (indirect effect = 0.045, T = 2.107, P = 0.035). In H3b, PR significantly promotes CRI through CI (indirect effect = 0.056, T = 2.870, P = 0.004). H4a reveals that NC promotes EBI through PA (indirect effect = 0.099, T = 3.461, P = 0.001), and H4b indicates that NC significantly impacts CRI through PA (indirect effect = 0.060, T = 2.972, P = 0.003). For H5a, the chain mediation path PR → CI → PA → EBI is confirmed to be significant (indirect effect = 0.028, T = 3.039, P = 0.002), and H5b demonstrates that the same chain pathway also significantly influences CRI (indirect effect = 0.017, T = 2.510, P = 0.012).
Figure 3 illustrates the PPB model developed in this study, highlighting the path relationships and corresponding coefficients among key variables, including PR, NC, CI, and PA. This diagram provides a clear visual representation of the interactions and strengths of influence between constructs, shedding light on their critical roles in shaping tourists’ behavioral intentions within the context of digital heritage experiences.
Solid blue arrows indicate positive effects; dashed orange arrows indicate negative effects. Light-blue nodes = independent variables; orange = mediators; green = dependent variables; yellow = moderating variable. Coefficients are standardized; asterisks denote significance (*P < 0.05, **P < 0.01, ***P < 0.001).
fsQCA results
Necessary Condition Analysis was conducted to identify whether any single condition is indispensable for achieving the outcomes of interest. As shown in Table 10, the analysis examined both positive outcomes (EBI and CRI) and their negations (~EBI and ~CRI). A condition is considered necessary only if its consistency exceeds the threshold of 0.90, indicating that the outcome does not occur without the presence of that condition94.
For EBI, the highest consistency values were observed in CI (0.897), PR (0.893), and PA (0.874), all falling short of the 0.90 benchmark. Similar patterns were found for CRI and its negation, where no condition met the threshold of necessity. These results confirm that none of the antecedents function as necessary conditions on their own.
Note: ● indicates the presence of a core condition; ● (non-bold) indicates a peripheral condition;⨂ indicates the absence (negation) of a condition; blank cells indicate an irrelevant or “do not care” status in the configuration.
Table 11 presents six sufficient configurations for achieving high levels of EBI. The overall solution demonstrates high consistency (0.916) and substantial coverage (0.829), both exceeding the recommended thresholds of 0.80 for consistency and 0.45 for coverage71, confirming the reliability of the solution.
Configuration 1 consists of the presence of NC and PA as core conditions, with PR and CI as peripheral conditions. Configuration 2 includes PR and CRI as core conditions, while NC and CI act as peripheral conditions. Configuration 3 features NC and CRI as core conditions, with PR and PA included as peripheral conditions. Configuration 4 is composed of PR and NC as core conditions, and PA and CC as peripheral conditions. Configuration 5 contains no core conditions; PR, PA, CC, and CRI are present as peripheral conditions. Configuration 6 also presents no core conditions; NC, CI, PA, CC, and CRI are all included as peripheral conditions.
Table 12 presents seven configurations that are sufficient for achieving high levels of CRI. The overall solution demonstrates a high consistency of 0.914 and substantial raw coverage of 0.883, indicating that these configurations collectively account for the majority of cases with high CRI.
Configuration 1 consists of the presence of PR as a core condition, with CC and the absence of CI (i.e., ~CI) as peripheral conditions. Configuration 2 includes no core conditions; PR, NC, and CI appear as peripheral conditions. Configuration 3 features PR and NC as core conditions, with PA as a peripheral condition. Configuration 4 includes PA and CC as core conditions, and CI as a peripheral condition. Configuration 5 presents no core conditions; PR appears as a peripheral condition, alongside the absence of NC (i.e., ~NC), PA (i.e., ~PA), and CC (i.e., ~CC). Configuration 6 contains no core conditions; CC is included as a peripheral condition, with the absence of NC (i.e., ~NC), CI (i.e., ~CI), and PA (i.e., ~PA). Configuration 7 features no core conditions; CI and PA are included as peripheral conditions, and both PR and NC are absent (i.e., ~PR and ~NC).
Discussion
This study employed a dual-method strategy, combining PLS-SEM and fsQCA, to examine how DTT relates to tourists’ sustainable behavioral intentions. By analyzing both net effects and configurational sufficiency, we identified dominant pathways and additional bundles that suggest robustness within our sample across heterogeneous subgroups. This section integrates the two sets of results to provide a layered interpretation of the perception–emotion–behavior chain and the moderating role of cultural capital.
The PLS-SEM analysis indicated two primary routes: a cognitive pathway (PR → CI) and an emotional pathway (NC → PA). Both were statistically significant, with PR associated with higher CI (β = 0.230, p < 0.001) and NC associated with higher PA (β = 0.275, p < 0.001). This pattern is consistent with Fan, Jiang, and Deng’s meta-analytic evidence that immersive AR/VR experiences improve appraisals and cognitive alignment in tourism contexts31. The fsQCA results mirror this distinction. For EBI, narrative-driven configurations (Configurations 1 and 3) consistently placed NC as a core condition, paralleling the PLS pattern that NC predicts PA and thereby supports environmental engagement. This result aligns with Chrysanthi et al.’s argument that when narrative structures are spatially embedded within heritage settings, they foster stronger affective bonding and situated emotional immersion35. For CRI, realism-driven and CC-stabilized configurations (Configurations 1, 3, and 6) emphasized PR and CI, resonating with the PLS pattern that cognitive alignment through realism is linked to normative respect. Thus, both methods converge on the pattern that emotional immersion tends to anchor environmental intentions, whereas cognitive recognition tends to underpin cultural respect intentions in this dataset.
While the net-effect model highlighted the prominence of NC → PA and PR → CI, fsQCA revealed additional sufficient combinations. For EBI, a realism–cognition pattern (Configurations 2 and 4) showed that PR, combined with ethical attitudes or weaker narrativity, could still be sufficient for behavioral activation. Similarly, multi-factor collaborations (Configurations 5 and 6) indicated that a mix of comparatively weaker conditions (e.g., PR, PA, CC) can jointly cross the sufficiency threshold. For CRI, emotion–cognition collaboration (Configurations 2, 4, and 7) suggested that CI and PA can compensate for limited PR or NC, while CC-stabilized pathways indicated that cultural capital may support respect intentions even when immersive cues are weak. Taken together, these findings are consistent with causal plurality: beyond dominant net-effect routes, redundant and compensatory bundles can also produce the outcome under our calibration. This helps explain why digital heritage interventions may remain effective even when technical realism or narrative intensity is modest—other conditions can fill the gap.
A notable result concerns the directionally opposed moderation of CC. In the PLS-SEM model, CC negatively moderated the PR → CI path (β = −0.115, p = 0.032) but positively moderated the NC → PA path (β = 0.154, p < 0.001). This asymmetry is consistent with museum and heritage learning studies showing that technology-forward presentations can induce technology overload and critical distancing when provenance or interpretation is opaque¹⁹. A broader systematic review likewise cautions that preservation technologies may distort intended meanings unless balanced by interpretability and authenticity scaffolds62. The fsQCA solutions reinforce this duality: CC-stabilized configurations for CRI (Configurations 1, 4, and 6) suggest that configurations including CC were associated with higher CRI even when dominant perceptual or affective cues were limited.
Why CC weaken the realism–cognition route. Psychological accounts suggest that higher CC shifts audiences from heuristic acceptance to accuracy-oriented, analytic processing (dual-process and elaboration-likelihood perspectives). Confronted with photorealistic scenes, high-CC visitors engage epistemic vigilance and authenticity norms—checking provenance, comparing rendered details with prior knowledge, and probing gaps between technical realism and historical/cultural authenticity. In this effortful mode, realism functions less as a persuasive shortcut and more as a credibility test, attenuating PR’s ability to translate perception into identity unless paired with verifiability scaffolds (e.g., source annotations, version histories, uncertainty cues). Expertise research also indicates that knowledgeable audiences prioritize evidential grounding over surface spectacle, which can produce aesthetic distance when verification is costly or opaque.
Why CC strengthens the narrative–affect route. Conversely, CC provides denser schema networks that make plots, symbols, and rituals easier to decode. When narrative cues fit these schemata, coherence increases and ambiguity resolves with less effort, facilitating narrative transportation and empathic involvement. In Bourdieu’s account20, cultural capital operates as a meaning “decoder,” enabling efficient parsing of symbols and plots and thereby amplifying narrative effects when narrativity is strong. Hence, CC does not globally suppress affect; it re-channels affect away from surface realism toward meaning-laden stories, deepening place attachment when narrativity is strong.
The PLS-SEM model also indicated a sequential mediation (PR → CI → PA → behavioral intention; H5a, H5b). Although the indirect effects were modest (H5a = 0.028, p = 0.002; H5b = 0.017, p = 0.012), the sequence is consistent with a layered process whereby technical perception aligns identity, which then consolidates into emotional attachment and, in turn, relates to intention. This mediating role of place attachment echoes evidence that attachment transmits perceptual appraisals to pro-environmental behavior in tourism settings48. This resonates with accounts of multi-layered place attachment and situates technology as a cognitive primer. fsQCA complements this by showing that such layering is not the only viable mechanism: configurations indicating the absence of CI (~CI) were still sufficient for high CRI when PR was strong, and emotion–cognition bundles could substitute for missing realism.
The necessary-condition check (consistency threshold = 0.90) did not identify PR, NC, CI, PA, or CC as necessary for either EBI or CRI under our calibration. This aligns with the sufficiency-focused results: no single must-have factor was detected. Instead, higher levels of respect intentions arose through alternative combinations. Theoretically, this supports an equifinality view in which multiple distinct pathways can reach the same outcome. Practically, it cautions against single-cue optimization (e.g., hyper-realism) and favors orchestration of complementary features that can substitute for or reinforce one another.
Although modeled within the same PPB framework, the mechanisms behind EBI and CRI differ in emphasis. EBI is more closely tied to the narrative–attachment route, where NC fosters PA and affective bonds support environmental action. CRI is more closely linked to the realism–identity route, where PR enhances CI and cognitive alignment supports normative respect. The fsQCA evidence strengthens this differentiation, and the narrative emphasis is consistent with findings on place-based digital storytelling that highlight the affective power of spatially grounded narratives³². In this study, these patterns support retaining the two intentions as distinct outcomes rather than collapsing them into a higher-order construct. In practice, DTT platforms may emphasize immersive storytelling and emotional transport to cultivate EBI, while strengthening verifiable realism and provenance cues to foster CRI. These design implications are also in line with cautions on authenticity management under technological intensification19,62.
These findings elaborate on TAM by proposing a PPB lens that captures both linear net-effect patterns and configurational sufficiency. Unlike TAM’s traditional focus on perceived ease and usefulness, the PPB perspective highlights that realism, narrativity, emotion, and symbolic identity jointly shape sustainable heritage intentions, showing that affective bonding and symbolic coherence can be as influential as utilitarian assessments. The study also refines Bourdieu’s20 notion of CC by demonstrating its dual role: as a decoder along narrative routes, CC strengthens schema-based comprehension and symbolic immersion, while as a filter along realism-to-cognition routes, it raises verification demands and reduces reliance on visual fidelity, making realism less persuasive unless supported by provenance cues. Finally, combining PLS-SEM and fsQCA illustrates methodological complementarity: PLS identifies dominant associations and mediated sequences such as PR → CI → PA → intention, whereas fsQCA reveals equifinal and compensatory configurations such as CC-stabilized pathways that sustain CRI even when affective or perceptual cues are weak. Together, these insights provide a richer account of how heterogeneous visitors interpret and act on digital heritage experiences and underscore the importance of mixed-method validation in contexts where behavioral causality is distributed across multiple routes.
From a practical standpoint, the results suggest that digital heritage platforms should adopt adaptive presentation strategies. At the technical application level, offering a dual-mode system—“expert mode” for high-CC users (featuring in-depth historical content, source annotations) and “story mode” for low-CC users (featuring gamified interactions, simplified storylines)—could enhance both engagement and educational outcomes. Additionally, integrating PPB-based behavior prediction models into smart heritage site management systems can support dynamic resource allocation and tailored communication. The fsQCA findings on “multi-factor collaboration” highlight that even weak individual signals can collectively generate behavioral outcomes. Hence, designing composite interventions that combine modest realism, engaging narrative, and symbolic cues may be more effective than over-reliance on any single element.
This study has several limitations that should be acknowledged. First, the data are based on cross-sectional, self-reported surveys, which restrict causal inference. Accordingly, the directional relationships discussed in this study should be interpreted as theoretically informed associations rather than definitive causal effects. Such designs may also be influenced by social desirability and common method variance. Longitudinal or experimental designs would provide stronger evidence of temporal dynamics and causal mechanisms. Second, the survey was conducted in technologically advanced Lingnan heritage sites. These contexts may amplify the salience of digital twin technology, limiting the transferability of findings to regions with less advanced infrastructure, oral traditions, or different cultural norms. Third, the fsQCA findings are sensitive to calibration and threshold choices. While we adopted a widely used 5–3–1 direct calibration and reported solution consistency and coverage, alternative anchors could alter peripheral configurations, especially those involving weaker conditions. Fourth, the dependent variables captured behavioral intentions rather than actual behaviors. Linking survey data with behavioral traces (e.g., digital interaction logs, donation or volunteering records) would improve ecological validity. Finally, the moderation patterns of cultural capital, though statistically supported, were not tested for measurement invariance across subgroups. Future research using multi-group models, longitudinal validation, or complementary qualitative inquiry could further confirm the stability and interpretive depth of these effects.
Finally, the study opens several directions for future research. While focused on Lingnan heritage in urbanizing China, the findings may not generalize to oral cultures or low-tech settings. Cross-cultural studies could assess whether the same configurations hold in contexts with different cultural norms or infrastructural conditions. Additionally, with the rise of Artificial Intelligence Generated Content (AIGC), the boundaries of authorship, authenticity, and interaction are shifting. Future work could investigate how AI-generated narratives alter the perception–emotion–behavior chain, whether algorithmic personalization reshapes place attachment, and how narrative “truth” is negotiated in human–machine collaborations. These questions could further expand the PPB model’s relevance in an age of intelligent cultural mediation.
Data availability
The data are available from the corresponding author upon reasonable request.
References
Dang, X., Liu, W., Hong, Q., Wang, Y. & Chen, X. Digital twin applications on cultural world heritage sites in China: a state-of-the-art overview. J. Cult. Herit. 64, 228–243 (2023).
Lucchi, E. Digital twins and cultural heritage conservation: challenges and prospects. Autom. Constr. 157, 105073 (2023).
Mystakidis, S. Metaverse. Encyclopedia 2, 486–497 (2022).
Ghani, I., Rafi, A. & Woods, P. The effect of immersion towards place presence in virtual heritage environments. Pers. Ubiquitous Comput. 24, 861–872 (2019).
Florido-Benítez, L. The use of digital twins to address smart tourist destinations’ future challenges. Platforms 2, 234–254 (2024).
Zhao, L. et al. How psychological and cultural factors drive donation behavior in cultural heritage preservation: construction of the cultural generativity behavior mode. npj Herit. Sci. 13, 28 (2025).
Tao, F. et al. Digital twin-driven product design framework. Int. J. Prod. Res. 57, 3935–3953 (2019).
Vuoto, A. & Funari, M. F. Shaping digital twin concept for built cultural heritage conservation: a systematic literature review. Int. J. Archit. Herit. https://doi.org/10.1080/15583058.2023.2258084 (2023).
Marto, A., Gonçalves, G., Bessa, M. & Vasconcelos, L. ARAM: a technology acceptance model to ascertain the adoption of augmented reality in cultural heritage. J. Imaging 9, 140 (2023).
Mogaji, E., Viglia, G., Madzokere, T. & Nguyen, N. Is it the end of the technology acceptance model in the era of artificial intelligence? A bibliometric and content analysis. Int. J. Contemp. Hosp. Manag. https://doi.org/10.1108/IJCHM-08-2023-1271 (2024).
Hanji, S. V., Hungund, S., Hanji, S. S., Desai, S. & Tapashetti, R. B. Augmented reality immersion in cultural heritage sites: analyzing adoption intentions. In Proc. Transfer, Diffusion and Adoption of Next-Generation Digital Technologies (TDIT 2023) Sharma, S. K., Dwivedi, Y. K., Metri, B., Lal, B. & Elbanna, A. (eds), (Springer, 2024).
Almeida, D., Brito e Abreu, F. & Boavida-Portugal, I. Digital twins in tourism: a systematic literature review. Tour. Manag. https://doi.org/10.1016/j.tourman.2025.106231 (2025).
Zou, C., Rhee, S.-Y., He, L., Chen, D. & Yang, X. Sounds of history: a digital twin approach to musical heritage preservation in virtual museums. Electronics 13, 2338 (2024).
Luther, W., Baloian, N., Biella, D. & Sacher, D. Digital twins and enabling technologies in museums and cultural heritage: an overview. Sensors 23, 1583 (2023).
Ashworth, G. J., Graham, B. & Tunbridge, J. E. Pluralising Pasts: Heritage, Identity and Place (Routledge, 2007).
Fu, Y. & Luo, J. M. An empirical study on cultural identity measurement and its influence mechanism among heritage tourists. Front. Psychol. 13, 1032672 (2022).
Williams, D. R. & Vaske, J. J. The measurement of place attachment: validity and generalizability of a psychometric approach. For. Sci. 49, 830–840 (2003).
Oleksy, T., Lassota, I., Wnuk, A. & Wcześniak, R. Virtual changes in real places: understanding the role of place attachment in augmented reality adoption. J. Environ. Psychol. https://doi.org/10.1016/j.jenvp.2024.102386 (2024).
Leow, F. T. & Ch’ng, E. Analysing narrative engagement with immersive environments: designing audience-centric experiences for cultural heritage learning. Museum Manag. Curatorsh. 36, 342–361 (2021).
Bourdieu, P. The forms of capital. In Proc. Handbook of Theory and Research for the Sociology of Education, Richardson, J. G. (ed.) 241–258 (Greenwood, 1986).
Fang, L., Liu, Z. & Jin, C. How does the integration of cultural tourism industry affect rural revitalization? The mediating effect of new urbanization. Sustainability 15, 10824 (2023).
Ramtohul, A. & Khedo, K. K. Augmented reality systems in the cultural heritage domains: a systematic review. Digit. Appl. Archaeol. Cult. Herit. 32, e00317 (2024).
Seyfi, S., Rasoolimanesh, S. M., Vafaei-Zadeh, A. & Esfandiar, K. Can tourist engagement enhance tourist behavioural intentions? A combination of PLS-SEM and fsQCA approaches. Tour. Recreat. Res. 49, 63–74 (2024).
Pappas, I. O. & Woodside, A. G. Fuzzy-set qualitative comparative analysis (fsQCA): guidelines for research practice in information systems and marketing. Int. J. Inf. Manag. 58, 102310 (2021).
Shabani, A. et al. 3D simulation models for developing digital twins of heritage structures: challenges and strategies. Procedia Struct. Integr. 37, 314–320 (2022).
Zhao, J., Guo, L. & Li, Y. Application of digital twin combined with artificial intelligence and 5G technology in the art design of digital museums. Wireless Commun. Mob. Comput. 2022, 8214514 (2022).
Alcindor, M., Jackson, D. & Alcindor-Huelva, P. Heritage places as the settings for virtual playgrounds: perceived realism in videogames as a tool for the re-localisation of physical places. Int. J. Herit. Stud. 28, 865–883 (2022).
Hu, Q. et al. Point cloud enhancement optimization and high-fidelity texture reconstruction methods for air material via fusion of 3D scanning and neural rendering. Expert Syst. Appl. 242, 122736 (2024).
Cummings, J. J. & Bailenson, J. N. How immersive is enough? A meta-analysis of the effect of immersive technology on user presence. Media Psychol 19, 272–309 (2016).
Jiang, Y. et al. Digital twin-enabled real-time synchronization for planning, scheduling, and execution in precast on-site assembly. Autom. Constr. 141, 104397 (2022).
Fan, X., Jiang, X. & Deng, N. Immersive technology: a meta-analysis of augmented/virtual reality applications and their impact on tourism experience. Tour. Manag. 91, 104534 (2022).
Li, M., Sun, X., Zhu, Y. & Qiu, H. Real in virtual: the influence mechanism of virtual reality on tourists’ perceptions of presence and authenticity in museum tourism. Int. J. Contemp. Hosp. Manag. 36, 3651–3673 (2024).
Sweller, J. Cognitive load theory. In: Psychology of Learning and Motivation, 55, 37–76 (Academic Press, 2011).
Albus, P., Vogt, A. & Seufert, T. Signaling in virtual reality influences learning outcome and cognitive load. Comput. Educ. 166, 104154 (2021).
Chrysanthi, A., Katifori, A., Vayanou, M. & Antoniou, A. Place-based digital storytelling: the interplay between narrative forms and the cultural heritage space. In: International Conference on Emerging Technologies and the Digital Transformation of Museums and Heritage Sites, 127–138 (Springer International Publishing, 2021).
Podara, A., Giomelakis, D., Nicolaou, C., Matsiola, M. & Kotsakis, R. Digital storytelling in cultural heritage: audience engagement in the interactive documentary “New Life. Sustainability 13, 1193 (2021).
Selmanovic, E. et al. VR video storytelling for intangible cultural heritage preservation. In: Eurographics Workshop on Graphics and Cultural Heritage https://doi.org/10.2312/gch.20181341 (The Eurographics Association, 2018).
Bevilacqua, M. G., Russo, M., Giordano, A. & Spallone, R. 3D reconstruction, digital twinning, and virtual reality: architectural heritage applications. In Proc. IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops (VRW), 92–96 (IEEE, 2022).
Gökcü Baz, M., Başoğlan Avşar, Ö, Özbek, Ç, Baz, M. & Söğüt, B. Understanding the role of local stories in living archaeological heritage sites: the case of Stratonikeia. Int. J. Herit. Stud. 30, 937–953 (2024).
Chen, X., Lee, T. J. & Hyun, S. S. Visitors’ self-expansion and perceived brand authenticity in a cultural heritage tourism destination. J. Vacat. Mark. https://doi.org/10.1177/13567667241309122 (2025).
Kartiko, I., Kavakli, M. & Cheng, K. Learning science in a virtual reality application: the impacts of animated-virtual actors’ visual complexity. Comput. Educ. 55, 881–891 (2010).
Zhang, G., Chen, X., Law, R. & Zhang, M. Sustainability of heritage tourism: a structural perspective from cultural identity and consumption intention. Sustainability 12, 9199 (2020).
Liu, J., Li, A., Zhu, Y. & Zhang, L. How do external environmental stimuli and internal psychological cultural identity affect tourists’ behavioral intentions in public cultural spaces? BMC Psychol. 12, 596 (2024).
Camuñas-García, D., Cáceres-Reche, M. P. & Cambil-Hernández, M. D. L. E. Maximizing engagement with cultural heritage through video games. Sustainability 15, 2350 (2023).
Bozzelli, G. et al. An integrated VR/AR framework for user-centric interactive experience of cultural heritage: the ArkaeVision project. Digit. Appl. Archaeol. Cult. Herit. 15, e00124 (2020).
Schaper, M. M., Santos, M., Malinverni, L., Berro, J. Z. & Parés, N. Learning about the past through situatedness, embodied exploration and digital augmentation of cultural heritage sites. Int. J. Hum. Comput. Stud. 114, 36–50 (2018).
Ramkissoon, H., Weiler, B. & Smith, L. D. G. Place attachment, place satisfaction and pro-environmental behaviour: a comparative assessment of multiple regression and structural equation modelling. J. Policy Res. Tour. Leis. Events 5, 215–232 (2013).
Wan, J., Zhang, J., Lu, S. & Li, L. Relationship between specific attributes of place, tourists’ place attachment and pro-environment behavioral intentions in Jiuzhaigou. Progress Geogr. 33, 411–421 (2014).
Pantelidis, C., tom Dieck, M. C., Jung, T. H., Smith, P. & Miller, A. Place attachment theory and virtual reality: the case of a rural tourism destination. Int. J. Contemp. Hosp. Manag. 36, 3704–3727 (2024).
Škola, F. et al. Virtual reality with 360-video storytelling in cultural heritage: study of presence, engagement, and immersion. Sensors 20, 5851 (2020).
Zhou, B., Wang, L., Huang, S. S. & Xiong, Q. Impact of perceived environmental restorativeness on tourists’ pro-environmental behavior: examining the mediation of place attachment and the moderation of ecocentrism. J. Hosp. Tour. Manag. 56, 398–409 (2023).
Peng, X., Liu, M., Hu, Q. & He, X. A multiscale perspective on place attachment and pro-environmental behavior in hotel spaces. J. Hosp. Tour. Manag. 55, 435–447 (2023).
Uslu, F. et al. The perception of cultural authenticity, destination attachment, and support for cultural heritage tourism development by local people: the moderator role of cultural sustainability. Sustainability 15, 15794 (2023).
Aggarwal, A., Lim, W. M., Dandotiya, R. & Kukreja, V. Dark tourism through the lens of attachment theory and domestic tourists. Int. J. Tour. Res. 26, e2609 (2024).
Yang, Y., Wang, Z., Shen, H. & Jiang, N. The impact of emotional experience on tourists’ cultural identity and behavior in the cultural heritage tourism context: an empirical study on Dunhuang Mogao Grottoes. Sustainability 15, 8823 (2023).
Boffi, M., Rainisio, N. & Inghilleri, P. Nurturing cultural heritages and place attachment through street art—a longitudinal psycho-social analysis of a neighborhood renewal process. Sustainability 15, 10437 (2023).
Gogishvili, D. & Müller, M. Culture goes East: mapping the shifting geographies of urban cultural capital through major cultural buildings. Urban Stud. https://doi.org/10.1177/00420980241289846 (2024).
Liu, Z., Zhang, M. & Osmani, M. Building information modelling (BIM) driven sustainable cultural heritage tourism. Buildings 13, 1925 (2023).
Sun, D., Wong, I. A., Huang, G. I., Kim, J.-H. & Liu, M. T. From savoring past trips to craving future journeys: the role of destination cultural capital and enjoyable reminiscence. J. Travel Res. 63, 1913–1932 (2024).
Wang, J. & Wu, Y. Income inequality, cultural capital, and high school students’ academic achievement in OECD countries: a moderated mediation analysis. Br. J. Sociol. 74, 148–172 (2023).
Bates, G. & Connolly, S. Exploring teachers’ views of cultural capital in English schools. Br. Educ. Res. J. 50, 1350–1366 (2024).
Mendoza, M. A., De La Hoz Franco, E. & Gómez Gómez, J. E. Technologies for the preservation of cultural heritage: a systematic review of the literature. Sustainability 15, 1059 (2023).
Innocente, C., Ulrich, L., Moos, S. & Vezzetti, E. A framework study on the use of immersive XR technologies in the cultural heritage domain. J. Cult. Herit. 62, 268–283 (2023).
Davis, F. D. Perceived usefulness, perceived ease of use, and user acceptance of information technology. MIS Q 13, 319–340 (1989).
Venkatesh, V. & Bala, H. Technology acceptance model 3 and a research agenda on interventions. Decis. Sci. 39, 273–315 (2008).
King, W. R. & He, J. A meta-analysis of the technology acceptance model. Inf. Manag. 43, 740–755 (2006).
Jia, W., Li, H., Jiang, M. & Wu, L. Melting the psychological boundary: how interactive and sensory affordance influence users’ adoption of digital heritage service. Sustainability 15, 4117 (2023).
Rasoolimanesh, S. M. & Lu, S. Enhancing emotional responses of tourists in cultural heritage tourism: the case of Pingyao, China. J. Herit. Tour. 19, 91–110 (2024).
Yung, R., Khoo-Lattimore, C. & Potter, L. E. Virtual reality and tourism marketing: conceptualizing a framework on presence, emotion, and intention. Curr. Issues Tour. 24, 1505–1525 (2021).
Fiss, P. C. Building better causal theories: a fuzzy set approach to typologies in organization research. Acad. Manag. J. 54, 393–420 (2011).
Ragin, C. C. Redesigning Social Inquiry: Fuzzy Sets and Beyond (University of Chicago Press, 2009).
ICOMOS International Scientific Committee. ICOMOS Charter on Cultural Heritage Classification and Typology (International Council on Monuments and Sites, 2021).
Vecco, M. A definition of cultural heritage: from the tangible to the intangible. J. Cult. Herit. 11, 321–324 (2010).
Ren, Z., Tang, Z. & Li, B. Construction and geo-distribution of the architectural characteristics of clan ancestral halls along the Yile–Xijing historical trail in Lechang. Buildings 14, 1550 (2024).
Hou, H., Lai, J. H., Wu, H. & Wang, T. Digital twin application in heritage facilities management: systematic literature review and future development directions. Eng. Constr. Archit. Manag. 31, 3193–3221 (2024).
Li, Y. et al. 3D LiDAR and multi-technology collaboration for preservation of built heritage in China: a review. Int. J. Appl. Earth Obs. Geoinf. 116, 103156 (2023).
Wagler, A. & Hanus, M. D. Comparing virtual reality tourism to real-life experience: effects of presence and engagement on attitude and enjoyment. Commun. Res. Rep. 35, 456–464 (2018).
Reese, E. et al. Coherence of personal narratives across the lifespan: a multidimensional model and coding method. J. Cogn. Dev. 12, 424–462 (2011).
Mulholland, P., Wolff, A., Zdrahal, Z. & Collins, T. Blending coherence and control in the construction of interactive educational narratives from digital resources. Interact. Learn. Environ. 16, 283–296 (2008).
Su, L., Hsu, M. K. & Boostrom, R. E. Jr From recreation to responsibility: increasing environmentally responsible behavior in tourism. J. Bus. Res. 109, 557–573 (2020).
Juvan, E. & Dolnicar, S. Measuring environmentally sustainable tourist behaviour. Ann. Tour. Res. 59, 30–44 (2016).
Lee, C. K. et al. Role of cultural worldview in predicting heritage tourists’ behavioural intention. Leis. Stud. 40, 645–657 (2021).
Hyyppä, M. T. How does cultural participation contribute to social capital and well-being? In Healthy Ties: Social Capital, Population Health and Survival (ed. Hyyppä, M. T.) 43–53 (Springer, 2010).
Garner, R. Insecure positions, heteronomous autonomy and tourism-cultural capital: a Bourdieusian reading of tour guides on BBC Worldwide’s Doctor Who Experience Walking Tour. Tour. Stud. 17, 426–442 (2017).
Hair, J. & Alamer, A. Partial least squares structural equation modeling (PLS-SEM) in second language and education research: guidelines using an applied example. Res. Methods Appl. Linguist. 1, 100027 (2022).
Meade, A. W. & Craig, S. B. Identifying careless responses in survey data. Psychol. Methods 17, 437–455 (2012).
Tabachnick, B. G. & Fidell, L. S. Using Multivariate Statistics 6th edn (Pearson Education, 2013).
Hair, J. F., Hult, G. T. M., Ringle, C. M. & Sarstedt, M. A Primer on Partial Least Squares Structural Equation Modeling (PLS-SEM) 3rd edn (SAGE Publications, 2021).
Chin, W. W. The partial least squares approach to structural equation modeling. In Modern Methods for Business Research (ed. Marcoulides, G. A.) 295–336 (Lawrence Erlbaum Associates, 1998).
Preacher, K. J. & Hayes, A. F. Asymptotic and resampling strategies for assessing and comparing indirect effects in multiple mediator models. Behav. Res. Methods 40, 879–891 (2008).
Ordanini, A., Parasuraman, A. & Rubera, G. When the recipe is more important than the ingredients: a qualitative comparative analysis of service innovation configurations. J. Serv. Res. 17, 134–149 (2014).
Kaiser, H. F. An index of factorial simplicity. Psychometrika 39, 31–36 (1974).
Henseler, J., Ringle, C. M. & Sarstedt, M. A new criterion for assessing discriminant validity in variance-based structural equation modeling. J. Acad. Mark. Sci. 43, 115–135 (2015).
Schneider, C. Q. & Wagemann, C. Set-Theoretic Methods for the Social Sciences: A Guide to Qualitative Comparative Analysis (Cambridge University Press, 2012).
Acknowledgements
This research was supported by the 2024 General Project of the Guangdong Provincial Philosophy and Social Science Planning, titled “Innovative Design Research on the Digital Revitalization of Lingnan Traditional Chinese Medicine Classics” (Grant No. GD24CYS17), hosted at Guangdong University of Finance and Economics and led by the corresponding author, Wei Bi.
Author information
Authors and Affiliations
Contributions
Z.D. drafted the manuscript, assisted with data analysis, and contributed to visualization. Q.D. participated in field investigation, data curation, and visual presentation. B.L. contributed to data analysis and literature review. W.B., as the corresponding author, conceived and supervised the study, secured the funding, and revised the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Deng, Z., Du, Q., Lei, B. et al. Unpacking digital heritage experiences using PLS SEM and fsQCA through a perception-place behavior model. npj Herit. Sci. 14, 65 (2026). https://doi.org/10.1038/s40494-026-02345-6
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s40494-026-02345-6


