Abstract
The COVID-19 pandemic has had a profound impact on almost all aspects of society. Cultural heritage sites, which are deeply intertwined with the tourism industry, are no exception. The direct impacts of the virus on the population, as well as indirect impacts, such as government-mandated measures including social distancing, face coverings, and frequent temporary closures of sites, have greatly impacted visitor experiences at heritage sites. To quantitatively evaluate the impact of these measures from the perspective of visitors, we collected 1.4 millions visitor reviews from the Google Maps platform for 775 heritage sites. We analyzed visiting rates using the number of online reviews as a proxy and adopt state-of-the-art natural language processing techniques to more deeply understand visitor perception of preventive measures put in place to control the spread of COVID-19. Our findings reveal that even if visitor focus on COVID-19 has significantly decreased, there may still be notable difference between actual and expected number of reviews, suggesting that visitor involvement (e.g., number of visitors) for cultural heritage sites, especially urban indoor sites, needs more time to recover. Our findings further show that most comments by visitors to sites were associated with negative sentiment toward restricted access, but recognized the necessity of other safeguarding measures (e.g., social distancing and the requirement for face coverings). Moreover, they exhibited negative sentiment towards staff or other visitors who did not adhere to these measures. We make specific recommendations for heritage sites to adapt to the COVID-19 pandemic and a more general observation that the method used to gather information from online reviews in this paper will be effective in measuring visitor perceptions towards specific aspects of heritage sites, particularly in capturing changes in perception before and after unexpected or disruptive events at heritage sites.
Similar content being viewed by others
Introduction
Cultural heritage-based tourism is regarded as one of the most significant and fastest growing areas within the industry (World Tourism Organization, 2015). The appearance of the coronavirus in 2019 (COVID-19) has caused global health and economic crises around the world. Among the industries affected by the pandemic in terms of economic losses, the tourism industry has been one of the hardest hit (Abbas et al., 2021; Gössling et al., 2020). The pandemic has substantially reshaped the global tourism landscape, transforming both the perceptions of risk and the behaviors of tourists and industry stakeholders alike (Cooper et al., 2022; Park et al., 2020; Rather, 2021; Zheng et al., 2020). From the perspective of the tourists, for example, health safety concerns and inconvenience during travel to and from sites, which are triggered by the pandemic, have created significant hesitation regarding travel decisions (Chan, 2021); concurrently, the pandemic shifts tourism industry stakeholders from viewing global tourists as ‘ambassadors’ to ‘undesired guests’ as the primary focus lies in implementing stringent preventive measures to minimize the transmission of the virus within destinations under the context of the pandemic (Cairns and Clemente, 2023; Çakmak et al., 2023; Erayman and Çağlar, 2022; Korstanje and George, 2021). Such changes in perceptions and behaviors have likely had a substantial impact on the visiting experience of cultural heritage-based tourism, and thus warrant further investigation.
Despite the COVID-19 pandemic being a global health crisis, geographical factors have contributed to variable impacts of COVID-19 (Read, 2022; Sigala, 2020). While the pandemic has had significant negative impacts on a global scale, which have been documented in several countries including India, Australia, and Malaysia (Flew et al., 2021; Foo et al., 2021; Jaipuria et al., 2021; Kumar et al., 2020; Sah et al., 2020), it should be noted that the impact of COVID-19 on the tourism industry varied between countries, particularly dependent on their level of economic development. Specifically, while developed economies have experienced significant losses in the tourism industry due to the pandemic, developing economies may have faced even more devastating consequences, regardless of the scale of loss of life due to COVID-19. This paradoxical phenomenon is due to the fact that some developing economies have become overly dependent on the tourism industry, making them more vulnerable to the negative secondary consequences of the pandemic, such as the closure of tourism-related businesses, job losses and interrupted transport and supply chains (Barbosa et al., 2021; Brunn et al., 2022; Ranasinghe et al., 2021).
The impact of the pandemic on heritage organizations and tourism has varied with geography and the characteristics of individual sites. For example, Vaishar and Šťastná (2022) argued that economic uncertainty and travel restrictions that limit international tourism during COVID-19 have caused severe negative impacts on urban tourism in the Czech Republic; however rural tourism that is predominantly focused on domestic tourists was less affected. Also, outdoor sites, such as open-air museums and parks and gardens, were more favorable to visitors due to perceived lower associated risks of COVID-19 transmission and thus likely to be more resilient during pandemic measures compared to indoor sites (Landry et al., 2021). In the UK, the closure of heritage sites during the peak season (i.e. Easter until September) where organizations earn up to 70% of their annual turnover, resulted in a difficult situation (UK Parliament, 2020) where 82% of heritage organizations reported high or moderate risk to their organization’s long-term viability (UK Parliament, 2020b). Concurrently, some researchers and professionals believe that the pandemic was an opportunity to reshape and reinvent the sector and its practices (Smith, 2021). During COVID-19 there was also a visible surge in digital tools for audience engagement (Samaroudi et al., 2020). Heritage organizations had to adopt a proactive approach where various stages of the pandemic required a distinct approach (Cui et al., 2023). This illustrates some of the management challenges the cultural organizations faced due to the pandemic.
Three years on from the initial outbreak of COVID-19 pandemic, although the long-term impact of COVID-19 is still unclear, this crisis is reaching a more mature, yet still threatening stage (Assaf et al., 2022). For instance, a UNESCO (2020) report estimated that about 10% of museums globally may never reopen. More recently, an Art Fund report (2022) found that the income and visitation levels have recently increased, however, they are still 68% and 61% of the pre-pandemic levels. Against this background, strategies to speed up the recovery of the tourism industry and prepare for the future should be proposed and adopted (Assaf et al., 2022; Korstanje et al., 2022). While the impact of the pandemic on the tourism industry in the UK may not be as devastating to the national economy as compared to the proportional impact in many developing economies, the tourism industry in the UK was severely affected by the pandemic (Office for National Statistics, 2021). Thus, it is still an important issue that requires attention since heritage tourism is a vital component of the UK economy that provides significant economic value and employment opportunities (Oxford Economics, 2016). This is highlighted by the statistics that, in 2019 (pre-COVID), the heritage sector in the UK provides a total gross value added of £36.6 billion and over 550,000 jobs (Historic England, 2020). Moreover, the abundance of online social media data pertaining to UK heritage sites generated during the pandemic provides a valuable opportunity to examine the broader social media responses towards disruptive events like COVID-19. The COVID-19 pandemic is therefore a suitable case study for demonstrating the effectiveness and limitations of the methodology proposed in this study that employs advanced machine learning models to effectively extract information from noisy social media data.
In this paper, we first quantify the impact of COVID-19 in terms of the number of visitors to heritage sites, using the number of comments from an online review platform as the proxy. Secondly, we investigate how the pandemic has affected different dimensions of the visitor experience. To that end, we apply sentiment analysis to measure visitor perceptions towards the impact of COVID-19. The negative impact caused by COVID-19 may arise from two causes, namely mental and physical health consequences, both due to being affected by the disease and inconvenience because of the imposed policies by local governments to stop the spread of the virus during the travel. In this paper, we focus on the latter aspects, namely the changes in experience resulting from non-pharmaceutical preventive measures related to COVID-19.
Our study aims to achieve two main goals: (1) To analyze impacts related to COVID-19 on visitor numbers and experiences at heritage sites in the UK. (2) To exemplify the efficacy of sophisticated machine learning techniques in interpreting visitor sentiments based on their social media comments, serving as an illuminating framework for potential future application in socio-cultural research. To accomplish this, we deploy a blend of deep learning-based weakly supervised natural language processing (NLP), zero-shot learning NLP, and computer vision (CV) models. We apply these tools to an expansive dataset of social media content, containing over 1 million review comments for over 750 heritage sites in the UK, to conduct an empirical evaluation of the COVID-19 pandemic’s effect on UK heritage-related tourism.
To our knowledge, this empirical study with state-of-the-art machine learning models on such an extensive dataset is both novel and unique. However, it is important to acknowledge the inherent challenges in this approach, and thus potential limitations:
-
1.
Language ambiguity: complex, polysemous natural language can cause ambiguity in interpretation.
-
2.
Uneven topic coverage: unstructured data, like user reviews, may show disproportionate topic representation.
-
3.
Passively collected data: mining user reviews limits purposeful questioning compared to structured surveys.
More detailed discussions regarding the limitations can be found in the section “Limitations”. Despite these, this exploration yields valuable insights to facilitate the recovery of the relevant sectors from the pandemic’s repercussions, and provides evidence-based recommendations for effective site management strategies amid the ‘living with COVID-19’ policy, shading light on using such method to conduct relevant social studies in future research.
Background
Coronavirus disease (or COVID-19 for short) is an infectious disease caused by the SARS-CoV-2 virus. COVID-19 was declared a pandemic by the World Health Organization (WHO) on March 11, 2020, and has since spread globally. COVID-19 was confirmed to be spreading in the UK by the end of January 2020 with the first confirmed deaths in March of the same year. Since then, in the UK, it has resulted in more than 22 million confirmed cases since the original outbreak in the country and is associated with 211,684 deaths by the end of 2022 (UK Government, 2022).
To prevent the spread of the virus, the UK had three national lockdowns, in which people were required to “stay at home” as much as possible. Most public places were therefore affected by these lockdowns, although to varying degrees (i.e., closed or restricted to public visiting). The first national lockdown came into effect on 26 March 2020 and was gradually relaxed in June 2020. Measures, including social distancing and mandatory face coverings, were introduced after the first national lockdown. The second national lockdown was enforced for 4 weeks between 5 November 2020 to 2 December 2020, aiming to contain the outbreak and flatten the curve of cases during winter. Finally, the third national lockdown started on 6 January 2021. The exit of the third lockdown was gradual and multi-staged: on 8 March 2021 recreation in an outdoor public space was permitted between two people; on 29 March 2021 outdoor gatherings of either six people or two households were permitted, and on 12 April 2021, public buildings and outdoor venues, including museums, libraries, zoos, and theme parks were allowed to reopen. Finally, on 19 July 2021, the third national lockdown was ended. However, several measures such as face coverings requirements, were kept until February 2022 when the government’s plan for “living with COVID-19” was issued (UK Government, 2022b).
Alongside lockdown measures, there were several partial or full closures of heritage sites, which in many cases has continued following the end of lockdown measures. According to one survey (VisitEngland, 2021), over a third of attractions were unable to open for their typical visitor seasons once national lockdown measures were lifted. This was predominantly driven by continued restrictions due to regional lockdowns and difficulties in meeting the requirements to open safely during the pandemic, although profitability and a lack of staff volunteers were also important factors.
As previously mentioned, the UK government has adopted several measures since the outbreak of COVID-19 to curb the spread of the virus in public places during and after national lockdowns. Guidance for preventing the spread of the virus in public places was issued by the UK Government (2021), in which three measures were specifically mentioned:
-
1.
Social distancing: encouraged the public to keep apart (2 m or over 1 m) from people they do not live within the public spaces.
-
2.
Face coverings: encouraged the public to wear a face covering in crowded and enclosed spaces where they may come into contact with other people they do not normally meet.
-
3.
The cleaning and hygiene: requested the owners and operators to implement cleaning protocols to limit coronavirus transmission in public places with a particular focus on touch points. Also, under cleaning and hygiene, except for the suggested use of cleaning protocols and posters, there was usually cleaning equipment such as hand gel provided.
These measures have been implemented for most of the coronavirus pandemic before they were withdrawn in February 2022. As the operating models of many heritage sites depend on visitors, these measures likely had a profound impact on visits during the pandemic.
Except for these general measures, there were also short-lived policies such as requiring an NHS COVID-19 passport, or evidencing proof of a negative test result within a certain recent timeframe for the entrance of public spaces. As these policies only hold for a short period of time (especially in the spike time of growing COVID-19 cases), they are unlikely to account for a significant portion of visitors’ opinions regarding COVID-19 during their visit. Therefore, in measuring visitors’ sentiment towards the impact of COVID-19 on their travel experiences, we selected social distancing, face coverings, hygiene (cleanness, provided equipment such as hand sanitizer) and restrictions (restricted or closed areas) in the heritage sites.
Many studies have leveraged social media data to analyze the impact of COVID-19 on different aspects of society. For example, Ginzarly and Srour (2022) examined discourse emerging from cultural heritage content shared online during the COVID-19 pandemic. Through analyzing hashtag data collected from Instagram using latent Dirichlet allocation (LDA) and sentiment analysis package (syuzhet), the paper concluded that positive topics of social values, including safety, inclusion, participation and resilience, and diverse cultural expressions were the most shared by the users during the COVID-19 pandemic. Their results also showed that users approach the virtual space as a substitute for the loss of physical access through terms like ‘home’, ‘virtual’, ‘online’, ‘travel tomorrow’, and ‘museums from home’.
In examining the impact of COVID-19 on Singapore, Ridhwan and Hargreaves (2021) used VADER, a Lexicon-based method, for sentiment analysis and a recurrent neural network model pre-trained on the emotion classification dataset for emotion analysis. The study used data collected from Twitter between 1 February 2020 and 31 August 2020. The results of the study reveal that nearly half (45%) of all tweets expressed joy, while 30% expressed fear. The topic of staying at home pertaining to COVID-19 was the dominant topic in the tweets. Public health topics mainly expressed positive sentiment, including topics of social distancing, the encouragement to stay at home and stay safe, as well as the wearing of face coverings, while travel and border restrictions caused by the pandemic situation were dominated by negative sentiments.
Sanders et al. (2021) used tweets collected from March to July 2020 to illustrate public attitudes toward the use of face coverings during the COVID-19 pandemic. The study performed clustering by applying a k-means algorithm in the embedded space to find semantically distinguished topics. Simultaneously, each tweet is labeled by a sentiment score using VADER. They also used a text summarization model to process each cluster (and its subclusters) using tweets at the centroid of each cluster and conducted a qualitative analysis based on the model’s outputs. The study found a consistently polarized Twitter discourse surrounding face coverings and an accompanying overall increase in negative sentimentality.
Lyu et al. (2021) also utilized Twitter data from 11 March 2020 to 31 January 2021 to investigate topics, sentiment and emotion expressed in COVID-19 vaccine-related content. LDA, syuzhet and a lexicon-based method are used to perform topic modeling, sentiment analysis and emotion analysis, respectively. The study found the topic about vaccination progress around the world was mostly discussed and was often driven by the key milestone steps in vaccination. The sentiment of vaccination was increasingly positive in general, and emotion analysis further showed that trust was the most predominant emotion expressed in tweets regarding vaccination.
These studies and the broader corpus of the literature suggest that visitor perception towards COVID-19 and associated measures can be understood by applying NLP techniques to data collected from social media. Methodologically, this paper follows this framework by adopting NLP methods to decode semantic information expressed in user comments towards COVID-19 measures implemented at UK heritage sites, and at the same time, advances the previous studies by applying state-of-the-art weakly supervised and zero-shot learning language models which are more capable of capturing semantic information from sentence contexts.
Data and method
This section will start by introducing the data included in this study. Afterward, we introduce the method used to categorize heritage sites into urban and rural and indoor and outdoor. This is followed by an introduction to the NLP models used to detect comments related to COVID-19. Then, we introduce how visitor involvement was measured, especially in terms of passenger flow, and based on this, how to estimate the degree of tourism recovery using the number of online reviews as the proxy. Lastly, sentiment analysis methods will be introduced to dig deeper into visitors’ sentiments towards different preventive measures used by heritage sites.
Data
This research aims to reveal the impact on visitor numbers and their experience of COVID-19 measures on UK heritage sites at the national level of the UK. Thus, we intended to include a large and diverse cohort of sites while ensuring that sufficient data was available to undertake robust analysis. Accordingly, the following inclusion criteria were set: (1) the site has received more than 100 reviews, to reduce the variance in estimation; (2) the site is primarily associated with cultural heritage and tourism; (3) places sharing very similar role to heritage sites in accounting for people’s leisure time (e.g., national park or London Eye). To find relevant sites, we included properties managed by large heritage organizations, including English Heritage, the National Trust, the National Trust for Scotland and Historic Environment Scotland. We also included relevant sites from compilation lists, including the most visited attractions in England in 2019 (published by VisitBritain.org), a list of ‘A History of England in 100 Places’ published by Historic England, and most visited museums in membership with Association of Leading Visitor Attractions (ALVA) in 2019. We collected review data for 775 sites from Google Maps using the specified criteria and sources, starting from February 2006 (the earliest available data on Google Maps for the included sites) until April 2022. We selected Google Maps because data from Google Maps is site-focused and each review is associated with a rating score given by the user, which can be used to complement the measurements of user sentiment.
We collected both textual comments data and imagery data from each site. For the collected textual review data, we excluded non-English data and non-text data (reviews that only contain ratings without textual comments), which resulted in ~1.4 million reviews included in the analyzed set. Each comment data has a corresponding user-given rating score, ranging from 1 (worst) to 5 (best). 100 photos were collected for each site, with each photo uploaded by a different user; we randomly choose a single photo if multiple photos were uploaded by a user. Table 1 shows the management organizations and sources of selected sites with the corresponding number of comments from different sources. Places categorized as ‘Other’ in the table are mostly abstract and do not belong to any organizations or lists, such as London’s Chinatown or the River Thames. However, as they possess similar significance to managed substantive heritage properties, they have been included in this study. More details regarding which sites are included can be found at https://zenodo.org/record/8130804 (Liu et al., 2023).
Outdoorness and urbanness
To quantify the within-variation of the impact of COVID-19 on the cultural heritage tourism industry, we classify sites in two ways: (1) indoor and outdoor and (2) urban and rural. Firstly, in order to separate indoor and outdoor sites, we apply places365 (Zhou et al., 2017), a convolutional neural network (CNN) CV model that is able to classify photos as being inside or outside, to calculate the percentage of photos that are outdoors in the collected 100 photos per site. Based on this measurement, we assign a site as an indoor or outdoor site. A first approximation to classify sites as either indoor or outdoor was done with a 50% threshold of ‘outdoorness’; however, as some sites do not allow photos to be taken in their interior spaces or of specific collections, visitors may not be able to fully share photos to show their indoor experiences. As well, visitors may share photos during their travel on arrival/departure or during the route that only includes exterior scenes. Thus, this analysis would not be representative. These factors will cause a bias towards outdoor scenes, which is problematic for the measurement of indoor and outdoor classification. Thus, we select several known outdoor sites (e.g., national parks) and calculate the 1% confidence interval of the distribution of their outdoorness. The lower bound of the confidence interval is used as the threshold for outdoor sites, i.e., sites with a percentage of outdoor photos below the threshold of 0.83 will be classified as indoor sites.
Secondly, to classify sites as urban or rural, we used the Code Point dataset (Ordnance Survey, 2022) as the proxy for population density. More specifically, we calculated the density of existing postcodes surrounding a site (within a radius of 10 km). Sites in high (low) density regions are considered as urban (rural) sites. Similar to the previous strategy, given known sites that can be classified as urban sites (e.g., sites in the central area of Greater London), we calculate the threshold (24 postcodes per km2) and obtain urban/rural classification for all sites.
As discussed in the introduction, the UK government considered indoor and outdoor spaces differently during the pandemic in terms of implemented preventive measures, and heritage sites in urban and rural are likely to be affected by COVID-19 with varying degrees. Thus, by classifying heritage sites into indoor and outdoor and urban and rural sites, we are able to, at much more finer level, investigate the impact of COVID-19 that may vary across different types of heritage sites.
COVID-19 topic detection
We used keywords associated with COVID-19 (’covid’, ’coronavirus’, ’social distance’, ’social distancing’, ’pandemic’, ’delta’, ’omicron’) to find topics related to COVID-19 among the collected user comments. A comment will be classified as being related to COVID-19 if one of the keywords appears in the comment. In addition to this, a pre-trained language model with a natural language inference strategy is also applied. Specifically, a pre-trained language model is a neural network that has been trained on large natural language datasets which enables the model to have a generalized off-the-shelf ability in understanding natural language in other circumstances. In this paper, as we are aiming to detect comments related to COVID-19, we used BERTweet-large (Nguyen et al., 2020), which was based on RoBERTa architecture (Liu et al., 2019) and pre-trained on a dataset of 850 million tweets in English, containing 845 million tweets streamed from January 2012 to August 2019 and 5 million tweets related to the COVID-19 from January 2020 to March 2020 (Nguyen et al., 2020).
The natural language inference strategy refers to the fact that the natural language model will be used to predict the relationship (e.g., entailment) of two given sentences. For example, if the following two sentences are given to the model: (1) ‘a museum with multiple tourists visiting’ (premise) and (2) ‘some people are visiting a place’ (hypothesis), the model will be trained to predict the relationship between the two sentences as ‘contradiction’, ‘neutral’ and ‘entailment’ (in this example the correct label is ‘entailment’). We fine-tuned BERTweet on the multi-genre natural language inference (MNLI) task of GLUE dataset (Williams et al., 2018) and prepared a template of ‘This sentence has user’s review about COVID-19’ as the hypothesis. We then estimate the probability of ‘entailment’ for each comment paired with the hypothesis as the probability of being related to the COVID-19 topic appearing in the given comment. To reduce misclassification, particularly false positives, instead of the commonly used threshold in the classification task of 0.5, we selected a threshold (0.92) that minimizes the appearance of COVID-19 topics before 2020. In detecting comments related to COVID-19, we are able to track the visitor’s perception towards the disruption of the experience caused by COVID-19 during their visits from the temporal dimension and therefore estimate the degree of recovery of heritage tourism in the UK.
Impact of COVID-19 on visitor involvement using number of online reviews as proxy
In leveraging social media data to account for the impact of COVID-19 on heritage sites, we use the number of online comments on Google Maps as the proxy for measuring visitor involvement. This involvement can be separated into two aspects. First, it represents the willingness of visitors to share their experiences. It is possible that during the pandemic, visitors may be more or less willing to leave comments online (i.e., a lower probability of leaving an online comment after visiting). We collected monthly actual numbers of visitors to the museums and galleries sponsored by the Department for Digital, Culture, Media & Sport (DCMS) of the UK government from January 2016 to March 2022 (UK Government, 2022c), and their corresponding number of online comments in the same period. Figure 1 shows how many visitors will be needed for, on average, one online comment to be given for these government-sponsored sites. We observe that, except for extremely unusual periods when these sites are closed due to COVID-19 in most of 2020 and early 2021 (where the lines are disconnected), there is no significant change in the ratio between the number of visitors and the number of comments after 2016. Therefore, we assume that the change in visitor willingness to leave an online comment does not change throughout the duration of this study for the included sites.
The shaded red area shows the 99% confidence interval. Break points of the line correspond to the closure of the sites during the pandemic due to lockdown, and the spike before the break point in 2020 is likely due to a significant decrease in the number of visitors, rather than a significant increase in the number of comments.
Secondly, and more importantly, this finding indicates that the number of comments is also an effective indicator of the actual number of visitors. Figure 2 shows the linear relationship between the log number of actual visitors and the log number of online reviews on Google Maps with an R2 of 0.73. Thus, assuming that visitor’s willingness in leaving online comments does not change significantly compared to periods without heightened health risks, the number of online comments can reflect the actual passenger flow volume of the sites and therefore will be an important metric.
A regression line depicting the relationship between the log number of reviews on Google Maps (x-axis) and the log number of visitors (y-axis) for museums and galleries sponsored by the UK’s Department for Digital, Culture, Media & Sport (DCMS). The coefficient of determination (R2) for the regression line is 0.73.
We collected the number of inbound data from a dataset of UK monthly overseas travel and tourism (Office for National Statistics, 2022). This dataset includes the number of inbound visitors to the UK subdivided into their regions of origin (Europe, North America and other countries) and purposes (holiday, business, visiting friends or relatives and Miscellaneous). We merged the original classifications ‘North America’ and ‘Other countries’ into ‘Non-Europe’, as they are highly correlated.
In this paper, we will quantify the impact of COVID-19 on heritage sites using the trend of number of online comments and number of comments related to COVID-19 to estimate the degree of recovery of heritage tourism from both passenger flow level and perception level. We also measured the correlation between the reduction (compared to 2019) in the number of online comments in both urban and rural heritage sites and the reduction in the number of inbound visitors by different source countries and purposes since the outbreak of COVID-19. This will reveal whether there are significant differences in the impacts of the COVID-19 pandemic on heritage sites in urban or rural areas.
Sentiment analysis
Sentiment analysis, a technique to systematically extract and quantify emotional and subjective information from textual data using NLP methods, is used to measure visitor perception towards the impact of COVID-19 on their experiences. As discussed above, in this paper we focus on measuring the change in visitor experiences caused by the policies and restrictive measures taken to stop the spread of COVID-19. We do not focus on visitors feelings towards their own physical and mental health, as most visitors will not travel when they feel unwell and these experiences are unlikely to be expressed in review comments.
The rating score given by the visitors for each comment will be used as the expression of sentiment. However, the 5-scale rating scores are usually highly skewed in distribution (Hu et al., 2006). This also applies in this study, where the average score is 4.39 with more than 62% of the comments being 5-star rated. Thus, we fold the 5-star rating scores into a dichotomy classification, namely positive (rating score above the mean value) and negative (rating score below the mean value). We then use the binarized rating as the dependent variable.
We conduct sentiment analysis at two levels: document-level and word-level. At document level, in detected COVID-19 topics, we further classify them into four finer-grained subtopics regarding COVID-19 measures, namely face covering, social distancing, restrictions and closure of areas and hygiene equipment as introduced in the background section. Similarly to the strategy used in detecting topics related to COVID-19, we apply a combination of rule-based (keyword search) algorithms and likelihood-based models (language models). We use the keywords that are related to the four subtopics, which can be found in Table 2; a subtopic is classified as appearing in the comment if the keywords belonging to that subtopic are included in the comment. In addition, we draw on BFV (Liu et al., 2022), a weakly supervised text classification model, with the keywords above as the input to further model the likelihood of subtopics appearing in each comment. We use a fuzzy classification strategy to fuse the two results, i.e., when they both agree the subtopics appeared (not appeared), the corresponding label will be 1 (0), whereas when they disagree, the corresponding label will be given 0.5. Using the strategy above, we obtain the subtopic-document matrix and model its correlation with sentiment. The subtopic-document matrix is then used as the independent variable to build logit models for both indoor and outdoor sites to investigate how each subtopic is related to visitor sentiment.
As for word-level sentiment analysis, we draw on a language model from NLP. Specifically, we train a simple sentiment language model using DistilBERT (Sanh et al., 2019) as a backend, in combination with a binary classification header with the existing comments related to COVID-19 and their corresponding dichotomy sentiment rating. Then, we use Integrated Gradient (IG) (Sundararajan et al., 2017) to calculate the gradient of sentiment with respect to each word, which represents the importance of the word in predicting the sentiment. We then aggregate the gradient of each word in every document to represent the overall sentiment of each word. Compared to traditional methods (e.g., bag of words or lexicon-based methods), the advantage of this approach is that the language model can predict the sentiment from each word as well as its surrounding words (context) in a comment since BERT is a context-aware language model. This dynamic information between sentiment and each word recorded in the parameters of the trained language model then can be extracted by IG. Therefore, it can more accurately capture the semantic information of each word.
The sentiment analysis enables this study to quantify visitor sentiment towards different preventive measures used in heritage sites and therefore provide useful evidence and feedback to inform their use in future heritage site management.
Results
Using the method introduced above, we detected 15,300 comments that are related to COVID-19 from the ~1.4 million reviews for 689 sites (since out of the 775 sites, 86 did not have any detected COVID-19-related comments, we excluded them from the subsequent analysis accordingly). Following the classifications of outdoorness and urbanness, we classify the 689 sites into indoor/outdoor and urban/rural sites as shown in the cross Table 3.
Figure 3 shows the trends of comments on Google Maps from January 2016 to April 2022 for indoor sites and outdoor sites. From the end of 2021 onward, the comments related to COVID-19 start to reduce for both indoor sites and outdoor sites, showing that visitors are gradually mentioning COVID-19 less frequently at heritage sites, despite the Omicron variant of the COVID-19 spreading rapidly across the UK. However, up until mid-2022, as reflected by the figure, it can be seen that the huge differences between the expected and actual number of visitors’ comments still persist, indicating that the recovery of heritage tourism is still significantly lagging behind pre-pandemic levels, specifically in terms of passenger flow. Particularly, the differences in expected and actual number of comments are larger in indoor sites compared to outdoor sites, consistent with the expectation that during the pandemic, outdoor sites were likely to be more popular than indoor sites (Landry et al., 2021).
The blue bars show the number of comments for each month; the light blue bars show the expected number of comments calculated by ARIMA using the previous trends, and the red bars are the number of comments relating to COVID-19. The number above light blue bars is the ratio (displayed as a percentage) between the actual number of comments and the expected number of comments, while the numbers above red bars are the ratio (also as percentages) between comments related to COVID-19 and the actual number of comments.
We then calculate the correlation between the reduction in the number of monthly comments and the reduction in the number of inbound visitors (benchmarked to 2019) for urban and rural sites as shown in Fig. 4. Consistent with previous research that COVID-19 likely result in negative impacts on international holiday travel (Vaishar et al., 2022), there are significant positive correlations between the reduction in the number of comments and the reduction in the number of inbound visitors from Europe, which is the largest source of inbound visitors to the UK, or with the purpose of the holiday in urban sites, whereas these correlations are not significant for rural sites, showing that urban sites are more severely affected by the reduction in the inbound visitors, especially visitors from Europe or visitors with the purpose of the holiday, compared to rural sites.
The next step is the analysis of the sentiment of the visitors towards COVID-19 at the document level and word level. We chose four subtopics connected with COVID-19 of special interest and significance in the dataset. Figure 5 is a Venn plot to show the number of the mentions of four subtopics and their intersections among comments related to COVID-19. Since the frequency of mentions of the four subtopics does not differ significantly across indoor/outdoor and urban/rural classifications, further details regarding the distribution of the four subtopics based on these classifications are not presented.
Figure 6 shows the correlations between sentiment (positive emotion or negative emotion) and the four subtopics of COVID-19 for both indoor and outdoor sites. The figure shows an interesting differential in terms of the visitor attitude towards sanitization and social distancing. Specifically, in indoor sites, mentions of sanitization equipment (e.g., hand gel and sanitization stations) are significantly associated with positive emotion. However, this positive emotion is insignificant in outdoor sites. This shows that providing hygiene equipment is welcomed by visitors in indoor sites as it represents safety. Nevertheless, in outdoor places, visitors may have fewer safety concerns and thus are indifferent toward the provision of this equipment. As well, social distancing is significantly related to positive emotion in outdoor sites, but it is not significant in indoor sites. This could be explained by that, although social distancing measures (and one-way systems and related queuing) are welcomed by visitors as they reduce the risk of the virus spreading, indoor places where space is more constrained may have difficulties in implementing it effectively. Thus, complaints may arise and the sentiment becomes mixed. As well, restrictions broadly, and the closure of specific areas, and wearing face coverings are significantly associated with negative emotions in both indoor and outdoor sites, suggesting that they are consistently disliked by visitors.
Figure 7 shows the sentiment analysis at the word level for all comments. Consistent with the document-level sentiment analysis, ‘closed’ and ‘restrictions’ are strongly negative, showing they are mostly complained about by visitors. On the other hand, surprisingly, the term ‘COVID’ is strongly associated with positive emotion. This may be contributed to that most comments mentioning COVID-19 express visitor excitement after returning to normal life after COVID-19 measures being released (e.g., reopening of public places and finish of lockdown), and thus ‘COVID’ will be considered as an indicator for positive emotion.
However, this analysis at both levels of granularity does not differentiate between whether the negative emotion towards face covering and social distancing results from the discomfort caused by being forced to follow the rule or displeasure of other visitors’ disregard for restrictions: some people may appreciate the added safety and security that face covering requirements and social distancing can provide, especially during the ongoing COVID-19 pandemic, and thus do not approve of others when they do not follow the measures; others, however, may find wearing face coverings and social distancing to be inconvenient or unnecessary, and may feel that it interferes with their ability to enjoy their visits. These two attitudes both cause displeasure but show totally opposite views towards the anti-COVID-19 measures. Thus, following the approach utilized by Sanders et al. (2021), we implemented the same document summarization strategy but with a different model, Google’s Pegasus (Zhang et al., 2020), which has been fine-tuned on the reddit_tifu dataset that is more suited to the context of social media. By using 25 negative comments closest to the centroid of the face covering and social distancing topics, we carried out a pseudo-qualitative analysis:
Summary generated from the model for negative comments of wearing face coverings: “had to wear a mask to escape the crowds of people not wearing masks."
Summary generated from the model for negative comments of social distancing: “didn’t follow COVID-19 social distancing guidelines and had to walk around with no social distancing.”
From the two summarized sentences, we can observe that no strict management and enforcement of the measures and the failure of other visitors to follow the measures are the main cause for complaint, rather than discomfort at a subjective level. This indicates that visitors generally accept the measures and expect that visitor follows them to ensure their personal safety.
Discussion
Implications for management
This paper reveals several empirical findings that can inform the management of heritage sites during the recovery from COVID-19 and other pandemics in the future. Further, it can also inform the recovery plans being prepared at the national level to understand the efficient allocation of resources. Although visitor perception towards COVID-19 has been diminishing, the recovery of visitor involvement as reflected by the actual number of online comments (which is also associated with the actual passenger flow volume of the sites) is much slower. From the urban/rural perspective, this loss of visitor involvement is more obvious in sites in urban areas and is more strongly associated with international tourism. From the indoor/outdoor perspective, indoor sites have been more severely impacted compared to outdoor sites. Therefore, more supporting policies are needed to help the recovery of urban indoor cultural heritage sites that heavily rely on international travelers (e.g., museums and galleries in cities), even when the perceived impact of COVID-19 is not obvious among existing visitors. Also, these sites should have a financial plan in place to manage any financial losses that may occur due to unexpected closures or reduced international visitor numbers.
Through a pseudo-qualitative analysis, and fine-grained sentiment analysis at both document level and word level, this paper reveals visitor sentiment towards different measures taken in response to the COVID-19 pandemic. More specifically, this sentiment information and pseudo-qualitative analysis suggest that the measures are generally welcomed by visitors but they need to be implemented effectively for all visitors and staff, and visitors are also disappointed towards areas being inaccessible due to COVID. The provision of hygiene equipment is also welcomed, but they are only perceived positively when they are considered essential (in indoor scenes where frequent touching of surfaces might occur). Thus, maintaining order on site, such as ensuring that staff and visitors, especially front-of-house staff, follow the COVID-19 prevention measures when it is crowded, and regularly reviewing and updating the emergency plan to keep visiting areas as accessible as possible while ensuring safety, are crucial for improving visitor experience during pandemic measures.
This study provides an example of how to extract useful information from visitor feedback as an alternative to traditional visitor surveys. Compared to these methods, collecting data from social media reduces human capital costs in distributing questionnaires and is contact free. Leveraging social media data can provide larger sample sizes from different sites and therefore reduce sampling bias. Also, given that writing online reviews is similar to answering open-ended questionnaires (Pietsch et al., 2018) and that the data obtained from open-ended questionnaires exhibit the same level of richness and similar ranking of importance compared to that obtained from close-ended questionnaires (Krosnick, 2018; Reja et al., 2003), the information contained in social media data can provide a comprehensive representation of visitor opinions that is comparable to traditional surveying methods. With the advancement of machine learning, especially NLP techniques, we will be able to mine increasingly diverse and valuable information from user-published social media data to help refine sustainable heritage management strategies. Lastly, this method allows us to take surveys and make analyses retrospectively. For example, even if a survey was not conducted before an unexpected event (e.g., earthquake), we are able to convert user comments from social media before and after the unexpected event into two equivalently structured surveys and compare them to investigate whether there is a significant difference that reflects the effects of the event on the visitor experience.
Limitations
In this paper, we quantitatively analyzed the impact of COVID-19 on cultural heritage sites based on visitor involvement (number of online comments) and sentiment towards COVID-19 measures using state-of-the-art NLP models that convert unstructured visitors’ reviews into structured questionnaire-like results with corresponding sentiment scores. The current breakthroughs in the deep learning realm that brought significant improvements to neural network models in terms of human language understanding provide the basis for the method proposed in this paper, which has several advantages such as larger sample sizes, lower costs and allowing for retrospective analysis as discussed above. However, there are still some limitations inherent to this method:
-
1.
Ambiguity in language is a concern. Natural language is complex and non-homogeneous among individuals, making it difficult to extract consistent and accurate meaning from unstructured text. Despite the limitation of ambiguity in language, it is important to note that this challenge is not unique to the method proposed in this paper. In fact, it is a prevalent issue in many other tasks, as long as they involve human language interpretation. For example, even in traditional surveys using questionnaires with a structured format, the language used in presenting the questions can lead to biases in answers (Fowler, 1995). However, this challenge could be especially prominent for studies involving social media data, where people tend to use informal, non-rigorous language, hindering the accuracy of expressing their views.
-
2.
The diversity of topics is also an important consideration. In traditional surveys typically used in heritage, researchers can include questions that would provide a sufficient basis on which to address the questions at hand. However, with unstructured data like user reviews, some topics may be more prevalent than others. This can result in an uneven distribution of data by topic, making higher statistical uncertainties among topics mentioned less frequently by reviewers. Additionally, traditional surveying methods have well-established statistical tools for verifying, calibrating and adjusting survey results, the same may not necessarily be true for results converted by NLP methods from unstructured data like user reviews.
-
3.
Lastly, the passive nature of user reviews means that researchers cannot ask specific questions that may be important for their analysis. This can be a limitation, particularly if the research is hypothesis-driven and requires specific data to test a hypothesis. However, on the other hand, data-driven analysis allows researchers to explore the data and identify patterns and trends that may not have been initially considered (Tansley et al., 2009).
Beyond the generic limitations associated with the methods used in this paper, there are also limitations that are specific to applying the method on measuring visitors’ perceptions towards COVID-19 measures at UK heritage sites. First, from the perspective of data, we collected online visitors’ opinions from the Google Maps platform, which may involve some limitations: For example, the data may contain more opinions from young visitors or people who are proficient in using social media platforms. In addition, as there is no strict curation with approval criteria on the uploaded social media review on Google Maps, the quality of the social media data may be low. Lastly, we only used English data in our analysis. This may cause a bias toward opinions from native English speakers and ignore some international non-native English speakers who account for a significant part of the consumers of heritage tourism in the UK. Thus, the results given by the models should be more critically evaluated with caution.
Conclusion
In this study, we collected user review data for 775 sites on Google Maps and analyzed it using state-of-the-art machine learning methods to detect and quantify the impact of COVID-19 on heritage tourism in the UK, aiming to (1) help the recovery of heritage tourism in the UK during the “post-COVID” era and summarizing lessons for the heritage tourism in preparation for the next potential severe public health emergency, and (2) showcase the efficacy of advanced machine learning techniques in interpreting unstructured data for potential relevant future research.
From the managerial implication perspective, this research provides critical insights for heritage site management. Notably, it reveals that although visitor perception towards COVID-19 (represented by reviews related to COVID-19) has significantly decreased, the difference between the actual and expected number of comments is not yet (until April 2022) restored to those based on pre-pandemic trajectories, suggesting visitor involvement with heritage sites needs more time to recover. Particularly, it underscores the need for enhanced support policies and financial planning for urban, indoor sites that heavily rely on international tourism, as these sites have seen a slower recovery of visitor involvement. Furthermore, the effective and consistent enforcement of COVID-19 preventive measures by both staff and visitors is crucial in maintaining a positive visitor experience during a pandemic. Areas should be kept as accessible as possible while ensuring safety to reduce visitor disappointment. Additionally, the value of mining social media data for visitor feedback is highlighted. This provides a cost-effective and contact-free alternative to traditional visitor surveys for heritage site managers. These practical implications can ensure a more robust and resilient response to crises, safeguarding the sustainability of the heritage tourism industry.
From a methodological development standpoint, the advanced machine learning techniques used in this study to extract information from online reviews have demonstrated their effectiveness in determining visitor perceptions towards specific facets of heritage sites. These methods can be harnessed to capture shifts in perception chronologically, particularly before and after unexpected or disruptive events such as conflicts, economic downturns, natural disasters, and disease outbreaks. These events often lead to profound impacts on the operations of heritage sites, which may or may not result in significant changes in the visiting experience. Therefore, the use of these advanced machine learning methods, as showcased in this study, highlights their potential for being adopted in future research. By leveraging these techniques, researchers can gain a deeper understanding of visitor sentiment and experiences, providing critical insights that can further enhance the management and preservation of heritage sites.
Data availability
The research data generated and analyzed during the study are available from https://zenodo.org/record/8130804 (Liu et al., 2023).
References
Abbas J, Mubeen R, Iorember PT et al. (2021) Exploring the impact of COVID-19 on tourism: transformational potential and implications for a sustainable recovery of the travel and leisure industry. Curr Opin Behav Sci 2:100033. https://doi.org/10.1016/j.crbeha.2021.100033
Art Fund (2022) Looking ahead: insights from our museum directors’ research. https://www.artfund.org/professional/news-and-insights/looking-ahead-insights-from-our-museum-directors-research
Assaf AG, Kock F, Tsionas M (2022) Tourism during and after COVID-19: an expert-informed agenda for future research. J Travel Res. 61(2):454–457. https://doi.org/10.1177/00472875211017237
Association of Leading Visitor Attractions (2020) Visits made in 2019 to visitor attractions in membership with ALVA. https://www.alva.org.uk/details.cfm?p=610
Barbosa RB, Costa JH, Handayani B et al. (2021) The effects of COVID-19 in the tourist society: an anthropological insight of the trivialisation of death and life. Int J Tour Anthropol 8(2):179–192. https://doi.org/10.1504/IJTA.2021.116094
Brunn SD, Gilbreath D (2022) Covid-19 and a world of ad hoc geographies. Springer Nature, Berlin
Cairns D, Clemente M (2023) The immobility turn: mobility, migration and the COVID-19 pandemic. Policy Press, Bristol
Çakmak E, K Isaac R, Butler R (2023) Changing practices of tourism stakeholders in COVID-19 affected destinations. Multilingual Matters, Bristol
Chan C (2021) Developing a conceptual model for the post-COVID-19 pandemic changing tourism risk perception. Int J Environ Res Public Health 18:9824. https://doi.org/10.3390/ijerph18189824
Cooper MA, Buckley R (2022) Tourist mental health drives destination choice, marketing, and matching. J Travel Res 61(4):786–799. https://doi.org/10.1177/00472875211011548
Cui T, Kumar P, Orr SA (2023) Connecting characteristics of social media activities of a heritage organisation to audience engagement. Digit Appl Archaeol Cult Herit 28:e00253. https://doi.org/10.1016/j.daach.2022.e00253
Erayman IO, Çağlar AB (2022) Hospitality in times of COVID-19: an evaluation in the context of the Baumanian concept of hospitality. Hosp Soc 12(1):73–94. https://doi.org/10.1386/hosp_00048_1
Flew T, Kirkwood K (2021) The impact of COVID-19 on cultural tourism: art, culture and communication in four regional sites of Queensland, Australia. Media Int Aust 178:16–20. https://doi.org/10.1177/1329878X20952529
Foo LP, Chin MY, Tan KL et al. (2021) The impact of COVID-19 on tourism industry in Malaysia. Curr Issues Tour 24(19):2735–2739. https://doi.org/10.1080/13683500.2020.1777951
Fowler FJ (1995) Improving survey questions: design and evaluation. Sage Publications, London
Ginzarly M, Srour FJ (2022) Cultural heritage through the lens of COVID-19. Poetics 92:101622. https://doi.org/10.1016/j.poetic.2021.101622
Gössling S, Scott D, Hall CM (2020) Pandemics, tourism and global change: a rapid assessment of COVID-19. J Sustain Tour 29(1):1–20. https://doi.org/10.1080/09669582.2020.1758708
Historic England (2020) Heritage and the economy 2020. https://historicengland.org.uk/content/heritage-counts/pub/2020/heritage-and-the-economy-2020/
Hu N, Pavlou PA, Zhang J (2006) Can online reviews reveal a product’s true quality? empirical findings and analytical modeling of online word-of-mouth communication. In: Proceedings of the 7th ACM conference on electronic commerce, Ann Arbor, Michigan, USA. EC '06, vol. Association for Computing Machinery, New York, NY, USA, pp 324–330
Jaipuria S, Parida R, Ray P (2021) The impact of COVID-19 on tourism sector in India. Tour Recreat Res. 46(2):245–260. https://doi.org/10.1080/02508281.2020.1846971
Korstanje ME, George B (2021) Mobility and globalization in the aftermath of COVID-19: emerging new geographies in a locked world. Springer, Heidelberg
Korstanje ME, Séraphin H, Maingi SW (2022) Tourism through troubled times: challenges and opportunities of the tourism industry in the 21st century. Emerald Publishing Limited, Bingley
Krosnick JA (2018) Questionnaire design. In: Vannette D, Krosnick J (eds) The Palgrave handbook of survey research. Palgrave Macmillan, Cham, pp. 439–455
Kumar S, Nafi SM (2020) Impact of COVID-19 pandemic on tourism: perceptions from Bangladesh. Available at SSRN 3632798. https://doi.org/10.2139/ssrn.3632798
Landry CE, Bergstrom J, Salazar J et al. (2021) How has the COVID-19 pandemic affected outdoor recreation in the US? A revealed preference approach. Appl Econ Perspect Policy 43(1):443–457. https://doi.org/10.1002/aepp.13119
Liu Y, Ott M, Goyal N et al. (2019) Roberta: a robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692. https://doi.org/10.48550/arXiv.1907.11692
Liu Z, Grau-Bove J, Orr SA (2022) BERT-Flow-VAE: a weakly-supervised model for multi-label text classification. Paper presented at the 29th international conference on computational linguistics, Gyeongju, Republic of Korea, October 2022
Liu Z, Orr SA, Kumar P et al. (2023) Measuring the impact of COVID-19 on heritage sites in the UK using social media data—data repository. https://doi.org/10.5281/zenodo.8130804
Lyu JC, Han EL, Luli GK (2021) COVID-19 vaccine-related discussion on Twitter: topic modeling and sentiment analysis. J Med Internet Res. 23(6):e24435. https://doi.org/10.2196/24435
Nguyen DQ, Vu T, Nguyen AT (2020) BERTweet: a pre-trained language model for English Tweets. Paper presented at the 2020 conference on empirical methods in natural language processing: system demonstrations, Online, October 2020
Office for National Statistics (2021) Coronavirus and the impact on the UK travel and tourism industry. https://www.ons.gov.uk/businessindustryandtrade/tourismindustry/articles/coronavirusandtheimpactontheuktravelandtourismindustry/2021-02-15
Office for National Statistics (2022) Overseas travel and tourism, monthly. https://www.ons.gov.uk/peoplepopulationandcommunity/leisureandtourism/datasets/monthlyoverseastravelandtourismreferencetables
Ordnance Survey (2022) Code-point open. https://www.data.gov.uk/dataset/c1e0176d-59fb-4a8c-92c9-c8b376a80687/code-point-open
Oxford Economics (2016) The impact of heritage tourism for the UK economy. https://www.oxfordeconomics.com/resource/the-impact-of-heritage-tourism-for-the-uk-economy/
Park SC, Park YC (2020) Mental health care measures in response to the 2019 novel coronavirus outbreak in Korea. Psychiatry Investig 17(2):85. https://doi.org/10.30773/pi.2020.0058
Pietsch AS, Lessmann S (2018) Topic modeling for analyzing open-ended survey responses. J Bus Anal 1(2):93–116. https://doi.org/10.1080/2573234X.2019.1590131
Ranasinghe R, Karunarathne A, Herath J (2021) After corona (COVID-19) impacts on global poverty and recovery of tourism based service economies: an appraisal. Int J Hosp Tour 1(1):52–64. https://ssrn.com/abstract=3775395
Rather RA (2021) Monitoring the impacts of tourism-based social media, risk perception and fear on tourist’s attitude and revisiting behaviour in the wake of COVID-19 pandemic. Curr Issues Tour 24(23):3275–3283. https://doi.org/10.1080/13683500.2021.1884666
Read C (2022) In crisis risk management across nations: COVID-19 wins when trust is low. In: Global risk and contingency management research in times of crisis. IGI Global, pp. 1–14
Reja U, Manfreda KL, Hlebec V et al. (2003) Open-ended vs. close-ended questions in web questionnaires. Dev Appl Stat 19(1):159–177
Ridhwan KM, Hargreaves CA (2021) Leveraging Twitter data to understand public sentiment for the COVID-19 outbreak in Singapore. Int J Inf Manag Data Insights 1(2):100021. https://doi.org/10.1016/j.jjimei.2021.100021
Sah R, Sigdel S, Ozaki A et al. (2020) Impact of COVID-19 on tourism in Nepal. J Travel Med 27(6):taaa105. https://doi.org/10.1093/jtm/taaa105
Samaroudi M, Echavarria KR, Perry L (2020) Heritage in lockdown: digital provision of memory institutions in the UK and US of America during the COVID-19 pandemic. Mus Manag Curatorsh 35(4):337–361. https://doi.org/10.1080/09647775.2020.1810483
Sanders AC, White RC, Severson LS et al. (2021) Unmasking the conversation on masks: Natural language processing for topical sentiment analysis of COVID-19 Twitter discourse. AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science, 2021:555–564
Sanh V, Debut L, Chaumond J et al. (2019) DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108. https://doi.org/10.48550/arXiv.1910.01108
Sigala M (2020) Tourism and COVID-19: impacts and implications for advancing and resetting industry and research. J Bus Res. 117:312–321. https://doi.org/10.1016/j.jbusres.2020.06.015
Smith CS (2021) The pandemic is an opportunity for museums to reinvent themselves https://www.museumsassociation.org/museums-journal/opinion/2021/05/the-pandemic-is-an-opportunity-for-museums-to-reinvent-themselves/
Sundararajan M, Taly A, Yan Q (2017) Axiomatic attribution for deep networks. In: Precup D, Teh YW (eds) Proceedings of the 34th international conference on machine learning, 6–11 August 2017. Proceedings of machine learning research, vol 70. PMLR, pp. 3319–3328
Tansley DSW, Tolle KM (2009) Scientific infrastructure. In: Hey AJ (ed) The fourth paradigm: data-intensive scientific discovery, vol 1. Microsoft Research, Redmond, WA
UK Government (2021) Coronavirus (COVID-19): Safer public places—managing public outdoor settings. https://www.gov.uk/guidance/coronavirus-covid-19-safer-public-places-managing-public-outdoor-settings#principles-for-individuals
UK Government (2022a) Coronavirus (COVID-19) in the UK. https://coronavirus.data.gov.uk/details/cases?areaType=nation&areaName=England
UK Government (2022b) COVID-19 response: living with COVID-19 https://www.gov.uk/government/publications/covid-19-response-living-with-covid-19
UK Government (2022c) Museums and galleries monthly visits. https://www.gov.uk/government/statistical-data-sets/museums-and-galleries-monthly-visits
UK Parliament (2020b) Written evidence submitted by The National Lottery Heritage Fund/National Heritage Memorial Fund. https://committees.parliament.uk/writtenevidence/7318/html/
UK Parliament (2020a) DCMS Select Committee Inquiry: the impact of COVID 19 on the heritage sector. https://committees.parliament.uk/writtenevidence/6909/pdf/#:~:text=What%20has%20been%20the%20immediate,the%20majority%20of%20their%20incomes
United Nations (2020) Educational, scientific and cultural organization museums around the world in the face of COVID-19. https://unesdoc.unesco.org/ark:/48223/pf0000373530
Vaishar A, Šťastná M (2022) Impact of the COVID-19 pandemic on rural tourism in Czechia preliminary considerations. Curr Issues Tour 25(2):187–191. https://doi.org/10.1080/13683500.2020.1839027
VisitBritain (2020) Most visited places in the UK in 2019. https://www.visitbritain.org/sites/default/files/vb-corporate/Documents-Library/documents/England-documents/full_attractions_listing_2019_final.xlsx
VisitEngland (2021) Visitor attraction trends in England 2020. Full report. https://www.visitbritain.org/sites/default/files/vb-corporate/Domestic_Research/2022-09-06_england_attractions_2021_trends_report.pdf
Wilkinson P (2018) Irreplaceable: a history of England in 100 places. https://historicengland.org.uk/campaigns/100-places/
Williams A, Nangia N, Bowman S (2018) A broad-coverage challenge corpus for sentence understanding through inference. Paper presented at the 2018 conference of the North American Chapter of the Association for Computational Linguistics: human language technologies, New Orleans, LA, June 2018
World Tourism Organization (2015) Tourism at world heritage sites—challenges and opportunities. https://doi.org/10.18111/9789284416608
Zhang J, Zhao Y, Saleh M et al. (2020) PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization. In: Daumé III H, Singh A (eds) Proceedings of the 37th international conference on machine learning, 13–18 July 2020. Proceedings of machine learning research, vol 119. PMLR, pp. 11328–11339
Zheng Y, Goh E, Wen J (2020) The effects of misleading media reports about COVID-19 on Chinese tourists’ mental health: a perspective article. Anatolia 31(2):337–340. https://doi.org/10.1080/13032917.2020.1747208
Zhou B, Lapedriza A, Khosla A et al. (2017) Places: a 10 million image database for scene recognition. IEEE Trans Pattern Anal Mach Intell 40(6):1452–1464. https://doi.org/10.1109/TPAMI.2017.2723009
Author information
Authors and Affiliations
Contributions
All authors contributed substantially to the conception and design of this study. ZL conducted data acquisition and analysis and prepared the draft manuscript. The other authors commented and gave feedback on the draft and revised the manuscript. All authors read and approved the final transcript and agreed to be accountable for all aspects of the work.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Ethical approval
This article does not contain any studies with human participants performed by any of the authors.
Informed consent
This article does not contain any studies with human participants performed by any of the authors.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Liu, Z., Orr, S.A., Kumar, P. et al. Measuring the impact of COVID-19 on heritage sites in the UK using social media data. Humanit Soc Sci Commun 10, 537 (2023). https://doi.org/10.1057/s41599-023-02022-0
Received:
Accepted:
Published:
DOI: https://doi.org/10.1057/s41599-023-02022-0