Introduction

Anthropogenic noise is a major source of pollution affecting urban and natural landscapes around the world, recognised as an emerging issue of environmental concern by the United Nations Environment Programme (UNEP) in 20221. While this issue is typically associated with urban areas, Protected Natural Areas (PNAs) are disrupted by noise as well2 with spots of outstanding natural beauty being their most fragile parts when they get exploited as tourist attractions, often resulting in degraded biodiversity3. Beyond preserving endangered landscape and enabling biodiversity conservation, PNAs are an essential resource allowing visitors to experience positive well-being effects of being in nature4,5,6 which makes them of prime interest for soundscape research, as understood in ISO 129137,8,9 Acoustics - Soundscape series.

Part 1 of the ISO 12913 series defines soundscape as an acoustic environment, as perceived by the people in context7. This has, in a way, set up soundscape research as a human perception-focused, mixed methods-based discipline, developed around questionnaire tools and/or interviews and environmental acoustics measurements, investigating how people perceive sounds of a place. However, research and policy in PNAs have traditionally approached sound-related human activity primarily from a noise mitigation perspective, mostly focusing on traffic-related noise reduction and mainly relying on physical metrics such as sound pressure levels. However, this approach overlooks a critical dimension: how humans perceive and respond to sound in context. Growing evidence suggests that natural sounds can enhance well-being10,11,12,13,14,15 and that perceptual outcomes from noise depend not only on energy-related metrics but also on context and meaning. Despite this, perceptual and experiential aspects of the acoustic environment remain underexplored in natural settings, especially in comparison to the extensive body of urban soundscape research - highlighting a notable research gap.

A holistic investigation of environmental sounds is characteristic of the soundscape approach outlined in the ISO 12913, which was implemented in this study by conducting participative socio-acoustic surveys and binaural acoustic measurements to characterise an acoustic environment in PNAs and observing its effect on human perception. In the two subsections below, we first briefly examine how soundscape issues in PNAs are addressed in policy and research, followed by introducing the ISO 12913 soundscape framework, which provides the conceptual and methodological foundation for this study’s pioneering application in mountainous protected areas.

Acoustic quality in PNAs: positive soundscape in policy and research

The international institutions such as the United Nations Educational, Scientific and Cultural Organisation (UNESCO) and the International Union for Conservation of Nature (IUCN) have developed protection guidelines to be applied to valuable natural areas around the world, requiring management strategies and often sharing the risk of overtourism16,17,18,19. The associated management plans, usually built on historical field data on the physical characteristics of an area and social/cultural/economic significance, include aspects related to aesthetics and visitors’ experience20. Regarding the appraisal of positive sound sources in the management documents, natural sounds and noise occasionally get mentioned but those mentions usually provide little or no actionable points. This issue will be briefly illustrated later in this study in the description of the case study sites (see Methods). This implies that more research is needed to characterise the acoustic environments and soundscapes of PNAs so they could be implemented in the protection documentation in a meaningful way, informing strategies to manage visitors’ behaviour and the risks of overtourism.

Within the European Noise Directive published in 200221 and the subsequent European Environment Agency Technical report No 4/2014 Good Practice on Quiet Areas22 PNAs are treated together with other exurban areas, sharing criteria for categorisation as quiet areas and the associated ‘quiet targets’, where soundscape is one of the key perceptual indicators alongside the environmental acoustic measurements. It is important to note that, in general, exurban areas receive less attention than urban ones and, while acknowledged as very important, soundscape criteria are mentioned in a very vague manner. This is due to the a lack of comparable perceptual data between the studies as many different approaches were observed to characterize the soundscape construct, such as tranquillity and wildness23,24 or the perceived affective quality25.

Studies investigating environmental sounds in PNAs are often focused on reporting sound pressure level-derived metrics26,27,28,29 and sound source type characterization as the main qualitative feature30,31. Various level-based indices have been employed from the fields of environmental acoustics and acoustic ecology to explain the frequency content and characterize the temporal changes of the audio signal with the aim of assessing noise pollution levels and detecting presence of species32,33. These studies, usually based on long-term measurements by sensor networks deployed in PNAs and noise propagation models, rely on sound pressure level (SPL)-based indices, such as LAeq and Lden for cumulative noise exposure over a whole day. These are often calculated at the sensor node but raw audio can also be collected for subsequent analyses. Despite numerous studies showing evidence that audio signal analysis-only approach cannot explain perceptual and behavioural outcomes of the human experience in sufficient detail34,35,36 the number of studies employing the ISO 12913 Acoustics: Soundscape framework in PNAs or similar mixed methods approaches is extremely limited.

Ferrari et al.37 have found that anthropogenic sounds have negative influence on the perceived recreational quality in PNAs. The same holds for a noise level increase beyond 38 dBA37 which is a very conservative value compared to urban areas where a typical threshold for acoustic comfort is considered to be around 65 dBA38. This implies that the increase in popularity of a site and the number of visits can have an adverse effect, not only on the natural habitats but on the visitors themselves by further contributing to noise pollution. This implies a role of the context as an understanding of what a place people find themselves in is and what it means to them.

Measuring soundscapes: the ISO 12913 series

The environmental acoustic metrics required by the Parts 2 and 3 of the ISO/TS 129138,9 include the psychoacoustic measurements, or sound quality metrics, developed by Zwicker & Fastl39 initially for the purpose of evaluating auditory characteristics of machinery and products, and defined by the respective international standards as shown in Table 1. Regarding the qualitative data, in its Annex C, the ISO/TS 12913-2 features three different tools: questionnaire approach (Method A and Method B questionnaires) or the narrative interview approach (Method C). Method B questionnaire was designed for use in soundwalks, while the Method A can be deployed as either a traditional on-site survey, a soundwalk or in laboratory settings. It has been shown in the past 6 years since the publishing of the ISO/TS, that the Method A has been the most widely accepted approach40 and is the one adopted in this study. It features the assessment of the perceived affective quality (PAQ), based on the circumplex model featuring a two-dimensional perceptual space defined by the orthogonal main axes, labelled as Pleasant and Eventful34 as shown in this study’s Results (Figs. 2 and 3).

Table 1 Environmental acoustic measures required and recommended per ISO/TS 12913-28.

Soundwalk is the recommended method for obtaining human responses based on a participatory listening walk along a (predetermined) route, featuring a number of listening stops – measurement points and a number of participants gathered at the location for the specific purpose of the soundwalk8. However, most of the research that fed into the ISO 12913 Acoustics – Soundscape series was conducted on urban environments, with urban setting in mind where a tolerance to certain noise sources is perhaps an integral part of the urban soundscape aesthetics. Mlynarczyk & Wiciak47 have compared urban soundscape data48 with the perceptual data from a national park in laboratory conditions using the “virtual soundwalk approach”49showing the majority of recordings from the national park being mapped in the pleasant and uneventful space. Conversely, while there is a growing number of studies exploring soundscape pleasantness and eventfulness in various urban settings and laboratory conditions40to the best of authors’ knowledge, there are no available studies conducting soundscape investigations in PNAs in a way compliant with the ISO recommendations for assessments in situ.

Study objectives

This study, based on the five expeditions conducted by the Silenzi in Quota initiative50,51 aims to address the research gap identified by providing evidence about the application of the ISO 12913 framework in PNAs and deepening the understanding of the effect of environmental sounds on human perception in PNAs. This is achieved by gathering perceptual in situ data at locations hard-to-reach and investigating the associations between the key (psycho)acoustic metrics and perceptual measurements. The manuscript has been structured in a way to provide answers to the following Research Questions:

  1. 1.

    How are the perceptual, context-related measurements (perceived sound sources dominance and overall perceived visual quality of the environment) influencing the perceived soundscape quality (pleasantness and eventfulness) in PNAs? (RQ1)

  2. 2.

    What are the (psycho)acoustic features influencing perceived soundscape quality (pleasantness and eventfulness) in PNAs? (RQ2)

Results

Acoustic measurements

The range of acoustic conditions observed across all the measurement points are described in Table 2 in terms of both acoustic and psychoacoustic variables. The investigated sites ranged from very quiet to rather loud environments, with an overall range of nearly 45 dB. The full details on all the acoustic measurements taken, per site, are available at the online repository52.

Table 2 The range of acoustic conditions across all the measurement points, based on five expeditions in four PNAs (five in Italy, one in the United Kingdom (UK)) and 23 audio recordings.

Perceptual measurements

The perceived dominance of sound sources is illustrated in Fig. 1, highlighting the character of the study locations covered by the soundwalks. These areas are characterized by the dominance of human sounds (e.g., voices, moderately, a lot, or completely dominating in 51% of cases, overall N: 435) and natural sounds, such as those produced by animals (dominating in 48% of cases, N: 438), water (44%, N: 439), and wind (33%, N: 435). Traffic noise and other noises (e.g., sirens or industrial sounds) are generally not heard (traffic: moderately, a lot, or completely dominating in 11% of cases, N: 439; other noise: 5%, N: 436).

Fig. 1
figure 1

Perceived dominance of different sound types, based on a varying number of observations (N = 435–439) across 28 listening stops.

Regarding the visual landscape, the evaluations are, as expected, very positive. In 94% of the evaluations visual landscape is rated as good or very good (N: 439).

Relationship between sound sources dominance, overall visual quality, soundscape pleasantness and eventfulness (RQ#1)

The results of LMM1_P for ISO Pleasantness show a significant effect of the dominance of traffic noise (χ2 166 (4) = 15.105, p = 0.005, η2 = 0.14 the degrees of freedom are reported in brackets), other sounds (e.g., sirens, construction, industry, loading of goods) (χ2 (1) = 4.036, p = 0.045, η2 = 0.04), sounds generated by other human beings (χ2 (1) = 53.663, p < 0.001, η2 = 0.49), water sound (χ2 (1) = 4.327, p = 0.037, η2 = 0.04), and the quality of the visual landscape (χ2 (1) = 21.693, p < 0.001, η2 = 0.20). Specifically, greater ISO pleasantness is associated with less traffic noise, construction noise and human voices, more dominant sound produced by water features, and better landscape quality (see Table 3). Gender, age, mountain sports habits, dominance of animals, and wind are not found to be significantly associated with the ISO Pleasantness of the sound environment.

Table 3 Results of LMM1_P and LMM1_E models reporting estimates, p-values and VIF/GVIF values for each fixed effect within the computed models for ISO Pleasantness and ISO Eventfulness.

As regards ISO Eventfulness, LMM1_E indicates a significant main effect of the dominance of traffic noise (χ2 (4) = 7.203, p = 0.045, η2 = 0.03), and human voices (χ2 (1) = 74.099, p < 0.001, η2 = 0.91), and water sounds (χ2 (1) = 4.390, p = 0.036, η2 = 0.05). Higher eventfulness is associated with more dominant traffic noise, human sounds and water sounds (see Table 3).

The soundscape assessments are represented in Fig. 2 with evaluations divided into two groups based on the perceived dominance of sounds (low or high, for traffic noise in Fig. 2a, other noise in Fig. 2b, human sounds in Fig. 2c, animal sounds in Fig. 2d, wind sounds in Fig. 2e, and water sounds in Fig. 2f) or the perceived quality of the landscape (low or high, as in Fig. 2g).

Relationship between the (psycho)acoustic features and soundscape pleasantness and eventfulness (RQ#2)

The single-parameter models (LMM2_P to LMM10_P) for ISO Pleasantness show a significant association with the A-weighted continuous equivalent sound pressure level LAeq, T2 (1) = 6.789, p = 0.009), LAF,5 - LAF,952 (1) = 8.765, p = 0.003), tonality (χ2 (1) = 27.332, p < 0.001), and fluctuation strength (χ2 (1) = 27.230, p < 0.001). Higher sound levels, sound level variation over time, tonality, and fluctuation strength values correspond to less pleasant and more annoying soundscapes (see Table 4).

Table 4 Results of LMM models reporting estimates, and p-values for each fixed effect within the computed models for ISO Pleasantness and ISO Eventfulness.

Regarding the modelling of ISO Eventfulness, the single-parameter models (2 to 10) exhibit a significant correlation with the A-weighted continuous equivalent sound pressure level LAeq, T2 (1) = 20.328, p < 0.001), LAF,5 - LAF,952 (1) = 8.7652, p = 0.003), loudness (χ2 (1) = 5.6013, p = 0.018), tonality (χ2 (1) = 28.068, p < 0.001), roughness (χ2 (1) = 4.979, p = 0.026), and fluctuation strength (χ2 (1) = 19.454, p < 0.001). Specifically, more eventful soundscapes are associated with higher sound levels, level variation over time, loudness values, tonality, roughness, and fluctuation strength values.

The effects of sound pressure level (Fig. 3a), sound level variability (Fig. 3b), loudness (Fig. 3c), tonality (Fig. 3d), roughness (Fig. 3e) and fluctuation strength (Fig. 3f) on soundscape are illustrated in Fig. 3, where the dataset is divided into two sub-samples based on the median value of each (psycho)acoustic variable (see Table 2). This allows for a comparison of soundscape contours (i.e., the curves representing the 50th percentiles) according to high vs. low levels of sound, loudness, and tonality. We can notice that responses scoring high in these psychoacoustic values are generally neutral in terms of pleasantness and more eventful. In quieter locations, with less sound level variation, lower roughness, tonality and fluctuation strength the soundscape contours are generally positioned in an area of greater pleasantness and lower eventfulness, thus resulting in a calmer soundscape. Moreover, it can be noticed that the two soundscape contours based on the median value of tonality are particularly distinct and separate, clearly defining an eventful zone with high tonality values and a calm zone with low tonality.

The AIC, the Rm2 and Rc2 coefficients are reported in Table 5, with lower AIC values corresponding to higher predictive power of the model, and higher R2 associated to higher proportion of variance in the dependent variable explained by the independent variables.

Table 5 AIC, marginal and conditional R2 of the LMM for each dependent variable.

For both ISO Pleasantness and ISO Eventfulness, perceptual models (LMM1_P and LMM1_E) outperform psychoacoustic models, resulting in considerably lower AIC values, especially for pleasantness. Among psychoacoustic ones, single-parameter models based on tonality (LMM7_P) and fluctuation strength (LMM10_P) are the most effective for predicting pleasantness, corresponding to lower AIC values. Regarding eventfulness, the tonality parameter (LMM7_E) has a similar performance in predicting eventfulness compared to perceptual models (i.e., within 2 AIC units).

Interestingly, the marginal (R2 m) coefficients of determination are significantly lower than the conditional (R2 c) ones for each model. This outcome suggests that a greater proportion of the variance was accounted by random effects related to the experimental design (i.e., participants, locations nested in sites) rather than by fixed effects (i.e., perceptions and measurements).

Discussion

Interpretation

RQ1—How are perceived sound source dominance and overall perceived visual quality of the environment influencing the perceived soundscape pleasantness and eventfulness in PNAs?

The effects of the perceived sound source dominance and the overall perceived visual quality of the environment on the ISO Pleasantness and ISO Eventfulness were explored using the questionnaire results only. The questionnaire item investigating the composition of natural sound source type, from the ISO/TS 12913-28, was expanded into additional three questions to capture animal, wind-driven and water sounds. This, more detailed sound source dominance questionnaire has revealed that different types of natural sounds contribute to ISO Eventfulness in different ways. Namely, the animal (Q4.4) and wind (Q.4.5) sounds showed no significant effect, but dominance of water sounds (Q4.5) exhibited a positive correlation with ISO Eventfulness. This implies that in natural areas, a more detailed sound source type appraisal is useful.

Watts et al.53 explored the combined effect of the acoustic environment, as captured by microphone-based sensors, together with the content of environmental sounds and the context they are experienced in, and demonstrated that in urban environments the presence of visible vegetation can increase human tolerance to noise53. In urban parks, it was found that higher human presence under a certain threshold would increase both auditory and visual satisfaction with an environment54. In “more extreme” urban environments, such as central business districts, the dominance of human sounds has also been found to be associated with higher ISO Pleasantness55. However, this study indicated that an increase in dominance of human sounds leads to a decrease in ISO Pleasantness. This difference is most likely driven by the expectations people have when visiting PNAs, which are different than in cities, both in urban parks and central business areas. Visiting a natural site is an effort implying both planning and financial cost, aimed at escaping everyday urban environments and achieving a connection with nature. Not meeting such expectations likely results in a feeling of disappointment. Indeed, this is similar to the findings by Pérez-Martínez et al.56 who reported a decrease in pleasantness associated with human sounds at a cultural heritage site with a strong (over)tourism component in Granada, Spain.

Papadakis et al.57 have looked into the influence of different expectations driving ISO Pleasantness and ISO eventfulness, namely the residence and participants’ background as a proxy for familiarity with certain urban acoustic environments. Indeed, familiarity was the third dimension, following valence and arousal, recognized by Axelsson et al.34. In this study, Q3 (Do you often (at least once a month) practice mountain sports? ) was used as the proxy for familiarity with natural areas similar to the ones investigated but no effect was found through the analysis. This is in line with Yang et al.58 who looked at the effect of tourism and showed that both residents and visitors display equal appreciation of natural sounds.

The questionnaire-based models LMM1_P and LMM1_E demonstrated the highest predictive power, as assessed by observing the AIC, the Rm2 and Rc2 coefficients (see Table 5). This speaks for the potential of using crowd-sourced questionnaire data from soundwalks or equivalent smartphone-based applications, such as59,60 over traditional sound level monitoring stations for predicting soundscape quality. This is in line with other similar studies comparing the physiological and psychophysical models61. Additionally, the higher LMM1_P and LMM1_E performance implies the benefit of accounting for the types of sources which are audible, highlighting the potential application of machine learning-based automatic source recognition methodologies62 to characterize soundscapes in natural areas. While the focus of this study was to observe the effect of human activity on soundscape of PNAs, this finding is in line with other studies investigating the effect of traffic noise on annoyance where perceptual models tend to outperform the ones based on psychoacoustic features only63. The LMM1_P performed significantly better than the LMM1_E, confirming the higher difficulty in predicting eventfulness/content compared to pleasantness/comfort already found for urban34 and indoor soundscapes64.

Regarding the effect of the visual context, it is important to note that the distribution of Q8 (Overall, how would you describe the present surrounding visual environment?) responses is skewed towards very positive. This was expected, given that all the soundwalks took place in areas that are tourist attractions. A positive correlation was found between the overall visual quality and ISO Pleasantness, in line with the findings from other studies in urban parks where it was found that a more attractive natural scene can improve soundscape65. However, the number of negative soundscape quality assessments in this study still proves that not even the very high visual attractiveness of a site is sufficient to ensure a high-quality natural environment and its soundscape.

RQ2—What are the (psycho)acoustic features influencing perceived soundscape pleasantness and eventfulness in PNAs?

The (psycho)acoustic measurements that displayed the strongest effect on the ISO Pleasantness were T, F, L AF5,T-L AF95,T and LAeq, T. The strongest effect on ISO Eventfulness were T, F, L Aeq, T, R and N rmc. Tonality emerged as the main psychoacoustic feature affecting both perceived soundscape pleasantness and eventfulness. The model reveals negative coefficients for ISO Pleasantness (i.e., higher tonality leads to higher annoyance) and positive coefficients for ISO Eventfulness; hence, following the structure of the soundscape circumplex model, one could infer that higher tonality in the acoustic environment of PNAs included in this study is related to higher perceived sense of chaos (i.e., a soundscape that features negative ISO Pleasantness and positive ISO Eventfulness can be defined as chaotic).

At the sites investigated in this study, higher tonality (between 0.1 and 0.4 tu) seems to be associated with higher perceived dominance of human sounds (voices from people in this case), as shown in Table 6. This is in line with findings by Yang & Kang66 where it was observed that high presence of human speech can result in tonality around 0.1 tu. While they66 observed birdsong to be usually more tonal (between 0.5 and 0.8 tu), in this study, it (Q4.4) didn’t result in tonality higher than 0.4 tu. It is important to note that such psychoacoustic measures are highly dependent on the overall acoustic context and all the measurements made are performed on the samples of complex environments containing a multitude of sound sources in random relationship, including their random relative distances. In urban context due to the presence of more dominant anthropic sound sources (e.g., traffic noise, mechanical sounds), not present in PNAs, human voices do not stand out as particularly tonal sound sources as they are “masked” by the urban noise background. In such context tonality often reaches higher values, above 0.4 tu in cases of acoustic environments containing anthropogenic sounds such as church bells or music66,67. Therefore, the range of tonality values observed in this study still falls in the ‘low tonality range’, demonstrating the importance of considering context when assessing complex auditory environments.

Table 6 Spearman correlation coefficients and p-values between the psychophysical measures and perceived sound source type dominance.

Other studies looking at the effects of psychoacoustic measures on ISO Pleasantness and ISO Eventfulness performed in urban context, including large urban parks, have found a strong effect of loudness, sharpness and LAeq, while the effect of tonality was noted but was found to be less important than in this study46.

While the association between the dominance of human sounds and annoyance is clear, it is important to note that the human sounds are in fact the most frequent sound source type observed across the sample (Fig. 1). Indeed, up to a certain threshold, Ednie et al.68 have found that urban visitors still prefer to experience urban noises in protected areas. Taking tonality as a proxy for human sound presence (see Table 6), we can derive threshold values for ISO Pleasantness and ISO Eventfulness based on linear regression models. These are T = 1.248 tu for ISO Pleasant (ISO Pleasantness = 42.653 + 34.17 T, p < 0.001, R2adj = 0.53) and T > 0.021 tu for ISO Eventful (ISO E = 0.503 + 23.777 T, p < 0.001, R2adj = 0.45). Therefore, a tonality threshold indicating chaotic soundscapes (i.e., both unpleasant and eventful) in PNAs could be as low as 0.021 tu.

Fluctuation Strength (F) is a psychoacoustic measure indicating the presence of low modulation frequencies in audio signal (around 4 Hz). Typically, F is associated with the presence of sounds sources such as the wind farm noise, but also human speech69. In this study it was tied to human sounds, similarly as the tonality. This is not uncommon66and it is a feature that was found to be positively associated with ISO Eventfulness and negatively associated with ISO Pleasantness in urban context as well. Based on linear regressions on collected data, a fluctuation strength higher than F > 1.78 vacil is likely to be causing negative ISO Pleasantness (ISO Pleasantness = 40.283 + 34.183 F, p < 0.001, R2adj = 0.53), while an indicative threshold for ISO Eventful is 0.011 vacil (ISO Eventfulness = 0.269 + 23.796 F, p < 0.001, R2adj = 0.45). Therefore, a fluctuation strength indicating chaotic soundscapes in PNAs would be F > 0.011 vacil. On the other hand, Pheasant et al.70 have reported thresholds of LAmax < 55 dB and LAeq < 42 dB to achieve a high tranquillity score. These findings were achieved in laboratory settings, based on 32s-long audio samples. While this study is not attempting on making a direct connection between the dimensions present in the soundscape circumplex model and the tranquillity construct, in the Fig. 3a) it can be observed that a threshold for a calm and pleasant soundscape lies somewhere above LAeq < 48 dB. This can probably be explained due to the following facts: (1) this study was conducted on-site, where a wider range of sound sources is present in their ecologically true setting, (2) dominance of water sounds proved to be associated with pleasant soundscape, yet a number of sites close to waterfalls that were captured in this study feature LAeq values > 42 dB. Most importantly, our study revealed tonality to be a better perceptual predictor than LAeq.

A practical implication for monitoring and assessment of soundscape in PNAs is that both subjective and objective measurements are necessary for accurate characterisation following the ISO 12913 framework7,8,9, while the ability to accurately monitor tonality and fluctuation strength on-site is more important than controlling sound pressure levels only. Moreover, applying management policies to improve sound-related behaviour of the visitors, .i.e. lowering their “noise footprint”71is crucial for ensuring positive experience of natural areas for the visitors, such as the one demonstrated by Stack et al.72.

Limitations and future pathways

PNAs are expected to feature a very high variability in human presence from overcrowded beauty spots and the associated walking paths and roads, to the parts that almost never get visited. Both types of sites can suffer from anthropogenic noise. This study is biased towards capturing the effect of overcrowding. However, even in such conditions, recruitment and obtaining consistent data can pose a challenge when compared to urban conditions. Method A presented in the Annex C of the ISO/TS 12913-28 was considered to provide a solid solution to characterize soundscape in PNAs using subjective questionnaire data and objective acoustic measurements. The large spread of responses within the two-dimensional circumplex space, and the large spread of measured (psycho)acoustic indices confirm that such conditions can be captured via this type of soundwalks.

However, it must be noted that conducting a soundwalk in a remote area brings up challenges related to the size of the area that can be covered, duration of the walk that is manageable to most participants, number of participants that cannot be too large before starting to bias the results and that the data are limited to the accessible hiking paths.

While it can be argued that leading a soundwalk with a group of participants represents a less ecologically valid approach to characterizing soundscapes due to the bias of ‘participants’ presence’ and the fact that participants at the last stop are likely more attentive to the whole procedure than at the first stop, the authors argue that this approach still ensures the following key advantages compared to different sampling strategies, such as the one employed by Ferrari et al.37: (1) all the ratings from each listening stop relate to same environmental conditions, (2) a number of questionnaire responses can be collected in one day characterising a hiking path of up to 12 km length. Also, we believe our characterization is relevant for the typical hiking experience (in a group), as lone hiking is nor typical, not recommended for safety purposes. The bias of experiencing a site within a group of people, compared to an experience of a lone visitor, was also not considered significant as it’s not uncommon to encounter other visitors in these popular mountain environments.

The questionnaire tool chosen for this study based on its popularity for soundscape research40 was developed by using sample locations characteristic for urban environments. Studies exploring the applicability of that tool for use in different context, such as indoor residential environment, have suggested some modifications to the attributes used but have confirmed the underlying structure of a valence-arousal circumplex model. Therefore, it was considered adequate for this study and has provided meaningful results that can be interpreted in a logical way. However, as most of the responses are gathered along the diagonal between chaotic and calm soundscapes, future research might be needed to properly address the state of excitement while exploring wilderness, which might be different from calm, pleasant or vibrant dimensions. Moreover, before the establishment of the ISO 12913 series, tranquillity was one of the perceptual constructs that has received more consistent attention by the research community when it comes to using mixed methods approaches to explain perceptual outcomes of exposure to an environment. Herzog and Barnes used it to characterize quietness and quite places73. It was redefined and extensively studied in both urban and natural areas by Watts and Pheasant24,70,74,75. Contextual features, such as the presence of visual natural features in a scene, were established as key factors contributing to the construct70 but the association with quietness and calmness was kept. So another, complementary construct, aimed at providing a more detailed characterization of natural settings introduced by Pheasant and Watts was wildness23 and it included considerations of felt remoteness and naturalness23,76 which provides a possible direction towards further explorations of the optimal attributes for assessing soundscape in PNAs.

Negligible number of participants used the opportunity to provide more information in the open-ended Q9 (Do you have any comment on this listening point? ). This is most likely because writing during a soundwalk in such locations could be considered impractical, so it speaks for the use of box-ticking questionnaires. For that reason, the use of short, structured interviews after the soundwalk sessions should be considered in future work to provide richer data sets and more opportunities to interpret the questionnaire data accurately.

Regarding the statistical analyses strategy, the two multivariable models (LMM1_P and LMM1_E) were built to evaluate the effects of the perceptual, questionnaire-based variables (RQ1). However, multiple models were built for different (psycho)acoustic sensor-based variables (RQ2). We observed a certain degree of collinearity, contributing to the decision not to build a single model featuring all the variables, nor a model with a subset of them, but rather to explore the effect of the various (psycho)acoustic variables independently, in an exploratory manner. This was done for the sake of interpretability and to avoid standard error inflation, despite not accounting for the shared variance between the predictors. Moreover, such a choice was considered suitable for the exploratory nature of this study and the aim to evaluate impact of specific variables suggested by the ISO 12913 series7,8,9, taking into account the multidimensional nature of soundscape-driven problems. Indeed, future research, based on a larger data set and featuring a greater variety of environments will enable the development of soundscape predictive models for PNAs, such as the those demonstrated by Mitchell et al. or Ooi et al.46,77.

The first soundwalks organised within the Silenzi in Quota initiative took place in 2022. This study reports on the implementation of the ISO 12913 framework not previously tested in mountainous and natural exurban areas to this extent. This work paved the way for future standardisation of soundscape investigations in PNAs and provided evidence for a sustainable approach to visitors’ numbers and behaviour. The importance of investigating influence of exurban context on soundscape has been highlighted together with some limitations of the current ISO 12913 framework when applied in large PNAs. Sound type categories and psychoacoustic features displayed a clearly different pattern than those found in urban context as visitors can easily become the most critical noise source themselves.

Methods

This study is based on a mixed methods approach featuring the five participatory walks conducted on-site where the subjective data was collected from the participants via a questionnaire tool simultaneously with the short-term environmental acoustic measurements.

Sites

Five walking routes located within PNAs in the north of Italy (N = 4) and Scotland, United Kingdom (N = 1) were investigated on a one-session-per-route basis, taking place over a period of 14 months between April 2022 and June 2023. The protection status of the natural areas investigated includes inscription at the UNESCO World Heritage list78 and National Park status79. The four walking routes in Italy are located within the following three natural areas, all within the zones inscribed to The Dolomites UNESCO World Heritage property: Parco naturale Fanes-Sennes Braies (session Lago di Braies), Parco naturale Panaveggio – Pale di San Martino (sessions Val Venegia and Passo Rolle) and Parco naturale Tre Cime (session Tre Cime di Lavaredo). The walking route in the United Kingdom is within the Cairngorms National Park (session Glen Lui). Throughout the text the five routes will be referred to as per their respective session names in the Table 7, similar to the names chosen in calls for participation via the webpage50.

Table 7 List of the five soundwalk sessions with route characteristics.

None of the UNESCO documents related to the Dolomites World Heritage Property, available online at the corresponding UNESCO-managed webpage78 mention any the following keywords: sound, noise and/or acoustic. The Cairngorms National Park Authority documentation mentions the dominance of natural sounds within the section on Special Landscape Qualities – Visual and Sensory Qualities and provides brief descriptions of the auditory experiences specific to specific types of landscapes within the Park80. The section Good Design in National Park81 mentions the potential of a well-designed development to reduce overall emissions, including noise, but the good design case studies provide no further details, according to the brief review by the authors.

Questionnaire

The questionnaire was structured as per the Method A of the Annex C8as follows: (1) basic demographic information, including familiarity with hiking , (2) sound source identification per sound type (sounds of technology, sounds of nature, sounds of human beings), (3) perceived affective quality of the present sound environment, (4) overall quality of the surrounding sound environment, (5) appropriateness of the surrounding sound environment to the present place. The Method A-type questionnaire was then expanded to capture more nuanced characterization of the sounds of nature, perceived overall visual quality of the present place, and participants’ experience in mountain sports to account for the possible effect of familiarity. The questionnaire was administered in Italian and English, referring to Aletta et al.82 for the translation of perceptual attributes. Questionnaire items are described in Table 8, while the complete questionnaire in Italian and English is provided in Supplementary Material.

Table 8 Questionnaire items in English and Italian.

A total of 443 questionnaires was submitted in paper form. Data was cleaned during the manual entry into a digital form. No full questionnaire was discarded but occasional missing data was observed, i.e. for certain questionnaire items, there are no more than 435 responses available.

Participants

A total of 88 participants (Lago di Braies (N = 14), Val Venegia (N = 6), Passo Rolle (N = 18), Glen Lui (N = 25), Tre Cime di Lavaredo (N = 25)) have attended the five walks. The reported mean age was 35.6 years old, with youngest participant of the age 19 and the 77 being the eldest one, which makes for the age range of 58 years. Four participants didn’t report their age but were not excluded from the sample. 40 (45%) participants reported their gender as female, 45 (51%) as male and two (4%) preferred not answering the question. 59 (67%) participants reported that they often practice mountain sports such as hiking, outdoor climbing or skiing, while 29 (33%) participants reported that they do not practice those activities often. The majority of participants across the five walks were different, with a small possibility that a few attended multiple walks in Italy. This was not controlled for in the analysis due to the data anonymization process. The participants were recruited from the general public usually 1–2 months ahead of the soundwalk via public calls posted on social networks such as LinkedIn, Facebook and X. Data about the walking route, elevation, length and the duration were advertised in the call, allowing for a fitness self-assessment.

As the research involved human participants, the study design was reviewed and approved by the Ethics Committee at the Bartlett School of Environment, Energy and Resources, University College London (registered under Z6364106/2023/05/08 social research), while procedures in place at the Institutional Research Offices at EURAC Research and University of Trento were followed for questionnaire administration based on the principle of informed consent. The participants provided their informed consent in written form following the online distribution of the Participation Information Sheet prior to each soundwalk. Additionally, for all the soundwalks, a written informed consent for publication was provided by participants to show individual images in the research publications and social media, including online open access publications. All the methods were performed in accordance with the Declaration of Helsinki83.

Audio recordings and environmental acoustic measurements

All audio recordings and measurements were performed by an operator wearing the head-mounted binaural microphone kit (BHS II by HEAD acoustics) during the questionnaire, as shown in Fig. 4b). During some sessions a head and torso simulator was present as well, as shown in Figs. 4a and 6), but that data was not used in this manuscript as the priority was given to the head-mounted kit for consistency. The front end devices varied between the sessions (SQuadriga III and SQobold by HEAD acoustics), but all the systems were Class 1 compliant, with the BHS II – specific equalisation engaged and set to ID84and were calibrated following the same procedure using the 94 dB 1 kHz sine wave generator for all sessions.

Procedure

Each of the five routes featured a number of listening stops. A total of 28 evaluation points (listening points) were recorded altogether (Lago di Braies (N = 7) – Fig. 5a, Val Venegia (N = 8) – Fig. 5b, Paso Rolle (N = 4) – Fig. 5c, Glen Lui (N = 6) – Fig. 5d, Tre Cime di Lavaredo (N = 3) – Fig. 5e). The exact locations of the listening stops, shown in Fig. 5, were recorded with the GPS tool integrated in the binaural measurement kit and added manually where the measurement device lost connection with the satellites.

All the five walking routes were selected so most of the stops are within the administrative borders of a protected natural area. It was expected that in a protected natural area where its management is focused on protection and tourism, visitors’ expectations of the overall sensory experience would be higher so the message about possible issues with environmental noise would be received as stronger. Moreover, one of the walks (Lago di Braies, Fig. 5a) was selected knowingly that there is a high chance of encountering crowds. Further considerations included accessibility by transport and the trail walkability for inexperienced hikers for the risk management purposes. All the routes were formed of the existing hiking trails, following recommendations from the official guides. The locations of the listening spots were decided ahead of the walks by observing two key criteria: (1) distance in relation to the whole walk for pragmatic reasons, (2) diversity of sonic experiences that were to be expected during the walk, based on scouting. The authors believe this kind of sampling is inevitable in studies that combine research with public engagement and the research focus is not jeopardized in any way, i.e. a completely random location sampling wouldn’t improve the level of quality at which the research questions are answered.

Participants and researchers walked along the predefined route as a group. While walking, participants were free to talk and interact with each other as the typical visitors would do. To minimise the disturbance to other visitors and the environment, we have either sought advice from local guides or had them accompanying us on the walks, following their recommended behaviour patterns, i.e. walking in line, making space for other visitors, not damaging the undergrowth, giving particular attention to specific species. At each listening stop, researchers invited participants to choose a spot where they feel comfortable in relation to the walking path, topography, other participants and other visitors, and then face towards the same view as the researcher handling the binaural recording system or the head and torso simulator (Fig. 6), followed by listening in silence for a minute and filling in a questionnaire (Fig. 4c). Meanwhile, the researchers collected at least 3 min of calibrated binaural recordings before proceeding to the next listening point. This method aimed to ensure that the audio recorded by the operator corresponds to what participants heard while completing the questionnaire, accounting for certain small variability between the participants. During the expedition, team members also collected photos and video footage of the soundwalk for social media and outreach activities. However, care was taken not to disturb the listening moments, avoiding noise from cameras, operator movements, and drones.

Data analysis

Data cleaning

A total of 28 audio recordings was made. A data cleaning protocol was performed where two researchers independently listened to each of the recordings and visually inspected spectrograms using software package ArtemiS SUITE 12.985. Five recordings were discarded due to excessive wind noise and weren’t included in further acoustic analyses, as suggested by Lyons et al.86. During the same listening sessions, 1-minute excerpts were selected for the analysis, from the usually 3-minutes long recordings made on-site. This has proved to be a period that could be consistently applied to all the 23 recordings after discarding the parts affected by wind or handling noise.

Acoustic analysis

ArtemiS SUITE 12.9 software package85 was employed to calculate environmental acoustic metrics, following the recommendations from the ISO/TS 12913-2 and ISO/TS 12913-38,9 as per Table 1.

Perceptual data

Following the recommendations from the Part 3 of the ISO/TS 129139, the following formula has been applied to calculate coordinates of the perceptual outcomes of the eight attributes in the Q5 and enable interpretation within the two-dimensional perceptual space defined by the axes representing “ISO Pleasantness” and “ISO Eventfulness”:

$$ISO{\text{ }}\,Pleasantness = {\text{ }}{{[(e - u) + {\text{ cos45}}^\circ (ch - ca) + {\text{ cos45}}^\circ (v{\text{ }} - {\text{ }}m)]} \mathord{\left/ {\vphantom {{[(e - u) + {\text{ cos45}}^\circ (ch - ca) + {\text{ cos45}}^\circ (v{\text{ }} - {\text{ }}m)]} {\left( {{\text{4 }} + \!{\underline {\, {\sqrt {32} } \,}} } \right)}}} \right. \kern-\nulldelimiterspace} {\left( {{\text{4 }} + | \!{\underline {\, {\sqrt {32} } \,}} } \right)}}$$
$${\text{ISO}}\, Eventfulness = {\text{ }}{{[(e - u) + {\text{ cos45}}^\circ (ch - ca) + {\text{ cos45}}^\circ (v{\text{ }} - {\text{ }}m)]} \mathord{\left/ {\vphantom {{[(e - u) + {\text{ cos45}}^\circ (ch - ca) + {\text{ cos45}}^\circ (v{\text{ }} - {\text{ }}m)]} {\left( {{\text{4 }} + | \!{\underline {\, {\sqrt {32} } \,}} } \right)}}} \right. \kern-\nulldelimiterspace} {\left( {{\text{4 }} + | \!{\underline {\, {\sqrt {32} } \,}} } \right)}}$$

where a is annoying, ca. is calm, ch is chaotic, e is eventful, m is monotonous, p is pleasant; u is uneventful, v is vibrant.

Statistical analysis

Ten Linear Mixed-Effects Models (LMM) were computed, as shown in Table 4, with the following aims: LMM1 to explore associations between soundscape perception and the perceived sound source dominance, perceived visual quality and soundscape, while accounting for individual age, gender, and habit of experiencing the mountains (regular vs. occasional visitor) (RQ1); LMM2 to LMM10 were designed as single parameter models and computed to test the ability of a set of nine acoustic and psychoacoustic metrics to predict soundscape perception. This approach was preferred to using multivariable models since the aim of the RQ2 was to provide findings easy-to-interpret and easy-to-implement in monitoring of PNAs at the sensor node, where simplicity and efficiency are critical, minimising storage and post processing issues87. Models are described in Table 9.

Table 9 Specification of model equations. Equal models were considered for both ISO Pleasantness (LMM_P) and ISO Eventfulness (LMM_E) scores.

The experimental activity employed two independent factors with different levels each: Site (five levels) as a between-subject factor, and Evaluation Point (between 3 and 7 levels depending on the Site) as a within-subject factor.

Considering the repeated-measure nature of the experimental design, the authors adopted Linear Mixed-Effects Models (LMM) using the statistical software R88 and the R packages lme489, considering multiple LMMs for each dependent variable. The basic theory of the LMM is that subjects’ responses are the sum of fixed factors, which are the variables of interest controlled during the study, and random factors that can influence the covariance of the data.

Concerning the generation of the model, the independent variables used as fixed effects were survey scores and measured acoustic variables. Participants were treated as a random factor. A random intercept varying among Sites and Evaluation Points was included in each model concerning the nested random effects (i.e., Evaluation Points nested in Sites). In addition, a by-subject random intercept was added to estimate the variance in the outcomes related to the different individuals90. The specification of the general final model was as follows:

Dependent Variable ~ Independent Variable + (1|SiteID /EvaluationPointID) + (1|Participant_SiteID).

Ten models were created and tested for each dependent variable, i.e., ISO Pleasantness (LMM_P) and ISO Eventfulness (LMM_E) scores, thus resulting in a total of twenty computed LMMs.

LMMs were computed after verifying the assumption of normality and homogeneity of residual data distributions. Variance Inflation Factor (VIF) or Generalized VIF (GVIF), in case of categorical predictor, were computed to diagnose collinearity for each predictor.

Once the models were computed, it was of interest to carry out a comparison to select the one(s) with the highest predictive power given the data, especially within the (psycho)acoustic-based models (LMM_P and LMM_E 2 to 10) and between perceptual-based (LMM1_P and LMM1_E). The Akaike Information Criterion (AIC) was used to compare the quality of the hypothesised models. The model with the smallest AIC has the highest predictive power and a two unit difference on AICs (ΔAIC = 2) is usually considered a threshold for evidence of a difference in the models91. In addition, to compare the accuracy of the tested models and represent the proportion of the total variance explained by the fixed effects and by both fixed and random effects, the marginal (R2m) and conditional (R2c) coefficients of determination were generated for each model. Indexes were estimated using the function r.squaredGLMM from the MuMIn package88,92 to be interpreted using the recommended thresholds for a minimum (0.20), moderate (0.50), and strong (0.80) effect size93.

Fig. 2
figure 2

Comparison of soundscapes based on the dominance of (a) traffic noise, (b) other noise, (c) human beings, (d) animals, (e) wind, (f) water sounds, and (g) quality of landscape. The curves represent the 50th percentile contour, and the bivariate distributions of ISO Pleasantness and ISO Eventfulness are plotted on the two axes. "L" represents low dominance (not at all, a little) or poor quality (very bad; bad) group, while "H" represents the high dominance (moderately, a lot, dominates completely) or high quality (neither good nor bad, good; very good) subsample.

Fig. 3
figure 3

Comparisons of soundscapes based on the values of (a) LAeq, (b) L AF5,T-L AF95,T, (c) Nrmc, (d) T, (e) R and (f) F. The dataset was divided into two subsamples based on the median value of the three parameters. The curves represent the 50th percentile contour, and the bivariate distributions of pleasantness and eventfulness are plotted on the two axes.

Fig. 4
figure 4

Data collection during soundwalks: (a) binaural recordings using a head and torso simulator in Glen Lui (data not used in this study), (b) recordings with a binaural headset at Tre Cime di Lavaredo, (c) completion of the questionnaire in paper format at Tre Cime di Lavaredo. Picture (b) and (c) credit: Mario Pedron.

Fig. 5
figure 5

Source: OpenStreetMap through Outdooractive: https://www.outdooractive.com/en/94. All routes began and concluded at the same location.

Overview of the soundwalks: (a) Lago di Braies (Italy), (b) Val Venegia (Italy), (c) Passo Rolle (Italy), (d) Glen Lui (Scotland, UK), (e) Tre Cime di Lavaredo (Italy). Numbers indicate listening stops. The scale is provided by the rulers. Dark green line represents the administrative borders of the protected area, dark red line represents the walking route, while the yellow line represents the main road.

Fig. 6
figure 6

The operator with the head and torso simulator (data not used in this study – see Methods, Audio recordings and environmental acoustic measurements) and the participants in the same position, looking in the same direction, listening, then filling in the questionnaire, during the session in Glen Lui, Scotland. Picture credit: Mario Pedron.