A psychometric study of the team psychological safety scale and sport psychological safety inventory in Swedish elite sports

Lundqvist, Carolina; Bermon, Stéphane; Timpka, Toomas

doi:10.1038/s41598-025-06963-1

Download PDF

Original Research
Open access
Published: 20 June 2025

A psychometric study of the team psychological safety scale and sport psychological safety inventory in Swedish elite sports

Carolina Lundqvist^1,2,
Stéphane Bermon^3,4 &
Toomas Timpka^2,5

Scientific Reports volume 15, Article number: 20227 (2025) Cite this article

2604 Accesses
3 Altmetric
Metrics details

Subjects

Abstract

Studies investigating psychological safety in sports and non-sports contexts have mostly utilized the universal Team Psychological Safety Scale (TPSS) aimed for performance development in professional teams. The Sport Psychological Safety Inventory (SPSI) has recently been introduced for psychological safety measurement specifically in sports. The aim of this study was to compare the psychometric properties of the TPSS and the SPSI within an elite sport context. A cross-sectional survey was used to collect data for assessment of the internal consistency, factorial validity, construct validity and measurement invariance of the TPSS and the SPSI. Complete data sets were provided by 371 elite Athletics athletes (track and field) and orienteers. Both the TPSS (ω = 0.72) and the SPSI subscales (range: ω = 0.81-0.88) showed acceptable internal consistency. Confirmatory factor analyses indicated a mediocre to good model fit for the TPSS and the SPSI three-factor correlated structure. The TPSS and the SPSI subscale ‘mentally healthy environment’ showed a moderate correlation. Measurement invariance tests suggested the TPSS to be fully invariant across genders, while the SPSI was found non-invariant. The study shows that the TPSS appears sound for assessing psychological safety in elite sports, while caution is needed when using the SPSI.

Investigating sport persistence through the development of the Sport Persistence Questionnaire

Article Open access 18 July 2025

Validation of Sport Anxiety Scale-2 (SAS-2) among Polish athletes and the relationship between anxiety and goal orientation in sport

Article Open access 19 July 2022

Factor structure and psychometric properties of the Perceived Stress Scale in Russian adolescents

Article Open access 08 January 2024

Introduction

Building on research initially rooted in organizational psychology that has demonstrated quantifiable associations between levels of psychological safety and team performance under pressure, as well as wellbeing and job satisfaction in individual team members^1,2,3,4, a growing body of literature discusses the application of the concept in sports settings^5,6. An increasing number of studies suggest that psychological safety is associated with adaptive outcomes in sports, for example, a good quality in the coach-athlete relationship, resilience, and mental health^7,8,9. Most studies investigating psychological safety have utilized Edmondson’s Team Psychological Safety Scale (TPSS)^3,10 aimed at performance development in professional teams. According to the operational definition of the TPSS, psychological safety is ‘a shared belief that the team is safe for interpersonal risk taking’[¹⁰, p. 354]. This definition characterizes psychological safety as mutual respect and trust among team members, where speaking up or being oneself does not lead to negative consequences. The emphasis is on open communication without fear of embarrassment or punishment, which is also linked to factors like leadership and organizational policies^10,11,12,13.

In 2021, the International Olympic Committee (IOC) defined psychological safety in sports as ‘the creation of an athletic environment where athletes feel comfortable being themselves, can take necessary interpersonal risks, have the knowledge and understanding of mental health symptoms and disorders, and feel supported and comfortable in seeking help if needed’[¹⁴, p. 34]. This definition, in addition to interpersonal coherence and the aspect of open communication, encompasses the individual’s understanding of mental health and readiness for help-seeking. Given the contextual differences between organizational (e.g., business, healthcare) and sports settings, as well as the semantic gap in the interpretation of psychological safety across the contexts⁶, it is unsurprising that a systematic review of the literature on psychological safety in sports reported that only 30% (n = 67) of articles investigating this concept provided a clear definition⁵. The term psychological safety was in sports often used as a broad label to describe phenomena ranging from threat and harm to general impressions of inclusivity, equality, and respect. Based on their review, Vella and colleagues proposed defining psychological safety in sports as ‘the perception that one is protected from, or unlikely to be at risk of, psychological harm in sport’[⁵, p. 15].

One of the few instruments adapted for the measurement of psychological safety in sports is the Sport Psychological Safety Inventory (SPSI), which includes three subscales: Mentally healthy environment, mental health literacy, and low self-stigma¹⁵. The initial validation study conducted among Australian elite athletes and coaches supported a three-factor correlated structure. Low scores on the mentally healthy environment subscale and high scores on the low self-stigma subscale were associated with moderate mental health distress caseness, but scores on the mental health literacy subscale were not predictive of such distress¹⁵.

Although both the TPSS and SPSI have been developed to measure the concept of psychological safety and both scales have been applied in sports^7,8,9,15,16, they diverge in their operational definitions. It remains unclear how these scales conceptually relate to each other and to important endpoints (e.g., health, coach-athlete relationship, performance) in sports, which has implications for the interpretation and conclusions of studies. The data collection for the SPSI validation was in addition performed during the early stages of the COVID-19 pandemic, a period characterized by exposure to strong, yet transient, psychological stressors. The authors therefore called for further validation of the psychometric properties of the inventory, as well as replication studies in diverse samples and cross-cultural settings¹⁵. Responding to the call for further analyses, the aim of this study was to investigate the psychometric properties, including internal consistency, factorial validity, construct validity and measurement invariance, of the TPSS and the SPSI in a Swedish elite sport context.

Methods

Participants

Swedish Athletics (track and field) athletes and orienteers, ranging from junior national sub-elite to senior international elite categories and aged ≥ 15 years, were invited to participate. A total of 371 athletes (Athletics: n = 233, females = 125; orienteering: n = 138, females = 73) completed the questionnaire. The mean ages were 18.72 years (SD = 4.73) for the Athletics sample and 18.93 (SD = 3.90) for the orienteering sample. Table 1 presents descriptive statistics for participants’ mean age, the age at which they began training in their sport, training hours per week, and the number of coaches they were currently trained by, categorized by competitive levels. The diverse group provided a comprehensive representation of the athletic spectrum within these sports.

Table 1 Descriptives of participants competitive levels, age, age when they started training the sport, training hours/week and number of coaches.

Full size table

Study design and data collection

This study employed a cross-sectional design by using an online survey. The data utilized in this study are part of a larger data collection, which included standardized questionnaires related to mental health, psychological safety, and other environmental or health prerequisites in elite Athletics and orienteering. With support from the Swedish Athletics Federation and the Swedish Orienteering Federation, an invitation containing a QR code and a weblink to the survey was distributed to all National Sports High Schools, elite clubs, high-performance environments, and national teams within the respective federations. Data were collected from April 2023 to March 2024, and the survey was completed anonymously. Data collection for Athletics was conducted using the Lynes platform (lynes.io), while data collection for orienteering was conducted using the Artologik Survey&Report platform (artologik.com). The same survey was administered on both platforms. The transition of platform was driven by technical considerations and was not considered to impact on the quality of data collection.

Measures

Demographics collected included age, age when the participants had started training the sport, self-assigned gender, number of training hours/week and number of coaches they currently were trained by.

The Team Psychological Safety Scale (TPSS)¹⁰ consists of seven items and was back-translated from English to Swedish. Originally developed for use in organizations, the scale assesses team psychological safety, such as the extent to which team members feel safe taking interpersonal risks like admitting mistakes or asking for help. Respondents rate each item on a 7-point scale, ranging from 1 (“strongly disagree”) to 7 (“strongly agree”). Three items (item 1,3,5) are reverse scored. Total scores range from 7 to 49, with higher scores indicating greater perceived psychological safety. While support for the reliability and validity of the TPSS has been reported both in non-sports and sports contexts^9,10,16, one study found potential problems related to item 6 when the scale was used in sports⁷.

The eleven-item Sport Psychological Safety Inventory (SPSI)¹⁵ was back-translated from English to Swedish. The SPSI operationalizes psychological safety into three subscales: mentally healthy environment (four items), mental health literacy (four items), and low self-stigma (three items). Respondents rate their answers on a five-point scale ranging from 0 (“strongly disagree”) to 4 (“strongly agree”). Three items (item 9,10,11) are reverse scored. Higher scores on the subscales indicate higher levels of perceived mentally healthy environment (total score range: 0–16), mental health literacy (total score range: 0–16), and lower self-stigma (total score range: 0–12). The initial validation provided support for the scale’s internal consistency and a three-factor correlated structure¹⁵.

A Swedish version of the fourteen-item Hospital Anxiety and Depression Scale (HADS) was utilized to assess anxiety (seven items) and depression (seven items)^17,18. Responses are scored on a four-point scale (ranging from 0 to 3), with total scores for each subscale ranging from 0 to 21. Higher scores indicate greater levels of anxiety and depression symptoms, with a cut-off score of ≥ 11 recommended to identify probable cases of clinically significant anxiety or depression disorders¹⁹. The HADS is widely used in Swedish healthcare and has been extensively validated, demonstrating good psychometric properties^{17,18,19,20,21}.

A Swedish version of the 11-item Coach-Athlete Relationship Questionnaire (CART-Q)^22,23 was used to assess the coach-athlete relationship in terms of commitment, closeness, and complementarity. Respondents rated their responses on a seven-point scale ranging from 1 (“strongly disagree”) to 7 (“strongly agree”). The scale has been validated in various languages, demonstrating adequate psychometric properties^22,23. In the present study, linguistic problems with the Swedish wording of item 2 (“I feel committed to my coach”) were identified during data screening, resulting in an unacceptable low McDonald’s omega. Problems with this item in the Swedish version of the CART-Q has also previously been identified^23,24. Consequently, this item was removed in this study, while the remaining ten items were retained. The total scores for the 10-item version of the CART-Q in this study ranged from 10 to 70. A high score indicates a good quality coach-athlete relationship.

Statistical analyses

The sample characteristics, including means and standard deviations (SD), were analyzed using descriptive statistics, and scale reliability was calculated using McDonald’s omega (ω). Mann–Whitney U tests were conducted to explore differences between sports (Athletics athletes and orienteers) as well as between female and male athletes. Effect size for the Mann–Whitney U test (r) was calculated, with < 0.3 representing a small effect, and thresholds for medium and large effects being 0.3 and 0.5, respectively²⁵. To evaluate the construct validity of the TPSS and SPSI, Spearman rank-order correlations were calculated with scores from instruments measuring the coach-athlete relationship (CART-Q) and mental health (HADS for anxiety and depression). Given that psychological safety as a construct has been suggested to be associated with a higher quality in the coach-athlete relationship and favorable conditions to support athletes’ mental health^7,8,9, we hypothesized psychological safety scores on both scales to be positively related to CART-Q scores and negatively related to HADS scores. Both the Mann-Whitney U test and Spearman rank-order correlation are non-parametric tests appropriate for the ordinal data that were used in this study. None of the tests assume normal distribution of data, as they are based on ranks of scores. However, the Mann-Whitney U test assumes similar distribution shapes across independent groups, while the Spearman rank-order correlation assumes independent observations between pairs of variables²⁵. Descriptive analyses and non-parametric tests were performed using SPSS Statistical Package version 29.

Confirmatory factor analyses (CFA) were conducted using MPlus version 8.8 to validate the factor structure of the measurement models for the TPSS and SPSI. Before conducting the CFA analyses, tolerance and variance inflation factor (VIF) was investigated to diagnose collinearity. Multicollinearity is indicated by a VIF above 4 or tolerance below 0.25 and no indication of collinearity was found in the data. Additionally, Mahalanobis distance was explored to detect multivariate outliers. The Mahalanobis distance measures the distance of a case from the centroid of the other cases, with the centroid being the point where the means of all variables intersect. A case is considered a multivariate outlier if it meets the chi-square (χ²) criterion with degrees of freedom and a significance level of p >.001²⁶. To prevent multivariate outliers from disproportionately influencing the results and distorting the overall model fit, which could lead to misleading conclusions about the model’s adequacy, multivariate outliers were removed prior to conducting the CFA: s. Missing data were handled using pairwise deletion.

The four á priori hypothesized measurement models tested are displayed in Fig. 1. For the TPSS, and based on Edmundson’s original scale¹⁰, a one-factor hypothesized á priori measurement model was tested (Fig. 1a).

For the SPSI, three hypothesized á priori measurement models were tested based on findings in the initial validation study conducted by Rice et al.¹⁵:

1.
A first order measurement model with one latent factor (Fig. 1b).
2.
A first order measurement model with three latent correlated factors (mentally healthy environment, mental health literacy, low self-stigma (Fig. 1c).
3.
A higher order measurement model with one higher order factor (psychological safety) and three latent factors (mentally healthy environment, mental health literacy, low self-stigma) (Fig. 1d).

To examine the interrelationship between the latent factors in the TPSS and the SPSI, a post hoc analysis was performed to analyze the scales together. The measurement models that displayed the most acceptable model fit for each scale were combined into a comprehensive model, with the latent factors from the two scales specified as correlated (see Fig. 2).

To assess the model fit of the hypothesized models, and because ordinal data was used, weighted least squares mean and variance (WLSMV) estimation was adopted to provide robust parameter estimates and standard errors²⁷. Model fit was evaluated using the comparative fit index (CFI) and the root mean square error of approximation (RMSEA)²⁸. A good model fit is indicated by CFI > 0.95 and RMSEA < 0.06. For RMSEA, values between 0.08 and 0.10 indicate a mediocre fit while values > 0.10 indicate a poor fitting model^28,29,30.

Measurement invariance was tested to evaluate the equivalence of the scales (TPSS and SPSI) across gender. Measurement invariance evaluates if a construct is interpreted and assessed similarly across groups and is a prerequisite for group mean comparisons^31,32. This involves analyses of increasingly constrained and nested models. First, configural invariance is established by analyzing the model fit achieved with only the factorial structure constrained across the groups of females and males. This step assesses the invariance of the dimensional model’s configuration across both groups, also serving as the baseline for further steps in the measurement invariance tests. In the second step, metric invariance is tested by constraining the factor loadings across gender. The third step focusses on scalar invariance where the item thresholds are required to be identical for both genders. We adopted the MPlus shortcut option that automatically runs multiple group models to test measurement invariance, using the settings configural, metric and scalar³³. To compare if subsequently more constrained models are significantly different (p <.05) and thereby not invariant, the shortcut option provides chi-square difference testing with scalar corrections for WLSMV^33,34,35,36. The chi-square difference test is an exact fit approach, but a limitation is that the test can be overly sensitive particularly when using large samples³⁷. Indices of approximate fit have been discussed as a solution, usually by calculating CFI (∆CFI) or RMSEA (∆RMSEA) differences^32,37,38. However, these indices are descriptive. There is no clear consensus on which fit indices and cut-offs should be used to assess misspecification under various conditions^32,37,38. For example, simulation analyses show that ∆CFI may retain both well-fitting and poor-fitting models, imposing uncertainty regarding the appropriate cut-off³². The ∆RMSEA has been reported to lack sensitivity and could therefore potentially mask misfit, particularly for models with large initial degrees of freedom³⁷. While additional indices are proposed (e.g., RMSEA_D) they have also met objections^37,39. A discrepancy between modification index values and chi-square difference tests in MPlus can also be observed when using WLSMV for ordinal data. This is due to the adjustments made in the chi-square difference test to accommodate this type of estimation³⁴. Given the controversies surrounding the interpretation of various indices of approximate fit in measurement invariance testing and considering our use of WLSMV to account for ordinal data, we decided to evaluate measurement invariance using the chi-square difference test provided in the MPlus shortcut option. We judged this method to be more reliable than use of approximate fit indices, particularly because our sample size was not overly large. Statistical significance in all analyses was determined by a p-value < 0.05.

Ethics statement

The study was approved by the Swedish Ethical Review Authority (2022–03327-01). All participants were 15 years or older, and in accordance with Swedish ethical regulations, parental consent was not required. Participants provided informed consent on the initial survey question.

Results

Demographics

Mean and standard deviations of all scales for the two sports (Athletics and orienteering) as well as for self-assigned gender (female and male athletes) are shown in Table 2. Gender differences related to low self-stigma (SPSI) and anxiety (HADS) were revealed, with female athletes reporting lower self-stigma and higher anxiety scores than males. No other significant differences in the assessments across sports or gender were found. Table 2 also displays skewness, kurtosis, and McDonald’s omega (ω) for the scales. All scales demonstrated acceptable ω values (> 0.70) and were, except for the CART-Q, approximately normally distributed.

Table 2 Descriptive statistics of all scales for the two sports (Athletics and orienteering) and self-assigned gender. Skewness, kurtosis and McDonald’s omega (ω) for all scales are also displayed.

Full size table

Construct validity

The strongest positive correlations between the psychological safety inventories and the validation instruments were observed for the TPSS and the SPSI subscale mentally healthy environment (Table 3). Although all psychological safety scales (TPSS, SPSI subscales) were significantly and negatively correlated with anxiety and depression scores, the TPSS and the subscale mentally healthy environment (SPSI) showed the strongest negative correlations.

Table 3 Associations between psychological safety inventories and validation instruments (Spearman rank-order correlations).

Full size table

Confirmatory factor analyses (CFA)

Because no significant differences were found between Athletics athletes and orienteers’ mean scores on the psychological safety inventories (Table 2), the study participants were analyzed as one sample in the CFA. Data screening with Mahalanobis distance identified 12 multivariate outliers (χ2(7) ≥ 24.32, p ≤.001) for the TPSS. One case had incomplete data and 13 cases were excluded from further analyses, resulting in a final sample of 358 cases (females: n = 192; males: n = 166) used in the measurement invariance analyses of TPSS. For SPSI, 8 cases were identified as multivariate outliers and four were identified with incomplete data. The final sample used for the SPSI included 359 cases (females: n = 192; males: n = 167).

Results from all CFAs are presented in Table 4. The CFA conducted on the TPSS with one latent factor indicated a good model fit across all fit indices, while the one latent factor model of the SPSI revealed a poor model fit. Analyses of the proposed SPSI three-factor correlated model and the higher order model, with the higher order factor specified to load on three latent factors, showed model fit to be acceptable (with CFI indicating an excellent fit and the RMSEA suggesting a mediocre model fit).

Table 4 Confirmatory analyses for á priori hypothesised models of TPSS and SPSI with chi-square (Χ²) and degrees of freedom (df). Model fit evaluated by comparative fit index (CFI), root mean square error of approximation (RMSEA) with 90% confidence interval (CI).

Full size table

Figure 2 presents the post hoc analysis where the one latent factor solution of the TPSS and the three-factor measurement model of the SPSI were analyzed within the same model. Mahalanobis distance identified 13 multivariate outliers (χ2(18) ≥ 42.31, p >.001) and four cases had incomplete data. Analyses were performed on 354 cases (females: n = 188; males: n = 166). The one latent factor of the TPSS was specified to correlate with the three latent factors of the SPSI. As shown in Table 4, this combined model demonstrated an acceptable model fit with all fit indices reaching acceptable levels. The strongest relationship between the latent factors of TPSS and SPSI was found between psychological safety (TPSS) and mental healthy environment (SPSI), while the relationships between psychological safety (TPSS), mental health literacy (SPSI) and low-self stigma (SPSI) were weaker.

Measurement invariance across gender

When females and males were analyzed separately, the TPSS (the first order model with one latent factor) displayed an acceptable to mediocre model fit for both genders (females: χ² = 32.36(14), p <.001, CFI = 0.98, RMSEA = 0.08; males: χ² = 30.11(14), p <.001, CFI = 0.98, RMSEA = 0.08). The SPSI (the first order model with three correlated latent factors) displayed an acceptable to poor model fit (females: χ² = 138.74(41), p <.001, CFI = 0.97, RMSEA = 0.11; males: χ² = 110.23(41), p <.001, CFI = 0.97, RMSEA = 0.10).

Measurement invariance tests were performed for the TPSS and SPSI respectively (Table 5). The shortcut chi-square test for difference testing suggested the TPSS to be metric and scalar invariant across genders. For the SPSI, the shortcut chi-square test for difference testing suggested the model to be metric but not scalar invariant across genders. To explore invariance of individual thresholds they were constrained on by one. The scalar-metric comparisons showed all single thresholds tested to be significant (p <.001) indicating them to be non-invariant.

Table 5 Measurement invariance (configural, metric and scalar) of the TPSS and the SPSI across genders. Configural invariance denotes the model fit achieved with only the factorial structure constrained across the groups of females and males, also serving as the baseline. Metric invariance is measured by constraining the factor loadings across gender. Scalar invariance requires the item thresholds to be identical for both genders.

Full size table

Discussion

This validation study of instruments measuring psychological safety confirmed the internal consistency of the investigated scales and their proposed factor structures: a one-factor solution for the TPSS and a three-factor correlated solution for the SPSI. Consistent with the findings of Rice et al.¹⁵, a one-factor solution for the SPSI was not supported and a higher order model was not found superior to the three-factor correlated solution. The TPSS was found to be fully invariant across genders, while scalar invariance was not supported for the SPSI. Indications of non-invariance across gender present a significant challenge for researchers aiming to conduct gender comparisons with the scale. When invariance is questionable, any observed score differences may reflect measurement bias rather than true differences in the construct, rendering such comparisons scientifically meaningless^32,40. Further research is desirable to investigate the measurement invariance of the scales across genders, sports, cultures and other groups that may be of interest for comparisons. Our results, however, suggest that if researchers are faced with the choice between the scales for studying gender differences related to psychological safety in sports, the TPSS may be preferable to the SPSI.

Regarding construct validity, the TPSS correlated with the indicators of mental health and the quality of the coach-athlete relationship in the theoretically expected direction. The mentally healthy environment subscale of the SPSI exhibited a similar pattern as the TPSS. Overall, the moderate strength of the correlation between the TPSS and the SPSI mentally healthy environment subscale when the two scales were jointly analyzed in the post hoc CFA suggests that these two scales partly, but not entirely, target a similar concept. The other two subscales, mental health literacy and low self-stigma, exhibited a divergent pattern suggesting that they measure constructs that are conceptually distinct from both the TPSS and the mentally healthy environment subscale. These findings are important, yet anticipated, given the semantic differences in the definition of psychological safety across organizational and sports contexts^2,3,14,15. Psychological safety has been extensively investigated in organizational settings, with several theoretical perspectives proposed to explain its mechanisms at different levels (individual, team, or organizational) and its influence on work outcomes³. In comparison, an aim of introducing the concept in sports appears to have been the identification of predictors of future mental health, as reflected in both the definition proposed by the IOC and the SPSI developed from this perspective^14,15. However, the specific purpose of the application of the psychological safety concept in sports remains unclear, which also is noticed in that the transfer of the organizational meaning to sports settings has been contested⁶. The IOC publication¹⁴ that presents the definition of psychological safety which the SPSI builds upon offers limited guidance because references to empirical scientific studies are lacking. This raises the question of whether describing the SPSI as a ‘sport psychological safety scale’ is constructive. Despite an acceptable model fit and internal consistency, the SPSI seems to lack a clear, empirically supported definition or theoretical foundation to guide researchers’ interpretation of scores obtained with the scale. In other words, it is unclear what the scale truly measures.

The empirical knowledge on how psychological safety in sports is perceived and influenced by various factors, as well as its relationship to different outcomes (e.g., performance, health, long-term development, motivation) is currently limited. The diverse and vague descriptions pose a risk of constraining scientific progress and practical assessments of psychological safety in sports^5,6. Experiences from outside the sports domain suggest that researchers need to study not only benefits but also potential drawbacks in various settings related to psychological safety³. It is essential to ensure that recommendations related to psychological safety in sports are founded on empirical studies with high methodological quality including valid assessments. This implies that continued research is warranted on what the SPSI measures by comparing its subscales to existing scales, such as those for mental health literacy^41,42 and stigma^43,44. The domain (i.e., the target concept, attribute, unobserved behavior, etc.) should be clearly articulated and defined. A well-defined, theoretically supported domain is crucial for establishment of construct validity and the boundaries of the construct that the scale should assess⁴⁵. Moreover, the existing literature should be reviewed to establish whether present instruments could serve the same purpose as the intended new scale. If similar scales exist, a justification for developing a new scale is required, along with an explanation of how it differs from existing instruments⁴⁵. Finally, when adopting the TPSS and SPSI in sports, researchers should be cautious of the jingle fallacy, which occurs when two different scales are assumed to assess the same construct because they share the same name, but in fact, assess different constructs⁴⁶. Jingle fallacies can lead to confusion and misinterpretation, making it challenging to compare and integrate findings across studies. When transferring a concept from one setting to another, which applies to the TPSS and SPSI in the sports setting, researchers should also be actively aware of the risk of concept creep, which can distort the original meaning of the term through semantic shifts and subsequently undermine the scientific and practical value of the construct⁴⁷.

This study offers new insights into the psychometric properties of two scales used to measure psychological safety in sports. However, some limitations should be noted when interpreting the results. Despite the sample compromised elite athletes across a range of ages, from junior elite to senior elite levels, it was predominantly composed of young developing athletes. Additionally, the study included only individual sports athletes, specifically Athletics athletes and orienteers. It is possible that psychological safety, when assessed according to its organizational meaning, is a more significant construct for use with athletes participating in team sports than individual sports. This hypothesis could not be tested in this study. The population studied was from a single Scandinavian country, and the cultural and educational background may also influence the results. In addition, the study did not include any coaches, support staff or other groups involved in sports environments. Therefore, future research should include both individual and team sports, as well as coaches and staff from various countries and sporting levels, to further evaluate the psychometric properties of the scales.

In conclusion, the results of this study underscore that psychological assessments used in sports should be based on judiciously developed operational definitions and carefully validated. The TPSS exhibited acceptable psychometric properties for assessing psychological safety in an elite sports context. While the SPSI three-factor correlated model demonstrated a robust factor structure and internal consistency, it was not invariant across genders. Concerns about its construct validity were also raised. These findings underscore a need for caution when using the SPSI as a measure of psychological safety in sports settings.

Data availability

Data are available from the corresponding author on reasonable request.

References

Edmondson, A. C. The Fearless Organization: Creating Psychological Safety in the Workplace for Learning, Innovation, and Growth (Wiley, 2018).
Edmondson, A. C. & Lei, Z. Psychological safety: the history, renaissance, and future of an interpersonal construct. Annu. Rev. Organ. Psychol. Organ. Behav. 1, 23–43. https://doi.org/10.1146/annurev-orgpsych-031413-091305 (2014).
Article Google Scholar
Newman, A., Donohue, R. & Eva, N. Psychological safety: a systematic review of the literature. Hum. Resour. Manag Rev. 27, 521–535. https://doi.org/10.1016/j.hrmr.2017.01.001 (2017).
Article Google Scholar
O’Donovan, R. & McAuliffe, E. A. A systematic review exploring the content and outcomes of interventions to improve psychological safety, speaking up and voice behaviour. BMC Health Serv. Res. 20, 101. https://doi.org/10.1186/s12913-020-4931-2 (2020).
Article PubMed PubMed Central Google Scholar
Vella, S. A. et al. Psychological safety in sport: a systematic review and concept analysis. Int. Rev. Sport Exerc. Psychol. 11, 1–24. https://doi.org/10.1080/1750984X.2022.2028306 (2022).
Article Google Scholar
Taylor, J., Collins, D. & Ashford, M. Psychological safety in high-performance sport. Contextually Applicable?? Front. Sports Act. Living 4:823488. https://doi.org/10.3389/fspor.2022.823488
Fransen, K., McEwan, D. & Sarkar, M. The impact of identity leadership on team functioning and well-being in team sport: is psychological safety the missing link? Psychol. Sport Exerc. 51, 101763. https://doi.org/10.1016/j.psychsport.2020.101763 (2020).
Article Google Scholar
Lundqvist, C. Prevalence of harassment, abuse, and mental health among current and retired senior Swedish cheerleaders. J. Sports Sci. 42 (18), 1673–1684. https://doi.org/10.1080/02640414.2024.2405791 (2024).
Article PubMed Google Scholar
Jowett, S. et al. Creating the conditions for psychological safety and its impact on quality coach-athlete relationships. Psychol. Sport Exerc. 65, 102363. https://doi.org/10.1016/j.psychsport.2022.102363 (2023).
Article PubMed Google Scholar
Edmondson, A. Psychological safety and learning behavior in work teams. Adm. Sci. Q. 44, 350–383. https://doi.org/10.2307/2666999 (1999).
Article Google Scholar
Frazier, M. L. et al. Psychological safety: A meta-analytic review and extension. Pers Psychol. ;70:1, 113 – 65. (2017). https://doi.org/10.1111/peps.12183
Appelbaum, N. P. et al. The effects of power, leadership and psychological safety on resident event reporting. Med. Educ. 50, 343–350. https://doi.org/10.1111/medu.12947 (2016).
Article PubMed Google Scholar
Ashauer, S. A. & Macan, T. How can leaders foster team learning? Effects of leader-assigned mastery and performance goals and psychological safety. J. Psychol. 147, 541–561. https://doi.org/10.1080/00223980.2012.719940 (2013).
Article PubMed Google Scholar
IOC. IOC mental health in elite athletes toolkit. International Olympic Committee (2021). https://stillmed.olympics.com/media/Document%20Library/IOC/Athletes/Safe-Sport-Initiatives/IOC-Mental-Health-In-Elite-Athletes-Toolkit-2021.pdf [Accessed: November 7, 2024].
Rice, S. et al. Psychological safety in elite sport settings: a psychometric study of the sport psychological safety inventory. BMJ Open. Sport Exerc. Med. 8 (2), e001251. https://doi.org/10.1136/bmjsem-2021-001251 (2022).
Article PubMed PubMed Central Google Scholar
Şenel, E. et al. Investigating the impact of coach behaviours and coach-athlete relationships on psychological safety. Int. J. Sport Exerc. Psychol. 1–15. https://doi.org/10.1080/1612197X.2024.2369717 (2024).
Lisspers, J., Nygren, A. & Soderman, E. Hospital anxiety and depression scale (HAD): some psychometric data for a Swedish sample. Acta Psychiatr Scand. 96 (4), 281–286. https://doi.org/10.1111/j.1600-0447.1997.tb10164.x (1997).
Article CAS PubMed Google Scholar
Zigmond, A. S. & Snaith, R. P. The hospital anxiety and depression scale. Acta Psychiatr Scand. 67 (6), 361–370. https://doi.org/10.1111/j.1600-0447.1983.tb09716.x (1983).
Article CAS PubMed Google Scholar
Brehaut, E. et al. Depression prevalence using the HADS-D compared to SCID major depression classification: an individual participant data meta-analysis. J. Psychosom. Res. 139, 110256. https://doi.org/10.1016/j.jpsychores.2020.110256 (2020).
Article PubMed Google Scholar
Bjelland, I. et al. The validity of the hospital anxiety and depression scale. An updated literature review. J. Psychosom. Res. 52 (2), 69–77. https://doi.org/10.1016/s0022-3999(01)00296-3 (2002).
Article PubMed Google Scholar
Cameron, I. M. et al. Psychometric comparison of PHQ-9 and HADS for measuring depression severity in primary care. Br. J. Gen. Pract. 58 (546), 32–36. https://doi.org/10.3399/bjgp08X263794 (2008).
Article PubMed PubMed Central Google Scholar
Jowett, S. & Ntoumanis, N. The Coach–Athlete relationship questionnaire (CART-Q): development and initial validation. Scand. J. Med. Sci. Sports. 14, 245–257. https://doi.org/10.1111/j.1600-0838.2003.00338.x (2024).
Article Google Scholar
Yang, S. X. & Jowett, S. Psychometric properties of the coach–Athlete relationship questionnaire (CART-Q) in seven countries. Psychol. Sport Exerc. 13 (1), 36–43. https://doi.org/10.1016/j.psychsport.2011.07.010 (2012).
Article CAS Google Scholar
Höök, M. et al. Do elite sport first, get your period back later. Are barriers to communication hindering female athletes? Int. J. Environ. Res. Public. Health. 18 (22), 12075. https://doi.org/10.3390/ijerph182212075 (2021).
Article PubMed PubMed Central Google Scholar
Field, A. Discovering Statistics Using IBM SPSS Statistics 5th edn (SAGE, 2018).
Tabachnick, B. G. & Fidell, L. S. Using multivariate statistics, 6th ed. Pearson; (2014).
DiStefano, C. & Morgan, G. B. A comparison of diagonal weighted least squares robust Estimation techniques for ordinal data. Struct. Equ Model. 21, 425–438. https://doi.org/10.1080/10705511.2014.915373 (2014).
Article MathSciNet Google Scholar
Hu, L. & Bentler, P. M. Cutoff criteria for fit indexes in covariance structure analysis: conventional criteria versus new alternatives. Struct. Equ Model. 6, 1–55. https://doi.org/10.1080/10705519909540118 (1999).
Article Google Scholar
Wang, J. & Wang, X. Structural Equation Modeling: Applications Using Mplus, John Wiley & Sons: (2012).
Browne, M. W. & Cudeck, R. Alternative ways of assessing model fit. In: (eds Bollen, K. A. & Long, J. S.) Testing Structural Equation Models. Sage; (1993). ;136 – 62.
Putnick, D. L. & Bornstein, M. H. Measurement invariance conventions and reporting: the state of the Art and future directions for psychological research. Dev. Rev. 41, 71–90. https://doi.org/10.1016/j.dr.2016.06.004 (2016).
Article PubMed PubMed Central Google Scholar
Rutkowski, L. & Svetina, D. Measurement invariance in international surveys: categorical indicators and fit measure performance. Appl. Meas. Educ. 30 (1), 39–51. https://doi.org/10.1080/08957347.2016.1243540 (2017).
Article Google Scholar
Muthén, L. K. & Muthén, B. O. Mplus statistical analysis with latent variables: User’s guide, 8th ed. Muthén & Muthén; (2018). Available at: https://www.statmodel.com/download/usersguide/MplusUserGuideVer_8.pdf [Accessed November 10, 2024].
Svetina, D., Rutkowski, L. & Rutkowski, D. Multiple-group invariance with categorical outcomes using updated guidelines: an illustration using Mplus and the lavaan/sem tools packages. Struct. Equ Model. 27 (1), 111–130. https://doi.org/10.1080/10705511.2019.1602776 (2020).
Article MathSciNet Google Scholar
Muthén, B. O., du Toit, S. H. C. & Spisic, D. Robust inference using weighted least squares and quadratic estimating equations in latent variable modeling with categorical and continuous outcomes (Unpublished technical report). (1997). Retrieved from www.statmodel.com/bmuthen/articles/Article_075.pdf [Accessed November 10, 2024].
Satorra, A. & Bentler, P. M. Ensuring positiveness of the scaled difference chi-square test statistic. Psychometrika 75 (2), 243–248. https://doi.org/10.1007/s11336-009-9135-y (2010).
Article MathSciNet PubMed PubMed Central Google Scholar
Savalei, V. & Brace, J. C. We need to change how we compute RMSEA for nested model comparisons in structural equation modeling. Psychol. Methods. 29 (3), 480–493. https://doi.org/10.1037/met0000537 (2024).
Article PubMed Google Scholar
Beribisky, N. & Hancock, G. R. Comparing RMSEA-based indices for assessing measurement invariance in confirmatory factor models. Educ. Psychol. Meas. 84 (4), 716–735. https://doi.org/10.1177/00131644231202949 (2024).
Article PubMed Google Scholar
MacCallum, R. C., Browne, M. W. & Cai, L. Testing differences between nested covariance structure models: power analysis and null hypotheses. Psychol. Methods. 11 (1), 19–35. https://doi.org/10.1037/1082-989X.11.1.19 (2006).
Article PubMed Google Scholar
Schmitt, N. & Kuljanin, G. Measurement invariance: review of practice and implications. Hum. Resour. Manag Rev. 18 (4), 210–222. https://doi.org/10.1016/j.hrmr.2008.03.003 (2008).
Article Google Scholar
O’Connor, M., Casey, L. & Clough, B. Measuring mental health literacy: a review of scale-based measures. J. Ment Health. https://doi.org/10.3109/09638237.2014.910646 (2014).
Article PubMed Google Scholar
Wei, Y. et al. Measurement properties of tools measuring mental health knowledge: a systematic review. BMC Psychiatry. 16 (1), 297. https://doi.org/10.1186/s12888-016-1012-5 (2016).
Article PubMed PubMed Central Google Scholar
Winnie, W. S. et al. Meta-analysis of stigma and mental health. Soc. Sci. Med. 65 (2), 245–261. https://doi.org/10.1016/j.socscimed.2007.03.015 (2007).
Article Google Scholar
Docksey, A. E. et al. The stigma and self-stigma scales for attitudes to mental health problems: psychometric properties and its relationship to mental health problems and absenteeism. Health Psychol. Res. 28 (2), 35630. https://doi.org/10.52965/001c.35630 (2022).
Article Google Scholar
Boateng, G. O. et al. Best practices for developing and validating scales for health, social, and behavioral research: A primer. Front. Public. Health. 6, 149. https://doi.org/10.3389/fpubh.2018.00149 (2018).
Article PubMed PubMed Central Google Scholar
Hanfstingl, B. et al. Detecting jingle and jangle fallacies by identifying consistencies and variabilities in study specifications - a call for research. Front. Psychol. 15, 1404060. https://doi.org/10.3389/fpsyg.2024.1404060 (2024).
Article PubMed PubMed Central Google Scholar
Lundqvist, C. et al. Aligning categories of mental health conditions with intervention types in high-performance sports: A narrative cornerstone review and classification framework. J. Sci. Med. Sport. 27 (8), 525–531. https://doi.org/10.1016/j.jsams.2024.05.001 (2024).
Article PubMed Google Scholar

Download references

Funding

Open access funding provided by Linköping University. This study was funded by the Swedish Research Council for Sport Science.

Author information

Authors and Affiliations

Department of Behavioural Sciences and Learning, Linköping University, Linköping, SE-581 83, Sweden
Carolina Lundqvist
Athletics Research Center, Linköping University, Linköping, Sweden
Carolina Lundqvist & Toomas Timpka
Health and Science Department, World Athletics, Monaco, Monaco
Stéphane Bermon
LAMHESS, Université Côte d’Azur, Nice, France
Stéphane Bermon
Regional Executive Office, Region Östergötland, Linköping, Sweden
Toomas Timpka

Authors

Carolina Lundqvist
View author publications
Search author on:PubMed Google Scholar
Stéphane Bermon
View author publications
Search author on:PubMed Google Scholar
Toomas Timpka
View author publications
Search author on:PubMed Google Scholar

Contributions

Substantial contributions to conception and design: CL. Data collection and statistical analyses: CL. Drafting and revising the article: CL, TT. Revising the draft critically for important intellectual content: CL, TT, SB.

Corresponding author

Correspondence to Carolina Lundqvist.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Lundqvist, C., Bermon, S. & Timpka, T. A psychometric study of the team psychological safety scale and sport psychological safety inventory in Swedish elite sports. Sci Rep 15, 20227 (2025). https://doi.org/10.1038/s41598-025-06963-1

Download citation

Received: 07 December 2024
Accepted: 11 June 2025
Published: 20 June 2025
DOI: https://doi.org/10.1038/s41598-025-06963-1