Introduction

People working in helping professions, such as healthcare providers, teachers, clergy, and police officers, are constantly exposed to the distress of others1,2. Such exposure can negatively influence their personal well-being and professional performance. This can lead to higher turnover rates, decreased empathy, and a reduced ability to help others3. The most widely used instrument to measure positive and negative aspects of working in helping professions is Professional Quality of Life scale (ProQOL), with the latest version, ProQOL 5 available in 28 languages1,4,5.

According to the definition and conceptual framework1, professional quality of life has two main aspects: Compassion Satisfaction (CS) and Compassion Fatigue (CF). The latter includes two facets: (1) Secondary Traumatic Stress (STS), arising from the constant exposure to traumatic experiences in others, and (2) Burnout (BO), characterized by feeling of frustration, depression, anger, and emotional exhaustion. As such, Compassion Fatigue can be seen as an indicator of the perceived negative impact of caregiving on health and well-being. In contrast, compassion satisfaction refers to the perceived positive impact of working in a helping profession, in terms of feelings of satisfaction related to being able to help others and the helping work itself. Compassion Satisfaction and Compassion Fatigue are regarded as related, yet distinct constructs. This means that some people feel fulfilled by their helping work even while being emotionally exhausted, whereas others might not find their work meaningful, yet also do not feel overwhelmed or stressed from their work. Reviews of studies that have used the scale, showed that the level of compassion fatigue has increased worldwide in the past decade and especially after COVID-19 pandemic, with many health care professionals experiencing compassion fatigue in a moderate to severe degree and a reduced sense of compassion satisfaction derived from work5,6.

Despite the popularity of the ProQOL and use in research and practice, multiple studies highlighted psychometric challenges, particularly related to the assessment of compassion fatigue in terms of symptoms of burnout and secondary traumatic stress7,8. Therefore, a unidimensional bifactor structure has been proposed9, with Compassion Fatigue and Compassion Satisfaction regarded as the low and high end on one and the same construct, rather than two separate dimensions. Researchers have also criticized the reliability of the Compassion Fatigue scale8. A recent systematic meta-analysis of 27 factor-analytic studies confirmed these concerns, identifying nine ProQOL items (including five reverse-scored items) that consistently failed to load as intended and noting high inter-factor correlations between burnout with both compassion satisfaction and secondary traumatic stress10. These findings suggest the three ProQOL subscales are not always clearly separable and may require revision. Additionally, authors highlighted several ProQOL items lack direct reference to caregiving with one of the clearest burnout item examples – “I am happy”, which does not explicitly mention work or helping. These items tend to cross-load onto other factors and do not meet face validity requirements10. Aforementioned concerns have led to two revised and shortened scales: the 9-item Short ProQOL11 and the 12-item Brief ProQOL-1212. Both preserve the three original dimensions (Compassion Satisfaction, Burnout, Secondary Traumatic Stress), with preliminary findings demonstrating adequate psychometric properties11,12.

The Short ProQOL retained nine items with the strongest factor loadings that each represented the theoretical constructs, designed for rapid screening11. Authors selected items also on the basis of conceptual coverage, those consistent with theoretical definitions of BO, CF, and CS. They confirmed its factor structure and measurement invariance across samples of Spanish, Argentinian, and Brazilian palliative care professionals. The Brief ProQOL-12 validation followed a different refinement route that integrated exploratory and confirmatory factor analysis with Rasch item-response modelling and multi-group tests12. It was validated across eight intercultural samples of helping professionals from Australia, Asia, North America, and Europe, followed by an independent validation in a separate ethnically and occupationally diverse sample. The resulting 12-item scale preserved the original theoretical structure while improving fit, internal consistency and meeting face validity. It is also important to consider cultural context when using and assessing ProQOL tools. Scales originally developed in Western contexts may not fully align with other cultures. ProQOL factor structures can vary across countries, reflecting differences in healthcare systems and cultural norms13. For example, expressions of personal work satisfaction or distress may map differently to “compassion” constructs in collectivists vs. individualists cultures14, thus even well-designed short forms require local validation.

Both the Short ProQOL and Brief ProQOL-12 have been validated11,12 and used in several studies so far15,16,17,18, despite the lack of robust evidence showing that these shorter versions are as good or even better than the longer ProQOL 5 version. To address the need for a greater insight into the psychometric qualities of the Short ProQOL and Brief ProQOL-12, the aim of this study was to compare the psychometric validity, reliability, and pre-defined factor structure of the ProQOL 5 and proposed theoretical models of the Short ProQOL and Brief ProQOL-12 in a large sample of helping professionals from Slovakia.

Method

Research sample

The sample consisted of 639 helping professionals from Slovakia: with 79% identifying themselves as Slovak (N = 506) and 15% as Czech (N = 94). Czech and Slovak people are greatly similar, due to their high linguistic, cultural, and historical similarity, which minimizes the risk of cultural bias in the responses19. To increase the generalizability of the findings, the sample included a wide range of helping professionals including psychologists (16%, N = 100), nurses (14%, N = 90), social workers (11%, N = 68), teachers and pedagogues (9%, N = 58), paramedics (9%, N = 56), police officers (6%, N = 38), and clergy (6%, N = 38). The majority was female (78%, N = 500), with one-fifth being male (22%, N = 139). Their mean age was 38 years (SD = 11.3). The duration of work experience in helping professions ranged from less than 5 years (35%, N = 223) to 16 years and more (32%, N = 204). Most participants were married (42%, N = 270), with 21% in a partnership (N = 132), and 26% single (N = 168).

Procedure

We conducted a cross-sectional survey of helping professionals distributed online via social media platforms (Facebook, Instagram, Reddit, etc.) and through helping professionals’ associations, using snowball technique. Data collection followed the ethical principles outlined in the 1964 Helsinki Declaration, including its subsequent amendments. Approval was granted from the university’s related ethical committee. To ensure adequate sample size for each Confirmatory Factor Analysis (CFA) model, we followed both established rules of thumb (a minimum of 10 observations per item) and conducted a power analysis using the semPower package in RStudio, with the expected Root Mean Square Error of Approximation (RMSEA) set below 0.0820. Based on the results of this analysis, the minimum required sample sizes were 82 participants for the ProQOL 5 model (30 items, 3 factors), 129 for the Brief ProQOL-12 model (12 items, 3 factors), and 148 for the Short ProQOL model (9 items, 3 factors). To conduct separate CFAs for each version of the scale, we randomly divided the full dataset (N = 639) into three independent subsamples using computer-generated random number sequence, so the subsamples were independent and representative of the full sample: 300 participants for the ProQOL 5, 169 for the Brief ProQOL-12, and 170 for the Short ProQOL.

Research measures

Professional quality of life scale (ProQOL)

The 30-item ProQOL 51 is a self-report tool assessing three dimensions: Secondary Traumatic Stress (STS), Burnout (BO), and Compassion Satisfaction (CS). Each of the dimensions has 10 items rated on a 5-point Likert scale (1 = “never” to 5 = “very often”), indicating experiences over the last month. The general instruction is: “When you [help] people you have direct contact with their lives. As you may have found, your compassion for those you [help] can affect you in positive and negative ways. Below are some questions about your experiences, both positive and negative, as a [helper]. Consider each of the following questions about you and your current work situation. Select the answer that honestly reflects how frequently you experienced these things in the last 30 days. When answering feel free to replace the word “help” with another word that better reflects your work”.

Example items are “I feel worn out because of my work as a helper.” (burnout), “I am preoccupied with more than one person I help.” (secondary traumatic stress), and “I get satisfaction from being able to help people.” (compassion satisfaction). The Slovak version of the ProQOL 5 followed a double back-translation process, evaluation for conceptual equivalence, and has been validated and published on the ProQOL website21.

As described in the Introduction as well, the Short ProQOL retained nine items from the original scale ProQOL: 3 items (10, 19, 21) for burnout; 3 items (9, 13, 25) for secondary traumatic stress; and 3 items (12, 18, 30) for compassion satisfaction and exactly the same evaluation on the Likert scale with the time frame11: on a 5-point Likert scale (1 = “never” to 5 = “very often”), referring to the past month.

The Brief ProQOL-12 includes all nine items of the Short ProQOL, with additionally one extra item for each scale: item 26 for burnout; item 24 for compassion satisfaction; and item 14 for secondary traumatic stress12 (See Supplementary Table S1). Item selection was carried based on systematic review and factor analyses across multiple international samples, ensuring both psychometric properties and clear face validity10. In the introductory guidance, participants are asked to reflect how frequently they experienced these things in the last 7 days, and items are rated with additional time refinements (1 = “never (0 days)”, 2 = “rarely (1 day)”, 3 = “sometimes (2–3 days)”, 4 = “often (4–5 days)”, 5 = “Always (6–7 days)”. Shortening the time frame from one month to one week and the answer options (originally from 1 = “never” to 5 = “very often”) were done for a more realistic self-assessment, and the need for clarification and better differentiating of the use “Often” and “Very Often”. The Brief ProQOL-12 was validated in a sample of ethnically-diverse population and across wide and balanced range of helping professionals.

Compassion

We included two scales for measuring compassion, in order to test convergent validity of the ProQOL scales. First, the Forms of Self-Criticising/Attacking & Self-Reassuring Scale (FSCRS)22, has 22 items, measuring the use of self-reassurance when the person faces failure and self-criticism. Rated on a 5-point Likert scale (1= “not at all like me” to 5="extremely like me”), the FSCRS measures three aspects of self-compassion: Inadequate Self (IS) which describes the feeling of personal inadequacy (e.g., “There is a part of me that feels I am not good enough”), Hated Self (HS) that measures hostile feelings toward oneself (e.g., “I stop caring about myself”), and Reassuring Self (RS) measuring the capacity for self-compassion and ability to self-reassure (“I find it easy to forgive myself”). For the FSCRS, we calculated a composite of the HS and IS subscales23. Reliability calculated for our sample was adequate: Composite (α = 0.800), IS (α = 0.861), HS (α = 0.703), and RS (α = 0.801),

The second scale was the Sussex-Oxford Compassion for the Self/Others scales (SOCS-S/SOCS-O)24. The SOCS scales, assess compassion towards self and others, each comprising 20 items and five dimensions: Recognizing suffering (awareness of distress), Understanding suffering’s universality (viewing suffering as a shared human experience), Empathy (connecting emotionally with distress), Tolerating uncomfortable feelings (accepting emotions that arise in response to suffering), and Motivation to alleviate suffering (acting to reduce suffering). Both scales are rated on a 5-point Likert scale (1="not at all true” to 5="always true”). Example items include: SOCS-O (“When someone is going through a difficult time, I feel kindly towards them”) and SOCS-S (“I notice when I’m feeling distressed”). Reliability for our sample was adequate for both scales and all their subscales, with overall SOCS-O α = 0.906 and SOCS-S α = 0.896.

Additionally, a single self-assessment question was used to capture the respondent’s subjective experience of compassion fatigue. We formulated it in line with Stamm’s Professional Quality of Life model, which defines compassion fatigue as the combined effects of burnout and secondary traumatic stress, resulting from prolonged exposure to others’ suffering1. The single self-assessment item asked: “How often have you experienced fatigue and being too immersed in the events of the person you were trying to help, leading to a desire to avoid that person?” This was rated on a 5-point scale (1 = “a couple of times in my life”, 2 = “several times a year”, 3 = “once a month”, 4 = “a couple of times a month”, 5 = “once a week”, 6 = “several times a week”, 7 = “every day”). The single-item measure was designed as a brief indicator of compassion fatigue intensity, consistent with the conceptualization of emotional exhaustion and distress described in the ProQOL framework.

Analyses

To compare the validity and reliability of all three scales, we first conducted confirmatory factor analyses (CFA) to test the 3-factor structure. The CFAs were conducted without use of correlated residuals or modification indices to avoid overfitting the model and maintain the theoretical model integrity. For model fit comparison and assessment, we used root mean square error of approximation (RMSEA), standardized root mean square residual (SRMR), comparative fit index (CFI) and Tucker-Lewis index (TLI). Good fit was indicated by CFI and TLI values above 0.95, and an RMSEA or SRMR below 0.08. We acknowledge that RMSEA confidence intervals can be wide in smaller samples, thus urging caution in overinterpreting their point estimates in moderate samples25. In addition, we used Cronbach’s alpha and McDonald’s omega to evaluate internal consistency with values above 0.7 indicating acceptable, and above 0.8 good reliability. We assessed convergent validity through the examination of correlations between the three ProQOL versions and the two measures of compassion. We evaluated and compared the magnitude of correlations, with r = 0.1, 0.3 and 0.5 interpreted as small, medium and large effect sizes respectively26. Additionally, we checked correlations between the three ProQOL versions and a single self-assessment question that evaluated how often participants experienced fatigue and being too immersed in the events of the person they were trying to help and a desire to avoid that person. We used average variance extracted (AVE) to measures the proportion of variance in observed indicators that is accounted for by the latent construct and check convergent validity. An AVE value exceeding an arbitrary threshold of 0.5 indicated that the latent construct captured at least 50% of the variance in its associated indicators27, thus supporting convergent validity. Afterwards we applied Fornell-Larcker criterion, stating that to support discriminant validity, each AVE value for any two factors must be larger than the squared correlation of those two factors28.

Results

Dimensionality of the ProQOL 5, Brief ProQOL-12 and Short ProQOL

The ProQOL 5 demonstrated a poor fit with all indices below acceptable levels (CFI = 0.704, TLI = 0.679, SRMR = 0.127, RMSEA = 0.097). In contrast, both Brief ProQOL-12 and Short ProQOL showed good fit, higher indices and lower error indicators (Brief ProQOL-12: CFI = 0.957, TLI = 0.944, SRMR = 0.044, and RMSEA = 0.068; Short ProQOL: CFI = 0.976, TLI = 0.964, SRMR = 0.049, and RMSEA = 0.061) (Table 1). Notably, RMSEA confidence intervals were precise, but poor in ProQOL 5, moderately wide in Brief ProQOL-12, and very wide in Short ProQOL. See Supplementary Table S2, S3, and S4 for factor loadings of the ProQOL 5, Brief ProQOL-12 and Short ProQOL.

Table 1 Goodness-of-fit comparison statistics for ProQOL 5 and short ProQOL.

Reliability of ProQOL 5, brief ProQOL-12 and short ProQOL

For all three scales we found similar and sufficient reliability of all three subscales (Table 2), except for burnout (α = 731; ω = 0.661) in ProQOL 5.

Table 2 Reliability coefficients of the ProQOL 5, Brief ProQOL-12 and Short ProQOL.

Convergent validity of the ProQOL 5, Brief ProQOL-12, and Short ProQOL

All three versions were significant and moderately strong related to the FSCRS total score (Table 3). In addition, ProQOL 5 and Brief ProQOL-12 (especially the facets of Burnout and Compassion Satisfaction) showed similar correlations with the SOCS (for others, and the self). The Short ProQOL showed less significant associations with the SOCS for others. The pattern of correlations generally confirms the hypothesized correlation between compassion fatigue and measures of self-criticism, as well as the association of compassion satisfaction with measures of compassion for self and others.

Additionally, we assessed convergent validity through correlations with a single screening question about how often participants experienced fatigue and a desire to avoid the person they were trying to help. Results indicated significant positive correlations between results on this single item screening question and the ProQOL subscales, with no significant differences among the three versions. Specifically, the answer to the screening question was significantly related with burnout (ProQOL 30: rho = 0.453; ProQOL 12: rho = 0.501; Short ProQOL: rho = 0.417) and Secondary Traumatic Stress (ProQOL 30: rho = 0.481; ProQOL 12: rho = 0.458; Short ProQOL: rho = 0.454), and, to a lesser extent, with compassion satisfaction (ProQOL 30: rho = – 0.178; ProQOL 12: rho = – 0.286; Short ProQOL: rho = – 0.222).

Table 3 Correlation coefficients for ProQOL 5, brief ProQOL-12, and short ProQOL.

Furthermore, we examined AVE values to assess both convergent and discriminant validity (see Table 2). ProQOL 5 showed lower AVE values for all three facets (i.e., burnout, secondary traumatic stress, compassion satisfaction), compared to the two shortened versions. AVE values for the shortened versions indicated that at least 50% of the variance in the observed items of the Brief ProQOL-12 and Short ProQOL are explained by latent variables, suggesting a strong relationship between the items and the construct. These results supported the convergent validity of both the Brief and Short versions, but not of the original ProQOL 5.

The Fornell-Larcker criterion for discriminant validity was generally met for all three ProQOL scales for all pairs, except the ProQOL 5 subscales of Burnout and Secondary Traumatic Stress. This suggests that the ProQOL 5 showed a weaker discriminant validity, compared to Brief and Short ProQOL, due to substantial overlap between STS and BO scales. All items in the Brief ProQOL-12 and Short ProQOL had significant loadings on respective factors, while for the ProQOL 5 two burnout items (4 and 29) did not reach the significance level.

Discussion

The current study evaluated and compared the psychometric properties of the ProQOL 5, Brief ProQOL-12, and 9-item Short ProQOL scales in a large sample of helping professionals from Slovakia. The main result is that both the Brief and the Short ProQOL showed an excellent model fit, and more distinct factor structure than the original ProQOL 5, with also adequate Cronbach α for all subscales (> 0.8). Both the Brief and Short ProQOL more clearly separated the distinct facets of Compassion Satisfaction, Secondary Traumatic Stress and Burnout constructs, than the original 30-item ProQOL 5. Our result underscores a previous report stating that the 12-item Brief ProQOL-12 achieved an excellent CFA fit and improved reliability over the 30 item measure12, and with robust CFA fit and invariance for the 9-item Short ProQOL11. Our results also confirm the reliability and validity of the 9-item Short version. When interpreting these findings, it should be taken into account that that time frame of the 12 item version is shorter (i.e., one week) from the short and full version. While this can have affected the responses to the items, we did not find evidence that the difference in time frames influenced the psychometric properties.

All items of the Brief and Short scales had significant loadings on their intended factors, while in the ProQOL 5, two Burnout items (4 and 29) failed to reach significance with also a strong overlap between BO and STS subscales. This pattern mirrors prior findings with failure to confirm Stamm’s three-factor model, showing that factors were not clearly distinct in CFA9, supported also by meta-analysis with multiple BO and STS items that misloaded in many samples and concluding the need for revision10. In contrast, the Brief and Short versions had higher AVE values and clear factor distinctions, indicating better convergent and discriminant validity overall.

One likely reason for the short forms’ success is improved face validity of their items. In both the Short and Brief versions, item selection was driven by clarity and relevance to the measured constructs, while the original ProQOL 5 uses items that do not have direct helping elements in them and thus are not evaluating Compassion Satisfaction, Burnout or Secondary Traumatic Stress directly (e.g., “I am happy”, “I have beliefs that sustain me”)12. By contrast, the Short and Brief ProQOL intentionally omit such items, and as we observed these improvements appear to produce cleaner factor solutions.

Another important consideration is the trade-off between brevity and coverage. Shortened scales reduce respondent burden, because longer surveys require more time, produce more missing data, and have higher refusal rates11. Thus, they can improve completion rates and data quality, but at the same time reducing items might risk in construct underrepresentation29. Fewer items mean potentially fewer aspects of burnout or satisfaction are covered. Our results show that the 12- and 9-item forms maintained high reliability (α > 0.80), but the Brief ProQOL-12 with its additional item per subscale, may offer a better balance between brevity and construct coverage.

Finally, our study underscores the importance of cultural context. Differences in healthcare systems and national culture can affect how burnout and compassion-related constructs manifest with possibility of non-invariance across different countries or cultures13. Our positive results suggest that both the Brief and Short ProQOL function well psychometrically in the context of Slovakia with additional studies needed across different language and cultural groups. Such cultural norms as emotional expression, work-related stress, and the value placed on the compassion itself could influence how these constructs are reported and experienced, and potentially might affect scale performance30.

Overall, our results add to the growing evidence showing psychometric problems of the ProQOL 5, especially concerns related to its factor structure7,8,9,12,31,32, and support usage of Brief ProQOL-12 and Short ProQOL.

Practical implications

The Brief ProQOL-12 and the Short ProQOL can be regarded to a valid, reliable and robust questionnaires for assessing compassion fatigue, burnout, and compassion satisfaction in helping professionals. Their brief formats may support the implementation in practice to monitor the presence of compassion fatigue and satisfaction, and to identify people at risk of elevated levels of fatigue and/or reduced levels of satisfaction, and offer them support.

Limitations and future research

While this study enhances understanding of the ProQOL psychometric properties, several limitations need to be considered. First, the cross-sectional design did not allow us to perform test-retest reliability on the ProQOL scales. Neither did we include scales which can be assumed to not be (or only weakly) related to compassion fatigue and satisfaction, to test more rigorously the discriminant validity of the scales. Second, the online distribution of the survey may have introduced volunteer and self-selection biases; respondents who chose to participate may differ systematically (e.g., in the distress level or motivation) from those who did not, potentially limiting generalizability. Third, relying on self-report measures can introduce social desirability bias and inaccurate self-assessment, especially for sensitive or negatively framed items. Additionally, the use of different time frames across the ProQOL versions (i.e., one month for the ProQOL 5 and Short and one week for the Brief ProQOL-12) may have introduced method variance, as the time reference can influence how respondents recall and report their experiences. Future research should include established measures like the Maslach Burnout Inventory33 and Secondary Traumatic Stress Scale34 to further establish convergent and discriminant properties, and additionally might re-evaluate the possibility of a unidimensional bifactor Compassion Fatigue/Satisfaction scale. Future intervention studies also should examine the sensitivity of the Brief and the Short ProQOL to test meaningful levels and changes in compassion fatigue and satisfaction over time. Our findings should be generalized only to the Slovak helping professionals’ context with the need for further validation across different cultural and professional contexts.

Conclusion

In conclusion, the study findings support the Brief ProQOL-12 and the Short ProQOL as a more reliable and valid tool than the ProQOL 5 for assessing professional quality of life of helping professionals in Slovakia. These two shorter scales omit problematic items and focus on core symptoms, which likely contribute to their superior performance. The Brief and the Short ProQOL therefore may be valuable for both research and practical applications and have potential to enhance professionals’ well-being by allowing for a quick screening, continuous monitoring and tailored interventions for helping professionals. Finally, the Brief ProQOL-12 with an additional item per subscale, may offer a better balance between brevity and construct coverage.