Introduction

Regular exposure to traumatic or stressful events can cause distress and precipitate the development of psychiatric disorders such as posttraumatic stress, depression and anxiety disorders1,2. This is especially the case within high-risk professions such as firefighting, paramedics, and police where exposure to potentially traumatic events is routine3. As such, rates of psychiatric disorder are elevated in these workforces4,5. In addition to this, symptoms of depression, anxiety, and posttraumatic stress may be high even if they are not at a level to warrant diagnosis6. These “sub-clinical” symptoms often escalate into disorder7 and targeting them presents an opportunity for early intervention. Effective mental health intervention is critical to alleviate sub-clinical symptoms and mitigate the risk of developing a diagnosable disorder8,9. Unfortunately, those in high-risk professions often do not receive timely mental health support due to barriers including mental health stigma, low mental health literacy, and concerns surrounding privacy10,11.

A contemporary solution to these challenges has been the use of smartphone apps to support mental health12,13. High-risk professionals have expressed willingness to engage in mental health support through smartphone applications11. Digital interventions have been shown to be an acceptable way of delivering mental health support for first responders14,15 and have been used in other populations with positive (but often small) effects on depression, stress, and anxiety symptoms16. The rationale for using smartphone apps among high-risk workforces is strong as they offer access to an intervention which (i) maintains their confidentiality and avoids stigma related barriers; (ii) is easily accessible in rural or remote regions where services are limited; and (iii) allows users to engage at their own convenience, which is particularly valuable for those in shift-work roles. However, less than 10% of mental health apps that are currently available have published evidence to support their efficacy17. There is a great need to test mental health apps in rigorously designed efficacy studies.

Few smartphone mental health apps have taken a trauma-informed approach or targeted symptoms of posttraumatic stress14,18. One of the few transdiagnostic examples is a trauma-informed Mindfulness Coach app which was piloted with military veterans with a self-reported prior diagnosis of post-traumatic stress disorder (PTSD)19. Compared to waitlist control, medium effects were found for PTSD symptoms and depression, however, study attrition was high at 68% making interpretation of results difficult. Similarly, the “PTSD Coach” mobile application provides assistance in managing PTSD symptoms. A systematic review and meta-analysis focussing on the efficacy of the “PTSD Coach” app, however, found there was not a significant pooled reduction in posttraumatic stress symptom severity for those in the PTSD Coach group, when compared to the comparison group20. This speaks to the challenges for apps in delivering an efficacious, trauma-informed approach to target symptoms of posttraumatic stress.

The Skills for Life Adjustment and Resilience (SOLAR) program is a brief, trauma-informed, skills-based intervention designed to address mental health symptoms following trauma and adversity. The SOLAR program is transdiagnostic in that it targets shared mechanisms or underlying psychological processes hypothesized to underpin depression, anxiety, and posttraumatic stress, rather than focusing on a single disorder21. It was developed to address subclinical symptoms and, therefore, to be considered as an early intervention7. The SOLAR program has growing evidence, including randomized controlled trials, supporting its efficacy21,22,23,24. In these studies, participants with subclinical symptoms reported significant reductions in posttraumatic stress, anxiety and depression symptoms, and improvements in general functioning. Participants in these trials with disorder-level symptoms also benefited significantly. Given its transdiagnostic approach and growing evidence base, the SOLAR program is a prime candidate for translation to a smartphone application.

This study aimed to test the efficacy of a smartphone app version of the SOLAR program (SOLAR-m) in a high-risk workforce—firefighting. Firefighting is a profession with high exposure to traumatic events with ~60% of firefighters experiencing traumatic events that have deeply affected them during their work and impacted their mental health25,26. In this randomized controlled trial, firefighters were randomized to receive the SOLAR-m app or an active control app—mood monitoring. We hypothesized that participants randomized to the SOLAR-m condition would report greater reductions in depression and anxiety (total score—primary outcome) relative to the mood-monitoring app at 8 weeks (primary endpoint) and at 3-month post follow-up (secondary endpoint). Secondary outcomes included depression (subscale score), anxiety (subscale score), posttraumatic stress symptoms, work-family conflict, and work-related performance.

Results

Figure 1 presents the CONSORT diagram. Of 567 individuals screened, N = 163 participants (29%) were enrolled and randomized. Most participants were current firefighters (96/163, 58.9%), with similar numbers from metropolitan (78/163, 47.9%) and regional or rural locations (85/163, 52.1%). On average, participants had extensive professional experience (mean service=17.7 years, SD = 11.6 years). Table 1 shows the pre-treatment characteristics of the complete sample and the two study conditions. Baseline characteristics and credibility expectations of SOLAR-m were comparable between groups (Table 1). One participant was randomized but had no outcome data at any time point so could not be included in analyzes (Fig. 1).

Fig. 1: CONSORT diagram.
Fig. 1: CONSORT diagram.
Full size image

CONSORT diagram depicting the flow of participants through the study.

Table 1 Baseline demographics of the complete sample and by intervention condition

Table 2 presents the mean and SD by intervention condition, as well as estimated differences in mean change in total depression and anxiety symptoms (primary outcome), depression, anxiety and PTSD symptoms scores between conditions, at each follow-up time point. SOLAR-m resulted in a greater decrease in depression and anxiety symptoms (HADS-Total) compared to the mood monitoring control app at 8 weeks (mean difference in score: –2.64, 95% CI: –4.64, –0.63, p = 0.01). This represented a small to moderate effect size (–0.33, 95% CI: –0.59, –0.08). This result held after adjusting for PCL-5 severity scores at baseline (mean difference in score: –2.48, 95% CI: –4.49, –0.48, p = 0.02). Findings were also similar in the complier average causal effect (CACE) analysis (Supplementary Information Table 1) and under the assumption of a different missing data mechanism (Supplementary Information Table 1). Results were comparable between current (mean difference in score: –2.49, 95% CI –4.72, –0.25) and former firefighters (–2.76, 95% CI –5.57, 0.06) (Supplementary Information Table 1). Considering the Hospital Anxiety and Depression Scale (HADS) subscales (secondary outcomes), there was a greater decrease in HADS-Depression (mean difference in score: –1.51, 95% CI: –2.72, –0.30, p = 0.01; effect size: –0.34, 95% CI: –0.62, –0.07) and HADS-Anxiety (mean difference in score: –1.18, 95% CI: –2.23, –0.13, p = 0.03; effect size: –0.30, 95% CI: –0.56, –0.03) at 8 weeks among SOLAR-m participants compared to those in the mood monitoring app group.

Table 2 Study outcomes for total depression and anxiety, depression, anxiety and posttraumatic stress symptoms (Intention-To-Treat population)

At 3-months after post-intervention follow-up, although the estimated mean difference in scores (–1.70) indicated a greater reduction in HADS-Total among those in SOLAR-m compared to the mood monitoring app group, the CI included the null (95% CI –3.59, 0.19, p = 0.08). The mean HADS total scores by timepoint and group are shown in Supplementary Information Fig. 1.

SOLAR-m resulted in a greater decrease in HADS-Depression at 3-months after post intervention follow-up (–1.32, 95% CI: –2.49, –0.15, p = 0.03; effect size: –0.28, 95% CI –0.53, –0.03) compared to the mood monitoring control group (Table 2), but not for HADS-Anxiety (–0.36, 95% CI: –1.38, 0.67, p = 0.49; effect size: –0.09, 95% CI –0.34, 0.16). The mean HADS Depression and Anxiety and PCL-5 scores by timepoint and group are shown in Fig. 2.

Fig. 2: Mean (95% confidence interval) HADS-Depression, HADS-Anxiety and PCL-5 score by group and time.
Fig. 2: Mean (95% confidence interval) HADS-Depression, HADS-Anxiety and PCL-5 score by group and time.
Full size image

This figure shows the mean (95% confidence interval) HADS-Depression, HADS-Anxiety and PCL-5 score by group and time (T1=baseline, T2=8 weeks, T3=3-month post intervention follow-up). There was a greater decrease in HADS-Depression and HADS-Anxiety at 8 weeks among SOLAR-m participants compared to those in the mood monitoring app group. SOLAR-m resulted in a greater decrease in HADS-Depression at 3-months after post intervention follow-up. SOLAR-m resulted in a greater decrease in PTSD symptoms (PCL-5) at 8 weeks compared to the control app.

SOLAR-m resulted in a greater decrease in PTSD symptoms (PCL-5) at 8 weeks compared to the control app (mean difference in score: –5.17, 95% CI: –8.51, –1.83, p < 0.01; effect size: –0.32, 95% CI –0.52, –0.11). At 3-months post intervention follow-up, the CI included the null (mean difference: –2.26, 95% CI –5.52, 1.01, p = 0.18; effect size: –0.14, 95% CI –0.35, 0.06).

Participants randomized to SOLAR-m had lower mean scores for work-to-family conflict (WFC) at 8 weeks (mean difference: –0.80, 95% CI: –1.58, –0.01, p = 0.05) and 3-months post intervention follow-up (–0.84, 95% CI: –1.69, 0.01, p = 0.05), compared to the mood monitoring app. CIs for family-to-work conflict (FWC) were wide and included the null (Supplementary Information Table 3).

Results for the measures of absolute and relative absenteeism and presenteeism are shown in Supplementary Information Table 3. There were no apparent differences between intervention groups in any of these measures.

Among 85 participants allocated to the SOLAR-m condition, 55% met criteria for treatment adherence, and 44% for treatment completion. On average, users of the SOLAR-app completed 67% of the core intervention component of the app. Sensitivity CACE analysis to examine the intervention effect among those who adhered, found an estimated mean difference of –5.6 in change in HADS-Total from baseline to 8 weeks between intervention groups (95% CI –9.4, –1.8, p < 0.01; see Supplementary Information Table 2). This means that participants who completed at least five of the six active SOLAR-m modules reported an average of 5.6 points lower on the HADS than those who did not.

The majority of participants chose telephone facilitation with their app. A total of 495 facilitation telephone calls were conducted across all participants, with a slightly greater proportion of SOLAR-m group participants engaging in facilitation calls (i.e., having at least two telephone calls with a facilitator [70%] compared to the mood monitoring app [55%]). The mean duration of facilitation calls was 14.1 min (SD = 7.3 min). The correlation between the number of facilitation sessions and the number of app modules completed by those in the SOLAR group was r = 0.61, with greater number of facilitation sessions being related to more modules being completed. Given that most participants in the SOLAR-m condition engaged with facilitation, outcome analyzes for facilitation were not conducted.

There were no study-related adverse events. A fifth of participants continued to receive other mental health care during the course of the study—SOLAR-m (12/51; 24%) and control (9/53; 17%) participants (Supplementary Information Table 4).

Discussion

This is the first randomized controlled trial testing the efficacy of a trauma-informed, transdiagnostic mobile app targeting depression, anxiety and posttraumatic stress in a high-risk organisation. The trial we conducted was a highly rigorous, randomized controlled trial. Our choice of an active control condition was in recognition of the “digital placebo effect”, and is in contrast to the majority of mental health app trials which have a waitlist or inactive control27. Consistent with this, we found that both the SOLAR-m app and the mood-monitoring app improved symptoms over time.

SOLAR-m was designed to target subclinical symptoms, which aligns with baseline severity of depression and anxiety symptoms reported by participants in our study (i.e., mild/moderate range on the HADS). SOLAR-m was associated with lower depression and anxiety symptoms, overall and separately, at 8 weeks relative to the active control condition, with differences maintained at 3-months post follow-up for depression symptoms. Interestingly, at baseline the average score for posttraumatic stress symptoms was close to the threshold level for disorder and would not be considered as “subclinical”. Despite this, SOLAR-m was associated with a greater improvement in posttraumatic stress symptoms at 8 weeks compared to the control app.

The observed effect size of –0.33 for anxiety and depression (HADS-Total) for SOLAR-m was considerably higher than findings from other studies. Meta-analytical results of depression apps and anxiety apps have reported only small effect sizes when compared to active control conditions28,29,30. Our finding that the SOLAR-m app was associated with significant decreases in posttraumatic stress with a small-moderate effect size (d = 0.32) is in contrast to meta-analytical results that report no significant difference between PTSD app-based treatment and comparison groups30, or very small effect sizes20,27. It is somewhat consistent with a recently published studies testing the efficacy of the Mindfulness Coach app which was tested in a sample of military veterans with PTSD19,31. While methodological differences prevent too much comparison (e.g., these studies involved samples with current or prior PTSD diagnosis and did not utilize an active app control condition), it does suggest that app delivery of interventions targeting posttraumatic stress symptoms warrant further study.

There was a comparatively high level of engagement with SOLAR-m, with 55% of participants meeting our criteria for treatment adherence (completing 5 of the 6 active modules). This rate of engagement is far higher than similar app-based studies where average usage rates sit around 15–30%32,33. Observed rates of engagement with SOLAR-m may be due to the use of co-design in developing the app34 and incorporating key values important to firefighters, such as flexibility, security, and confidentiality10. The high engagement may be also associated with the use of a telephone facilitation model which is consistent with systematic reviews that show that human feedback is associated with significantly higher engagement in digital app studies35,36. This is consistent with our finding that there was a strong correlation between number of facilitation sessions and number of modules completed. In recognition that human facilitation can add cost barriers which impact scalability, future directions could test models that use artificial intelligence to serve a similar role.

Study results should be considered along-side its limitations. The final sample size was smaller than the original target (N = 240). This reduced its ability to detect small between-group effects. Additionally, the majority of the sample were male and it is unclear how these findings might generalise to female first responder populations. There were no exclusion criteria at baseline for higher levels of depression, anxiety or PTSD symptoms. As such, this may have resulted in some participants who met criteria for disorder enrolling in the study and conflating the concepts of early intervention and treatment. Finally, sensitivity analyzes adjusting for baseline PTSD symptoms were only conducted for the primary outcome (HADS-Total), but not for secondary analyzes.

This is the first RCT to test the efficacy of the SOLAR-m app with a population working in a high-risk profession and an important innovation in improving the mental health in high-risk populations. The findings support the use of SOLAR-m so first responders can address mental health symptoms in an independent, confidential, low-cost, and accessible way.

Methods

Study design

This trial was a two-arm, parallel group, triple-blind superiority randomized controlled trial, meaning that participants, investigators, and statisticians were not aware of which intervention group participants were assigned to prior to analysis. Researchers undertaking facilitation calls were not blinded as they needed to know a participant’s treatment allocation to provide facilitation support. To ensure participant blinding, all participants downloaded the same app which contained both the SOLAR content and the mood monitoring content, with the content visible restricted by allocated group. The randomization process determined which component of the app participants could access (SOLAR or mood-monitoring). Participants were recruited from a fire and rescue government agency in New South Wales, Australia. Full study details can be found in the trial protocol34. The study was approved by the University of Melbourne Human Research Ethics Committee (protocol number 2021-20632-18826-5) and registered with the Australian New Zealand Clinical Trials Registry (ANZCTRN12621001141831, registered on 23rd August 2021). The authors assert that all procedures contributing to this work comply with the ethical standards of the relevant national and institutional committees on human experimentation and with the Helsinki Declaration of 1975, as revised in 2008.

Participants

Participants were recruited between June 2022 and May 2024 and were included if they were (i) aged above 18 years; (ii) currently working, or previously worked as a career or on-call firefighter with the fire and rescue agency; (iii) self-identified as experiencing distress or difficulty managing emotions; (iv) scored ≥7 on the Kessler 637; and (v) owned a smartphone. Participants were recruited through advertisements at local firefighter stations, targeted emails to firefighters, online advertisement via social media and newsletters. A third-party recruitment service was also used (TrialFacts) to promote the study to potential participants via social media.

Procedure

Interested participants were directed to the study website for eligibility screening. On this website, participants completed the Kessler 637 (a brief measure of distress). Eligible participants provided written informed consent online via digital signature and completed a baseline survey. They were then provided with instructions on how to download the study app which included in-built app software using a simple randomization sequence to allocate participants to either SOLAR-m or the mood-monitoring control according to a 1:1 allocation ratio. Participants in either group were offered weekly check-ins with a trained facilitator via telephone. The facilitator’s role was to engage with participants to ensure that the app was used as intended, to troubleshoot app-related difficulties and to re-engage disengaged users. Facilitators did not deliver SOLAR-related content but did clarify intervention-related questions. Facilitators were instructed to aim to keep calls to a maximum of 15 min long. The facilitation process was structured and manualized for consistency.

Participants completed a post-intervention survey which was delivered 7–9 weeks post-randomization, and a follow-up survey delivered 3-months after post-intervention. For simplicity, throughout this manuscript, the 7–9 weeks time point is referred to as “8 weeks”. Study data were collected and managed using Research Electronic Data Capture electronic data capture tools hosted at The University of Melbourne38,39.

Interventions

Participants randomized to the intervention group received access to the smartphone-based version of the SOLAR program, referred to as SOLAR-m, that had been co-designed with firefighters. SOLAR-m consisted of eight modules of content that were delivered through animations, videos, audio, activities, and notifications. Participants could complete the modules at their own pace although they were encouraged to complete the modules within a 5-week period (to be consistent with previous SOLAR trials). The 8-week assessment point was chosen to allow extra time to complete SOLAR-m, however, no facilitation was provided after the 5-week point. The modules covered a range of topics, including healthy living, emotional regulation strategies, values-based behavioral activation, relationship management, cognitive strategies to manage rumination and worry, and expressive writing for processing difficult experiences. The final module helped participants create a personalized plan for incorporating the skills learned throughout the program into their daily lives.

We chose an active control condition—mood monitoring—which has empirical support for improving mental health40. Mood monitoring acts by increasing emotional self-awareness which is often low in people experiencing mental health distress40. Participants randomized to the active control condition received access to a custom-designed mood monitoring control app specifically developed for this trial. The control app provided daily mood monitoring and psychoeducation on stress and trauma. This control app had the same look and feel as SOLAR-m. Participants were prompted to record their mood at least once per day throughout the five-week intervention period.

Measures

The HADS41 total score (HADS-Total) was the primary outcome measure. Anxiety (HADS-Anxiety) and depression (HADS-Depression) subscales were also considered as secondary outcomes. Other secondary outcomes included PTSD symptoms (measured using the PTSD Checklist for DSM-5 [PCL-542]), WFC and FWC (measured using the Work-Family Conflict Scale43), and absolute and relative absenteeism and presenteeism (measured using the Health and Work Performance Questionnaire [HPQ44]). Finally, a single item from the Credibility Expectations Questionnaire (CEQ)45 was used to assess treatment expectancy at baseline. App use metrics were captured automatically and stored on a secure server. Treatment adherence was defined as completing five out of the six active SOLAR-m modules, with module 1 (Introduction to SOLAR) and Module 8 (Planning for the future) defined as non-active modules. Treatment completion was defined as using all six active SOLAR-m modules.

Data preparation and analyses

Data analysis was conducted by statisticians using a statistical analysis plan prepared while blinded to treatment allocation (see Supplementary Information- Statistical Analysis Plan). Changes made post unblinding are clearly outlined in the methods and results as post-hoc analyzes. Target recruitment was 240 participants (120 per group) based on detecting a difference of 0.4 in effect size (small-to-moderate effect) in the primary outcome (HADS-Total) between groups at 8 weeks, with 80% power, two-sided alpha of 0.05 and 20% attrition34. Constrained longitudinal data analysis (cLDA) models were used to analyze primary (HADS-Total) and secondary continuous outcomes (HADS-Anxiety, HADS-Depression, PCL-5, WTC, FWC, absolute absenteeism, relative absenteeism, absolute presenteeism, relative presenteeism). These models included all scores (baseline, 8 weeks [primary endpoint], 3-month post-intervention follow-up) and factors representing intervention (SOLAR-m vs. control app [reference group]), time (categorical), and an intervention-by-time interaction. The model assumed a common baseline mean score across interventions (i.e., did not estimate a treatment effect at baseline), under the assumption that there were no differences in mean baseline outcome score (i.e., randomization was effective in creating comparable groups). The primary hypothesis was evaluated using the estimate, two-sided 95% confidence interval (CI), and p-value of the mean difference in change in HADS-Total from baseline to 8 weeks between SOLAR-m and the mood monitoring app. To enable comparison with the effect specified in the sample size calculation, standardized effect size estimates (Cohen’s d) were obtained by dividing the difference in means by the pooled standard deviation (SD) across groups. Effect size estimates were also calculated for key secondary outcomes. Secondary analyzes examined the maintenance of intervention effects using the same parameters for the change from baseline to 3-month post-intervention follow-up. In sensitivity analyzes, the model of HADS-Total adjusted for baseline PTSD symptoms given known associations between this and anxiety and depression outcomes.

An exploratory subgroup analysis was conducted to examine whether the treatment effect for the primary outcome differed by employment status (current worker/former worker) by including interaction terms between intervention group and employment status in the primary analysis model. Descriptive analyzes were conducted for baseline CEQ and for additional mental health or pharmacological treatments received since starting the trial. All analyzes followed intention-to-treat principles, evaluating differences between intervention and control conditions according to treatment allocation, irrespective of intervention adherence, for both primary and secondary outcomes.

A supplementary analysis including all randomized participants provided an estimate of the treatment effect on the primary outcome (HADS-Total) among participants who adhered to the intervention (completed at least 5 of 6 active SOLAR-m modules) using a CACE analysis. This was estimated using a latent class regression model, with two latent classes for compliers and non-compliers46. As the primary analysis model (cLDA) provides valid inference assuming missing outcome data are missing at random, multiple imputation was performed before completing the CACE analysis to ensure consistency with the missing data assumption used in the primary analysis. Sensitivity analysis using the delta adjustment method enabled assessment of differences in findings should outcome data be missing not at random.