Introduction

Chronic low back pain (cLBP) impairs physical, emotional, and social well-being. cLBP ranks among the top three causes of disability-adjusted life year impact in North America and its prevalence is increasing due to the aging population and rise in obesity rates1,2. The National Health and Nutrition Examination Survey found that opioids are the most prescribed pain medication taken by patients with cLBP (18.8%), followed by antidepressants (17.8%)3. Yet, patients with cLBP often discover that opioids fall short in delivering meaningful pain reduction, fail to improve health-related quality of life, and present significant side effects4. It is vital to optimize benefits for patients while minimizing harm when treating cLBP.

Medical extended reality (Med XR), as defined by the U.S. Food and Drug Administration (FDA)5, refers to devices that use immersive technologies to produce therapeutic effects6. Among these technologies is virtual reality (VR), which employes stereoscopy and motion tracking to deliver vivid, immersive, and emotionally evocative environments through head-mounted displays7. VR-based therapy is an evidence-based, non-pharmacological approach for managing cognitive, affective, and sensory aspects of pain8,9,10,11,12,13,14,15,16,17,18,19,20. By stimulating the visual cortex and engaging multiple sensory pathways, VR-based pain interventions distract users from processing painful stimuli and may also employ principles of cognitive behavioral therapy (CBT) to enhance attitudes, beliefs, and cognitions about pain in a manner that persists post-treatment20.

In November 2021, the FDA authorized a VR program called RelieVRx21 which delivers an 8-week, skills-based VR treatment for management of cLBP and was among the first device clearances in the FDA Med XR program5. In a double-blind, randomized, placebo-controlled trial by Garcia and colleagues, RelieVRx was evaluated against a sham VR condition for 179 patients with cLBP and showed superiority across all primary outcomes, with clinically meaningful effect sizes (Cohen d of 0.40 to 0.49)22. The trial concluded that the program had greater effectiveness in reducing pain and associated impairments than a control condition, demonstrating the potential of home-based VR to provide accessible nonpharmacologic management of cLBP.

In another large, sponsor-funded trial, the program was tested against an active sham control for cLBP management among 1067 individuals with cLBP23. RelieVRx was again found to be significantly more effective than Sham VR in reducing pain intensity and pain interference over 8 weeks.

Building on this foundation, the present study is one of 13 federally funded trials within the Back Pain Consortium (BACPAC) Research Program, part of the National Institutes of Health (NIH) Helping to End Addiction Long-term® (HEAL) Initiative, which seeks to evaluate novel therapies for cLBP. Our unique three-arm design compared Skills-Based VR to both a 2D Sham VR control and a 3D distraction-based VR. The primary objective was to assess the efficacy of immersive Skills-Based VR and Distraction VR in decreasing pain interference compared to Sham VR among patients with cLBP. Secondary objectives included evaluating the impact of the VR interventions on additional patient-reported outcomes (PROs), such as sleep quality, anxiety, pain catastrophizing, and reductions in opioid use. Tertiary objectives extended to assessing depression and physical function, as well as collecting biometric variables through wearable sensors to explore correlates of therapeutic response. Finally, the study aimed to identify patient-level predictors of VR efficacy, offering insights into which individuals may derive the most benefit from VR.

Results

Descriptive statistics

A total of 385 participants were randomized to the Sham VR (n = 127), Distraction VR (n = 127), or Skills-Based VR (n = 131) arms (Table 1), exceeding the minimal sample size requirement. The mean age of participants was 54.1 years (SD 15.6, range 19-85). Nearly one-third (30.4%) of the sample reported daily opioid use at baseline. Baseline pain measures, including 7-day average pain (mean 6.1, SD 1.9) and daily pain intensity (mean 5.8, SD 1.9), were comparable across groups. Sociodemographic characteristics demonstrated a diverse study population, with 19.5% identifying as Black or African American, 4.7% as Asian, and 3.1% as multiple races, while 23.9% identified as Hispanic or Latino. The majority of participants were female (62.6%). Participant flow through the trial is summarized in Fig. 1.

Fig. 1
Fig. 1
Full size image

CONSORT diagram describing study participant flow. Created in Lucid (lucid.co).

Table 1 Participant characteristics

Primary Outcome: Change in PROMIS PI Score

The primary outcome, change in PROMIS-PI T-score from baseline to Day 30, showed reductions within each of the three treatment arms (p < 0.05), but no statistically significant differences were observed between Skills-Based VR or Distraction VR vs Sham VR in ITT analysis (Fig. 2). In the Skills-Based VR group, PROMIS-PI scores decreased by an average of 1.63 points at Day 30 compared to baseline, while the Sham VR group experienced a reduction of 1.79 points—both statistically significant but numerically modest. The adjusted mean difference between Skills-Based VR and Sham VR was 0.16 points (95% CI: -1.12 to 1.43; p = 0.81), indicating no significant effect. The Distraction VR group demonstrated a reduction of 1.25 points at Day 30. Compared to Sham VR, the adjusted mean difference was 0.47 points (95% CI: -0.87 to 1.81; p = 0.49), which was also not statistically significant. Although the reduction in the Skills-Based group was greater than that in the Sham group at Days 60 and 90, no significant group differences were observed (Fig. 2).

Fig. 2
Fig. 2
Full size image

Changes in PROMIS-PI between baseline and Days 30, 60, and 90. Because PROMIS-PI is positively scored, negative changes represent a reduction in pain. Created in R/ggplot235.

Among participants who adhered to the intervention protocol using the PP definition, changes in PROMIS-PI were more pronounced over time compared to the ITT analysis, but group differences remained non-significant. At Day 30, reductions in pain interference were observed across all groups in PP analysis, with mean reductions of 2.12 (SD 5.57) for Sham VR, 1.22 (SD 6.72) for Distraction VR, and 1.76 (SD 5.88) for Skills-Based VR. By Day 60, reductions reached 3.21 (SD 5.85) for Sham VR, 1.48 (SD 6.75) for Distraction VR, and 3.02 (SD 7.25) for Skills-Based VR. While the Sham VR group showed the largest mean reductions at Days 60 and 90, the observed differences between groups did not achieve statistical significance.

Secondary outcomes

There were no statistically significant differences observed among groups across multiple secondary outcomes, including pain numeric rate scale, PROMIS Anxiety, PROMIS Depression, PROMIS Sleep Disturbances, PROMIS Physical Function, Pain Catastrophizing, Patient Global Impression of Change, or Fitbit-measured daily steps and sleep efficiency.

However, when using a generalized linear mixed-effects model, a significant interaction effect emerged in the Distraction VR vs. Sham VR comparison for the change in self-reported daily opioid use between baseline and day 90. The odds of opioid use in the Distraction VR group decreased significantly more over time compared to the Sham VR group (interaction term ratio of ORs [95% CI] = 0.00 [0.00–0.17]; p = 0.009). In contrast, no significant differences were observed in the Skills-Based VR vs. Sham VR comparison (interaction term ratio of ORs [95% CI] = 0.73 [0.04–12.62]; p = 0.830). This finding suggests a notable reduction in opioid use over time in the Distraction VR group relative to Sham VR, though the effect was not observed in the Skills-Based VR group. For a detailed description of all secondary outcome analyses, refer to the Supplemental Material.

Adherence to intervention

VR headset usage data were successfully extracted for 337 participants, or 87.5% of the sample (Sham: 109, 85.8%; Distraction 110, 86.6%; Skills-Based 118, 90.1%). We observed high variance in dose among participants, with mean total minutes of use of 270.4 (SD 244.6) of Sham VR, 400.5 (SD 483.1) of Distraction VR, and 335.8 (SD 293.7) of Skills-Based VR. Total minutes of use between groups was not significantly different (Kruskal-Wallis p = 0.276). Prior research with the Skills-Based VR therapeutic defined treatment completion as 24 sessions23—using this threshold, treatment completion was achieved by 59 (54.1%) participants in the Sham VR group, 63 (57.3%) in the Distraction VR group, and 80 (67.8%) in the Skills-Based VR group. Adjusting our primary and secondary MMRM analyses of PROMIS-PI to individuals who completed at least 24 sessions, we found that treatment completion predicted pain reduction in both the Sham and Skills-Based arms at 90 days (estimate = -1.68 [-3.31, -0.05], p = 0.044).

Exploratory outcomes: predictors of response

Exploratory, baseline-adjusted linear regression analyses to detect patient-level predictors of VR efficacy, which included demographic variables, baseline health measures, and minutes of VR use, were largely not statistically significant. However, we found an interaction between baseline anxiety levels and response to Skills-Based VR at Day 60, when individuals were expected to have completed the 56-session programs. Higher baseline anxiety, as measured by PROMIS Anxiety at screening, was associated with a greater likelihood of response to the Skills-Based VR intervention. Specifically, a significant interaction was observed between baseline PROMIS anxiety levels and Day 60 PROMIS-PI in the Skills-Based group (p = 0.025), as shown in Fig. 3, indicating that baseline anxiety has a stronger association with the Day 60 change score in Skills-Based VR compared to Sham VR. This interaction suggests that participants with elevated baseline anxiety experienced larger reductions in pain interference scores at Day 60 when treated with Skills-Based VR compared to Sham VR. No significant interaction was observed for Distraction VR relative to Sham VR. In contrast, baseline depression scores did not predict response.

Fig. 3
Fig. 3
Full size image

Interaction between baseline PROMIS Anxiety and the change in PROMIS-PI recorded at Day 60. The slope difference was significant between the Skills-Based VR and Sham VR groups (p = 0.025) and not significant between the Distraction VR and Sham VR groups. Created in R/ggplot235.

Safety

The majority of reported adverse events (AEs) were mild in severity and were predominantly related to cybersickness, a well-documented phenomenon associated with VR use. Importantly, the total number of AEs reflects any participant-reported issue throughout the 90-day study period, as participants were prompted weekly to provide input about any AEs they experienced, regardless of severity or perceived relevance to the intervention. Across all study arms, AEs were most frequently reported in the Skills-Based VR group, where 26.7% of participants reported at least one AE of any kind, compared to 19.7% in the Distraction VR group and 15.0% in the Sham group. Cybersickness was the most common AE overall, accounting for 68 out of 117 total events. These cybersickness events included symptoms of nausea, vertigo, and other disorientation (50 events), eye strain (11 events), and headaches (7 events). Other reported events included neck pain (15 events), rash or allergy from the silicon VR face mask or Fitbit watch strap (9 events), VR-induced anxiety (6 events), pain from the VR headset on the face (5 events), and other occurrences of headaches and life events not associated with the study interventions. When stratifying by severity, most AEs were mild: 80% in the Sham group, 71% in the Distraction VR group, and 74% in the Skills-Based VR group. Moderate AEs were slightly more frequent in the Skills-Based group (21%) compared to the Distraction VR group (20%) and Sham group (16%). Severe AEs were rare (3 events in total) and were all deemed unrelated to VR use, as they involved hospitalizations for unrelated medical conditions during study participation.

Discussion

This large, federally funded trial demonstrated that participants across all three arms—Skills-Based VR, Distraction VR, and Sham VR—experienced modest improvements in chronic pain over the course of the study, with high adherence across interventions. As the first randomized trial to directly compare a Skills-Based VR program with a distraction-based VR program and an immersive sham control, these findings provide important context for the growing field of Med XR, an FDA-recognized area of therapeutic development.

Although no significant differences emerged in the primary outcomes and numerical changes in PROMIS-PI were modest, the results suggest that distraction-based VR—which is widely available to patients and consumers outside of FDA-cleared channels—may offer improvements in pain comparable to the FDA-cleared intervention for many users. Moreover, secondary analyses revealed that the non-FDA Distraction VR arm demonstrated a greater reduction in opioid use over time compared to Sham VR, whereas the Skills-Based VR intervention did not show this effect, a finding worth examining in future research. Together, these findings highlight the feasibility and acceptability of home-based Med XR interventions for cLBP while emphasizing the need to consider individual patient characteristics and usage patterns when interpreting therapeutic outcomes.

Importantly, we found that baseline anxiety significantly predicted a response to Skills-Based VR: individuals with higher levels of anxiety, as measured by PROMIS Anxiety scores, were more likely to respond (p = 0.025, Fig. 3). This association was not observed for depression, highlighting the distinction between anxiety-related and depression-related chronic pain. Anxiety-related pain has been shown to respond to interventions that reduce physiological arousal and provide cognitive distraction, mechanisms that align with the Skills-Based VR’s interoceptive and distraction components24,25. In contrast, depression-related pain is often sustained by negative cognitive patterns, low motivation, and pain catastrophizing26,27,28—mechanisms which may be treated effectively with structured CBT interventions29,30 that are not included in the Skills-Based VR tested in this study. Thus, while Skills-Based VR may provide meaningful benefits for individuals with elevated anxiety, its limited psychoeducation (comprising less than 5% of the program’s total minutes of content) and absence of a formal CBT component likely reduced its efficacy for depression-related pain.

There are several distinctions between this study population and outcome measures compared to prior, positive studies using the RelieVRx Skills-Based VR intervention conducted in 202122 and 202323. Our trial enrolled a more racially and ethnically diverse cohort than these RelieVRx trials, as measured by the proportion of individuals identifying as Hispanic or Latino ethnicity and as a race other than white or Caucasian, and we did not utilize online advertisements for recruiting purposes. These RelieVRx studies also used 11-point pain numeric rating scales with 24-hour recall (the Defense and Veterans Pain Rating Scale and the Brief Pain Inventory scales) to measure the primary endpoints of pain intensity/interference, whereas our primary endpoint was PROMIS-PI, which may have captured subtler variations in pain-related functioning over the 7 day recall period. These differences highlight the importance of considering population heterogeneity and outcome measure sensitivity when interpreting results and designing future trials.

This study has several strengths, including its rigorous design, diverse participant population, federal funding source, and comprehensive exploration of treatment effects across multiple secondary endpoints. The study is the first of its kind to incorporate a Distraction VR arm to isolate for possible therapeutic benefits of distraction-based 360-degree videos alone. However, the study also has limitations. The software used in the Skills-Based VR intervention was first developed and tested over five years ago at the time of this publication. Rapid advances in virtual reality software and hardware since then have led to higher-resolution, more immersive, and AI-enabled VR experiences that may yield different outcomes. Additionally, while previous VR experience did not predict treatment response in this study, the increased familiarity and comfort with VR in the general population over recent years could have influenced patient perceptions of the three VR interventions and affected subsequent PROs. The reliance on self-reported measures may have introduced bias, and missing data—particularly for wearable outcomes—limits the interpretability of some exploratory analyses.

Finally, although the FDA-cleared program is described as “skills-based,” it does not include the core cognitive restructuring techniques that define traditional CBT—such as identifying automatic thoughts, labeling cognitive distortions, or practicing step-by-step cognitive restructuring. In addition, the program does not incorporate graded exposure to pain-related sensations, movements, or feared activities, a technique frequently used to reduce fear avoidance and improve functional outcomes in chronic pain. Rather, the program primarily provides psychoeducation, mindfulness exercises, and distraction-oriented experiences. Given this overlap with the distraction VR arm, and the finding that neither active intervention outperformed Sham VR, it is possible that the absence of dedicated cognitive restructuring practice contributed to the similar pattern of outcomes observed across the two active conditions. Future research should evaluate the impact of incorporating traditional CBT skills into immersive VR environments to determine whether fully immersive CBT programs outperform clinician-delivered CBT or Sham VR for chronic low back pain.

In conclusion, while this trial did not demonstrate superiority of either Skills-Based VR or Distraction VR over Sham VR for the primary pain outcomes, and overall reductions in pain were modest, several important insights emerged that advance the field of Med XR. Notably, the Skills-Based VR program appeared to offer greater benefit for individuals with higher baseline anxiety, suggesting that anxiety-related pain may be particularly responsive to interoceptive and skills-oriented approaches. Additionally, secondary analyses revealed a reduction in opioid use over time in the Distraction VR group relative to Sham VR, indicating that different forms of VR may confer distinct advantages depending on the clinical target. These signals require confirmation in future studies that are prospectively powered to evaluate such endpoints as primary outcomes. Together with the strong feasibility and high adherence observed across all arms, these findings underscore the potential of home-based Med XR interventions and highlight the importance of aligning VR therapeutic design with patient characteristics and engagement patterns. Future work should further examine how structured CBT content, enhanced personalization strategies, and more targeted dosing could optimize outcomes across diverse subgroups of individuals with chronic low back pain.

Methods

The detailed methods for this study have been previously published in a peer-reviewed study protocol and are available in BMJ Open31. This report provides an abbreviated summary of the key methodological elements to orient the reader, while referring to the original protocol for comprehensive details.

Design

This was a prospective, three-arm randomized controlled trial conducted at Cedars-Sinai Health System (CSHS). The trial compared two VR interventions for pain reduction with a placebo VR control program in individuals with cLBP. Participants were recruited through CSHS physician referrals supplemented by AI cohort-building software (Deep6 AI®, Pasadena, CA), which applies natural language processing to electronic health record (EHR) data to match study inclusion and exclusion criteria. Participants were randomized 1:1:1 across the three arms; randomization and staff blinding are described in the published protocol31. A set of standardized demographic instruments based on NIH HEAL Common Data Elements and other cLBP outcomes were incorporated in study follow-up at baseline and at 90 days to support harmonization across participating BACPAC trials32.

Ethics approval and registration

The study was approved by the Cedars-Sinai Institutional Review Board (IRB; STUDY00000631). Research was conducted in accordance with the Declaration of Helsinki, and participants provided informed consent and HIPAA authorization prior to engagement with study activities. The trial was registered at clinicaltrials.gov (NCT04409353) on May 14, 2020; this study record contains the latest IRB-approved protocol and informed consent form.

Setting and Sample

This study was conducted remotely, including participants’ use of VR therapy and a Fitbit wearable device (Mountain View, CA) for measurement of steps and sleep efficiency. Patient-reported data were collected electronically via REDCap (Nashville, TN)33. Additional details on the remote study procedures are available in the published protocol.

Participants were eligible if they were aged 13 years or older, had cLBP persisting for at least 3 months with symptoms occurring on at least half the days over the past 6 months, were able to provide consent, and had access to a smartphone, laptop, or desktop computer to complete study activities. Participants needed to comprehend written and spoken English, be willing to comply with study procedures, and have access to email. Pregnant or planning-to-become-pregnant individuals were eligible.

Exclusion criteria included conditions interfering with VR use (e.g., history of seizures, significant visual or hearing impairment, or facial injury precluding safe headset use), prior participation in a VR clinical trial, anticipated long-term hospitalization (>3 weeks), recent surgery (within 8 weeks), planned surgery within 3 months, use of a spinal cord stimulator, or cLBP due to specific pathologies such as infection, cancer, or inflammatory spondylopathies, consistent with the NIH Task Force Research Standards for cLBP. Further details on inclusion and exclusion criteria are provided in the protocol paper31.

Screening process

Screening was conducted remotely and included physician referrals, identification via the Cedars-Sinai EHR with the assistance of Deep6 AI software, and subsequent eligibility verification. Initial eligibility was assessed using EHR data, including age, language, cLBP status, and exclusion criteria. Eligible participants were contacted by email with an IRB-approved recruitment letter and informational brochure, with the option to opt out. Further details on the screening process are available in the published protocol31, and our efforts to improve the racial and ethnic diversity of the study population are published in Journal of Racial and Ethnic Health Disparities34.

Administration of interventions

Following successful completion of a screener week, which required participants to complete at least five of seven daily baseline questionnaires to enroll in the trial, participants were randomized and mailed an all-in-one VR headset (PICO G2 4 K; Shenzhen, China), a biometric wristband (Fitbit Charge 4), and printed instruction manuals for each device. Headsets were sanitized prior to mailing using standardized cleaning procedures, including UV treatment (Cleanbox, Nashville, TN). On the day devices were shipped, participants were emailed tracking information and video tutorials for their assigned allocation. Following delivery of study equipment, an onboarding phone call took place between a study coordinator and the participant to acclimate participants with the Fitbit Charge 4 and assigned VR software as well as describe the upcoming 90-day follow-up protocol.

Each intervention included 56 sequential VR modules with a prescribed daily schedule, with additional access to an on-demand library in the Skills-Based VR arm; participants could view additional sessions each day and repeat the 56-session program as desired; participants were encouraged to take breaks after 10 to 15 minutes of use to minimize side-effects. Adherence was assessed through self-reported surveys, with reasons for non-use collected as applicable. Biometric data, including sleep and motion tracking, were collected from the Fitbit Charge 4 and securely stored and aggregated via Fitabase (San Diego, CA), a Health Insurance Portability and Accountability Act (HIPAA)-compliant and IRB-approved platform. Nonidentifiable data from VR headsets, including what type of modules were watched and when they were viewed, were uploaded to AppliedVR (Los Angeles, CA) servers throughout participants’ completion of the study protocol and provided to the study team in aggregated reports. Participants could pursue concomitant treatments and clinical trials outside the exclusion criteria and could discontinue the intervention at any time. Further details on intervention components, device management, and adherence monitoring are provided in the published protocol.

Intervention 1: skills-based VR (EaseVRx + )

The Skills-Based VR program, EaseVRx + , was developed by AppliedVR and is fully described in the published protocol31. It is identical in content and sequencing as the first version of the FDA-authorized program, now called RelieVRx, but EaseVRx+ also includes a separate, on-demand, searchable library of content that allows patients to access modules they prefer to experience, offering greater access to the user. Briefly, the program incorporates evidence-based principles of mindfulness meditation, relaxation therapy, and biofeedback techniques. It combines psychoeducation, pain education, relaxation exercises, breathing training, and executive functioning games into a standardized, prescriptive 56-day course. Each VR session lasts 2–16 minutes (average duration: 6 minutes) and delivers scheduled daily virtual experiences designed to provide a mind-body approach to chronic pain management.

Intervention 2: distraction VR

The novel distraction intervention featured the same VR hardware, the same number of experiences, the same approximate duration of experiences, and a user interface identical to that of Skills-Based VR, with a linear, prescribed sequence of experiences. The key difference is that instead of offering a variety of VR experiences—including education, games, and breath biofeedback—the distraction intervention only included 360-degree videos (many of which were also present in Skills-Based VR program). This removed the effect of education and skills-based training while preserving the immersive experience of 360-degree VR.

Placebo control: sham VR

The Sham VR condition is described in full in the published protocol31. It consists of 2D nature footage accompanied by emotionally neutral music, designed to simulate a VR experience without therapeutic components. Unlike the immersive and interactive content of the other arms, Sham VR mimics the experience of watching a large screen in a dark room. The intervention maintains the same VR hardware, the same number and approximate duration of experiences, and the user interface as the two other conditions. It includes further modifications to eliminate therapeutic effects, such as solid gray and black backgrounds behind the menu and the 2D video content. The manufacturer developed and used this control condition in its previous clinical trials.

Study outcome measures

The study’s outcome measures are fully detailed in the published protocol, including specific instruments and time frames31. Briefly, the primary outcome was the change from baseline in the PROMIS Pain Interference (PROMIS-PI) T-score, assessed at baseline and days 7, 15, 21, and 30, and specified time points thereafter. Secondary outcomes included measures of pain catastrophizing, anxiety, depression, sleep disturbance, self-reported opioid use, physical function, depression, patient-reported impressions of change, and biometric data (Fitbit-measured steps and sleep efficiency).

Additional VR experience-specific measures included baseline assessments of immersive tendencies and motion sickness propensity, assessment of presence in the VR setting during initial use, and adverse events over the course of the study. Full details, including the case report forms and consent materials, are provided in the supplemental materials of the original protocol31.

Data collection and monitoring

Fitbit data, including daily step count and sleep efficiency scores, were aggregated by Fitabase and VR usage were aggregated by AppliedVR. PROs were collected electronically during a 7-day baseline “screener week” and over 90 days following randomization. Participants received up to $225 in Amazon gift cards as incentives for survey completion and equipment return. Full details of the measurement schedule, including all PRO instruments and questionnaires, are provided in the published protocol31. Recruitment spanned from October 21, 2020, to October 17, 2023, and all 90-day follow-up concluded February 1, 2024.

Study monitoring was overseen by a data safety monitoring board (DSMB) established by the National Institute of Arthritis and Musculoskeletal and Skin Diseases (NIAMS) and facilitated by Navitas Life Sciences. Adverse events were collected and assessed weekly during regular electronic survey follow-up, with enrollment and safety data reported biannually to the DSMB.

Statistical analysis

All statistical analyses were conducted using R (v4.3.1) with study statisticians blinded to group assignments. Full details of the statistical plan are provided in the published protocol31.

For the primary endpoint (PROMIS-PI T-score), treatment effects of Skills-Based VR and Distraction VR were compared to Sham VR using a specific type of linear mixed model, the mixed model for repeated measures (MMRM), which treats time as a categorical factor and estimates marginal means at each time point. This approach does not assume a particular underlying trajectory or linear slope over time; instead, it focuses on time-specific mean differences, allowing us to directly estimate the group difference at the final assessment. The model included fixed effects for treatment, time, and their interaction, with baseline PROMIS-PI T-score as a covariate. Primary analyses were conducted at Day 30. Similar models were applied to secondary endpoints, at days 60 and 90, including PROMIS-PI, PROMIS anxiety, PROMIS sleep disturbance, pain catastrophizing, and opioid use, with estimates of least squares means, standard errors, and 95% confidence intervals reported for each group. Analyses were performed on the intent-to-treat (ITT) population, with a per-protocol (PP) analysis defined as participants self-reporting intervention use on ≥50% of days within the follow-up period.

To explore patient-level predictors of VR efficacy at Day 60, additional baseline-adjusted linear regression analyses tested treatment-by-factor interactions. Baseline-adjusted linear regression analyses included factors such as baseline mood disturbances (e.g., anxiety and depressive symptoms), VR usage (minutes per week based on headset data), sociodemographics (age, sex, race/ethnicity, and education), pain characteristics (severity, duration, radicular vs. non-radicular), patient comorbidities, and treatment expectations. Each model included treatment, a factor of interest, their interaction, and baseline PROMIS-PI T-score, with Day 60 change in PROMIS-PI as the outcome variable. These analyses allowed for a targeted examination of predictors, including baseline biopsychosocial distress, to identify which patients were most likely to have improved pain interference at the end of the 56-day VR treatment.

The full sample size calculation is described in detail in the published protocol31. Briefly, assuming a standard deviation (SD) of 11.83 to account for correlation of repeated measures, a minimum sample size of 120 analyzable participants per arm (360 total) was determined to provide 84% power to detect a clinically meaningful 5-point change in PROMIS Pain Interference scores, using a two-sided significance level of 0.025.