Introduction

Chronic nonspecific low back pain (CLBP) is a primary pain that lasts for more than three months1. The prevalence of CLBP is 19.6% in individuals aged 20–59 years2, and 36.1% in those aged \(\ge\) 60 years3. CLBP is framed in a multidimensional conceptual model to inform the assessment and management of the associations between behavioral, psychological, and social factors and future persistence of pain and disability4. The severity of chronic pain was recorded using pain intensity1, which was adopted as the primary outcome. Additionally, emotional distress5,6,7,8 and pain-related interference with functional ability9,10 are commonly assessed.

Pharmacological treatment is recommended following an inadequate response to non-pharmacological treatments4. Recent guidelines for pharmacological treatment of CLBP commonly recommend the use of nonsteroidal anti-inflammatory drugs (NSAIDs)4,11,12,13. The therapeutic effects of conventional NSAIDs are derived from the inhibition of cyclooxygenase (COX)-2, whereas adverse effects on the upper gastrointestinal tract arise from inhibition of COX-1 activity. The risk of developing upper gastrointestinal bleeding and peptic ulcer disease is lower with coxibs than conventional NSAIDs14,15. Coxibs selectively inhibit COX-2, thereby inhibiting inflammation while preserving most of the homeostatic functions of COX-derived prostaglandins16. No difference in pain efficacy has been identified between different NSAIDs, including selective versus non-selective COX-2 inhibiting NSAIDs 13. The current recommended dose of celecoxib is 200 mg/day for long-term use in Japan and 400 mg/day in Western countries.

Despite its widespread use, very low-quality evidence exists for paracetamol as a treatment option for CLBP; therefore, recommendations are contradictory among guidelines12,17. A nonrandomized observational study suggested that paracetamol has analgesic effects equivalent to NSAIDs in patients with CLBP18. Only two trials have compared the effects of paracetamol with other drugs for CLBP19,20. First, Hickey compared treatment efficacy over 4 weeks between paracetamol (1,000 mg 4 times daily) and diflunisal (500 mg twice daily) in 1982, with 30 CLBP participants. They reported significantly superior treatment efficacy for diflunisal compared to paracetamol; however, the outcome relied on patient-reported ratings of treatment efficacy, ranging across four categories from poor to excellent19. A small number of patients in each group experienced mild adverse effects. NSAIDSs acts by inhibition of COX-1 and COX-2, which leads to a decrease in synthesis of pro-inflammatory prostaglandins, but adverse effects do exist. Second, Bedaiwi showed superior efficacy of celecoxib (200 mg twice daily) compared with paracetamol (500 mg twice daily) over 4 weeks of treatment in 2016, with 50 CLBP participants20. Demonstrating the superiority of celecoxib had significant implications; however, this study had some limitations. First, the administered dose of paracetamol (500 mg twice daily) was relatively low compared to that in real-world medicine20. The recommended maximum daily dose of paracetamol in adults is less than 4,000 mg17. Among the elderly, treatment with high-dose (> 3,000 mg per day) paracetamol is associated with a greater risk of hospitalization as a result of gastrointestinal perforation, ulceration, or bleeding compared to low-dose paracetamol (≤ 3,000 mg per day)21. Second, they investigated only the superiority of NSAIDs over paracetamol in CLBP. A non-inferiority trial22,23 has not been conducted to determine whether paracetamol is an alternative to NSAIDs for CLBP.

Thus, this randomized controlled study aimed to compare the analgesic effects of paracetamol (1,000 mg three times daily) and celecoxib (100 mg twice daily) in a non-inferiority trial. This non-inferiority trial tested the hypothesis that paracetamol is as effective as celecoxib for the treatment of CLBP within a pre-specified non-inferiority margin.

Methods

Participants

Participants were consecutively recruited from seven outpatient clinics in Japan (Nakae Hospital, Yukioka Hospital, Hayaishi Hospital Center for Pain Management, Hayaishi Hospital Orthopedics, ISEIKAI International General Hospital, Matsubara Mayflower Hospital, and Yoshida Medical Clinic) between June 2023 and 2024. Patients were eligible if they (1) were \(>\) 20 years and (2) had persistent low back pain for a minimum duration of 6 months prior to enrollment in the study. Exclusion criteria were as follows: (1) uninterested in participation; (2) tumor-related pain, presence of neurological symptoms, or C-reactive protein level > 10 mg/dL; (3) within 3 months after surgery; and (4) taking medication associated with dementia. They were allowed to take their regular long-term medication or treatment for certain diseases, but not for pain. All the inclusion and exclusion criteria were assessed by the referring physicians.

This study was approved by the Ethics Committee of Hayaishi Hospital (Mar 17th, 2019) (reference number, R010517-01), and all study participants provided written informed consent. All experiments were registered (UMIN000039719, Mar 7th, 2020, https://center6.umin.ac.jp/cgi-open-bin/ctr/ctr_view_his.cgi). This study was performed in accordance with the Consolidated Standards of CONSORT statement and Declaration of Helsinki.

Randomization and treatment

This was a multicenter, open-label, randomized, assessor-unblinded, non-inferiority clinical trial. The patients were randomly divided into 2 equal groups using computer generation: the paracetamol group received 1,000 mg paracetamol tablets 3 times daily (3,000 mg per day) for 4 weeks (Calonal Tablet, Asympti Pharmaceutical Corp., Tokyo, Japan), and the celecoxib group received 100 mg celecoxib tablets 2 times daily (200 mg per day) for 4 weeks (Celecox Tablet; Astellas Pharmaceutical Inc., Tokyo, Japan). Treatment was performed by orthopedic doctors throughout the study, and a person who was not involved in treating the patients performed the randomization. The physicians instructed the patients that they could stop the medication if their symptoms improved or if any adverse effects related to the gastrointestinal tract, liver toxicity, or other adverse effects occurred during the period of the study.

Primary outcome

Outcome measures were assessed at baseline and at weeks two and four after starting the medication. The Numeric Rating Scale (NRS) was used as the primary outcome measure of pain severity in each assessment, where 0 = no pain and 10 = worst imaginable pain.

Secondary outcome

Secondary outcomes were measured using the following questionnaires: the Pain Catastrophizing Scale (PCS), Hospital Anxiety and Depression Scale (HADS), Pain Disability Assessment Scale (PDAS), and EuroQol-5 Dimensions-3 level (EQ-5D-3L). All of these questionnaires were previously validated in Japanese and have shown good reliability among patients with CLBP24.

  1. (i)

    PCS

    The PCS score consists of 13 items in which respondents rate the frequency of pain-related thought and emotions they have experienced5,6. The total PCS score ranged from 0 to 52, with higher scores indicating higher levels of catastrophizing.

  2. (ii)

    HADS

    HADS is designed to assess two separate dimensions: anxiety and depression7,8. The HADS consists of 14 items with anxiety (HADS-Anxiety) and depression (HADS-Depression) subscales, each including 7 items. A four-point response scale (from 0 representing absence of symptoms to 3 representing maximum symptoms) was used, with possible scores for each subscale ranging from 0 to 21.

  3. (iii)

    PDAS

    The PDAS assesses the degree to which chronic pain interferes with various daily activities during the past week9. The PDAS includes 20 items reflecting pain interference in a broad range of daily activities, and respondents indicate the extent to which pain interferes. The total PDAS scores ranged from 0 to 60, with higher scores indicating higher levels of pain interference.

  4. (iv)

    EQ-5D-3L

    The EQ-5D-3L assesses self-reported health-related quality of life (QOL)10. The EQ-5D-3L defines health according to 5 dimensions: mobility, self-care, usual activities, pain/discomfort, and anxiety/depression. The EQ-5D-3L index score range from − 0.111 to 1.000, where negative scores indicate health states worse than death, 0 represents death, and 1.000 represents full health.

  5. (v)

    Adverse effects

    Adverse events were assessed simultaneously at two and four weeks. When subjects reported possible adverse events while taking the medication, they were instructed to withdraw or decrease the medication dosage depending on their condition.

Statistical analysis

Prior to sample size estimation, we determined the non-inferiority margin in the NRS between paracetamol and celecoxib. According to previous studies on CLBP, the mean change in pain intensity over a 4-week period was 1.6, with a standard deviation of 2.320. Since the NRS was used to assess the primary outcome in this study, the non-inferiority margin was defined as 0.313. Based on these values, at least 51 subjects per arm were required, with a type I error of 5% and 80% power. Given the expected dropout rate25, 156 patients with CLBP were randomly allocated to receive treatment with paracetamol or celecoxib.

We initially tested whether the data were normally distributed using the Shapiro–Wilk normality test for continuous variables. Between-group comparisons at each time point were conducted using either unpaired t-tests or Mann–Whitney U tests, depending on data distribution Analyses were adjusted for baseline covariates and reported with means and 95% confidence intervals (CIs). Categorical variables were represented as the number and percentage of patients, and were analyzed using the χ2 test. Changes over time were analyzed using a repeated measures ANOVA and post hoc tests. We then tested the following null (H0) and alternative (H1) hypotheses as MP: mean changes in NRS from baseline to weeks two and four with paracetamol, MC: mean changes in NRS from baseline to weeks two and four with celecoxib, and δ: non-inferiority margin;

$${\text{H}}0:{\text{MP}}{-}{\text{MC}} > \delta \left( {0.{3}} \right)$$
$${\text{H1}}:{\text{MP}} - {\text{MC}} \le \delta \left( {0.{3}} \right)$$

Data were analyzed using SPSS (version 28.0, for Microsoft Windows; SPSS, Chicago, IL, USA). Statistical significance was set at p < 0.05.

Results

A flowchart of the dispositions of the participants is shown in Fig. 1. Overall, 156 consecutive patients with CLBP were randomized equally to receive either paracetamol or celecoxib. Of the 156 patients, 17 patients (10.9%) were lost to follow-up (paracetamol: 8, versus celecoxib: 9) and 139 patients (89.1%) completed the study (paracetamol: 70, versus celecoxib: 69). The dropout rate was not significantly different between the two groups (p = 0.797). As shown in Table 1, the NRS, PCS, HADS-Depression, and EQ-5D-3L variables showed no significant differences between the two medication groups at baseline. The paracetamol group had significantly worse PDAS and HADS-Anxiety scores than did the celecoxib group at baseline.

Fig. 1
figure 1

Flow diagram of the randomized trial comparing paracetamol or celecoxib. Of 156 patients, 139 patients completed the study.

Table 1 Patient characteristics between the paracetamol and celecoxib groups at baseline.

Figure 2 shows the mean and its 95% CI for the difference in changes in NRS from baseline to each follow-up point between the two medication groups. Using these analyses, the null hypothesis—that paracetamol is at least 0.3 points less effective than celecoxib on the NRS—was rejected at each follow-up point, respectively. Therefore, we cocluded that paracetamol was non-inferior to celecoxib within the predefined 0.3-point non-inferiority margin on the NRS.

Fig. 2
figure 2

Mean differences in NRS change scores from baseline to each follow-up between the paracetamol and celecoxib groups. The mean and its 95% confidence interval show the treatment difference of paracetamol compared to celecoxib. Paracetamol was as effective as celecoxib within at least a 0.3 points difference on NRS.

Table 2 shows changes in values of the primary and secondary outcomes. Scores at weeks two and four showed significant improvements than at baseline for each outcome in the paracetamol group. In the celecoxib group, scores for NRS and PCS improved significantly, whereas PDAS, HADS-Anxiety, HADS-Depression, and EQ-5D values showed no significant changes from baseline. There were no statistical differences in changes in NRS or PCS between the two medications at weeks two and four, even after covariate adjustment. The PDAS, HADS-Anxiety, and HADS-Depression values in the paracetamol group were significantly decreased compared with the celecoxib group at weeks two and four. The EQ-5D-3L value in the paracetamol group was significantly increased compared with the celecoxib group at week 4.

Table 2 Changes in values of primary and secondary outcome from baseline to each follow-up.

Contrastingly, although adverse effects were observed more frequently in the celecoxib group (6 of 69, 8.6%) compared to the paracetamol group (2 of 70, 2.8%), the overall incidence showed no significant difference between the 2 medications (Table 3). No patients discontinued or adjusted their medication due to adverse events.

Table 3 Adverse events in patients with chronic low back pain receiving paracetamol or celecoxib.

Discussion

Overview

The present randomized open-label non-inferiority trial investigated whether paracetamol was as effective as celecoxib for CLBP within a specified non-inferiority margin over four weeks of treatment in outpatient clinics. We found that the mean difference in changes in NRS between the two medication groups did not statistically exceed the non-inferiority margin, suggesting that paracetamol has a comparable analgesic effect on CLBP within a 0.3 point margin on a 0–10 point NRS compared with celecoxib.

Clinical trials of paracetamol for CLBP

However, the efficacy of each drug in the treatment of CLBP remains uncertain13. Pharmacological therapy for CLBP often uses NSAIDs, paracetamol, opioids, antidepressants, and skeletal muscle relaxants; however, guideline recommendations are contradictory12,17. Paracetamol, one of the oldest and most commonly used analgesics, was introduced to the pharmacological market in 195526. Nevertheless, findings regarding paracetamol for CLBP are few and contradictory17,18,19,20. Previous studies have suggested a significantly superior treatment efficacy of NSAIDs compared with paracetamol17,18,19,20. The present study suggests that paracetamol has analgesic effects comparable to those of celecoxib. The dose of paracetamol (3,000 mg/day) in the present study was higher than that used in a previous trial (1,000 mg/day)20.

Mechanisms and adverse events of paracetamol

The mechanism of action is complex and includes the effects of both peripheral and central antinociceptive processes and redox mechanisms17,26. Recent studies emphasize that paracetamol may reduce pain through neurochemical changes in the central nervous system27,28. Interestingly, the present study suggested that paracetamol has a superior effect on pain-related psychological variables and QOL in the secondary outcomes compared with celecoxib at weeks 2 and 4. However, there is a lack of research on the efficacy and mechanisms of action of paracetamol17,26,28.

Paracetamol is known to have few adverse effects in the gastrointestinal tract17,26. However, cases of paracetamol-induced hepatotoxicity26 and hospitalizations21 have been reported. However, there is a risk of serious adverse effects when regular and large doses of paracetamol are administered26. In the present study, the incidence of gastrointestinal adverse events was rare (2 of 70, 2.8%) with administration of paracetamol at 3,000 mg per day in the present study.

Several clinical trials have suggested a limited analgesic effect of paracetamol for patients with various diseases12,17,29,30. That is, for acute low back pain, there is high-quality evidence that paracetamol (4,000 mg per day) is no better than placebo for relieving pain in either the short or long term12,17. For knee or hip osteoarthritis, the efficacy of paracetamol (4,000 mg per day) was equal to or less than celecoxib (200 mg per day)29,30. These studies suggest that the efficacy of paracetamol could be disease-specific, even at maximal doses (4,000 mg/day)15.

Limitations

This study has several limitations. First, the open-label and assessor-unblinded design of this clinical trial may have increased the risk of trial bias. Efficacy was compared between paracetamol and celecoxib groups, but not a placebo group. Therefore, these findings should be interpreted with caution. This study included patients with both primary and secondary low back pain. Further studies with rigorous patient selection may provide more useful findings.

Conclusions

The present randomized controlled clinical trial suggested that the mean difference in changes in NRS over four weeks between the two medication groups did not statistically exceed the specified non-inferiority margin, suggesting that paracetamol provides analgesic effects comparable to celecoxib for CLBP.