Introduction

Major depressive disorder (MDD) is a complex mental disorder that significantly affects psychosocial functions and overall quality of life, and contributes to disability-adjusted life-years (DALYs) [1]. Antidepressants like selective serotonin reuptake inhibitors (SSRIs) were commonly used for treating MDD [2], but individual responses to these medications could vary widely [3]. Clinical guidelines recommend a waiting period of four to six weeks before evaluating antidepressant efficacy [4], leading to unnecessary drug exposure, compromised compliance, and prolonged patient morbidity [3]. Mitigating ineffective trials and improving antidepressants therapy requires effective prediction models for antidepressants efficacy in the early stage of MDD [5].

Paroxetine, a commonly prescribed SSRI for MDD, has acceptable efficacy but moderate tolerability [6]. Prediction models built with the same method exhibited varying performance across different drugs [7, 8], suggesting that drug-specific models might yield better performance. A study predicted treatment remission for eleven antidepressants, including paroxetine, which did not restrict medication changes during treatment and showed suboptimal performance (area under the curve = 0.67) [9]. There is a lack of effective models targeting treatment remission in paroxetine monotherapy.

Paroxetine is primarily metabolized by Cytochrome P450 2D6 (CYP2D6) in the liver [10]. CYP2D6 metabolizer status can be predicted based on CYP2D6 genotype through the activity score (AS) system [11]. A previous study used a physiologically based pharmacokinetics model to predict the steady-state concentration (Css) of paroxetine based on CYP2D6 AS, but sourced concentration-time profiles from literature involving mixed populations, introducing heterogeneity in CYP2D6 allele frequencies [12]. This underscores the necessity for further research to develop a model predicting paroxetine Css using CYP2D6 AS as a predictor and to clarify its role in determining paroxetine Css in the Han Chinese population.

Therapeutic drug monitoring (TDM) is a crucial tool to optimize the pharmacotherapy of individual patients [13]. Low drug concentrations may result in insufficient efficacy, while high concentrations may cause adverse drug reactions (ADRs) [14]. Arbeitsgemeinschaft für Neuropsychopharmakologie und Pharmakopsychiatrie (AGNP) has proposed a therapeutic reference range (TRR) of 20–65 ng/ml for paroxetine, based on its efficacy after six weeks of treatment in European and Japanese populations [13]. Considering the diverse frequencies of pharmacokinetic and pharmacodynamic genetic variants among different ethnicities [15], it is imperative to further validate this TRR in the Han Chinese population. Additionally, the association between paroxetine concentrations and adverse drug reactions (ADRs) is controversial; some studies report no correlation, while others suggest a positive link [16,17,18]. Given this discrepancy, it is crucial to further investigate the relationship between ADRs and Css. In addition to the TRR proposed by the AGNP, our prior study in a Chines Han population identified a TRR of 31.55–52.5 ng/ml for patients with MDD, accounting for both treatment responses at the fourth week and ADRs [19]. Another European study suggested a range of 20–70 ng/ml to support long-term treatment efficacy [20]. Given the variations in previous TRRs, it is essential to integrate these ranges to offer a reference for paroxetine TDM that balances treatment efficacy and ADRs.

Therefore, we conducted this study with the following objectives: (1) develop prediction models for treatment remission in the early stage of paroxetine monotherapy, (2) develop prediction models for Css of paroxetine using CYP2D6 AS and confirm its role in Css determination, (3) validate the TRR proposed by AGNP guidelines, (4) explore the association between Css and ADRs, and (5) update the TRRs to incorporate both treatment remission and ADRs, thereby enhancing paroxetine treatment and TDM, particularly in patients with MDD from the Han Chinese population.

Methods

Study design and participants

This post-hoc analysis was derived from an 8-week, multi-center, prospective cohort study of paroxetine monotherapy, conducted within the Precision Medicine to Enhance Depression and Anxiety Outcome (PMEDA) consortium, to evaluate the impact of CYP2D6 genotype and phenotype on paroxetine plasma concentrations and treatment outcomes (Supplementary Materials) [19]. The consortium, comprising members from seventeen Chinese hospitals, recruited participants between March 2021 and April 2023, with follow-up and data collection extending to June 2023.

Psychiatric clinicians diagnosed patients using the Structured Clinical Interview for the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-5). Eligible participants were aged 18–65 years, of Han Chinese ancestry, with a Hamilton Depression Rating Scale-17 (HAMD-17) score ≥ 17 and a Hypomania Checklist-32 (HCL-32) score ≤ 13. Patients had not received systemic antidepressant treatment or used CYP2D6-inducing or -inhibiting drugs within two weeks prior to enrollment. We excluded patients with other mental disorders, pregnancy or lactation, severe suicidal or self-harm tendencies, severe or unstable physical illnesses, secondary MDD resulting from somatic diseases, or concurrent participation in another clinical trial.

Patients started paroxetine hydrochloride (Seroxat) monotherapy at 10 mg/day, titrated to 20–40 mg/day after the first week. Patients who took less than 80% of the prescribed medications were considered to have poor adherence and excluded. Patients meeting exclusion criteria, experiencing intolerable ADRs, or requesting withdrawal were withdrawn, with alternative plans provided by PMEDA consortium clinicians.

Out of 713 patients who met the inclusion criteria, 550 patients completed assessments at the eighth week and 530 patients without missing features were finally included in the analysis (Fig. 1a and Supplementary Methods).

Fig. 1: Flow chart of this study and identification in CYP2D6 genotype.
figure 1

a Flow chart of this 8-week study of paroxetine monotherapy and prediction model development. b CYP2D6 copy number detection and SNPs genotyping. For example, if a patient has a CYP2D6 gene copy number of 2N, a genotype of A.DEL at rs5030655 (indicating a deletion variant), and a genotype of GT at rs5030867, the patient is considered to carry one *6 and one *7 allele, classifying their CYP2D6 genotype as *6/*7. Css steady-state concentration, MDD major depressive disorder, ROC analysis, receiver operating characteristic curve analysis, SNP single nucleotide polymorphism.

Ethics approval and consent to participate

This study followed the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) and the Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD) reporting guidelines for investigation and modeling (Supplementary Materials) [21, 22]. The research protocol was approved by the Ethics Committee of Peking University Sixth Hospital (chictr.org: ChiCTR2000038462; Approved ID: 2020–49), and written informed consent was obtained from all participants.

Treatment outcomes

Patients were evaluated by clinical scales at baseline, the fourth, and the eighth week. Patients with MDD who had the HAMD-17 ≤ 7 during assessment were classified as achieving remission, while others were classified as non-remission. ADRs were assessed using the Treatment Emergent Symptom Scale (TESS), with a score greater than two on any item at any timepoint indicating the occurrence of ADRs. Blood samples were collected at the fourth week, following at least ten days of stable daily dosing, 20–24 h after the last paroxetine dose and before the morning dose. Dried blood spots (DBS) of plasma and whole blood were then prepared. The Css of paroxetine in plasma was measured using liquid chromatography-tandem mass spectrometry (LC-MS/MS, AB5500, AB Sciex) (Supplementary Methods). In this study, we treated Css as a binary variable and classified it according to the TRR of paroxetine described by the AGNP [13]. Css within the range of 20–65 ng/ml was defined as ‘within TRR’, while more than 65 ng/ml or less than 20 ng/ml was defined as ‘outside TRR’.

CYP2D6 genotyping and activity score

The selection criteria for CYP2D6 genotyping were informed by prior pharmacogenomics research [23, 24]. We excluded the *2 and *9 alleles because they did not affect CYP2D6 enzyme function (*2) or had low prevalence (*9) in the Chinese population [25,26,27]. We included the *114 allele, which was recognized as a loss-of-function variant [28]. The CYP2D6 *5 (CYP2D6 whole-gene deletion) had a frequency of over 0.03 in the East Asian populations [29], but some studies overlooked copy number variants (CNVs) when determining AS [30]. When performing CYP2D6 genotyping and AS calculation, we focused on both single nucleotide polymorphisms (SNPs) and CNVs, thereby enhancing the precision of our calculations.

Genomic DNA was extracted from the DBS samples using the Mag-MK Blood Spot DNA Extraction Kit (QIAGEN, Hilden, Germany). CNVs of the CYP2D6 gene were identified using TaqMan real-time fluorescent quantitative PCR and calculated using the ΔΔCt relative quantification method in CopyCaller V2.3.1 software (Thermo Fisher, Waltham, USA) (Supplementary Methods). Genotyping of SNPs was performed using the Matrix-Assisted Laser Desorption/Ionization-Time of Flight Mass Spectrometry (MALDI-TOF MS). Based on the results of CYP2D6 copy number detection and SNPs genotyping, the types and quantities of CYP2D6 star alleles were identified (Fig. 1b). Each star allele was assigned an activity value, and the CYP2D6 AS for each patient was the sum of all the activity values (Supplementary table 1). CYP2D6 metabolizer status was categorized into poor metabolizers (PMs, AS = 0), intermediate metabolizers (IMs, 0 < AS < 1.25), normal metabolizers (NMs, 1.25 ≤ AS ≤ 2.25) and ultrarapid metabolizers (UMs, AS > 2.25) [31].

Model for treatment remission

Previous studies suggest that factors influencing antidepressant efficacy were controversial [32, 33]. Therefore, we only selected baseline HAMD-17 score, age, sex, and Css (within/outside TRR) as potential predictors. To identify the optimal predictors, we conducted the least absolute shrinkage and selection operator (LASSO) regression analysis, which identified variables most strongly associated with the response variable by shrinking certain regression coefficients to zero. We selected the optimal lambda value (‘lambda min’) for the best performance. In the absence of an external validation dataset, cross-validation could enhance the evaluation of model generalizability [34]. The dataset was divided into ten equal-sized subsamples for cross-validation, where each subsample was used once as the test sample for model evaluation, with the remaining nine serving as training data. After conducting ten LASSO analyses, we calculated the occurrences of variables with non-zero coefficients. Variables appearing with non-zero coefficients in more than six of the ten analyses were included in the final prediction model [35, 36]. We then developed a prediction model for treatment remission using logistic regression analysis and 10-fold cross-validation, incorporating predictors selected by the LASSO analysis.

The discriminative ability between remission and non-remission was evaluated using receiver operating characteristic (ROC) curve analysis. We assessed sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), accuracy, balanced accuracy, and area under the curve (AUC) on each test fold, and calculated their mean and standard deviation (SD). We conducted the Hosmer-Lemeshow test on each test fold to evaluate the calibration performance. We further used the decision curve analysis (DCA) to assess clinical benefits under different threshold probabilities. Predictions from ten test folds were then pooled. We assessed the AUC and conducted the DCA on the pooled data, and the 95% confidence interval (CI) of the AUC was estimated using the bootstrap method with 1000 replicates. To evaluate the stability of model performance, we conducted the ROC analysis in the subgroup of first-onset or relapse patients, respectively. We further conducted logistic regression analyses on each train sample and the entire dataset to investigate relationship between predictors and remission, calculating the odds ratio (OR) and confidence interval (CI) for each predictor.

Model for paroxetine Css

Based on previous research [37, 38], we selected factors related to paroxetine Css as potential predictors, including age, sex, body mass index (BMI), smoking habit, drinking habit, CYP2D6 AS, CYP2D6 genotype-predicted metabolizer phenotype, paroxetine daily dose, and onset status. The feature selection process was conducted in the same way as the model for remission. We then developed a prediction model for Css (within/outside TRR) using logistic regression analysis and 10-fold cross-validation, incorporating predictors selected by the LASSO analysis. The discriminative ability, calibration, and clinical benefit of this model were assessed in the same way as the model for remission, including ROC analysis, Hosmer-Lemeshow tests, and DCA analysis. Subgroup analysis was also performed for first-onset or relapse patients. Logistic regression analysis was further conducted to investigate relationship between predictors and Css (within/outside TRR).

Association between ADRs and Css

We conducted logistic regression to explore the association between ADRs and Css, using both Css (within/outside TRR) and its natural log transformation (to adjust for skewness) as independent variables [39]. For the multivariable analysis, given the low frequency of ADRs, only sex and age were included as covariates to maintain adequate statistical power. Corrected odds ratios were calculated using a penalized maximum likelihood logistic regression model (Firth’s method, via the logistf R package) [40].

Therapeutic reference range

We further updated the TRR through ROC analysis, considering both treatment remission at the eighth week and ADRs. We utilized the Youden index to determine the optimal threshold. We first conducted ROC analysis to determine the upper limit of the updated TRR required for reducing ADRs. Subsequently, within the patients where Css was below this upper limit, we performed ROC analysis to identify the lower limit of the TRR required for improving treatment remission at the eighth week.

Patients were then categorized based on whether their Css fell within the updated TRR, and differences in ADRs and remission incidence were compared between groups using the Chi-square or Fisher’s exact test. Logistic regression analysis was further conducted to investigate relationships between Css (within/outside updated TRR) and treatment remission or ADRs. For each TRR (20–65 ng/ml, 20.8–52.5 ng/ml, and 31.55–52.5 ng/ml), patients with Css within these ranges were identified, and differences in ADR and remission incidences were compared across groups using the Chi-square or Fisher’s exact test.

Statistical analysis

All statistical analyses were performed using R software (version 4.3.1). To handle the missing data in our analyses, we employed a complete-case analysis approach. A two-tail P-value < 0.05 indicated statistical significance.

Results

Characteristics of participants

Table 1 lists demographic and clinical characteristics at baseline. Of 713 enrolled patients, 550 completed all assessments and 530 were included in the complete-case analysis (Fig. 1a). The average follow-up time was 6.3 weeks. At the fourth week of treatment, the median and interquartile range (IQR) of paroxetine Css in plasma were 39 (20.73–77.18) ng/ml. Among the patients, 230 (43.40%) had a Css within the TRR. 359 patients (67.74%) achieved treatment remission by the eighth week, and 34 patients (6.42%) had ADRs during treatment. Statistical power of the sample size was sufficient (Supplementary Materials).

Table 1 Demographic and clinical characteristics at baseline.

Model for treatment remission

We first performed LASSO analysis to identify predictors strongly associated with treatment remission. Baseline HAMD-17 scores, age, sex, and Css (within/outside TRR) were all selected for logistic model construction with non-zero coefficients in all of the ten LASSO analyses (Supplementary table 2). In the 10-fold cross-validation, we conducted ROC analysis on each test fold (Fig. 2a and Supplementary table 3). The mean and SD of accuracy, balanced accuracy, and AUC for the ten test folds were 0.723 ± 0.057, 0.719 ± 0.041, and 0.707 ± 0.063 (Table 2). The Hosmer-Lemeshow test on each test fold demonstrated good calibration, with all P-values > 0.05 (Supplementary table 4). DCA revealed that the model offered higher standardized net benefits when predicting remission compared to situations without prediction models (Fig. 2b). After pooling the ten test folds, we conducted ROC and DCA analysis on the pooled data (Fig. 2a and b), and AUC was 0.709 (95% CI: 0.658–0.757). We also conducted ROC analysis in the subgroup of first-onset or relapse patients, respectively (Table 2). The mean and SD of AUC for the first-onset subgroup were 0.714 ± 0.104, while for the relapse subgroup, they were 0.723 ± 0.057, suggesting stability between different onset statuses.

Fig. 2: Receiver operating characteristic curve and decision curve analysis for two prediction models.
figure 2

a Receiver operating characteristic curves on each test fold and the pooled test sample for the model predicting treatment remission of paroxetine. b Clinical decision curves on each test fold and the pooled test sample for the model predicting treatment remission of paroxetine. c Receiver operating characteristic curves on each test fold and the pooled test sample for the model predicting Css of paroxetine. d Clinical decision curves on each test fold and the pooled test sample for the model predicting Css of paroxetine. AUC area under the curve, Css steady-state concentration.

Table 2 Performance of two prediction models.

To further explore the relationship between predictors and remission, we performed logistic regression on each train sample (Supplementary Fig. 1a) and the entire dataset (Fig. 3a and Supplementary table 5). We found that age, baseline HAMD-17 scores, and Css (within/outside TRR) were significantly associated with treatment remission, with no evidence of multicollinearity (VIF < 1.01). Across the entire dataset, patients with Css within the TRR had a higher likelihood of achieving remission (OR = 1.655, 95% CI: 1.109–2.489). Patients with Css below, within, or above the TRR demonstrated remission frequencies of 62.2, 73.0, and 64.7%, respectively (Supplementary Fig. 2a).

Fig. 3: Logistic regression of Css and identification of updated TRR.
figure 3

a The forest plot shows logistic regression results between Css (within/outside TRR of 20–65 ng/ml) and treatment remission, adjusting for sex, age, and baseline HAMD-17 score. b The forest plot shows logistic regression results between activity score and Css (within/outside TRR of 20–65 ng/ml), adjusting for sex, daily dose of paroxetine, and smoking and drinking status. c The forest plot shows logistic regression results between Css (log-transformed Css or within/outside TRR of 20–65 ng/ml) and ADRs, adjusting for sex and age using the Firth’s method. d Receiver operating characteristic curves for paroxetine steady-state concentration to identify the upper limit of the updated TRR, considering adverse drug reactions. The columns display frequencies of ADRs for subgroups with Css below, within, or above the updated TRR. e Receiver operating characteristic curves for paroxetine steady-state concentration to identify the lower limit of the updated TRR, considering treatment remission. The columns display frequencies of remission for subgroups with Css below, within, or above the updated TRR. ADR adverse drug reaction, CI confidence interval, Css steady-state concentration, HAMD-17 Hamilton Depression Rating Scale-17, OR odds ratio, TRR therapeutic reference range (20–65 ng/ml), updated TRR updated therapeutic reference range (20.8–52.5 ng/ml); *p < 0.05, **p < 0.01, and ***p < 0.001.

Model for paroxetine Css

LASSO analysis identified potential predictors for Css with non-zero coefficients in more than six of the ten LASSO analyses (Supplementary table 2), including sex, smoking habit, drinking habit, CYP2D6 AS, and daily dose of paroxetine. In the 10-fold cross-validation, we conducted ROC analysis on each test fold (Fig. 2c and Supplementary table 3). The mean and SD of accuracy, balanced accuracy, and AUC for the ten test folds were 0.627 ± 0.056, 0.629 ± 0.054, and 0.615 ± 0.069 (Table 2). The Hosmer-Lemeshow test on each test fold demonstrated good calibration, with all P-values > 0.05 (Supplementary table 4). DCA revealed that the model offered higher standardized net benefits (Fig. 2d). We further conducted ROC and DCA analysis on the pooled data (Fig. 2c and d), and AUC was 0.607 (95% CI: 0.559–0.656). ROC analysis in the subgroup of first-onset or relapse patients indicated stability between different onset statuses (Table 2).

To further clarify the role of CYP2D6 AS in predicting Css, we excluded CYP2D6 AS from the predictors and performed ROC analysis to assess model performance without it. The mean AUC ( ± SD) for the model without AS was 0.578 ± 0.049. ROC test on the pooled data revealed significant differences between AUCs of models with and without AS (P = 0.031). The model with AS showed an AUC improvement of 0.042 (95% CI: 0.001–0.085) over the model without AS, as demonstrated by bootstrap.

In the logistic regression analysis (Fig. 3b and Supplementary table 6 and Fig. 1b), we found that sex, daily dose of paroxetine, and CYP2D6 AS were significantly associated with Css (within/outside TRR), with no evidence of multicollinearity (VIF < 1.57). Across the entire dataset, patients with lower CYP2D6 AS were more likely to maintain their Css within the TRR (OR = 0.638, 95% CI: 0.461–0.878).

Association between ADRs and Css

We investigated the effects of Css on ADRs (Fig. 3c and Supplementary table 7). Both univariable and multivariable logistic regression analyses showed statistically significant positive associations between log-transformed Css and ADRs (OR = 1.469, 95% CI: 1.008–2.159; OR = 1.490, 95% CI: 1.009–2.228, respectively). These associations remained significant in the penalized maximum likelihood logistic regression (OR = 1.463, 95% CI: 1.007–2.144; OR = 1.481, 95% CI: 1.007–2.205, respectively). Additionally, patients with Css within the TRR were less likely to experience ADRs in the univariable and multivariable regressions (OR = 0.448, 95% CI: 0.194–0.947; OR = 0.443, 95% CI: 0.191–0.942, respectively), which was further supported in the penalized model (OR = 0.464, 95% CI: 0.205–0.966; OR = 0.460, 95% CI: 0.203–0.961, respectively). Across the entire dataset, patients with Css below, within, or above the TRR demonstrated ADR incidence of 5.5, 3.9, and 10.5%, respectively (Supplementary Fig. 2b).

Therapeutic reference range

ROC analysis identified 52.5 ng/ml as the optimal upper limit of paroxetine Css to reduce ADRs (Fig. 3d). Among those with Css below 52.5 ng/ml, 20.8 ng/ml was determined as the lower threshold for achieving remission (Fig. 3e). Patients with Css within the updated TRR (20.8–52.5 ng/ml) showed significantly higher remission rates compared to those with Css below this range (P = 0.034, Fig. 3d). Conversely, patients with Css above the updated TRR showed significantly higher ADR rates (P = 9.02 × 10⁻⁵, Fig. 3e). Patients with Css above the updated TRR showed lower remission rates, while those with Css below the updated TRR exhibited higher ADR rates compared to patients within the updated TRR, although these differences were not statistically significant (Fig. 3d and e). Adjusting for sex, age, and baseline HAMD-17 score, these patients were more likely to achieve remission (OR = 1.644, 95% CI: 1.078–2.536, Supplementary table 8 and Fig. 3). Additionally, after adjusting for sex and age and applying the Firth’s method, these patients showed reduced likelihood of experiencing ADRs (OR = 0.190, 95% CI: 0.051–0.513, Supplementary table 8 and Fig. 3). For the comparison across three TRR standards, patients with Css within 20.8–52.5 ng/ml demonstrated higher remission rates and lower ADR rates compared to those with Css within 20–65 ng/ml or 31.55–52.5 ng/ml, though the differences did not reach statistical significance (Fig. 3d, e, Supplementary table 9, Fig. 2, and Fig. 4).

Discussion

In this study, we developed models for treatment remission and Css of paroxetine, demonstrating good performance. Age, baseline HAMD-17 scores, and Css (within/outside TRR) significantly influenced treatment remission, whereas sex, daily dose of paroxetine, and CYP2D6 AS significantly influenced Css. We validated the TRR of 20–65 ng/ml in the Han Chinese population and confirmed associations between ADRs and Css. Additionally, by considering both treatment remission at the eighth week and ADR occurrence, we established an updated TRR of 20.8–52.5 ng/ml. Patients with Css within this updated TRR showed significantly reduced ADR frequencies and higher remission rates compared to those outside the updated TRR.

We constructed a model to predict treatment remission of paroxetine monotherapy using baseline and early-stage information. The model showed good performance as well as clinical net benefits, demonstrating potential clinical utility. Consistent studies have shown accuracies of about 0.7 for predicting antidepressants efficacy [8]. However, few studies have specifically examined paroxetine efficacy. One such study predicted treatment remission for 11 antidepressants including paroxetine, achieving an AUC of 0.67—lower than our model, which also lacked stability in maintaining target drug usage without switching medications [9]. Compared to previous models for other antidepressants, our model demonstrated superior [9, 41, 42] or equivalent performance [7] and required fewer predictors, enhancing clinical applicability and reducing overfitting risks.

Allele frequencies of genetic variants related to pharmacokinetics and pharmacodynamics differ among various ethnicities [15]. However, the AGNP established the TRR for paroxetine at 20–65 ng/ml based on studies that did not include Han Chinese population [13, 43,44,45]. Therefore, it is crucial to validate whether the AGNP’s TRR works in the Han Chinese population before applying it in clinical practice. In the present study, maintaining Css within the AGNP’s TRR significantly correlated with higher remission rates after eight weeks of paroxetine monotherapy, validating the AGNP’s TRR in the Han Chinese population. Furthermore, in line with previous studies, the baseline HAMD-17 score was identified as a risk factor for remission, while age was a protective factor [32, 46].

We developed a model to predict whether the Css of paroxetine falls within the TRR of 20–65 ng/ml based on variables collected at baseline. The model demonstrated moderate performance, surpassing chance levels and offering clinical net benefits. Compared to prior models that predicted if concentrations were within specific ranges, our model exhibited superior [47] or equivalent [48] performance. CYP2D6 AS is a crucial predictor that significantly improves our model’s performance, with higher AS linked to a decreased likelihood of Css falling within the TRR. Given that higher AS indicates increased CYP2D6 metabolic activity [31], this accelerates paroxetine metabolism in plasma, potentially resulting in Css falling below the TRR’s lower limit. Male patients were more likely to maintain Css within the TRR compared to females. Consistently, female patients were found to exhibit significantly higher paroxetine concentrations than males [49], with males displaying lower relative bioavailability [50]. Collectively, these findings suggest female patients are more likely to have their Css exceeding the TRR. Patients receiving higher doses were less likely to maintain Css within the TRR. As paroxetine acts as a CYP2D6 inhibitor, with higher doses (e.g., 30 mg) enhancing this inhibition [10], patients might have their Css exceeding the TRR following dosage increases in this study.

Previous research has underscored the importance of CYP2D6 AS-dependent metabolism in predicting paroxetine concentrations [12], but it relied on concentration-time curves from studies spanning diverse populations, which introduced heterogeneity in CYP2D6 allele frequencies. Our previous findings demonstrated a significant negative correlation between CYP2D6 AS and paroxetine Css in female patients with MDD, generalized anxiety disorder, or panic disorder [19]. In this study, we examined the roles of CYP2D6 AS and metabolizer status in predicting paroxetine Css in patients with MDD. In the LASSO analysis and cross-validation for predictor selection, AS consistently showed non-zero coefficients across all folds. Conversely, CYP2D6 metabolizer status only did so in one fold. This aligns with prior research indicating that AS offers a more detailed assessment than traditional metabolizer status systems [51]. Our findings suggest the potential value of using AS to predict plasma concentrations.

The relationship between ADRs and paroxetine concentration is still debated. Previous research found no link between paroxetine plasma levels and ADRs [17, 18]. However, another study suggested that higher steady-state brain paroxetine concentrations were associated with increased risk of adverse events [16]. These studies were based on small European samples (n < 100), underscoring the necessity of investigating the association between Css and ADRs in the Han Chinese population with a more substantial sample size. In this study, we confirmed a significant positive association between log-transformed paroxetine Css and ADRs, aligning with previous research linking brain paroxetine levels and ADRs [16]. Furthermore, patients with Css within the TRR of 20–65 ng/ml had a lower likelihood of experiencing ADRs, which remained significant in the penalized model adjusted for sex and age. These results underscore the utility of the TRR in assessing ADR risks.

The plasma concentrations, treatment efficacy, and ADRs of antidepressants vary across ethnicities [52,53,54]. These variations are driven by genetic polymorphisms in drug-metabolizing enzymes like CYP2D6, population-specific metabolizer status, differences in drug transport and receptor function, and societal or environmental factors such as adherence, lifestyle, diet, and overall health [53, 55, 56]. Consequently, there is a need for TRRs tailored to specific populations to optimize treatment outcomes. Considering both treatment remission at the eighth week and the occurrence of ADRs, we have refined the TRR to 20.8–52.5 ng/ml for Han Chinese population. The upper limit (52.5 ng/ml) aligns with findings from our prior research, where we set a TRR of 31.55–52.5 ng/ml [19]. The lower limit was updated in this study based on ROC analysis, which identified the threshold distinguishing patients who achieved clinical remission after 8 weeks of treatment from others. Patients with Css below, within, or above the TRR of 31.55–52.5 ng/ml showed no significant differences in eighth-week remission rates, whereas significant differences were observed for Css below or within the TRR of 20.8–52.5 ng/ml. The lower limit (20.8 ng/ml) aligns with the TRR of AGNP clinical guidelines (20–65 ng/ml) [13], which is based on treatment efficacy observed at the sixth week [44, 45], close to the remission assessment timelines in our study. Furthermore, another study tracking patients with MDD treated with paroxetine over a year found that patients with paroxetine plasma levels between 20 and 70 ng/ml exhibited greater long-term treatment efficacy [20]. We found that patients with Css within 20–65 ng/ml had higher ADR incidence (3.9%) compared to those with Css within 20.8–52.5 ng/ml (1.7%). Our updated TRR integrates prior TRRs and is expected to provide a reference for paroxetine TDM that balances long-term efficacy and ADRs.

Our study presents several strengths and clinical implications. First, to our knowledge, few studies have specifically developed a model for predicting treatment efficacy of paroxetine in monotherapy. Our model exhibited good performance in forecasting treatment remission. Second, we predicted Css of paroxetine in monotherapy samples from the same ancestry (Han Chinese), involving over 500 participants, which confirmed the effects of CYP2D6 AS on Css and underscored its potential in predicting Css. Third, we validated the TRR of 20–65 ng/ml in the Han Chinese population. Patients with Css within the TRR were more likely to achieve remission and avoid ADRs. Finally, we consolidated prior research and adjusted the TRR to 20.8–52.5 ng/ml, optimizing the balance between long-term treatment efficacy and ADRs.

We acknowledge several limitations in our study. First, our prediction models lacked external validation, specifically tailored for paroxetine monotherapy in the Han Chinese population. Therefore, caution is warranted when applying these results to other contexts. This underscores the importance of verifying these findings in external cohorts using consistent predictors. Second, in this study, we categorized the Css of paroxetine as a binary variable defined by the TRR of 20–65 ng/ml, potentially losing information compared to treating Css as a continuous variable, such as determining if a patient’s Css exceeds laboratory alert levels. Future research should develop models predicting Css or log-transformed Css to enhance TDM precision. Lastly, we used the latest standards that combine both SNPs and CNVs to improve the accuracy and reliability of determining CYP2D6 AS beyond traditional methods; however, the inference of CYP2D6 AS might hold uncertainties as not all variants were detected.

Overall, we predicted treatment remission and Css of paroxetine in patients with MDD from the Han Chinese population, achieving good performance. Our analysis confirmed significant correlations between CYP2D6 AS and Css, as well as between Css and both treatment remission and ADRs. We validated the TRR proposed by previous clinical guidelines in the Han Chinese population and refined it to 20.8–52.5 ng/ml. Our results underscore the potential of CYP2D6 AS and Css as predictors for Css and treatment remission, respectively. Independent cohort validation is essential to confirm these results.