Introduction

Treatment-induced peripheral neuropathy (TIPN) is one of the most complex toxicities to diagnose and manage in cancer patients, thus limiting optimal therapy1,2. Chemotherapy-induced peripheral neuropathy (CIPN) is the most widely recognized form of TIPN. Notably, TIPN can also be caused by other cancer therapies, such as molecular therapies, radiation, and surgery3. TIPN is characterized by pain, tingling, numbness, heightened sensitivity to temperature variations, and impaired fine motor skills4, which impair daily function and reduce quality of life5,6. Severe TIPN can necessitate altering therapy dosage, disrupting clinical treatment plans, thus impacting treatment outcomes7. TIPN is often cumulative and persistent8. The number of patients experiencing TIPN may increase since surgery, radiation, and targeted therapy are the standard treatments for multiple types of cancer. Therefore, a systematic evaluation instrument is needed to assess TIPN and understand the increasing burden of residual treatment-related neuropathic effects.

The peripheral neuropathy symptomology is diverse, depending on the treatment method and dose9. Research on neuropathy symptoms has expanded from an initial focus on pain to multidimensional symptoms of sensory, motor, and autonomic changes10,11. Currently, objective neurobiological tests, clinician evaluation, and subjective patient reports are used to detect the multifaceted nature of TIPN. However, the equipment used for neurophysiological testing is expensive and complex to operate, requiring professional knowledge and training to operate, limiting its promotion in multicenter pilots12. Furthermore, clinician evaluation has a wide range of score variances due to subjective differences among evaluators, making it difficult to capture changes in clinical symptoms. Previous studies have demonstrated a moderate correlation between the findings of neurobiological tests and patient reports, as well as a moderate correlation between patient reports and clinician evaluation13,14. The effects of symptoms are experienced subjectively. Therefore, patient-reported outcomes (PROs) are more appropriate for clinical practice than objective neurobiological tests when monitoring adverse events and symptom perception. Also, PROs are more accurate and reliable in capturing a wide range of TIPN symptoms than clinician evaluation15.

PROs have been widely used to collect peripheral neuropathy symptom information. Existing patient-reported outcome measures (PROMs), such as the European Organization for Research and Treatment of Cancer Quality of Life Chemotherapy-Induced Peripheral Neuropathy Questionnaire (EORTC QLQ-CIPN20)16, the Functional Assessment of Cancer Therapy/Gynecologic Oncology Group Neurotoxicity subscale (FACT/GOG-Ntx)17, and the National Cancer Institute Patient-Reported Outcome Common Terminology Criteria for Adverse Events-Numbness & tingling (PRO-CTCAE), can capture several aspects of neuropathy18. Neuropathy symptoms develop progressively during neurotoxic chemotherapy administration. Therefore, existing PROMs should follow a rigorous scientific approach during development to identify key symptoms, discriminate between agents associated with the development of neuropathy, and detect neurological dysfunction and change in its severity over time13,15.

The US Food and Drug Administration (FDA) provided the following guidance19: “Patient-Reported Outcome Measures: Use in Medical Product Development to Support Labeling Claims”, insisting that the development of PROM should address issues related to the conceptual framework, item development, endpoint models, psychometric properties. The guidance explicitly states that the content validity (items and dimensions) of the PROMs should be supported by evidence from qualitative studies. However, the content domains of the most widely used neuropathy measures, CIPN20 and FACT-GOG/Ntx, were not developed and validated through qualitative patient interviews. Additionally, most existing PROMs for neuropathy, including CIPN20 and FACT-GOG/Ntx, are based on the quality of life (QOL) paradigm. Symptoms are the most direct indicators of physiological changes caused by disease and treatment, as perceived and reported by patients. Although QOL is an important indicator in assessing the side effects of cancer treatment, the symptom burden is more proximal to the physiological changes that cause side effects (peripheral neuropathy). Therefore, symptom burden is a more sensitive measure in research and practice than QOL.

Mendoza et al. from the University of Texas MD Anderson Cancer Center developed a Treatment-Induced Neuropathy Assessment Scale (TNAS), a PROM of TIPN symptom burden, to address issues related to the adequacy of currently available measures20. Symptom items for the initial versions of TNAS (v1.0 and v2.0) were generated through consultation with clinical expert panels, literature reviews, and reports of the initial set of items by patients with multiple types of cancer21. Preliminary psychometric evaluation indicated that TNAS showed good responsiveness, validity, and reliability. Besides, the specific sensory and motor deficits were more bothersome to patients than pain. These measures should be revised as new relevant information is generated since the development and validation of PROMs should be an iterative process22.

Furthermore, Mendoza et al. conducted one-on-one qualitative interviews and cognitive debriefing with patients receiving treatments known to induce TIPN to appropriately refine the tool to form the final version of the TNAS (TNAS v3.0), ensuring that the TNAS strictly complies with FDA guidance on establishing the content validity of PROMs used in labeling claims23. The developed TNAS v3.0 was then validated in psychometric evaluation using patients with colorectal cancer, multiple myeloma, or gynecologic cancer receiving oxaliplatin, bortezomib, or taxane-platinum anticancer therapies. TNASv3.0 includes nine items, two dimensions (sensory and interference), and has shown robust psychometric properties. Multiple studies have confirmed that TNASv3.0 is an informative, practical PROM that imposes little burden on patients and can be used in clinical trials at multiple sites23. Although the English version of TNAS was translated into Chinese and Hebrew by the MD Anderson Cancer Center, psychometric verification has only been performed on the English version of TNAS.

Although several studies have shown that TNAS is a reliable and valid instrument, future research should focus on cultural validation and include samples from non-western countries to test the psychometric properties of TNAS. Thus, one of the purposes of this study is to test the reliability and validity of TNAS in China. Furthermore, given the importance of longitudinal assessment for accurately capturing the dynamic development of the TIPN, this study also evaluated a characteristic of the TNAS that has not been tested in the Western literature: in addition to testing the factor structure proposed by the original TNAS, we also analyzed whether this factor structure is invariant over time, that is, longitudinal measurement invariance24. Longitudinal measurement invariance is of equal importance to reliability and validity because if the factor structure of an instrument changes over time, inferences based on the results of follow-up studies may be inaccurate25. Longitudinal measurement invariance ensures that the structure of the scale remains stable over time for reliable comparison of results across different time points. This contextualization reinforces the practical value of this study. Therefore, this study aimed to measure the reliability, validity, and temporal stability of TNAS in Chinese cancer patients receiving chemotherapy. The findings may provide evidence of good psychological properties and stability of TNAS in non-western countries, promoting its application in Chinese clinical settings.

Methods

Design

A quantitative survey with a longitudinal design was used to determine the psychometric properties and longitudinal measurement invariance of TNAS. In this design, questionnaires are distributed to patients at three time points within 3 months after the start of chemotherapy (1 month, 2 months, and 3 months).

Setting and participants

This study used convenience sampling in two centers. Patients were recruited from the outpatient and inpatient oncology departments of two tertiary hospitals in Tianjin and Jiangxi, China, from February 2024 to July 2024. The inclusion criteria included: (1) patients diagnosed with cancer (head and neck cancer, multiple myeloma, colorectal cancer, breast cancer, ovarian cancer, lung cancer) and received bortezomib, oxaliplatin, or taxane-platinum-based chemotherapy for the first time; (2) patients aged above 18 years; (3) patients who could complete the questionnaire independently; and (4) patients willing to participate. The exclusion criteria included: patients with cognitive or psychiatric impairments; and patients suffering from severe heart, liver, kidney, and other serious complications.

Sample size

The sample size of factor analysis should be 5–10 times the number of items on the scale26, Herein, a minimum of 108 samples were required, considering 20% invalid questionnaires. In addition, confirmatory factor analysis requires a minimum of 200 samples26. In this study, 361 patients were included.

Procedures

Before the start of this study, our research team contacted MD Anderson Cancer Center to obtain the Chinese version of TNAS and the authorization to conduct psychological properties and longitudinal measurement invariance testing in China. Two PhD nurses with overseas study backgrounds were invited to back-translate the Chinese version of TNAS and compare it with the original scale to ensure the cross-cultural validity of the scale. An expert panel (10 oncology nurse specialists and five oncologists) then made cultural adaptations to the scale in terms of content relevance, clarity of expression, humanistic background, etc. Finally, pre-testing was conducted on 15 cancer patients to assess the readability and comprehension of the Chinese version of TNAS. During this process, the back-translated version is identical to the English version. Besides, the panel highly recognized the content, clarity, and cultural background of the Chinese version of TNAS. The patients in the pre-test expressed satisfaction with the readability and comprehensibility of the Chinese version of TNAS.

The research team consisted of an associate professor (leader), two master students, and two registered nurses (research assistants). The team leader had rich quantitative research experience. The study was conducted simultaneously in two medical centers. The four research assistants received systematic training before the formal investigation, including standardized subject recruitment and data collection procedures, and all members passed the examination.

Patient recruitment was conducted by research members who approached the potential participants when they entered the hospitals. Research members presented the content and purpose of the research to potential participants, assessed them to determine whether they met the inclusion and exclusion criteria, and assured them that participation was voluntary. The researcher distributed the questionnaires to the participants after obtaining written consent. The researcher immediately checked their questionnaire after completion to ensure the integrity of the data. The questionnaires were returned to the participants if there were missing values to fill in the missing items. Considering that the chemotherapy cycle is generally 21 days, and combined with the time, manpower, and material resources of this study, we chose the following three time points for data collection: questionnaires were distributed to patients at baseline T1 (1 month after the start of the first chemotherapy), T2 (2 months after the start of the first chemotherapy), and T3 (3 months after the start of the first chemotherapy).

Measures

Demographics and clinical information

Demographic and clinical information, including age, gender, diagnosis, cancer stage, and type of chemotherapy, was obtained using a questionnaire.

The treatment-induced neuropathy assessment scale (TNAS)

The TNAS is a 9-item, 2-factor patient-reported outcome measure to assess the severity and course of neuropathy across various cancer treatments20. The TNAS instruments are scored on a 0 to 10 scale, where 0 indicates no symptom, and 10 indicates severe symptom. The overall average of all the items is determined as the overall score. Two subscale scores can also be calculated to assess symptoms related to the sensory and interference dimensions. The sensory subscale score is the mean score of six sensory items, including numbness, tingling, pain, heat or burning, cold sensation, and disturbed sleep. The interference subscale score is the mean score of three items, including difficulty walking, difficulty balancing, and difficulty using the hands23. In the original version of the study, the TNAS showed robust psychometric properties, and the subsequent studies also reported the strong internal consistency of the TNAS (Cronbach alpha = 0.86 for the total score with individual subscales ranging from 0.82 to 0.85).

European organization for research and treatment of cancer quality of life chemotherapy-induced peripheral neuropathy questionnaire (EORTC QLQ-CIPN20)

The EORTC QLQ-CIPN20 is a 20-item measure consisting of sensory, motor, and autonomic domains. All items are rated on a 4-point Likert scale (1 = “not at all” to 4 = “very much”)16. The total score and three subscale scores were calculated and linearly transformed into a 0–100 scale according to the scoring manual, with higher scores indicating more severe CIPN symptoms. The Chinese version of the EORTC QLQ-CIPN20 showed good internal consistency (Cronbach alpha = 0.90 for the total score with individual subscales ranging from 0.70–0.87)27.

Ethical considerations

This study followed the Declaration of Helsinki and was approved by the Ethics Committees of Nanjing Medical University (Ethics No. 2024-697). The participants signed the informed consent form and had the right to withdraw from the study at any time without prejudice.

Statistical analyses

R software (version 4.3.1; R Core Team, 2024) and Mplus Version 8.3 were used for all data analyses. Multivariate normality was assessed using the ‘MVN’ package (version 5.9.1). The Henze-Zirkler test28 yielded a test statistic of 0.995 with a corresponding p-value of 0.574, which was not statistically significant (p > .05), indicating that the data conformed to a multivariate normal distribution. The demographic characteristics of participants were described using descriptive statistics (frequency, percentages, means, and standard deviations).

The Confirmatory Factor Analysis (CFA) was used as a critical step in refining the instrument and identifying the factorial structure of TNAS. Four fit indices were employed to examine the adequacy of model fit: a chi-square to degrees of freedom ratio (χ2/df < 3)29, the comparative fit index (CFI ≥ 0.90)29, the Tucker-Lewis index (TLI ≥ 0.90)29, and the root mean square error of approximation (RMSEA ≤ 0.05)29. In CFA, the convergent validity of the TNAS was verified through the factor loading of each item (≥ 0.50), the composite reliability (CR) value (≥ 0.70), and the average variance extracted (AVE) of the 14 factors (≥ 0.50)30.

Longitudinal measurement invariance was used to determine the configural (similar factor structure), metric (similar factor loadings), and scalar invariance (similar intercepts) of the TNAS over time. Invariance was established by comparing these models based on the following criteria: changes in RMSEA (Δ < 0.015), CFI (Δ < 0.01), and TLI (Δ < 0.01)31,32.

A criterion validity analysis was also conducted by determining the value of the Pearson correlation coefficient between the TNAS and the EORTC QLQ-CIPN20.

The reliability of the TNAS was determined based on internal consistency. To evaluate the internal consistency of the TNAS, Cronbach’s alpha was measured. Internal consistency was considered adequate when α ≥ 0.7033. A multiple imputation method was used to find missing data.

Results

Sample characteristics

A total of 400 potential participants were approached, of which 26 did not meet the inclusion criteria, 13 declined to participate for various reasons, and 361 were eligible and consented to participate (The flowchart of patient screening and inclusion is shown in Fig. 1). A total of 361, 354, and 335 patients completed data collection at T1, T2, and T3, respectively. The average age of the patients at T1 was 54.34 years (SD = 8.25, range = 33–77). Most patients at T1 were females (64.0%) with breast cancer (40.4%). The characteristics of the participants at T1 are presented in Table 1.

Fig. 1
Fig. 1
Full size image

The flowchart of patient screening and inclusion.

Table 1 Demographic and clinical characteristics of the samples (N=361).

CFA

The factorial validity of the TNAS at each time point was explored by estimating the proposed two-factor model. Observed items were used as indicators for the latent factor. No items were removed. The CFA model with two latent subscales demonstrated an adequate fit across multiple fit indices. The results showed that the model fitted the data well at Time 1 [(χ2 /df = 2.137, p < .001), TLI = 0.983, CFI = 0.982, RMSEA = 0.032], Time 2 [(χ2 /df = 1.637, p < .001), TLI = 0.987, CFI = 0.989, RMSEA = 0.017], and Time 3 [(χ2 /df = 2.245, p < .001), TLI = 0.973, CFI = 0.974, RMSEA = 0.027], respectively.

The convergent validity of TNAS was examined by CFA as well as AVE and CR at three time points. The results showed that at T1, the factor loadings of all items ranged from 0.765 to 0.879, the AVE values of the 2 factors were 0.610 and 0.746, and the CR values were 0.901 and 0.890. At T2, the factor loadings of all items ranged from 0.750 to 0.855, the AVE values of the 2 factors were 0.623 and 0.728, and the CR values were 0.911 and 0.889. At T3, the factor loadings of all items ranged from 0.732 to 0.843, the AVE values of the 2 factors were 0.595 and 0.697, and the CR values were 0.897 and 0.872. As described earlier, all criteria displayed good acceptability at the three-time points, indicating that the TNAS has satisfactory convergent validity (Table 2).

Table 2 Factor structures by confirmatory factor analysis (N = 361).

Longitudinal measurement invariance and mean comparisons

The temporal equivalence of TNAS was evaluated through longitudinal measurement invariance. The results showed that configural, metric, and scalar invariance were established (Table 3). Furthermore, CFI (Δ < 0.01), TLI (Δ < 0.01), and RMSEA (Δ < 0.015) were not significantly different between the configural, metric, and scalar invariance models. The findings indicate that the TNAS is a consistent measure over time, and mean comparisons can be made. Mean comparisons, with TNAS at T1 as the reference point, showed that TNAS increased at T2 and T3 (P < .001). The results of the metric and scalar invariance models of the longitudinal measurement invariance are shown in Fig. 2.

Table 3 Longitudinal invariance of the TNAS.
Fig. 2
Fig. 2
Full size image

Metric and scalar invariance models of the longitudinal measurement invariance. Note: xa = Sensory_T1; xb = Sensory_T2; xc = Sensory_T3; ma = ma1 ~ ma3_T1; mb = Interference_T2; mc = Interference_T3. xa1 ~ xa6, xb1 ~ xb6, and xc1 ~ xc6 belong to the sensory subscale items at three-time points; ma1 ~ ma3, mb1 ~ mb3, and mc1 ~ mc3 belong to the interference subscale items at three-time points.

Criterion validity

The criterion validity was assessed with Pearson correlations between the TNAS and the EORTC QLQ-CIPN20. Given that the dimensionality of the EORTC QLQ-CIPN20 has been questioned in several recent studies34,35 and there is a lack of evidence for longitudinal measurement invariance of the EORTC QLQ-CIPN20, we only measured the correlation between TNAS and EORTC QLQ-CIPN20 at T1. The result showed that TNAS had a significant positive correlation with EORTC QLQ-CIPN20 (r = .502, p < .001), indicating adequate criterion-related validity of the TNAS.

Internal consistency

Cronbach’s alpha values were calculated individually for the subscales of the TNAS at the three time points to determine internal consistency. The Cronbach’s alpha values ​​of TNAS at T1, T2, and T3 were 0.880, 0.873, and 0.886, respectively, indicating that TNAS had good internal consistency. The Cronbach’s alpha values ​​of the two subscales are shown in Table 4.

Table 4 Mean (SD) and Cronbach’s alpha of TNAS and its subscales at three-time points.

Discussion

TNAS is widely used to assess the severity and course of neuropathy across various cancer treatments. This is the first study to evaluate the psychometric properties of TNAS and establish the longitudinal invariance of TNAS outside Western countries. Results found that TNAS has adequate reliability and validity, as well as temporal stability, in a Chinese population receiving cancer chemotherapy.

The Chinese version of TNAS contained two factors (sensory and interference), which are consistent with the English version of TNAS20. The results showed that all items were significantly loaded and sufficiently on their respective factors at the indicated time points (on the factorial level). All standardized factor loadings were statistically significant (p < .001) and exceeded the threshold of 0.70, as recommended by Kline (2023)36. Further, the AVE and CR values of the two factors were ≥ 0.50 and ≥ 0.70, respectively, implying that all items belonged to their respective factors and the TNAS had a good convergent validity.

Herein, the sensory subscale contained six items related to sensory symptoms, including numbness, tingling, pain, heat or burning, cold sensation, and disturbed sleep. Numbness and pain are the most common symptoms described by patients regarding their feelings of TIPN37,38. The interference subscale contained three items related to symptoms that interfere with daily activities, including difficulty walking, difficulty balancing, and difficulty using the hands. Bennett et al.39 and Tofthagen et al.40 ound that patients had problems walking on uneven terrain. Furthermore, Bakitas et al.41 showed that TIPN was associated with difficulty driving among patients.

Compared with other existing tools, EORTC QLQ-CIPN20 contains three factors: sensory, motor, and autonomic16, while FACT/GOG-Ntx contains four factors: sensory, motor, hearing, and dysfunction17. he autonomic side effects were not common during the development of TNAS, according to the results of qualitative interviews with patients, and thus, no items to measure these side effects were included in the TNAS21. However, previous studies did not confirm the three-factor structure of EORTC QLQ-CIPN20 and the four-factor structure of FACT/GOG-Ntx27,34,35. In contrast, the two-factor structure of TNAS was confirmed at the three time points, which indicated the good construct validity of TNAS.

Furthermore, the results provided support for configural, metric, and scalar invariance when considering the temporal stability or the factorial equivalence of the TNAS over time. This implied that the two-factor structure (sensory and interference) of TNAS was measured equally and consistently across time. Moreover, the TNAS showed similar factor structures, factor loadings, and intercepts at three time points. These results support the temporal stability of the instrument, implying that latent mean differences indicate actual temporal changes in the factors over time and not changes in the meaning of constructs. Therefore, the TNAS can be used for mean comparisons in longitudinal studies.

Results also indicated high criterion validity of the TNAS with another established measure of neuropathy, the EORTC QLQ-CIPN20. TNAS was moderately correlated with EORTC-CIPN20, indicating that there was some overlap between TNAS and EORTC-CIPN20, consistent with a previous study23. EORTC-CIPN20 includes sensory, motor, and autonomic domains, while TNAS does not include any items for autonomic side effects based on the qualitative interview results. However, TNAS is a brief and less burdensome (only 2 min to complete) method, providing convenience and rapid measure of neuropathy in clinical practice. Finally, the level of internal consistency for all constructs at the three time points suggested that TNAS is a reliable measure of TIPN. Similarly, other studies20,23 showed high levels of internal consistency for the Chinese version of the TNAS.

Overall, the findings confirmed that the Chinese version of TNAS has good reliability, validity, and longitudinal measurement invariance. From a clinical practical perspective, the items on this scale are concise, clear, and easy to understand. Furthermore, the number of items is moderate, with a completion time of about 2 min. Therefore, this tool is less burdensome and can be used in multiple centers. Importantly, the longitudinal invariance analysis demonstrated that the factorial structure of TNAS was stable across the three time points, reinforcing its validity for clinical use in repeated measurements. This suggests that TNAS can be used to accurately grasp the changes in TIPN of tumor patients during dynamic treatment, thus improving formulation of multi-channel and personalized intervention plans to prevent or alleviate the impact of TIPN on treatment effects and quality of life.

Nonetheless, this study has some limitations. First, the sample was limited to the urban areas of Tianjin and Jiangxi, China, indicating that the results may not be generalizable to other regions in China and other countries. Second, only three time points were selected within 3 months after the first chemotherapy due to time, manpower, and material constraints. TIPN often lasts for a long time after the end of chemotherapy. Therefore, future studies should extend the follow-up time to ensure sufficient time and fully validate the stability of the scale. Third, given that the EORTC QLQ-CIPN20 was developed without patient interviews and there is insufficient evidence of temporal stability, we only observed the correlation between TNAS and the EORTC QLQ-CIPN20 at the one-time point. In the future, a more reliable and temporally stable instrument should be selected to longitudinally investigate the criterion validity of the TNAS. Fourth, the collected neuropathic symptoms were patients’ perceptions of the symptoms within 24 h, which may be subject to memory bias. Therefore, future research should adopt ecological momentary assessments to timely capture patients’ current symptoms. Finally, more diverse populations and multicultural settings should be considered in future validation studies to determine whether different ethnic groups and multicultural settings measure the same neuropathic symptom perceptions.

Conclusion

In conclusion, these results support that the TNAS is a valid and reliable measure for TIPN within the current context. Additionally, this is the first study to empirically test the longitudinal measurement invariance of the TNAS. Accurate measurement and tracking of TIPN are valuable for designing interventions that can prevent or reduce exacerbations caused by TIPN. Therefore, this study contributes to future research and clinical practice on TIPN.