Introduction

According to Alfaro-LeFevre1, critical thinking (CT) is a key trait required in nursing education, knowledge, and practice. Although there is no universally accepted definition, this skill can be defined as a regular, self-correcting assessment process intended to achieve continuous improvement. It is a transformative process that requires skills, know-how, and attitudes and contributes to the professional’s self-improvement2. The development of critical thinking skills increases diagnostic accuracy, as accurate interpretation of patient information requires high levels of thinking skills. In nursing practice, critical thinking is a dimension of intelligence, essential for carrying out the diagnostic process and supporting the validity of diagnoses3,4,5. There is still no consensus on the scope of CT in nursing practice, perhaps because CT requires several types of knowledge, is abstract and generalised, and depends on experience and on contextual factors5,6. Nurses need to use CT constantly to decide on the best evidence available and to make clinical and management decisions in complex settings with varying demands7. That’s the reason why in nursing education The American Association of Colleges of Nursing8 and the Rede Iberoamericana de Investigação em Educação em Enfermagem9 have strongly recommended defining CT skills as a primary component of nursing curricula. At present there is abundant literature on CT in nurses and nursing students, but only a small percentage of it refers to nurse educators5.

Background

In Spain, clinical nurse educators are professional nurses who perform clinical work in health institutions and who have completed university nursing studies. Nurses with the necessary background are appointed to this position by collaborating hospitals. These educators actively instruct, monitor, and evaluate students, together with other clinical professionals and academic faculty. Clinical nurse educators take the lead during the students’ practicum training. In this context, they play an essential role in guiding students, building good learning environments, and conveying the practical curricular knowledge, an understanding of the physical and social environment, and the values of the nursing profession10. Consequently, it is particularly important for clinical nurse educators to possess a high level of CT, not only in healthcare work, but also in pedagogy.

Since 2010, the nursing studies curriculum in Spain changed from a three-year diploma to a 4-year Bachelor’s Degree, consisting of 240 credits that represents around 6000–7200 h (one credit represents between 25 and 30 h of student work). For Clinical Practicum subjects it represents 84 credits (2100–2520 h)11,12.

Measuring CT has been the focus of a number of studies over recent decades13,14, and several instruments for measuring CT have been described in the literature. The review by Carter, Creedy & Sidebotham15 reported four standardised commercially available measures of CT that had been utilised in different studies. These were the California CT Disposition Inventory (reliability across studies ranges from 0.67 to 0.90 thoughout cronbach alpha test), the California CT Skills Test (reliability from 0.55 to 0.83), the Watson–Glaser CT Appraisal (alpha coefficient of 0.77), and the Health Sciences Reasoning Test (Kuder-Richardson estimate internal consistency of r = 0.70). All of these tools have reported psychometric reliability and validity allowing comparison across settings, disciplines, and time15. The Nursing Critical Thinking in Clinical Practice Questionnaire (N-CT-4 Practice) is a recently developed self-administered questionnaire designed to measure the CT capacity of nurses working in clinical areas. The N-CT-Practice questionnaire was designed based on the conceptual framework of the 4-Circle Critical Thinking model of Alfaro-LaFevre. Unlike the other available questionnaires, this model describes the construct of CT as the integration of four components: personal characteristics, intellectual or cognitive skills, interpersonal and self-management skills, and technical skills1.

The original version of the questionnaire in Spanish has been shown to have good psychometric characteristics for application in clinical practice14. The instrument has been translated into several languages16,17,18,19,20. In respect to reliability, in clinical nurses has an internal consistency of 0.96 alpha coefficient14. In nursing students it was recently determined with a 0.96 alpha coefficient19. Nurse educators are a key part in the teaching and learning process of nursing students. Although different instruments have been developed to assess CT in the different agents involved in the teaching and learning process, none of them have specifically assessed the CT of clinical nurse educators. In view of this gap, the main aim of the study was to examine the psychometric properties of the Nursing Critical Thinking in Practice Questionnaire (N-CT-4 Practice) in clinical nurse educators.

Methods

Study design and participants

We used a psychometric quantitative method, based on a descriptive cross-sectional design. Study participants were clinical nurse educators who oversaw student clinical practicums at hospitals that have cooperative agreements with the School of Nursing under the Faculty of Medicine and Health Sciences at the University of Barcelona, a leading public institution with the highest number of enrolled students and clinical nurse educators in Spain’s Catalonia region. The sample size was calculated from the number of items comprising the N-CT-4 questionnaire. Because the scale had 109 items, five participants were needed per item21. Therefore, the necessary sample size was 545. Sampling was non-probabilistic and proceeded consecutively. Although the sampling method could affect the generalisability of the results, the research team increased the number of participants to mitigate this potential bias and come closer to representativeness. Data was collected until the required number of nurses fulfilling the following inclusion criteria was reached: clinical nurse educator employed in one of the collaborating hospitals, active during the data collection period.

Data collection and procedure

Recruitment was carried out by the research team teaching the practicum. After reviewing the selection criteria, the candidates were invited to participate during the regular follow-up meetings held in the practicum.

The survey consisted of two sections: an information form to determine nurses’ personal (sex, age), professional (hospital unit, job category, contract type, work shift, seniority, and years worked in the unit) and academic characteristics (academic level, specific training) and the N-CT-4 Practice Questionnaire, in its original version. All items of the scale were transferable to the nurse educators without the need for editing.

The instrument, originally created in Spanish, was designed and validated by Zuriguel-Pérez14 and has been shown to have good psychometric characteristics for application in clinical practice. The questionnaire consists of 109 items and has four dimensions: (1) Personal characteristics (pattern of intellectual attitudes, beliefs and values that could activate thinking ability elements, items 1–39); (2) Intellectual and cognitive abilities (knowledge and competencies linked to the nursing process and decision-making, items 40–83); (3) Interpersonal abilities and self-management domain (therapeutic communication and obtaining information relevant to the patient, items 84–103); and (4) technical abilities (knowledge and expertise in technical procedures of nursing care, items 104–109). The instrument’s overall score is obtained by adding the scores of all items (range: 109–436 points). The higher the score, the higher the nurse’s CT self-assessed skill level14. The large number of items makes it possible to describe the construct in a comprehensive way, even though the administration time is long for the participants. Each item is classified on an interval Likert-type response scale of 4 points: 1 = never or almost never and 4 = always or almost always. The higher the score, the higher the nurse’s CT self-assessed skill level14. The instrument’s overall score is obtained by adding the scores of all items (range 109–436 points).Because of its characteristics, it does not have a cut-off point, but a recent study in clinical nurses showed a mean total score of 327 (SD 38,10) points19. Regarding the questionnaire’s psychometric properties, the total Cronbach’s alpha coefficient was 0.96 (ranging from 0.78 for the technical indicator to 0.94 for the intellectual indicator), and the intraclass correlation coefficient was 0.77. The construct validity analysed by confirmatory factor analysis demonstrated the presence of the four indicators proposed in Alfaro-LeFevre’s CT theoretical model14. The instrument has been translated into several languages16,17,18,20,22.

The questionnaire was distributed to each nurse along with a cover letter listing the authors and explaining the purpose of the study, which emphasized the voluntary nature of participation and guaranteed data confidentiality. Data collection took place between January 2020 and June 2021. Critical thinking is a construct that evolves over a long period. To assess temporal stability, the test–retest was determined over a 14-day time interval, and participants were invited to fill in the questionnaire a second time23. The two questionnaires were linked with unique codes for pairing.

Ethical considerations

The study was conducted in accordance with the principles of the Declaration of Helsinki. Permissions were obtained to use the scale in the study, as required. The study was approved by the management at participating sites and by the clinical research ethics committee of the sponsoring and participating sites. All participants voluntarily agreed to participate in the study, signed the informed consent form, were assigned alphanumeric codes to conceal their identity, and had complete freedom to leave the study at any time.

To preserve confidentiality, data were dissociated and records inserted into the database were encoded.

Data analysis

We carried out a descriptive analysis to present the sample and record the scores of each item. Analysis of the items included calculation of the average, standard deviation, and corrected item-total correlation. Construct validity was analyzed through confirmatory factor analysis (CFA) with estimated parameters using the diagonally weighted least squares method and a polychoric correlation matrix. This method offers properties similar to maximum likelihood estimation, but allows for less stringent criteria, making it suitable for analyzing ordinal data like the one used in this study24. The model scale was not explicitly fixed by setting a factor variance to 1 in the CFA model specification. However, the factor loadings were standardized to have a mean of 0 and a variance of 1. The analysis was performed using the R Studio software (Boston, MA, USA) via its lavaan package25. The following fit indices were calculated to determine the overall fit of the model26: (i) the comparative fit index (CFI), with values > 0.90 suggesting an acceptable fit and values of 0.95, an excellent fit27, (ii) the normed fit index (NFI), with recommended values > 0.9528 and (iii) the root mean square error of approximation (RMSEA), with appropriate values lower than 0.0829. Convergent validity was analysed using the Spearman correlation between the N-CT-4 Practice Questionnaire dimensions based on the hypothesis that the correlation between each dimension and the general instrument should be higher than the correlations between the factors30. Reliability was assessed with an analysis of the internal consistency of the items, for each of the dimensions. The internal consistency of the questionnaire and each of its indicators was analysed by Cronbach’s alpha, establishing α ≥ 0.80 as acceptable31. Test–retest reliability was evaluated by calculating the intraclass correlation coefficient (ICC). ICC values below 0.70 were determined to indicate weak concordance32. Descriptive statistics (number, percentage, mean, standard deviation) were used to report participants’ personal and professional characteristics and scale scores. CT levels were calculated globally and within subcategories. Statistical significance was established at a probability of p < 0.05. Data processing and analysis were performed using R statistical software (release 4.1.0 for Windows, https://www.r-project.org/).

Results

Participant characteristics

Eleven hospitals in Catalonia’s public health network participated in the study. The total number of participants was 639, representing a response rate of 66.56%. The majority of the sample were women (84.5%, n = 540), and the average age was 38.9 years (standard deviation, SD = 9.65). Well under half of the participants had a permanent contract (43.5%, n = 278), most had more than 10 years of experience (60.9%, n = 389), and a small share had only the nursing degree (17.37%, n = 111). The remaining 82.6% (n = 528) had a postgraduate certificate, a master’s or doctoral degree, or a specialty (Table 1).

Table 1 Academic and professional characteristics (N = 639).

The participants obtained a mean of 363.21 (SD 33.44; range 109–436) on the CT scale. The mean (SD) scores for the four factors of the original version of the questionnaire were 125.13 (12.69) for the personal factor (range 39–156), 150.32 (14.99) for the intellectual and cognitive factor (range 44–176), 67.11 (7.82) for the interpersonal and self-management factor (range 20–80), and 20.65 (2.32) for the technical indicator (range 6–24).

Construct validity

CFA was used to verify the internal structure of the instrument, in which a four-dimension model identical to the structure of the original instrument was proposed14. The factorial structure was analysed using the chi-square, CFI, NFI and RMSEA. The result of the chi-square test was significant (χ2 = 9155.104; p < 0.0001), indicating that the hypothesis of a perfect model needed to be rejected. However, in light of these values and considering that this test can be sensitive to sample size, we determined that other statistical tests were necessary to evaluate the theoretical model. All goodness-of-fit indicators from the CFA reached the minimum established. Overall, the goodness-of-fit indices showed that the structure of the proposed questionnaire is acceptable (Table 2). All items had a factorial load > 0.3 (except for items 3 and 35, which had 0.22 and 0.29, respectively) (Fig. 1).

Table 2 Goodness-of-fit indices for the confirmatory model N-CT-4 Practice.
Fig. 1
figure 1

Confirmatory factor analysis. tcn technical dimension, prs personal dimension, atg interpersonal and self-management dimension, int intellectual dimension.

Convergent validity

The hypothesis was confirmed in the analysis of the correlations between the factors and the general instrument, with the strongest correlations found between the majority of factors and the general instrument. The intellectual factors showed the strongest correlation with the N-CT-4 Practice instrument (rho = 0.944). The technical factor had the weakest correlation (rho = 0.740). Table 3 shows the correlation of all the factors with the N-CT-4 Practice score.

Table 3 Correlations between N-CT-4 Practice factors and total instrument.

Reliability

There was no floor effect in any of the factors, while the ceiling effect was negligible in all factors (Table 4). The internal consistency of the questionnaire and its dimensions were measured using Cronbach’s alpha. The total Cronbach’s alpha for the N-CT-4 Practice Questionnaire was 0.97, which qualifies as excellent according to the authors33. The four factors had an internal consistency that ranged from 0.7 for the technical factor to 0.95 for the intellectual factor (Table 4). The Cronbach’s alpha values were then calculated, excluding each of the items. Total internal consistency did not improve when any items were excluded. The threshold of 0.7 was always reached, except for on the technical dimension, which had a lower Cronbach’s alpha when excluding items 106 and 107. This may be due to the fact that this dimension only includes 6 items, while the rest include between 19 and 39 items. Temporal stability or test–retest analysis was carried out with 85.9% (n = 549) of the sample through the ICC. The mean (SD) of the test was 363.2 (33.4) and of the retest was 366.39 (35.06). The ICC values were 0.78 [95% CI 0.75–0.81] for the whole instrument and 0.70 to 0.75 for the four factors, and all were statistically significant with at least p < 0.05, indicating good stability over a two-week period (Table 5).

Table 4 Critical thinking levels and the four factors.
Table 5 Test–retest analysis of the dimensions of critical thinking questionnaire.

Discussion

The study is the first that evaluate the psychometric properties of the N-CT-4 Practice applied to nurse educators in the context of clinical practicums. The results demonstrate that the N-CT-4 Practice is endowed with good psychometric properties to measure nurse educators’ self-evaluation of their CT skills in the context of clinical practicums. This evaluation is necessary to determine the level of critical thinking of educators to ensure that students gain the competencies of the practicum subjects. Cronbach’s alpha was used to assess the internal consistency of the questionnaire, showing that it yields values consistent with those reported for other instruments normally used to measure CT, such as the Critical Thinking Diagnostic34 and the California Critical Thinking Disposition Inventory (CCTDI)35, but not previously used to study CT in nurse educators in the context of clinical practicums. It also demonstrated consistency with the values reported by the N-CT-4 Practice Questionnaire, but not specifically in clinical nurse educators14,17,18,22. In this study, reliability was assessed by the ICI (range 0.73–0.84), with similar values to those observed in previous studies, indicating good temporal stability of the instrument.

Most instruments for measuring CT capacity are designed and validated in the context of training future nursing professionals, although some are used for professionals in clinical practice. There is very little data about nurse educators. One of these is the study by Raymond and Profetto-McGrath, which used the CCTDI to measure CT in nurse educators36. This instrument was also used to measure CT in nursing students37, in nurses38 and in supervisory nurses39.

The CFA indicated that the initial four-indicator hypothetical model is a good fit for the data, although there may be room for improvement. The goodness-of-fit and correlation values showed that the structure of the proposed questionnaire is acceptable. As in the original instrument, the items match the 4 dimensions of the Alfaro-LeFevre model (personal, intellectual, interpersonal and technical) and are applicable to both practice and education. Clinical educators respond to these two areas: practice in the development of their care work and education when guiding nursing students during their acquisition of knowledge in clinical practice.

The multidimensional concept of CT has been defended by most theoreticians in the field35, who argue that CT comprises a series of skills that should be understood to be interrelated. However, to date there are no validated instruments that approach this multidimensional perspective in a clear and adequate manner. The N-CT-4 Practice includes this multidimensional approach and, therefore, represents a new, important instrument to measure this construct14.

The N-CT-4 Practice instrument has been validated in different languages and regions (e.g. Vietnam, Brazil), as well as with different agents involved in clinical practice subjects (students, nurses working at different care services or shifts, public or private hospitals, etc.)17,18,19. The N-CT-4 Practice has shown good psychometric properties in the populations in which it has been used, such as registered nurses and nurse managers, as well as for clinical nurse educators in this study. This is the first study that uses this instrument in clinical nurse educators and to address this gap in the literature.

This study has several limitations. First, the characteristics of our sample make it difficult to generalise the results to other populations of interest. The participants were not recruited from all the hospitals in the different areas of Spain through random sampling but rather were recruited by convenience sampling from 11 hospitals in the northeast of Spain. Therefore, caution must be used in extrapolating these results. Secondly, it is imperative to consider the limitations of the self-administered questionnaire when interpreting the results. The utilisation of this instrument may potentially result in the collection of erroneous information, which could lead to misleading results and conclusions, or Type I research bias. In order to mitigate this risk, the sample size was augmented. Participants were encouraged to respond openly, reflecting their personal views; however, this information lacked external validation through other means, such as direct observation. It is recommended that future studies incorporate this verification process. At the same time, the study has several strengths, such as the size and specificity of the sample in terms of clinical nurse educators and the robustness of the values obtained. The sample size of this study is larger than that of most other published studies that have used this questionnaire. Additionally, while the sample was not representative of nurse educators throughout Spain, the design included a large number of hospitals, adding rigor to the study.

Conclusions

We have shown the N-CT-4 Practice Questionnaire to be useful for measuring self-assessed CT levels in clinical nurse educators, a novel group for CT skill analysis in professionals actively participating in the university training of nursing students. The study results also show that the N-CT-4 Practice structure is consistent with its theoretical basis, as the proposed indicators behave adequately to analyse CT. Therefore, the N-CT-4 Practice allows CT to be evaluated based on four interrelated dimensions.

Future studies are needed to investigate the tool’s usefulness in measuring patient care quality and outcomes, as well as students’ learning quality based on the CT level of clinical nurse educators. Longitudinal studies on the development of the level of critical thinking in nursing students over the course of their academic curriculum, or to assess the impact of specific training interventions on the level of critical thinking, can also be developed. Studies are also needed to establish the relationship between CT levels and the various occupational and training characteristics of clinical nurse educators.

Relevance for clinical practice

The available empirical evidence supports the utility of utilizing this instrument for investigating the critical thinking abilities of clinical nurse educators. The application of the N-CT-4 Practice questionnaire provides a valuable opportunity to assess critical thinking proficiency among clinical nurse educators and opens further avenues for cross-cultural comparative studies with international counterparts. Consequently, the use of this valid instrument facilitates additional exploration and training related to critical thinking.

The findings of the study will facilitate the creation of training programmes that will enhance critical thinking skills in nursing students. Additionally, the development of practicum subjects will be enriched by the implementation of a unified instrument to assess the critical thinking abilities of students and clinical nurse educators.

Promoting critical thinking skills allows for reflection on the care model and the development of strategies to enhance the quality of healthcare and nursing work processes.