Introduction

The clinical assessment of nociception relies primarily on quantitative sensory testing, i.e., pain perception ratings. Neurophysiological measures such as pain-related evoked potentials (PREPs) may be of value in this context as they allow for objective, bias-minimized assessment of nociceptive processing and quantification of subtle alterations within the nociceptive system1,2.

Most clinical studies to date have focused on the integrity of the thermo-nociceptive pathways, which can be objectively assessed using laser or contact heat evoked potentials (LEPs, CHEPs)3,4,5,6,7,8. Additionally, pinprick evoked potentials (PEPs) were introduced as a promising tool in assessing the integrity of mechano-nociceptive pathways9,10. Indeed, previous studies have suggested the assessment of multi-modal PREPs in response to different types of noxious stimuli to measure the integrity of alternative aspects of nociception, allowing a more in-depth characterization of certain pathologies affecting the nociceptive system11,12,13,14. Another type of PREPs is intra-epidermal electrically evoked potentials (IEEPs), which are obtained in response to noxious intra-epidermal electrical stimulation (IES)12,14,15,16.

IES enables rapid and synchronous activation of nociceptive fibers through the direct depolarization of free intra-epidermal nerve endings, resulting in short and robust IEEP latencies while maintaining low pain perception. Its electrical properties enable the activation of a broad range of nociceptor subtypes, potentially assessing the functional integrity of both thermo- and mechano-nociceptive pathways in contrast to CHEPs and PEPs26.

Various electrode designs including concentric needle electrodes, concentric planar electrodes, and micropattern electrodes have been employed following different stimulation parameters, leading to an IES depolarizing free intra-epidermal nerve endings directly12,15,17,18,19. However, a major limitation of IES is the lack of standardized electrodes, stimulation protocols, and peripheral and central characterization of IEEPs. To address these limitations, our laboratory follows the most frequently published stimulation parameters applying a stimulation duration of 0.5 ms with a pulse frequency of 200 Hz16,18,20,21,22,23,24,25. Some adaptations were made, including the use of a triple concentric planar electrode and the application of a triple pulse stimulation to enhance spatial and temporal summation improving the signal-to-noise ratio of IEEPs, as suggested in previous studies18,20. Additionally, we have characterized the peripheral activation by IES, confirming with psychophysical assessments the activation of both mechano- and thermo-nociceptors26. Lastly, in two separate studies, we applied IES in patients with centromedullary lesions, demonstrating spinothalamic propagation of IEEPs and highlighting its potential for clinical applications12,14. However, a prerequisite to use IES and IEEPs as a diagnostic tool in clinical practice test–retest reliability of IEEPs must be established. Hence, the aim of Experiment 1 was to assess the test–retest reliability of IEEPs using our planar, concentric triple electrode12,14,26 compared to other PREPs such as CHEPs and PEPs.

Additionally, the optimal stimulation intensity for IES remains unclear. While 1.5 × and 2 × electrical detection (EDT)18,20,21,22,23,24,25 or pain threshold15,21,23,24,27,28 have been commonly used, we previously demonstrated spinothalamic propagation of IEEPs at 0.5 mA14. Therefore, Experiment 2 aimed to assess the test–retest reliability of IEEPs across different IES intensities, including 1.5×, 2×, and 4 × EDT, as well as 0.5 mA.

Methods

Participants

Participants aged between 20 and 40 years were enrolled after providing written informed consent. Exclusion criteria were any neurological or psychological condition, experiencing any acute or chronic pain, pregnancy, and intake of any medication other than birth control. Additionally, to minimize psychological confounding factors related to pain perception, participants with scores exceeding the established cut-off values of 30 on the Pain Catastrophizing Scale (PCS)29 and 21 on the Beck Depression Inventory-II (BDI)30 were excluded. Participants were recruited through flyers, social media, and the online marketplace of the University of Zurich and ETH Zurich. They were compensated with 20 CHF per hour, and transportation costs to the study site (Balgrist University Hospital) were reimbursed. All procedures within the study were approved by the research ethics board, including the Kantonale Ethikkommission Zurich (EK-04/2026, PB_2016-02051), and the study was registered at ClinicalTrials.gov (Registration number: NCT06443281, registered on 17/04/2025; https://clinicaltrials.gov/study/NCT06443281). All procedures followed the principles of the Declaration of Helsinki.

Study design

The study consisted of two separate experiments each including two visits, which were 2 weeks apart (Fig. 1). Each experiment was conducted independently with different participant groups. Since both followed a similar overall design, the shared methodology is described first, with specific differences detailed in the following sections. The study design was based on a previously published protocol, and the data from Experiment 1 were extracted from that study26.

Fig. 1
figure 1

Experimental design of Experiment 1 and Experiment 2. In Experiment 1, 26 participants were recruited, while in Experiment 2, 30 participants were recruited. Both experiments consisted of two identical visits two weeks apart. Additionally, pain-related evoked potentials (PREPs), pain perception in a numerical rating scale, and quality of sensation in response to noxious stimuli were assessed in both experiments. In Experiment 1, the volar forearm was stimulated with contact heat, pinprick, and intra-epidermal electrical stimulation (IES) in a randomized order. In Experiment 2, the volar forearm was stimulated with IES applying different stimulation intensities in a randomized order. EDT electrical detection threshold, IES intra-epidermal electrical stimulation, PREPs pain-related evoked potentials.

Visit 1 started with the completion of the PCS29 and the BDI30 to account for potential psychological cofounders of pain ratings and processing. Sensory functions were semi-quantitatively assessed by light touch, pinprick, and thermal testing (25 °C and 40 °C thermorollers, Rolltemp II, Somedic SenseLab AB, Sweden), according to the grading system of the International Standards for Neurological Classification of Spinal Cord Injury (0 = absent, 1 = impaired, 2 = normal)31.

During the experimental procedures, participants lay in a supine position. The testing site was the volar forearm, and the testing side (left/right) was identical between the two visits and randomized for every participant.

Each visit comprised a familiarization procedure, the assessment of the EDT in response to IES (see “Recording and analyses of pain-related evoked potentials for both experiments”), and the assessment of PREPs (see “Data analyses and statistics for both experiments”), pain ratings, and quality of sensation.

The familiarization of the EDT assessment and noxious stimuli protocol was performed by applying three to four noxious stimuli on the dorsal aspect of the contralateral hand to the testing site (for specific stimulation parameters, see “Experiment 1” and “Experiment 2”).

As IES intensity was gradually increased, participants were instructed to indicate the first detectable sensation, defined as the EDT. Following EDT determination, either three different stimulation modalities were applied (Experiment 1: contact heat, pinprick, and IES) or IES was applied at different stimulation intensities (Experiment 2: 1.5×, 2×, 4 × EDT, and 0.5 mA). The order of stimulation modalities or stimulation intensities was randomized for each participant and identical between the two visits. Each stimulation block (one per stimulation modality or intensity) consisted of 15–20 stimuli with a random interstimulus interval of 8–12 s. An acoustic cue, occurring 4 s after each stimulus, prompted participants to rate their perceived pain intensity on a numeric rating scale (NRS) from 0 (“no pain”) to 10 (“most intense pain imaginable”). Each stimulation block was followed by a break of approximately 2 min. During this time, participants were asked to verbally describe the quality of sensation evoked by the preceding stimuli using one or more terms from the predefined list of eleven descriptors for noxious or innocuous stimulation: “shooting”, “stabbing”, “sharp”, “shock”, “pricking”, “hot/burning”, “warm”, “throbbing”, “light” “touch”, “touch”, “tingling”16,32.

Experiment 1

Heat stimulation: the TCS II thermode (QST Lab, Strasbourg, France) was used for heat application. The initial temperature was set at 35 °C, and the target temperature at 60 °C. Heating and consequent cooling of the thermode occurred at rates of 250 °C/s and 300 °C/s, respectively, with a total heat pulse of 180 ms. The thermode was repositioned after each stimulus to prevent adaptation33.

Pinprick stimulation: a 256 mN modified pinprick stimulator equipped with a contact trigger (MRC Systems, Heidelberg, Germany) was used. The stimulator was positioned perpendicularly to the skin with a total contact time of 1 s. This duration was chosen to favor Aδ- over Aβ-fiber activation34. To mitigate peripheral adaptation effects, the pinprick was repositioned after every stimulus.

Intra-epidermal electrical stimulation (IES): was applied using a triple planar concentric electrode employed in prior studies12,14,26. The electrode configuration comprised three blunt pin electrodes arranged concentrically (MN3512P150, Spes Medica, Genoa, Italy) within individual gold ring electrodes (DEGM102600, Spes Medica, Genoa, Italy). These pin electrodes were positioned in a triangular layout, with separations of 7 mm, 7 mm, and 11 mm, respectively. The central cathodes (blunt steel pins, diameter: 0.35 mm) protrude 1 mm beyond the external anodal gold ring, which has a diameter of 5 mm. Each electrical stimulus comprised a sequence of three pulses (pulse width 0.5 ms, inter-pulse interval: 4.5 ms, total stimulus duration: 10.5 ms, pulse frequency: 200 Hz). IES intensity was set at twice the individual EDT. This specific intensity level was determined to activate nociceptors selectively16. The electrode was fixed to the skin to ensure consistent contact between the electrode and the skin.

Experiment 2

Intra-epidermal electrical stimulation (IES): was applied using a triple planar concentric electrode as described in Experiment 1. IES intensity was set at 1.5×, 2×, 4× the individual EDT, and 0.5 mA. The fixed intensity of 0.5 mA was included to explore its potential for clinical application, particularly in cases where EDT cannot be determined due to sensory dysfunction. The order of stimulation intensity applied was randomized in the first visit (Visit 1) and kept the same in the second visit (Visit 2). The electrode was fixed to the skin to ensure consistent contact between the electrode and the skin.

Recording and analyses of pain-related evoked potentials for both experiments

PREPs were recorded using the electroencephalographic (EEG) system of the Keypoint Workstation G4 (Dantec Medical, UK). The EEG recording sites were prepared with alcohol and Nuprep, an abrasive skin preparation gel (D.O. Weaver & Co., Aurora, Colorado, USA), to reduce impedance. Following this preparation, two 9 mm Ag/AgCl cup electrodes filled with conductive adhesive gel were placed at the vertex (Cz) and referenced to the earlobe (A1–A2) according to the international 10—20 system. Signals were sampled at a rate of 2000 Hz. Electrooculogram (EOG) was obtained to identify artifacts due to eye blinks and ocular movements. EOG recording involved the use of surface electrodes (BlueSensor ECG electrodes, Ambu, Denmark) placed above and below one eye.

All electroencephalographic (EEG) signals were imported into MATLAB (MathWorks R2024a), where a custom-made algorithm was applied to identify N- and P-peaks of PREPs, as described elsewhere35. Briefly, the EEG data was downsampled to 1000 Hz and filtered with a bandpass filter of 0.1–100 Hz and a notch filter of 59–61 Hz. An offset correction based on the 500 ms pretrigger window was applied. Eye blinks and recording artifacts were removed based on visual inspection, resulting in a total of 15 artifact-free trials. The custom-made algorithm served to minimize investigator bias during the selection of N- and P-peaks of PREPs by suggesting N- and P-peaks within a certain time window (CHEPs: [200, 600] ms, PEPs: [0, 500] ms, IEEPs: [111, 383] ms), and below or above ± 2 standard deviations of the 500 ms pretrigger window35. Peaks identified by the algorithm underwent verification and, if necessary, manual adjustment by two independent investigators. For absent PREPs, the N-latency was marked as “not available (NA)”, and the NP-amplitude was set as the difference between the maximum and the minimum noise values within the respective time windows. Lastly, since the pinprick stimulus has a delay between its first skin contact and the generated trigger signal of 16.7 ms, this delay was added to the PEP N-latencies.36.

Data analyses and statistics for both experiments

Statistical analyses were carried out within the R computing environment (R Version 4.0.5, RStudio version 2024.04.2 + 764), compatible with Windows. The distribution characteristics of the data were visually inspected with histograms and Q-Q plots (function qqnorm() from R package “stats”).

Demographics of the participants and the parameters associated with the PREPs (proportions of evoked potentials, N-latencies, NP-amplitudes, pain ratings) were presented using descriptive statistical measures, including the mean and standard deviation (SD).

Test–retest reliability of N-latency, NP-amplitude of PREPs, and pain ratings were examined with intra-class correlation coefficient (ICC) and Bland–Altman analyses. ICC values (function icc(model = “twoway”, type = “agreement”, unit = “average”) from R package “irr”) were characterized as “poor” (< 0.40), “fair” (0.41–0.59), “good” (0.60–0.74), and “excellent” (0.75–1.00)37. The ICC for IEEP N-latency in response to IES at 1.5 × EDT was found to be negative (ICC = − 0.56). This result likely reflects high within-subject variability, a low sample size at this stimulation intensity (n = 16), and deviations from normality in the data, all of which can lead to poor model fit. For the purpose of interpretation and visualization, we adopted a value of zero, since ICC values are typically interpreted within a 0–1 range38. Additionally, Bland–Altman analyses started with the determination of whether the differences between Visit 1 and Visit 2 of N-latency, NP-amplitude of PREPs, and pain ratings deviated significantly from zero. Hence, a Wilcoxon signed-rank test was performed (function wilcox.test() from R package “stats”)39. Furthermore, reliability was analyzed using Bland–Altman plots (function bland.altman.stats() from R package “BlandAltmanLeh”), including the limits of agreement (mean 1.96 ± SDs).

Results

Participants

Experiment 1: All 26 participants showed normal sensory function and comprised 12 females and 14 males (25.3 ± 4.6 years). Almost all participants were right-handed (n = 24), while only two were left-handed (n = 2). None of the PCS scores (8.9 ± 6.3, ranging from 0 to 22 points) was above the cut-off level of 30 points29. BDI-II scores (4.2 ± 4.7, ranging from 0 to 18 points) were all below the cut-off level of 21 points for clinical depression30.

Experiment 2: Of the 35 recruited participants, five had to be excluded due to the following reasons: (1) technical issues with the concentric planar electrode (n = 3), and (2) exceeding the cut-off value of 30 points after filling out the PCS questionnaire (n = 2). Normal sensory function was observed in the remaining 30 participants, including 20 females and 10 males (27.7 ± 3.7). Additionally, one participant misunderstood the instructions on how to rate their perceived pain intensity and was therefore excluded from all analyses related to pain ratings. The average PCS and BDI-II scores were 9.6 ± 5.8 points and 2.6 ± 2.6 points, respectively.

Electrical detection threshold

While participants in Experiment 1 exhibited a mean EDT of 0.16 ± 0.04 mA in Visit 1 and 0.17 ± 0.05 mA in Visit 2, participants in Experiment 2 showed a mean EDT of 0.11 ± 0.04 mA (Visit 1) and 0.13 ± 0.03 mA (Visit 2).

N-latencies and NP-amplitudes of pain-related evoked potentials, and pain ratings

The grand averages of multi-modal PREPs for both experiments are shown in Fig. 2. N-latencies and NP-amplitudes of multi-modal PREPs and pain ratings are presented in Tables 1 and 2 as mean ± SD.

Fig. 2
figure 2

Grand averages of multi-modal pain-related evoked potentials for Visit 1 (dark grey) and Visit 2 (light grey). Experiment 1: pain-related evoked potentials in response to contact heat (CHEPs) and pinprick (PEPs), and intra-epidermal electrical stimulation (IES, IEEPs) applied at 2 × electrical detection threshold (EDT), were recorded. Experiment 2: IEEPs were recorded in response to IES at different stimulation intensities (1.5×, 2×, 4 × EDT, and 0.5 mA). CHEPs contact heat evoked potentials, EDT electrical detection threshold, IEEPs intra-epidermal electrically evoked potentials, PEPs pinprick evoked potentials.

Table 1 Characteristics of multi-modal pain-related evoked potentials in Experiment 1.
Table 2 Characteristics of intra-epidermal electrically evoked potentials in Experiment 2.

Test–retest reliability

ICCs and Bland–Altman coefficients are summarized in Tables 3 and 4.

Table 3 Test–retest reliability of multi-modal pain-related evoked potentials in Experiment 1.
Table 4 Test–retest reliability of intra-epidermal electrically evoked potentials in Experiment 2.

Experiment 1 (Fig. 3, Table 3): ICCs indicated “excellent” reliability between Visit 1 and Visit 2 for CHEP and PEP N-latencies, while IEEP N-latencies in response to an IES intensity of 2 × EDT demonstrated “poor” reliability. The NP-amplitude of all multi-modal PREPs showed “excellent” reliability between Visit 1 and Visit 2. In contrast to the “excellent” reliability of pain ratings in response to contact heat and pinprick stimuli between visits, pain ratings in response to IES at 2 × EDT demonstrated a “fair” reliability.

Fig. 3
figure 3

Bland–Altman plot of N-latency of pain-related evoked potentials in response to contact heat (CHEPs), pinprick (PEPs), and intra-epidermal electrical stimuli (IES, IEEPs) in Experiment 1. The dotted lines indicate the limits of agreement. No significant N-latency differences were found between Visit 1 and Visit 2 of pain-related evoked potentials (continuous line). CHEPs contact heat evoked potentials, IEEPs intra-epidermal electrically evoked potentials, PEPs pinprick evoked potentials.

In agreement with the ICC, the Bland–Altman coefficients of the N-latency are the widest for IEEPs in contrast to CHEPs and PEPs, indicating a poor agreement between Visit 1 and 2 for IEEPs. NP-amplitude showed similar Bland–Altman coefficients across all multi-modal PREPs demonstrating a high agreement between visits which is in accordance with the “excellent” ICC reliability. Additionally, Bland–Altman analyses showed no significant bias between Visit 1 and 2 (p > 0.05) for N-latency and NP-amplitude. In contrast to the ICCs for pain ratings, Bland–Altman coefficients of pain ratings were similar across all stimulation modalities. However, pain ratings in response to contact heat and pinprick stimuli showed a significant positive bias indicating lower pain ratings at Visit 2 compared to Visit 1.

Experiment 2 (Fig. 4, Table 4): As stimulation intensities increased, ICCs of IEEP N-latencies improved from “poor” at 1.5 × EDT and 2 × EDT, to “good” at 4 × EDT, and reached “excellent" at 0.5 mA. The NP-amplitude of IEEPs showed the best reliability (“good”) at an IES intensity of 0.5 mA. As stimulation intensities increased, ICCs of pain ratings in response to IES improved from “fair” (1.5 × EDT) to “good” (2 × and 4 × EDT) and reached “excellent” (0.5 mA) reliability between Visit 1 and Visit 2.

Fig. 4
figure 4

Bland–Altman plot of N-latency differences of intra-epidermal electrically evoked potentials in response to intra-epidermal electrical stimuli across different stimulation intensities in Experiment 2. N-latency differences were calculated by subtracting Visit 1 from Visit 2 (y-axis), while the x-axis represents the mean N-latency of both visits. The dotted lines indicate the limits of agreement. No significant N-latency differences were found between Visit 1 and Visit 2 of intra-epidermal electrically evoked potentials (continuous line). EDT electrical detection threshold.

The Bland–Altman coefficients of the N-latency for IEEPs get narrower with increasing IES intensities, indicating an improved agreement between Visit 1 and 2 for IEEPs applying an intensity of 4 × EDT and 0.5 mA. Both NP-amplitude of IEEPs and pain ratings in response to IES showed similar Bland–Altman coefficients across all IES intensities demonstrating a high agreement between visits. Additionally, Bland–Altman analyses showed no significant differences between the mean differences of Visit 1 and 2 throughout all the readouts (N-latency, NP-amplitude, and pain ratings) and zero (p > 0.05). These findings indicate no significant bias between Visit 1 and Visit 2.

CHEPs contact heat evoked potentials, ICC intraclass correlation coefficient, IEEPs intra-epidermal electrically evoked potentials, PEPs pinprick evoked potentials, SD standard deviation.

Discussion

To our knowledge, this is the first study assessing the test–retest reliability of IEEPs compared to multi-modal PREPs such as CHEPs and PEPs and across different IES intensities up to 0.5 mA. Only when using high IES intensities (4 × EDT and 0.5 mA), the N-latency reliability of IEEPs was comparable to the one of CHEPs and PEPs. Additionally, NP-amplitude demonstrated high reliability at 2 × EDT in Experiment 1 and at 0.5 mA in Experiment 2. Hence, IEEPs may serve as a valuable addition to clinical neurophysiology for assessing the functional integrity of the nociceptive system. Noteworthy, this study suggests the application of IES at an intensity of 4 × EDT or 0.5 mA is more likely to yield reliable IEEPs.

Improved test–retest reliability of IEEP N-latencies with increasing IES intensities

IES is known for its synchronous direct electrical depolarization of nerve fibers in contrast to other stimulation modalities. Hence, we expected “good” to “excellent” N-latency reliability for IEEPs, as previously reported by Özgül et al.28. Surprisingly, IEEPs recorded in response to IES at 2 × EDT during Experiment 1 showed low N-latency reliability (“poor” ICC and wide Bland–Altman coefficients) in contrast to CHEPs and PEPs, which both showed high reliability (“excellent” ICCs and narrow Bland–Altman coefficients). A previous study reported “fair” rather than “excellent” ICCs for CHEPs N-latencies in response to contact heat stimulation, which might be due to a slower heating ramp5. While the heating ramp of the present study was 250 °C/s, Kramer et al. had a heating ramp of 70 °C/s, potentially leading to a less synchronous nociceptor activation and an increased N-latency jitter1,40. In addition, previous studies demonstrated “good” instead of “excellent” ICCs of PEP N-latencies9, which might be due to the precise manual application of the pinprick by our investigator. Overall, differences in study design and stimulus application could account for the differences in N-latency reliability between CHEPs, PEPs, and IEEPs. However, in contrast to the present study, Özgül et al., 2017 demonstrated “excellent” reliability of IEEP N-latencies. While in the present study (Experiment 1), a stimulation intensity of 2 × EDT (~ 0.32–0.34 mA) was applied, Özgüls’ study applied 2 × pain threshold (~ 5 mA)28. Such high intensities may lead to an increased electrical field of stimulation, i.e., spatial summation, thereby increasing the recruitment and synchronicity of nociceptor activation, which in turn could improve the latency jitter and thereby reliability of IEEP N-latencies. This assumption was directly tested in Experiment 2 of the present study, where IEEPs were assessed in response to IES applied at different stimulation intensities. Indeed, Experiment 2 demonstrated an improvement in the reliability of IEEP N-latencies when applying increasing IES intensities such as 4 × EDT and 0.5 mA. The combination of the findings from Experiment 2 with the “excellent” reliability of IEEP N-latencies reported in Özgül’s study supports the assumption that higher IES intensities lead to better reliability of IEEP N-latencies. However, high IES intensities have been associated with the co-activation of fast-conducting Aβ-fibers in addition to slow-conducting Aδ-fibers by IES, possibly generating better synchronized IEEPs, improving the reliability of IEEP N-latencies. If this was the case, IEEPs in the study by Özgül might be cofounded by non-nociceptive activation, reflecting primarily the integrity of the tactile system and not of the nociceptive system, limiting its clinical use to test the spinal conduction along the spinothalamic tract8,27.

Until recently, it was assumed that nociceptive specific (i.e. exclusive nociceptor) activation is essential for PREPs to propagate via the spinothalamic tract in order to reflect the integrity of the nociceptive system. Several studies have concluded that IES is not nociceptive specific at higher intensities (1.3–3.4 mA), as IEEPs were still detectable after skin denervation, unlike LEPs8,27. Other studies argued that IES is likely nociceptive specific at lower intensities (2 × EDT, ~ 0.18 mA)16. However, recent findings showed evidence of spinothalamic propagation by IEEPs at intensities as high as 4 × EDT or 0.5 mA, leading to the hypothesis that predominant, rather than exclusive, nociceptive activation by IES may be sufficient for IEEPs and can reflect the functional integrity of the nociceptive system14. For this reason, Experiment 2 assessed the test–retest reliability of IEEPs at increasing IES intensity up to 0.5 mA and demonstrated a “good” IEEP N-latency reliability applying an IES intensity of 4 × EDT and 0.5 mA.

Methodological factors influencing the reliability of IEEP NP-amplitudes

Experiment 1 demonstrated “excellent” ICCs and narrow Bland–Altman coefficients likewise for NP-amplitudes of CHEPs, PEPs, and IEEPs. These results are supported by prior research, substantiating the robustness of these PREPs5,9,28. In Experiment 2, the ICCs of IEEP NP-amplitudes improved with higher stimulation intensities, possibly due to the recruitment of more nociceptors (i.e. spatial summation), while the Bland–Altman coefficients remained consistently narrow. As a result of spatial summation, the pain perception was heightened with increasing IES intensities likely enhancing the salience of the noxious stimulus, generating a more synchronized afferent signal, and leading to an increased NP-amplitude reliability1.

Notably, the ICC of IEEP NP-amplitudes elicited by IES at 2 × EDT was “excellent” in Experiment 1, whereas it was only “fair” in Experiment 2. Although the study designs of Experiment 1 and 2 were very similar, there is one key difference that might explain the discrepancy in IEEP NP-amplitude reliability. The experiments were conducted by two different investigators, which may have led to differences in electrode–skin contact forces during the taping of of the triple concentric planar electrode to the skin. The way of taping the electrode to the skin is a relevant limitation and should be optimized in future studies, as variations in skin properties and contact forces can affect stimulation consistency.

Pain ratings are maintained low up to 0.5 mA

Pain ratings in response to contact heat and pinprick stimuli had an “excellent” ICC in contrast to pain ratings in response to IES which showed a “poor” ICC. While rating the pain intensity of a noxious stimulus in an NRS from 0 to 10, the participant needs to understand the difference between non-painful and painful quality of sensation. If no painful quality such as “burning”, “stabbing”, or “pricking” is felt, an NRS of 0 should be rated irrespective of the perceived stimulus intensity. If a painful quality is felt, then the rated number should represent the intensity of the perceived pain (1–10 NRS). To ensure that participants correctly understood the instructions for rating pain intensity and the distinction between 0 and 1 on the NRS, they were asked to verbally describe the quality of sensation after each block of stimuli by selecting one or more terms from a predefined list of eleven descriptors. Therefore, we are confident that our participants were rating pain intensity rather than the overall unpleasantness of the stimulus. Mouraux et al., 2010 as well as Júlio et al., 2025 reported that the quality of sensation of IES at 2 × EDT is “pricking”, “stabbing”, “sharp”, “tingling”, “light touch”, and “touch”16,26. If the difference between perceiving the stimulus as painful (NRS ≥ 1) or non-painful (NRS = 0) was difficult to determine at low intensities this might explain the “poor” reliability of pain ratings in response to IES at 2 × EDT. In accordance, Experiment 2 showed improved ICCs (“good” to “excellent”) when applying higher stimulation intensities (4 × EDT and 0.5 mA) where the difference between NRS = 0 and NRS ≥ 1 might be clearer. However, more important than having reliable pain ratings in a clinical setting is to ensure that a noxious stimulation remains tolerable for patients to minimize eye muscle artifacts, which could interfere with the assessment of PREPs.

Clinical applicability of IEEPs

For future clinical applications, it is crucial to have reliable N-latencies and NP-amplitudes of IEEPs and tolerable IES to minimize muscle artifacts. Both IES intensities of 4 × EDT and 0.5 mA resulted in reliable IEEPs (N-latencies, NP-amplitudes) as well as reliable and low pain ratings. When stimulating at 4 × EDT, we account for inter-visit variabilities by adapting the intensity to the EDT, which helps control for differences between visits, such as arousal levels. For patients with clear sensory deficits, where the EDT cannot be reliably assessed, a fixed stimulation intensity of 0.5 mA can be used instead. These findings support the potential clinical application of IES and IEEPs as an objective, reliable, and feasible tool for assessing the functional integrity of the nociceptive system, particularly in neurological conditions such as small fiber neuropathy or spinal cord pathologies. The ability of IES to elicit robust and reliable IEEPs with low perceived pain makes it well-suited for repeated assessments and longitudinal monitoring in diagnostic contexts. Indeed, IEEPs were already shown in a previous study to be sensitive in detecting impaired spinothalamic conduction in patients with centromedullary lesions and may offer complementary diagnostic value alongside CHEPs14. Additionally, in contrast to CHEPs and LEPs, which require specialized setups, IES is technically more feasible and can be implemented in standard clinical neurophysiology laboratories.

Limitations

One limitation of this study is the variability in manually applied pinprick stimuli, particularly in terms of stimulation duration and velocity. Additionally, impedance during IES may differ across participants and application sites due to variations in skin properties and contact forces of the electrode, which can further vary systematically between investigators. Future research should explore the influence of factors such as pulse frequency, stimulation area, inter-investigator variability, and electrode–skin contact forces on the test–retest reliability of IEEPs. Further studies in larger and more diverse populations are needed to confirm the generalizability of the present findings on IEEP reliability.

Conclusions

IEEPs represent a promising method to assess the functional integrity of the nociceptive system. Previous studies hypothesized that IEEPs might allow more in-depth characterization of certain spinal pathologies, improving the delineation of the shape and trajectory of the spinal lesion12,14. As a prerequisite to use IES and IEEPs as a diagnostic tool in clinical practice it is essential to establish the test–retest reliability of IEEPs. To this end this study assessed test–retest reliability of IEEPs in response to IES compared to CHEPs and PEPs (Experiment 1), and across different IES intensities up to 0.5 mA (Experiment 2). IEEPs demonstrated high test–retest reliability for N-latencies, NP-amplitudes, and pain ratings, particularly when applying relatively high IES intensities (Experiment 2), such as 4 × EDT and 0.5 mA. While stimulating at 4 × EDT could be advantageous to minimize inter-visit variability by accounting i.e., for day-to-day fluctuations in perception, 0.5 mA may serve as a practical choice for patients with elevated or absent detection thresholds.