Introduction

Stroke remains the second leading cause of death worldwide and is a major driver of disability1. With demographic changes leading to an aging population, its incidence is projected to rise significantly in the coming decades1. Although acute stroke treatments have advanced, many survivors face chronic neurological deficits that substantially affect their independence and quality of life2. Patients with significant functional impairments typically receive intensive multidisciplinary rehabilitation, including speech and language therapy, in specialized neurorehabilitation clinics3.

Telerehabilitation, using communication technologies to deliver remote rehabilitation, tackles challenges such as travel logistics, high costs, and restricted access to quality care4. This approach enables therapy continuation after hospital discharge and supports patients in rural or remote regions5. Speech and language telerehabilitation, a subtype of telerehabilitation, can be administered synchronously, asynchronously, or via a hybrid model6. Evidence suggests that these remote interventions are feasible, safe, and can match in-person outcomes while boosting overall therapy doses7,8.

Around 30–60% of stroke survivors develop communication disorders9. Conditions such as aphasia, apraxia, dysphagia, and dysarthria disrupt daily activities and well-being10,11. Dysarthria, affecting between 26 and 44% of stroke survivors, compromises respiration, phonation, resonance, articulation, and prosody12,13,14,15. Speech-language pathologists (SLPs) target various facets—articulation, voice, fluency, and swallowing—to improve function16. Yet, SLPs face critical hurdles in remote monitoring, mainly due to subjective evaluations and manual, paper-based documentation17,18.

A pressing need exists for technology-driven tools that not only standardize speech and language assessments but also address the practical challenges faced by SLPs19,20,21. In many clinical settings, evaluations rely heavily on subjective judgments and manual data entry, which can lead to inconsistencies, extended assessment times, and increased workload22. The absence of robust automated systems—capable of providing real-time, objective scoring and seamless data capture—restricts the ability of clinicians to monitor patient progress effectively and make evidence-based decisions23. Although language analysis has benefited from resources such as TalkBank, the lack of analogous comprehensive databases for pathological speech means that SLPs are missing critical benchmarks for assessing and tracking patient performance24. By integrating automated scoring and digital data management tools, clinicians can reduce manual tasks, minimize documentation errors, and tailor interventions more precisely, ultimately enhancing patient outcomes and workflow efficiency25.

Recently developed digital tools all migrate traditional speech-language assessments online, offer automated scoring or analytics, and demonstrate initial validation across diverse age groups26,27,28,29,30. However, none of these platforms specifically target post-stroke dysarthria: they lack specialized modules for articulatory precision, real-time motor speech control assessments, and stroke-adapted task designs needed for objective dysarthria evaluation in stroke survivors31,32. Furthermore, established digital tools often suffer from critical limitations: many provide for only cumbersome user experience, limited remote assessment capabilities, lack integrated automated scoring, or force clinicians to navigate multiple disjointed platforms19,33,34,35. Data fragmentation across multiple platforms further hampers remote speech-language pathology, leading to inefficiencies and lost opportunities for integrated care35,36,37. These weaknesses translate into extended documentation times, inconsistent data capture, and ultimately lack of uptake in daily clinical practice. To combat this, our study aims to develop a digitized assessment tool that automatically captures assessment data. By minimizing manual entry and siloed record-keeping, we seek to reduce documentation workload and improve data consistency. Simultaneously, a needs-based interface designed to shorten assessment time and enhance usability is proposed. Automated summaries of assessment scores will streamline SLTs’ workflows and cut reliance on manual calculations, while voice-recording options will enable future data analytics for deeper clinical insights38,39.

To achieve these objectives, this study adopted a user-centred design (UCD) framework. UCD involves ongoing user participation throughout development to ensure that final products align with real-world requirements. We investigated the key stakeholder requirements on a remote speech and voice assessment tool, addressing two sub-questions:

What features and functionalities do SLPs and other stakeholders require?

How do stakeholders perceive the tool’s usability, effectiveness, and overall utility?

By answering these questions, our work aims to advance innovative tools that enhance the accessibility and quality of speech-language therapy—particularly crucial for stroke survivors and others with significant communication needs.

Methods

User centred design methodology

This formative user study followed the ISO 9241 − 210:2019 User-Centred Design (UCD) framework40,41augmented by an additional prototype development phase to facilitate iterative user involvement. The five core phases were:

  • Understand and Specify the Context of Use.

  • Specify the User and Organizational Requirements.

  • Produce Design Solutions.

  • Develop Prototype.

  • Evaluate Prototype Against Requirements.

All procedures adhered to relevant guidelines and regulations. The study was juristically verified by the Ethics Commission of Northwestern and Central Switzerland (EKNZ; Req-2024-00103). All participants provided written informed consent, including consent for publication of de-identified data. The sample size followed ISO Ergonomics of human-system interaction and classical empirical research guidelines42,43,44,45.

Participants and setting

Study participants included six speech-language pathologists (SLPs) with experience in online or inpatient neurorehabilitation, one SLP researcher specializing in telerehabilitation, two clinical IT specialists, and two lay users. SLPs were required to be actively delivering rehabilitation services, researchers to be employed in telerehabilitation-oriented institutions, IT experts to be engaged in clinical support or telehealth solutions, and lay users to not be engaged in SLP practice or research. Detailed demographic information is presented in Table 1.

Table 1 Participant characteristics.

Data collection and analysis

Sampling strategy

A purposive sampling method was employed, predicated on the assumption that each participant would offer unique insights46. Because participants’ roles were not interchangeable, sample size was guided by data saturation rather than statistical power analysis46.

Qualitative data collection

Qualitative data

To capture user needs and workflows for a tele-SLP application, data were gathered through:

  • Focus Group Interview (1 h): Three online SLPs discussed online assessment challenges, device preferences, and usability issues.

  • Individual Interviews (1 h each): One SLP researcher (requirements for data collection, collaboration features) and two clinical IT professionals (integration, compliance, performance, cost).

  • Work Shadowing (two sessions): Observations with one online SLP and one clinical SLP focused on internet stability, audio/video quality, and ancillary tools used.

All sessions were audio-recorded and transcribed; relevant themes were extracted using MAXQDA to identify user, organizational, and technical needs.

Development process

Understand and specify the context of use

Focus group and expert interviews, both of one hour duration, were conducted via Microsoft Teams, recorded with noScribe (an AI-based, open-source transcription tool), and analysed in MAXQDA. Work shadowing sessions were conducted via customized Telerehab applications and onsite in a neurorehabilitation clinic, each lasting three hours. We identified user goals, tasks, resources, and environment constraints, then mapped current (“as-is”) and potential (“to-be”) processes with Business Process Model and Notation (BPMN). User goals, tasks, resources, and environment constraints were identified and then mapped as current (“as-is”) and potential (“to-be”) processes using BPMN. Three online SLPs participated in the focus group, one researcher and two IT professionals in the single expert interviews, and one online plus one clinical SLP in the work shadowing sessions.

Specify the user and organizational requirements

User needs tagged in MAXQDA were converted into structured user requirements (e.g., “The < user group > needs to know < information>…”). Functional vs. non-functional requirements were established by evaluating feasibility and importance. Needs met by existing systems or beyond the prototype’s scope were excluded; remaining requirements formed use cases defining tasks and expected outcomes.

Prototype design

A mid-fidelity mock-up was developed in Figma, incorporating key features. Iterative feedback from an online SLP and two lay users addressed usability issues and refined the interface.

Develop prototype

Building upon the developed Figma design, an initial web application prototype was implemented using React.js, with Node.js and npm modules managing dependencies. Our web application uses React.js, a popular library for building user interfaces. It relies on Node.js to run JavaScript on the server side and npm modules—small, reusable code packages—to add extra features. Git version control was integrated to track development progress and facilitate collaboration. The prototype followed a component-based architecture, leveraging React hooks, simple functions that let different parts of the page keep track of data and respond to changes for local state management and a context provider for global state sharing. To ensure accessibility and continuous feedback from stakeholders, the production build was deployed on GitHub Pages, enabling rapid updates and broader accessibility.

Evaluate prototype against requirements

The iSpeak-Tele prototype was evaluated through a structured three-stage process aimed at verifying essential functionality, identifying usability gaps, and assessing overall user satisfaction. The first stage, Use Case Testing, ensured that core features—such as ID format validation, note saving, and audio/video recording—functioned correctly and aligned with predefined requirements. Secondly, Lay User Testing was conducted with two lay users, who explored the prototype freely to identify interface ambiguities and potential usability issues; minor text and interface adjustments were made based on their feedback.

Finally, Expert User Testing involved five SLPs over two iterative cycles (four tested version 1.0; all five tested version 1.1). Participants received a customized questionnaire, patient information (English/German), a user manual, and a usability test form. SLPs completed real or simulated patient sessions, providing qualitative and quantitative feedback through open questions and ratings on 5 point Likert scales (1 = strongly disagree, 5 – strongly agree), focusing on general usability, features & functionalities, the FDA-2 assessment, and patient interaction, and filling out the System Usability Scale (SUS) (suppl Tables 1, and 2)47. In one instance, an SLP compared the time required to assess a patient with and without the prototype, indicating possible efficiency gains.

Results

Understand and specify the context of use

Focus groups, individual interviews, and work-shadowing sessions revealed a predominance of manual, paper-based workflows among SLPs. Commonly cited challenges included:

  • Long Assessment Times: Manual documentation and scanning of notes.

  • Connectivity and Privacy Issues: Unstable internet disrupting online sessions, legal constraints on recording.

  • Tool Fragmentation: SLPs often used multiple, disjointed platforms for therapy tasks.

SLPs, researchers, and IT professionals converged on the need for streamlined documentation, a user-friendly interface, and robust data-security measures.

Specify the user and organizational requirement

A total of 34 user needs emerged, showing minimal overlap across groups:

  • 7 needs from IT professionals.

  • 9 from the researcher.

  • 15 from SLPs.

  • 2 overlapping between researcher and SLPs.

  • 1 overlapping between researcher and IT professionals.

Six were excluded (five already covered by existing systems, one deemed technically infeasible), leaving 28 needs (supp.Table 3). From these, the team derived 40 functional requirements—20 of which were deferred to future iterations due to data dependency, low priority, additional backend demands, frontend-only constraints, redundancy, security concerns, assessment validity issues, or data retention policies. The remaining 20 functional requirements are summarized in Table 2, alongside seven non-functional requirements (Table 3). These informed nine use cases (Table 4), each aligning with the prototype’s initial design objectives. These furthermore informed the decision on the digitized assessment. SLPs preferred a web based synchronous application over a tablet based or asynchronous solution. The FDA-2 was chosen due to the inclusion of free-speech tasks, its wide distribution, and therapist-administered format48.

Table 2 Functional requirements.
Table 3 Nonfunctional requirements.
Table 4 Use cases defined for final tests of the functional requirements.

Produce design solution

Based on the refined requirements, a Figma mock-up was created (Fig. 1). A fixed sidebar provided constant access to tools (stopwatch, recording, note-taking) across all pages, supporting key tasks:

Fig. 1
figure 1

Figma page design for FDA-2 cough reflex task with instruction text, grading, and navigation menu reflecting SLP feedback on its clear layout and high usability.

  • Patient Data Information (patient/case IDs).

  • Influencing Factors (FDA-2-specific variables).

  • Task Layout for FDA-2 (main assessment flow).

  • FDA-2 Intelligibility Task (patient speaks, SLP interprets).

  • Patient Window (plain display for patient prompts).

  • Note-Taking (sidebar module).

  • Stopwatch (timing tasks).

The clickable prototype underwent two feedback cycles. Early feedback led to adjustments in logo placement, hover icon information, and instruction text structure (on-page vs. info button). The intelligibility task was revised to include a short training phase (two words) before the primary 10-word set, with the option for SLPs to edit entries afterward. Having resolved minor design issues, the team proceeded to the development stage (Figs. 1 and 2).

Fig. 2
figure 2

React Prototype of FDA-2 cough reflex task with instruction text, grading, and navigation menu demonstrating the intuitive feature design.

Develop prototype

A frontend-only React.js application was built to store user input locally and enable end-of-session data exports (PNG, CSV, WAV, WEBM). This offline approach addressed immediate privacy concerns and simplified early deployment.

Component-based architecture

The application was divided into modular React components (Fig. 2), illustrated in UML diagrams and the application workflow (supp.figure 1, 2, and 3), meaning each part of the page is a self-contained, reusable piece that makes development faster and more reliable. Key component groups included Layout & navigation, session setup & patient data, assessment execution, and utility tools & data export.

This architecture also facilitates easy updates to assessment-specific texts (e.g., FDA-2) and supports multilingual functionality. With a focus on accessibility, the website is designed to be fully responsive for both desktop and laptop users.

Deployment and documentation

To facilitate testing in clinical environments, the application was containerized using Docker, ensuring secure local hosting and easy integration into existing IT infrastructures. A README file, including a concise user guide, assisted developers and testers in configuring and navigating the application. Given that typical SLP hardware setups favour desktop usage, full mobile optimization was not prioritized.

Evaluate prototype against requirements

Testing overview

After verifying use-case functionality (e.g., ID format checks, note saving, AV recording), two lay users tested the prototype without identifying major bugs. Minor wording and button-placement changes were adopted prior to expert user testing. Expert users (SLPs) received a reference manual and completed a questionnaire after their sessions. A temporary adjustment was also made to accommodate variable ID lengths (use case 9). Patients did not participate during since only SLPS interact with the front end at this stage.

Expert user testing cycles

First Cycle (v1.0).

  • Participants: Four online SLPs testing for ~ 48 min each. One used a real clinical patient (offline), the others simulated an online session with a stand-in patient.

  • Issues & Requests (n = 31): Backend needs (e.g., pausing assessments, longitudinal comparisons), single-screen or mobile adaptation, additional language options, layout adjustments (e.g., graph contrast), missing or unclear app or user manual details, recording compatibility issues on Firefox, and text reduction (deferred for assessment validity).

  • Implemented: “Download All” button, stopwatch-to-notes integration, an intelligibility-task restart button, minor assessment-flow tweaks.

Second Cycle (v1.1).

  • Participants: Five SLPs (two repeated from v1.0 for direct comparison).

  • Issues & Requests (16): Ongoing backend needs (file storage, pausing, analysis integration), expanded assessment support, text reduction requests (still pending SLP consultation).

  • Implemented: Task counter, recording pause function, .wav encoder (Praat compatibility), various minor bug fixes (e.g., window-size display issues).

User ratings

Mean ratings for meeting expectations rose from 4.25 in V1.0 to 4.8 in V1.1, while design appeal and navigation both improved from 4.25 to 4.6. Core features—stopwatch, note, recording, and patient window—were rated highly useful (4.6–5.0) and intuitive (4.0–5.0) in V1.1. The FDA-2 assessment workflow remained easy to use (4.6 vs. 4.75) and preference for the digital version increased from 4.25 to 4.6, supporting the prototype’s promise for streamlined, efficient assessments (Table 5).

Table 5 Expectations, design properties, core-features. Assessment workflow and user preferences measured (1 = strongly disagree, 5 = strongly agree).

SUS results

Four SLPs tested v1.0 (mean SUS: 87.5) and five tested v1.1 (mean SUS: 82.5), for a combined 84.7, which is above the standard benchmark of 68 (supp.Tables 1, and 4), and highlighted by the following exemplary quote from one SLP:

Keep most of it. It is clean and easy to use.

Minor score differences likely reflect sample size rather than true changes in usability, given minimal workflow alterations between versions.

The SLPs reported a reduction in total assessment time from 50 min (without the app) to 35 min (with the app). Though the sample size is limited, this suggests potential efficiency gains, which is highlighted by an example quote from one SLP:

Clear instructions, only stuff that you need is shown, all options are useful, graphs are made automatically and can be saved so I can upload it into patient system - huge timesaver!

Discussion

This study tackled the lack of digitized, standardized tools for remote speech and language assessments—a pressing issue given the growth of telerehabilitation and the persistent reliance on manual, paper-based methods17,18,35,37. We found that SLPs require intuitive, integrated functionalities (e.g., time measurement, note-taking, automated scoring) while researchers and IT professionals emphasize secure data handling and interoperability. Overall, stakeholders perceived the prototype as highly usable and effective—reflected in its strong SUS rating—indicating that a carefully tailored, user-centred solution can meet the demands of remote speech assessment.

By developing a frontend application focused on the FDA-2 assessment, we addressed core functional requirements such as patient data register, take notes, time measurement, recording, store data locally, data transfer, and visualization, alongside key use cases like recording assessment, recording session, multiple windows, and input patient’s data that emerged from our user-centred design (UCD) process. Specifically, the application supports synchronous online therapy by enabling SLPs to record audio/video, measure time, and maintain concise session notes, thereby reducing manual documentation. Thus, the prototype represents an advance at various levels in rapidly evolving speech and language telerehabilitation, particularly regarding its synchronous and integrated approach. These outcomes align with the global push to improve the quality of life for stroke survivors through accessible telerehabilitation1,4.

A core strength of our study is the rigorous UCD approach, directly informed by the diverse needs of SLPs, researchers, and clinical IT professionals. Iterative input from clinicians led to design features such as error checks for patient data (patient data register), a separate window for patient display (multiple windows), and one-click recording at any time (recording). These integrated functions helped reduce assessment time from 50 to 35 min and yielded a mean System Usability Scale (SUS) of 84.7—well above the industry average of 68 and indicative of high usability—highlighting both efficiency gains and high user acceptance. This echoes prior calls for integrated digital solutions that streamline remote speech therapy while maintaining effectiveness19,33,34,35,36,49,50.

SLPs consistently emphasized time measurement, note taking, and automated scoring (under scoring) as central to easing their workload. By combining these in a single interface, the prototype demonstrably improved workflow efficiency—one SLP recorded a 15-minute reduction in assessment time. Although further validation with larger sample is needed, these indications suggest that digitized tasks can significantly reduce administrative overhead, which is especially valuable in tele-SLP settings7,8.

Meanwhile, researchers highlighted the importance of automated scoring, data anonymization, and exchangeability to support broader analyses, while IT specialists prioritized secure data handling compliable with health-data regulations. Although the current prototype takes into account many security issues by storing data locally on the device, a robust backend is crucial for real-time data processing, comprehensive interoperability, and adherence to regulatory requirements23,24,51. This will allow data to flow seamlessly to clinical systems, safeguarding patient information under frameworks like GDPR, MDR, and ISO 27,001/27,0025,17. Furthermore, those needs indicate the potential for data ecosystem to serve as a connection and analysis hub, integrating various data streams and supporting clinical decision making51. They likewise indicate the potential for seamless data sharing with electronic health record (EHR) systems and national rehabilitation platforms, enabling clinicians to access patient progress directly within their healthcare workflows.

To support a wider range of speech and voice disorders, additional language options, further visualisation features and expansion to other assessments beyond FDA-2 have been requested9,12,37. In addition, amassing labelled audio data over time could lead to robust pathological speech datasets, advancing automated diagnosis models and complementing existing linguistic repositories such as TalkBank24,38,39,52,53.

Although participants typically preferred synchronous, supervised sessions, future iterations might incorporate semi-supervised or self-assessment modes for less complex patient needs, especially where mobility or scheduling is limited54. Given that user requirements evolve, ongoing multi-stakeholder collaboration—including potential early engagement with policy-makers—remains pivotal for balancing advanced features (e.g., full automation, broader assessment coverage) with practical clinical and regulatory constraints. Future iterations will include tests with people with stroke dysarthria to gain more ecologically valid insights from their perspective, while their involvement was kept minimal at this development stage since only therapists interact with the application. Furthermore, future iterations may include a wider selection of possible assessments. For the first iteration the FDA-2 was chosen based on users preferences and the existing gap in digitized assessments, specifically the inclusion of free-speech tasks, its wide distribution, and therapist-administered format31,32.

This formative evaluation involved six SLPs—sufficient for early-stage usability debugging but under-powered for hypothesis-driven statistics. In the next development cycle we will introduce a secure backend for data storage and user management, add automated scoring and objective voice-analytics that return phenomenon’s, fluency and other speech metrics, and then conduct a powered confirmatory study with 20–25 SLPs and information on patients specific conditions and dysarthria severity when those are included in testing. That trial will test pre-specified hypotheses on diagnostic accuracy, efficiency gains and user satisfaction. Consistent with ISPOR ePRO migration guidance, we did not repeat a reliability study here because the underlying scoring algorithm remains manual and unchanged;55 formal equivalence testing is planned once automated scoring is implemented.

Conclusion and outlook

In summary, our UCD-driven, frontend-first prototype successfully meets critical requirements for patient data capture, timekeeping, recording and note-taking, among others, thereby reducing the administrative burden for SLPs and achieving high acceptance scores in remote speech therapy settings. The next decisive step is integrating a secure backend to support real-time data analysis, automated scoring, and seamless interoperability with existing healthcare systems. Such enhancements will help overcome the fragmentation in patient data, promote evidence-based decision-making, and foster a more robust ecosystem for speech pathology research24,36,37. Ultimately, expanded, digitized assessments stand to improve tele-SLP services for an aging population of stroke survivors and other individuals who depend on accessible, high-quality speech therapy1,2.