Main

Every embodiment requires distinct capabilities for various processes. Robots embody complex mechanical and software components. To assess suitability, these capabilities must be quantified comparably. However, given the complexity of a robotic system, performance analysis is challenging. Accurate quantification clarifies fitness for processes and enables meaningful comparisons across systems, aiding classification for various use cases. Although current classification focuses on morphological characteristics like kinematics1,2,3,4, we need to assess robots’ intrinsic and holistic features. For this, we introduce the tree of robots, analogous to Darwin’s tree of life, a morphology-capability-based taxonomy linking robots to processes. This tree branches from all machines into different robot types refined by principles, scales, application environments and mobility. Robot morphology groups families for specific domains into genera based on fitness for applications. However, fitness-based classification requires extensive investigation and quantification of robot fitness for processes, not yet systematically done for robot applications.

Considering the well-studied group of industrial robot manipulators and manipulation as a core application, we derive fundamental capabilities and fitness metrics for robotics. These include metrics for the robot’s interaction capabilities, such as force control accuracy, precision and resolution, and motion fitness, such as pose repeatability (see the ‘Process analysis and robot fitness metrics’ section). Robot manipulation aims for natural, intelligent and gentle handling of objects in dynamic environments in which factors vary5. This requires a sense of touch and sophisticated control of contacts6. Multiple different principles for force sensing and force control strategies that may fundamentally affect the robot’s process fitness were introduced7,8,9,10,11,12,13 and implemented in service and industrial robotics. Examples include programming by demonstration (kinaesthetic teaching)14,15,16,17; telemanipulation for space, prosthetic or medical applications using force sensing and feedback18,19,20,21,22; and safe physical human–robot interaction via contact force recognition and reaction23,24,25. Despite these approaches, scalable task execution in service and industrial robotics is yet to be achieved5,6. Automating processes such as fitting26, force-sensitive processing (grinding)27,28,29, haptic telemedical services30 or housing construction31 requires powerful yet precise physical interaction between the system and its environment.

To achieve a generalized and comparable quantification of fitness for robot manipulation, we must recognize that any manipulation task requires physical interaction with the environment. Complexity increases with flexible processes needing reactive behaviour under unknown conditions. Advanced motion and force/impedance control equips robots with the required interaction capabilities. Thus, objective, process-based metrics are needed to describe a robot’s fitness for motion, especially physical interaction, referred to as tactility. Existing motion performance metrics in industrial standards assess, for example, the positioning accuracy32. Our additional metrics now comprehensively measure the physical interaction quality, including force control accuracy.

By combining existing motion performance metrics with our new tactility metrics, we can now evaluate the overall fitness of robotic systems for processes involving physical interaction and categorize them accordingly.

This paper details the tree of robots and the procedure for determining process-based fitness metrics using industrial manipulators. For this purpose, we define the required fitness metrics for this family of robots based on individual previous works32,33,34,35 and connect them to our holistic framework. Fundamental to this framework is that any embodiment’s sole purpose is successful process execution and it is a unique instance created from specific cyber-physical components, which determine its fitness for a particular domain of processes. Our contributions in this paper are structured as follows.

  1. 1.

    Introduce the synthesis and holistic concept of the tree of robots as a fundamental tool for robot classification available as an online tool and a code base for worldwide research contribution;

  2. 2.

    Provide an overview of the required metrics and their origin to establish the tree of robots;

  3. 3.

    Measure and provide an overview of the tactility fitness of 11 commercially available industrial robot manipulators;

  4. 4.

    Introduce the first established branch of the tree of robots and show how its genus classifies using a significance-based clustering approach.

Results

A taxonomy for robot fitness on top of morphology

The tree of robots is a taxonomy structuring the process-centric fitness of an artificial embodiment in connection to its morphology. The classification in its current form focuses on its ability to perform certain movements and physical interactions with the environment and humans. To combine an embodiment’s basic morphological and fitness information, this tool uses a tree-like graphical representation similar to the tree of life36, where the tree’s trunk represents the morphological categories, and the top branches refer to its process-based fitness. To obtain this fitness and classify the systems, we follow a structured, repeatable procedure originating from the desired processes, as conceptualized in Fig. 1 and summarized in this section. Forks in the tree represent differences in basic morphological features or substantial variations in the robot’s overall fitness. In this representation, the morphological characteristics (Fig. 1b) remain constant and are determined by the fundamental system design, meaning alterations in the system design result in a new embodiment. To provide an example for this morphological categorization, we divide the tree trunk first into domains based on existing machine categories, robots being one of them, which is the focus of our analysis. Kingdoms within the robot domain are defined based on the scale of the robot, explicitly being at the macro-/mesoscale. At the next level, the fundamental building material, for example, inorganic structures, defines the phylum. Depending on the basic building principle, we categorize the class, for example, mechatronic. The order refines the field of operation by considering the robot’s mobility, for example, a fixed base in our case. The definition of a robot’s family is based on its specific kinematic structure, such as serial manipulators. The system design’s main characteristics can be summarized graphically in a morphological box or formally in an extended unified robotics description format. Morphological features can be categorized using various approaches3,4,37 and describe the robot capabilities, for example, kinematics4 or controller design37. However, these methods overlook the final performance of a robot’s capabilities for desired processes. The tree of robots bridges the gap between morphology and a robot’s fitness for specific processes using process-centric fitness metrics.

Fig. 1: The tree of robots.
figure 1

A taxonomy consisting of the components based on system architecture (grey), embodiment performance (green) and process (blue). a, The robot-fitness-based classification: (i) the fitness metrics that define the green spider plots, namely, the robot fitness spectrum; (ii) the species of an embodiment defined by the overall fitness score plotted over motion and tactility fitness; (iii) genus groups clustering the robot species into process-based fitness categories. b, Hierarchical morphology representation. c, Requirements for process automation that are transferred into the required robot fitness metrics by (i) identification of domain-specific processes; (ii) deduction of the required basic operations; (iii) quality metrics describing the process that define the required embodiment fitness metrics.

We evaluate a system’s fitness for a specific domain, independent of its morphological features (Fig. 1a). This involves a structured process analysis as the basic requirements (Fig. 1c) by (1) extracting processes, (2) deducing basic operations and (3) defining process quality metrics. These metrics are transformed into relevant robot fitness metrics, such as tactility and motion fitness. Specific measurement procedures and reference benchmark setups for these metrics (Fig. 1a(i)) ensure reproducible evaluation. This creates a unique performance spectrum, mapping the embodiment into the fitness space and defining its fitness scores as the species (Fig. 1a(ii)). Assessing significant fitness differences and indifference classifies robots into genus groups (Fig. 1a(iii)).

The industrial task domain, historically the best studied, systematically defined metrics for tactility and motion fitness, and evaluated existing serial manipulators. A publicly available software tool provided with this paper accesses the tree of robots database for graphical display, allowing fitness classification testing, comparison and user contributions38. Adding new robot systems to the database reflects fitness, shapes and clusters. As the process and metrics database grows, the tree of robots offers a comprehensive fitness overview across automation domains. This work forms the basis for systematizing, categorizing and quantifying robot capabilities through inherent fitness.

Process analysis and robot fitness metrics

A key challenge in establishing relevant robot fitness metrics is identifying proven solutions for industrial manufacturing processes as a foundational process database. Human vocational training curricula, such as those provided by the Chambers of Commerce and Trade (IHK) for mechatronics apprenticeships, offer an ideal source. The process database (Fig. 2a) was assembled using this curriculum, enhanced by reviewing industrial processes, analysing robot application videos, available surveys and research application literature. The full list of analysed videos and literature is available in Supplementary Tables 210. The final process list consists of 17 identified processes clustered into process groups, namely, (1) assembly/disassembly, (2) dispensing, (3) welding and soldering, (4) handling and (5) processing. By utilizing application videos, the basic operations and general necessities for every process were derived, and summarized into the eight basic operations (go to pose, follow path, establish contact, follow contour, apply material, manipulate workpiece, sense force and apply force) and three necessities (efficiency, safety and usability), as listed in Fig. 2b.

Fig. 2: Industrial process analysis results and robot fitness metrics on the example of a painting process.
figure 2

a, Identified relevant industrial processes and groups. b, Derived basic operations. c, Quality metrics.

Every basic operation considers features such as the desired position, motion pattern, or contact force, which need to be fulfilled with specific quality metrics (Fig. 2b). These metrics include constraints on motion (displacement tolerance and path tolerance), pure interaction forces (force estimation error and force tolerance), and combinations of motion and interaction forces (disturbance force tolerance, dispensing flow error tolerance, process cycle time, fault reaction quality, manual manoeuvrability and fitting tolerance), as shown in Fig. 2c. These fundamental process quality metrics establish the requirements for successful process execution and form the basis for robot fitness metrics (Fig. 3). They are divided into two main categories: motion fitness metrics, defined by previous standardization work32, and tactile fitness metrics, which we introduce here to complete a meaningful, process-based set of robot fitness metrics.

Fig. 3: Robot fitness metrics for tactility and motion resulting from the industrial process analysis.
figure 3

The light-grey metrics are the newly established metrics for robot tactility clustered by force sensing, human safety, force control, manual manoeuvrability and force reaction. The dark-grey area indicates the existing motion fitness metrics. MAF, minimum applicable force.

In summary, the main groups of tactile fitness metrics described in more detail in the Methods are as follows.

  1. 1.

    Force sensing metrics measure the system’s capability to sense forces accurately (AF) and precisely (PF), with a certain resolution (RSF) and consistently over time (SD1–4), using the maximum force deviations from a reference mass attached to the robot end-effector.

  2. 2.

    Human safety metrics measure the physical contact safety concerning the current state of safety regulations, and these provide references for acceptable safety thresholds in collision scenarios based on the contact duration. Therefore, using a standardized test device, we measure conformance to transient (St) and quasi-static contact force thresholds (Sqs) according to another work39.

  3. 3.

    Force control metrics measure the quality of controlling the end-effector interaction forces, which is essential for manipulating objects and interacting with the environment. They quantify the accuracy (AcF) of the applied forces compared with a desired force, the applied force precision (PcF), resolution (RScF), settling time (TS), overshoot forces (OV), bandwidth the force control can achieve (Bc), minimal force that the system can apply (MAF) and the stability under changing environmental conditions such as materials (MVC) and impact forces (IS).

  4. 4.

    Manual manoeuvrability metrics define, measure and rate the quality and smoothness of the hand-guiding functionality by using the metrics of minimum motion force (MF), guiding force (GF), guiding force deviation (GD), required guiding energy (GE) and manoeuvre effort (ME).

  5. 5.

    Force reaction metrics measure the robot’s capability to detect and react to contacts and its contact sensitivity. They quantify the force-sensitive detection of obstacles or contacts that can be desired or accidental during motion and differentiate between heavy (CS) and light (CSt) contacts.

Extended Data Fig. 2 shows a comprehensive overview of the derived basic operations/necessities, process quality metrics, and the relation and definition of the robot fitness metrics. The tree of robots online tool integrates all this knowledge into a comprehensive process database. This database allows for viewing and extending these metrics and their interconnections.

Tactile fitness atlas

The developed metrics were applied to 11 state-of-the-art serial manipulators using the testing procedures described in the Methods: Franka Emika robot (FE), Franka Research 3 (FR3)40, LBR iiwa 14 R820 (LBR)41, Universal Robots UR5e (UR5e), Universal Robots UR10e (UR10e), Yuanda Yu+ (Yu+), Kinova Gen3 (Gen3), Yaskawa HC10 (HC10), ABB Gofa (GoFa), Techman TM5-700 (TM5) and Doosan M0609 (M0609). Details of the manipulators are listed in Supplementary Table 30 and the results for all the robots are listed in Extended Data Tables 26. For comparison, example industrial manipulators, namely, Kuka KR 120, Agilus 4 and DR4020C-AX0.5, were added to the representation later on. Five newly developed reference benchmark setups and 25 experimental procedures yielded ~3,000 datasets through single experiments available on the accompanying GitLab repository38. As a result, we present the fitness spectrum for each tested robot in Fig. 4—the first robot tactile fitness atlas—which defines the robot genus of serial manipulators based on the physical interaction fitness.

Fig. 4: Tactile fitness spectra of state-of-the-art commercial manipulators.
figure 4

a, Example of the spectral structure and detailed results for the FR3 robot as an example. NA, not available. b, Tactile fitness spectra and individual pictures of all the other tested manipulators of the referred robot types.

In Fig. 4, the outer circle represents a theoretical optimum for each metric result. The inner circle represents the data-based worst-case results rationalized in the Methods. If the metric results are displayed at the origin of the radar chart, either the fitness measured is beyond the worst-case assumption or the metric was not available. Such out-of-range results have been excluded from the following overview. We obtained the force sensing fitness results between 0.81 N (FE) and 2.73 N (Yu+). The resolution of the robots’ force measurements lays between 0.06 N (LBR) and 4.23 N (Yu+). The force sensing precision ranges between 0.07 N (M0609) and 0.52 N (UR5e). For the sensing drift over 1 min, the results range from 0.02 N (FE, Gen3, FR3 and M0609) to 0.24 N (UR5e). Over 10 min, the measured drift is between 0.02 N (FR3) and 0.25 N (Yu+). For a duration of 1 h, the drift ranges from 0.03 N (FE) to 0.52 N (Gen3). Last, the robot force sensing drift results in 8 h are between 0.05 N (FR3) and 0.43 N (Yu+).

Seven of our manipulators provide state-of-the-art force control fitness, two of which have strong outliers. The accuracy of the remaining manipulator force controllers ranges between 0.55 N (FE) and 1.20 N (LBR), with resolution measurements being 0.16 N (FR3) to 0.35 N (M0609). The precision is scored between 0.06 N (FR3) and 0.31 N (M0609). The minimum applicable force ranges between 0.57 N (FR3) and 3.94 N (UR5e). The controller bandwidth could solely be measured for FE and FR3 and ranged between 10 Hz (FR3) and 13 Hz (FE). Force overshoot ranges between 1.75 N (FE) and 17.55 N (GoFa), and settling times are measured between 0.45 s (FR3) and 3.72 s (M0609). The material variation consistency is available for five robots, and the impact stability of the three tested robots ranges between 40% (M0609 and LBR) and 100% (FR3 and FE).

Robot force reaction capabilities vary between 3% (UR10e) and 100% (FE, LBR, FR3, M0609 and GoFa) considering impacts with speeds above 250 mm s–1 and between 0% (UR10e, UR5e, TM5 and HC10) and 100% (FR3) considering impacts with contact speeds of 50–250 mm s–1. The overview of human safety fitness shows that transient force limits for collisions with different body parts and contact speeds of 50–540 mm s–1 can be adhered to in 22.5% (HC10) to 83.75% (FR3) of cases. For quasi-static contacts, the adherence to safety thresholds lies between 46.25% (HC10) and 100% (FR3 and FE).

Last, for manual manoeuvrability, the minimum motion force required to manually guide a robot end-effector for all systems is between 0.94 N (LBR) and 4.14 N (UR10e and TM5). Guiding force and guiding force deviation range from 4.38 N (LBR) to 26.28 N (UR10e) and from 2.17 N (FE) to 17.71 N (UR10e). The guiding energy for a sideways end-effector motion of 200 mm ranges from 2.13 J (FE) to 9.18 J (UR10e). The manoeuvre effort to accelerate the system from 0 to 250 mm s–1 is between 0.56 J (FE) and 1.96 J (UR10e).

Overall, we observe that the metric results of different manipulators can be at least twice as high compared with the best-measured fitness, for example, force sensing accuracy, but can, in some cases, even be 10 times or 20 times as high, for example, for robot force overshoot or sensing drift. When considering the outlying results, the largest ratio between the two measured robot fitness results is 1,802 (between FR3 and UR10e in sensing drift fitness over 1 h). These significant differences necessitate a differentiation between systems based on their tactility fitness.

Fitness classification

We use the tactile fitness results in combination with motion fitness from the manufacturer’s information listed in their respective datasheets and linked in Supplementary Table 30 to classify our tested manipulators based on fitness. Despite motion performance metrics being defined by EN ISO 9283:1998, most manufacturers only provide pose repeatability. Thus, only the tactility metrics currently serve as a valid basis for statistical classification. To perform automated, robust classification on our dataset, which can be extended with more data, we assume (1) all metrics are of equal interest and weighted equally and (2) fitness metrics form a tactility feature vector representing the intrarobot fitness distribution. The boxplots in Extended Data Fig. 1 show that the features are not normally distributed within a robot for our dataset. Therefore, we use a non-parametric Mann–Whitney–Wilcoxon significance test to determine significant differences between the tactility fitness of each of the two robots. We aim to group similar robots based on U statistics in Extended Data Fig. 1a. We observe no significant differences in tactility fitness between FE and FR3 (U(FE, FR3) = 0.849), between LBR and M0609 (U(LBR, M0609) = 0.258), between LBR and GoFa (U(LBR, GoFa) = 0.278), and between GoFa and M0609 (U(GoFa, M0609) = 0.776). UR5e clusters with UR10e (U(UR5e, UR10e) = 0.409), Yu+ (U(UR5e, Yu+) = 0.385) and HC10 (U(UR5e, HC10) = 0.248). Yu+ and Gen3 (U(Yu+, Gen3) = 0.109) form another cluster, including TM5 and HC10 (U(Gen3, HC10) = 0.206 and U(Gen3, TM5) = 0.446).

On the basis of the groups identified by the significance analysis, we used the expectation-maximization method to classify robots into four distinct groups based on the initial mean values of tactility and motion fitness and an initial covariance. The four groups are (1) industrial manipulators ([0, 0.29]), (2) cobots ([1.74, 0.15]), (3) soft robots ([3.42, 0.09]) and (4) tactile robots ([4.48, 0.04]), as shown in Extended Data Fig. 1b. Supplementary Section 3 provides the full algorithm details. We then created branches within the serial manipulators’ family, exemplified for TM5, GoFa and FE, based on physical and software characteristics. These include electromechanical actuation (for example, brushless direct-current motors), sensory architectures (for example, joint torque sensing in LBR and FE; wrist-wrench sensors in UR5e), mechanical structures (six- or seven-degrees-of-freedom exo-tube/exo-shell designs) and controller architecture (impedance, force or unified force impedance control13). Although we chose examples in which the controller is specified, information on control architectures is usually proprietary. The graph shown in Fig. 5 finally illustrates the relationship between the robot morphological features and their fitness, which can be further extended and created using the provided tool and code base38.

Fig. 5: Results of the fitness-based robot classification.
figure 5

Final integration of all the robots with their morphological data into the tree of robots with selected robots as examples. ufic, unified Force impedance control; IMU, inertial measurement unit. This display can be generated by the user tool for any tested robot at treeofrobots.com.

Discussion

In this study, we introduce the tree of robots, a taxonomy to evaluate the fitness of artificial embodiments for desired processes, focusing on physical interaction. Each embodiment, generated from its unique combination of cyber-physical components, is differently well fit for various processes. Fitness measurement follows a reproducible pipeline, from domain knowledge (for example, higher education materials) to process analysis and extraction of basic operations and relevant process quality metrics. Our initial analysis identified 17 processes, 8 basic operations and 3 necessities for physical interaction, deducing 11 process quality metrics and establishing 25 tactility metrics. These metrics were measured for 11 state-of-the-art serial manipulators using standardized test stands and procedures, revealing significant differences in tactility fitness and classifying them into 4 categories (genera). An online tool allows access to the tree of robots database, enabling interaction and classification updates with new metrics. Overall, this work allows to structurally measure progress in robot advancement in capability and achievable fitness, and steer this development more systematically. On the basis of the organization of robot morphology and capability in a reusable and modular extendable manner, our work allows for assembling an ever-growing taxonomic atlas. It offers a tool for researchers worldwide to collaborate that can be accessed online and is progressively advanced under treeofrobots.com.

We present results for a single robot of a manipulator type (for example, LBR (serial no. 984012)) at the central point of the ISO 9283:1998 reference cube for robot motion performance testing using metrics derived from industrial processes. This does not yet cover the distribution of robot fitness within the same type or device. Manufacturers and researchers that adopt these metrics could use them to improve the system information and verify the production quality as well as the objectivity of capabilities. Detailed information on morphological features, such as actuation power or sensor precision, is not included in the current tree of robots. Future analyses can extend the tree with sub-branches for specific design aspects.

The proposed system can extend the tree of robots to other application domains. Although we focused on physical interactions and introduced metrics like contact sensitivity, force sensing accuracy or minimum applicable force, additional metrics will certainly emerge. For example, energy-related metrics, contact reflexes or workspace-level dexterity can be added to the tree of robots. It is also evident that additional performance metrics axes will emerge, particularly when dealing with close human–robot interaction. Take, for example, the care sector. Using a reduced sample of applications, such as shaving, cream application and ultrasonic gel application, we already performed a structured metric development pipeline for these applications. It becomes evident that beyond mere motion and tactility requirements, these robots need to be fit for exteroceptive perception tasks and non-tactile interaction, including natural communication skills, allowing the fluency of processes. The importance of fluency for ensuring efficiency is currently measured in proposed fluency metrics42. These metrics encompass a combination of all our proposed relevant physical interaction metrics and expand them to include perception and interaction metrics, crucial for smooth transitions between actions with other entities and within their own operations. Additionally, great strength may be required for lifting persons. We further elaborate on these developments in Extended Data Fig. 3.

Finally, the current tree of robots has not yet considered more complex fitness metrics that cover dynamic interaction processes and skills, such as manipulation, locomotion or interaction. Their incorporation needs further research on future conceptual, algorithmic and experimental levels. The DIN EN ISO 9283:1998 standard outlines various measures used to evaluate the performance of robotic movements. These measures include how accurately a robot can achieve a specific pose or follow a given path, the extent of any drift from the intended pose, any excess movement beyond the target position, the time it takes for the robot to stabilize and the characteristics of the robot’s speed along a path. However, industrial manufacturers have only provided pose repeatability so far. Fully assembling this ever-growing tree of robots requires global collaborative efforts to integrate all past, present and future robot developments into the framework.

Methods

In the following, insights are given on the process isolation procedure by surveying and the metrics definitions, reference benchmark setups and procedures are summarized. Supplementary Section 1 provides a detailed breakdown of the process analysis, and Supplementary Section 2 provides a more detailed instruction on the metric measurement procedures.

Process isolation by surveying

An extensive literature study, market research and task analysis were conducted to analyse and formalize the industrial manufacturing process. We synthesized reports on industrial robotics use cases reported by the International Federation of Robotics and market analysts like McKinsey & Company, among others. We analysed the standardized industrial process curricula from the German Chamber of Commerce and Trade that are taught at all professional vocational schools to industrial apprentices nationwide. These harmonized curricula provide descriptions of groups of all the relevant single processes such as screwdriving or assembly. Still, even this information is incomplete as it fails to provide single, machine-interpretable basic operations. Thus, to deduce these basic operations, a large set of research use cases from EU Horizon 2020 projects (Extended Data Table 1), trade show demonstrations, factory implementations and expert assessments collected by video analysis of approximately 870 min use-case video footage were analysed. Supplementary Tables 210 provide an additional summary of isolated use cases being integrated into the tree of robots website and included in the collection. For the video analysis, we referred to the available videos of trade shows, real use cases or advertised use cases on the official online video platforms of well-known robot and automation system manufacturers, including Kuka AG, Universal Robots and FANUC. We included all applications in our analysis that are (1) in the industrial sector and (2) standard industrial procedures, require human interaction, involve sensitive objects or must be flexibly programmed. For our domain of industrial manufacturing, this structured process led to the main process groups also referred to in the World Robotics Reports43,44 and the German apprenticeship programs of assembly and disassembly, dispensing, welding and soldering, handling and processing44. The list of collected industrial robot processes was clustered (Fig. 2).

Tactile metrics definitions

The following paragraph summarizes reproducible definitions, experimental setups depicted in Fig. 6 and measurement procedures used to obtain the developed tactility fitness metrics from Fig. 2. For each metric group, the reference system for accurately and objectively measuring every metric is first described, followed by the metric definition, and the detailed experiment and evaluation protocol. For detailed graphical measurement procedure descriptions and more detailed information regarding the reference benchmark setups, please refer to Supplementary Section 2. In all the following definitions, force vectors refer to forces in Cartesian x, y and z directions, \(\bar{{{\bf{F}}}}\) is the robot force measurement, Fs is the force measurement measured by an external force–torque sensor (K6D80, ME-Meßsysteme), Fr is the external reference force, Fd,z is the desired commanded force in the z direction, t denotes the measurement duration and N marks the number of experiment repetitions.

Fig. 6: Metric test setups.
figure 6

ae, Schematic and photographs of the metric reference benchmark of the conducted tests with reference benchmark setups and different manipulators for force sensing metrics (a), force control metrics (b), force reaction metrics (c), manual manoeuvrability metrics (d) and human safety metrics (e).

All the robot metric results were derived using individual robot systems available at the laboratory and shown in Fig. 4. Only proprietary robot controllers are used and accessible via the standard user interface, if not indicated differently. The test position, ambient temperature conditions of 20 ± 2° and humidity of 40 ± 5%, and other relevant ambient conditions are in line with the EN ISO 9283:1998 standard32. Specifically, the test location is chosen along the central location of the robot reference cube.

Force sensing fitness metrics

We quantify the robot’s internal force sensing fitness in terms of the sensing accuracy, resolution, precision and sensing deviation over time.

Reference systems: all the force sensing metrics are measured using a normalized weight with weight mL of 800 g, which is attached to the robot flange via a steel hook with weight mH of 182 g (Fig. 6a).

Force sensing accuracy, resolution and precision: the force sensing accuracy, resolution and precision quantify the robot’s capability to sense external forces on the end-effector, which is fundamental for all physical interaction. Accuracy compares the actual force with the measured one, whereas resolution describes the measurement fluctuation within one measurement; precision defines the repeatability of the measurement among repeated trials. The experimental procedure is as follows. First, the hook is attached to the robot flange, and mH is compensated for by using the corresponding robot settings. Then, for existing systems, the load mL is fixed to the hook, and after a settling time of 1 s, the external force at the robot’s end-effector is recorded at the maximum frequency that the robot allows (aiming for 1 kHz) and finishes after t = 3 s recording time. The weight is removed, and the robot remains without a load for a minimum of 3 s. This procedure is repeated for N = 30 times to define a statistically relevant precision metric suggested by the ISO 9283:1998 standard32.

Force sensing drift: to ensure reliable task execution in physical contact situations, the robot force sensing quality must be time independent. Thus, we measure the robot force sensing drift to understand the effects of force sensing between that may occur after the typical cycle time of 1 s and (1) a typical task duration of 1 min and (2) a repetition of ten task executions after 10 min, (3) 90 min and (4) one workday (8 h). The robot is first turned off for 6 h to measure the force sensing drift. The hook is attached, and the robot end-effector load mH is set. The robot force measurement is started at a maximum of 10 min after switching on the controller box. Then, the load mL is fixed to the hook. After a set time tset = 1 s, the robot end-effector force recording is started at 1 Hz measurement frequency, which finishes after 8.5 h. The sensing drift metrics are defined in Supplementary Table 14. Although SD1 and SD2 address task-relevant information for a short loading duration of the robot, SD3 and SD4 are concerned with long-term effects of the robot load on the sensed external force. In particular, SD4 measures robot force sensing quality over an entire workday.

Force control fitness metrics

Nine metrics measure the tactile controller fitness: force accuracy, force resolution, force precision, step response force settling time, step response force overshoot, minimum applicable force, impact stability and force control bandwidth against the ground. All the mathematical definitions are listed in Supplementary Table 17.

Reference system: to establish a meaningful reference setup for describing the robot force control quality, we refer to our use-case analysis for the choice of contacting materials. This analysis includes multiple material pairs, such as wood and stainless steel, stainless steel on steel, aluminium/copper, flexible materials like cables or food bags, or even stainless steel with glass. However, the most prominent are use cases in which Duroplast workpieces are handled like printed circuit boards, transport boxes or screen testing. Consequently, we chose a contact material pairing stainless steel with hard plastics for our reference test rig.

The contact force is measured using a K6D80 sensor (ME-Meßsysteme) with a maximum measurement frequency of 300 Hz. The sensor is covered with high-density polyethylene with Hc ≈ 63–67 Shore hardness type D (ShD); the robot end-effector is a stainless steel sphere with Hs ≈ 250 Brinell hardness (HB) with 25-mm radius and total additional weight of ms = 0.2 kg (Fig. 6b, left).

Force control accuracy, resolution and precision: the force control accuracy, resolution and precision describe the robot’s capability to reliably and accurately apply forces to object surfaces. Using the apply force process, a contact with Fd,z = 8 N (ref. 35) is established for 5 s. After contact is made and a settling time of 1 s passes, the sensor measurement starts capturing the applied force for Δt = 3 s. This is repeated for N = 30 times, as suggested by the ISO 9283:1998 standard32, to define a statistically relevant precision metric.

Force settling time and overshoot: the force settling time and overshoot explain the robot’s capability to control the contact force when establishing contact with an object’s surface. The force overshoot metric describes whether a robot can handle certain fragile materials. For overshoot and settling time of the robot applied controller force, the desired force is set via the apply force and move to contact processes (if available) as Fd,z = 8 N applied for t = 20 s. The sphere is placed just above the sensor plate at ds ≈ 0.5 mm. First, the measurement of the force sensor is started. Then, the robot is commanded to establish contact and apply force to keep the contact force for the desired duration of t = 10 s. After the sensor finishes the measurement, the contact can be released again. The procedure is repeated for N = 3 times, as suggested by the ISO 9283:1998 standard32, for position stabilization and overshoot.

Minimum applicable force: to define which minimal forces can be used to establish contact with an object and, thus, how delicately a robot can establish contact with an object that may be fragile, we use the robot apply force capability and search for its most sensitive settings. First, the robot apply force feature is used to contact the sensor plate with Fd,z = 0 N for 5 s. If no contact was successfully established, Fd,z is increased in an increment of ΔFd,z = 0.01 N. This procedure is repeated until the sensor establishes and measures successful contact. After contact is made and a settling time of t = 1 s passes, the sensor measurement starts reading and captures the applied force for Δt = 3 s. Using N = 3 measurements with the final setting for Fd,z, the minimum applicable force is determined via external measurement Fs.

Force controller bandwidth: we apply the force controller bandwidth metric to understand how well the force control can react and adapt to new desired contact forces. We define the force controller bandwidth under contact with a rigid surface. If available, the robot research interfaces are used for this metric to define the system’s frequency response. Initially, the robot is brought into contact with the high-density polyethylene surface of the sensor. Then, the robot’s apply force process is used to produce a sinusoidal force profile with an amplitude of 8 N and the frequency ω starting from 0.5 rad s−1 to 66 rad s−1 increasing by a factor of 0.001. After recording the actual forces, the cut-off frequency is calculated, and the metric is defined.

Material variation consistency and impact stability: these two metrics measure the force control consistency and impact stability over material variation. We consider five different material types that are being contacted by the stainless steel sphere Hs ≈ 250 HB attached to the end-effector (Supplementary Table 19).

For the impact stability measurement, the robot is controlled to contact the material at two different velocities, first with low velocity (\(\dot{x}=50\,{{\rm{mm}}\,{\rm{s}}^{-1}}\)) and then at \(\dot{x}=250\,{{\rm{mm}}\,{\rm{s}}^{-1}}\). The contact velocities are verified using a photo-interrupter (precision light barrier LS 203.1 and speed counter MZ 373, Hentschel) and a 10-mm-thick interrupting structure attached to the robot end-effector. The contact forces are measured via the internal force measurement. First, we check whether the force settles within a 3% error bound around the overall force average within 3 s. Then, if this settled force measured by the robot internal sensing \({\bar{F}}_{{{\rm{z}}}}\) is within ±10% of the desired value, we accept the resulting force for benchmarking. The metric is then split into two parts, where (1) the number of materials for which the contact force settled within the defined boundaries ns,m at 50 mm s–1 and (2) the number of cases in which the contact under dynamic conditions resulted in acceptable forces ns,i are considered. Then, both numbers are compared with the number of experiments (N = 5).

Force reaction fitness metrics

The contact sensitivity is derived as a metric to examine the robot force reactions. Contact detection may have two different goals: (1) preventing secondary hazards by stopping at contacts or (2) detecting objects to be handled. Consequently, both purposes need to be considered. We divide contact sensitivity into two scenarios: (1) robot motion without expected contact at higher velocities (CS) and (2) robot motion in a confined space at low velocity that aims for tactile search of an object (CSt), as defined in Supplementary Table 21.

Reference systems: for both metrics, unconstrained collision scenarios are established using the setup shown in Fig. 6c. It comprises a passive pendulum with adaptable effective masses and the contact stiffness of 70 Shore hardness type A. CS considers collisions with velocities of more than 250 mm s–1, whereas CSt considers collisions at low velocities too. The robot is programmed to move in a straight line towards the pendulum at the desired velocity \({\dot{{{\bf{x}}}}}_{{{\rm{d}}}}\) for both metrics. During the motion, the robot end-effector collides with the pendulum. The effective mass me is modified for different contact situations. We apply the most sensitive safety settings to the desired trajectory and observe whether the robot stopped on contact. Stopping was observed visually by a scale, and up to 10 mm breaking distance behind the collision point was considered successful contact detection. For each contact, this procedure is repeated N = 3 times. Afterwards, the velocity is increased. When all the velocities are tested, the pendulum’s mass is increased until all conditions are examined. The exact setup and testing conditions are listed in Supplementary Table 22. The number of repeatable stops at each velocity–mass pair is determined, where we distinguish between the safe stopping behaviour with higher robot velocities nd and lower robot velocities nd,t. It is then put into relation with the number of total contact situations, namely, different masses (\({N}_{{{{\rm{m}}}}_{{{\rm{e}}}}}\)) and different number of tested robot velocities (\({N}_{\dot{{{\bf{x}}}}}\) and \({N}_{\dot{{{\bf{x}}}},t}\)). The overall contact scenarios with Nc = 35 and Nc,t = 25 are evaluated for each metric.

Manual manoeuvrability fitness metrics

The manual manoeuvrability metrics45 have one additional metric called the manoeuvre effort. They reflect three different phases felt by the user during hand guiding, which are defined as start, acceleration and steady motion. Two setups are required to measure the metrics (Supplementary Table 24).

Reference system: the reference system consists of a three-dimensional linear slider with a K6D40 force–torque sensor (ME-Meßsysteme) integrated via a CompactRio system (National Instruments). The sensor is attached to the top of the z axis of the gantry (Fig. 6d). The robot end-effector is attached to the sensor, and the proprioceptive hand-guiding function is activated. The linear axes guide the robot end-effector along a defined trajectory. During motion, the occurring interaction forces Fs are measured, where we seek to understand the required force magnitude in the direction of motion Fs,x.

Minimum motion force: the minimum motion force corresponds to the acceleration phase of the robot kinaesthetic guidance. First, the static friction of the robot joints is evaluated as a metric. This breakout force is described by force and displacement over time. We first command the gantry to place the sensor in its initial measurement position to measure the minimum motion force. Then, we activate the robot’s proprioceptive guiding function. We attach the robot end-effector to the sensor via an adaptor plate. For robots with redundant kinematic structures, the initial robot configuration is adjusted to the desired value or noted down for the sake of precision between all the measurements. Then, the force and position measurements are started using LabVIEW (v.2020 SP1), and the linear units are controlled to move with \(\ddot{{{\bf{x}}}}=1\,{{\rm{mm}}\,{\mathrm{s}}^{-2}}\) for x = 20 mm in the x direction and back. Once the initial position is reached, the force and position measurements stop. Afterwards, the initial robot configuration is checked, and the process is repeated for N = 10 times.

Guiding force and guiding force deviation: the metrics guiding force and respective guiding force deviation quantify the force profile required from a human operator to guide the robot end-effector at constant speed along the desired path.

First, the robot proprioceptive hand-guiding function is activated using the free guidance feature to determine the guiding force and the guiding force deviation. Then, the sensor is placed in its initial measurement position, and the robot end-effector is attached to the sensor via an adaptor plate (Fig. 6e). Next, the force and position measurements are activated. The linear units are controlled to perform a motion with constant velocity \(\dot{x}=250\,{{\rm{mm}}\,{\mathrm{s}}^{-1}}\) (corresponding to the maximum guiding velocity suggested by the ISO/TS 15066:2016(E) standard39), acceleration \(\ddot{x}=1\,{{\rm{mm}}\,{\mathrm{s}}^{-2}}\) and x = 400 mm along the y axis of the gantry. Then, the trajectory is reversed. When the initial start position of the sensor is reached again, the force recording stops. The configuration is checked and corrected, and the measurement procedure is repeated N = 30 times.

Manoeuvre effort: to quantify the effort required from a human operator to manually set the robot end-effector in motion, this metric measures the energy required to accelerate the robot end-effector from 0 to 250 mm s–1 (as the maximum guiding velocity suggested by the ISO/TS 15066:2016(E) standard39).

Kinaesthetic guidance energy: the kinaesthetic guidance energy metric quantifies the effort required from the human operator to move the robot end-effector from one side of the workspace to the other. It can be applied to workplace ergonomic evaluations.

For the kinaesthetic guidance energy metric, the gantry position shown in Fig. 6d and the same experimental procedure explained for the GF metrics are used. Unlike for the GF metric, we evaluate the entire motion, including the phase in which \(\ddot{x} > 0\,{{\rm{mm}}\,{\mathrm{s}}^{-2}}\).

Human safety fitness metrics

For the human safety fitness metric set, we consider the conformance of the resulting collision forces to the ISO/TS 15066:2016 standard. Here transient and quasi-static contact forces are distinguished. Thus, we receive the metrics conformance to ISO-constrained transient collision force St and ISO-constrained quasi-static clamping force Sq.

To measure the constrained collision forces, we use the Pilz robot measurement system for biomechanical force and pressure measurements, which records the force in the z direction for robot contact for 1,000 ms starting with the initial contact. Using the device, multiple spring stiffness and damping materials can be combined. The applied combinations refer to the ISO/TS 15066:2016 standard39 and are listed in Supplementary Table 28. The achieved number of safe transient contacts ns,t or quasi-static contacts ns,q is then divided by the number of conducted constrained collision tests Nc based on the number of tested body-part representations Nb and robot contact velocities \({N}_{\dot{{{\rm{x}}}}}\). We receive the human safety metrics as a percentage of tests, which resulted in forces complying with the ISO/TS 15066 thresholds (Supplementary Table 27).

Finally, all the results are listed in Extended Data Tables 26.

The obtained datasets give the worst-case assumptions for the tactile fitness atlas. The mean and standard deviations are obtained for every absolute numeric metric, excluding obvious outliers. The worst-case assumption is derived by the sum of the mean and standard deviation rounded to one or the next multiple of 5. All means and standard deviations are listed with the corresponding metrics in Extended Data Tables 2, 5 and 6.