Table 2 Comparison between the proposed HA4M dataset and existing vision-based datasets on assembly actions.
Dataset | Visual Sensors | Environment | Data Modalities | Task |
|---|---|---|---|---|
Assembly10141 | Eight RGB Cameras mounted on a scaffold around a table and four monochrome cameras mounted on an headset | Laboratory | RGB frames, 3D hand poses | Assembly and Disassembly of toy vehicles |
Meccano42 | One Intel RealSense SR300 camera mounted on an headset | Laboratory | RGB videos | Assembly of a toy motorbike |
IKEA-ASM43 | Three Microsoft Kinect v2 | Offices, Labs and Family Homes | RGB videos, Depth videos, 3D Skeleton Joints | Furniture Assembly |
HA4M | Microsoft Azure Kinect | Laboratory | RGB frames, Depth maps, IR frames, RGB-Depth-Aligned frames, Point Clouds, Skeleton Data | Assembly of an EGT |