Fig. 5: Performances using fractions of training data on the PKU-MMD dataset.
From: Localization and recognition of human action in 3D using transformers

We observe that the performance of our approach remains robust even when utilizing only half of the training data. Impressively, with just 10% of the training data, our approach maintains a performance level of 91.9% in the relatively straightforward cross-view evaluation. In contrast, the performance of Beyond-Joints deteriorates when working with reduced fractions of training data. This experiment highlights the scalability of our proposed approach, demonstrating its efficacy even when only a limited amount of labeled data is accessible. It also underscores the low intra-class variance among the action categories within the PKU-MMD dataset. Thus, the creation of a new and challenging 3D action localization benchmark becomes imperative to drive future research in this domain. a Cross-subject evaluation. b Cross-view evaluation. PKU-MMD: Peking University Multi-Modal Dataset.