Extended Data Fig. 5: Ablations on components of the hybrid network.

(a-c) Compare the network performance between two frames, on a subset of DSEC-Detection. (a) Our methods performance with ResNet-18, ResNet-34, and ResNet-50 backbones, with events (green), and without events (red). Methods without events propagate detections from the image at t = 0 to the current time. (b) Comparison of our method to Events+YOLOX34 (blue), a baseline which takes in concatenated images and events up to time t. (c) Drop in mean average precision (mAP) over time, for each method. (d-c) Compare the network performance on the full DSEC-Detection test set. (d) Ablation on the fusion strategies between GNN-based detections from events and CNN-based detections from images. (e) Ablation on CNN pretraining and feature concatenation.