Fig. 2: VespAI model architecture and functionality. | Communications Biology

Fig. 2: VespAI model architecture and functionality.

From: VespAI: a deep learning-based system for the detection of invasive hornets

Fig. 2

a Illustration of the motion detection and video pre-filtering process used by ViBe50. This ensures that the system remains passive until motion is detected and that only ‘hornet-sized’ objects—determined from a known reference range for each species (Fig. S1)—are extracted from videos and passed on to the detection algorithm. b Diagram detailing the algorithm for hornet detection, classification, and confidence assignation. This model is built on YOLOv5s architecture, utilising a ResNet-5053 backbone with a PaNet71 neck, and applies a single F-CNN to the whole image to rapidly detect and classify hornets. To optimise performance, the algorithm downscales images to a resolution of 640 × 640 and applies letterboxing during detection. Class predictions and detection confidence values between 0 and 1 are then provided on an associated bounding box that is projected back onto the original image, as detailed in the diagram. c Examples of successful detections in a range of common scenarios including target saturation and overlap, class co-occurrence, and the presence of non-target insects. Dashed boxes denote discrete modules of ViBe motion detection and background subtraction, YOLOv5s object detection and classification, and example outputs when these processes are combined.

Back to article page