Fig. 1: Overview of the GRAMME conceptual framework and architecture. | Nature Machine Intelligence

Fig. 1: Overview of the GRAMME conceptual framework and architecture.

From: Deep learning-based robust positioning for all-weather autonomous driving

Fig. 1

a, The publicly available independent AV datasets are collected using multiple sensors such as camera, lidar and radar under diverse settings such as variable ambient illumination and precipitation. Example multimodal measurements from the RADIATE dataset17 are shown to illustrate the data types and the degradation in sensor measurements caused by adverse conditions. b, Architecture overview for self-supervised estimation of scene geometry and ego-motion. DepthNet and VisionNet modules predict the pixel-wise depth map of each camera frame and the ego-motion between consecutive camera frames, respectively. In parallel, the RangeNet and MaskNet modules operate on range sensors (that is, lidar and radar) to predict ego-motion and input masks, respectively. FusionNet collects the unaligned individual motion predictions as input and predicts the ultimate motion. Finally, the spatial transformer module uses the multimodal predictions and geometrically reconstructs the scene, creating a supervisory signal (L).

Back to article page