Table 5 Quantitative comparison of 3D object detection performance on the nuScenes validation set. ‘C’, ‘R’, and ‘L’ denote input from Camera, Radar, and LiDAR sensors, respectively.

From: FDSNet: dynamic multimodal fusion stage selection for autonomous driving via feature disagreement scoring

Method

Input

NDS\(\uparrow\)

mAP\(\uparrow\)

mATE\(\downarrow\)

mASE\(\downarrow\)

mAOE\(\downarrow\)

mAVE\(\downarrow\)

mAAE\(\downarrow\)

CenterFusion62

C+R

45.3

33.2

0.649

0.263

0.535

0.540

0.142

CRAFT63

C+R

51.7

41.1

0.494

0.276

0.454

0.486

0.176

RCBEVDet16

C+R

56.3

45.3

0.492

0.269

0.449

0.230

0.188

RCBEV4D36

C+R

49.7

38.1

0.526

0.272

0.445

0.465

0.185

CRN34

C+R

54.3

44.8

0.518

0.283

0.552

0.279

0.180

CR3DT64

C+R

45.6

35.1

-

-

-

0.47

-

BEVDet65

C

39.2

31.2

0.691

0.272

0.523

0.909

0.247

BEVDepth66

C

47.5

35.1

0.639

0.267

0.479

0.428

0.198

SOLOFusion67

C

53.4

42.7

0.567

0.274

0.411

0.252

0.188

StreamPETR68

C

54.0

43.2

0.581

0.272

0.413

0.295

0.195

RCBEVDet16

C+R

56.8

45.3

0.486

0.285

0.404

0.220

0.192

PolarFusion69

C+L

75.1

73.3

-

-

-

-

-

IS-Fusion70

C+L

74.0

72.8

-

-

-

-

-

ProFusion3D71

C+L

73.6

71.1

-

-

-

-

-

FDSNet (Ours)

C+R

58.2

47.9

0.468

0.251

0.319

0.270

0.140

FDSNet (Ours)

C+L

76.5

74.4

0.398

0.228

0.288

0.240

0.110

FDSNet (Ours)

C+L+R

78.1

75.9

0.385

0.219

0.275

0.229

0.105