Table 4 Inference latency in milliseconds for selected models in three hardware configurations

From: Neonatal pose estimation in the unaltered clinical environment with fusion of RGB, depth and IR images

  

Apple M2

RTX 3060Ti

NVIDIA A100

Backbone

Model

B=1

B=1

B=10

B=1

B=10

B=100

HRNet-W32-256

Depth

129

11.4a

3.79a

29.4a

2.94a

0.956a

HRNet-W32-256

IIF-2

316

18.2a

14.7a

29.6a

3.96a

3.20a

HRNet-W32-384

LIF-3

208

12.5a

5.87a

36.2

3.64a

1.52a

HRNet-W48-384

RGB

629

31.7a

24.6a

91.8

8.87a

6.53a

HRNet-W48-384

LIF-3

1070

51.7

44.8

89.9

11.9a

9.59a

HRFormer-S

IIF-2

432

23.3a

16.5a

55.9

7.60a

6.49a

HRFormer-B

Depth

784

52.4

40.0

47.0

16.6a

15.0a

HRFormer-B

IIF-2

1090

59.3

52.0

57.0

21.2a

19.2a

  1. The three hardware configurations represent a cot-side small PC, on-site desktop PC, or off-site server.
  2. Bold indicates suitability for 1fps inference.
  3. B batch size.
  4. a30fps inference.