Table 3 Comparison of the different detection methods, using different window sizes \(d\)

From: A deep learning-based approach to enhance accuracy and feasibility of long-term high-resolution manometry examinations

 

Method

Precision (%)

Recall (%)

F1-score (%)

d = 100

Non-ML baseline

n/a

n/a

n/a

 

ViMeDat

79.62 ± 5.12

50.28 ± 14.05

57.23 ± 14.36

 

Ours

77.50 ± 7.63

84.68 ± 6.49

80.62 ± 7.18

d = 400

Non-ML baseline

29.58 ± 6.46

76.70 ± 9.77

38.70 ± 7.10

 

ViMeDat

85.73 ± 4.49

54.18 ± 15.86

61.59 ± 16.19

 

Ours

86.13 ± 2.98

94.07 ± 2.46

89.57 ± 2.59

d = 800

Non-ML baseline

37.76 ± 7.04

85.94 ± 9.56

48.22 ± 7.16

 

ViMeDat

88.47 ± 3.86

58.46 ± 16.95

65.37 ± 16.66

 

Ours

89.04 ± 1.92

97.65 ± 1.19

92.82 ± 1.36

  1. Using \(d\in \left[100,400,800\right]\), a correct swallow detection is counted for the MobileNet and ViMeDat approaches if the predicted swallow start is in a distance of at most \(\pm \frac{d}{2}\) measurements from the true swallow start. Regarding the non-ML baseline, as it is not focusing on detecting the start of a swallow, but rather the actual swallow event, reducing the allowed distance to 100 is not applicable here, while for the other values of \(d\), a correct swallow detection in this case is counted if the predicted swallow event is in range \([y+200-\frac{d}{2},\ldots ,y+200+\frac{d}{2}]\), with \(y\) being the true swallow start. We report the average metrics over a fivefold cross-validation along with their respective standard deviation (±). Bold numbers indicate the highest performance in the respective category.