Table 1 Evaluation metrics for the two different temporal model types: MS-TCN and LTContext

Model	wAP	wAR	wAA	wAF1
MS-TCN	74.51 ± 1.86	69.59 ± 1.45	87.81 ± 0.79	70.29 ± 1.65
LTContext	77.15 ± 1.93	70.31 ± 0.96	88.32 ± 0.64	71.54 ± 1.07

The evaluation metrics are computed class-wise, for each instrument and working trocar individually, and then averaged over all surgeries in the test set, yielding the weighted-averaged precision (wAP), recall (wAR), accuracy (wAA), and F1 score (wAF1). The weighted-averaged metrics are calculated by taking the mean of all per-class scores while considering the number of actual occurrences of each class in the test data set. The averaged metrics over tenfold are reported (%) with the corresponding standard deviation ( ± ).

Quick links

Search