Table 2 Model comparisons within land cover problems.

From: Meta-learning to address diverse Earth observation problems across resolutions

  

Number of shots (training examples per class)

Model

Avg. rank

1

2

5

10

15

SSL4EO

2.51

56.4 ± 9.7

72.8 ± 11.8

79.4 ± 10.2

80.5 ± 10.6

82.4 ± 10.3

METEOR

2.84

61.5 ± 10.7

69.2 ± 11.9

78.6 ± 11.2

81.5 ± 10.4

81.7 ± 11.9

MOSAIKS

2.86

61.3 ± 11.5

68.7 ± 14.8

77.3 ± 11.5

81.3 ± 10.3

84.7 ± 9.2

BASELINE

2.99

58.1 ± 11.9

71.2 ± 9.9

79.4 ± 7.9

81.0 ± 8.0

82.7 ± 7.9

SSLTRANSRS

5.70**

51.2 ± 10.5

61.4 ± 6.4

71.4 ± 8.4

74.2 ± 10.4

75.9 ± 11.0

SWAV

6.51**

46.5 ± 8.8

60.1 ± 13.0

67.6 ± 14.4

69.3 ± 14.4

72.1 ± 14.5

DINO

6.73**

45.4 ± 11.8

58.4 ± 13.7

66.9 ± 15.7

69.1 ± 14.9

71.6 ± 14.9

SECO

6.83**

49.1 ± 12.4

55.9 ± 13.2

66.5 ± 15.5

66.6 ± 16.8

67.5 ± 19.0

SCRATCH

9.00**

36.7 ± 16.8

46.7 ± 13.2

49.9 ± 12.9

51.6 ± 16.4

54.6 ± 15.3

IMAGENET

9.03**

42.3 ± 12.8

48.7 ± 12.1

59.0 ± 15.2

59.9 ± 14.4

62.8 ± 15.8

  1. We report averaged accuracies obtained on the seven DFC2020 regions. Each model is fine-tuned to the 5–7 classes of each DFC region individually, using an increasingly large support set of 1, 2, 5, 10, and 15 training examples per class, i.e., shots. It is then tested on a query set containing all remaining images. We report the average rank (lower is better) to compare models across all shots simultaneously. We further test for the significance of the differences to METEOR with a Wilcoxon Signed Rank test and indicate signficiant deviations by **.