Fig. 2

An overview of the proposed SAINet. SAINet first downsamples the input image and then puts it into the backbone network extracting global features. The spatial refined module (SRM) extracts salient regions within the global features and obtains the salient regions index (SRI). The local features are extracted from crop patches. The spatial interaction module (SIM) performs the spatially adaptive interaction between global features and local features. The images used in this figure are sourced from the DeepGlobe dataset, and the dataset can be accessed at: https://www.kaggle.com/datasets/balraj98/deepglobe-land-cover-classification-dataset.