Abstract
Few-shot learning has increasingly been explored in remote sensing scene classification as an effective approach to learning from limited examples. However, existing methods suffer from several limitations, including the need for well-annotated auxiliary datasets, limited generalisation and a tendency to overfit on training data. In this paper, a novel semi-supervised self-organised prototype tree-based method (S3OPT) is proposed for few-shot remote sensing scene classification. S3OPT progressively constructs a hierarchical prototype tree from image embeddings in a top-down, discriminatory manner across multiple levels of granularity, capturing inter-class similarities and intra-class variations to enable automated class separation. Using pretrained convolutional neural networks for feature extraction and exploiting pseudo-labelling, S3OPT reduces the need for extensive manual labelling and enhances generalisation through self-training from unlabelled samples. Thanks to the prototype-based nature, S3OPT offers high transparency and its reasoning is based on the mutual similarity between images, ensuring explainability in internal reasoning and decision-making. Extensive experiments on four widely used benchmark datasets demonstrate the great classification accuracy of the proposed S3OPT under standard few-shot learning protocols. It achieved results superior to or on par with state-of-the-art methods for few-shot remote sensing scene classification, delivering up to a 15% accuracy increase without the requirement for computationally expensive training and/or fine-tuning.