AerialFormer: Multi-Resolution Transformer for Aerial Image Segmentation

Hanyu, Taisei; Yamazaki, Kashu; Tran, Minh; McCann, Roy A; Liao, Haitao; Rainwater, Chase; Adkins, Meredith; Cothren, Jackson; Le, Ngan

doi:10.3390/rs16162930

Citation Details

AerialFormer: Multi-Resolution Transformer for Aerial Image Segmentation

When performing remote sensing image segmentation, practitioners often encounter various challenges, such as a strong imbalance in the foreground–background, the presence of tiny objects, high object density, intra-class heterogeneity, and inter-class homogeneity. To overcome these challenges, this paper introduces AerialFormer, a hybrid model that strategically combines the strengths of Transformers and Convolutional Neural Networks (CNNs). AerialFormer features a CNN Stem module integrated to preserve low-level and high-resolution features, enhancing the model’s capability to process details of aerial imagery. The proposed AerialFormer is designed with a hierarchical structure, in which a Transformer encoder generates multi-scale features and a multi-dilated CNN (MDC) decoder aggregates the information from the multi-scale inputs. As a result, information is taken into account in both local and global contexts, so that powerful representations and high-resolution segmentation can be achieved. The proposed AerialFormer was benchmarked on three benchmark datasets, including iSAID, LoveDA, and Potsdam. Comprehensive experiments and extensive ablation studies show that the proposed AerialFormer remarkably outperforms state-of-the-art methods. more »

Award ID(s):: 2345176

PAR ID:: 10556613

Author(s) / Creator(s):: Hanyu, Taisei; Yamazaki, Kashu; Tran, Minh; McCann, Roy A; Liao, Haitao; Rainwater, Chase; Adkins, Meredith; Cothren, Jackson; Le, Ngan

Publisher / Repository:: MDPI

Date Published:: 2024-08-09

Journal Name:: Remote Sensing

Volume:: 16

Issue:: 16

ISSN:: 2072-4292

Page Range / eLocation ID:: 2930

Subject(s) / Keyword(s):: remote sensing semantic segmentation transformers dilated convolution

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
https://doi.org/10.3390/rs16162930

More Like this