Performance of xLSTM for Semantic Segmentation of Remotely Sensed Images

Research output: Chapter in Book or Report/Conference proceedingConference Proceedingpeer-review

1 Citation (Scopus)

Abstract

Recent advancements in autoregressive networks with linear complexity have driven significant research progress, demonstrating exceptional performance in large language models. A representative model is the Extended Long Short-Term Memory (xLSTM), which incorporates gating mechanisms and memory structures, performing comparably to Transformer architectures in long-sequence language tasks. Autoregressive networks such as xLSTM can utilize image serialization to extend their application to visual tasks such as classification and segmentation. Although existing studies have demonstrated Vision-LSTM's impressive results in image classification, its performance in image semantic segmentation remains unverified. Our study represents the first attempt to evaluate the effectiveness of Vision-LSTM in the semantic segmentation of remotely sensed images. This evaluation is based on a specifically designed encoder-decoder architecture named Seg-LSTM, and comparisons with state-of-the-art segmentation networks. Our study found that Vision-LSTM's performance in semantic segmentation was limited and generally inferior to Vision-Transformers-based and Vision-Mamba-based models in most comparative tests. Future research directions for enhancing Vision-LSTM are recommended. The source code is available from https://github.com/zhuqinfeng1999/Seg-LSTM.

Original languageEnglish
Title of host publication7th International Conference on Sensors, Signal and Image Processing, SSIP 2024 - Proceedings
PublisherAssociation for Computing Machinery
Pages90-96
Number of pages7
ISBN (Electronic)9798400717420
DOIs
Publication statusPublished - 7 Jul 2025
Event7th International Conference on Sensors, Signal and Image Processing, SSIP 2024 - Shenzhen, China
Duration: 22 Nov 202424 Nov 2024

Publication series

NameACM International Conference Proceeding Series

Conference

Conference7th International Conference on Sensors, Signal and Image Processing, SSIP 2024
Country/TerritoryChina
CityShenzhen
Period22/11/2424/11/24

Keywords

  • High-resolution
  • Image
  • Remote Sensing
  • Semantic Segmentation
  • Vision-LSTM
  • xLSTM

Fingerprint

Dive into the research topics of 'Performance of xLSTM for Semantic Segmentation of Remotely Sensed Images'. Together they form a unique fingerprint.

Cite this