IndoorMS: A Multispectral Dataset for Semantic Segmentation in Indoor Scene Understanding

Qinfeng Zhu, Jingjing Xiao, Lei Fan*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Indoor scene understanding is a critical task in computer vision, traditionally relying on RGB data for deep learning-based semantic segmentation to achieve pixel-level understanding. However, indoor environments provide valuable information beyond the visible light spectrum, which has been largely overlooked in existing research. To address this gap, we introduce IndoorMS, a comprehensive multispectral dataset specifically designed for the semantic segmentation of indoor scenes. The dataset comprises images captured using a multispectral sensor in 17 buildings across diverse indoor settings, including meeting rooms, halls, lounges, offices, corridors, and classrooms. With 19 finely annotated semantic categories, IndoorMS enables robust evaluation of indoor scene segmentation. Benchmark experiments are performed using several leading semantic segmentation frameworks, followed by a thorough analysis of their performance. The results indicate that the optimal model combination, namely ConvNeXt-s with UperNet, achieved an mF1 score of 82.38 and an mIoU score of 72.90. Despite these promising results, IndoorMS's challenges on segmentation networks remain, such as class distribution imbalance and domain gaps between RGB and multispectral data. This work marks the first effort to support multispectral indoor scene understanding with a dedicated dataset, offering new opportunities for research in this domain. Potential avenues for future research are presented. The project page for the IndoorMS dataset is available at https://zhuqinfeng1999.github.io/IndoorMS/. (The dataset will be publicly available for download after peer review.).

Original languageEnglish
JournalIEEE Sensors Journal
DOIs
Publication statusAccepted/In press - 2025

Keywords

  • Dataset
  • Image
  • Indoor
  • Multispectral
  • Scene Understanding
  • Semantic Segmentation

Cite this