Enhanced multi-scale feature adaptive fusion sparse convolutional network for large-scale scenes semantic segmentation

Lingfeng Shen, Yanlong Cao*, Wenbin Zhu, Kai Ren, Yejun Shou, Haocheng Wang, Zhijie Xu

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

Semantic segmentation has made notable strides in analyzing homogeneous large-scale 3D scenes, yet its application to varied scenes with diverse characteristics poses considerable challenges. Traditional methods have been hampered by the dependence on resource-intensive neighborhood search algorithms, leading to elevated computational demands. To overcome these limitations, we introduce the MFAF-SCNet, a novel and computationally streamlined approach for voxel-based sparse convolutional. Our key innovation is the multi-scale feature adaptive fusion (MFAF) module, which intelligently applies a spectrum of convolution kernel sizes at the network's entry point, enabling the extraction of multi-scale features. It adaptively calibrates the feature weighting to achieve optimal scale representation for different objects. Further augmenting our methodology is the LKSNet, an original sparse convolutional backbone designed to tackle the inherent inconsistencies in point cloud distribution. This is achieved by integrating inverted bottleneck structures with large kernel convolutions, significantly bolstering the network's feature extraction and spatial correlation proficiency. The efficacy of MFAF-SCNet was rigorously tested against three large-scale benchmark datasets—ScanNet and S3DIS for indoor scenes, and SemanticKITTI for outdoor scenes. The experimental results underscore our method's competitive edge, achieving high-performance benchmarks while ensuring computational efficiency.

Original languageEnglish
Article number104105
JournalComputers and Graphics (Pergamon)
Volume126
DOIs
Publication statusPublished - Feb 2025

Keywords

  • Multi-scale feature
  • Point cloud
  • Semantic segmentation
  • Sparse convolutional

Fingerprint

Dive into the research topics of 'Enhanced multi-scale feature adaptive fusion sparse convolutional network for large-scale scenes semantic segmentation'. Together they form a unique fingerprint.

Cite this