TY - JOUR
T1 - Classwise-CRF
T2 - Category-specific fusion for enhanced semantic segmentation of remote sensing imagery
AU - Zhu, Qinfeng
AU - Jiang, Yunxi
AU - Fan, Lei
N1 - Publisher Copyright:
© 2025 Elsevier Ltd
PY - 2026/5
Y1 - 2026/5
N2 - With the continuous development of visual models such as Convolutional Neural Networks, Vision Transformers, and Vision Mamba, the capabilities of neural networks in semantic segmentation of remote sensing images have seen significant progress. However, these networks exhibit varying performance across different semantic categories, making it challenging to find a single network architecture that excels in all categories. To address this, we propose a result-level category-specific fusion architecture called ClassWise-CRF. This architecture employs a two-stage process: first, it selects expert networks that perform well in specific categories from a pool of candidate networks using a greedy algorithm; second, it integrates the segmentation predictions of these selected networks by adaptively weighting their contributions based on their segmentation performance in each category. Inspired by Conditional Random Field (CRF), the ClassWise-CRF architecture treats the segmentation predictions from multiple networks as confidence vector fields. It leverages segmentation metrics (such as Intersection over Union) from the validation set as priors and employs an exponential weighting strategy to fuse the category-specific confidence scores predicted by each network. This fusion method dynamically adjusts the weights of each network for different categories, achieving category-specific optimization. Building on this, the architecture further optimizes the fused results using unary and pairwise potentials in CRF to ensure spatial consistency and boundary accuracy. To validate the effectiveness of ClassWise-CRF, we conducted experiments on two remote sensing datasets, LoveDA and Vaihingen, using eight classic and advanced semantic segmentation networks. The results show that the ClassWise-CRF architecture significantly improves segmentation performance: on the LoveDA dataset, the mean Intersection over Union (mIoU) metric increased by 1.00% on the validation set and by 0.68% on the test set; on the Vaihingen dataset, the mIoU improved by 0.87% on the validation set and by 0.91% on the test set. These results fully demonstrate the effectiveness and generality of the ClassWise-CRF architecture in semantic segmentation of remote sensing images. The full code is available at https://github.com/zhuqinfeng1999/ClassWise-CRF.
AB - With the continuous development of visual models such as Convolutional Neural Networks, Vision Transformers, and Vision Mamba, the capabilities of neural networks in semantic segmentation of remote sensing images have seen significant progress. However, these networks exhibit varying performance across different semantic categories, making it challenging to find a single network architecture that excels in all categories. To address this, we propose a result-level category-specific fusion architecture called ClassWise-CRF. This architecture employs a two-stage process: first, it selects expert networks that perform well in specific categories from a pool of candidate networks using a greedy algorithm; second, it integrates the segmentation predictions of these selected networks by adaptively weighting their contributions based on their segmentation performance in each category. Inspired by Conditional Random Field (CRF), the ClassWise-CRF architecture treats the segmentation predictions from multiple networks as confidence vector fields. It leverages segmentation metrics (such as Intersection over Union) from the validation set as priors and employs an exponential weighting strategy to fuse the category-specific confidence scores predicted by each network. This fusion method dynamically adjusts the weights of each network for different categories, achieving category-specific optimization. Building on this, the architecture further optimizes the fused results using unary and pairwise potentials in CRF to ensure spatial consistency and boundary accuracy. To validate the effectiveness of ClassWise-CRF, we conducted experiments on two remote sensing datasets, LoveDA and Vaihingen, using eight classic and advanced semantic segmentation networks. The results show that the ClassWise-CRF architecture significantly improves segmentation performance: on the LoveDA dataset, the mean Intersection over Union (mIoU) metric increased by 1.00% on the validation set and by 0.68% on the test set; on the Vaihingen dataset, the mIoU improved by 0.87% on the validation set and by 0.91% on the test set. These results fully demonstrate the effectiveness and generality of the ClassWise-CRF architecture in semantic segmentation of remote sensing images. The full code is available at https://github.com/zhuqinfeng1999/ClassWise-CRF.
KW - Conditional random field
KW - Deep learning
KW - Fusion
KW - Images
KW - Remote sensing
KW - Semantic segmentation
UR - https://www.scopus.com/pages/publications/105027127748
U2 - 10.1016/j.neunet.2025.108485
DO - 10.1016/j.neunet.2025.108485
M3 - Article
C2 - 41483572
AN - SCOPUS:105027127748
SN - 0893-6080
VL - 197
JO - Neural Networks
JF - Neural Networks
M1 - 108485
ER -