TY - JOUR
T1 - MSVM-UNet
T2 - Multi-Scale Spatial Attention Enhanced Vision Mamba U-Net for Agricultural Disease Segmentation
AU - Shi, Lin
AU - Liu, Xinyu
AU - Zhao, Li
AU - Zhang, Haiyang
AU - Ji, Zhanlin
N1 - Publisher Copyright:
© 1994-2012 IEEE.
PY - 2026
Y1 - 2026
N2 - Agricultural diseased leaf image segmentation is a critical technology for precision agriculture and intelligent crop protection. To overcome the limitations of current segmentation methods-such as imprecise leaf edge extraction, difficulty in detecting small disease lesions, and insufficient robustness in complex backgrounds-this paper proposes an agricultural diseased leaf image segmentation method based on an enhanced visual state space model, named MSVM-UNet (Multi-Scale Spatial Attention Vision Mamba U-Net). This method employs an encoder-decoder framework and integrates improved Visual State Space (VSS) modules in both the encoder and decoder, enhancing long-range dependency modeling and local-global feature fusion. Simultaneously, a Multi-Scale Spatial Attention (MSSA) module is introduced in the skip connections to enhance cross-scale feature representation and capture fine boundary details of disease spots. To simulate real field imaging conditions, we perform random horizontal or vertical flips on the images and randomly adjust hue, saturation, and brightness before training. Experimental results demonstrate that, compared with mainstream methods, MSVM-UNet achieves significant performance improvement in agricultural diseased leaf segmentation tasks, reaching 80.44% mIoU and 92.56% Dice on the validation set, providing our solution for intelligent agricultural disease monitoring.
AB - Agricultural diseased leaf image segmentation is a critical technology for precision agriculture and intelligent crop protection. To overcome the limitations of current segmentation methods-such as imprecise leaf edge extraction, difficulty in detecting small disease lesions, and insufficient robustness in complex backgrounds-this paper proposes an agricultural diseased leaf image segmentation method based on an enhanced visual state space model, named MSVM-UNet (Multi-Scale Spatial Attention Vision Mamba U-Net). This method employs an encoder-decoder framework and integrates improved Visual State Space (VSS) modules in both the encoder and decoder, enhancing long-range dependency modeling and local-global feature fusion. Simultaneously, a Multi-Scale Spatial Attention (MSSA) module is introduced in the skip connections to enhance cross-scale feature representation and capture fine boundary details of disease spots. To simulate real field imaging conditions, we perform random horizontal or vertical flips on the images and randomly adjust hue, saturation, and brightness before training. Experimental results demonstrate that, compared with mainstream methods, MSVM-UNet achieves significant performance improvement in agricultural diseased leaf segmentation tasks, reaching 80.44% mIoU and 92.56% Dice on the validation set, providing our solution for intelligent agricultural disease monitoring.
KW - Agricultural diseased leaf segmentation
KW - Mamba U-Net
KW - multi-scale attention
KW - visual state-space model
UR - https://www.scopus.com/pages/publications/105030389689
U2 - 10.1109/LSP.2026.3664272
DO - 10.1109/LSP.2026.3664272
M3 - Article
AN - SCOPUS:105030389689
SN - 1070-9908
JO - IEEE Signal Processing Letters
JF - IEEE Signal Processing Letters
ER -