TY - JOUR
T1 - Enhancing Light Field Salient Object Detection with Variance-Maximized Key Focal Slice Selection
AU - Han, Jiaxin
AU - Li, Feng
AU - Li, Anqi
AU - Zhang, Mengmeng
AU - Bai, Huihui
AU - Xiao, Jimin
AU - Zhao, Yao
N1 - Publisher Copyright:
© 1999-2012 IEEE.
PY - 2025
Y1 - 2025
N2 - Light field saliency object detection (LF SOD) methods have made significant progress recently. Most of them explore abundant multi-modal information from the all-focus image and the focal stacks at all focal planes to enrich scene details and depth perception. However, in light-field images, the spatial and depth information varies slightly across different slices, raising redundancy within focal stacks. Besides, the noise can appear repeatedly in multiple images of the focal stacks, which brings interference. To address these issues, in this work, we propose VMKNet, an effective approach that leverages innovative variance-maximized key slice selection and interacts with the all-focus image, to improve LF SOD. Specifically, we measure consistency differences between the all-focus image and each focal slice in the salient region as saliency scores. Then, we randomly assemble sets of them, where each score corresponds to a certain slice. The one exhibiting the highest variance is singled out to determine key focal slices as they reveal the diversity of salient objects. Then, the bidirectional guidance module (BGM) is presented to learn attentive features of allfocus and selected key slices in a mutual guidance manner, thus producing enhanced and holistic features. With hierarchical BGMs, our model can progressively aggregate common salient semantics and meaningful contextual details, generating more discriminative representations. Moreover, we introduce the edge enhancement module in conjunction with BGM to improve the sharpness of saliency maps. Extensive experiments on common light field datasets demonstrate that our method, termed VMKNet, outperforms recent state-of-the-art LF, RGB-D, and RGB methods.
AB - Light field saliency object detection (LF SOD) methods have made significant progress recently. Most of them explore abundant multi-modal information from the all-focus image and the focal stacks at all focal planes to enrich scene details and depth perception. However, in light-field images, the spatial and depth information varies slightly across different slices, raising redundancy within focal stacks. Besides, the noise can appear repeatedly in multiple images of the focal stacks, which brings interference. To address these issues, in this work, we propose VMKNet, an effective approach that leverages innovative variance-maximized key slice selection and interacts with the all-focus image, to improve LF SOD. Specifically, we measure consistency differences between the all-focus image and each focal slice in the salient region as saliency scores. Then, we randomly assemble sets of them, where each score corresponds to a certain slice. The one exhibiting the highest variance is singled out to determine key focal slices as they reveal the diversity of salient objects. Then, the bidirectional guidance module (BGM) is presented to learn attentive features of allfocus and selected key slices in a mutual guidance manner, thus producing enhanced and holistic features. With hierarchical BGMs, our model can progressively aggregate common salient semantics and meaningful contextual details, generating more discriminative representations. Moreover, we introduce the edge enhancement module in conjunction with BGM to improve the sharpness of saliency maps. Extensive experiments on common light field datasets demonstrate that our method, termed VMKNet, outperforms recent state-of-the-art LF, RGB-D, and RGB methods.
KW - key focal slices
KW - Light field salient object detection
KW - variance-maximized
UR - https://www.scopus.com/pages/publications/105010343049
U2 - 10.1109/TMM.2025.3586131
DO - 10.1109/TMM.2025.3586131
M3 - Article
AN - SCOPUS:105010343049
SN - 1520-9210
VL - 27
SP - 6555
EP - 6567
JO - IEEE Transactions on Multimedia
JF - IEEE Transactions on Multimedia
ER -