TY - GEN
T1 - CNC
T2 - 39th Annual AAAI Conference on Artificial Intelligence, AAAI 2025
AU - Wang, Xiaolei
AU - Wang, Xiaoyang
AU - Bai, Huihui
AU - Lim, Eng Gee
AU - Xiao, Jimin
N1 - Publisher Copyright:
Copyright © 2025, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.
PY - 2025/4/11
Y1 - 2025/4/11
N2 - Existing unsupervised distillation-based methods rely on the differences between encoded and decoded features to locate abnormal regions in test images. However, the decoder trained only on normal samples still reconstructs abnormal patch features well, degrading performance. This issue is particularly pronounced in unsupervised multi-class anomaly detection tasks. We attribute this behavior to 'over-generalization' (OG) of decoder: the significantly increasing diversity of patch patterns in multi-class training enhances the model generalization on normal patches, but also inadvertently broadens its generalization to abnormal patches. To mitigate 'OG', we propose a novel approach that leverages class-agnostic learnable prompts to capture common textual normality across various visual patterns, and then apply them to guide the decoded features towards a 'normal' textual representation, suppressing 'over-generalization' of the decoder on abnormal patterns. To further improve performance, we also introduce a gated mixture-of-experts module to specialize in handling diverse patch patterns and reduce mutual interference between them in multi-class training. Our method achieves competitive performance on the MVTec AD and VisA datasets, demonstrating its effectiveness.
AB - Existing unsupervised distillation-based methods rely on the differences between encoded and decoded features to locate abnormal regions in test images. However, the decoder trained only on normal samples still reconstructs abnormal patch features well, degrading performance. This issue is particularly pronounced in unsupervised multi-class anomaly detection tasks. We attribute this behavior to 'over-generalization' (OG) of decoder: the significantly increasing diversity of patch patterns in multi-class training enhances the model generalization on normal patches, but also inadvertently broadens its generalization to abnormal patches. To mitigate 'OG', we propose a novel approach that leverages class-agnostic learnable prompts to capture common textual normality across various visual patterns, and then apply them to guide the decoded features towards a 'normal' textual representation, suppressing 'over-generalization' of the decoder on abnormal patterns. To further improve performance, we also introduce a gated mixture-of-experts module to specialize in handling diverse patch patterns and reduce mutual interference between them in multi-class training. Our method achieves competitive performance on the MVTec AD and VisA datasets, demonstrating its effectiveness.
UR - https://www.scopus.com/pages/publications/105004285153
U2 - 10.1609/aaai.v39i8.32856
DO - 10.1609/aaai.v39i8.32856
M3 - Conference Proceeding
AN - SCOPUS:105004285153
T3 - Proceedings of the AAAI Conference on Artificial Intelligence
SP - 7943
EP - 7951
BT - Special Track on AI Alignment
A2 - Walsh, Toby
A2 - Shah, Julie
A2 - Kolter, Zico
PB - Association for the Advancement of Artificial Intelligence
Y2 - 25 February 2025 through 4 March 2025
ER -