CNC: Cross-modal Normality Constraint for Unsupervised Multi-class Anomaly Detection

Xiaolei Wang; Xiaoyang Wang; Huihui Bai; Eng Gee Lim; Jimin Xiao

doi:10.1609/aaai.v39i8.32856

CNC: Cross-modal Normality Constraint for Unsupervised Multi-class Anomaly Detection

Xiaolei Wang, Xiaoyang Wang, Huihui Bai, Eng Gee Lim, Jimin Xiao^*

^*Corresponding author for this work

Research output: Chapter in Book or Report/Conference proceeding › Conference Proceeding › peer-review

Abstract

Existing unsupervised distillation-based methods rely on the differences between encoded and decoded features to locate abnormal regions in test images. However, the decoder trained only on normal samples still reconstructs abnormal patch features well, degrading performance. This issue is particularly pronounced in unsupervised multi-class anomaly detection tasks. We attribute this behavior to 'over-generalization' (OG) of decoder: the significantly increasing diversity of patch patterns in multi-class training enhances the model generalization on normal patches, but also inadvertently broadens its generalization to abnormal patches. To mitigate 'OG', we propose a novel approach that leverages class-agnostic learnable prompts to capture common textual normality across various visual patterns, and then apply them to guide the decoded features towards a 'normal' textual representation, suppressing 'over-generalization' of the decoder on abnormal patterns. To further improve performance, we also introduce a gated mixture-of-experts module to specialize in handling diverse patch patterns and reduce mutual interference between them in multi-class training. Our method achieves competitive performance on the MVTec AD and VisA datasets, demonstrating its effectiveness.

Original language	English
Title of host publication	Special Track on AI Alignment
Editors	Toby Walsh, Julie Shah, Zico Kolter
Publisher	Association for the Advancement of Artificial Intelligence
Pages	7943-7951
Number of pages	9
Edition	8
ISBN (Electronic)	157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978
DOIs	https://doi.org/10.1609/aaai.v39i8.32856
Publication status	Published - 11 Apr 2025
Event	39th Annual AAAI Conference on Artificial Intelligence, AAAI 2025 - Philadelphia, United States Duration: 25 Feb 2025 → 4 Mar 2025

Publication series

Name	Proceedings of the AAAI Conference on Artificial Intelligence
Number	8
Volume	39
ISSN (Print)	2159-5399
ISSN (Electronic)	2374-3468

Conference

Conference	39th Annual AAAI Conference on Artificial Intelligence, AAAI 2025
Country/Territory	United States
City	Philadelphia
Period	25/02/25 → 4/03/25

Access to Document

10.1609/aaai.v39i8.32856

Cite this

Wang, X., Wang, X., Bai, H., Lim, E. G., & Xiao, J. (2025). CNC: Cross-modal Normality Constraint for Unsupervised Multi-class Anomaly Detection. In T. Walsh, J. Shah, & Z. Kolter (Eds.), Special Track on AI Alignment (8 ed., pp. 7943-7951). (Proceedings of the AAAI Conference on Artificial Intelligence; Vol. 39, No. 8). Association for the Advancement of Artificial Intelligence. https://doi.org/10.1609/aaai.v39i8.32856

Wang, Xiaolei ; Wang, Xiaoyang ; Bai, Huihui et al. / CNC : Cross-modal Normality Constraint for Unsupervised Multi-class Anomaly Detection. Special Track on AI Alignment. editor / Toby Walsh ; Julie Shah ; Zico Kolter. 8. ed. Association for the Advancement of Artificial Intelligence, 2025. pp. 7943-7951 (Proceedings of the AAAI Conference on Artificial Intelligence; 8).

@inproceedings{94b447d193fb4b1ebb65f01cae83af90,

title = "CNC: Cross-modal Normality Constraint for Unsupervised Multi-class Anomaly Detection",

abstract = "Existing unsupervised distillation-based methods rely on the differences between encoded and decoded features to locate abnormal regions in test images. However, the decoder trained only on normal samples still reconstructs abnormal patch features well, degrading performance. This issue is particularly pronounced in unsupervised multi-class anomaly detection tasks. We attribute this behavior to 'over-generalization' (OG) of decoder: the significantly increasing diversity of patch patterns in multi-class training enhances the model generalization on normal patches, but also inadvertently broadens its generalization to abnormal patches. To mitigate 'OG', we propose a novel approach that leverages class-agnostic learnable prompts to capture common textual normality across various visual patterns, and then apply them to guide the decoded features towards a 'normal' textual representation, suppressing 'over-generalization' of the decoder on abnormal patterns. To further improve performance, we also introduce a gated mixture-of-experts module to specialize in handling diverse patch patterns and reduce mutual interference between them in multi-class training. Our method achieves competitive performance on the MVTec AD and VisA datasets, demonstrating its effectiveness.",

author = "Xiaolei Wang and Xiaoyang Wang and Huihui Bai and Lim, {Eng Gee} and Jimin Xiao",

note = "Publisher Copyright: Copyright {\textcopyright} 2025, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.; 39th Annual AAAI Conference on Artificial Intelligence, AAAI 2025 ; Conference date: 25-02-2025 Through 04-03-2025",

year = "2025",

month = apr,

day = "11",

doi = "10.1609/aaai.v39i8.32856",

language = "English",

series = "Proceedings of the AAAI Conference on Artificial Intelligence",

publisher = "Association for the Advancement of Artificial Intelligence",

number = "8",

pages = "7943--7951",

editor = "Toby Walsh and Julie Shah and Zico Kolter",

booktitle = "Special Track on AI Alignment",

edition = "8",

}

Wang, X, Wang, X, Bai, H, Lim, EG & Xiao, J 2025, CNC: Cross-modal Normality Constraint for Unsupervised Multi-class Anomaly Detection. in T Walsh, J Shah & Z Kolter (eds), Special Track on AI Alignment. 8 edn, Proceedings of the AAAI Conference on Artificial Intelligence, no. 8, vol. 39, Association for the Advancement of Artificial Intelligence, pp. 7943-7951, 39th Annual AAAI Conference on Artificial Intelligence, AAAI 2025, Philadelphia, United States, 25/02/25. https://doi.org/10.1609/aaai.v39i8.32856

CNC: Cross-modal Normality Constraint for Unsupervised Multi-class Anomaly Detection. / Wang, Xiaolei; Wang, Xiaoyang; Bai, Huihui et al.
Special Track on AI Alignment. ed. / Toby Walsh; Julie Shah; Zico Kolter. 8. ed. Association for the Advancement of Artificial Intelligence, 2025. p. 7943-7951 (Proceedings of the AAAI Conference on Artificial Intelligence; Vol. 39, No. 8).

Research output: Chapter in Book or Report/Conference proceeding › Conference Proceeding › peer-review

TY - GEN

T1 - CNC

T2 - 39th Annual AAAI Conference on Artificial Intelligence, AAAI 2025

AU - Wang, Xiaolei

AU - Wang, Xiaoyang

AU - Bai, Huihui

AU - Lim, Eng Gee

AU - Xiao, Jimin

PY - 2025/4/11

Y1 - 2025/4/11

N2 - Existing unsupervised distillation-based methods rely on the differences between encoded and decoded features to locate abnormal regions in test images. However, the decoder trained only on normal samples still reconstructs abnormal patch features well, degrading performance. This issue is particularly pronounced in unsupervised multi-class anomaly detection tasks. We attribute this behavior to 'over-generalization' (OG) of decoder: the significantly increasing diversity of patch patterns in multi-class training enhances the model generalization on normal patches, but also inadvertently broadens its generalization to abnormal patches. To mitigate 'OG', we propose a novel approach that leverages class-agnostic learnable prompts to capture common textual normality across various visual patterns, and then apply them to guide the decoded features towards a 'normal' textual representation, suppressing 'over-generalization' of the decoder on abnormal patterns. To further improve performance, we also introduce a gated mixture-of-experts module to specialize in handling diverse patch patterns and reduce mutual interference between them in multi-class training. Our method achieves competitive performance on the MVTec AD and VisA datasets, demonstrating its effectiveness.

AB - Existing unsupervised distillation-based methods rely on the differences between encoded and decoded features to locate abnormal regions in test images. However, the decoder trained only on normal samples still reconstructs abnormal patch features well, degrading performance. This issue is particularly pronounced in unsupervised multi-class anomaly detection tasks. We attribute this behavior to 'over-generalization' (OG) of decoder: the significantly increasing diversity of patch patterns in multi-class training enhances the model generalization on normal patches, but also inadvertently broadens its generalization to abnormal patches. To mitigate 'OG', we propose a novel approach that leverages class-agnostic learnable prompts to capture common textual normality across various visual patterns, and then apply them to guide the decoded features towards a 'normal' textual representation, suppressing 'over-generalization' of the decoder on abnormal patterns. To further improve performance, we also introduce a gated mixture-of-experts module to specialize in handling diverse patch patterns and reduce mutual interference between them in multi-class training. Our method achieves competitive performance on the MVTec AD and VisA datasets, demonstrating its effectiveness.

UR - http://www.scopus.com/inward/record.url?scp=105004285153&partnerID=8YFLogxK

U2 - 10.1609/aaai.v39i8.32856

DO - 10.1609/aaai.v39i8.32856

M3 - Conference Proceeding

AN - SCOPUS:105004285153

T3 - Proceedings of the AAAI Conference on Artificial Intelligence

SP - 7943

EP - 7951

BT - Special Track on AI Alignment

A2 - Walsh, Toby

A2 - Shah, Julie

A2 - Kolter, Zico

PB - Association for the Advancement of Artificial Intelligence

Y2 - 25 February 2025 through 4 March 2025

ER -

Wang X, Wang X, Bai H, Lim EG , Xiao J. CNC: Cross-modal Normality Constraint for Unsupervised Multi-class Anomaly Detection. In Walsh T, Shah J, Kolter Z, editors, Special Track on AI Alignment. 8 ed. Association for the Advancement of Artificial Intelligence. 2025. p. 7943-7951. (Proceedings of the AAAI Conference on Artificial Intelligence; 8). doi: 10.1609/aaai.v39i8.32856

CNC: Cross-modal Normality Constraint for Unsupervised Multi-class Anomaly Detection

Abstract

Publication series

Conference

Access to Document

Other files and links

Fingerprint

Cite this