TY - GEN
T1 - The Dark Side of Explanations
T2 - 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2023
AU - Chen, Ziheng
AU - Silvestri, Fabrizio
AU - Wang, Jia
AU - Zhang, Yongfeng
AU - Tolomei, Gabriele
N1 - Funding Information:
This work was partially supported by projects FAIR (PE0000013) and SERICS (PE00000014) under the MUR National Recovery and Resilience Plan funded by the European Union - NextGenerationEU, the XJTLU Research Development Fund under RDF-21-01-053, TDF21/22-R23-160, Ningbo 2025 Key Scientific Research Programs, Grant/Award Number: 2019B10128, S10120220021, and National Science Foundation 2127918, 2046457 and 2124155.
Publisher Copyright:
© 2023 Copyright held by the owner/author(s). Publication rights licensed to ACM.
PY - 2023/7/19
Y1 - 2023/7/19
N2 - Deep learning-based recommender systems have become an integral part of several online platforms. However, their black-box nature emphasizes the need for explainable artificial intelligence (XAI) approaches to provide human-understandable reasons why a specific item gets recommended to a given user. One such method is counterfactual explanation (CF). While CFs can be highly beneficial for users and system designers, malicious actors may also exploit these explanations to undermine the system's security. In this work, we propose H-CARS, a novel strategy to poison recommender systems via CFs. Specifically, we first train a logical-reasoning-based surrogate model on training data derived from counterfactual explanations. By reversing the learning process of the recommendation model, we thus develop a proficient greedy algorithm to generate fabricated user profiles and their associated interaction records for the aforementioned surrogate model. Our experiments, which employ a well-known CF generation method and are conducted on two distinct datasets, show that H-CARS yields significant and successful attack performance.
AB - Deep learning-based recommender systems have become an integral part of several online platforms. However, their black-box nature emphasizes the need for explainable artificial intelligence (XAI) approaches to provide human-understandable reasons why a specific item gets recommended to a given user. One such method is counterfactual explanation (CF). While CFs can be highly beneficial for users and system designers, malicious actors may also exploit these explanations to undermine the system's security. In this work, we propose H-CARS, a novel strategy to poison recommender systems via CFs. Specifically, we first train a logical-reasoning-based surrogate model on training data derived from counterfactual explanations. By reversing the learning process of the recommendation model, we thus develop a proficient greedy algorithm to generate fabricated user profiles and their associated interaction records for the aforementioned surrogate model. Our experiments, which employ a well-known CF generation method and are conducted on two distinct datasets, show that H-CARS yields significant and successful attack performance.
KW - Counterfactual explanations
KW - Explainable recommender systems
KW - Model poisoning attacks
UR - http://www.scopus.com/inward/record.url?scp=85167893762&partnerID=8YFLogxK
U2 - 10.1145/3539618.3592070
DO - 10.1145/3539618.3592070
M3 - Conference Proceeding
AN - SCOPUS:85167893762
T3 - SIGIR 2023 - Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval
SP - 2426
EP - 2430
BT - SIGIR 2023 - Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval
PB - Association for Computing Machinery, Inc
Y2 - 23 July 2023 through 27 July 2023
ER -