TY - GEN
T1 - Enhancing Object Detection in Adverse Weather Conditions Through Entropy and Guided Multimodal Fusion
AU - Zhang, Zhenrong
AU - Gong, Haoyan
AU - Feng, Yuzheng
AU - Chu, Zixuan
AU - Liu, Hongbin
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.
PY - 2025
Y1 - 2025
N2 - Integrating diverse representations from complementary sensing modalities is essential for robust scene interpretation in autonomous driving. Deep learning architectures that fuse vision and range data have advanced 2D and 3D object detection in recent years. However, these modalities often suffer degradation in adverse weather or lighting conditions, leading to decreased performance. While domain adaptation methods have been developed to bridge the gap between source and target domains, they typically fall short because of the inherent discrepancy between the source and target domains. This discrepancy can manifest in different distributions of data and different feature spaces. This paper introduces a comprehensive domain-adaptive object detection framework. Developed through deep transfer learning, the framework is designed to robustly generalize from labelled clear-weather data to unlabeled adverse weather conditions, enhancing the performance of deep learning-based object detection models. The innovative Patch Entropy Fusion Module (PEFM) is central to our approach, which dynamically integrates sensor data, emphasizing critical information and minimizing background distractions. This is further complemented by a novel Weighted Decision Module (WDM) that adjusts the contributions of different sensors based on their efficacy under specific environmental conditions, thereby optimizing detection accuracy. Additionally, we integrate a domain align loss during the transfer learning process to ensure effective domain adaptation by regularizing the feature map discrepancies between clear and adverse weather datasets. We evaluate our model on diverse datasets, including ExDark (unimodal), Cityscapes (unimodal), and Dense (multimodal), where it ranks 1st in all datasets at the point in time of our evaluation.
AB - Integrating diverse representations from complementary sensing modalities is essential for robust scene interpretation in autonomous driving. Deep learning architectures that fuse vision and range data have advanced 2D and 3D object detection in recent years. However, these modalities often suffer degradation in adverse weather or lighting conditions, leading to decreased performance. While domain adaptation methods have been developed to bridge the gap between source and target domains, they typically fall short because of the inherent discrepancy between the source and target domains. This discrepancy can manifest in different distributions of data and different feature spaces. This paper introduces a comprehensive domain-adaptive object detection framework. Developed through deep transfer learning, the framework is designed to robustly generalize from labelled clear-weather data to unlabeled adverse weather conditions, enhancing the performance of deep learning-based object detection models. The innovative Patch Entropy Fusion Module (PEFM) is central to our approach, which dynamically integrates sensor data, emphasizing critical information and minimizing background distractions. This is further complemented by a novel Weighted Decision Module (WDM) that adjusts the contributions of different sensors based on their efficacy under specific environmental conditions, thereby optimizing detection accuracy. Additionally, we integrate a domain align loss during the transfer learning process to ensure effective domain adaptation by regularizing the feature map discrepancies between clear and adverse weather datasets. We evaluate our model on diverse datasets, including ExDark (unimodal), Cityscapes (unimodal), and Dense (multimodal), where it ranks 1st in all datasets at the point in time of our evaluation.
KW - Domain adaptation
KW - Entropy fusion
KW - Multimodal fusion
UR - http://www.scopus.com/inward/record.url?scp=85213372201&partnerID=8YFLogxK
U2 - 10.1007/978-981-96-0972-7_2
DO - 10.1007/978-981-96-0972-7_2
M3 - Conference Proceeding
AN - SCOPUS:85213372201
SN - 9789819609710
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 22
EP - 38
BT - Computer Vision – ACCV 2024 - 17th Asian Conference on Computer Vision, Proceedings
A2 - Cho, Minsu
A2 - Laptev, Ivan
A2 - Tran, Du
A2 - Yao, Angela
A2 - Zha, Hongbin
PB - Springer Science and Business Media Deutschland GmbH
T2 - 17th Asian Conference on Computer Vision, ACCV 2024
Y2 - 8 December 2024 through 12 December 2024
ER -