ML-VehicleDet: A Unified Multi-label Vehicle Detection Framework

Runwei Guan, Shanliang Yao, Xiaohui Zhu, Jeremy Smith, Ka Lok Man, Yutao Yue*

*Corresponding author for this work

Research output: Chapter in Book or Report/Conference proceedingConference Proceedingpeer-review

Abstract

Vehicle detection based on deep learning has been developed rapidly and basically formed a certain pattern. Almost all works in vehicle detection are concentrated on single-label object detection. However, in the real world, a vehicle has multiple attributes from the perspective of a human being. When we observe a car, we tend to perceive its type, color, orientation, and other attributes. The categories of these labels are neither relevant nor hierarchical. For a neural network, this means that it needs to output multiple labels of a vehicle while regressing its bounding box. For traffic supervisors, a multi-label vehicle detection system could help them find out the targeted vehicle efficiently. Moreover, there has been no research on multi-label vehicle detection so far. Therefore, we design and develop a unified multi-label vehicle detection framework called ML-VehicleDet, which can detect the location (bounding box), type, color and orientation of the vehicle at the same time. In ML-VehicleDet, firstly, we design a hybrid one-stage object detection NN with ViT-encoder and CNN-decoder called Swin Only Look Once, which abbreviates SOLO. Such a SOLO is an anchor-free detector. Secondly, we design a practical loss function framework called MLC Loss, which includes two loss functions namely MLC-OM and MLC-OO for two different annotations of multi-label detection, specialized for alleviating the mutual inhibition problem in multi-label classification. Thirdly, we design a low-cost NMS algorithm called ML-NMS, specialized to merge bounding boxes with multiple labels for one vehicle. Furthermore, we reconstruct UA-DETRAC as a multi-label vehicle detection dataset (benchmark), called UA-DETRAC-ML. To the best of our knowledge, UA-DETRAC-ML is the first unified multi-label vehicle detection dataset. On UA-DETRAC-ML, ML-VehicleDet achieves 70.23% mAP, outperforming YOLOv5-M and YOLOX-M. To promote the development of the community, we release UA-DETRAC-ML at https://github.com/GuanRunwei/UA-DETRAC-ML.

Original languageEnglish
Title of host publicationProceedings of the 15th International Conference on Digital Image Processing, ICDIP 2023
PublisherAssociation for Computing Machinery
ISBN (Electronic)9798400708237
DOIs
Publication statusPublished - 19 May 2023
Event15th International Conference on Digital Image Processing, ICDIP 2023 - Nanjing, China
Duration: 19 May 202322 May 2023

Publication series

NameACM International Conference Proceeding Series

Conference

Conference15th International Conference on Digital Image Processing, ICDIP 2023
Country/TerritoryChina
CityNanjing
Period19/05/2322/05/23

Keywords

  • Multi-label object detection
  • hybrid neural network
  • multi-label classification
  • non-maximum suppression

Fingerprint

Dive into the research topics of 'ML-VehicleDet: A Unified Multi-label Vehicle Detection Framework'. Together they form a unique fingerprint.

Cite this