TY - JOUR
T1 - Uni-EPM
T2 - A Unified Extensible Perception Model Without Labeling Everything
AU - Gao, Yilin
AU - Mu, Shiyi
AU - Xu, Shugong
N1 - Publisher Copyright:
© 2000-2011 IEEE.
PY - 2024
Y1 - 2024
N2 - Multi-task perception system to simultaneously perceive various kinds of objects is essential for autonomous driving. Existing perception frameworks always rely on multi-labeled datasets, which encompass labels for all pertinent objects, thereby constraining their adaptability to leverage specialized, task-oriented datasets. This approach hinders the efficient utilization of abundant but focused data. Furthermore, stacking multiple expert networks to address these perception objectives inevitably introduces additional computational overhead. To address this limitation, we propose Uni-EPM (Unified Extensible Perception Model), with a novel training framework for multi-task perception using task prompt selection to decouple tasks, which enables perceiving traffic signs and traffic lights in addition to lane lines and traffic elements from existing task-specific datasets without re-labeling. To the best of our knowledge, Uni-EPM is the first model can do this in the field of autonomous driving. By introducing the parameter-sharing decoder among tasks, we alleviate the problems of stacking task heads, including significant parameter increase, etc. Uni-EPM achieves state-of-the-art results in multi-task algorithms without substantial increase in parameters, which also demonstrates comparable performance to existing standalone models. The efficiency of the design is validated through comprehensive ablation experiments and results.
AB - Multi-task perception system to simultaneously perceive various kinds of objects is essential for autonomous driving. Existing perception frameworks always rely on multi-labeled datasets, which encompass labels for all pertinent objects, thereby constraining their adaptability to leverage specialized, task-oriented datasets. This approach hinders the efficient utilization of abundant but focused data. Furthermore, stacking multiple expert networks to address these perception objectives inevitably introduces additional computational overhead. To address this limitation, we propose Uni-EPM (Unified Extensible Perception Model), with a novel training framework for multi-task perception using task prompt selection to decouple tasks, which enables perceiving traffic signs and traffic lights in addition to lane lines and traffic elements from existing task-specific datasets without re-labeling. To the best of our knowledge, Uni-EPM is the first model can do this in the field of autonomous driving. By introducing the parameter-sharing decoder among tasks, we alleviate the problems of stacking task heads, including significant parameter increase, etc. Uni-EPM achieves state-of-the-art results in multi-task algorithms without substantial increase in parameters, which also demonstrates comparable performance to existing standalone models. The efficiency of the design is validated through comprehensive ablation experiments and results.
KW - Multi-task
KW - panoptic driving perception
KW - unified framework
UR - http://www.scopus.com/inward/record.url?scp=85208755622&partnerID=8YFLogxK
U2 - 10.1109/TITS.2024.3487962
DO - 10.1109/TITS.2024.3487962
M3 - Article
AN - SCOPUS:85208755622
SN - 1524-9050
JO - IEEE Transactions on Intelligent Transportation Systems
JF - IEEE Transactions on Intelligent Transportation Systems
ER -