Skip to main navigation Skip to search Skip to main content

Achelous++: Power-oriented multi-task panoptic waterway perception framework based on vision-radar fusion

  • Runwei Guan
  • , Haocheng Zhao
  • , Shanliang Yao
  • , Limin Yu
  • , Xiaohui Zhu
  • , Ryan Wen Liu
  • , Eng Gee Lim
  • , Weiping Ding
  • , Yutao Yue*
  • , Hui Xiong
  • *Corresponding author for this work
  • The Hong Kong University of Science and Technology (Guangzhou)
  • Wuhan University of Technology
  • Xi'an Jiaotong-Liverpool University
  • Yancheng Institute of Technology
  • Nantong University

Research output: Contribution to journalArticlepeer-review

Abstract

Multi-task panoptic perception leveraging multi-sensor fusion is crucial for comprehensively understanding waterway environments, which enhances the robust monitoring and autonomous navigation of unmanned surface vessels. However, the fragmented design inherent in multi-modal and multi-task neural networks inevitably leads to decreased inference speed and increased energy consumption. Therefore, we focus on developing a low-power, lightweight multi-task panoptic perception framework with high liberty for development. In this paper, we propose an end-to-end framework named Achelous++, capable of executing five perception tasks concurrently with high speed and low power consumption, which include object detection, semantic segmentation, drivable-area segmentation, waterline segmentation, and radar point cloud semantic segmentation. Notably, we design an efficient vision-radar fusion module, termed Gating Adaptive Fusion (GAF), to enhance fusion-based perception tasks cost-effectively within a shared computational space. Moreover, we design a dynamic feature routing module called Edge-Context Weighting (ECW) for feature selection in multi-segmentation tasks. Building on this, we also design a series of metrics to evaluate the energy consumption of multi-task perception. Overall, our Achelous++ framework achieves state-of-the-art performance on WaterScenes benchmark. Specifically, the optimal model of Achelous++ framework outperforms other models by approximately 5% mAP and 7% mIoU in object detection and multiple semantic segmentation tasks, while maintaining over 20 FPS and power consumption under 20W on Orin. To the best of our knowledge, Achelous++ is the pioneering fusion-based framework for panoptic perception that integrates five perception tasks.

Original languageEnglish
Article number114787
JournalApplied Soft Computing
Volume192
DOIs
Publication statusPublished - Apr 2026

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 7 - Affordable and Clean Energy
    SDG 7 Affordable and Clean Energy

Keywords

  • Low-power model
  • Multi-task perception framework
  • Vision-radar fusion
  • Water-surface perception

Cite this