TY - GEN
T1 - NanoMVG
T2 - 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2025
AU - Guan, Runwei
AU - Liu, Jianan
AU - Jia, Liye
AU - Zhao, Haocheng
AU - Yao, Shanliang
AU - Zhu, Xiaohui
AU - Man, Ka Lok
AU - Lim, Eng Gee
AU - Smith, Jeremy
AU - Yue, Yutao
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - Recently, visual grounding and multi-sensors setting have been incorporated into perception system for terrestrial autonomous driving systems and Unmanned Surface Vessels (USVs), yet the high complexity of modern learning-based visual grounding model using multi-sensors prevents such model to be deployed on USVs in the real-life. To this end, we design a low-power multi-task model named NanoMVG for waterway embodied perception, guiding both camera and 4D millimeter-wave radar to locate specific object(s) through natural language. NanoMVG can perform both box-level and mask-level visual grounding tasks simultaneously. Compared to other visual grounding models, NanoMVG achieves highly competitive performance on the WaterVG dataset, particularly in harsh environments. Moreover, the real-world experiments with deployment of NanoMVG on embedded edge device of USV demonstrates its fast inference speed for real-time perception and capability of boasting ultra-low power consumption for long endurance.
AB - Recently, visual grounding and multi-sensors setting have been incorporated into perception system for terrestrial autonomous driving systems and Unmanned Surface Vessels (USVs), yet the high complexity of modern learning-based visual grounding model using multi-sensors prevents such model to be deployed on USVs in the real-life. To this end, we design a low-power multi-task model named NanoMVG for waterway embodied perception, guiding both camera and 4D millimeter-wave radar to locate specific object(s) through natural language. NanoMVG can perform both box-level and mask-level visual grounding tasks simultaneously. Compared to other visual grounding models, NanoMVG achieves highly competitive performance on the WaterVG dataset, particularly in harsh environments. Moreover, the real-world experiments with deployment of NanoMVG on embedded edge device of USV demonstrates its fast inference speed for real-time perception and capability of boasting ultra-low power consumption for long endurance.
UR - https://www.scopus.com/pages/publications/105029916835
U2 - 10.1109/IROS60139.2025.11246532
DO - 10.1109/IROS60139.2025.11246532
M3 - Conference Proceeding
AN - SCOPUS:105029916835
T3 - IEEE International Conference on Intelligent Robots and Systems
SP - 11789
EP - 11796
BT - IROS 2025 - 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems, Conference Proceedings
A2 - Laugier, Christian
A2 - Renzaglia, Alessandro
A2 - Atanasov, Nikolay
A2 - Birchfield, Stan
A2 - Cielniak, Grzegorz
A2 - De Mattos, Leonardo
A2 - Fiorini, Laura
A2 - Giguere, Philippe
A2 - Hashimoto, Kenji
A2 - Ibanez-Guzman, Javier
A2 - Kamegawa, Tetsushi
A2 - Lee, Jinoh
A2 - Loianno, Giuseppe
A2 - Luck, Kevin
A2 - Maruyama, Hisataka
A2 - Martinet, Philippe
A2 - Moradi, Hadi
A2 - Nunes, Urbano
A2 - Pettre, Julien
A2 - Pretto, Alberto
A2 - Ranzani, Tommaso
A2 - Ronnau, Arne
A2 - Rossi, Silvia
A2 - Rouse, Elliott
A2 - Ruggiero, Fabio
A2 - Simonin, Olivier
A2 - Wang, Danwei
A2 - Yang, Ming
A2 - Yoshida, Eiichi
A2 - Zhao, Huijing
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 19 October 2025 through 25 October 2025
ER -