TY - GEN
T1 - AlignBot: Aligning VLM-Powered Customized Task Planning with User Reminders Through Fine-Tuning for Household Robots
AU - Zhaxizhuoma,
AU - Chen, Pengan
AU - Wu, Ziniu
AU - Sun, Jiawei
AU - Wang, Dong
AU - Zhou, Peng
AU - Cao, Nieqing
AU - Ding, Yan
AU - Zhao, Bin
AU - Li, Xuelong
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - This paper presents AlignBot, a novel framework designed to optimize VLM-powered customized task planning for household robots by effectively aligning with user reminders. In domestic settings, aligning task planning with user reminders poses significant challenges due to the limited quantity, diversity, and multimodal nature of the reminders. To address these challenges, AlignBot employs a fine-tuned LLaVA-7B model, functioning as an adapter for GPT-40. This adapter model internalizes diverse forms of user reminders-such as personalized preferences, corrective guidance, and contextual assistance-into structured instruction-formatted cues that prompt GPT-40 in generating customized task plans. Additionally, AlignBot integrates a dynamic retrieval mechanism that selects task-relevant historical successes as prompts for GPT-40, further enhancing task planning accuracy. To validate the effectiveness of AlignBot, experiments are conducted in real-world household environments, which are constructed within the laboratory to replicate typical household settings. A multimodal dataset with over 1,500 entries derived from volunteer reminders is used for training and evaluation. The results demonstrate that AlignBot significantly improves customized task planning, outperforming existing LLM- and VLM-powered planners by interpreting and aligning with user reminders, achieving 86.8% success rate compared to the vanilla GPT-40 baseline at 21.6%, reflecting a 65% improvement and over four times greater effectiveness. Supplementary materials are available at: https://yding25.com/AlignBot/
AB - This paper presents AlignBot, a novel framework designed to optimize VLM-powered customized task planning for household robots by effectively aligning with user reminders. In domestic settings, aligning task planning with user reminders poses significant challenges due to the limited quantity, diversity, and multimodal nature of the reminders. To address these challenges, AlignBot employs a fine-tuned LLaVA-7B model, functioning as an adapter for GPT-40. This adapter model internalizes diverse forms of user reminders-such as personalized preferences, corrective guidance, and contextual assistance-into structured instruction-formatted cues that prompt GPT-40 in generating customized task plans. Additionally, AlignBot integrates a dynamic retrieval mechanism that selects task-relevant historical successes as prompts for GPT-40, further enhancing task planning accuracy. To validate the effectiveness of AlignBot, experiments are conducted in real-world household environments, which are constructed within the laboratory to replicate typical household settings. A multimodal dataset with over 1,500 entries derived from volunteer reminders is used for training and evaluation. The results demonstrate that AlignBot significantly improves customized task planning, outperforming existing LLM- and VLM-powered planners by interpreting and aligning with user reminders, achieving 86.8% success rate compared to the vanilla GPT-40 baseline at 21.6%, reflecting a 65% improvement and over four times greater effectiveness. Supplementary materials are available at: https://yding25.com/AlignBot/
UR - https://www.scopus.com/pages/publications/105016602083
U2 - 10.48550/arXiv.2409.11905
DO - 10.48550/arXiv.2409.11905
M3 - Conference Proceeding
AN - SCOPUS:105016602083
T3 - Proceedings - IEEE International Conference on Robotics and Automation
SP - 12549
EP - 12556
BT - 2025 IEEE International Conference on Robotics and Automation, ICRA 2025
A2 - Ott, Christian
A2 - Admoni, Henny
A2 - Behnke, Sven
A2 - Bogdan, Stjepan
A2 - Bolopion, Aude
A2 - Choi, Youngjin
A2 - Ficuciello, Fanny
A2 - Gans, Nicholas
A2 - Gosselin, Clement
A2 - Harada, Kensuke
A2 - Kayacan, Erdal
A2 - Kim, H. Jin
A2 - Leutenegger, Stefan
A2 - Liu, Zhe
A2 - Maiolino, Perla
A2 - Marques, Lino
A2 - Matsubara, Takamitsu
A2 - Mavromatti, Anastasia
A2 - Minor, Mark
A2 - O'Kane, Jason
A2 - Park, Hae Won
A2 - Park, Hae-Won
A2 - Rekleitis, Ioannis
A2 - Renda, Federico
A2 - Ricci, Elisa
A2 - Riek, Laurel D.
A2 - Sabattini, Lorenzo
A2 - Shen, Shaojie
A2 - Sun, Yu
A2 - Wieber, Pierre-Brice
A2 - Yamane, Katsu
A2 - Yu, Jingjin
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2025 IEEE International Conference on Robotics and Automation, ICRA 2025
Y2 - 19 May 2025 through 23 May 2025
ER -