Cross-Scenario Foundation Localization Models: Architecture, Key Technologies, and Challenges

Haonan Si, Xiansheng Guo*, Gordon Owusu Boateng, Huang Xia

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

As a core technology in the realm of ubiquitous mobile intelligence, indoor localization is shifting from the traditional “single scenario-driven” and static localization to a “cross-scenario” and generalized dynamic localization paradigm. However, the high dynamics and uncertainties across different scenarios, sensor heterogeneity, and the lack of standardized training data pose severe challenges to the dynamic localization concept. Recently, foundation models (e.g., DeepSeek, GPT, and LLaMA, etc.) have raised an unprecedented wave of research fervor by virtue of their extraordinary capability to seamlessly adapt to multiple downstream tasks across diverse scenarios. Inspired by the capabilities of foundation models and the cross-scenario dynamic localization, we introduce the concept of Foundation Localization Models (FLMs), aiming to achieve cross-scenario and cross-modality generalized localization. Subsequently, we propose its architecture and three key techniques: geometry-aligned backbone components, multi-modality co-temporal-spatial attention, and multi-dimensional data generation framework, which are interwoven to lay a solid theoretical foundation for cross-scenario generalized localization, clarifying the coupling mechanisms between localization tasks and the environments in complex scenarios. Experimental results are presented to demonstrate the feasibility and efficiency of the proposed FLM concept. Finally, several challenges and future research directions for implementing the FLM framework are highlighted.

Original languageEnglish
JournalIEEE Wireless Communications
DOIs
Publication statusAccepted/In press - 2025

Cite this