Abstract
Visual tracking holds significant importance in enabling diverse practical applications, yet critical challenges persist in two key aspects: target characterization and motion dynamics. Foreground-background discrimination becomes problematic under real-world complexities like occlusion and scale variation, necessitating highly discriminative feature extraction. Moreover, appearance changes during target motion render static template strategies insufficient, demanding dynamic template updates to ensure continuity and prevent tracker drift. In this paper, we present FGTrack, a novel single object tracker that addresses these challenges through two perspectives. First, the Distribution-Aware Mask Modeling (DMM) enhances feature discriminability by leveraging Transformer attention distribution in conjunction with GridShift clustering to generate nuanced foreground mask. Building upon token candidate elimination from one-stream training process, this approach employs a simple yet efficient adaptive clustering to achieve precise foreground localization without the need for manual threshold adjustment. It effectively suppresses background interference by utilizing the token correlation between the template and search regions. Second, the Temporal Feature Propagation (TFP) ensures motion consistency by integrating autoregressive queries with spatio-temporal features. The TFP module maintains a dynamically updated query queue and aggregates historical features through a temporal attention mechanism. The spatio-temporal fusion maintains adaptive template updates and correlates the current frame's spatially encoded features with historical queries, capturing target evolution patterns through multi-head cross-attention. Experiments across five short-term and two long-term benchmarks demonstrate FGTrack's superiority over state-of-the-art trackers, particularly in occlusion and deformation scenarios, validating its balanced approach to spatial discrimination and temporal coherence. The code will be released at https://github.com/BroCome25/FGTrack.
| Original language | English |
|---|---|
| Article number | 114208 |
| Journal | Knowledge-Based Systems |
| Volume | 328 |
| DOIs | |
| Publication status | Published - 25 Oct 2025 |
Keywords
- Distribution-aware mask modeling
- Temporal feature propagation
- Visual object tracking
Fingerprint
Dive into the research topics of 'Fine-grained visual tracking via distribution-aware mask modeling and temporal propagation'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver