TY - JOUR
T1 - Counting with ease
T2 - Class-agnostic counting via one-shot detection across diverse domains
AU - Peng, Zhongxing
AU - Guo, Bohui
AU - Xu, Shugong
N1 - Publisher Copyright:
© 2025
PY - 2026/1
Y1 - 2026/1
N2 - Class-agnostic counting is increasingly prevalent in industrial and agricultural applications. However, most deployable methods rely on density maps, which (1) struggle with background interference in complex scenes, and (2) fail to provide precise object locations, limiting downstream usability. The advancement of class-agnostic counting is hindered by suboptimal model designs and the lack of datasets with bounding box annotations. While some studies explore text-guided methods using multimodal models, they remain impractical for edge deployment and are beyond our study's scope. To address these limitations, we diverge from traditional counting paradigms and propose a novel Class-Agnostic Counting and Localization (CACAL) framework, which performs accurate object counting and localization using a single query image-streamlining the process for real-world use. First, we introduce a Sampling-Aware Feature Enhancement module to improve feature discriminability and mitigate confusion in shared-encoder settings. Second, we design a Split-and-Assemble Feature Matching strategy to produce structurally-aware similarity maps, boosting performance in cluttered and occluded scenarios. To further advance the field, we introduce the LOCO dataset, a large-scale benchmark with both point and bounding box annotations across industrial, agricultural, and daily-life domains. CACAL consistently outperforms existing methods across multiple benchmarks and demonstrates strong generalization across diverse domains. Our dataset will be released at: https://github.com/imMid-Star/CACAL.
AB - Class-agnostic counting is increasingly prevalent in industrial and agricultural applications. However, most deployable methods rely on density maps, which (1) struggle with background interference in complex scenes, and (2) fail to provide precise object locations, limiting downstream usability. The advancement of class-agnostic counting is hindered by suboptimal model designs and the lack of datasets with bounding box annotations. While some studies explore text-guided methods using multimodal models, they remain impractical for edge deployment and are beyond our study's scope. To address these limitations, we diverge from traditional counting paradigms and propose a novel Class-Agnostic Counting and Localization (CACAL) framework, which performs accurate object counting and localization using a single query image-streamlining the process for real-world use. First, we introduce a Sampling-Aware Feature Enhancement module to improve feature discriminability and mitigate confusion in shared-encoder settings. Second, we design a Split-and-Assemble Feature Matching strategy to produce structurally-aware similarity maps, boosting performance in cluttered and occluded scenarios. To further advance the field, we introduce the LOCO dataset, a large-scale benchmark with both point and bounding box annotations across industrial, agricultural, and daily-life domains. CACAL consistently outperforms existing methods across multiple benchmarks and demonstrates strong generalization across diverse domains. Our dataset will be released at: https://github.com/imMid-Star/CACAL.
KW - Class-agnostic counting
KW - Feature enhancement and matching
KW - One-shot
KW - Sampling-aware mechanism
UR - https://www.scopus.com/pages/publications/105013786469
U2 - 10.1016/j.neunet.2025.107961
DO - 10.1016/j.neunet.2025.107961
M3 - Article
AN - SCOPUS:105013786469
SN - 0893-6080
VL - 193
JO - Neural Networks
JF - Neural Networks
M1 - 107961
ER -