TY - JOUR
T1 - Hybrid Multi-Class Token Vision Transformer Convolutional Network for DOA Estimation
AU - Xie, Yuxuan
AU - Liu, Aifei
AU - Lu, Xinyu
AU - Chong, Dufei
N1 - Publisher Copyright:
© 1994-2012 IEEE.
PY - 2025
Y1 - 2025
N2 - In this letter, we propose an efficient hybrid model, named HMC-ViT, that combines a convolutional neural network (CNN) with a multi-class token vision transformer (ViT) to address the problem of direction of arrival (DOA) estimation. HMC-ViT integrates the local feature extraction capability of CNN with the global feature extraction capability of ViT to enhance DOA estimation performance and improve the computational efficiency of ViT. Additionally, the ViT component employs multiple class tokens in parallel to generate spatial spectra for sub-regions, further enhancing the model's performance. Simulation results demonstrate that the proposed method outperforms existing approaches under low signal-to-noise ratio (SNR) scenarios.
AB - In this letter, we propose an efficient hybrid model, named HMC-ViT, that combines a convolutional neural network (CNN) with a multi-class token vision transformer (ViT) to address the problem of direction of arrival (DOA) estimation. HMC-ViT integrates the local feature extraction capability of CNN with the global feature extraction capability of ViT to enhance DOA estimation performance and improve the computational efficiency of ViT. Additionally, the ViT component employs multiple class tokens in parallel to generate spatial spectra for sub-regions, further enhancing the model's performance. Simulation results demonstrate that the proposed method outperforms existing approaches under low signal-to-noise ratio (SNR) scenarios.
KW - Convolutional neural network (CNN)
KW - deep learning
KW - direction of arrival (DOA) estimation
KW - vision transformer (ViT)
UR - http://www.scopus.com/inward/record.url?scp=105006801068&partnerID=8YFLogxK
U2 - 10.1109/LSP.2025.3573949
DO - 10.1109/LSP.2025.3573949
M3 - Article
AN - SCOPUS:105006801068
SN - 1070-9908
VL - 32
SP - 2279
EP - 2283
JO - IEEE Signal Processing Letters
JF - IEEE Signal Processing Letters
ER -