TY - GEN
T1 - Audio-Radar SMC-PHD Filtering for Indoor Multi-Speaker Tracking
AU - Zhou, Yi
AU - Lopez-Benitez, Miguel
AU - Yu, Limin
AU - Yue, Yutao
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - High-resolution millimetre-wave (mmWave) radar sensors have become increasingly popular in consumer markets. This study addresses the challenge of tracking multiple active speakers in indoor environments using high-resolution radar and microphone array. Through our experiments, we have observed that the Sequential Monte Carlo Probability Hypothesis Density (SMC-PHD) filter, when given point cloud data from a high-resolution radar as input, can provide promising tracking performance. In this work, we add another modality, i.e., audio, to the radar SMC-PHD filtering framework for the active speaker tracking task. Specifically, we use the audio Direction of Arrival (DoA) to guide the particle birth and relocation process in the SMC-PHD filtering framework. Furthermore, we propose a likelihood function that jointly considers the spatial and angular estimation from radar and audio. Experimental results on the RAV4D dataset demonstrate that our audio-radar SMC-PHD filtering approach produces reliable trajectories, especially in the challenging cases such as varying numbers of speakers.
AB - High-resolution millimetre-wave (mmWave) radar sensors have become increasingly popular in consumer markets. This study addresses the challenge of tracking multiple active speakers in indoor environments using high-resolution radar and microphone array. Through our experiments, we have observed that the Sequential Monte Carlo Probability Hypothesis Density (SMC-PHD) filter, when given point cloud data from a high-resolution radar as input, can provide promising tracking performance. In this work, we add another modality, i.e., audio, to the radar SMC-PHD filtering framework for the active speaker tracking task. Specifically, we use the audio Direction of Arrival (DoA) to guide the particle birth and relocation process in the SMC-PHD filtering framework. Furthermore, we propose a likelihood function that jointly considers the spatial and angular estimation from radar and audio. Experimental results on the RAV4D dataset demonstrate that our audio-radar SMC-PHD filtering approach produces reliable trajectories, especially in the challenging cases such as varying numbers of speakers.
KW - audio-radar fusion
KW - object tracking
KW - PHD filtering
UR - http://www.scopus.com/inward/record.url?scp=85206071760&partnerID=8YFLogxK
U2 - 10.1109/ICSIP61881.2024.10671483
DO - 10.1109/ICSIP61881.2024.10671483
M3 - Conference Proceeding
AN - SCOPUS:85206071760
T3 - 2024 9th International Conference on Signal and Image Processing, ICSIP 2024
SP - 282
EP - 286
BT - 2024 9th International Conference on Signal and Image Processing, ICSIP 2024
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 9th International Conference on Signal and Image Processing, ICSIP 2024
Y2 - 12 July 2024 through 14 July 2024
ER -