TY - GEN
T1 - Game Engine Based Multi-View Video Dataset Synthesis for Pedestrian Detection and Tracking
AU - Pan, Xiaonan
AU - Sun, Qilei
AU - Wang, Jia
AU - Lim, Eng Gee
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Multi-view deep learning models have demonstrated significant promise in addressing pedestrian detection and tracking challenges, such as heavy occlusion in monocular cameras and their restricted field of view. However, these models demand a considerable volume of training data, the acquisition of which is time-consuming, labour-intensive, and further complicated by privacy and ethical concerns. The currently available public multi-view datasets are insufficient to support the extensive training required for these models.To solve the paucity of multi-view training data, this paper presents a novel multi-view synthetic dataset pipeline, named WildPerception, based on integrated techniques of Unity Perception, SyntheticHumans Package and MultiviewX. WildPerception simulates pedestrians in a photo-realistic scene along with multiple overlapping views, allowing an instant generation of large-scale and labeled video training datasets in WILDTRACK format. Our pipeline is modular and can be easily tailored to the demands of diverse specific multi-view tasks. Experiments were carried out to validate the efficiency of this pipeline.Moreover, the models trained on these synthesized datasets also benefit the robust adaptability when deployed on datasets gathered from novel environments.The code for the pipeline is publicly available on GitHub at https://github.com/TsingLoo/com.tsingloo.wildperception to facilitate reproducibility and further research.
AB - Multi-view deep learning models have demonstrated significant promise in addressing pedestrian detection and tracking challenges, such as heavy occlusion in monocular cameras and their restricted field of view. However, these models demand a considerable volume of training data, the acquisition of which is time-consuming, labour-intensive, and further complicated by privacy and ethical concerns. The currently available public multi-view datasets are insufficient to support the extensive training required for these models.To solve the paucity of multi-view training data, this paper presents a novel multi-view synthetic dataset pipeline, named WildPerception, based on integrated techniques of Unity Perception, SyntheticHumans Package and MultiviewX. WildPerception simulates pedestrians in a photo-realistic scene along with multiple overlapping views, allowing an instant generation of large-scale and labeled video training datasets in WILDTRACK format. Our pipeline is modular and can be easily tailored to the demands of diverse specific multi-view tasks. Experiments were carried out to validate the efficiency of this pipeline.Moreover, the models trained on these synthesized datasets also benefit the robust adaptability when deployed on datasets gathered from novel environments.The code for the pipeline is publicly available on GitHub at https://github.com/TsingLoo/com.tsingloo.wildperception to facilitate reproducibility and further research.
KW - computer graphics
KW - computer vision
KW - dataset synthesis
KW - multi-view
KW - pedestrian detection and tracking
UR - http://www.scopus.com/inward/record.url?scp=85211444441&partnerID=8YFLogxK
U2 - 10.1109/MetaCom62920.2024.00049
DO - 10.1109/MetaCom62920.2024.00049
M3 - Conference Proceeding
AN - SCOPUS:85211444441
T3 - Proceedings - 2024 IEEE International Conference on Metaverse Computing, Networking, and Applications, MetaCom 2024
SP - 259
EP - 264
BT - Proceedings - 2024 IEEE International Conference on Metaverse Computing, Networking, and Applications, MetaCom 2024
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2nd IEEE International Conference on Metaverse Computing, Networking, and Applications, MetaCom 2024
Y2 - 12 August 2024 through 14 August 2024
ER -