Study on Adoption of 3D Computer Vision and Speech Recognition in Intelligent Manufacturing

Quan Zhang*, Mark Leach, Eng Gee Lim

*Corresponding author for this work

Research output: Chapter in Book or Report/Conference proceedingConference Proceedingpeer-review


This work aims to adopt state-of-art, artificial intelligence (AI) based technology in intelligent manufacturing applications, which in involve using 3D computer vison and speech recognition (voice commands) for control of typical intelligent manufacturing systems such as robotic workstations. As a first attempt, we consider a relatively simple scenario in manufacturing, i.e. object grasping, with a 3-axis Cartesian robot arm. On this workstation, we have developed a 3D vision system using the Intel RealSense D455 depth camera for object detection and capture, which is processed by SSD algorithm. We have also developed a speech (audio or voice) recognition system with our proposed algorithm, combining both LSTM and GMM-HMM audio recognition models. Good performance is achieved in terms of both efficiency and accuracy in controlling normal operation, as well as emergency stop functionality for our robotic workstation. Possible future improvements to our system and relevant methodology are discussed.
Original languageEnglish
Title of host publicationProceedings of the 28 th IEEE International Conference on Automation and Computing (ICAC2023)
Publication statusPublished - 2023


Dive into the research topics of 'Study on Adoption of 3D Computer Vision and Speech Recognition in Intelligent Manufacturing'. Together they form a unique fingerprint.

Cite this