Study on Adoption of 3D Computer Vision and Speech Recognition in Intelligent Manufacturing

Quan Zhang; Mark Leach; Eng Gee Lim

doi:10.1109/ICAC57885.2023.10275158

Study on Adoption of 3D Computer Vision and Speech Recognition in Intelligent Manufacturing

Quan Zhang^*, Mark Leach, Eng Gee Lim

^*Corresponding author for this work

Research output: Chapter in Book or Report/Conference proceeding › Conference Proceeding › peer-review

Abstract

This work aims to adopt state-of-art, artificial intelligence (AI) based technology in intelligent manufacturing applications, which in involve using 3D computer vison and speech recognition (voice commands) for control of typical intelligent manufacturing systems such as robotic workstations. As a first attempt, we consider a relatively simple scenario in manufacturing, i.e. object grasping, with a 3-axis Cartesian robot arm. On this workstation, we have developed a 3D vision system using the Intel RealSense D455 depth camera for object detection and capture, which is processed by SSD algorithm. We have also developed a speech (audio or voice) recognition system with our proposed algorithm, combining both LSTM and GMM-HMM audio recognition models. Good performance is achieved in terms of both efficiency and accuracy in controlling normal operation, as well as emergency stop functionality for our robotic workstation. Possible future improvements to our system and relevant methodology are discussed.

Original language	English
Title of host publication	Proceedings of the 28 th IEEE International Conference on Automation and Computing (ICAC2023)
DOIs	https://doi.org/10.1109/ICAC57885.2023.10275158
Publication status	Published - 2023

Access to Document

10.1109/ICAC57885.2023.10275158

Cite this

@inproceedings{e20f6fd87a0a4f40addd9ddee5831b28,

title = "Study on Adoption of 3D Computer Vision and Speech Recognition in Intelligent Manufacturing",

abstract = "This work aims to adopt state-of-art, artificial intelligence (AI) based technology in intelligent manufacturing applications, which in involve using 3D computer vison and speech recognition (voice commands) for control of typical intelligent manufacturing systems such as robotic workstations. As a first attempt, we consider a relatively simple scenario in manufacturing, i.e. object grasping, with a 3-axis Cartesian robot arm. On this workstation, we have developed a 3D vision system using the Intel RealSense D455 depth camera for object detection and capture, which is processed by SSD algorithm. We have also developed a speech (audio or voice) recognition system with our proposed algorithm, combining both LSTM and GMM-HMM audio recognition models. Good performance is achieved in terms of both efficiency and accuracy in controlling normal operation, as well as emergency stop functionality for our robotic workstation. Possible future improvements to our system and relevant methodology are discussed.",

author = "Quan Zhang and Mark Leach and Lim, {Eng Gee}",

year = "2023",

doi = "10.1109/ICAC57885.2023.10275158",

language = "English",

booktitle = "Proceedings of the 28 th IEEE International Conference on Automation and Computing (ICAC2023)",

}

TY - GEN

T1 - Study on Adoption of 3D Computer Vision and Speech Recognition in Intelligent Manufacturing

AU - Zhang, Quan

AU - Leach, Mark

AU - Lim, Eng Gee

PY - 2023

Y1 - 2023

N2 - This work aims to adopt state-of-art, artificial intelligence (AI) based technology in intelligent manufacturing applications, which in involve using 3D computer vison and speech recognition (voice commands) for control of typical intelligent manufacturing systems such as robotic workstations. As a first attempt, we consider a relatively simple scenario in manufacturing, i.e. object grasping, with a 3-axis Cartesian robot arm. On this workstation, we have developed a 3D vision system using the Intel RealSense D455 depth camera for object detection and capture, which is processed by SSD algorithm. We have also developed a speech (audio or voice) recognition system with our proposed algorithm, combining both LSTM and GMM-HMM audio recognition models. Good performance is achieved in terms of both efficiency and accuracy in controlling normal operation, as well as emergency stop functionality for our robotic workstation. Possible future improvements to our system and relevant methodology are discussed.

AB - This work aims to adopt state-of-art, artificial intelligence (AI) based technology in intelligent manufacturing applications, which in involve using 3D computer vison and speech recognition (voice commands) for control of typical intelligent manufacturing systems such as robotic workstations. As a first attempt, we consider a relatively simple scenario in manufacturing, i.e. object grasping, with a 3-axis Cartesian robot arm. On this workstation, we have developed a 3D vision system using the Intel RealSense D455 depth camera for object detection and capture, which is processed by SSD algorithm. We have also developed a speech (audio or voice) recognition system with our proposed algorithm, combining both LSTM and GMM-HMM audio recognition models. Good performance is achieved in terms of both efficiency and accuracy in controlling normal operation, as well as emergency stop functionality for our robotic workstation. Possible future improvements to our system and relevant methodology are discussed.

U2 - 10.1109/ICAC57885.2023.10275158

DO - 10.1109/ICAC57885.2023.10275158

M3 - Conference Proceeding

BT - Proceedings of the 28 th IEEE International Conference on Automation and Computing (ICAC2023)

ER -

Study on Adoption of 3D Computer Vision and Speech Recognition in Intelligent Manufacturing

Abstract

Access to Document

Fingerprint

Cite this