Yin Cao

Phone+86 (0)512 88970370
EmailYin.Caoxjtlu.educn

h-index

Research activity per year

Personal profile

I am an Associate Professor in the Department of Intelligent Science at Xi’an Jiaotong-Liverpool University. I hold a Ph.D. in Acoustics from the Chinese Academy of Sciences and have previously worked at Brigham Young University, the University of Surrey, and Qualcomm. My research focuses on machine learning and signal processing for audio, speech, and acoustics. Since 2020, I have served as an Associate Editor for the Noise Control Engineering Journal and regularly review for top journals in the field.

Research interests

My research interests include machine learning for audio and speech, spatial audio modeling, intelligent sound event detection and localization, active noise control, and data-driven acoustic signal processing. I am particularly interested in developing deep learning approaches, Audio and Speech Large Language Models (LLMs), and generative models to enhance sound environment analysis and control.

Experience

I completed my Ph.D. in Acoustics at the Chinese Academy of Sciences in 2013. Since then, I have held research and engineering positions at Brigham Young University, the Institute of Acoustics (CAS), the University of Surrey, and Qualcomm. Currently, I work in the Department of Intelligent Science at Xi’an Jiaotong-Liverpool University and as a Visiting Scholar at the University of Surrey.

Teaching

INT402, Data Mining and Big Data Analytics

INT403, Spoken Language Processing

Awards and honours

I have received several awards for my research contributions in sound event localization and detection, including:

First place in the L3DAS22 Challenge Task 2 (“3D Sound Event Localization and Detection”)
Third place in the DCASE 2023 Challenge Task 3B (“Sound Event Localization and Detection Evaluated in Real Spatial Sound Scenes”)
Second place in the DCASE 2022 Challenge Task 3 (“Sound Event Localization and Detection Evaluated in Real Spatial Sound Scenes”)
First place in the DCASE 2020 Challenge Task 5 (“Urban Sound Tagging with Spatiotemporal Context”)
Second place in the DCASE 2019 Challenge Task 3 (“Sound Event Localization and Detection”)

I have also received the “Judges’ Award” and “Reproducible System Award” at the DCASE 2019 and 2020 workshops, recognizing the impact and reproducibility of my research.

In addition, during my doctoral research, I was awarded the National Scholarship and the Zhu-Li-Yue-Hua Outstanding Doctoral Scholarship for academic excellence.

Education/Academic qualification

PhD, Ph.D. in Signal Processing and Acoustics, University of Chinese Academy of Sciences (UCAS),

2008 → 2013

Award Date: 1 Jun 2013

Bachelor, BSc in Electrical and Electronics Engineering., Nanjing University

2004 → 2008

Award Date: 1 Jun 2008

External positions

Visiting Scholar, University of Surrey

Jun 2025 → …

Associate Editor for Noise Control Engineering Journal

Jan 2020 → …

Research areas

Machine Learning
Audio Signal Processing
Speech Processing
Spatial Audio Modeling
Sound Event Detection and Localization (SELD)
Generative Audio Models
Audio and Speech Large Language Models (LLMs)

Keywords

QC Physics
Acoustics
QA75 Electronic computers. Computer science
Machine learning

Person Types

Staff

Fingerprint

Dive into the research topics where Yin Cao is active. These topic labels come from the works of this person. Together they form a unique fingerprint.

2 Similar Profiles

1 Finished
3 Active

Development of Sound Event Recognition Algorithms
Cao, Y.
2/06/25 → 30/06/28
Project: Collaborative Research Project
Implementation of Sound Source Detection and Localization Algorithm
Cao, Y. & Liang, S.
15/12/24 → 14/12/25
Project: Collaborative Research Project
Methods Study on Multi-Task Learning for 3D Computational Environmental Audio Analysis
Cao, Y.
1/01/23 → 31/12/25
Project: Internal Research Project
Develop of sound detection simulation software
Cao, Y.
1/06/24 → 30/09/24
Project: Collaborative Research Project

19 Conference Proceeding
18 Article
5 Paper
1 Chapter
More
- 1 Conference article

Face2VoiceSync: Lightweight Face-Voice Consistency for Text-Driven Talking Face Generation
Kang, F. & Cao, Y., May 2025.
Research output: Contribution to conference › Paper › peer-review
WavJourney: Compositional Audio Creation with Large Language Models
Liu, X. & Cao, Y., May 2025, In: IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP).
Research output: Contribution to journal › Article › peer-review
EDTC: enhance depth of text comprehension in automated audio captioning
Tan, L. & Cao, Y., Feb 2024.
Research output: Contribution to conference › Paper
POWER CUE ENHANCED NETWORK AND AUDIO-VISUAL FUISON FOR SOUND EVENT LOCALIZATION AND DETECTION OF DCASE2024 CHALLENGE
Guan, X. & Cao, Y., 2024, POWER CUE ENHANCED NETWORK AND AUDIO-VISUAL FUISON FOR SOUND EVENT LOCALIZATION AND DETECTION OF DCASE2024 CHALLENGE.
Research output: Chapter in Book or Report/Conference proceeding › Conference Proceeding › peer-review
Selective-Memory Meta-Learning with Environment Representations for Sound Event Localization and Detection
Hu, J. & Cao, Y., Aug 2024, In: IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP).
Research output: Contribution to journal › Article › peer-review

2 Publication Peer-review
2 Completed SURF Project
2 Master Dissertation Supervision
1 Editorial work

Language-queried audio source separation
Yin Cao (Supervisor)
2024
Activity: Supervision › Completed SURF Project
Deep source separation for speech and music
Yin Cao (Supervisor)
2023
Activity: Supervision › Completed SURF Project
Multimodal Sound Event Localization and Detection
Yin Cao (Supervisor)
2023 → 2024
Activity: Supervision › Master Dissertation Supervision
Audio Deepfake Detection
Yin Cao (Supervisor)
2022 → 2023
Activity: Supervision › Master Dissertation Supervision
Noise Control Engineering Journal (Journal)
Yin Cao (Reviewer)
2021 → …
Activity: Peer-review and editorial work of publications › Editorial work

Yin Cao

Personal profile

Personal profile

Research interests

Experience

Teaching

Awards and honours

Related documents

Education/Academic qualification

External positions

Research areas

Keywords

Person Types

Fingerprint

Collaborations and top research areas from the last five years

Dive into details

Development of Sound Event Recognition Algorithms

Implementation of Sound Source Detection and Localization Algorithm

Methods Study on Multi-Task Learning for 3D Computational Environmental Audio Analysis

Develop of sound detection simulation software

Face2VoiceSync: Lightweight Face-Voice Consistency for Text-Driven Talking Face Generation

WavJourney: Compositional Audio Creation with Large Language Models

EDTC: enhance depth of text comprehension in automated audio captioning

POWER CUE ENHANCED NETWORK AND AUDIO-VISUAL FUISON FOR SOUND EVENT LOCALIZATION AND DETECTION OF DCASE2024 CHALLENGE

Selective-Memory Meta-Learning with Environment Representations for Sound Event Localization and Detection

Language-queried audio source separation

Deep source separation for speech and music

Multimodal Sound Event Localization and Detection

Audio Deepfake Detection

Noise Control Engineering Journal (Journal)

Yin Cao

Personal profile

Personal profile

Research interests

Experience

Teaching

Awards and honours

Related documents

Education/Academic qualification

External positions

Research areas

Keywords

Person Types

Fingerprint

Collaborations and top research areas from the last five years

Projects

Research output

Activities