TY - GEN
T1 - A ROBUST DEEP AUDIO SPLICING DETECTION METHOD VIA SINGULARITY DETECTION FEATURE
AU - Liang, Shan
AU - Zhang, Kanghao
AU - Nie, Shuai
AU - He, Shulin
AU - Pan, Jiahui
AU - Zhang, Xueliang
AU - Ma, Haoxin
AU - Yi, Jiangyan
N1 - Publisher Copyright:
© 2022 IEEE
PY - 2022
Y1 - 2022
N2 - There are many methods for detecting forged audio produced by conversion and synthesis. However, as a simpler method of forgery, splicing has not attracted widespread attention. Based on the characteristic that the tampering operation will cause singularities at high-frequency components, we propose a high-frequency singularity detection feature obtained by wavelet transform. The proposed feature can explicitly show the location of the tampering operation on the waveform. Moreover, the long short-term memory (LSTM) is introduced to the CNN-architecture LCNN to ensure that the sequence information can be fully learned. The proposed feature is sent to the improved RNN-architecture LCNN together with the widely used linear frequency cepstral coefficients (LFCC) to learn forgery characteristics where the LFCC is used as a supplement. Systematic evaluation and comparison show that the proposed method has greatly improved the accuracy and generalization.
AB - There are many methods for detecting forged audio produced by conversion and synthesis. However, as a simpler method of forgery, splicing has not attracted widespread attention. Based on the characteristic that the tampering operation will cause singularities at high-frequency components, we propose a high-frequency singularity detection feature obtained by wavelet transform. The proposed feature can explicitly show the location of the tampering operation on the waveform. Moreover, the long short-term memory (LSTM) is introduced to the CNN-architecture LCNN to ensure that the sequence information can be fully learned. The proposed feature is sent to the improved RNN-architecture LCNN together with the widely used linear frequency cepstral coefficients (LFCC) to learn forgery characteristics where the LFCC is used as a supplement. Systematic evaluation and comparison show that the proposed method has greatly improved the accuracy and generalization.
KW - forged audio
KW - high frequency
KW - singularity detection feature
KW - tampering
UR - http://www.scopus.com/inward/record.url?scp=85131258816&partnerID=8YFLogxK
U2 - 10.1109/ICASSP43922.2022.9746596
DO - 10.1109/ICASSP43922.2022.9746596
M3 - Conference Proceeding
AN - SCOPUS:85131258816
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - 2919
EP - 2923
BT - 2022 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2022 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 47th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2022
Y2 - 23 May 2022 through 27 May 2022
ER -