APSIPA Transactions on Signal and Information Processing > Vol 14 > Issue 2

ASVSpoof 2021: Detecting Spoofed Utterances Through Hybrid Features

Ramesh K. Bhukya, Department of Electronics and Communication Engineering, Indian Institute of Information Technology, India, rkbhukya@iiita.ac.in , Aditya Raj, Department of Information Technology-Business Informatics, Indian Institute of Information Technology, India, Anshul Kumar, Department of Electronics and Communication Engineering, Birla Institute of Technology Mesra, India
 
Suggested Citation
Ramesh K. Bhukya, Aditya Raj and Anshul Kumar (2025), "ASVSpoof 2021: Detecting Spoofed Utterances Through Hybrid Features", APSIPA Transactions on Signal and Information Processing: Vol. 14: No. 2, e19. http://dx.doi.org/10.1561/116.20250026

Publication Date: 22 Jul 2025
© 2025 R. K. Bhukya et al.
 
Subjects
 

Share

Open Access

This is published under the terms of CC BY-NC.

Downloaded: 49 times

In this article:
Introduction 
Literature Review 
Database Description 
Feature Extraction 
Methodology 
Experimental Results and Analysis 
Conclusion and Future Directions 
References 

Abstract

ASVSpoof is a set of challenges intended to advance research into the spoofing risks to automated speaker verification (ASV) systems. Giving a false speech signal that mimics the characteristics of a real speech signal is a common technique for tricking an ASV system. Spoofing is the practice of impersonating another speaker. ASVspoof uses three assessment measures, Logical Access (LA), Physical Access (PA) and DeepFake (DF), to assess the effectiveness of spoofing defences developed for ASV systems. In this study, we used the k-Nearest Neighbour (k-NN), Support Vector Machine (SVM), Random Forest (RF), Gradient Boosting (GB), AdaBoost, XGBoost, and Multi-Layer Perceptron (MLP) are Machine Learning (ML) models. DNN-single, DNN-CNN, DNN-convLSTM, and DNN-BiLSTM are Deep Learning (DL) models to assess the ASVspoof on the ASVspoof2021 datasets. DL entails the process of transforming manually crafted feature vectors (FVs) into more extensive, dense FVs via matrix multiplication. A DL model’s architecture may be modified to fit the particular application, offering flexibility in terms of the number of layers, hidden layer dimensions, utilized transformation functions, and selected loss functions. In this study, we created specialized DL architectures that were suited to the ASVSpoof dataset, assuring both computational and temporal effectiveness. With the above algorithm, the ML models have an accuracy of 90% for k-NN, 96% for SVM, 95% for RF, 95% for GB, 92% for AdaBoost, 96% for XGBoost, and 95% for MLP. When it is applied to the DL models, it shows more than 99% accuracy in DNN-Single, DNN-CNN, DNN-convLSTM, and DNN-BiLSTM. It demonstrates that the DL algorithm on ASVspoof 2021 data shows more accuracy.

DOI:10.1561/116.20250026