This project implements the research paper:
"Deepfake Audio Detection via MFCC Features Using Machine Learning"
We detect audio deepfakes using classical ML models and a hybrid deep learning model. The project covers:
- π§Ή Data Cleaning & Preprocessing
- π΅ MFCC and Spectral Feature Extraction
- βοΈ Dimensionality Reduction via PCA
- π§ ML Models: SVM, RF, MLP, Gradient Boosting
- π€ Deep Learning: VGG16 + LSTM Fusion
- π Hyperparameter tuning with RandomizedSearchCV
Fake-or-Real (FoR):
Includes 4 subsets:
for-originalfor-2secfor-normfor-rerec
- PCA helped reduce training time without losing much accuracy.
- VGG16+LSTM gave best performance for robust detection.
- Evaluation used accuracy, confusion matrix, ROC-AUC.
Full code in this notebook:
deepfake_audio_detection.ipynb
Also on Kaggle
- Python, NumPy, Pandas
- Librosa, Matplotlib
- Scikit-learn
- TensorFlow / Keras
git clone https://github.com/<your-username>/deepfake-audio-detection.git
cd deepfake-audio-detection
jupyter notebook deepfake_audio_detection.ipynb
