This project demonstrates different data-driven approaches for the detection of coughs in patients with respiratory illnesses. The dataset used is available here. It consists of 135 cough files and 52 non-cough files from Google's AudioSet, 40 cough files and 1,960 non-cough files from the ESC-50 dataset, and 256 cough files and 10,801 non-cough files from the FSDKaggle2018 dataset. The utils/download_audioset.py file can be used to download the files from AudioSet. The following approaches were investigated:
- MobileNetV2: This implementation involves MobileNetV2. In
classifiers/mobilenetv2/main.py, theAudioDatasetclass fromclassifiers/mobilenetv2/AudioDataset.pyis used to load data, thetrainfunction fromclassifiers/mobilenetv2/train.pyis used to train the net, thevalidatefunction fromclassifiers/mobilenetv2/validate.pyis used to validate, and finally, thetestfunction fromclassifiers/mobilenetv2/test.pyis used to test. - SVM: The
Netclass infeature_extractor/net.pycreates the network architecture detailed in this paper and loads the parameters fromfeature_extractor/mx-h64-1024_0d3-1.17.pkl. This network is then used infeature_extractor/extract_features.pyto extract 1024-dimensional feature vectors from files in the dataset, which are saved infeatures/X.npyand their associated labels are saved infeatures/labels.pkl. This implementation is loosely based on this one. Finally, the feature vectors infeatures/X.npyand their labels infeatures/labels.pklare used inclassifiers/svm/main.pyto train a support vector machine with a radial basis function kernel to classify coughs and non-coughs. - Gaussian Naïve Bayes, Independent Components, and AdaBoost: The same feature extractor in the
feature_extractordirectory is used to compute 1024-dimensional feature vectors, which are stored infeatures/X.npyand with their associated labels infeatures/labels.pkl. Next, in theclassifiers/gnb/main.pyfile, 512 independent components are extracted from each of the 1024-dimensional feature vectors. 5 Gaussian Naïve Bayes classifiers are then combined and trained using the AdaBoost algorithm to classify coughs and non-coughs.