You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
You can choose between features `mfcc`, `gfcc`, `spectral`, `chroma` or a comma separated combination of those, example `gfcc,mfcc,spectral,chroma`, to extract from your audio files.
39
+
39
40
Classifier options :
40
41
You can choose between `svm`, `svm_rbf`, `randomforest`, `logisticregression`, `knn`, `gradientboosting` and `extratrees`.
42
+
41
43
Hyperparameter tuning is included in the code for each using grid search.
42
44
45
+
### Training and Testing Data structuring
46
+
47
+
Let's say you have 2 classes that you have training data for (music and speech), and you want to use pyAudioProcessing to train a model using available feature options. Save each class as a directory and all the training audio .wav files under the respective class directories. Example:
48
+
49
+
```bash
50
+
.
51
+
├── training_data
52
+
├── music
53
+
│ ├── music_sample1.wav
54
+
│ ├── music_sample2.wav
55
+
│ ├── music_sample3.wav
56
+
│ ├── music_sample4.wav
57
+
├── speech
58
+
│ ├── speech_sample1.wav
59
+
│ ├── speech_sample2.wav
60
+
│ ├── speech_sample3.wav
61
+
│ ├── speech_sample4.wav
62
+
```
63
+
64
+
Similarly, for any test data (with known labels) you want to pass through the classifier, structure it similarly as
65
+
66
+
```bash
67
+
.
68
+
├── testing_data
69
+
├── music
70
+
│ ├── music_sample5.wav
71
+
│ ├── music_sample6.wav
72
+
├── speech
73
+
│ ├── speech_sample5.wav
74
+
│ ├── speech_sample6.wav
75
+
```
76
+
If you want to classify audio samples without any known labels, structure the data similarly as
77
+
Similarly, for any test data (with known labels) you want to pass through the classifier, structure it as
78
+
79
+
```bash
80
+
.
81
+
├── data
82
+
├── unknown
83
+
│ ├── sample1.wav
84
+
│ ├── sample2.wav
85
+
```
43
86
44
87
### Examples
45
88
46
-
Command line example of using `gfcc,spectral,chroma` feature and `svm` classifier.
89
+
Code example of using `gfcc,spectral,chroma` feature and `svm` classifier. Sample data can be found [here](https://github.com/jsingh811/pyAudioProcessing/tree/master/data_samples).
90
+
```
91
+
from pyAudioProcessing.run_classification import train_and_classify
The above logs the filename where the classification results are saved along with the details about testing files and the classifier used.
103
+
104
+
105
+
If you cloned the project via git, the following command line example of doing training and classification with `gfcc,spectral,chroma` features and `svm` classifier can be used as well. Sample data can be found [here](https://github.com/jsingh811/pyAudioProcessing/tree/master/data_samples).
This feature lets the user extract data features calculated on audio files.
123
+
This feature lets the user extract aggregated data features calculated per audio file.
72
124
73
125
### Choices
74
126
75
127
Feature options :
76
-
You can choose between features `mfcc`, `gfcc`, `spectral`, `chroma` or a comma separated combination of those, example `gfcc,mfcc,spectral,chroma`, to extract from your audio files.
77
-
To use your own audio files for feature extraction and pass in the directory containing .wav files as the `-d` argument. Please refer to the format of directory `data_samples/testing`.
128
+
You can choose between features `mfcc`, `gfcc`, `spectral`, `chroma` or any combination of those to extract from your audio files.
78
129
79
130
### Examples
80
131
81
-
Command line example of for `gfcc` and `mfcc` feature extractions.
Features extracted get saved in `audio_features.json`.
132
+
Code example for performing `gfcc` and `mfcc` feature extraction can be found below. To use your own audio data for feature extraction, pass the path to `get_features` in place of `data_samples/testing`. Please refer to the format of directory `data_samples/testing`.
87
133
88
-
Code example of performing `gfcc` and `mfcc` feature extraction.
89
134
```
90
135
from pyAudioProcessing.extract_features import get_features
91
136
# Feature extraction
92
137
features = get_features("data_samples/testing", ["gfcc", "mfcc"])
138
+
# features is a dictionary that will hold data of the following format
If you cloned the project via git, the following command line example of for `gfcc` and `mfcc` feature extractions can be used as well. The features argument should be a comma separated string, example `gfcc,mfcc`.
154
+
To use your own audio files for feature extraction and pass in the directory containing .wav files as the `-d` argument. Please refer to the format of directory `data_samples/testing`.
0 commit comments