Skip to content

Commit ee464da

Browse files
committed
Updated README and added aptfile with demo screenshots
1 parent 0af2112 commit ee464da

File tree

4 files changed

+25
-10
lines changed

4 files changed

+25
-10
lines changed

Aptfile

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
libsndfile1
2+
libsndfile1-dev

README.md

Lines changed: 23 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -2,18 +2,31 @@
22

33
This project promulgates a `pipeline` that `trains` end-to-end keyword spotting models using input audio files, `tracks` experiments by logging the model artifacts, parameters and metrics, `build` them as a web application followed by `dockerizing` them into a container and deploys the application containing trained model artifacts as a docker container into the cloud server with `CI/CD` integration.
44

5-
## Keyword Spotter in Heroku - Demo
6-
![]()
7-
8-
_**Disclaimer:** This app is just a demo and not for realtime usage. The main objective is to get ML models into production in terms of deployment and CI/CD, from MLOps paradigm._
95
## Author
106

117
- [@Jithin Sasikumar](https://www.github.com/Jithsaavvy)
128

139
## Languages and Tools
1410

1511
<div align=""><a href="https://www.python.org" target="_blank" rel="noreferrer"><img src="https://raw.githubusercontent.com/devicons/devicon/master/icons/python/python-original.svg" alt="python" width="40" height="40"/></a><a href="https://www.tensorflow.org" target="_blank" rel="noreferrer"><img src="https://www.vectorlogo.zone/logos/tensorflow/tensorflow-icon.svg" alt="tensorflow" width="40" height="40"/></a><a href="https://www.docker.com/" target="_blank" rel="noreferrer"><img src="https://raw.githubusercontent.com/devicons/devicon/master/icons/docker/docker-original-wordmark.svg" alt="docker" width="40" height="40"/></a><a href="https://flask.palletsprojects.com/en/2.2.x/" target="_blank" rel="noreferrer"> <img src="https://banner2.cleanpng.com/20180704/sv/kisspng-flask-python-web-framework-bottle-microframework-django-5b3d0ba62504c0.3512153115307273341516.jpg" alt="flask" width="95" height="43"/></a><a href="https://github.com/features/actions" target="_blank" rel="noreferrer"> <img src="https://res.cloudinary.com/practicaldev/image/fetch/s--2mFgk66y--/c_limit,f_auto,fl_progressive,q_80,w_375/https://dev-to-uploads.s3.amazonaws.com/uploads/badge/badge_image/78/github-actions-runner-up-badge.png" alt="actions" width="52" height="49"/></a><a href="https://numpy.org/" target="_blank" rel="noreferrer"> <img src="https://upload.wikimedia.org/wikipedia/commons/thumb/3/31/NumPy_logo_2020.svg/768px-NumPy_logo_2020.svg.png?20200723114325" alt="numpy" width="95" height="43"/></a> <a href="https://hydra.cc/docs/intro/" target="_blank" rel="noreferrer"> <img src="https://raw.githubusercontent.com/facebookresearch/hydra/master/website/static/img/Hydra-Readme-logo2.svg" alt="hydra" width="98" height="37"/></a> <a href="https://www.mlflow.org/docs/latest/python_api/mlflow.html" target="_blank" rel="noreferrer"> <img src="https://www.mlflow.org/docs/latest/_static/MLflow-logo-final-black.png" alt="mlflow" width="98" height="44"/></a> <a href="https://www.heroku.com/platform" target="_blank" rel="noreferrer"> <img src="https://upload.wikimedia.org/wikipedia/commons/thumb/e/ec/Heroku_logo.svg/2560px-Heroku_logo.svg.png" alt="heroku" width="107" height="43"/></a> </div>
12+
13+
<br>
14+
15+
## Keyword Spotter in Heroku - Demo
16+
17+
_**Link**: Will be updated. Please check the `Disclaimer` below the screenshot for more !!!_
18+
19+
| ![input](./images/heroku_in.PNG) |
20+
|:--:|
21+
| <b>Figure 1a: App demo - Audio input to app for predicting keyword from trained model artifact</b>|
22+
23+
| ![input](./images/heroku_out.PNG) |
24+
|:--:|
25+
| <b>Figure 1b: App demo - Predicted keyword with probability</b>|
1626

27+
_**Disclaimer:**_ <br>
28+
_1. This app is just a demo and not for realtime usage. The main objective is to get ML models into production in terms of deployment and CI/CD, from MLOps paradigm_. <br>
29+
_2. Additionally, due to some technical issues in the Heroku backend, the app currently crashes, so the Heroku app link is not provided as of now. It will be updated once the issues are solved and when the app is up and running_.
1730

1831
## Motivation
1932

@@ -64,7 +77,7 @@ Every application or project comprises of multiple configuration settings. The m
6477

6578
Firstly, the audio has to be embedded into the vector space which constitutes the features to learn. To facilitate that, [Mel-Frequency Cepstral Coefficients](https://en.wikipedia.org/wiki/Mel-frequency_cepstrum) (MFCC) is the most common widely used, feature extraction technique for audio data. MFCCs are derived using `Fourier transform` and `log-Mel spectrogram`. More detailed mathematical explanation can be found [here](https://en.wikipedia.org/wiki/Mel-frequency_cepstrum). In order to extract these features, `librosa` is used. [data.py](./src/data.py) contains the code for preprocessing audio and extracting features from them. It reads the audio file, compute MFCC and pad them for fixed-sized vector for all audio files as CNN cannot handle sequential data. In order to avoid any hassles in loading and processing plethora of audio files, it's a good practice to dump them to `.npy` arrays, which makes it easier for further usage.
6679

67-
_**Note:** Due to large file size, the training data (.npy) files are uploaded to shared folder. Download it from []() and make sure that, the downloaded files are placed in [this directory](./dataset/train/). [Test directory](./dataset/test/) contains some sample audio files for local inferencing._
80+
_**Note:** Due to large file size, the training data (.npy) files are uploaded to shared folder. Download it from [here](https://www.dropbox.com/sh/4wjo8e8h4cg4xlo/AAAC3yR_kj5oq-ZcJopBosYYa?dl=0) and make sure that, the downloaded files are placed in [this directory](./dataset/train/). [Test directory](./dataset/test/) contains some sample audio files for local inferencing._
6881

6982
### CNN-LSTM Model
7083

@@ -117,7 +130,7 @@ The aforementioned, same functionality is also implemented in the code as well.
117130

118131
Now, the model artifacts are ready and are built into an web API, it's the time for deployment to host this application. To facilitate them a step further, `docker` would be a great tool. [Docker](https://www.docker.com/) eases the developers to package applications or software which can be easily reproduced on another machine. It uses containers to pack any applications with its dependencies to deploy in another environment. Generally, it is not a mandatory tool or step for deployment as it can also be done without dockers but they serve many purpose like portability, scalability, version control, no dependency hassles etc. Thus, docker is a great tool in the deployment cycle.
119132

120-
The main idea of using docker in this project is, to package and build a `docker image` from the FLASK application with necessary files and containerize them into a `docker container` which can be deployed in any server (in this case - Heroku cloud server). [Dockerfile](./Dockerfile) contains all the commands needed to build an image. It serves as a bridge in the `CI/CD` pipeline.
133+
The main idea of using docker in this project is, to package and build a `docker image` from the FLASK application with necessary files and containerize them into a `docker container` which can be deployed in any server (in this case - Heroku cloud server). [Dockerfile](./Dockerfile) contains all the commands needed to build an image. The command to install external packages for any `Debian or Ubuntu` based systems are also added. The `docker` serves as a bridge in the `CI/CD` pipeline between the web app and cloud server.
121134

122135
### GitHub Actions
123136

@@ -129,7 +142,7 @@ The main idea of using docker in this project is, to package and build a `docker
129142

130143
[Heroku](https://www.heroku.com/) is a container-based cloud Platform as a Service (PaaS) to deploy, manage, and scale modern apps. It accounts for the CD pipeline. As a result of CI, when docker container is build, CD deploys it into `Heroku` cloud which hosts the application and can be accessed via `URL`. _In layman terms, the application is on the internet, up and running and can be accessed with website or URL_. The command for Heroku is included in the [Dockerfile](./Dockerfile) itself.
131144

132-
As a result, the application can be accessed @ [here]() and snapshot of application UI is depicted in the [Demo](#KeywordSpotterinHeroku-Demo) section.
145+
As a result, the application will be deployed and the snapshot of application UI is depicted in the [Demo](#KeywordSpotterinHeroku-Demo) section.
133146

134147
## Run locally
135148

@@ -151,9 +164,9 @@ Install dependencies
151164
pip install -r requirements.txt
152165
```
153166

154-
Download `.npy` dataset from [here](). Make sure to put them in [./dataset/train/](./dataset/train/) directory. If not, it is fine to use a different directory but, make sure to specify the valid directory name or path in the [config.yaml](./config_dir/config.yaml) file.
167+
Download `.npy` dataset from [here](https://www.dropbox.com/sh/4wjo8e8h4cg4xlo/AAAC3yR_kj5oq-ZcJopBosYYa?dl=0). Make sure to put them in [./dataset/train/](./dataset/train/) directory. If not, it is fine to use a different directory but, make sure to specify the valid directory name or path in the [config.yaml](./config_dir/config.yaml) file.
155168

156-
Train model
169+
Train the model
157170

158171
```bash
159172
python3 main.py
@@ -171,7 +184,7 @@ Use audio files from this [test directory](./dataset/test/) for local inferencin
171184

172185
_**Note:** Assign necessary parameter variables and path in the [config.yaml](./config_dir/config.yaml). If it throws any error, please ensure that valid `PATH_NAMES` and `parameter` values are used._
173186

174-
Additionally, to run via docker container , build image from [Dockerfile](./Dockerfile) and run the container using `docker build` and `docker run` commands. As this is not a docker tutorial, it is not necessary to go more in-depth into dockers.
187+
Additionally, to run locally via docker container , build image from [Dockerfile](./Dockerfile) and run the container using `docker build` and `docker run` commands. As this is not a docker tutorial, it is not necessary to go more in-depth into dockers.
175188

176189
## What's next?
177190

images/heroku_in.PNG

944 KB
Loading

images/heroku_out.PNG

1.04 MB
Loading

0 commit comments

Comments
 (0)