The world’s leading publication for data science, AI, and ML professionals.

Music Genre Recognition Using Convolutional Neural Networks- Part 2

Learn how to build an app for music genre recognition using Streamlit and deploy it on Amazon EC2.

Photo by Austin Neill on Unsplash
Photo by Austin Neill on Unsplash

Any roles involved in a project that do not directly contribute toward the goal of putting valuable software in the hands of users as quickly as possible should be carefully considered. – Stein Inge Morisbak

Hello again, in the first part of this article we developed a Convolutional Neural Network for recognising genre of music. Now in this part we will develop a web app using the awesome Streamlit library.

Most of the people studying Deep Learning or machine learning master how to build a model, train a model but they don’t know how to deploy a model (including me before this project xP). Deploying a model is as important as training it, by deploying your model people can make use of it. Wouldn’t it be cool to show your friends an app you developed for your model, instead of running your model in a jupyter notebook?.

I will use the Streamlit library to build app and then deploy it on Amazon EC2 Instance.

Building the app

First, if you don’t have Streamlit installed use the following command in terminal to install

pip install streamlit

Streamlit is a very easy-to-use framework it is especially designed for people who are not web developers. I’m just going to build a basic web app and you can further improve it by adding your own style and creativity.

First, create a python script and write all the code for app in that.

import streamlit as st
st.write("Music Genre Recognition App") 
st.write("This is a Web App to predict genre of music")
file = st.sidebar.file_uploader("Please Upload Mp3 Audio File Here or Use Demo Of App Below using Preloaded Music",type=["mp3"])

Now, to see how this actually looks on webpage you need to run this script. To run, type the following command in terminal.

streamlit run app.py
Source : Image by Author
Source : Image by Author

So, this is a very basic web app which has a file uploader to upload the mp3 file for music. Now, we need to write code for taking the file input from file_uploader, preprocess it, and gives the prediction using our CNN model.

First, lets import all the libraries which we will require

from PIL import Imageimport librosa
import numpy as np
import librosa.display
from pydub import AudioSegment
import matplotlib.cm as cm from matplotlib.colors import Normalize

Now, we will create a function which will convert mp3 audio files into .wav files, because Librosa works only with .wav files.

def convert_mp3_to_wav(music_file):  
sound = AudioSegment.from_mp3(music_file)      sound.export("music_file.wav",format="wav")

If you remember, we trained our model for recognising genre using audio clips of 3 sec. Keeping that in mind, we will need to write a function which extracts 3 sec of audio from our music.

def extract(wav_file,t1,t2):  
  wav = AudioSegment.from_wav(wav_file)  
  wav = wav[1000*t1:1000*t2]  
  wav.export("extracted.wav",format='wav')

Above function will segment out audio between time t1 and t2.

Now, The most important part is to create a mel spectrogram of the extracted audio. This mel spectrogram will be fed to the model for prediction.

def create_melspectrogram(wav_file):  
  y,sr = librosa.load(wav_file,duration=3)  
  mels = librosa.feature.melspectrogram(y=y,sr=sr)    
  fig = plt.Figure()  canvas = FigureCanvas(fig) 
  p = plt.imshow(librosa.power_to_db(mels,ref=np.max))   
  plt.savefig('melspectrogram.png')

We will create a function to predict the genre of music using mel spectrogram generated above.

def predict(image_data,model):   
  image = img_to_array(image_data)   
  image = np.reshape(image,(1,288,432,4))   
  prediction =     model.predict(image/255)   
  prediction = prediction.reshape((9,))     
  class_label = np.argmax(prediction)     
  return class_label,prediction
class_labels = ['blues', 'classical', 'country', 'disco', 'hiphop', 'metal', 'pop', 'reggae', 'rock']

Use the model you trained on GTZAN dataset to predict here. If you don’t know how to save and load model in Keras refer to How to Save and Load Your Keras Deep Learning Model (machinelearningmastery.com).

class_label is the label with highest probability, and prediction captures the probability distribution over all classes (i.e genres).

class_labels is a list which maps genres to numbers (i.e indexes of list)

Now, we will merge all the functions we wrote to display the final output on our web app.

if file is None:  
  st.text("Please upload an mp3 file")
else:  
  convert_mp3_to_wav(file)  
  extract_relevant("music_file.wav",40,50)   
  create_melspectrogram("extracted.wav")   
  image_data =   load_img('melspectrogram.png',color_mode='rgba',target_size=(288,432))
  class_label,prediction = predict(image_data,model)   
  st.write("## The Genre of Song is "+class_labels[class_label])

So, we are almost done now, We just need to see how this simple web looks in a browser. Note that, you can also add other features to your web app like sidebar, radio buttons, background images using streamlit which you can explore on your own.

Source; Image by Author
Source; Image by Author

This is how our app looks, it is very crude but you can explore on your own how to add more features, its really easy.

Now that we have built our app, we will deploy it using Amazon EC2 Instance

Deploying on Amazon EC2

  1. Create a free account on Amazon Web Services, fill in all details such as name, address, postal code, etc. You will also need to give your debit card information, but they will charge no fee.
  2. After creating the account, browse to AWS Management Console.
Source: Image by Author
Source: Image by Author

Select, Launch a virtual machine with EC2, after that select Amazon Linux 2 AMI (HVM), SSD Volume Type

  1. Select Instance type as t2.micro, now just go next till Configure Security Group page, here we need to create a custom TCP port because Streamlit works on port 8501. Select add rule and write 8501 in port range, select anywhere in source.
  2. After completing, it will prompt you to create a key pair, so choose to create a new key pair and give it a name and download the key pair, store it safely as you will require it to access your virtual machine.
  3. Now, open a terminal window in the directory where you have stored key pair (*.pem extension). Type the following in terminal.
chmod 400 name.prem

This will give necessary permissions for .pem file

  1. Now, ssh into EC2 instance you created
ssh -i your_key_pair_name.pem ec2-user@your public dns address

You can find your public DNS address in EC2 console. You have successfully launched your Amazon EC2 Instance.

  1. Install all the required packages for our app and model.
sudo python3 -m pip install librosa llvmlite==0.33.0 numba==0.49.1
sudo yum install libsndfile

This, will install librosa and libsndfile. Now, we will need to install ffmpeg, whose installation is little bit involved, so refer to this article How to install FFMPEG on EC2 running Amazon Linux? | by Vivek Maskara | Medium for installing ffmpeg.

After installing ffmpeg, to export it into our path use

export PATH=$PATH:<path where you installed fffmpeg>

Now, install tensorflow

sudo python3 -m pip install --no-cache-dir tensorflow==2.2.0 grpcio==1.24

We have installed all the required packages.

  1. Make a GitHub repository, add all required files to the GitHub repository which include model weights, model json, python script for app. Clone this repository to your amazon EC2 Instance using the following.
git clone https://github.com/username/repository_name.git
  1. Change directory to the repository you just cloned and run the python script which has the code for app
streamlit run script_name.py --server.port 8501

Access you web app using the External URL, Voila! You have just deployed an app using Amazon EC2 Instance.

  1. You will observe that as soon as you close the terminal the web app becomes inaccessible, so to keep it running even after you have closed your terminal, we need to use tmux.
tmux new -s StreamSession

This will create a tmux session and inside this session enter the command mentioned in point 9. After doing this press Ctrl +b first and then d to detach from the tmux session. Now you can access the web app anytime, even after closing terminal :).

You have just deployed your app and now anyone can access it and make use of it and, of course you can show it to your friends 😃 .

Conclusion

This was the last part of Music Genre Recognition using Convolutional Neural Networks. This was an End to End tutorial on how to develop a model, train the model, build an app around the model and then deploy it. Hope that it is helpful to all Deep Learning enthusiasts out there.

Following are the screenshots of the Web App I developed.

Source: Image by Author
Source: Image by Author
Image by Author
Image by Author

You can visit the Web App at music-genre-recognition-app · Streamlit (ec2–3–134–82–239.us-east-2.compute.amazonaws.com)

Please let me know if you find it helpful or not. I always like an honest feedback!

Find the GitHub repository linked to this article here

KunalVaidya99/Music-Genre-Classification: Music Genre Recognition App With Accuracy of 89%. (github.com)

References

  1. How to Deploy Streamlit Apps on AWS Ec2 – JCharisTech
  2. How to install FFMPEG on EC2 running Amazon Linux? | by Vivek Maskara | Medium
  3. API reference – Streamlit 0.76.0 documentation

Related Articles