The world’s leading publication for data science, AI, and ML professionals.

Build a Movie Recommendation Engine backend API in 5 minutes (Part 2)

Create a simple backend Flask API for an impressive Data Science portfolio project

Photo by Lauren Mancke on Unsplash
Photo by Lauren Mancke on Unsplash

Differentiating yourself as a Data Scientist is challenging because everyone has done Machine Learning courses and is familiar with concepts such as classification and regression. To truly stand out from the crowd you need to showcase a real-world product that can quickly be grasped by recruiters.

In this post, we’ll create a Flask API which recommends movies to the user based on other peoples’ ratings. We’ll use the open-source MovieLens dataset and implement the item-to-item collaborative filtering approach.

The goal of this series Part 1–4 is to provide you with a step-by-step guide on how to build a Movie Recommendation Engine which you can then put on your GitHub & Resume to improve your chances of landing your dream Data Science job.

In the previous tutorial,

we went through what item-to-item Collaborative Filtering is, and in the future posts, we’ll go through:

  • How to deploy the Flask API (which we’ll build in this post) on AWS &
  • How to create a frontend using Vue.js.

The end-product that you’ll build through this series can be tried out on:

and will look something like this

End-product: Movie Recommendation Engine
End-product: Movie Recommendation Engine

Let’s get started!


The first step is to download the data from:

https://grouplens.org/datasets/movielens/

I used the following dataset from the MovieLens: "education & development".

User Ratings Data Source: MovieLens
User Ratings Data Source: MovieLens

After you’ve downloaded & unzipped the "ml-latest-small" folder, let’s load the relevant files into Jupyter notebook:

import pandas as pd
df_movies = pd.read_csv('~/Downloads/ml-latest-small/movies.csv')
df_ratings = pd.read_csv('~/Downloads/ml-latest-small/ratings.csv')
df_merged = pd.merge(df_movies, df_ratings, on='movieId', how='inner')
df_merged.head()

Next, let’s create a dataframe which maps individual user ratings into rows against each movie as a column and drop movies that had fewer than 8 ratings.

df = df_merged.pivot_table(index='userId', columns='title', values='rating')
# Keep only movies that had at least 8 ratings
df = df.dropna(thresh=8, axis=1)
df.fillna(0, inplace=True)
df.head()

Now we can use the Pearson correlation to calculate the similarity between movies. The Pearson method treats each movie (ie column) as a vector containing user rating values and determines how close/similar a movie is to the other ones. As you can see from the similarity matrix below, each movie is perfectly similar to itself and either strongly correlated to other movies (~+1) or strongly dissimilar (~-1).

df_similarity = df.corr(method='pearson')
#Store the data for later to be used in building the API
df_similarity.to_csv('movie_similarity.csv')
df_similarity.head()

If you like the movie "Heat (1995)" like me, let’s see what recommendations we would get by accessing the corresponding movie column and sorting the similarity scores from highest to lowest to get the top 50 movie recommendations:

movieLiked = 'Heat (1995)'
similarityScores = df_similarity[movieLiked]
similarityScores.sort_values(ascending=False)[1:50]

Seems that the movies "The Rock" and "Casino" are our top 2 recommendations.

Time to create the Flask API which can be deployed and used to make movie recommendations in real-time.

In case you’re familiar with Flask, please skip the below ‘Getting started with Flask‘ section and go directly to the ‘Movie Recommendation Engine Flask API‘ section.


Getting started with Flask

In case you don’t have Flask, you can install it using pip:

pip install Flask

Let’s start out simple by creating a python file: flask_ex.py with the following code:

In your terminal you can run this file which will essentially run a Flask server:

python flask_ex.py

And should see the following displayed:

In a separate terminal you can test it out using the curl command to send HTTP requests to your default "/" flask API resource:

curl http://0.0.0.0:80

Should output:

We can access the /find_friend resource to find out who Joey’s best friend is through the following command:

curl -X POST http://0.0.0.0:80/find_friend -H 'Content-Type: application/json' -d '{"person":"Joey"}'

Should output


Movie Recommendation Engine Flask API

Now let’s create our Flask API- application.py file, that will make the movie recommendations:

Note: make sure your application.py file is in the same directory where you outputted the movie_similarity.csv file from the jupyter notebook.

As you can see we created a resource called /recms which when made a "POST" request to will access the make_rec() function and obtain the "movie_title" of the user’s favourite movie, store it inside the "movie" variable and find similar movies which will be served as recommendations.

The api_recommendations list is then sent from the Flask API to the frontend and contains the list of movie recommendations which can then be displayed to the user.

Let’s test it out again with the movie "Heat (1995)".

First, to run the API, execute the following in the terminal

python application.py

Then in a separate terminal execute the following:

curl -X POST http://0.0.0.0:80/recms -H 'Content-Type: application/json' -d '{"movie_title":"Heat (1995)"}'

If everything is working properly, you should see the following output:

The API outputted a list containing movie recommendations which can then be consumed by the frontend.

And that’s it, we have successfully built our Movie Recommendation Engine Flask API!


Final Words

In the next post, we’ll go through how we can dockerize this API and deploy it on AWS for others to consume it.


Want more useful articles on ML Engineering?

Subscribe for free to get notified when I publish a new story.

Become a Medium member to read more stories from me and thousands of other writers. You can support me by using my referral link when you sign up. I’ll receive a commission at no extra cost to you.


References


Related Articles