How To Deploy A Neural Network From Beirut

Published in

Towards Data Science

17 min readAug 31, 2019

Beirut is Lebanon’s gorgeous capital and comes with the typical problems of a bustling city. On top of that it suffers from frequent power cuts and one of the slowest internet connections in the world.

It is also where I spent my summer vacation and an ideal testing ground for the purpose of this article: how to deploy a neural network in the form of a web app with Amazon’s SageMaker and PyTorch.

Before moving on it is advisable to clone the repository into SageMaker to follow along this lengthy process. The entire project from start to finish is hosted in this GitHub repository.

Download the Data

For this task we will use the IMDb dataset by Maas et al, it contains 25,000 highly polar movie reviews for training, and 25,000 for testing. We create a directory and download the data:

Processing & Cleaning

Photo by Pablo Lancaster Jones on Unsplash

With the data downloaded we need to transform it into a readable format. The read_ibmd_data function below reads into each of the reviews and combines them into a single input structure.

We then combine the positive and negative reviews, and shuffle the resulting records with the prepare_imdb_data function.

Before moving on let’s check how we are doing. The review below has a value of 1, meaning that it is positive.

Contains *spoilers* - also, my quotes may not be exact.<br /><br />Everyone always notes the satire in social commentary and economic parallels - how true. But to me, I see this movie as much more than that. I love the symbolism of this guy in a glowing white suit. There is so much confusion and filth in the world around him, but it won't stick. Alec Guiness was the perfect guy to play this - his boyish grins and eternal curiousity are so appropriate:<br /><br />"That's ingenious - can you tell me, what is the ratio of ink to petrol?"<br /><br />The only moment of defeat is when he realizes that his invention hasn't worked after all - standing there almost naked. Yet, more than shame is the simple disappointment that "it didn't work." He's never really intimidated by people. Remember,<br /><br />"But Sidney, we want to stop it too."<br /><br />Barely a moments hesitation before he's off trying to get away again. Does he show any sign of the pain such a betrayal must've caused? No.<br /><br />Also notable is Dapne's role. She is sick and tired of money and power. She thinks she's finally found love, outside of her father's company. At first she doesn't really care about Sidney anymore than anyone else. But that moment when he falls off her car and she goes back to see if maybe she killed him - and yet he is still thinking only of the beauty of his invention. She's finally found something she thinks is worth living for. The funny thing is that it's not even romance. It is friendship, but of such an ephemeral nature that the title almost doesn't fit. It's more admiration, and perhaps even inspiration.<br /><br />Upon her discovery that Michael has no real love for her, and that her father is completely incompetent to take care of her, she gives into cynicism and tries to temp Sidney. Fortunately she finds that there really are people in this world living for more than power, money and lust. What a refreshment:<br /><br />"Thank you Sidney. If you would've said 'yes' I think I'd have strangled you."<br /><br />I love the very end, when all of this crazy business seems to have come to nothing. But then, the bubbly, quirky beat starts up and Sidney goes off, his stride matching the tune: dauntless. Where is Daphne? We don't really know - but they weren't really in love and she wasn't really a scientist. He got help escaping and she got "a shot in the arm of hope." (Pollyanna) A cont'd relationship would've been nice, but as Billy Joel says "it's more than I'd hoped for..."<br /><br />

Coincidentally it’s a very long review however skimming it shows it’s positive. It comes with a lot of html tags that we will remove because they add no sentiment value. We also want to tokenize our input so that words such as entertained and entertaining are considered the same when it comes to sentiment analysis.

The review_to_words helper function does just that, and relies on the NLTK library.

We use the review_to_words inside the preprocess_data function below. It reads into the data and caches the results. This is because performing this processing step can take a long time. This way if we are unable to complete the notebook in the current session, we can come back without needing to process the data a second time.

Transform the Data

For the model we are going to implement in this notebook, we will construct a feature representation for all words as a way to map words that appear in the reviews to integers. To start, we will represent each word as an integer. However some of the words that appear in the reviews occur very infrequently and so likely don’t contain much information for the purposes of sentiment analysis.

The way we will deal with this problem is that we will fix the size of our working vocabulary and we will only include the words that appear most frequently. We will then combine all of the infrequent words into a single category and, in our case, we will label it as 1.

Since we will be using a recurrent neural network, it will be convenient if the length of each review is the same. To do this, we will fix a size for our reviews and then pad short reviews with the category ‘no word’ (which we will label 0) and truncate long reviews.

The build_dict implements the feature transformation to the data. Note that even though the vocab_size is set to 5000, we only want to construct a mapping for the most frequently appearing 4998 words. This is because we want to reserve the special labels 0 for 'no word' and 1 for 'infrequent word'.

What are five most common words in our reviews? Unsurprisingly they are: movi, film, one, like, time.

Later on when we construct an endpoint which processes a submitted review, we will need to make use of the word_dict which we have created. Therefore we save it to a file now for future use.

Now that we have our word dictionary which allows us to transform the words appearing in the reviews into integers, we use it to convert reviews to their integer sequence representation, making sure to pad or truncate to a fixed length, which in our case is 500.

We first create the convert_and_pad function that pads a single review.

We then implement it inside the convert_and_pad_data to apply it to the entire data.

Let’s recap what we did in the steps above. To deal with both short and very long reviews, we pad or truncate all our reviews to a specific length. For reviews shorter than some pad length, we will pad with 0s. This might cause memory issues however it is a necessary step to standardize the reviews.

When we built the word_dict variable that contains all the vocabulary we only used the training data. This means that no data leak occurs between the training and testing datasets. However if the training dataset is not exhaustive we will encounter limitations down the line.

Upload to S3

We need to upload the training dataset to a Simple Storage Service (S3) in order for our training code to access it. For now we will save it locally and we will upload to S3 later on.

It is important to note the format of the data that we are saving as we will need to know it when we write the training code. In our case, each row of the dataset has the form label, length, review[500], where review[500] is a sequence of 500 integers representing the words in the review.

Next, we need to upload the training data to the SageMaker default S3 bucket so that we can provide access to it while training our model.

The code block below uploads the entire contents of our data directory. This includes the word_dict.pkl file. This is fortunate as we will need this later on when we create an endpoint that accepts an arbitrary review. For now, we will just take note of the fact that it resides in the data directory (and so also in the S3 training bucket) and that we will need to make sure it gets saved in the model directory.

Build & Train PyTorch Model

A model on SageMaker comprises of three objects:

Model Artifacts
Training Code
Inference Code

Each component interacts with one another. Here we will use containers provided by Amazon and write our own custom training and inference code.

We will start by implementing our own neural network in PyTorch along with a training script. For the purposes of this project we have provided the necessary model object in the model.py file, inside of the train folder. The code block below shows the implementation.

There are three parameters that we may wish to tweak to improve the performance of our model:

Embedding Dimension
Hidden Dimension
Size of Vocabulary

We will likely want to make these parameters configurable in the training script so that if we wish to modify them we do not need to modify the script itself. To start we will write some of the training code so that we can more easily diagnose any issues that arise.

First we will load a small portion of the training data set to use as a sample. It would be very time consuming to try and train the model completely in the notebook as we do not have access to a GPU and the compute instance that we are using is not particularly powerful. However, we can work on a small bit of the data to get a feel for how our training script is behaving.

Next we need to write the training code itself. We will leave complex aspects such as model saving / loading and parameter loading until a little later.

Assuming we have the training method above, we will test that it is working by writing code that executes our training method on the small sample training set that we loaded earlier. The reason for doing this early on is so that we have an opportunity to fix any errors that arise early when they are easier to diagnose.

In order to construct a PyTorch model using SageMaker we must provide SageMaker with a training script. We may optionally include a directory which will be copied to the container and from which our training code will be run. When the training container is executed it will check the uploaded directory (if there is one) for a requirements.txt file and install any required Python libraries, after which the training script will be run.

When a PyTorch model is constructed in SageMaker, an entry point must be specified. This is the Python file which will be executed when the model is trained. Inside of the train directory is a file called train.py which contains most of the necessary code to train our model. The only thing that is missing is the implementation of the train() method which we wrote earlier.

The way that SageMaker passes hyperparameters to the training script is by way of arguments. These arguments can then be parsed and used in the training script. To see how this is done, feel free to take a look at the provided train/train.py file.

Deploy Model for Testing

Now that we have trained our model, we would like to test it to see how it performs. Currently our model takes input of the form review_length, review[500], where review[500] is a sequence of 500 integers which describe the words present in the review, encoded using word_dict. Fortunately for us, SageMaker provides built-in inference code for models with simple inputs such as this.

We need to provide a function which loads the saved model. This function must be called model_fn() and takes as its only parameter a path to the directory where the model artifacts are stored. This function must also be present in the python file which we specified as the entry point. In our case the model loading function has been provided and so no changes need to be made.

Note that when the built-in inference code is run it must import the model_fn method from the train.py file. This is why the training code is wrapped in a main guard ( ie, if __name__ == ‘__main__’: )

Since we don’t need to change anything in the code that was uploaded during training, we can simply deploy the current model as-is.

When deploying a model we are asking SageMaker to launch an compute instance that will wait for data to be sent to it. As a result, this compute instance will continue to run until shut down. This is important to know since the cost of a deployed endpoint depends on how long it has been running for.

Use the model for testing

Once deployed, we can read in the test data and send it off to our deployed model to get some results. Once we collect all of the results we can determine how accurate our model is.

We now have a trained model which has been deployed and which we can send processed reviews to and which returns the predicted sentiment. However, ultimately we would like to be able to send our model an unprocessed review. That is, we would like to send the review itself as a string. For example, suppose we wish to send the following review to our model.

The question we now need to answer is, how do we send this review to our model?

Recall that in the previous sections we did two things:

Removed any html tags and stemmed the input
Encoded the review as a sequence of integers using word_dict

In order process the review we will need to repeat these two steps. Using the review_to_words and convert_and_pad methods from before, we convert test_review into a numpy array called test_data, suitable to send to our model. Recall that our model expects input of the form review_length, review[500]. We can then use the predictor object to predict sentiment.

Deploy Model for Web App

Now that we know that our model is working, it’s time to create some custom inference code so that we can send the model a review which has not been processed and have it determine the sentiment of the review.

As we saw above, by default the estimator which we created, when deployed, will use the entry script and directory which we provided when creating the model. However, since we now wish to accept a string as input and our model expects a processed review, we need to write some custom inference code.

We will store the code that we write in the directory. Provided in this directory is the model.py file that we used to construct our model, a utils.py file which contains the review_to_words and convert_and_pad pre-processing functions which we used during the initial data processing, and predict.py, the file which will contain our custom inference code. Note also that requirements.txt is present which will tell SageMaker what Python libraries are required by our custom inference code.

When deploying a PyTorch model in SageMaker, we are expected to provide four functions which the SageMaker inference container will use.

model_fn: This function is the same function that we used in the training script and it tells SageMaker how to load our model.
input_fn: This function receives the raw serialized input that has been sent to the model's endpoint and its job is to de-serialize and make the input available for the inference code.
output_fn: This function takes the output of the inference code and its job is to serialize this output and return it to the caller of the model's endpoint.
predict_fn: The heart of the inference script, this is where the actual prediction is done.

For the simple website that we are constructing during this project, we only require being able to accept a string as input and we expect to return a single value as output. We might infer that in a more complex application the input or output may be image data or some other binary data which would require more effort to serialize.

Inside serve/predict.py, we write the inference code as follows:

Now that the custom inference code has been written, we will create and deploy our model. To begin with, we need to construct a new PyTorch model object which points to the model artifacts created during training and also points to the inference code that we wish to use. Then we can call the deploy method to launch the deployment container.

Now that we have deployed our model with the custom inference code, we should test to see if everything is working. Here we test our model by loading the first 250 positive and negative reviews and send them to the endpoint, then collect the results. The reason for only sending some of the data is that the amount of time it takes for our model to process the input and then perform inference is quite long and so testing the entire data set would be prohibitive.

Use Model for Web App

Now that we know our endpoint is working as expected, we can set up the web page that will interact with it.

So far we have been accessing our model endpoint by constructing a predictor object which uses the endpoint and then just using the predictor object to perform inference. What if we wanted to create a web app which accessed our model?

The way things are set up currently makes that impossible since in order to access a SageMaker endpoint the app would first have to authenticate with AWS using an IAM role which included access to SageMaker endpoints. However, there is an easier way! We just need to use some additional AWS services

The diagram above gives an overview of how the various services will work together. On the far right is the model endoint which we trained above and which is deployed using SageMaker. On the far left is our web app that collects a user’s movie review, sends it off and expects a positive or negative sentiment in return.

In the middle we will construct a Lambda function, which wecan think of as a straightforward Python function that can be executed whenever a specified event occurs. We will give this function permission to send and receive data from a SageMaker endpoint.

Lastly, the method we will use to execute the Lambda function is a new endpoint that we will create using API Gateway. This endpoint will be a url that listens for data to be sent to it. Once it gets some data it will pass that data on to the Lambda function and then return whatever the Lambda function returns. Essentially it will act as an interface that lets our web app communicate with the Lambda function.

Set Up Lambda Function

The Lambda function will be executed whenever our public API has data sent to it. When it is executed it will receive the data, perform any sort of processing that is required, send the data (the review) to the SageMaker endpoint we’ve created and then return the result.

Since we want the Lambda function to call a SageMaker endpoint, we need to make sure that it has permission to do so. To do this, we will construct a role that we can later give the Lambda function.

Using the AWS Console, navigate to the IAM page and click on Roles. Then, click on Create role. Make sure that the AWS service is the type of trusted entity selected and choose Lambda as the service that will use this role, then click Next: Permissions.

In the search box type sagemaker and select the check box next to the AmazonSageMakerFullAccess policy. Then, click on Next: Review.

Lastly, give this role a name. Make sure you use a name that wewill remember later on, for example LambdaSageMakerRole. Then, click on Create role.

Now we can create the Lambda function. Using the AWS Console, navigate to the AWS Lambda page and click on Create a function. When we get to the next page, make sure that Author from scratch is selected.

Now, name theLambda function, using a name that wewill remember later on, for example sentiment_analysis_func. Make sure that the Python 3.6 runtime is selected and then choose the role that was created in the previous part. Then, click on Create Function.

On the next page we can see some information about the Lambda function we just created. Scroll down to see an editor in which we can write the code that will be executed when the Lambda function is triggered. In our example, we will use the code below.

We need to add the endpoint name to the Lambda function. Inside the SageMaker notebook we can get this endpoint name:

Once we have added the endpoint name to the Lambda function, click on Save. The Lambda function is now up and running. Next we need to create a way for our web app to execute the Lambda function.

Set Up API Gateway

Now that our Lambda function is set up, it is time to create a new API using API Gateway that will trigger the Lambda function we have just created.

Using AWS Console, navigate to Amazon API Gateway and then click on Get started.

On the next page, make sure that New API is selected and give the new api a name, for example sentiment_analysis_api. Then, click on Create API.

Now we have created an API, however it doesn’t currently do anything. What we want it to do is to trigger the Lambda function that we created earlier.

Select the Actions dropdown menu and click Create Method. A new blank method will be created, select its dropdown menu and select POST, then click on the check mark beside it.

For the integration point, make sure that Lambda Function is selected and click on the Use Lambda Proxy integration. This option makes sure that the data that is sent to the API is then sent directly to the Lambda function with no processing. It also means that the return value must be a proper response object as it will also not be processed by API Gateway.

Type the name of the Lambda function created earlier into the Lambda Function text entry box and then click on Save. Click on OK in the pop-up box that then appears, giving permission to API Gateway to invoke the Lambda function you created.

The last step in creating the API Gateway is to select the Actions dropdown and click on Deploy API. We will need to create a new Deployment stage and name it something relevant, for example prod.

We have now successfully set up a public API to access your SageMaker model! Make sure to copy or write down the URL provided to invoke the newly created public API as this will be needed in the next step. This URL can be found at the top of the page, highlighted in blue next to the text Invoke URL. A sample URL looks like:

The link associated with the endpoint is: https://ltii177nx3.execute-api.us-west-2.amazonaws.com/prod

Deploy the Web App

Now that we have a publicly available API, we can start using it in a web app. For our purposes, we have a simple static html file which can make use of the public APIcreated earlier.

In the website folder there is a file called index.html. Download the file to your computer and open that file up in a text editor of your choice. There should be a line which contains **REPLACE WITH PUBLIC API URL**. Replace this string with the url from the last step and then save the file.

Now, if you open index.html on your local computer, the browser will behave as a local web server and you can use the provided site to interact with your SageMaker model.

To go further, you can host this html file anywhere you’d like, for example using a static site on Amazon’s S3.

Results

Let’s see how our model behaves with sample positive and negative reviews.

Congratulations! You now have a neural network powered web app!

This project was made possible thanks to Udacity’s Deep Learning nanodegree. I highly recommend this course as a way to gain a solid understanding of artificial intelligence!