Expose endpoints using Jupyter Kernel Gateway

Published in

Towards Data Science

6 min readFeb 7, 2019

While browsing through Medium articles, I came across Jupyter Kernel Gateway which was completely new to me. So, I explored the same. I discovered its documentation and decided that I’ll implement a project to understand it better.

Jupyter Kernel Gateway is a web server that provides headless access to Jupyter kernels. — Jupyter Kernel Gateway Repo

Basically, the service allows us to interact with Jupyter cells of any given Jupyter notebook and then use the information accordingly. It supports GET, POST, PUT and DELETE. Let’s begin this journey together. You can check my repository here.

About the project

In this project, we’ll use the California Housing dataset from sklearn, train a Random Forest Regressor model on it and then predict house prices. We’ll set up a GET call to fetch statistics about our dataset and a POST endpoint to get the predicted price of a house with a given set of features.

As this article is about Kernel Gateway and not the Machine Learning Project on California Housing itself, I’ll skip the details of it but will explain the relevant cells of the notebook. The notebook is located here.

Basics

Let’s go through the basics of Jupyter Kernel Gateway first.

Define endpoint

We need to start the cell which we want to create as an endpoint with a comment. If we want to create a GET endpoint with the path as /housing_stats, it is defined as:

# GET /housing_stats

Then, we define our Python code and work on the data. After all the work is done in the cell, we define the response of the endpoint. It is defined as a print() command with a JSON value. So, if we have to return total number of houses in the dataset in a parameter total_houses, we define it as:

print(json.dumps({
  'total_houses': 20640
}))

It’s that simple. We can extend the functionality to more complex solutions if needed. Thus, each endpoint Jupyter cell would be similar to the Github Gist below:

Start the server

To start the server, there is a very simple command. We will use the Jupyter notebook by the name, House Price Prediction. The same will be available on 0.0.0.0:9090. The code is as follows:

jupyter kernelgateway --api='kernel_gateway.notebook_http' --seed_uri='House Price Prediction.ipynb' --port 9090

Just change the PORT number and/or the Jupyter notebook name whenever you plan to work on a different project.

GET and POST endpoints

Let’s set up our endpoints in the notebook.

GET endpoint

The first line includes the word GET to define that it is a GET endpoint and then the path /housing_stats. We extract the total houses in the dataset, the maximum value of all the houses and the minimum value of all the houses. Also, as per my analysis in the notebook, I identified that the most important feature is the Median Income of the Block. Thus, I dump all these values in a JSON and put it inside the print command. The print command defines the response of this endpoint.

POST endpoint

Now, I’d like to use my trained Machine Learning model to predict the price of any new house with a given set of features. Thus, I use this endpoint to post my features and get the response as the predicted price.

I start the cell by defining it as a POST endpoint with the path /get_prices. The request data is contained inside the object REQUEST, inside the key body. Thus, I first load the request and then read all the values from the body tag and convert it into an Numpy array. However, the shape is not correct and hence, I correct it using the Numpy’s function reshape. I then predict the price. It returns an array with the predicted price with only one value, so I read that value into the variable predicted_price. I reformat it to two decimal places and multiply it by 100000 as the values are in units of 100,000. Finally, I return the value by appending it to a string and putting it inside the print command.

Making requests

Let’s follow the steps to interact with our endpoints:

Start your Jupyter notebook. I named my notebook as House Price Prediction, so I started the same.
Start the server. Use the command I defined above and your server will start running at http://0.0.0.0:9090.
Finally, decide where you will call your endpoints from. You could use a tool called Postman, or create a webpage that will make those calls or you can simply create another notebook to call these endpoints.

Here, we will be creating a notebook Requests and use the requests package in Python to call these endpoints and get the results.

Set up

Let’s create a new Jupyter notebook Requests. We will import the requests package into our notebook which is used to make calls to endpoints in Python. Then, we’ll specify the base url in the variable URL ashttp://0.0.0.0:9090.

Make the GET request

We make the get request using request.get() and specify the complete URL which is the http://0.0.0.0:9090/housing_stats and save it in the variable stats. Then, we load the JSON from that variable. For each key value pair, we print the same.

stats has the response object. To get the content which is encoded, we use content followed by decode('UTF-8). The result is then iterated.

I’ve added the result as comments in the Github Gist above. The endpoint responds with total houses, their maximum and minimum price and the most important factor for predicting the price of any house.

Make the POST request

Here, we’ll use requests.post() to make a POST request. We first specify the complete URL which is http://0.0.0.0:9090/get_price. We send the features in the form of JSON. You can change the values as you like and see the effect on the predicted price. We then load the result from the endpoint and print the result we get.

expected_price has the response object. To get the content which is encoded, we use content followed by decode('UTF-8). The result is then read and its result field has our actual response.

From the commented response above, you can see that for the given set of features, the predicted price of the house is $210,424.

Conclusion

In this article, we discussed about Jupyter Kernel Gateway which allows us to convert Jupyter notebook cells to REST endpoints which we can call and get response from. We then explored the use of the same through an example project. For further information, you should check the documentation for Jupyter Kernel Gateway.

Please feel free to share your thoughts, ideas and suggestions. You feedback is always welcome.