Expose endpoints using Jupyter Kernel Gateway
While browsing through Medium articles, I came across Jupyter Kernel Gateway which was completely new to me. So, I explored the same. I discovered its documentation and decided that I’ll implement a project to understand it better.
Jupyter Kernel Gateway is a web server that provides headless access to Jupyter kernels. — Jupyter Kernel Gateway Repo
Basically, the service allows us to interact with Jupyter cells of any given Jupyter notebook and then use the information accordingly. It supports GET
, POST
, PUT
and DELETE
. Let’s begin this journey together. You can check my repository here.
About the project
In this project, we’ll use the California Housing dataset from sklearn
, train a Random Forest Regressor
model on it and then predict house prices. We’ll set up a GET
call to fetch statistics about our dataset and a POST
endpoint to get the predicted price of a house with a given set of features.
As this article is about Kernel Gateway and not the Machine Learning Project on California Housing itself, I’ll skip the details of it but will explain the relevant cells of the notebook. The notebook is located here.
Basics
Let’s go through the basics of Jupyter Kernel Gateway first.
Define endpoint
We need to start the cell which we want to create as an endpoint with a comment. If we want to create a GET
endpoint with the path as /housing_stats
, it is defined as:
# GET /housing_stats
Then, we define our Python code and work on the data. After all the work is done in the cell, we define the response of the endpoint. It is defined as a print()
command with a JSON value. So, if we have to return total number of houses in the dataset in a parameter total_houses
, we define it as:
print(json.dumps({
'total_houses': 20640
}))
It’s that simple. We can extend the functionality to more complex solutions if needed. Thus, each endpoint Jupyter cell would be similar to the Github Gist below:
Start the server
To start the server, there is a very simple command. We will use the Jupyter notebook by the name, House Price Prediction
. The same will be available on 0.0.0.0:9090
. The code is as follows:
jupyter kernelgateway --api='kernel_gateway.notebook_http' --seed_uri='House Price Prediction.ipynb' --port 9090
Just change the PORT number and/or the Jupyter notebook name whenever you plan to work on a different project.
GET and POST endpoints
Let’s set up our endpoints in the notebook.
GET endpoint
The first line includes the word GET
to define that it is a GET endpoint and then the path /housing_stats
. We extract the total houses in the dataset, the maximum value of all the houses and the minimum value of all the houses. Also, as per my analysis in the notebook, I identified that the most important feature is the Median Income of the Block. Thus, I dump all these values in a JSON
and put it inside the print command. The print command defines the response of this endpoint.
POST endpoint
Now, I’d like to use my trained Machine Learning model to predict the price of any new house with a given set of features. Thus, I use this endpoint to post my features and get the response as the predicted price.
I start the cell by defining it as a POST
endpoint with the path /get_prices
. The request data is contained inside the object REQUEST
, inside the key body
. Thus, I first load the request and then read all the values from the body
tag and convert it into an Numpy array. However, the shape is not correct and hence, I correct it using the Numpy’s function reshape
. I then predict the price. It returns an array with the predicted price with only one value, so I read that value into the variable predicted_price
. I reformat it to two decimal places and multiply it by 100000
as the values are in units of 100,000. Finally, I return the value by appending it to a string and putting it inside the print command.
Making requests
Let’s follow the steps to interact with our endpoints:
- Start your Jupyter notebook. I named my notebook as
House Price Prediction
, so I started the same. - Start the server. Use the command I defined above and your server will start running at
http://0.0.0.0:9090
. - Finally, decide where you will call your endpoints from. You could use a tool called
Postman
, or create a webpage that will make those calls or you can simply create another notebook to call these endpoints.
Here, we will be creating a notebook Requests
and use the requests
package in Python to call these endpoints and get the results.
Set up
Let’s create a new Jupyter notebook Requests
. We will import the requests
package into our notebook which is used to make calls to endpoints in Python. Then, we’ll specify the base url in the variable URL
ashttp://0.0.0.0:9090
.
Make the GET request
We make the get request using request.get()
and specify the complete URL which is the http://0.0.0.0:9090/housing_stats
and save it in the variable stats
. Then, we load the JSON
from that variable. For each key value pair, we print the same.
stats
has the response object. To get the content which is encoded, we use content
followed by decode('UTF-8)
. The result is then iterated.
I’ve added the result as comments in the Github Gist above. The endpoint responds with total houses, their maximum and minimum price and the most important factor for predicting the price of any house.
Make the POST request
Here, we’ll use requests.post()
to make a POST request. We first specify the complete URL which is http://0.0.0.0:9090/get_price
. We send the features in the form of JSON. You can change the values as you like and see the effect on the predicted price. We then load the result from the endpoint and print the result we get.
expected_price
has the response object. To get the content which is encoded, we use content
followed by decode('UTF-8)
. The result is then read and its result
field has our actual response.
From the commented response above, you can see that for the given set of features, the predicted price of the house is $210,424.
Conclusion
In this article, we discussed about Jupyter Kernel Gateway which allows us to convert Jupyter notebook cells to REST endpoints which we can call and get response from. We then explored the use of the same through an example project. For further information, you should check the documentation for Jupyter Kernel Gateway.
Please feel free to share your thoughts, ideas and suggestions. You feedback is always welcome.