The world’s leading publication for data science, AI, and ML professionals.

How to Create a Docker Image with Jupyter Notebook and Kotlin

Set up a custom Jupyter Notebook environment compatible with Kotlin kernel in 3 steps using Docker.

Jupyter + kotlin kernel sample (Image by Author)
Jupyter + kotlin kernel sample (Image by Author)

Computational Notebooks or **** simpl_y Notebook_s are a flexible and interactive tool that allows scientists to combine software code, computational output and explanatory resources (like text, charts and any other media content) within the same document. Notebooks are used for a wide variety of purposes, including data cleaning and transformation, numerical simulation, statistical modeling, data visualization, machine learning and much more.

Computational notebooks are not something new and they have been around for quite some time. However, the rise of web based development environments as well as the growing interest on Data Science disciplines (like exploratory data analysis, machine learning or deep learning among others), has appointed, notebooks in general and [Jupyter](https://jupyter.org/) notebooks in particular, as the preferred tool for scientists and researchers all around the world. There are plenty of ways to install Jupyter on your local environment (pip, Anaconda,…) or even working directly on cloud powered notebook environments (like Google Colab). Nevertheless, most of these approaches are Python oriented, which means that if you want to use Jupyter with any other language (like Kotlin, Scala or Java) you will have to install additional "kernels". In this post we are going to explain an extremely simple approach to set up your own Jupyter environment **** compatible with _Kotli_n language without having to install anything but only one tool on your laptop: Docker.


Environment

This article is written based on the following platform:

Objectives

This article aims to illustrate in detail the steps to follow in order to create a custom docker image with the following components: Jupyter Notebook and Kotlin kernel. Once the environment is set up, we will show how to access it and how to work with it. Finally, after confirming that everything works fine we will upload our image to a container images repository like Docker Hub, so it can be easily accessed by the community.

Let’s briefly discuss about the technologies and products we are going to use:

Docker is a software platform designed to make it easier to create, deploy, and run applications by using containers. Docker grants developers to package up applications along with all their dependencies in a container, and then ship it out as one package. This technology allows to stop worrying about installing components and libraries, just focus on working.

Jupyter is a web-based interactive development environment that allows to manage Jupyter notebooks. Jupyter Notebook is an interactive open document format based on JSON, which is used to combine software source code, narrative text, media content and computational outputs in one single document.

Kotlin is a general purpose, free, open source, statically typed "pragmatic" programming language created by JetBrains. Kotlin combines object-oriented and functional programming features and it is designed to interoperate fully with Java, and the JVM version of Kotlin’s standard library depends on the Java Class Library.

It’s out of the scope of this post to discuss about the reasons to use Kotlin on your data science projects, but if you want to read about the topic I would recommend to check the following resource from JetBrains (here).

Step 1: Create a Dockerfile

A Dockerfile is a text document that contains the instructions a user can execute on the command line to build and assemble a docker image. Since we aim to create a docker image, our first step consist of creating a docker file. So go to your project directory and create a new file named Dockerfile. Once created we can edit it by adding the content as follows:

Let’s explain the content of the file:

  • FROM jupyter/base-notebook: This is the first line for our Dockerfile. Usually a Dockerfile begins with the FROM command. The FROM instruction receives as argument a pre-existent docker image. The idea is to use the services provided by this image and extends it by adding new layers on top of. The base image passed as argument is jupyter/base-notebook. This image contains a basic version of Jupyter already installed.
  • LABEL Miguel Doctor <[email protected]>: LABEL command is optional. It is used to identify the maintainer of the image. If included, you allow people interested on your image to contact you in case they need to ask you something or report a problem with the image.
  • The ENV defines an environmental variable. On this case we indicate that we want to use the full Jupyter Lab solution, which allows us to handle several notebooks and use a nice browser web interface.

Step 2: Adding openjdk-8-jre and Kotlin kernel to our Dockerfile

As mentioned earlier, the default configuration for Jupyter is Python oriented, therefore in order to use a different language like Kotlin, we need to install and configure a specific Kernel. Since Kotlin is a JVM language, the first thing to install is the Java Runtime Environment (JRE).

So, let us update our Dockerfile by including the openjdk-8-jre installation on it. Once the JRE is installed, we need to add the Kotlin kernel so it can be connected to Jupyter. To achive this we just need to open the Dockerfile and edit its content as indicated in the script below:

As you can see we have added several lines to our file, and these lines make use of new Dockerfile keywords like USER and RUN. So le us explain what are these commands used for.

  • The USER command allows to change the user account in charge of executing the following instructions within the Dockerfile. Since installing the JDK 8 on the container is something only available for the root user, we need to indicate so in the file.
  • The RUN keyword is in charge of executing actions at build time. By using RUN you can customize the underlying image updating software packages, adding new applications to the image or arranging security rules among others. In our case, we call the RUN instruction twice in the file. First to run apt package management utility so the open-jdk (version 8) can be installed on the container. The second RUN instruction executes conda command line tool to install kotlin-jupyter-kernel.

Step 3: Create a Makefile to build and run the image

So that is pretty much all as regards the Dockerfile! Now, we need to use docker to take the just created docker file and build the docker image so it can be run as a container. In order to do so, just go to the terminal, navigate to your project folder (where the Dockerfile is located) and create a new text file named Makefile. Then you need to add the following:

Let us explain what we have just added to the file:

1) run keyword defines a section with instructions to be executed when running the Makefile.

2) docker build -t <name of the image> <path or the dockerfile>:It creates the docker image by passing as argument a name for the image and the path where the Dockerfile is located.

Important: The label "migueldoctor/docker-jupyter-kotlin-notebook" is the identifier we have assigned to the image. This is an arbitrary parameter so you can modify as you wish.

3) docker run: This command actually starts up the docker container. It requires some parameters that we are explained as follows:

  • The argument —name is used to assign a specific name to the container.
  • The option v is used to establish a mapping between a folder (you have to create the folder whether it does not exist yet) located on the host (/Users/mdoctor/Documents/dockerVolumes/kotlin-jupyter) and a specific path within the container (/home/jovyan/work). This is what Docker define as Volume and it allows to share information between the host and the container. Both paths need to be separated by colon (:) symbol when passing as argument to the docker run command. If you want to know more about volumes, you can check a post about the topic here.
  • The option -p allows us to map the host ports and the container ports. The -p parameter is followed by the port number of the host (8888) that will be redirected to the port number of the container (8888).
  • Finally, we need to pass as argument the image migueldoctor/docker-jupyter-kotlin-notebook created in the previous docker build command.

Once explained the Makefile, we just need to save the file and type the following command when located into the folder of the project.

$ make

For a while, the terminal will display the list of instructions under execution (building the image, starting up the container…) and eventually you will see something like the following log trace:

Juptyer server ready to be used (Image by Miguel Doctor)
Juptyer server ready to be used (Image by Miguel Doctor)

That means the Jupyter server is up and running and you can access it by opening your web browser and typing the last URL displayed on the logs:(http://127.0.0.1:8888/lab?token=de4d7f250f18430848ad1b40bb84a127d558968907cb10a6).

Jupyter Lab with Kotlin Kernel installed (Image by Author)
Jupyter Lab with Kotlin Kernel installed (Image by Author)

Congratulations if you can see what is indicated in the picture! That means the container is running with Jupyter and the Kotlin kernel. Now you just need to click on the Kotlin icon within the Notebook section to create your notebook and start working with it. As follows you can see how a simple notebook should look like with some kotlin code.

Jupyter + kotlin kernel sample (Image by Author)
Jupyter + kotlin kernel sample (Image by Author)

Important: Note that running make command will build the docker image every time you execute it, which means that it will waste time and resources if you haven’t made any modifications in the Dockerfile. Therefore, unless you edit the Dockerfile you should run the container using the following command (already included in the makefile and explained above).

$ docker run --name my-jupyter-kotlin -v /Users/mdoctor/Documents/dockerVolumes/kotlin-jupyter/:/home/jovyan/work -p 8888:8888 migueldoctor/docker-jupyter-kotlin-notebook

Optional: Push the image to Docker Hub

In some situations we might be interested on releasing our docker image to the community, so everybody can benefit from our work. For our particular case, we have decided to use Docker Hub as our central registry. As follows, we describe the approach to upload our just created image to Docker Hub:

  • Docker Hub allows users to create an account as requirement to upload images, so you need to register on the web if you have not done it yet.
  • Once registered, you need to open a terminal and connect to the Docker registry using the docker login command. You will be prompted asking for you username and password.
$ docker login
Login with your Docker ID to push and pull images from Docker Hub. If you don't have a Docker ID, head over to https://hub.docker.com to create one.
Username:
  • Finally you need to use docker push followed by the name of image to upload. Since we have used our docker ID username (migueldoctor) as prefix for building the image, we can push it directly with the command below. If you have not used your docker ID as prefix you would need to tag it as indicated in the step 5 of the following post:
$ docker push migueldoctor/docker-jupyter-kotlin-notebook

Once the image is successfully pushed into Docker Hub, a new url is created to provide public access to the image. This url will contain the current version of the image as well as any new version/tag we might want to push in the future. For our example the url is the following:

https://hub.docker.com/r/migueldoctor/docker-jupyter-kotlin-notebook

Conclusion

In this post we have described how to create your own custom docker image including a fully functional Jupyter Notebook server compatible with Kotlin Kernel. Also, we explained all parameters so you can customize the image by adding/removing kernels. Finally we explained how to run the image and the steps to push your docker image to Docker Hub registry.

Please feel free to leave any comment, insight, or error you might have detected in the article. Your feedback is very welcomed and it will always help in improving the quality of the article. As always, thank you for reading. I strongly hope this tutorial helps you get started on data science projects using Jupyter, Kotlin and Docker.

Sources


Related Articles