Introducing Jupytext

Marc Wouts
Towards Data Science
5 min readSep 8, 2018

--

Jupyter notebooks are interactive documents that contain code, narratives, plots. They are an excellent place for experimenting with code and data. Notebooks are easily shared, and the 2.6M notebooks on GitHub just tell how popular notebooks are!

Jupyter notebooks are great, but they often are huge files, with a very specific JSON file format. Let us introduce Jupytext, a Jupyter plugin that reads and writes notebooks as plain text files: either Julia, Python, R scripts, Markdown, or R Markdown documents.

Jupyter notebooks as text files

We wrote Jupytext to work on Jupyter notebooks just like we work on text files. With Jupytext,

  • refactoring a notebook (represented as e.g. a plain Python script) in your favorite text editor or IDE becomes a real option,
  • writing notebooks directly as scripts or Markdown is another option, and
  • collaborating on Jupyter notebooks with Git becomes straightforward.

The text representation of a notebook focuses on the part that we have actually written: cell inputs. We value inputs more than outputs. Often they are the only part of the notebook that we want to have under version control. Inputs are incomparably lighter than outputs (commonly kilobytes versus megabytes).

We also value outputs. Preserving outputs is possible with paired notebooks. In that configuration, Jupyter saves the notebook as a traditional .ipynb file, in addition to the script or Markdown document. The text representation can be edited outside of Jupyter. When reloading the notebook in Jupyter, cell inputs are taken from the text file, and matching outputs are loaded from the .ipynb file.

Jupyter notebook edited as a script

In this first animation we show how your favorite text editor or IDE can be used to edit your Jupyter notebooks. IDEs are more convenient than Jupyter for navigating through code, editing and executing cells or fractions of cells, and debugging.

Animation script:

  • We start with a Jupyter notebook.
  • The notebook includes a plot of the world population. The plot legend is not in order of decreasing population, we’ll fix this.
  • We want the notebook to be saved as both a .ipynb and a .py file: we add a "jupytext_formats": "ipynb,py", entry to the notebook metadata.
  • The Python script can be opened with PyCharm:
  • Navigating in the code and documentation is easier than in Jupyter.
  • The console is convenient for quick tests. We don’t need to create cells for this.
  • We find out that the columns of the data frame were not in the correct order. We update the corresponding cell, and get the correct plot.
  • The Jupyter notebook is refreshed in the browser. Modified inputs are loaded from the Python script. Outputs and variables are preserved. We finally rerun the code and get the correct plot.

Scripts and Markdown as Jupyter notebooks

With Jupytext, every Julia, Python or R script, R Markdown or Markdown document becomes a potential Jupyter notebook. Write your notebooks as text, and render them in Jupyter when desired.

In the animation below,

  • Jupyter notebook (not lab, stay tuned) opens our plain Python script as a Jupyter notebook.
  • Saving from Jupyter adds a YAML header to the otherwise unchanged file.
  • Adding a cell to the notebook contributes a very simple diff.
  • Refreshing the notebook preserves the variables, but not the outputs. Outputs are not stored in text files.
  • We pair the script with a traditional Jupyter notebook by adding a "jupytext_formats": "ipynb,py", entry to the notebook metadata. When we save, a new ipynb file is created.
  • Thanks to the ipynb file, outputs are preserved when the notebook is refreshed or reloaded.

Collaborating on Jupyter notebooks

Have you ever tried to merge Jupyter notebooks? You should either use nbdime, or get prepared for a Unreadable Notebook: NotJSONError if a comma or parenthesis is missing in the merged JSON!

With Jupytext, collaborating on notebooks becomes just as easy as collaborating on scripts.

Check in the text version only. Enjoy easy merges and meaningful diffs!

Installing Jupytext

Jupytext is available on pypi. Install the python package and configure Jupyter to use Jupytext’s content manager:

# Get Jupytext from pip
pip install jupytext --upgrade
# Append this to .jupyter/jupyter_notebook_config.py c.NotebookApp.contents_manager_class="jupytext.TextFileContentsManager"# And restart your notebook server
jupyter notebook

Associate a Python script to your Jupyter notebook, or an ipynb file to your Python script (for the convenience of preserving cell outputs), by adding "jupytext_formats": "ipynb,py", in the notebook metadata (replace py with your favorite extension). If you plan to keep Jupyter open while you edit the text file outside of Jupyter, turn off Jupyter's autosave by running %autosave 0 in a cell.

References

The idea of working on Jupyter notebooks as text is not new. Alternative converters implemented in Python include:

  • notedown: Jupyter notebooks as Markdown documents,
  • ipymd: Jupyter notebooks as Markdown documents, Python scripts, and OpenDocument files,
  • A fork of ipymd that adds the support of R Markdown and R HTML notebooks,
  • pynb: Jupyter notebooks as Python scripts.

We are following with much interest the Hydrogen plugin for Atom, and the Jupyter extension for Visual Studio Code. These extensions turn scripts (with explicit cell markers which we hope to support in Jupytext at some point) into interactive notebook-like environments.

Acknowledgments

Jupytext is my first significant open source contribution. Working on an open source project was a great experience. I asked many questions around and really appreciated the helpful answers, suggestions and also collaborations.

Especially, I want to thank Gregor Sturm for his great idea that we could pair text notebooks with the traditional Jupyter notebooks, and for his feedback on the project. Eric Lebigot and François Wouts’ advices on how to advance and communicate on the project were very helpful. Finally, I’d like to thank the early beta testers for the time spent on experimenting new ways of collaborating on Jupyter notebooks with Jupytext.

Feedback

Jupytext owes much to the feedback of its users. Suggestions and questions are welcome: please use the issue tracker on our GitHub project for suggesting improvements in either the program or the documentation. See you there!

--

--

Author of Jupytext and ITables. I love maths, data visualization and programming in mixed languages