Overview of Your Journey
- Setting the Stage
- Installing Mito
- Basics of Mitosheets
- Cleaning Data with Mito
- Filtering Data with Mito
- Visualizing Data with Mito
- Wrapping Up
1 – Setting the Stage
I’m constantly on the lookout for new tools that can help speed up the exploratory phase of data analysis. Although you should be confident in the tools you choose, it is also good to keep up to date with new tools that might improve your process.
A few weeks ago I came across Mito, a tool that gives you excel-spreadsheet capabilities within Jupyter notebooks. My initial reaction was that this seemed like a step backwards; if you need excel, then use excel. Otherwise use more code-heavy tools like Pandas to do the job. However, I was pleasantly surprised with how seemingly Mito integrated with the notebook environment 😃
In this blog post, I want to show you some of the basics of what Mito can do as a convenience tool in your data analysis toolchain. Specifically, I will focus on the following three tasks using Mito:
- Cleaning Data,
- Filtering Data,
- Visualizing Data.
I will contrast Mito with how you do things in Pandas. As you probably suspect, both tools have their advantages and disadvantages. After reading this blog post, it will be up to you whether you want to implement some of Mito’s capabilities into your daily workflow.
Declaration: I am not sponsored or affiliated with Mito, Pandas, or any other software in the same domain as these in any way.
2 – Installing Mito
Installing Mito is fairly straightforward: You first need to run the following command in a terminal or command prompt:
python -m pip install mitoinstaller
Now you have downloaded the Mito installer. To run the Mito installer, use the command:
python -m mitoinstaller install
If something did not work out there is a page on Common Installation Errors that you can check out.
Important: You need a Python version of 3.6 or higher to run Mito. If your Python version is 3.5 or below, then you are seriously missing out on other Python features like dataclasses. Upgrade your Python distribution to the newest stable release if possible.
3 – Basics of Mitosheets
To get started with Mito, open a JupyterLab notebook and write the following piece of code:
import mitosheet
mitosheet.sheet()
If the installation in the previous step went correctly, then you should see the following GUI (graphical user interface) pop up in your notebook:

You have now opened an empty Mito sheet. Before moving on, you might want to take a look at the headers in the Mito GUI. There are buttons for importing and exporting data, adding and deleting columns, graphing data, merging data, and more 💪
So one way to get data into Mito is, as you probably already guessed, to use the import button in the GUI. This opens up a file explorer where you can select files (e.g. a CSV file) to import into Mito.
A more common approach is to use import functions in Pandas such as read_csv()
to import the data. Afterwards the dataframe is passed into the Mito spreadsheet. Let us use the classical Titanic dataset to explore how to work with data in Mito. You can download the titanic dataset here, but you don’t need to download it manually! Simply run:
import pandas as pd
titanic = pd.read_csv("https://raw.githubusercontent.com/datasciencedojo/datasets/master/titanic.csv")
titanic.head()
Cool, right? Not everyone knows that the Pandas function read_csv()
can take in URL’s 🔥
Now all you need to do is to write:
import mitosheet
mitosheet.sheet(titanic, view_df=True)
You should see the following:

You have successfully loaded a dataset into Mito. Don’t be afraid to click around and see if you can understand by yourself what some of the features in Mito does. In fact, one appeal of Mito is how self-explanatory many of the features it provides really is.
In the next sections, you will work with the titanic dataset in Mito to learn some cool features!
4 – Cleaning Simple Data with Mito
There are some missing data in the Titanic dataset. Let us focus on the column Embarked
, which is missing two values. To find the missing values, you can sort the Embarked
column by clicking on the filtering symbol in the header of the Embarked
column. Then you should get the following menu:

You can now choose the Sort-option Ascending
and press the highlighted button Filter/Sort
. Then the embarked column should be sorted. More importantly, you will now find the two rows with missing Embarked
values at the top:

The two passengers that are missing Embarked
values are Miss. Amelie Icard and Mrs. George Nelson Stone. By following the links to their titanic survivor pages you can see that both of them boarded the Titanic at Southampton. In fact, Miss. Amelie Icard served as a maid to Mrs. George Nelson Stone 😮
How can you change the two missing entries in Embarked
to incorporate this new information? Simply click on the entries like you would do in an excel sheet and update the information with a S
for Southampton! This is quicker than writing pandas-queries manually when there are only a few entries that need to be changed.
The coolest thing? If you look in the cell below the Mito sheet, then pandas code has been automatically generated for reproducibility:
# Set column Embarked at index 829 in titanic to S
titanic.at[829, 'Embarked'] = "S"
# Set column Embarked at index 61 in titanic to S
titanic.at[61, 'Embarked'] = "S"
Mito has given us the Pandas function .at()
, which is way faster for a single change than the more commonly used .loc()
Pandas function. It’s awesome that code generated by a graphical tool like Mito is more performant than Pandas code that many developers would write on the go.
Other columns like age
have many missing values. In this case it would probably be easier to use Pandas to fill in these values in an appropriate way using functions like .isna()
. Using both Pandas and Mito for data cleaning is an efficient combination 😃
5 – Filtering Data with Mito
While you can use Mito for simple missing values as you saw in the previous section, the main advantage of Mito is in my option in Data Exploration. Want to check out certain values of a column based on a condition? This is very quick and simple to do in Mito.
Say you are interested in only viewing the passengers that paid more than 20$ for their trip. When you can then do is to click the filtering symbol in the Fare
column header to get the filtering menu:

Here you can add a filter and specify how the filtering should take place. As such, you can create the following filter:

Press the highlighted button Filter/Sort
to apply the filter! Below the Mito spreadsheet cell you can see the nice Pandas Boolean indexing code:
# Filtered Fare in titanic
titanic = titanic[titanic['Fare'] > 20]
There are plenty of options for filtering you can do with the Titanic dataset. I suggest that you try a few filtering out to see how easy it is to do with Mito.
6 – Visualizing Data with Mito
As a final showcase, it is really easy to visualize the various features in a dataset by using Mito. Press the Graph
button at the top of the Mito GUI to get started. Then you will see the following window pop up:

You can now select the Chart Type
and the values for the X axis
and Y axis
. As an example, let us try out a bar chart (the default) and pick the Pclass
(passenger class) for the X axis
:

The picture above is not the most beautiful in the world. But keep in mind that it should be primarily used for data exploration, not end-result data visualizations. You can easily see that most passengers traveled in third-class.
Try out a few other visualizations. Mito has four built-in bar types:
- bar-charts,
- histograms,
- box-plots,
- and scatterplots.
All the plots are made by Plotly. In my option, these plots should only be used as initial data exploration. For visualizations meant for presentations I would recommend tools like Seaborn or PowerBI.
7 – Wrapping Up
I’ve found Mito to be helpful for simple data cleaning, filtering, and exploratory data visualizations. In some cases Mito generates code that is very performant, which is great. Mito might also be helpful for people who is coming from a more business analysis background.
If you need to learn more about Mito, then check out the Mito Documentation.
Like my writing? Check out some of my other posts for more Python content:
- Modernize Your Sinful Python Code with Beautiful Type Hints
- Visualizing Missing Values in Python is Shockingly Easy
- A Quick Guide to Symbolic Mathematics with SymPy
- 5 Awesome NumPy Functions That Can Save You in a Pinch
- 5 Expert Tips to Skyrocket Your Dictionary Skills in Python 🚀
If you are interested in data science, programming, or anything in between, then feel free to add me on LinkedIn and say hi ✋