The world’s leading publication for data science, AI, and ML professionals.

Introducing Mito – How To Generate Pandas Code While Editing a Spreadsheet

Is this free point-and-click GUI for Pandas any good?

Photo by Solaiman Hossen on Unsplash
Photo by Solaiman Hossen on Unsplash

Disclaimer: This is not a sponsored article. I don’t have any affiliation with Mito or the creators of the library. The article shows an unbiased overview of the library, intending to make data science tools accessible to the broader masses.

The world of data science tools and libraries is becoming (or already is) saturated. It’s difficult for anything new to gain traction without a massive wow-factor. That’s where Mito caught my attention.

The idea behind the library is simple – you edit the dataset as a spreadsheet, and Mito automatically generates Python Pandas code. Is it any good? Well, yeah, but continue reading for a more profound overview.

This article is structured as follows:

  • Installing Mito
  • Dataset Loading
  • Adding and Removing Columns
  • Filtering and Sorting Data
  • Summary Statistics
  • Saving and Loading Analyses
  • The Verdict

Installing Mito

The Mito package has two prerequisites:

I’m assuming you have Python installed, but Node might be the issue. Please take a minute to install it, and you’re ready to proceed.

From here, let’s create a virtual environment for Mito:

python3 -m venv mitoenv

And now let’s activate it:

source mitoenv/bin/activate

Great! We can install the package next with the following command:

pip install mitosheet

Almost there – we’ll also need Jupyter Lab extension manager:

jupyter labextension install @jupyter-widgets/jupyterlab-manager@2

And that’s it! You can launch Jupyter lab with the following command:

jupyter lab

Let’s proceed with dataset loading in the next section.


Dataset Loading

Once Jupyter is loaded, you can execute the following code to open a new Mito sheet:

import mitosheet
mitosheet.sheet()

Here’s the resulting output:

Image 1 - Creating a new Mito sheet (image by author)
Image 1 – Creating a new Mito sheet (image by author)

You’ll have to enter your email to continue. Once done, you’ll see a blank sheet:

Image 2 - New Mito sheet (image by author)
Image 2 – New Mito sheet (image by author)

You can click on the Import button to load in a dataset. We’ll use the well-known Titanic dataset for this article. Here’s how it should look like after the initial loading:

Image 3 - Titanic dataset in a Mito sheet (image by author)
Image 3 – Titanic dataset in a Mito sheet (image by author)

And that’s how easy it is to load a CSV file! Mito automatically writes Pandas code in the cell below. The following image shows how it should look like by now:

Image 4 - Generated Pandas code for dataset loading (image by author)
Image 4 – Generated Pandas code for dataset loading (image by author)

Let’s see how to add and remove columns next.


Adding and Removing Columns

One of the most fundamental operations in data preprocessing is adding and removing attributes. Mito does this through Add Col and Del Col buttons in the top menu.

Let’s start by adding columns. Click on the Add Col button – this will add a column with an arbitrary name to the table below (M in my case). To add data, you can click on the first row and enter the formula the same as with Excel!

Here’s an example – the following formula will return 1 if the value of the Sex attribute is male, and 0 otherwise:

Image 5 - Adding columns with Mito (image by author)
Image 5 – Adding columns with Mito (image by author)

You can click on the column name to change it to something more appropriate – like IsMale. Mito generates the following Pandas code behind the surface:

titanic_csv.insert(2, 'M', 0)
titanic_csv['M'] = IF(titanic_csv['Sex'] == "male", 1, 0)
titanic_csv.rename(columns={"M": "IsMale"}, inplace=True)

Let’s see how to delete a column next. Simply select the column you want to remove and click on the Del Col button. A popup window like this one will appear:

Image 6 - Deleting columns with Mito (image by author)
Image 6 – Deleting columns with Mito (image by author)

From there, simply confirm that you want to remove the column. Mito will generate the following code for you:

titanic_csv.drop('PassengerId', axis=1, inplace=True)

And that’s how easy it is to add and remove columns. Let’s see how to filter and sort the data next.


Filtering and Sorting Data

It isn’t easy to imagine any Data Analysis workflow without some sort of filtering and sorting operations. The good news is – these are easy to do with Mito.

Let’s start with filtering. A side menu will pop up when you select a column:

Image 7 - Filtering with Mito (image by author)
Image 7 – Filtering with Mito (image by author)

Under Filter, select the appropriate option. Let’s include only these records where the Age column is not missing, and the values are between 40 and 42. You’ll have to add an entire group to have multiple filtering conditions, as shown below:

Image 8 - Multiple filtering conditions with Mito (image by author)
Image 8 – Multiple filtering conditions with Mito (image by author)

Here’s the code that Mito writes behind the scenes:

titanic_csv = titanic_csv[((titanic_csv.Age.notnull()) &amp; (titanic_csv['Age'] > 40) &amp; (titanic_csv['Age'] <= 42))]
titanic_csv = titanic_csv.reset_index(drop=True)

It’s a bit messy, but it gets the job done.

Onto the sorting next. It’s a much easier operation to implement. You need to select the column and select the ascending or descending option. Here’s how you can sort the filtered dataset descendingly by the Fare column:

Image 9 - Sorting with Mito (image by author)
Image 9 – Sorting with Mito (image by author)

Here’s the generated code for sorting:

titanic_csv = titanic_csv.sort_values(by='Fare', ascending=False, na_position='first')
titanic_csv = titanic_csv.reset_index(drop=True)

And that’s it for basic sorting and filtering. Let’s look at another interesting feature next – summary statistics.


Summary Statistics

If there’s one thing you’ll do a lot in any data analysis workload, that’s summary statistics. The basic idea is to print the most interesting statistical values for a column and maybe show some data visualizations.

As it turns out, Mito does that automatically. All you need to do is to select an attribute and click on Summary Stats in the right menu. Let’s do just that for the Age column:

Image 10 - Summary statistics - histogram (image by author)
Image 10 – Summary statistics – histogram (image by author)

As you can see, the distribution of the variable is shown first, followed by the summary statistics:

Image 11 - Summary statistics (image by author)
Image 11 – Summary statistics (image by author)

Let’s take a look at one more feature of Mito before calling it a day.


Saving and Loading Analyses

In Excel terms, saving an analysis with Mito is like recording a macro, but with Python. You can save any analysis by clicking on the Save button in the top menu.

It will bring up a modal window asking you to name the analysis and to confirm the saving:

Image 12 - Saving analysis with Mito (image by author)
Image 12 – Saving analysis with Mito (image by author)

To repeat an analysis, click on the Replay button. It will ask you to select one of the saved ones – as shown below:

Image 13 - Replaying analysis with Mito (image by author)
Image 13 – Replaying analysis with Mito (image by author)

The replaying feature may come in handy for some repetitive operations from separate notebooks, provided that you don’t want to copy-paste the code.


The Verdict

And that’s just enough for today. We’ve gone over the most common features and tried them out on a small dataset. The question remains – should you use Mito?

As a data scientist, I don’t see why you shouldn’t, especially if you’re skilled in Excel and want to get started with Python and Pandas. Mito can make the transition process that much easier.

The automated code generation function can also be beneficial to beginner and intermediate Pandas users, as it lets you know if there’s an alternative or easier way to do something with the library.

To conclude – give Mito a try. It’s free, and you have nothing to lose. I’d love to hear your opinion on the library in the comment section below.


Loved the article? Become a Medium member to continue learning without limits. I’ll receive a portion of your membership fee if you use the following link, with no extra cost to you.

Join Medium with my referral link – Dario Radečić


Useful Links

  1. Mito Website – https://trymito.io
  2. Mito Documentation – https://docs.trymito.io

Learn More


Stay Connected


Related Articles