PYTHON

A few weeks ago, I posted a blog about Bamboolib that became quite popular. The blog was very well received, achieving tens of thousands of views in the first week. And after that, I had planned to write about other Data Science-related subjects and was going to avoid writing about Python libraries for some time unless I had found something awe-inspiring. Well, I found it, and it’s called Mito.
I heard about Mito a while ago but never had the chance to test it out, but recently I decided to give it a try, and it’s impressive! But before anything, a little note: this is not sponsored content. I’m saying this because some developers have reached out to me in the past few weeks asking if I would do a paid partnership, so if you are one of those folks, I don’t do any kind of paid partnerships. However, I do accept suggestions of subjects to write about. Now, let’s go back to Mito.
Mito – save hours of work with a few clicks
Mito is a Python library that helps you perform data preparation, data cleaning, transformation, exploratory data analysis, creating graphs, etc. Through a GUI, you can get a lot done with just one or two lines of codes and a few clicks. They tried to create a similar experience as if you were using Microsoft Excel. Mito and Bamboolib (another Python library) have many similarities, such as having a GUI to make it easier for people to perform tasks, and they both create Python codes that you can copy and use anywhere, even if you don’t have Mito installed.
If you haven’t read my blog on Bamboolib, you can find it here:
Bamboolib: One of the Most Useful Python Libraries You Have Ever Seen
That being said, let’s put our hands on Mito!
Installation
The installation process is easy. For security reasons, I recommend you create an environment for Mito. Their website shows how you can create a Python environment. To do so, you can copy and paste one of the two codes into your terminal:
Mac:
python3 -m venv mitoenv
source mitoenv/bin/activate
python -m pip install mitoinstaller
python -m mitoinstaller install
Windows:
python3 -m venv mitoenv
mitoenvScriptsactivate.bat
python -m pip install mitoinstaller
python -m mitoinstaller install
If you decide to go with a Conda Virtual Environment, you can copy and paste the following code:
conda create -n mitoenv python=3.8
conda activate mitoenv
python -m pip install mitoinstaller
python -m mitoinstaller install
After these steps, you should be good to go. Note that Mito will create a JupyterLab notebook file for you with the starting code. You can also initiate a JupyterLab notebook by typing jupyter lab in your Terminal. Now let’s get our hands dirty.
First Steps
Initiating Mito couldn’t be any easier. You just need to import it by typing import mitosheet and initiate it by typing mitosheet.sheet(). That’s all the coding we will be using today!

Now let’s import the dataset that we will be using. Click on import, find the file, and we are good to go. For this demonstration, I will be using the Top Video Games 1995–2021 Metacritic dataset from Kaggle. After importing the dataset, Mito shows us a few carts with what you can do with it.

Data Preparation
Change datatype
Changing the datatype in Mito is a piece of cake. All you need to do is click on the datatype, right below the column’s name, and choose the new datatype. In the example below, I will change a string to a datetime datatype. You will see that the icon will change from Abc (which means that the datatype is a string) to a calendar icon. DAT EASY!

They don’t have an option where you can choose the format of the datetime datatype, so if any of the Mito developers ready this blog, I think it would be a nice extra feature if we can choose the format.
Also, it seems like the user_review column is a string. Let’s fix this by changing it to a float.

Did you see that a line of code was added to the cell below Mito’s GUI? That’s because Mito also gives you the code to use it in another notebook, even if you don’t have Mito installed. If you are learning to code, but don’t know how to do something, this is a fantastic tool. Another cool thing about it is that it automatically adds comments in the cell so that anyone can read it.
Just to make a quick comparison, Bamboolib can automatically recognize that the user_review column should be a float and not an integer, but Mito cannot, so if you try to change a column with digits to an integer, it will throw you an error. Not a significant problem, though.

Renaming columns
Changing the column’s name couldn’t be any simpler. Just like you do in Excel, you will just click on the column’s name and edit. Awesome, right? I worked on a project a while ago, and I had to change the name of 200 columns, and it took me quite some time. I can’t think how much time I would have saved if I knew Mito.

Dropping columns
Dropping columns is another thing that Mito makes ridiculously simple. Just select the column, and click on DEL COL. That’s it!
Imagine a project where you need to delete multiple columns. Instead of typing the name of each of them, you can solve this with a few clicks.

Undoing actions
Mito makes deleting columns so easy that someone could delete more columns than they would like. If you are using Jupyter notebook without Mito, what would you do? Run all the code again? Who has time for that? Just click on UNDO, and you can undo any actions. Did you use UNDO a little too much? No problem! Just click on REDO, and you will undo the undo. Sorry, I had to say it.

Using multiple DataFrames
Another very cool feature is that you can work with multiple DataFrames at the same time. There are two ways to do it. You can click on IMPORT and upload files from your computer, or, in case you have the DataFrames in your notebook, you can add their names inside the parenthesis, just like the following code.
mitosheet.sheet(df1, df2)

For the example above, I created two DataFrames. One where user_review > 9 and one where user_review < 9.
Data Transformation
Filtering data
You can filter the data by clicking on the funnel icon and choosing what you want to filter. In the example below, I’m filtering to see only games where user_review is larger than 9.5. You can also create filter groups where you can select multiple conditionals.

Pivot Table
.groupby() is very useful for data analysis and, even if you are a pro user, you need to confess that grouping data in Python can be a little time-consuming. If you create one or two .groupby() you are probably ok, but doing multiple times every day can become annoying. Now, Mito doesn’t have a .groupby() option. It had .pivot_table() instead, which works similarly. But what’s the difference between them?
In short, .pivot_table() allows you to aggregate data into more shapes. For example, you can choose the index, column, and rows values. .groupby() will create tables where the given dimensions are placed into columns, and the rows will be created for each combination of those dimensions.
Now that we understand the difference between them, you can create a pivot table by clicking on PIVOT, selecting the columns you want to create a group by, the stats you want to see, and voilá!
In the example below, I’m grouping by the platform, then get the meta_score mean, the name count, and the user_review mean.

Here’s the code created by Mito for the .pivot_table() that we just did. That’s quite some code and we did all that with a few clicks.
# Pivoted all_games_csv into df2
unused_columns = all_games_csv.columns.difference(set(['platform']).union(set([])).union(set({'name', 'user_review', 'meta_score'})))
tmp_df = all_games_csv.drop(unused_columns, axis=1)
pivot_table = tmp_df.pivot_table(
index=['platform'],
values=['meta_score', 'name', 'user_review'],
aggfunc={'meta_score': ['mean'], 'name': ['count'], 'user_review': ['mean']}
)
That’s a nice feature, but what it’s really cool to me is that you get the group by in a new tab. You can navigate back in forth through tabs and easily compare results.

Creating conditional columns
You can create columns based on conditionals with very few steps using IF. In the example below, I will create a new column saying if the user_review column is above nine or not. To do so, I clicked on any value in the user_review column, clicked on ADD COL, and typed =IF(user_review,'Yes', 'No'). The way it works is =IF(column, Value_IF_True, Value_IF_False), which is pretty similar to what you would do in Excel.

Now, let’s explore the code that Mito generated:
all_games_csv.rename(columns={"new-column-1inv": "above_9"}, inplace=True)
all_games_csv['above_9'] = IF(all_games_csv['user_review'] > 9, 'Yes','No')
I must confess that I didn’t know that we could use IF as is in the code snippet. I would normally use a .lambda() function. Good to know!
Data Exploration
Mito makes data exploration intuitive. Just click on the column, click on the filter icon, then click on Summary. You will see information such as a graph of the data distribution, and descriptive statistics such as mean, standard deviation, number of null values, etc.

It works for numerical and categorical, in case you are wondering.

Data Visualization
If you need to create some basic graphs, you can do that as well. For example, you can create a bar plot with a few clicks, and you can generate the code used to generate the graph if you want to edit it. Just click on GRAPH, then select the chart type and the x-axis and y-axis.

Final Thoughts
Mito is a library for you who are learning Python or if you want to get some tasks done fast. I really like Mito and Bamboolib because they can make it accessible for more people to start using Python. Python can be a little tricky and frustrating for those who are beginning, and I believe that Mito can help people learn and develop their skills.
Even if you have been using Python for years, you can learn a thing or two by observing the code generated by Mito. Now, is it for everyone? That’s a tricky question, and it will depend on what you need, but I think even people who have been coding for years can take advantage of some of the features and, instead of typing 15 lines of code, maybe you can achieve the same in a few clicks. Nevertheless, try it out and let me know what you think about it. Happy Coding!
You might also like:
Bamboolib: One of the Most Useful Python Libraries You Have Ever Seen




