
Data Analysis has become an integral part of various industries, as it enables us to make informed decisions based on collected data. One of the most popular libraries for data analysis in Python is Pandas, which provides powerful data manipulation and cleaning tools. However, working with Pandas can sometimes feel overwhelming, especially for those who are new to data analysis or prefer a more visual approach. This is where PandasGUI steps in – a library that brings a graphical user interface to Pandas, making data manipulation and visualization more accessible and user-friendly.
In this article, we will take a closer look at PandasGUI and its features, guiding you through the installation process and showcasing its capabilities.
1. Installation and Launch

Before everything, we need to install PandasGUI. As always, we can use pip
to install it.
pip install pandasgui
1.1 A Little Problem for Non-Windows OS
This section is for those who are using non-Windows OS, you can skip this step if you are actually using Python in Windows OS.
It looks like the author created this library on a Windows PC so it assumes that the operating system will have an environment variable APPDATA
. However, it is not the case for other operating systems such as Mac or Linux. Specifically, when we try to import the PandasGUI, it will show up this error.
import pandas as pd
import pandasgui

The easiest way to fix this problem is to manually give an empty string for this environmental variable.
import os
os.environ['APPDATA'] = ""
Then, we will be able to use PandasGUI without any problems.

The warning message is OK. I guess it doesn’t implement some recommended interfaces in Mac OS, so my system gives this warning.
1.2 Load Sample Dataset
To demo this library, we need to use a sample dataset. If you’re a data scientist, you may be familiar with the Iris dataset that is used in many classification or clustering machine learning demos.
Let’s get the dataset from Datahub.io. It is a platform for discovering, sharing, and publishing high-quality open data sets from a variety of sources. Most of the datasets here are open-sourced and can be used for learning purposes according to the license, including the Iris dataset.
df = pd.read_csv("https://datahub.io/machine-learning/iris/r/iris.csv")
df.head()
df.shape

1.3 Launch PandasGUI
Now, let’s launch the PandasGUI extremely easily. Just simply call the show()
function as follows.
pandasgui.show(df)

Don’t worry about the warning about the missing font family, this is again caused by the operating system. The specified font family doesn’t exist on my Mac OS. It doesn’t affect how we use the GUI.
After we run this line of code, the GUI should pop up as a desktop application.

2. Features of PandasGUI

The UI is pretty straightforward. It consists of the following components. I’ll introduce them in the later sub-sections.
- DataFrame List – we can navigate and switch dataframes here. It also shows the shape of the dataframe for convenience.
- Filters Query – create and select query expressions to filter the current dataframe
- Column List – view and navigate columns of the current dataframe
- Feature Tabs – switch the tabs to navigate different tools
- Main Area – show the results of the current manipulation

2.1 Filter the DataFrame
The first feature I want to introduce is filtering. It relies on the DataFrame query expressions to quickly filter the dataframe for us.
Specifically, we just need to type the queries such as sepallength > 7
and press enter. The filter will be applied to the dataframe. We can review the filtered results in the main area.

If we want to go back to see the entire dataframe, we can uncheck the expression to remove the filter.

Also, it is allowed to add many query expressions and flexibly apply them using the checkboxes. For example, the screenshot below shows two checked expressions that both are applied to filter the dataframe.

2.2 Sorting, Type-Converting and Colour Coding
In the DataFrame main area, we can also easily achieve many manipulations like Excel, such as sorting and colour coding. Apart from that, we can also easily cast the type of column.

For example, the screenshot below shows that the dataframe is sorted by the sepalwidth
column in descending order, and the numeric columns are colour coded based on their value scale.

2.3 Statistics
In the second feature tab, we can see the statistics of this dataframe.

It is also worth mentioning that, we are also allowed to select query expressions on the left. Then, the statistics will be recalculated based on the filtered dataframe.

2.4 Plotting
I have to say that Python is one of the easiest languages when we want to plot a graph using code. However, we have to write some code after all.
In PandasGUI, we can plot the dataframe using its columns in seconds. For example, the demo below shows that I just need to switch to the "Grapher" tab and select "Scatter 3D". Then, drag some columns to the axis fields.

If we want to switch to other types of graphs, it also takes no time to do so. This actually allows us to quickly test different types of graphs and decide which one could tell a better data story.
2.5 Reshaping the Dataframe
We can also use PandasGUI to reshape a dataframe with drag and drop. For example, we can pivot the Iris dataframe by converting its "class" into columns and then calculate the average of each attribute such as the petal length.

After dragging the column, click the "Finish" button. A new dataframe will be generated as follows.

2.6 Generating Code
For most of the features about, PandasGUI can also generate the code for us. This could be very useful when we use the GUI to decide which type of graph is the best, and then easily generate the code to put it into our real script.

Similarly, the reshaping feature also provides this code export feature. It allows us to experiment with reshaping many times and then output the right code.
Well, we probably can do this in ChatGPT but needs to explain a lot, as well as adopt it into our context 🙂
Summary

In summary, this article delves into the various features of PandasGUI, a powerful library that brings a graphical user interface to the widely-used Pandas library for data manipulation and visualization. We have demonstrated the installation process, loading a sample dataset, and explored features like filtering, sorting, statistical analysis, plotting, reshaping, and code generation.
PandasGUI is a valuable tool that can significantly enhance your data analysis workflow by simplifying common tasks and offering an interactive experience. While it greatly facilitates data manipulation for both beginners and experienced data scientists, it is important to note that it may not support extremely complex operations. For advanced manipulations, one might need to rely on traditional Pandas scripting.
If you feel my articles are helpful, please consider joining Medium Membership to support me and thousands of other writers! (Click the link above)
Unless otherwise noted all images are by the author