The world’s leading publication for data science, AI, and ML professionals.

1 Line of Python Code for Data Profiling

Now, you don't need to write hundreds of lines of code

Photo by Mimi Thian on Unsplash
Photo by Mimi Thian on Unsplash

Python is one of the most used programming languages for programmers and data scientists. Programmers love Python because of its programmers’ friendly behaviour. Data scientists love Python because most machine learning and deep learning libraries are available in Python.

When we talk about programmers or data scientists, in the real world, when we work for any project or client, we all need to understand the data. Data is a decisive factor in every industry. We need to apply different programming logic, analytics and further modelling exercise to know the data.

It took an incredible amount of time to analyse the data and making the data suitable for your task. In python, we have a library that can create an end to end data profiling report in a single python code line.

This article will cover the library that can provide us with a detailed data profiling report in a single code line. The only thing you need is the data!


pandas_profiling

pandas_profiling is one of the most famous python libraries for the programmers to instantly get the data profiling report in one line of python code.

Installation

To install this library, you can use the pip command as follows.

pip install pandas_profiling

Import the library

Once the pandas_profiling gets installed, we can import this library using the below import command.

import pandas_profiling
import pandas as pd

We will be using pandas to import the dataset.

About the data

For this article, we will be using open source house pricing data. The data can be download from here.

Importing Dataset
Importing Dataset
Data Snapshot
Data Snapshot

Getting data profiling report

Once we have the data ready with us, we can use the single line of python code to generate the data profiling report, as shown below.

hourse_price_report=pandas_profiling.ProfileReport(df)

After running the below command, you will see the progress bar generating the data profiling report based on specific parameters.

Generating data profiling report
Generating data profiling report

Saving report as HTML format

Once the report has generated successfully, we can save the report as an HTML file and share it with others.

You can use the below line of code to save the report in HTML format.

hourse_price_report.to_file('house_report.html')

What you can achieve from the data profiling report

Overall data summary

Screenshot by Author
Screenshot by Author

Detailed information about each variable

Screenshot by Author
Screenshot by Author
Screenshot by Author
Screenshot by Author

Detailed visualization for each correlations among variables

Screenshot by Author
Screenshot by Author
Screenshot by Author
Screenshot by Author

Missing values count

Screenshot by Author
Screenshot by Author

Different kind of interactions

Screenshot by Author
Screenshot by Author

And many more exciting details about the data understanding.


Final closing points

We have seen how a single python code line can help us provide a detailed data profiling report.

The profiling report can provide us with an overall summary of the data, detailed information about each feature, visual representation of the relationship among components, detail about missing data, and many more interesting insights that can help us understand data well.

Stay tuned for more exciting articles. I usually write about the practical side of programming and Data Science.

Thank you for reading!


Before you go…

If you liked this article and want to stay tuned with more exciting articles on Python & Data Science – do consider becoming a medium member by clicking here https://pranjalai.medium.com/membership.

Please do consider signing up using my referral link. In this way, the portion of the membership fee goes to me, which motivates me to write more exciting stuff on Python and Data Science.

Also, feel free to subscribe to my free newsletter: Pranjal’s Newsletter.


Related Articles