
Introduction
Are you also bored of writing _df.shape, df.info(), plt.plot(kind=’bar’), df[‘columnname’].nunique(), and many other basic functions again and again to get the basic insights from any data all the time. I am sure you all must have started to find this process monotonous too. After reading this article, you will see how you can automate these basic functions in five basic steps by developing your own Python package in a matter of a few minutes.
Preview
Let’s get started
To begin developing your own customised Python package, the following are the steps that we need to perform:
STEP 1 – Creation of the Python script file
This file will contain the Python code necessary to run the basic data analysis. To demonstrate, let us automate the steps such as calculation of –
- Dimension of the dataset
- The data types of all the columns
- Number of Unique values
- Percentage of NA values
- Plot the bar chart for all categorical columns
- Plot the histogram for all numeric columns to see the distribution of the data
- Make a heatmap to show the null values
The following is the snippet of the code that I wrote:

The name of the file should be the name of the package that you want it to be called as such as Pandas, Numpy, etc and should be unique. In our case, I have named it ‘Mrinal’.
STEP 2 Create a Setup.py file
This file is necessary to install the package and contains the information like the package name, author name etc. This file resides outside the folder which contains the Python script file from Step 1 and other files discussed later.

The above image shows the code to be written in the Setup.py. Some things to be noted here are that the name of your package should be unique as if you want to publish to pypi.org later then you can’t use any matching name which is already present in the website. For example, you cannot create a package named ‘Pandas’ or ‘Numpy’ as they are already in the library.
STEP 3 Create an init.py file
This file tells Python that the folder contains a package. It should be present in the same folder along with the Python script file created in Step 1.

The above code is referencing the name of the class that we created in the Python script which was ‘Insights’ and the name of the package that is ‘Mrinal’ in our case. The ‘.’ is mandatory in Python3 and later versions.
STEP 4 Arrange the files in the right folder
For this step:
- Create a folder which you can name anything that you want as it wouldn’t affect the installation in any way. Let’s name it ‘My first Python package’ for reference
- Store the Setup.py file inside this folder
- Create another folder inside it and name it the same that you gave to the name of the package, in our case it is ‘Mrinal’ and whenever you want to import the package, you would be writing ‘From Mrinal import Insights’.
- Store the Python script file named ‘Mrinal.py’ and the ‘init.py’ file inside the newly created folder
STEP 5 Pip Install
- Open the command prompt
- Use ‘cd’ command to navigate to ‘My first Python package’ folder
- Type ‘Pip install .’
- This would install your package
- Then open any IDE such as Jupyter Notebook and type: ‘From Mrinal import Insights’
- Create a class object, for instance, insight_1 = Insights(). You can also look at the preview video.
- Then call the ‘automate_analysis()’ function just like in the video. You would see how those repeated steps are now automated and now you have to just call this function which would do all the work.
Congratulations!
You built your first python package on your own and would be saving a lot of time in future by not writing those functions again and again. Similarly, you can add more functions and classes to add more content to your package and make your process of data analysis smoother.
Resources
- You can also download all the code files from my GitHub page
- If you want to upload your package to pypi.org then you can go to this link
If you like this article then do read my another article on how you can develop the critical skills needed to perform Feature Engineering for a strong Machine Learning model.