Pandas is a highly popular data analysis and manipulation library for Python. It is one of the very first tools introduced in Data Science education. Pandas provides plentiful functions and methods for more efficient data analysis.
What I like most about Pandas is that there are almost always multiple ways to complete a given task. One way might outperform others in terms of time and complexity. However, having multiple options makes you think outside the box. It also helps to improve your approach to solve complex tasks.
Another advantage of practicing different ways to solve an issue is that it greatly improves your knowledge of Pandas.
In this article, we will go over some examples that demonstrate different ways to do the same tasks. The medical cost dataset available on Kaggle is used for the examples.
Let’s start with importing the libraries and reading the dataset into a dataframe.
import numpy as np
import pandas as pd
df = pd.read_csv("/content/insurance.csv")
df.head()

The dataset contains some personal information and the amount charged for the insurance.
The first set of examples is filtering rows based on values in a column. For instance, we may want to filter the rows in which the age is higher than 40.
The more common way of doing this task is as below.
df[df.age > 45]
Another way is to use the query method and specify the condition for filtering as a string.
df.query('age > 45')
The third way I will show is typically not used but important to show the flexibility of pandas.
df.where(df.age > 45).dropna(axis=0, how='all')
We first apply the where function which only takes the rows that fit the specified condition and marks the other ones as NaN. We then drop the rows full of NaN values.
The second task is to create a new column. Assume we need to add a new column that contains customer id numbers.
Here is the first method.
cust_id = np.arange(1,1339)
df['cust_id'] = cust_id
df.head()

It might seem better if we put the customer id column as the first column. The insert function of pandas is used in that case.
df.drop('cust_id', axis=1, inplace=True)
cust_id = np.arange(1,1339)
df.insert(0, 'cust_id', cust_id)
df.head()

We first drop the column added in the previous step and then insert the customer id column as the first row. The parameters of the insert function are the position to insert, the name of the column, and the values.
The next example involves the group by function of pandas. We can use it to calculate the average charges for each category in the sex and smoker columns.
df[['sex','smoker','charges']].groupby(['sex','smoker']).mean()

Let’s say we want to have the sex and smoker as columns in the dataframe instead of index levels.
One way is to use the reset_index function.
df[['sex','smoker','charges']].groupby(['sex','smoker']).mean()
.reset_index()
We can also use the as_index parameter of the group by function.
df[['sex','smoker','charges']].groupby(['sex','smoker'], as_index=False).mean()
Both result in the following dataframe:

The next example is a simple yet functional one. We may want to see the number of unique values in a column. One way is to use the value_counts function implicitly.
The value_counts function returns the unique values in a column along with the number of occurrences. If we apply the len function with the value_counts, we get the number of unique values.
A simpler way is to use a dedicated function for this task which is the nunique. As its name suggests, it returns the number of unique values in a column.
len(df.age.value_counts()) == df.age.nunique()
True
As we can see, both return the same value.
Consider we have the following dataframe:

The index contains the id numbers. We want to have them as a column named "id" instead of the index.
One way is to use the reset_index function which creates a new column using the index. However, the name of the column will be "index" so we need to change it using the rename function.
df.reset_index().rename(columns={'index':'id'})

Another way is to rename the index first and then reset the index.
df.rename_axis('id').reset_index()

Conclusion
We have seen some examples that show how it is possible to accomplish the same tasks with different methods with pandas.
In some cases, one way is preferred over others due to time and complexity concerns. However, it is always better to know multiple ways. Practicing and implementing different ways will also improve your pandas skills as well.
Thank you for reading. Please let me know if you have any feedback.