Pandas needs no introduction. It’s been a go-to library in the Python ecosystem for years and is here to stay. There are hundreds of functions available in the library, and knowing all of them isn’t feasible even for the most experienced users.
Today you’ll learn three valuable functions that don’t get much attention. You won’t find them in almost any Data Science guides, even though they can be helpful in day-to-day analysis workflows.
You’ll need Numpy and Pandas installed to following along.
eval()
You can use the eval()
function to evaluate a Python expression as a string. Here are the parameters you should know about:
expr: str
– The expression you want to evaluate, must contain only Python expressions.target: object
– Target object for assignment – the DataFrame.inplace: bool
– Indicates if thetarget
should be modified.False
by default, so a copy is returned.
Let’s see how it works in practice. The following code snippet creates a dummy DataFrame showing made-up Machine Learning model performance:
df = pd.DataFrame({
'Algorithm': ['XGBoost', 'DNN'],
'MSE': [63.3234, 51.8182]
})
Here’s how it looks like:

You’ll now use the eval()
function to calculate RMSE based of MSE:
pd.eval('RMSE = df.MSE ** 0.5', target=df)
Here’s how the new DataFrame looks like:

The function provides an alternative way to create new attributes or change existing ones. It doesn’t offer any groundbreaking functionality.
You can learn more about the function here.
interpolate()
The interpolate()
function fills missing values using a specified interpolation method. Here are the parameters you should know about:
method: str
– Interpolation method.linear
by default. It can have many values, so check the documentation to find the appropriate one.axis: int or str
– Axis to interpolate along.limit: int
– Optional parameter used to specify the maximum number of consecutive missing values to fill.inplace: bool
– Indicates if the original DataFrame should be modified.False
by default.
Let’s see how to work with this function. The following snippet creates a Pandas Series containing the first five integers with the fourth missing. The interpolate()
functions should determine automatically that 4
is the correct interpolation value:
s1 = pd.Series([1, 2, 3, np.nan, 5])
s1.interpolate()
Here are the results:

But what if the relationship isn’t linear? Here’s an example of the quadratic relationship between list values:
s2 = pd.Series([1, 4, 9, np.nan, 25])
s2.interpolate(method='polynomial', order=2)
And here are the results:

Works like a charm! Calling interpolate()
by default on the above example would fill the missing value with 17.0
, as that’s the average of the surrounding elements. That’s not what you want.
You can learn more about the function here.
factorize()
You can use the factorize()
function to encode the object as an enumerated type or a categorical variable. Here are the parameters you should know about:
values: sequence
– A 1-dimensional sequence of values to encode.sort: bool
– Indicates if thevalues
should be sorted.False
by default.na_sentinel: int
– Value to mark missing values. IfNone
, missing values are dropped.
Let’s see how it works in practice. The following code snippet factorizes a dummy list of US states without sorting:
codes, uniques = pd.factorize(
values=['NY', 'CA', 'AL', 'AL', 'AR', 'CA']
)
print(codes)
print(uniques)
Here are the results:

The factorize()
function returns a tuple, so it’s a good practice to store the results into two variables.
You can specify sort=True
if ordering matters:
codes, uniques = pd.factorize(
values=['NY', 'CA', 'AL', 'AL', 'AR', 'CA'],
sort=True
)
print(codes)
print(uniques)
Here are the results:

Pandas figured out automatically that it should sort the values alphabetically. Neat.
You can learn more about the function here.
Final thoughts
It seems like there’s nothing Pandas can’t do. More and more functions are added in each release, so it’s worthy to check documentation every now and then. Who knows, maybe something you spend hours to implement manually is already built into the library. Can’t hurt to check.
Do you have a favorite lesser-known Pandas function? Let me know in the comment section below.
Loved the article? Become a Medium member to continue learning without limits. I’ll receive a portion of your membership fee if you use the following link, with no extra cost to you.
Stay connected
- Follow me on Medium for more stories like this
- Sign up for my newsletter
- Connect on LinkedIn