The world’s leading publication for data science, AI, and ML professionals.

3 Advanced Python Functions for Data Scientists

Make your code cleaner and more readable by not reinventing the wheel.

Python can be lots of fun. It’s not a difficult task to re-invent some built-in function that you don’t know exists in the first place, but why would you want to do that?. Today we’ll take a look at three of those functions which I use more or less on a daily basis, but was unaware of for a good portion of my data science career.

Photo by Drew Beamer on Unsplash
Photo by Drew Beamer on Unsplash

While they maybe won’t be a huge time saver (if you understand the logic behind), your code will look much cleaner. Maybe to you it doesn’t sound like a big deal, but the future you will be grateful.

A couple of weeks back I’ve posted an article on some essential pure Python skills, and the article covers some other cool built-in functions, so make sure to check it out:

3 Essential Python Skills for Data Scientists

Without further ado, let’s start with the first one!


map()

map() is a built-in Python function used to apply a function to a sequence of elements like a list or dictionary. It’s probably the cleanest and most readable way to apply some sort of operation to your data.

In the example below the goal is to square numbers in a list. A function for doing so must be declared, to begin with, and then I’ll showcase how you would do it with and without map() function, ergo in a non-pythonic and pythonic way.

nums = [1, 2, 3, 4, 5]
# this function will calculate square
def square_num(x): 
    return x**2
# non-pythonic approach
squares = []
for num in nums:
    squares.append(square_num(num))

print('Non-Pythonic Approach: ', squares)
# pythonic approach
x = map(square_num, nums)
print('Pythonic Approach: ', list(x))

The outputs will essentially be the same, but just take a second to appreciate how much cleaner does pythonic approach looks. There’s no need for looping either.


zip()

zip() is one of my favorites. It enables you to iterate over two or more lists at the same time. This can come in handy when working with dates and times.

For example, I use it daily in my job when I have an attribute which represents the starting time of some event, and the second attribute representing the ending time of that event. For further analysis, it’s almost always necessary to compute the time difference between those, and zip is so far the easiest way to accomplish it.

In the example, I’ve created two lists containing numbers, and the task is to sum corresponding elements:

first = [1, 3, 8, 4, 9]
second = [2, 2, 7, 5, 8]
# Iterate over two or more list at the same time
for x, y in zip(first, second):
    print(x + y)

So easy and clean.


filter()

filter() function is in a way similar to map() – it also applies a function to some sequence, the difference being that filter() will return only those elements that are evaluated as True.

In the example below I’ve created an arbitrary list of numbers and a function that will return True if the number is even. Once again, I’ll demonstrate how to perform the operation in a non-pythonic and pythonic way.

numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
# Will return true if input number is even
def even(x):
    return x % 2 == 0
# non-pythonic approach
even_nums = []
for num in numbers:
    if even(num):
        even_nums.append(num)

print('Non-Pythonic Approach: ', even_nums)
# pythonic approach
even_n = filter(even, numbers)
print('Pythonic Approach: ', list(even_n))

And again, the pythonic way is much cleaner and readable – something future you will appreciate.


Before you go

There are more functions similar to those 3 in Python, but I don’t find them to be too much applicative in Data Science. Practice those 3, and remember them when facing any sort of challenge on your job or college. It’s so easy to reinvent the wheel, but there’s no point to it.

What built-in functions do you use daily? Feel free to share.


Loved the article? Become a Medium member to continue learning without limits. I’ll receive a portion of your membership fee if you use the following link, with no extra cost to you.

Join Medium with my referral link – Dario Radečić


Related Articles