The world’s leading publication for data science, AI, and ML professionals.

Context Managers: a Data Scientist’s View

How python context managers can clean up your code

https://unsplash.com/photos/KVihRByJR5g?utm_source=unsplash&utm_medium=referral&utm_content=creditShareLink
https://unsplash.com/photos/KVihRByJR5g?utm_source=unsplash&utm_medium=referral&utm_content=creditShareLink

This post is part of a series where I will be sharing things I’m learning on the topic of clean python code. I am a data scientist seeking to level up my python skills by writing more pythonic code and finding better ways to structure my larger code bases. I’m reading through Clean Code in Python and enriching the material with other sources. My goal is to reinforce the topics I’m learning about by summarizing them here and hopefully help others understand these topics as well!

This post is about context managers in Python. First I will describe what context managers are, and why they are useful. Then, I will walk you through a practical example with web scraping.

What is a Context Manager?

The best way to describe context managers is to show an example almost every Python programmer has encountered at some point, unknowing they were using one! The bellow code snippet opens a .txt file and stores the lines in a python list. Let’s see what this looks like:

with open('example.txt') as f:
    lines = f.readlines()
print(lines)

Let’s break this down. The first line opens a file and assigns the file object to f. The next line executes the readlines method on the file object, returning a list, each item in the list representing a line of text in the example.txt file. The last line prints the list. What makes this a context manager is the file is implicitly closed once the indent under the with statement is exited. Without a context manager, the code would look like this:

f = open('example.txt')
lines = f.readlines()
f.close()
print(lines)

Using context managers make this much more readable. Imagine a script that opens and closes multiple files, having lines of manipulation between them. It could become difficult to read, and you could easily forget to close a file. You also guarantee the close will happen, even if you run into an exception.

Writing Your Own Context Manager

To illustrate how to write your own, we’ll use a practical example. I recently ran into an issue while webscraping using selenium where I was unable to open a new url in an existing browser. The solution was to open a new browser to visit the url and close it after I’ve scraped the data. There’s two ways to do this. We’ll start with defining a class and use the __enter__ and __exit__ dunder methods.

# imports
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
# define context manager class
class OpenBrowser():

    def __enter__(self):
        self.driver = webdriver.Chrome(chrome_driver_path)
        return self.driver

    def __exit__(self, exception_type, exception_value, 
                 exception_traceback):
        self.driver.close()
# use context manager
with OpenBrowser() as driver:
    driver.get("http://www.python.org")
    elem = driver.find_element_by_name("q")
    elem.clear()
    elem.send_keys("pycon")
    elem.send_keys(Keys.RETURN)
    html = driver.page_source
print(html)

This looks complicated, but it’s not. You define a class with __enter()__ and __exit()__ which define what happens when you call it via with and what happens when you exit the indentation respectively. The arguments to __exit__are fixed and the return value for __enter__ is optional. Now it’s clear to readers that the driver only exists to search python.org for pycon and grab the resulting html. No matter what happens, the browser will close.

This syntax is not too bad, but there is an easier way. The context lib provides a decorator that takes care of the boiler plate and replaces the class with function. Lets see how the above example would look using the decorator:

# imports
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import contextlib
@contextlib.contextmanager
def open_browser():
    driver = webdriver.Chrome(chrome_driver_path)
    yield driver
    driver.close()
with open_browser() as driver:
    driver.get("http://www.python.org")
    elem = driver.find_element_by_name("q")
    elem.clear()
    elem.send_keys("pycon")
    elem.send_keys(Keys.RETURN)
    html = driver.page_source

Here, we just import contextlib and decorate a function with contextlib.contextmanager. The function needs to have a yield statement, yielding a value is optional but it tells the context manager where the __enter__ and __exit__ occur, which is before and after the yield respectively.

Conclusion

I hope I’ve not only explained context managers well enough, but convinced you that they can be extremely valuable to have in your tool box. Even as a data scientist, I have found many cases where these came in handy. Feel free to leave any questions or comments, happy Coding!


Related Articles