This post is part of a series where I will be sharing things I’m learning on the topic of clean python code. I am a data scientist seeking to level up my python skills by writing more pythonic code and finding better ways to structure my larger code bases. I’m reading through Clean Code in Python and enriching the material with other sources. My goal is to reinforce the topics I’m learning about by summarizing them here and hopefully help others understand these topics as well!
This post is about context managers in Python. First I will describe what context managers are, and why they are useful. Then, I will walk you through a practical example with web scraping.
What is a Context Manager?
The best way to describe context managers is to show an example almost every Python programmer has encountered at some point, unknowing they were using one! The bellow code snippet opens a .txt file and stores the lines in a python list. Let’s see what this looks like:
with open('example.txt') as f:
lines = f.readlines()
print(lines)
Let’s break this down. The first line opens a file and assigns the file object to f
. The next line executes the readlines
method on the file object, returning a list, each item in the list representing a line of text in the example.txt
file. The last line prints the list. What makes this a context manager is the file is implicitly closed once the indent under the with statement is exited. Without a context manager, the code would look like this:
f = open('example.txt')
lines = f.readlines()
f.close()
print(lines)
Using context managers make this much more readable. Imagine a script that opens and closes multiple files, having lines of manipulation between them. It could become difficult to read, and you could easily forget to close a file. You also guarantee the close will happen, even if you run into an exception.
Writing Your Own Context Manager
To illustrate how to write your own, we’ll use a practical example. I recently ran into an issue while webscraping using selenium where I was unable to open a new url in an existing browser. The solution was to open a new browser to visit the url and close it after I’ve scraped the data. There’s two ways to do this. We’ll start with defining a class and use the __enter__
and __exit__
dunder methods.
# imports
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
# define context manager class
class OpenBrowser():
def __enter__(self):
self.driver = webdriver.Chrome(chrome_driver_path)
return self.driver
def __exit__(self, exception_type, exception_value,
exception_traceback):
self.driver.close()
# use context manager
with OpenBrowser() as driver:
driver.get("http://www.python.org")
elem = driver.find_element_by_name("q")
elem.clear()
elem.send_keys("pycon")
elem.send_keys(Keys.RETURN)
html = driver.page_source
print(html)
This looks complicated, but it’s not. You define a class with __enter()__
and __exit()__
which define what happens when you call it via with
and what happens when you exit the indentation respectively. The arguments to __exit__
are fixed and the return value for __enter__
is optional. Now it’s clear to readers that the driver only exists to search python.org for pycon and grab the resulting html. No matter what happens, the browser will close.
This syntax is not too bad, but there is an easier way. The context lib provides a decorator that takes care of the boiler plate and replaces the class with function. Lets see how the above example would look using the decorator:
# imports
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import contextlib
@contextlib.contextmanager
def open_browser():
driver = webdriver.Chrome(chrome_driver_path)
yield driver
driver.close()
with open_browser() as driver:
driver.get("http://www.python.org")
elem = driver.find_element_by_name("q")
elem.clear()
elem.send_keys("pycon")
elem.send_keys(Keys.RETURN)
html = driver.page_source
Here, we just import contextlib
and decorate a function with contextlib.contextmanager. The function needs to have a yield statement, yielding a value is optional but it tells the context manager where the __enter__
and __exit__
occur, which is before and after the yield respectively.
Conclusion
I hope I’ve not only explained context managers well enough, but convinced you that they can be extremely valuable to have in your tool box. Even as a data scientist, I have found many cases where these came in handy. Feel free to leave any questions or comments, happy Coding!