How and When People use the Public Library

Checkout trends over the years from the Seattle Public Library.

Yorgos Askalidis
Towards Data Science

--

Update January 6th, 2018: The folks at the Seattle Public library pointed out that I used an outdated inventory dataset to produce the list published in the older version of this post. The error resulted in ignoring from my analysis all items that were added to the library since September 2017. I’ve updated the dataset and the analysis below is up to date.! Huge thanks to David from SPL for finding the error!

What day and time are the most busy for checkouts from the public library? Do checkouts peak in January with reading resolutions or in the summer for beach vacations?

Image from pexels.com

Public libraries are one of society’s great institutions. They provide an opportunity for anyone with an appetite to read, learn and socialize with their community.

What, how and how much people checkout from a public library can provide an interesting lens on how the community is feeling and behaving.

The Data

I used this dataset as well as the original source of the data to explore checkouts from the Seattle Public library. The final dataset I used includes all checkout of items from 2006 up to Dec 26, 2018. Besides the checkout records, the dataset includes a detailed inventory and data dictionary in order to translate checkout records back into physical item title, type and author.

In case you missed it, I used the same dataset to create a ‘Most Checked Out’ items list for every year from 2006 to 2018. You can find that post here.

Methodology

All the checkouts together (from 2006 up to 2018) come up to about 67M records, a size fairly large to be loading all at once for analysis (especially since the records are accompanied by their metadata). The files come separated in years hence one can load each year’s dataset one by one, but I wanted to be able to do analysis across years all at once. The solution I used was to take a random 10% sample from each year’s checkout records and creating one file with all these samples. That left me with one representative dataset of about 6.7M records across all available years.

I also created a dataset with all the unique items and how many times they were checked out (this was created from the entire dataset, not the sample). I used this dataset to get such lists as most checkout each year.

Let’s dive in the analysis and the results.

Checkouts across the years

We observe a generally steady trend in the number of book checkouts (again, not including renewals), without around 4M checkouts a year.

Audio/Visual checkouts peak in 2009 and follow a slow but steady decreasing trend since then. This might be because of the rise to prominence of streaming services (like Netflix for TV and Spotify for music) that provide easy and (kinda) affordable access to a vast catalog of TV and music on demand and with no waiting time.

Since this dataset is only for physical checkouts only, I wonder if book checkouts remain steady but more and more users checkout e-books.

Checkout Temporal Trends

Let’s explore what type of trends people exhibit when they checkout items from the library. I explored trends for time of the day, day of the week and month of the year.

Time of Day

Let’s start with the most micro of the temporal trends, time of day. The Seattle Public library is mostly open from 10am to 8pm. It closes at 6pm on Friday and Saturday and has reduced hours on Sunday (mostly 1pm-5pm).

We see in the data that 4pm is the most popular hour for checkouts. This is true for both Books as well as DVD/CDs. We also see that that even though this trend changed slightly over the years, 4pm has been the most popular checkout hour for every year in the dataset. But we see that the hours between 1–5pm are not far behind.

We also see that the trend for digital goods (DVDs and CDs) is almost identical to that of Books. This observation remains true even when we look at day of week and month of year trends (see below) and it indicates that people don’t go to the library on separate trips for their book and digital needs.

Figure 1. Checkouts by Time of Day. On the far right plot, darker lines represent more recent years.

Day of Week

Having seen the time of day trends, let’s look at the day of week. Like I mentioned above, the Seattle PL is open every with slightly reduced hours on Friday and Saturday and heavily reduced hours on Sunday.

We see that Saturday is the most popular day for checkouts, for both Books and DVD/CDs. And again, we see that even though there has been a trend change in 2009, Saturday remained the most popular day for all years in the dataset. Furthermore, Friday and Sunday are the least popular days.

When we look at the trend across the years we observe a shift around 2009–2010, with Friday becoming substantially less popular. Could this have been caused by a change in the library open hours? It looks like Monday and Tuesday picked up the checkouts that were “dropped” by Friday after the trend change, with Saturday remaining relatively stable.

Figure 2. Checkouts by Day of Week. On the far right plot, darker lines represent more recent years.

Month of Year

Finally, let’s look at the month of year temporal trends. Do checkouts peak in the summer when people go on vacations and need a good book to read or maybe around the new year when people start new resolutions?

It seems a little of both, even though there is no major trend across the months with the highest value happening in July and the lowest coming in February and September. Adjusting for the fact that not all months have the same number of days, the trend can become even more stationary. Nevertheless, the beginning of the year and the summer seem to be the most popular months for checkouts, with September and December the least popular.

Again, we see that the trend is mostly similar for both DVD/CDs and books but we do see that book checkouts fall much more in December than audio-visual items. Reversely, we see that books over-perform in the summer months compared to audio-visual items.

Figure 3. Checkouts by Month of Year. On the far right plot, darker lines represent more recent years.

So there you have it: 4pm, Fridays and January & July are the most busy times for checkouts from the Seattle Public library, mostly both for Books and DVD/CDs. The trends also mostly persisted throughout all the years in the dataset (2006–2018) although we observed changes such as a drop in the checkouts on Fridays from 2009 and on.

Like I mentioned above, this dataset is only for physical items and undoubtedly there is a shift towards more digital items especially in a tech-savvy city like Seattle. It would be very interesting to see the digital checkouts too in order to get a wholistic understanding of the trends.

This data is also only for Seattle’s Public library. Do you know of any other public libraries that open source their data? It would be interesting to investigate if trends differ by city.

Thanks for reading.

You can find my story on the most checked-out items from the Seattle PL (using the same datasource) here.

--

--

Data Scientist at Instagram NYC. Previously at Spotify. Ask me about data, soccer, or data about soccer (or anything else).