The world’s leading publication for data science, AI, and ML professionals.

Acquiring and Analyzing Data from analytics.usa.gov with Python and Tableau

Data freely available on analytics.usa.gov provides insights into how people interact with the U.S. government online. This article shows…

Data freely available on analytics.usa.gov provides insights into how people interact with the U.S. government online. This article shows you how to access the site’s reports through downloads and with a Python program and API.

What is analytics.usa.gov?

analytics.usa.gov provides data about how people interact with the United States government’s online presence. The Digital Analytics Program (DAP) of the General Services Administration (GSA) operates the site. Its data provides insights into how the public accesses about 5,700 government websites hosted on roughly 400 executive branch domains.

The U.S. government's analytics.usa.gov website provides public reports about access, by the public, to about 5,700 public-facing government websites. Image captured by the author.
The U.S. government’s analytics.usa.gov website provides public reports about access, by the public, to about 5,700 public-facing government websites. Image captured by the author.

This article describes the government website usage data that members of the public can download from analytics.usa.gov. It also shows how to retrieve the data as JSON streams with a Python program and API. Finally, it presents several Tableau Public dashboards used to analyze a subset of the data.

Available Datasets

Users can download the datasets listed below as CSV or JSON files. They can also call a beta API to retrieve the data. Calling the API from a Python program will be demonstrated later in this article.

Visits and Traffic Sources for Participating Agencies

Reports Updated Daily

  • Visits to all domains in the past 30 days
  • Top downloads yesterday
  • Top traffic sources in the past 30 days
  • Top exit pages in the past 30 days

Reports Updated Every 5 Minutes

  • All pages people are visiting at a point in time
  • Total people online at a point in time

Visitor Demographics for Participating Agencies

Reports Updated Daily

  • Language
  • Desktop/mobile/tablet
  • Web browsers
  • Versions of Internet Explorer
  • Operating systems
  • Versions of Windows
  • OS & browser (combined)
  • Windows & browser (combined)
  • Windows & IE (combined)
  • Screen sizes
  • Device model

Reports Updated Every 5 Minutes

  • Visitors per country
  • Visitors per city

Using Python and the API to Access Data

DAP has developed the analytics.usa.gov API to retrieve reports as JSON streams programmatically. While in BETA, I found the API to work well with a simple test. To use the API, register for a key by submitting a request form, as shown below, with your name and email address. The form will return the API key immediately when you press the [Signup] button.

Programmers must apply for an API key to use the analytics.usa.gov API to retrieve reports. Image capture by the author.
Programmers must apply for an API key to use the analytics.usa.gov API to retrieve reports. Image capture by the author.

The file openapi.yaml documents the API specification. YAML stands for YAML Ain’t Markup Language. Open the YAML file with a code editor or text editor.

Programming Environment

I used these free tools to code and execute the program on my Windows 10 PC:

  • Python 3.9.2 – Any release of Python 3 will probably work.
  • Microsoft Visual Studio Community for Windows – I do most of my coding with Visual Studio. But any editor or integrated development environment (IDE) that supports Python should work. Note that the Macintosh version of Visual Studio does not support Python.

Code Logic

The example program shown below performs these steps to call the API to download a report in a JSON steam and save it to a file:

  1. Call the constructor of class c_analytics_usa_gov_api. Specify the desired report name and JSON output file name.
  2. Open the JSON output file for writing.
  3. Call the API to retrieve the report.
  4. Write the report data stream to the JSON file.
  5. Close the JSON file.

Python Modules

The program uses the requests and json Python modules. To install the modules, call these commands from a command line or, if supported, within your development tool:

  • pip install requests
  • pip install json

Controller Module

The controller module file call_analytics_usa_gov_api.py is the program entry point. Note the call to c_analytics_usa_gov_api().

Class c_analytics_usa_gov_api

Class c_analytics_usa_gov_api is instantiated by the controller. It performs the steps listed in the Code Logic section above.

Data Use Ideas

While the data available on analytics.usa.gov may be most helpful to government developers and department administrators, these two uses came to mind for me:

  • Identify the most popular operating systems and browsers that access U.S. government websites. Since the U.S. government is so large, these numbers may indicate their use in general within the U.S.
  • Persons who work in specific industries (accounting or tax processing, for example) may find the Total downloads yesterday report helpful to analyze frequently accessed government services.

Sample Data Analytics Dashboards

To demonstrate the use of the reports on analytics.usa.gov, I developed several dashboards with the free Tableau Public business intelligence and data visualization tool. The dashboards’ data sources are the OS & browser (combined) CSV and the Desktop/mobile/tablet CSV reports. Since the structures of CSV files are tabular, they are easy to work with in Tableau.

I was surprised to learn that iOS (iPhones), as shown below, is the number one operating system used to access government sites.

This interactive Tableau Public dashboard shows access to government websites by operating system and web browser. Dashboard created by and image captured by the author.
This interactive Tableau Public dashboard shows access to government websites by operating system and web browser. Dashboard created by and image captured by the author.

The dashboard below shows that the top combinations of browsers and operating systems that access government websites are Safari and iOS, Chrome and Windows, and Chrome and Android.

This Tableau Public dashboard shows access to government websites by operating system and web browser combinations. Dashboard created by and image captured by the author.
This Tableau Public dashboard shows access to government websites by operating system and web browser combinations. Dashboard created by and image captured by the author.

As shown in the dashboard below, access to government sites by mobile devices (smartphones) exceeds access by desktop computers. This information suggests that web developers may want to ensure that they design their websites to present pages as well on mobile devices as desktops.

This interactive Tableau Public dashboard shows access to government websites by desktop PCs, mobile devices (smartphones), and tablets. Dashboard created by and image captured by the author.
This interactive Tableau Public dashboard shows access to government websites by desktop PCs, mobile devices (smartphones), and tablets. Dashboard created by and image captured by the author.

Putting it All Together

As you now know, analytics.usa.gov provides a variety of reports about access by the public to about 5,700 U.S. government websites. Most reports are available for download in CSV or JSON formats.

The analytics.usa.gov API enables programs written in Python or other languages to download JSON reports easily. It also supports filtering by agencies.

Since U.S. government websites see so much traffic, the reports available on analytics.usa.gov may provide insights to web and application developers. With the data, which could represent the U.S. as a whole, they can identify widely used operating systems, browsers, and platforms. Maybe you can think of other uses for the data to promote understanding and support decision-making.

About the Author

Randy Runtsch is a data analyst, software developer, writer, photographer, cyclist, and adventurer. He and his wife live in southeastern Minnesota, U.S.A.

Watch for Randy’s upcoming articles on public datasets to drive Data Analytics insights and decision-making, programming, data analytics, photography, bicycle touring, and more. You can see some of his photographs at shootproof.com and shutterstock.com.


Related Articles