Publish AI, ML & data-science insights to a global community of data professionals.

Harvest and Analyze Agricultural Data with the USDA NASS API, Python, and Tableau

This tutorial swill show you how

Data Engineering, Data Analytics, Agriculture, and Tableau

Photo by Scott Goodwill on Unsplash
Photo by Scott Goodwill on Unsplash

The United States is blessed with fertile soil and a huge agricultural industry. To put its scale into perspective, in 2021, more than 2 million farms operated on more than 900 million acres (364 million hectares). That is an average of nearly 450 acres per farm operation. And data scientists, analysts, engineers, and any member of the public can freely tap more than 46 million records of farm-related data managed by the U.S. Department of Agriculture (USDA).

This article "will show you how to use Python to retrieve agricultural data with the NASS Quick Stats API." – The author

The USDA’s National Agricultural Statistics Service (NASS) makes the department’s farm agricultural data available to the public on its website through reports, maps, search tools, and its NASS Quick Stats API. This article will provide you with an overview of the data available on the NASS web pages. Then, it will show you how to use Python to retrieve agricultural data with the NASS Quick Stats API. Finally, it will explain how to use Tableau Public to visualize the data.

NASS Data Overview

NASS collects and manages diverse types of agricultural data at the national, state, and county levels. The following are some of the types of data it stores and makes available:

  • Crops and Plants
  • Demographics
  • Economic and Prices
  • Environmental
  • Livestock and Animals
  • Research, Science, and Technology

NASS makes data available through CSV and PDF files, charts and maps, a searchable database, pre-defined queries, and the Quick Stats API. One way it collects data is through the Census of Agriculture, which surveys all agricultural operations with $1,000 or more of products raised or sold during the census year. The census takes place once every five years, with the next one to be completed in 2022.

Searching NASS Datasets

Quick Stats

NASS makes it easy for anyone to retrieve most of the data it captures through its Quick Stats database search web page. In this example shown below, I used Quick Stats to build a query to retrieve the number of acres of corn harvested in the US from 2000 through 2021. I built the queries simply by selecting one or more items from each of a series of dynamic dropdown menus.

Tip: Click on the images to view full-sized and readable versions.

Creating a corn query with Quick Stats. Screenshot by Randy Runtsch.
Creating a corn query with Quick Stats. Screenshot by Randy Runtsch.
Data resulting from the corn query in Quick Stats. Screenshot by Randy Runtsch.
Data resulting from the corn query in Quick Stats. Screenshot by Randy Runtsch.

Corn production data goes back to 1866, just one year after the end of the American Civil War. Not all NASS data goes back that far, though.

NASS has also developed Quick Stats Lite search tool to search commodities in its database. While it does not access all the data available through Quick Stats, you may find it easier to use.

Census Data Query Tool

While Quick Stats and Quick Stats Lite retrieve agricultural survey data (collected annually) and census data (collected every five years), the Census Data Query Tool is easier to use but retrieves only census data. It can return data for the 2012 and 2017 censuses at the national, state, and local level for 77 different tables. Each table includes diverse types of data.

In the example shown below, I selected census table 1 – Historical Highlights for the state of Minnesota from the 2017 Census of Agriculture. The report shows that, for the 2017 census, Minnesota had 68,822 farm operations covering 25,516,982 acres.

Farm operations in Minnesota. Screenshot by Randy Runtsch.
Farm operations in Minnesota. Screenshot by Randy Runtsch.

Introduction to the NASS Quick Stats API

Now that you have a basic understanding of the data available in the NASS database, you can learn how to reap its benefits in your projects with the NASS Quick Stats API. With the Quick Stats application programming interface (API), you can use a programming language, such as Python, to retrieve data from the Quick Stats database. In fact, you can use the API to retrieve the same data available through the Quick Stats search tool and the Census Data Query Tool, both of which are described above.

Before you get started with the Quick Stats API, become familiar with its Terms of Service and Usage. Before using the API, you will need to request a free API key that your program will include with every call using the API.

From the Quick Stats API menu, you can access everything you need to get started. Screenshot by Randy Runtsch.
From the Quick Stats API menu, you can access everything you need to get started. Screenshot by Randy Runtsch.

The API Usage page provides instructions for its use. While there are three types of API queries, this tutorial focuses on what may be the most flexible, which is the "GET /api/api_GET" query. It accepts a combination of "what," "where," and "when" parameters to search for and retrieve the data of interest.

While the Quick Stats database contains more than 52 million records, any call using GET /api/api_GET query is limited to a 50,000-record result set. Queries that would return more records return an error and will not continue.

Building a Sample Quick Stats API URL

Here is the format of the base URL that will be used in this article’s example:

_http://quickstats.nass.usda.gov/api/api_GET/?key=api key&{parameter … parameter}&format={json | csv | xml}_

See the Quick Stats API "Usage" page for this URL and two others.

In the example program, the value for "api key" will be replaced with my API key. Also, the parameter values be replaced with specific parameter-value pairs to search for the desired data. Finally, format will be set to "csv," which is a data file format type that works well in Tableau Public. The use of a callback function parameter, not shown in the example above, is beyond the scope of this article.

The example Python program shown in the next section will call the Quick Stats with a series of parameters. After it receives the data from the server in CSV format, it will write the data to a file with one record per line. That file will then be imported into Tableau Public to display visualizations about the data.

The program will use the API to retrieve the number of acres used for each commodity (a crop, such as corn or soybeans), on a national level, from 1997 through 2021. Here are the pairs of parameters and values that it will submit in the API call to retrieve that data:

  • Source_desc = SURVEY
  • Sector_desc = FARMS & LANDS & ASSETS
  • Commodity_desc = FARM OPERATIONS
  • Unit_desc = ACRES
  • Freq_desc = ANNUAL
  • Reference_period_desc = YEAR
  • Year__GE = 1997 (all years greater than or equal to 1997)
  • State_name = US TOTAL
  • Format = CSV

Following is the full encoded URL that the program below creates and sends with the Quick Stats API. Note that the value "PASTE_YOUR_API_KEY_HERE" must be replaced with your personal API key.

_http://quickstats.nass.usda.gov/api/api_GET/?key=PASTE_YOUR_API_KEY_HERE&source_desc=SURVEY&sector_desc%3DFARMS%20%26%20LANDS%20%26%20ASSETS&commodity_desc%3DFARM%20OPERATIONS&statisticcat_desc%3DAREA%20OPERATED&unit_desc=ACRES&freq_desc=ANNUAL&reference_period_desc=YEAR&year__GE=1997&agg_level_desc=NATIONAL&state_name%3DUS%20TOTAL&format=CSV_

Each parameter is described on the Quick Stats Usage page, in its Quick Stats Columns Definition table, as shown below. For most Column or Header Name values, the first value, in lowercase, is the API parameter name, like those shown above. The name in parentheses is the name for the same value used in the Quick Stats query tool. For example, in the list of API parameters shown above, the parameter "source_desc" equates to "Program" in the Quick Stats query tool.

The Quick Stats Column Definitions table on the Quick Stats API Usage page names and describes each parameter available for queries and results. Screenshot by the author.
The Quick Stats Column Definitions table on the Quick Stats API Usage page names and describes each parameter available for queries and results. Screenshot by the author.

The Python Program to Call the Quick Stats API

Python Programming Tool

While I used the free Microsoft Visual Studio Community 2022 integrated development ide (IDE) to write and run the Python program for this tutorial, feel free to use your favorite code editor or IDE. If you are interested in trying Visual Studio Community, you can install it here. If you use it, be sure to install its Python Application support. Language feature sets can be added at any time after you install Visual Studio.

Python Code

The Python program that calls the NASS Quick Stats API to retrieve agricultural data includes these two code modules (files):

  • c_usad_quick_stats.py – This file contains the c_usda_quick_stats class that encapsulates code needed to use the Quick Status API. Its get_data() function calls the API, retrieves the resulting dataset, and writes the dataset to a CSV file.
  • run_usda_quick_stats.py – This file prepares the parameters for the call to the Quick Stats API. Then it creates an instance of the c_usa_quick_stats class. Finally, it calls the instance with the parameters and the name of the file where the resulting agricultural data CSV file will be saved.

Scroll down to see the code from the two modules. The following pseudocode describes how the program works:

  1. In file run_usda_quick_stats.py create the parameters variable that contains parameter and value pairs to select data from the Quick Stats database.
  2. Create an instance called stats of the c_usda_quick_stats class. This will call its initializer (init()) function, which sets the API key, the base URL for the Quick Stats API, and the name of the folder where the class will write the output CSV file that contains agricultural data.
  3. Call the function stats.get_data() with the parameters string and the name of the output file (without the extension).
  4. In the get_data() function of c_usd_quick_stats, create the full URL.
  5. Retrieve the data from the Quick Stats server.
  6. Decode the data Quick Stats data in utf8 format.
  7. Create the full output file name.
  8. Open the output file.
  9. Write the CSV data to the output file.
  10. Close the output file.

Note the use of the urllib.parse.quote() function in the creation of the parameters string in step 1. This function replaces spaces and special characters in text with escape codes that can be passed, as part of the full URL, to the Quick Stats web server.

Also note that I wrote this program on a Windows PC, which uses back slashes ("") in file names and folder names. So, you may need to change the format of the file path value if you will run the code on Mac OS or Linux, for example: self.output_file_path = r’c:usda_quickstats_files’.

Here are the two Python modules that retrieve agricultural data with the Quick Stats API:

Run the Program

To run the program, you will need to install the Python requests and urllib packages. In some environments you can do this with the PIP INSTALL utility. If you use Visual Studio, you can install them through the IDE’s menu by following these instructions from Microsoft.

Also, before running the program, create the folder specified in the self.output_file_path variable in the init() function of the c_usda_quick_stats class. You can change the value of the path name as you would like as well.

If you are using Visual Studio, then set the Startup File to the file run_usda_quick_stats.py. Do this by right-clicking on the file name in Solution Explorer and then clicking [Set as Startup File] from the popup menu. Then, when you click [Run], it will start running the program with this file first.

Remember to request your personal Quick Stats API key and paste it into the value for self.api_key in the init() function in the c_usda_quick_stats class.

After you have completed the steps listed above, run the program. If all works well, then it should be completed within a few seconds and it will write the specified CSV file to the output folder.

Sample Tableau Dashboard

To demonstrate the use of the agricultural data obtained with the Quick Stats API, I have created a simple dashboard in Tableau Public. Tableau Public is a free version of the commercial Tableau data visualization tool. Its main limitations are 1) it can save visualization projects only to the Tableau Public Server, 2) all visualization projects are visible to anyone in the world, and 3) it can handle only a small number of input data types.

The first of two worksheets in the Tableau dashboard created for this article. Using data retrieved with the Quick Stats API, it shows the acreage used, by year, for commodities (crops - corn is shown in this example) grown on US farms. Tableau dashboard created by Randy Runtsch.
The first of two worksheets in the Tableau dashboard created for this article. Using data retrieved with the Quick Stats API, it shows the acreage used, by year, for commodities (crops – corn is shown in this example) grown on US farms. Tableau dashboard created by Randy Runtsch.
The second of two worksheets in the Tableau dashboard created for this article. Using data retrived with the Quick Stats API, it shows the acreage used by year to grow eight commodities (crops) that consume the most farm acreage in the US. Tableau dashboard created by Randy Runtsch.
The second of two worksheets in the Tableau dashboard created for this article. Using data retrived with the Quick Stats API, it shows the acreage used by year to grow eight commodities (crops) that consume the most farm acreage in the US. Tableau dashboard created by Randy Runtsch.

The sample Tableau dashboard is called U.S. Agricultural Commodity Production by Land Area. Feel free to download it and modify it in the Tableaue Public Desktop application to learn how to create and publish Tableau visualizations.

Instructions for how to use Tableau Public are beyond the scope of this tutorial. However, here are the basic steps to install Tableau Public and build and publish the dashboard:

  1. Install Tableau Public Desktop on a Windows or Mac OS machine. If interested in learning how to use Tableau, see the free interactive training videos here.
  2. Open Tableau Public Desktop and connect it to the agricultural CSV data file retrieved with the Quick Stats API through the Python program described above.
  3. Create a worksheet that allows the user to select a commodity (corn, soybeans, selected) and view the number of acres planted or harvested from 1997 through 2021.
  4. Create a worksheet that shows the number of acres harvested for top commodities from 1997 through 2021.
  5. Add the two worksheets to a dashboard.
  6. Publish the dashboard to Tableau Server.
This screenshot from Tableau Public Desktop shows the connection to the file national_farm_survey_acres_ge_1997.csv, which was created by the Python program described in this article. The screen displays sample records from the file. Screenshot by Randy Runtsch.
This screenshot from Tableau Public Desktop shows the connection to the file national_farm_survey_acres_ge_1997.csv, which was created by the Python program described in this article. The screen displays sample records from the file. Screenshot by Randy Runtsch.
This screenshot from Tableau Public Desktop shows how the worksheet called All Commodities by Area was created. Note the names of the fields from the CSV file on the left. Screenshot by Randy Runtsch.
This screenshot from Tableau Public Desktop shows how the worksheet called All Commodities by Area was created. Note the names of the fields from the CSV file on the left. Screenshot by Randy Runtsch.

Summary

After completing this tutorial, you should have a general understanding of:

  • The types of agricultural data stored in the FDA Quick Stats database.
  • How to query the database.
  • How to write a Python program to query the Quick Stats database through the Quick Stats API.
  • How to install Tableau Public and learn about it if you want to try it to visualize agricultural data or use it for other projects.

I can imagine many use cases for projects that would use data from the Quick Stats database. Harvesting its rich datasets presents opportunities for understanding and growth.


Towards Data Science is a community publication. Submit your insights to reach our global audience and earn through the TDS Author Payment Program.

Write for TDS

Related Articles