Climate Data Science

MODIS Vegetation Indices: a GEE Approach

The case of the Xingu Indigenous Park in Brazil.

Published in

Towards Data Science

7 min readMar 20, 2020

The Moderate Resolution Imaging Spectroradiometer (MODIS) is one of the key sensors used in Earth Observation ever since it was first launched into orbit in 1999. Several reasons can be appointed for its success: the high revisit-time of 1 to 2 days, the global coverage, the wide range of applicability and its derived products. Just to name a few.

In this tutorial, you’ll focus on the MOD13Q1 product, which consists of both the Normalized Difference Vegetation Index (NDVI) and the Enhanced Vegetation Index (EVI) in a global scale at a 16 days frequency. You might already know the NDVI from the Introduction to GEE, but here you’ll also explore the usefulness of EVI for environmental analysis.

Ladies and Gentlemen, we are exploring the Earth from space. Photo by Brian McGowan on Unsplash

The Usual Suspects (and a new one)

From the GEE Introduction, you already know how to sign up and use gee, install folium for interactive visualization and geehydro to interact with the code in a similar way to the native Javascript API. If you’re interested in climate modeling and read the Introduction to CMIP6, then you must know Proplot as well. The novel package for this tutorial is ipygee, which works in a similar way as geehydro to allow the use of some functions that are only available within the GEE code editor.

To install ipygee, just use Pip:

pip install ipygee

Then you can proceed to import all the tools you’ll need in this tutorial:

import ee
import folium 
import geehydro
import numpy as np # yes, numpy!
import pandas as pd # yes, pandas!
import proplot as plot 
import matplotlib.pyplot as plt from ipygee import*
from pandas.plotting import register_matplotlib_converters

As you know, to use gee you must always trigger the authentication and initialize the library, especially if you are interested in running the code in Google Colab.

# Trigger the authentication flow.
>>> ee.Authenticate()
# Initialize the library.
>>> ee.Initialize()

Protected Areas of the World

The World Database on Protected Areas (WDPA) is the greatest database of marine and terrestrial protected areas and it’s easily available in GEE. It provides comprehensive geographic information and the boundaries of any area in the world. Here you’ll use it to get the shape of the Xingu Indigenous Park in Brazil, known as one of the first and greatest protected areas in the world.

With the EE snippet, you can load the polygon database of the WDPA and filter the area of the Xingu Park:

wdpa_polygon = ee.FeatureCollection('WCMC/WDPA/current/polygons')
# select the Xingu Indigenous Park shapefile
xingu = wdpa_polygon.filter(ee.Filter.eq('NAME', 'Parque do Xingu'))

As you already know, Folium and geehydro allow visualization of GEE data. In this case, you can plot the area of the Xingu Park and interact with it:

Map = folium.Map(location=[-12., -52.5], zoom_start=8)
>>> Map.addLayer(xingu)
>>> Map

The WPDA polygon of the Xingu Indigenous Park in Brazil.

The Xingu Park is a special place for its rich history and diversity. Before the boundaries were defined to make the Park official in 1961, the Villas-Boas brothers first went there almost by accident as part of the Roncador-Xingu expedition. As most of the country (and the government at the time) thought the region was deserted, they quickly found it was filled with hundreds of different indigenous tribes. Those encounters defined the beginning of the history of indigenous rights in Brazil.

MODIS MOD13Q1 Vegetation Indices

To load the MODIS product, simply use ee.ImageColletion() with the snippet you need. To make the analysis a bit easier, you’ll filter the data for the last 3 years.

# MODIS MOD13Q1
modis = ee.ImageCollection('MODIS/006/MOD13Q1')
modis = modis.filterDate(ee.DateRange('2016-01-01','2019-12-01'))

The MOD13Q1 product has both the NDVI and EVI available, so you can easily select them and attribute the data to different variables:

# select EVI and NDVI
evi = modis.select('EVI')
ndvi = modis.select('NDVI')

A minor issue with MODIS data is that you need to apply a scale factor according to the product you’re using to make sense of it. According to this User Guide (RTFM), the scale factor for the MOD13Q1 product is simply 0.0001. That means you’ll have to multiply each dataset by this scale factor to start using it properly.

You probably already figured out that you’ll need to write a small function to do it. In GEE dialect that means you’ll have to .map() over an ImageCollection. No unnecessary for-loops here, please. When in Rome, do as the Romans do.

Please, insert your Monty Python joke here.

def scale_factor(image):
# scale factor for the MODIS MOD13Q1 product
return image.multiply(0.0001).copyProperties(image, 
                                             ['system:time_start'])# mapping function to multiply by the scale factor
scaled_evi = evi.map(scale_factor)
scaled_ndvi = ndvi.map(scale_factor)

With the datasets properly scaled, you can finally use them within the WDPA boundaries of the Xingu Park and start your analysis. A good way to start is to plot the mean NDVI on the interactive Map.

# mean NDVI in the Xingu Park
>>> Map.addLayer(scaled_ndvi.mean().clip(xingu),
                  vis_params={'min': 0,
                              'max': 1,
                              'palette': ['red', 'yellow','green']})
>>> Map

The same code goes for the mean EVI. Thanks to Folium and geehydro it’s simple as that. While an interactive Map is cool, any python enthusiast would be interested to get a time series and squeeze out some cool information from it with data science. GEE has a native way to make time-series plotting but, unfortunately, that’s only available in the Code Editor. That’s exactly where ipygee comes in handy, as it mimics the native javascript-based ui.Chart.image.series(). Without it, we would be in trouble.

It works just the same way with the same arguments, and it even allows you to make a small widget from it when you plot.

# Xingu MOD13Q1 NDVI time series
xingu_ndvi = chart.Image.series(**{'imageCollection': scaled_ndvi,
                                   'region': xingu,
                                   'reducer': ee.Reducer.mean(),
                                   'scale': 1000,
                                   'xProperty': 'system:time_start'})xingu_ndvi.renderWidget(width='50%')

The NDVI times series in the Xingu Indigenous Park.

That could get a way better, you could probably be thinking. The amazing part from that is that ipygee also permits you to work with this time series as any Dataframe you might be familiar with. That’s where Pandas gets in the pitch. Don’t believe it?

>>> type(xingu_ndvi.dataframe)
pandas.core.frame.DataFrame

From then on, you can go from GEE to our usual Python ecosystem at ease. Think about Pandas, Seaborn or any related package you know to study time series. Thanks to ipygee you can go from both points any time you want.

That means you can plot it in the way you like it. Try it out with Proplot:

fig, ax = plot.subplots(figsize=(7, 3), tight=True)
ax.plot(xingu_ndvi.dataframe.index, xingu_ndvi.dataframe['NDVI'],
        color='forest green', marker='o')plot.show()

Again, the same can be done with the EVI time series just waiting around the corner. With a bit of Pandas, we can transform these 16-day sampled time series in a monthly time series with .groupby(). That could make them behave a bit less wildly.

# monthly averaging
xingu_evi_monthly = xingu_evi.dataframe.groupby(pd.Grouper(freq="M")).mean()xingu_ndvi_monthly = xingu_ndvi.dataframe.groupby(pd.Grouper(freq="M")).mean()

To compare the two time series, you’ll go back to the standard Matplotlib as there are some issues about sharing axis in Proplot for the time. Unfortunately, or not, you’ll have to stick with a little less elegant solution until they fix this minor problem.

# time index
time = xingu_evi_monthly.index# plot
fig, ax1 = plt.subplots(figsize=(7, 3))
ax2 = ax1.twinx()# EVI
ax1.plot(time, xingu_evi_monthly, label='EVI',
         color='muddy brown', marker='+')
# NDVI
ax2.plot(time, xingu_ndvi_monthly, label='NDVI',
         color='forest green', marker='o')ax1.set_xlabel('Time')
ax1.set_ylabel('EVI')
ax2.set_ylabel('NDVI')ax1.set_yticks(np.arange(0.3, 1.1, 0.1))
ax2.set_yticks(np.arange(0.3, 1.1, 0.1))plt.legend()
plt.tight_layout()
plt.show()

EVI and NDVI from MODIS in the Xingu Indigenous Park.

Notice the difference in magnitude but the same behavior in both time series. It’s likely that the more complex algorithm of EVI makes it resistant to atmospheric effects and allows it to reflect more robust vegetation changes. The vegetation in the Xingu Park is dominated by the Cerrado biome, which is less dense than the standard Amazon rainforest. Perhaps that’s why EVI is lower than NDVI, but that’s a guess.

Final Words

MODIS-related products are fantastic datasets that allow a wide range of environmental applications. The global, high-frequency and high-resolution MOD13Q1 dataset is extraordinarily useful to study vegetation behavior and changes. With the Python API, you can have access to many of these products in GEE and, hopefully, you learned how to explore them with this tutorial.

As usual, the interactive Jupyter Notebook with all of the code above is freely available in my Climate Data Science repository.