
Interpolation is a quite common mathematical concept, which is used not only by data scientists, but also people from a vast range of fields. However, when dealing with geospatial data, interpolation gets more complicated, since you need to create a representative grid based on several often sparse observations.
Before diving into geospatial part, let’s have a brief recap on linear interpolation.
As always, to follow up the tutorial, you can download and run the notebook here.
For the purpose of demonstration I’ll use a regular polynomial function:
def F(x):
return -2*x**3+x**2+2.1
x = np.arange(-5,5, 0.1)
y = F(x)

Now we can randomly sample several points [-4.2, 0, 2.5] and connect them together:

This is called linear interpolation, since the function is approximated by a straight line at each interval, and now, knowing function’s values only at 3 points, we can find the values inside the interval [-4.2;2.5].
There are many other methods, which have higher accuracy, but the idea behind them is the same: find the functions values between at least two known points.
Now it’s time to get to geospatial part. In this tutorial, our goal will be to perform spatial interpolation of daily average air temperature measured at meteorological sites across Switzerland provided by NOAA. The expected result is a grid of temperatures with cells of 0.1° resolution.
Firstly, we need to acquire an administrative boundary of Switzerland and visualize it using geopandas:
import geopandas as gdp
shape = gpd.read_file('gadm41_CHE_0.shp')
shape.plot()

Indeed, looks like it’s Switzerland, wow =)
Now let’s plot our temperature observations and overlay them with the country shape. To do that, let’s load meteorological data to regular pandas dataframe and then convert it to a geopandas one with coordinates transformed to the shapely points:
import pandas as pd
from shapely.geometry import Point
df = pd.read_csv('3639866.csv')
points = list()
for i in range(len(df)):
point = Point(df.loc[i, 'LONGITUDE'], df.loc[i, 'LATITUDE'])
points.append(point)
gdf = gpd.GeoDataFrame(geometry=points).set_crs(shape.crs)
After doing that we can easily overlay the two dataframes using matplotlib.
fig, ax = plt.subplots(figsize=(16,9))
shape.plot(ax=ax, color='black')
gdf.plot(ax=ax, color='r', markersize=85)
plt.show()

To visualize our task let’s create the grid for interpolation and overlay it with the map above:
import matplotlib.ticker as mticker
import cartopy.crs as ccrs
from cartopy.mpl.gridliner import LONGITUDE_FORMATTER, LATITUDE_FORMATTER
LAT, LON = np.arange(45.75, 48, 0.1), np.arange(6, 10.81, 0.1)
fig, ax = plt.subplots(subplot_kw=dict(projection=ccrs.PlateCarree()), figsize=(16, 9))
shape.plot(ax=ax, color='grey')
gdf.plot(ax=ax, color='r', markersize=85)
gl = ax.gridlines(draw_labels=True,linewidth=2, color='black', alpha=0.5, linestyle='--')
gl.xlocator = mticker.FixedLocator(LON)
gl.ylocator = mticker.FixedLocator(LAT)
gl.xformatter = LONGITUDE_FORMATTER
gl.yformatter = LATITUDE_FORMATTER
plt.show()

So the goal is to produce interpolation on the regular grid illustrated above having 8 temperature observations.
I. Nearest Neighbor (NN)
The first super intuitive and simple method is called Nearest Neighbor (NN). As you can guess from its name, the algorithm assigns the value of the nearest observation to each grid node.
To implement it we will need only two functions. The first function is called Euclidean, and it calculates the distance between two points using the following formula:

The second one is the NN method itself. After creating an empty array to store the values we iterate over all latitudes and longitudes, calculate distances from each point to the current grid node and assign the value of the closest observation to that grid node:
def Euclidean(x1,x2,y1,y2):
return ((x1-x2)**2+(y1-y2)**2)**0.5
def NN(data, LAT, LON):
array = np.empty((LAT.shape[0], LON.shape[0]))
for i, lat in enumerate(LAT):
for j, lon in enumerate(LON):
idx = data.apply(lambda row: Euclidean(row.LONGITUDE, lon, row.LATITUDE, lat), axis = 1).argmin()
array[i,j] = data.loc[idx, 'TAVG']
return array
The whole idea is here, in this line:
idx = data.apply(lambda row: Euclidean(row.LONGITUDE, lon, row.LATITUDE, lat), axis = 1).argmin()
Variable data is our pandas dataframe with meteo sites (each row represents one site). So in the for loop we calculate the distance and find the index of the site with the minimum distance.
Now let’s run the algorithm and wrap the results into xarray dataset:
t2m = NN(df, LAT, LON)
ds = xr.Dataset(
{'TAVG': (['lat', 'lon'], t2m)},
coords={'lat': LAT, 'lon': LON})
Now we can plot the results:

Looks nice, but let’s enhance our plot by creating a Switzerland mask using regionmask library:
shape['new_column'] = 0
sw = shape.dissolve(by='new_column')['geometry']
rg = regionmask.mask_3D_geopandas(sw, lon_or_obj=ds.lon, lat=ds.lat)
fig, ax = plt.subplots(subplot_kw=dict(projection=ccrs.PlateCarree()), figsize=(16, 9))
#shape.plot(ax=ax, color='black')
ds.where(rg).TAVG.plot(ax=ax, alpha=0.6)
gdf.plot(ax=ax, color='r', markersize=85)
ax.gridlines(draw_labels=True,linewidth=2, color='black', alpha=0.5, linestyle='--')
plt.show()

As you can see this method might be applied only to categorical data. Since we are dealing with temperature, which is continuous variable, meaning it can take any value in a certain range, this interpolation is misleading. In real life there always gradients and randomness.
So let’s have a look at more advanced algorithm.
II. Inverse Distance Weighting (IDW)
Basically, Inverse Distance Weighting (IDW) is an enhanced version of NN:
def IDW(data, LAT, LON, betta=2):
array = np.empty((LAT.shape[0], LON.shape[0]))
for i, lat in enumerate(LAT):
for j, lon in enumerate(LON):
weights = data.apply(lambda row: Euclidean(row.LONGITUDE, lon, row.LATITUDE, lat)**(-betta), axis = 1)
z = sum(weights*data.TAVG)/weights.sum()
array[i,j] = z
return array
As you can see, instead of assigning the value of the closest known point, here we calculate weights. To do that the aforementioned Euclidean distanced is used as well, but this time we raise each distance to the –β-th power (β is an arbitrary value). These weights are basically the contribution of each ground point to a certain grid node. The greater the distance, the less this point influence the node value.
After getting the weights we calculate the weighted average.


Let’s plot it:

As you can see, now results are much more realistic and smooth!
III. Kriging
The last method for today is Kriging. Among these three, this one is most complex, and we will only touch upon it. In case you want to use it consciously and effectively, consider having look at the literature!
So the main idea of the method is the usage of variogram (or semivariogram). In essence, a variogram quantifies how the variability of some parameter changes with distance and direction. This exactly what we need when dealing with air temperature.
To implement the Kriging algorithms we will need two types of variograms: experimental and theoretical.
The first one is really easy to calculate. It’s defined as gamma γ:

where h – geographical distance between two points, z – temperature function. So in a nutshell it’s an average of temperature differences at known points.
The theoretical variogram is a little bit more complicated. Firstly, there are many of them:

where p – is partial sill, d – distance (we used h before), n – nugget, r – range.
I found a really good visual explanation of these parameters at CDT Columbia. I adopted an image, which illustrates the dependence between γ and distance, from their material. As you can see now it’s clear what sill, partial sill, nugget and range are.

So the whole idea of the algorithm is to adjust the parameters of the parameters of theoretical variogram so it would fit to the experimental one and then predict the values of nodes using it.
To implement the method we’ll need several extra libraries and to create a class called OrdinaryKriging.
from scipy.linalg import solve
from itertools import product
from sklearn.metrics import mean_squared_error as MSE
class OrdinaryKriging:
def __init__(self, lats, lons, values):
self.lats = lats
self.lons= lons
self.values = values
self.nugget_values = [0, 1, 2, 3, 4]
self.sill_values = [1, 2, 3, 4, 5]
self.range_values = [1, 2, 3, 4, 5]
# Generate all combinations of parameter values to fit
self.parameter_combinations = list(product(self.nugget_values, self.sill_values, self.range_values))
self.optimal_pars = None
def theoretical_variogram(self, h, nugget, sill, r):
return nugget + (sill-nugget) * (1-np.exp(-3*h/r))
def Euclidean(self, X, Y):
all_dists, point_dists = [], []
for x,y in zip(X, Y):
k = 0
for k in range(len(X)):
h = np.linalg.norm(np.array([x, y]) - np.array([X[k], Y[k]]))
point_dists.append(h)
all_dists.append(point_dists)
point_dists = []
return all_dists
def gamma(self):
distances = self.Euclidean(self.lats, self.lons)
differences = np.abs(self.values.reshape(-1,1) - self.values)
variogram_values = []
for h in np.unique(distances):
values_at_h = differences[(distances == h)]
variogram_values.append(np.mean(values_at_h**2))
return variogram_values, np.unique(distances)
def fit(self):
experimental_variogram, distances = self.gamma()
fit_metrics = []
for nugget, sill, range_ in self.parameter_combinations:
theoretical_variogram_values = self.theoretical_variogram(distances, nugget, sill, range_)
fit_metric = MSE(experimental_variogram, theoretical_variogram_values)
fit_metrics.append((nugget, sill, range_, fit_metric))
self.optimal_pars = min(fit_metrics, key=lambda x: x[3])[:3]
def predict(self, point):
points = np.array([(x,y) for x,y in zip(self.lats, self.lons)])
distances = np.linalg.norm(points - point, axis=1)
pars = list(self.optimal_pars)
pars.insert(0, distances)
weights = self.theoretical_variogram(*pars)
weights /= np.sum(weights)
return np.dot(weights, self.values)
kriging = OrdinaryKriging(df.LATITUDE.values, df.LONGITUDE.values, df.TAVG.values)
kriging.fit()
Now let’s consider each function separately. The init function besides initializing coordinates and values consists of three lists comprising possible values of nugget, sill and range. All three are mixed together in all possible combinations and stored in the parameter_combinations variable. We will need it later for searching the optimal ones.
def __init__(self, lats, lons, values):
self.lats = lats
self.lons= lons
self.values = values
self.nugget_values = [0, 1, 2, 3, 4]
self.sill_values = [1, 2, 3, 4, 5]
self.range_values = [1, 2, 3, 4, 5]
# Generate all combinations of parameter values to fit
self.parameter_combinations = list(product(self.nugget_values, self.sill_values, self.range_values))
self.optimal_pars = None
The second function, theoretical_variogram, is simply a python implementation of one of the aforementioned formulas. In our case, it’s the exponential one (but you can write code for another one and compare the results):
def theoretical_variogram(self, h, nugget, sill, r):
return nugget + (sill-nugget) * (1-np.exp(-3*h/r))
The third class method Euclidean. It’s an altered version of the function we created for NN and IDW. This time we return a matrix (n,n) where each row represents distances between a point and all other points (one of the values in each row is 0, since the distance between a point and itself is 0).
def Euclidean(self, X, Y):
all_dists, point_dists = [], []
for x,y in zip(X, Y):
k = 0
for k in range(len(X)):
h = np.linalg.norm(np.array([x, y]) - np.array([X[k], Y[k]]))
point_dists.append(h)
all_dists.append(point_dists)
point_dists = []
return all_dists
The fourth function performs the fitting. Here it obtains experimental variogram values and Euclidean distances. Then, iterating over our combinations of sill, range and nugget it calculates the theoretical variogram values with the following estimation of Mean Squared Error, or MSE, between theoretical and experimental values (but you can use other metrics). Then we save the optimal parameters to the class variable optimal_pars.
def fit(self):
experimental_variogram, distances = self.gamma()
fit_metrics = []
for nugget, sill, range_ in self.parameter_combinations:
theoretical_variogram_values = self.theoretical_variogram(distances, nugget, sill, range_)
fit_metric = MSE(experimental_variogram, theoretical_variogram_values)
fit_metrics.append((nugget, sill, range_, fit_metric))
self.optimal_pars = min(fit_metrics, key=lambda x: x[3])[:3]
And the last, but not least function is predict. Getting a point (lat;lon) as input, it estimates the distance between the point and other known values. Then it calls the theoretical_variogram function passing the optimal parameters we obtained earlier and getting weights as output. Then a weighted mean is calculated and returned.
def predict(self, point):
points = np.array([(x,y) for x,y in zip(self.lats, self.lons)])
distances = np.linalg.norm(points - point, axis=1)
pars = list(self.optimal_pars)
pars.insert(0, distances)
weights = self.theoretical_variogram(*pars)
weights /= np.sum(weights)
return np.dot(weights, self.values)
Now we can collect all the predictions and visualize the map:
row, grid = [], []
for lat in LAT:
for lon in LON:
row.append(kriging.predict(np.array([lat, lon])))
grid.append(row)
row=[]
ds = xr.Dataset(
{'TAVG': (['lat', 'lon'], grid)},
coords={'lat': LAT, 'lon': LON})
fig, ax = plt.subplots(subplot_kw=dict(projection=ccrs.PlateCarree()), figsize=(16, 9))
ds.where(rg).TAVG.plot(ax=ax, alpha=0.6)
gdf.plot(ax=ax, color='r', markersize=85)
ax.gridlines(draw_labels=True,linewidth=2, color='black', alpha=0.5, linestyle='--')
plt.show()

As you can see, the result is quite different from what we get from IDW. For Kriging the most important parameter is the kind of theoretical variogram you pick, since it basically defines the relationship between the predicted values and distance. In case you don’t want to play around with the code I provide here or your own, you can explore PyKrige library which has implementations of many variogram models.
Hopefully this article was informative and insightful for you!
===========================================
All my publications on Medium are free and open-access, that’s why I’d really appreciate if you followed me here!
P.s. I’m extremely passionate about (Geo)Data Science, ML/AI and Climate Change. So if you want to work together on some project pls contact me in LinkedIn.
🛰 ️Follow for more🛰 ️