Geospatial Data — A Datum Primer

Kendall Fortney
Towards Data Science
7 min readAug 29, 2017

--

When I was a kid I was watching Star Wars with my Dad and he asked me if the dogfighting spaceship were going up or down. My response, “Dad, that’s silly, there is no up or down in space!”

In space there is no constant reference when you are in space to make an “up” or a “down” and in many ways the same problem is presented when providing a reference for terrestrial data. This problem has evolved in complexity over the years but at the core it still is the same.

The power of Geographic Information Systems (GIS) is the ability to create a relational spatial framework to set layers of information to derive insights. But first the basics…

Projections and Latitude/Longitude

Cartography is the study and practice of making maps and has deep roots in human history. Eratosthenes in the 3rd century BC was the first propose Latitude (North/South) and Longitude(East/West) as a coordinate system. The measurement is broken down into degrees, minutes, and seconds which can be extended into a decimal for increased accuracy. If the seconds are measured to six decimal places they will be accurate up to 4 feet versus to the 6.9 miles if only measured to one decimal place.

This is an example of a Web Mercator projection

Latitude and Longitude describe a 3D structure of the globe, but converting it onto a 2D map requires using a Projection, or a series of transformations which convert the location of points on a curved surface to locations on flat plane. There could easily be a whole article on the pro and cons of various projections.

Accuracy is a measurement of the closeness of results of observations or estimates of a map’s features to their true value or position. There are two types of accuracy in modern cartography:

  • Relative accuracy: Relative accuracy is the degree to which a given point on a map is accurate relative to other points within that same map. Often cartography uses a known point and measures from this local reference.
  • Absolute accuracy: Absolute accuracy is the degree to which a point on a map corresponds to a fixed coordinate system in the real world, i.e. latitude and longitude of a point on that map will correspond fairly accurately with actual GPS coordinates.

Modern applications, like self-driving cars or airplane autopilots, requires a new level of accuracy compared to even 50 years. The evolution of the current standards reflect solutions to the problems faced in increasing the accuracy.

The Problems with Location

Placing a reference point on the Earth is not as straightforward as it may initially appear. For one, the Earth is not a perfect sphere but bulges around the middle due to gravity. It is also made of moving tectonic plates that also may be rising or sinking as they converge (in Vermont the ground level is still rising from when the glaciers melted over 14,000 years ago!).

One of the initial solutions created its own problems: Mean Sea Level (MSL). This is determined by measuring the height of the sea surface over a long period and mathematically averaged to remove the effects of waves, tides, and short-term changes in wind and currents removed.

However it will not remove the effects of local gravity strength, water temperature, and salinity so the height of MSL relative to a geodetic datum (not based on actual sea level) will vary around the world. Often a country will choose the mean sea level at one specific point to be used as the standard “sea level” for all mapping.

http://wiki.gis.com/wiki/index.php/Chart_datum

When it come to nautical charts a sailor must be able to know the minimum depth of water that could occur at any point. Depths and tides on a nautical chart are measured relative to chart datum, which is defined to be a level below which tide rarely falls. That can make the difference between grounding a boat or making a safe passage.

When looking back through history, sea level has not stayed constant throughout geological time either. All of these variables create a massive problem when it comes to comparing data.

Reference Ellipsoid, Geoid and Datums

In order to provide a common set of references a datum is used. Typically a datum defines the surface and the position of the surface relative to the center of the earth. Datums can be two types: horizontal or vertical. Original efforts to map the North American coastline started in 1807 under President Jefferson and U.S. Coast Survey under Ferdinand Hassler but it was many years until the first standardized datums were widely adopted in North America.

Horizontal datums provide a reference of a fixed point and an reference ellipsoid model which better represents the non-spherical shape of the Earth. The North American Datum of 1927 (NAD 27) used a fixed point at Meade’s Ranch in Kansas as the contiguous center of the United States and the Clarke Ellipsoid because it was a good fit to the United States and minimized error. This was the common standard for many years. It’s replacement, North American Datum of 1983 (NAD 83), instead uses the gravitational center of the Earth (also called a geocentric datum) as the fixed reference for the ellipsoid.

Vertical datums are used to describe the elevation or orthometric height of a point either from Mean Sea Level, Tidal data, or from a geoid which uses gravitational forces to represent hypothetical sea level. The current North American Vertical Datum of 1988 (NAVD 88) is a geoid model using instruments capable of measuring gravity with high accuracy to create a hypothetical shape that the surface of the oceans would take under the influence of Earth’s gravity and rotation alone, in the absence of other influences such as winds and tides. Density of the crust and feature like mountains impact the shape of the geoid and it will deviate from the ellipsoid-based datums such as WGS84, GRS80 or NAD83.

In an increasingly connected world, the need for a universal perspective led to a global datum starting with the ellipsoid models such as WGS72, Geodetic System Reference 1980 (GSR 80) and finally the World Geodetic System of 1984(WGS84).

The WGS84 is the most common international standard and is the reference coordinate system used by the Global Positioning System (GPS) after January 1987. The error is believed to be less than 2 centimeters to the center mass of the Earth.

Datum Shifts

Each dataset should have which datum is used in the metadata, reading and understanding it can save a huge amount of time and pain later on. One cannot just put two datasets with different datums on top of each other as they would not line up. Instead there is a process of converting data from one datum to another called a datum shift; it’s also called a coordinate transformation (EPSG, OGC) or geographic transformation (ESRI). The margin of error between different datums can be quite small to hundreds of feet.

For example, the National Geodetic Survey has adjusted the NAD 83 datum multiple times since the original geodetic datum was established in 1986 and those nuances must be considered. The full list of all datums is far too long to include here, but take a look at this list for more information.

It is also important to understand that often the coordinate system origin of a local datum is not at the center of the earth but that the spheroid of a local datum is offset from the earth’s center. NAD 27 and the European Datum of 1950 (ED 1950) are local datums. NAD 27 is designed to fit North America reasonably well, while ED 1950 was created for use in Europe. Because a local datum aligns its spheroid so closely to a particular area on the earth’s surface, it’s not suitable for use outside the area for which it was designed.

The Future of Datums

The changes to technology and how it is applied will always necessitate updates to the datums, and NAD 83 and NAVD 88 will be replaced in 2022 with a newer version. This will correct for continental drift and help with accuracy as NAD 83 was non-geocentric by about 2.2 meters (about 7.2 feet). Some of the issues for accuracy are a product of the datum defined by using passive geodetic survey marks that have deteriorated over time.

The new datum replacing it will instead rely primarily on satellite’s GPS which should be easier and more accurate over time. Replacing the three existing NAD 83 reference frames will be four plate-fixed terrestrial reference frames:

  • North American Terrestrial Reference Frame of 2022 (NATRF2022)
  • Pacific Terrestrial Reference Frame of 2022 (PATRF2022)
  • Mariana Terrestrial Reference Frame of 2022 (MATRF2022)
  • Caribbean Terrestrial Reference Frame of 2022 (CATRF2022)
  • North American-Pacific Geopotential Datum of 2022 (NAPGD2022)

Within NAPGD2022 there will be a time dependent model of the geoid, provided in three regions (the first covering the entirety of North and Central America, Hawaii, Alaska, Greenland and the Caribbean, the second covering American Samoa and the third covering Guam and the Commonwealth of the Mariana Islands) and will be called GEOID22.

Like all scientific fields this is an ongoing evolution growing to fit the needs of the current generation. The potential presented by super accurate geospatial data could be life altering but like most things, it is important to understand its legacy as well. Dig in, look at the metadata and see what you can find!

--

--

Self-taught Data Scientist focused on Python, machine learning and Geospatial Data with degree in Art and years of experience in tech in Vermont.