
Datasets are rarely complete and often require pre-processing. Imagine some datasets have only an address column without latitude and longitude columns to represent your data geographically. In that case, you need to convert your data into a geographic format. The process of converting addresses to geographic information – Latitude and Longitude – to map their locations is called Geocoding.
Geocoding is the computational process of transforming a physical address description to a location on the Earth’s surface (spatial representation in numerical coordinates) – Wikipedia
In this tutorial, I will show you how to perform geocoding in Python with the help of Geopy and Geopandas Libraries. Let us install these libraries with Pip if you have already Anaconda environment setup.
pip install geopandas
pip install geopy
If you do not want to install libraries and directly interact with the accompanied Jupyter notebook of this tutorial, there are Github link with MyBinder at the bottom of this article. This is a containerised environment that will allow you to experiment with this tutorial directly on the web without any installations. The dataset is also included in this environment so there is no need to download the dataset for this tutorial.
Geocoding Single Address
To geolocate a single address, you can use Geopy python library. Geopy has different Geocoding services that you can choose from, including Google Maps, ArcGIS, AzureMaps, Bing, etc. Some of them require API keys, while others do not need.

As our first example, we use Nominatim Geocoding service, which is built on top of OpenStreetMap data. Let us Geocode a single address, the Eifel tower in Paris.
locator = Nominatim(user_agent="myGeocoder")
location = locator.geocode("Champ de Mars, Paris, France")
We create locator
that holds the Geocoding service, Nominatim. Then we pass the locator we created to geocode any address, in this example, the Eifel tower address.
print("Latitude = {}, Longitude = {}".format(location.latitude, location.longitude))
Now, we can print out the coordinates of the location we have created.
Latitude = 48.85614465, Longitude = 2.29782039332223
Try some different addresses of your own. In the next section, we will cover how to geocode many addresses from Pandas Dataframe.
Geocoding addresses from Pandas
Let us read the dataset for this tutorial. We use an example of Store addresses dataset for this tutorial. The CSV file is available in this link.
Download the CSV file and read it in Pandas.
df = pd.read_csv("addresses.csv")
df.head()
The following table provides the first five rows of the DataFrame table. As you can see, there are no latitude and longitude columns to map the data.

We concatenate address columns into one that is appropriate for geocoding. For example, the first address is:
Karlaplan 13,115 20,STOCKHOLM,Stockholms län, Sweden
We can join address columns in pandas like this to create an address column for the geocoding:
Once we create the address column, we can start geocoding as below code snippet.
-
1 – We first delay our Geocoding 1 second between each address. This is convenient when you are Geocoding a large number of physical addresses as the Geocoding service provider can deny access to the service.
-
2 – Create a
df['location']
column by applyinggeocode
we created. -
3 – Third, we can create latitude, longitude, and altitude as a single tuple column.
-
4 – Finally, We split latitude, longitude, and altitude columns into three separate columns.
The above code produces a Dataframe with latitude and longitude columns that you can map with any Geographic visualisation tool of your choice. Let us look at the first few raws of our DataFrame, but first, we will clean out the unwanted columns.
df = df.drop(['Address1', 'Address3', 'Address4', 'Address5','Telefon', 'ADDRESS', 'location', 'point'], axis=1)
df.head()

I will use Folium to map out the points we created but feel free to use any other Geovisualization tool of your choice. First, we display the locations as a circle map with Folium.
The map produced below shows the geocoded addresses as circles.

Or if you prefer a dark background with an aggregated cluster of points, you can do the following:
Below is a dark background map with Clustered points map in Folium.

Conclusion
Geocoding is a critical task in many location tasks that require coordinate systems. In this article, we have seen how to do geocoding in Python. There are a lot of other services that provide either free or paid geocoding services that you can experiment within GeoPy. I find Google Maps geocoding services more powerfull than the Openstreetmap services we have used in this tutorial, but it requires an API key.
To interact and experiment with this tutorial without any installation, I created a Binder. Go this GitHub repository and click on launch binder.
Or directly to the Jupyter notebook Binder link here: