Data Science in the Real World.

From Analysis to Dashboarding-Connecting notebooks to Tableau

The story of how I ended up with creating dashboards of dreams from notebook after trying and failing many of the ways. Here I share how I connected my dataframes to Tableau. And hope this blog helps the ones struggling between notebooks and dashboards.

Devipriya Sunilkumar
Towards Data Science
5 min readApr 29, 2020

--

Photo by Luke Chesser on Unsplash
Photo by Luke Chesser on Unsplash

The big picture of data analysis is not just data and deriving insights but also making that information understandable by everyone. Back then being a beginner to data science I was introduced to Jupyter notebooks. Notebooks are just like any other presentation media. It consists of cells where you can put your code then run and output just below the cell also it comes with a markdown feature where you can document your code with basic HTML. These notebooks are used for exploratory data analysis where it is easy to share your work with teammates. But it is still not a ready to go way for non-technical users though it has libraries like Plotly, matplotlib, etc.

For example, take the case of COVID-19. Everyone across the world is concerned about knowing the daily status of their country. Will they choose a notebook that is way more documented with cells, code, or a dashboard that shares your analyzed visualizations in an interactive form? Which is why every data analysts need to have storytelling skills in their pockets. Dashboarding is just like storytelling, you are framing a story from the information you have drawn which should be presented in a way that it doesn’t let them take their eyes off.

So the question is how do I connect my jupyter notebook to Tableau without going through the installation of python in Tableau?

The answer is simple. All that Tableau takes is data. A dataframe from jupyterlab can be saved in a format like CSV, Xls, JSON but in Real-time analysis when the data keeps updating on day to day basis it will be inefficient to download data daily from jupyter notebook and uploading it into Tableau.

But there is a solution though notebooks don’t give a database support as their sole purpose is analyzing datasets.

How about these analyzed data sets which are ready to visualize is send to google sheets using google API. Thus even if the data keeps updating on each refresh of our notebook the data we read from an online portal will be saved to google sheets automatically after analysis and tableau can be connected to these datasheets.

so the task is as simple for creating a real-time dashboard. All you got to do is

  • Create a notebook
  • Read the data from an online portal data API to a pandas dataframe
  • Analyze and create a ready to go clean and precise datagram.
  • Send the data to google sheets.
  • Let’s import the following libraries to our notebook
import gspread
from oauth2client.service_account import ServiceAccountCredentials
from df2gspread import df2gspread as d2g
  • Now before anything get credential access to your sheets by following instructions:
  1. Head to Google Developers Console and create a new project (or select the one you already have).

2. Under “APIs & Services > Library”, search for “Drive API” and enable it.

3. Under “APIs & Services > Library”, search for “Sheets API” and enable it.

4. Enable API Access for a Project if you haven’t done it yet.

5. Go to “APIs & Services > Credentials” and choose “Create credentials > Service account key”.

6. Fill out the form

7. Click “Create key”

8. Select “JSON” and click “Create”

You will automatically download a JSON file with credentials. It may look like this:

{
"type": "service_account",
"project_id": "api-project-XXX",
"private_key_id": "5ty … bg4",
"private_key": "-----BEGIN PRIVATE KEY-----\nNrDyLw … jINQh/9\n-----END PRIVATE KEY-----\n",
"client_email": "443000000000-msmhj@developer.gserviceaccount.com",
"client_id": "445 … hd.apps.googleusercontent.com",
...
}

Remember the path to the downloaded credentials file. Likewise in the next step, you’ll need the value of client_email from this file. Go to your spreadsheet and share it with a client_email from the step above.

  • Create a new google sheet copy the key to a variable named spreadsheet_key. For eg: (the text in bold will be the key for my sheet.To be found in the link which you open your sheets)
https://docs.google.com/spreadsheets/d/1vd6hDT6k8lcWNMDYz13kqW7zXnIX5TVED9–4Kq7EAQnvI/edit#gid=0
  • Add the following code to your python script or notebook. Also, a friendly reminder that the JSON file you just downloaded and the python script should be in the same folder before copying the following code.
scope = [‘https://spreadsheets.google.com/feeds',
https://www.googleapis.com/auth/drive']
credentials = ServiceAccountCredentials.from_json_keyfile_name(
‘myfile.json’, scopes=scope)
gc = gspread.authorize(credentials)
spreadsheet_key=’1vd6hDT6k8lcWNMDYz13kqW7zXnIX5TVED9–4Kq7EAQnvI
  • Now clean your datasets and save ready to visualize data to a dataframe.
  • Copy and paste the following code to send your dataframe to google sheets.
wks_name= “nameofyoursheet”
d2g.upload(df, spreadsheet_key, wks_name, credentials=credentials, row_names=True)
print(“Data upload success”)

wks_name will be your sheet’s name and df will be your dataframe variable. So now when each time your notebook is refreshed the data gets updated automatically to the sheets as of your data API from which you are reading data to your notebook thus making real-time analysis with jupyter notebook easy.

  • After doing the above steps, open Tableau and connect your google sheets. To download Tableau click here
  • To connect, go to server>google sheets
After Adding a google sheet to Tableau data connection (data source: OGD Platform India)
  • Create your visualizations in Tableau using the dataframe from sheets. Each visualization can be done in sheets which is what they call in Tableau. These sheets can be dragged and dropped into a dashboard. Publish your dashboard to Tableau Public and you are done.

Now as the data changes every day all we have to do is refresh the notebook and Tableau data connection. All your visualizations gets updated automatically.

One advantage which Tableau has over other visualization libraries like Plotly and matplotlib is that you get visualizations with zero coding and it’s something that incredibly professional.

Click here to view one of my dashboard which I did using jupyter notebook(for analysis) and Tableau (for visualizations).

--

--