For ease of browsing, here’s the final code used to query data from Snowflake. In the article below I’ll breakdown the reasoning for each step, and how to query MySQL and PostgreSQL databases in the same way.
Importing Details
The reason I choose to import connection details from a separate file is so that I can add the file to .gitignore
ensure no information is committed to an online repo. This way the private connection details remain hidden even if the repo becomes public.
Note that the keys are named exactly as the arguments required by the snowflake.connector.connect
function.
This creates a python dictionary from e.g. a yaml file.
user: anthony
account: account.region
authenticator: externalbrowser
warehouse: DEV
Database: DEV
For a json file the gist above would instead use:
import json
with open("snowflake_details.json", 'r') as stream:
snowflake_details = json.load(stream)
Creating the Connector

In creating the snowflake_connection
connector object the details are used as function arguments by using the **
dictionary unpacking operator. The key values for the details unloaded must match the arguments taken by the function (e.g. for a MySQL connection the host
argument would be needed).
For a MySQL connection the gist above would instead use:
import pymysql
mysql_connection = pymysql.connect(host='', user='', password='',
db='' port='')
Or for a PostgreSQL connection:
import psycopg2
postgresql_connection = psycopg2.connect(host='', user='',
password='', db='', port='')
Querying the Database
For ease of readability, it’s generally easier to create a string variable for the query we want to run. For larger queries, using three double quotes """query"""
instead of just double quotes "query"
, enables the query to neatly span multiple lines like in the gist above.
Now we can use [pd.read_sql](https://pandas.pydata.org/docs/reference/api/pandas.read_sql.html)
to instantly create a pandas dataframe from a query and a connection.
df = pd.read_sql(query, connection)
This is much cleaner than the approach of first using cursors to fetch the data, then using that information to build up a pandas dataframe, which just has a lot more steps.
with connection.cursor() as cursor:
sql = """
select *
from table 1
limit 10
"""
cursor.execute(sql)
result = cursor.fetchall()
fieldnames = [i[0] for i in cursor.description]
connection.close()
df = pd.DataFrame(list(result), columns=field_names)
Final Thoughts
The main reason I wrote this article was because finding out about pd.read_sql
made querying database connections in Python a much easier process for me. I hope that alongside tricks like three double quotes and the **
operator this write up helps someone else out too.