The world’s leading publication for data science, AI, and ML professionals.

Connect to PostgreSQL Database Server Using Python Module of psycopg2

Develop a tidy script and directory to connect to PostgreSQL locally or remotely

Photo by Jan Antonin Kolar on Unsplash
Photo by Jan Antonin Kolar on Unsplash

Hands-on Tutorial

Table of contents (only for web)

1 What's it all about?
2 The psycopg module to connect a Postgresql
3 How to connect with PostgreSQL
4 Create, Read, Update, and Delete (CRUD) using psycopg2
  • Create a table
  • Insert data to a table
  • Retrieve a table
  • Update a table
  • Delete rows or a table
5 Conclusion
6 References

What’s it all about?

PostgreSQL is one of the relational database management systems which is open-source and also has a lot of functionality. In Python, we have several modules to connect to PostgreSQL such as SQLAlchemy, pg8000, py-postgresql, etc. However, psycopg2 becomes the most popular one. So that in this tutorial, we will discuss how to connect to PostgreSQL using psycopg2 .

The psycopg module to connect a PostgreSQL

The psycopg2 is the PostgreSQL connector commonly used by Python developers to connect to Python. It’s the core module for this tutorial, so make sure we have installed it on our machine. Use the following command on the terminal to install the psycopg2 module.

$pip3 install psycopg2

How to connect with PostgreSQL

To connect to PostgreSQL, we must create a file that represents our database on PostgreSQL. We will save that database configuration file to <filename>.ini . It keeps our database configuration safe if we share our main Python script with the public. In this case, we will name it database.ini . Please input your database configuration, like hostname, database name, user name, password, port number, etc to database.ini as follows.

[postgresql]
host = localhost
database = customer
user = postgres
password = admindb
port = 5432

Information about database configuration.

  • host – is the server name or IP address on which PostgreSQL is running
  • database – is the database name we want to connect with
  • user – is the name of a user of PostgreSQL
  • password – is a key required to connect to PostgreSQL
  • port – port number that will be used when listening for TCP/IP connections. The default port number is 5432

To parse the information from the database.ini file, we create config.py file with a script as follows. It also creates a function that returns the project root folder.

Note: section = 'postgresql' is set depends on the header of database.ini file. We can modify it later.

# Import libraries
from configparser import ConfigParser
from pathlib import Path
def get_project_root() -> Path:
  """Returns project root folder."""
  return Path(__file__).parents[1]
def config(config_db):
  section = 'postgresql'
  config_file_path = 'config/' + config_db
  if(len(config_file_path) > 0 and len(section) > 0):
    # Create an instance of ConfigParser class
    config_parser = ConfigParser()
    # Read the configuration file
    config_parser.read(config_file_path)
    # If the configuration file contains the provided section name
    if(config_parser.has_section(section)):
      # Read the options of the section
      config_params = config_parser.items(section)
      # Convert the list object to a python dictionary object
      # Define an empty dictionary
      db_conn_dict = {}
      # Loop in the list
      for config_param in config_params:
        # Get options key and value
        key = config_param[0]
        value = config_param[1]
        # Add the key value pair in the dictionary object
        db_conn_dict[key] = value
      # Get connection object use above dictionary object
      return db_conn_dict

When our works need to read the table on the database regularly, we can create a function with the filename db_conn.py as follows. It will load the table that fits with our input (config_db and query).

# Import libraries
import pandas as pd
import psycopg2
from config.config import config
# Take in a PostgreSQL table and outputs a pandas dataframe
def load_db_table(config_db, query):
    params = config(config_db)
    engine = psycopg2.connect(**params)
    data = pd.read_sql(query, con = engine)
    return data

In order to compile our scripts, we set our directory to locate database.ini, config.pyand db_conn.py. Use the following settings.

The recommended directory for running the scripts (Image by Author)
The recommended directory for running the scripts (Image by Author)

Wait! There is main.py file and what is it? Okay then, it must be a Python script to test our PostgreSQL connection via Python. So, here is the main.py.

# Import libraries
from src.data.db_conn import load_db_table
from config.config import get_project_root
# Project root
PROJECT_ROOT = get_project_root()
# Read database - PostgreSQL
df = load_db_table(config_db = 'database.ini', query = 'SELECT * FROM tablename LIMIT 5')
print(df)

It works! Now we’re going to customize, and create other PostgreSQL commands to create, read, update or delete via Python.

Create, Read, Update, and Delete (CRUD) using psycopg2

Create a table

Create a table via psycopg2 uses CREATE TABLE statement, but it is in Python script. Previously, we saved our database configuration file. When we want to connect to PostgreSQL, just import the config function.

# Import libraries
import pandas as pd
import psycopg2
from config.config import config
# Connect to PostgreSQL
params = config(config_db = 'database.ini')
engine = psycopg2.connect(**params)
print('Python connected to PostgreSQL!')
# Create table
cur = con.cursor()
cur.execute("""
CREATE TABLE customer(
customer_id INT PRIMARY KEY NOT NULL,
name CHAR(50) NOT NULL,
address CHAR(100),
email CHAR(50),
phone_number CHAR(20));
""")
print('Table created in PostgreSQL')
# Close the connection
con.commit()
con.close()

Insert data into a table

After creating a new table namely customer, we will insert some values to the table via SQL statement INSERT INTO. Here is an example of this.

# Import libraries
import pandas as pd
import psycopg2
from config.config import config
# Connect to PostgreSQL
params = config(config_db = 'database.ini')
engine = psycopg2.connect(**params)
print('Python connected to PostgreSQL!')
# Insert values to the table
cur = con.cursor()
cur.execute("""
INSERT INTO customer (customer_id,name,address,email,phone_number)
VALUES (12345,'Audhi','Indonesia','[email protected]','+621234567');
""")
cur.execute("""
INSERT INTO customer (customer_id,name,address,email,phone_number)
VALUES (56789,'Aprilliant','Japan','[email protected]','+6213579246');
""")
print('Values inserted to PostgreSQL')
# Close the connection
con.commit()
con.close()

Retrieve a table

We retrieve our customer table via SELECT SQL statement. In this case, we will retrieve the five first rows of customer table.

Note: to retrieve the data from PostgreSQL, make sure we select the right table name and the columns. Don’t use *``** statement because it will slow down our progress

# Import libraries
import pandas as pd
import psycopg2
from config.config import config
# Connect to PostgreSQL
params = config(config_db = 'database.ini')
engine = psycopg2.connect(**params)
print('Python connected to PostgreSQL!')
# Read the table
cur = con.cursor()
cur.execute("""
SELECT customer_id,name,email FROM customer LIMIT 5;
""")
print('Read table in PostgreSQL')
# Close the connection
con.commit()
con.close()

Update a table

When working on the database, it’s normal to update our database regularly and it easily executes using UPDATE statement. Supposed we will update the address of the customer who has the customer_id = 12345. Check our table and it will be updated automatically via psycopg2.

# Import libraries
import pandas as pd
import psycopg2
from config.config import config
# Connect to PostgreSQL
params = config(config_db = 'database.ini')
engine = psycopg2.connect(**params)
print('Python connected to PostgreSQL!')
# Insert values to the table
cur = con.cursor()
cur.execute("""
UPDATE customer SET address = 'Japan' WHERE customer_id = 12345;
""")
print('Values updated in PostgreSQL')
# Close the connection
con.commit()
con.close()

Delete rows or a table

The last, D for delete (rows or a table) using the DELETE statement. In this case, we’d like to delete customer information who has the customer_id = 12345 from a customer table.

# Import libraries
import pandas as pd
import psycopg2
from config.config import config
# Connect to PostgreSQL
params = config(config_db = 'database.ini')
engine = psycopg2.connect(**params)
print('Python connected to PostgreSQL!')
# Delete rows from the table
cur = con.cursor()
cur.execute("""
DELETE FROM customer WHERE customer_id = 12345;
""")
print('Values deleted from PostgreSQL')
# Close the connection
con.commit()
con.close()

Conclusion

When working with tons of codes, we are required to not only create a specific script but also a tidy and clean directory. Sometimes, our scripts contain secret information, such as database configuration, etc. So, we created database.ini outside the main script. And for psycopg2 module, it has a lot of functions to interact with PostgreSQL locally or remotely, this makes it easier for us to develop automation instead of doing the CRUD operation manually via PgAdmin.

References

[1] Python-Dev. psycopg2 2.8.6 (2020), https://pypi.org/project/psycopg2/.


Related Articles