
Table of Contents
What are Agents? What Do We Need to build our own Intelligent Assistant? How do we connect all required Skills? – The Theory behind Agents How does Agents work? – The Chain-of-Thought The Agent Executor – The Agent behind the Agent
How to Use the SQLDatabaseToolkit? Hands-On Tutorial
Intro
This article is about how we can get LLMs (Large Language Models) to solve complex tasks independently.
Humans are remarkable in their ability to constantly absorb information, make decisions, take actions, observe the changes, and then make the next decision. Our entire live is a never-ending chain of observations, thoughts and actions. This concept we want to transfer to LLMs to continuously make new decisions and thus gradually approaching the solution to more complex problems.
By breaking down complex problems into smaller, manageable pieces and continuously building on the knowledge of previous generations, we as humans have come a long way. It may have taken 4 billion years, but now we have reached a point where private companies are building rockets, and factories can be operated entirely by robots.

But that is not the end, our economy is built on ever more growth … further accelerating progress is only possible through disruptive ideas. One of these is certainly Generative AI, which not only generates new content, but also interprets texts and observations and can make decisions itself. By constantly analyzing observations and making decisions, we drive progress until we finally build cities on Mars and become immortal … or whatever it is we want to achieve.
![Progress and Innovations : today and tomorrow - Image by the author [Webclouddata.com]](https://towardsdatascience.com/wp-content/uploads/2024/01/1G1i99cX8Mvb1j1gmGwqYrg.png)
Today, we are not quite there yet, because every AI model that exists only ever mimics a certain aspect of human intelligence. Large Language Models, for example, are remarkably good at understanding and composing texts. Their capabilities often surpass human performance in these areas. However, when it comes to simple arithmetic tasks, LLMs often stumble.
So how can we get them to the point where they can solve more complex questions on their own?
One approach is the concept of Agents.
What are Agents?
Science fiction and spy movies often use a central Artificial Intelligence that talks to the hero and searches the internet and various secret databases to guide him or her through the missions, J.A.R.V.I.S. from Iron Man is just one example.
What makes Jarvis to Jarvis? Iron Man don’t have to tell him how to solve a problem, it finds a way by itself. And that is exactly what we want to achieve with Agents.
What Do We Need to build our own Intelligent Assistant?
The core is definitely a super good Large Language Model, which is capable of interpreting a question, observe an environment and making decisions based on that. Add to this a few models that translate speech into text, models that interpret image content, and voilà, we have everything we need to build our own Jarvis.

How do we connect all required Skills? – The Theory behind Agents
An agent is a component that has access to a range of tools. It’s main feature lies in its ability to make informed decisions and employing the appropriate tools until it achieves a good enough answer.
The concept of agents is perhaps the most fascinating development when it comes to LLM applications. It makes us dream of an AI that autonomously controls processes, conducts research or rescues our superhero by finding the scenario with the highest probability of survival.
When we apply the concept of Agents, we use the LLM not just to answer questions, but rather as the brain that processes the observations it sees and makes decisions about what to do next. We humans do that all the time. We have a task to solve and are looking for ways and tools that help us to solve this task as easily as possible.

Applying this concept and right tools, the computer is also able to solve complex queries. For example, if we use LLMs to convert text to SQL, the SQLAgent from Langchain will not give up if an error occurs when executing the generated SQL. Instead, it will try to interpret the error and rectify the issue.
The just mentioned SQL Agent is using a set of LangChain tools called SQLDatabaseToolkit. But the range of possible tools is diverse. I will explain more about the tools that LangChain already provides below, but a potential tool could be anything. I already mentioned above how incapable the language models are when it comes to simple analytical calculations. We need an Agent that identifies analytical tasks and then uses a calculator to solve the problem.
OpenAI’s ChatGPT already does that. If you use the Plus subscription and send a request, you will see an analyzing step before the chatbot decides how to solve the request. It chooses from a range of tools, e.g. the Python REPL to solve analytical tasks, the web search if it needs up-to-date information or Dall-E3 if the user asks for an image to be created:

How does Agents work? – The Chain-of-Thought
One of the great strengths of humans is the ability to absorb a relatively large amount of information, filter out unimportant details, and make decisions based on critical information. We often start by breaking down large questions into hypotheses and then attempt to support or refute these hypotheses step by step with observations.
We mimic this concept using "Chain-of-Thought Prompting", which breaks down the multi-step problems into intermediate steps.
This can be described in 3 simple steps:
- We initiate an action, and our LLM observes the feedback from a chosen environment.
- Our process collects all the information and uses it to decide what suitable action to take next.
- If we do this iteratively to solve a bigger, more complex task, we use a method called "Reasoning Trace" which involves tracking the steps or stages our process goes through to arrive at a conclusion or solution.

So if we want to implement something like that, we first need a pool of potential actions that could be taken. LangChain captures this assortment of possible functions in the form of so-called Toolkits.
These Toolkits provide a variety of functions or actions that the LLM can choose from, depending on the task at hand.

The tools need to be described in a way that lets the agent know exactly what each tool can be used for. This seems logical. LLMs and agents can’t do magic; they at least need a description of the available tools to decide which ones could do the trick.
LangChain already offers several built-in tools, but it also allows you to create your own toolset.
So what’s left? – We need something that uses these tools, the Agent Executor.
The Agent Executor – The Agent behind the Agent
The Agent is executing the right tools in succession, continuously observing the outcomes and then deciding which tool is needed for the next step. This iterative execution of functions is carried out by what is known as the AgentExecutor.
The entire process is repeated over and over again until we reach a predefined termination criterion. The image below shows an example that uses the SQLDatabaseToolkit from LangChain to find the answer to the question "How many customers do we have in our database".
It uses the database we create in the "Hands-on tutorial" section below, which contains three tables:
- Customer: all customers and data that describes the customer
- Orders: all orders that relate to the customers in the customer table
- Agents: the agents assigned to the different customers
Let’s take a look at how the agent tries to find the answer to the question.

The AgentExecutor class I have just described is actually just one of many possible options. You can also choose other agent runtimes, such as:
- Plan-and-execute Agent
- Baby AGI
- Auto GPT
Each of these agent runtimes has its own approach and methodology for executing tasks and making decisions. The choice of runtime can depend on the specific requirements of the task at hand, the complexity of the decision-making process, and the level of autonomy or intelligence you want the agent to exhibit.
So, in summary:
- Tools are the various functions, capabilities, or actions that an agent can utilize to complete tasks or solve problems.
- These tools must be clearly described so the agent understands their purpose and how to use them effectively.
- Langchain not only provides a range of built-in tools but also the flexibility to incorporate custom tools
- The AgentExecutor plays a critical role in managing the workflow, continuously evaluating the effectiveness of the tools used and adjusting the strategy as needed to achieve the desired outcomes.
From Theory to Practice
LangChain provides us with some guardrails to build our own customized agents, but it also offers a range of ready-to-use tools.
If we want to build our own J.A.R.V.I.S., the first ability we must give it is the ability to read and interpret data. With text-based data this is straightforward. The LLMs can interpret unstructured texts from the ground up very well.
However, it looks different when we want to retrieve the right data from structured data, e.g., databases. For this we want to use an Agent and the so-called SQLDatabaseToolkit from Langchain, which allows the Agent to interact with the database using SQL commands.
Let’s try it out and see what it can do.
You can find all the code snippets at Github.
How to Use the SQLDatabaseToolkit?
By using the SQLDatabaseToolkit, we build an agent to interact with SQL
databases. The agent is designed to answer more general questions about a database, as well as to recover from errors.
Just like with all agents, not all responses will be correct. The whole thing is not an exact science. Sometimes it will come up with an answer, sometimes not, which of course also depends on our questions and if the Agent has enough information to understand our database structure. I believe we can only try it out and see how far we can get with the existing toolkit.
Let’s try it.
Note:
Since it’s not entirely clear what the agent and the model will do with our query, it’s better to use it with a database user which can only query tables and is not allowed to insert data or even delete tables. This way, we can provide some guard rails to the whole process. Moreover, the user could overload our SQL database by asking a simple question like "run the biggest query possible". We need to think about how to mitigate these risks as much as possible.
Hands-On Tutorial
What we need is the SQLite database installed and an OpenAI key, to use the LLMs from OpenAI for our experiment. Further down you can see how I am using the key from OpenAI:
openai_api_key=os.getenv("OPENAI_API_KEY")
To use it in this way, we first need to create an environment variable called "OPENAI_API_KEY" and save our own API key. The easiest way to find your key is to search for "OpenAI API key" in Google, which will take you to your OpenAI user profile.
SQLite Database
To populate a sample database that we can use for our experiment, I use the lightweight in-file database SQLite. This allows you to easily reproduce the experiment.
The database is set up and populated using a simple Python script, which you can find below.
Virtual Environment and requirements.txt
But first we need a Python environment with all the necessary Python packages. First create a new project directory. I have created a new Git repository that automatically creates the ".gitignore", "License" and "README.md" files, which you don’t necessarily need. You can also create an empty directory.

In the project directory I am creating an new virtual environment ".venv" using the following command:

python -m virtualenv .venv
It creates a new folder ".venv" within the project directory:

Then we start the virtual environment, by executing the the "activate" file within the ".venv" folder:

.venv/Scripts/activate
and install all packages from requirements.txt. To do this, first create a new file with the name "requirements.txt" with the following content:
aiohttp==3.9.1
aiosignal==1.3.1
annotated-types==0.6.0
anyio==4.2.0
asttokens==2.4.1
async-timeout==4.0.3
attrs==23.2.0
certifi==2023.11.17
charset-normalizer==3.3.2
colorama==0.4.6
comm==0.2.1
dataclasses-json==0.6.3
debugpy==1.8.0
decorator==5.1.1
distro==1.9.0
exceptiongroup==1.2.0
executing==2.0.1
frozenlist==1.4.1
greenlet==3.0.3
h11==0.14.0
httpcore==1.0.2
httpx==0.26.0
idna==3.6
ipykernel==6.28.0
ipython==8.20.0
jedi==0.19.1
jsonpatch==1.33
jsonpointer==2.4
jupyter_client==8.6.0
jupyter_core==5.7.1
langchain==0.1.0
langchain-community==0.0.10
langchain-core==0.1.8
langchain-openai==0.0.2
langsmith==0.0.77
marshmallow==3.20.1
matplotlib-inline==0.1.6
multidict==6.0.4
mypy-extensions==1.0.0
nest-asyncio==1.5.8
numpy==1.26.3
openai==1.6.1
packaging==23.2
parso==0.8.3
platformdirs==4.1.0
prompt-toolkit==3.0.43
psutil==5.9.7
pure-eval==0.2.2
pydantic==2.5.3
pydantic_core==2.14.6
Pygments==2.17.2
python-dateutil==2.8.2
pywin32==306
PyYAML==6.0.1
pyzmq==25.1.2
regex==2023.12.25
requests==2.31.0
six==1.16.0
sniffio==1.3.0
SQLAlchemy==2.0.25
stack-data==0.6.3
tenacity==8.2.3
tiktoken==0.5.2
tornado==6.4
tqdm==4.66.1
traitlets==5.14.1
typing-inspect==0.9.0
typing_extensions==4.9.0
urllib3==2.1.0
wcwidth==0.2.13
yarl==1.9.4
And then install all packages from the just created "requirements.txt":
pip install -r requirements.txt
Create and fill the SQLite database
To create and fill the database, I am using a sample database template from www.w3resource.com. To execute the SQL commands step by step I am creating a python script called "01_create_and_fill_database.py". It creates the SQLite database and executes the SQL commands to create the tables and fill them with some samples.
First you need to create the just mentioned python file "01_create_and_fill_database.py" and fill it with the following source code:
# 01_create_and_fill_database.py
import sqlite3
import os
# File path
database_file_path = './sql_lite_database.db'
# Check if database file exists and delete if it does
if os.path.exists(database_file_path):
os.remove(database_file_path)
message = "File 'sql_lite_database.db' found and deleted."
else:
message = "File 'sql_lite_database.db' does not exist."
# Step 1: Connect to the database or create it if it doesn't exist
conn = sqlite3.connect(database_file_path)
# Step 2: Create a cursor
cursor = conn.cursor()
# Step 3: Create tables
create_table_query1 = """
CREATE TABLE IF NOT EXISTS "AGENTS"
(
"AGENT_CODE" CHAR(6) NOT NULL PRIMARY KEY,
"AGENT_NAME" CHAR(40),
"WORKING_AREA" CHAR(35),
"COMMISSION" NUMBER(10,2),
"PHONE_NO" CHAR(15),
"COUNTRY" VARCHAR2(25)
);
"""
create_table_query2 = """
CREATE TABLE IF NOT EXISTS "CUSTOMER"
( "CUST_CODE" VARCHAR2(6) NOT NULL PRIMARY KEY,
"CUST_NAME" VARCHAR2(40) NOT NULL,
"CUST_CITY" CHAR(35),
"WORKING_AREA" VARCHAR2(35) NOT NULL,
"CUST_COUNTRY" VARCHAR2(20) NOT NULL,
"GRADE" NUMBER,
"OPENING_AMT" NUMBER(12,2) NOT NULL,
"RECEIVE_AMT" NUMBER(12,2) NOT NULL,
"PAYMENT_AMT" NUMBER(12,2) NOT NULL,
"OUTSTANDING_AMT" NUMBER(12,2) NOT NULL,
"PHONE_NO" VARCHAR2(17) NOT NULL,
"AGENT_CODE" CHAR(6) NOT NULL REFERENCES AGENTS
);
"""
create_table_query3 = """
CREATE TABLE IF NOT EXISTS "ORDERS"
(
"ORD_NUM" NUMBER(6,0) NOT NULL PRIMARY KEY,
"ORD_AMOUNT" NUMBER(12,2) NOT NULL,
"ADVANCE_AMOUNT" NUMBER(12,2) NOT NULL,
"ORD_DATE" DATE NOT NULL,
"CUST_CODE" VARCHAR2(6) NOT NULL REFERENCES CUSTOMER,
"AGENT_CODE" CHAR(6) NOT NULL REFERENCES AGENTS,
"ORD_DESCRIPTION" VARCHAR2(60) NOT NULL
);
"""
queries = [create_table_query1, create_table_query2, create_table_query3]
# queries = [create_table_query1, create_table_query2]
for query in queries:
# execute queries
cursor.execute(query)
# Step 4: Insert data into tables Agents, Orders and Customers
insert_query = """
INSERT INTO AGENTS VALUES ('A007', 'Ramasundar', 'Bangalore', '0.15', '077-25814763', '');
INSERT INTO AGENTS VALUES ('A003', 'Alex ', 'London', '0.13', '075-12458969', '');
INSERT INTO AGENTS VALUES ('A008', 'Alford', 'New York', '0.12', '044-25874365', '');
INSERT INTO AGENTS VALUES ('A011', 'Ravi Kumar', 'Bangalore', '0.15', '077-45625874', '');
INSERT INTO AGENTS VALUES ('A010', 'Santakumar', 'Chennai', '0.14', '007-22388644', '');
INSERT INTO AGENTS VALUES ('A012', 'Lucida', 'San Jose', '0.12', '044-52981425', '');
INSERT INTO AGENTS VALUES ('A005', 'Anderson', 'Brisban', '0.13', '045-21447739', '');
INSERT INTO AGENTS VALUES ('A001', 'Subbarao', 'Bangalore', '0.14', '077-12346674', '');
INSERT INTO AGENTS VALUES ('A002', 'Mukesh', 'Mumbai', '0.11', '029-12358964', '');
INSERT INTO AGENTS VALUES ('A006', 'McDen', 'London', '0.15', '078-22255588', '');
INSERT INTO AGENTS VALUES ('A004', 'Ivan', 'Torento', '0.15', '008-22544166', '');
INSERT INTO AGENTS VALUES ('A009', 'Benjamin', 'Hampshair', '0.11', '008-22536178', '');
INSERT INTO CUSTOMER VALUES ('C00013', 'Holmes', 'London', 'London', 'UK', '2', '6000.00', '5000.00', '7000.00', '4000.00', 'BBBBBBB', 'A003');
INSERT INTO CUSTOMER VALUES ('C00001', 'Micheal', 'New York', 'New York', 'USA', '2', '3000.00', '5000.00', '2000.00', '6000.00', 'CCCCCCC', 'A008');
INSERT INTO CUSTOMER VALUES ('C00020', 'Albert', 'New York', 'New York', 'USA', '3', '5000.00', '7000.00', '6000.00', '6000.00', 'BBBBSBB', 'A008');
INSERT INTO CUSTOMER VALUES ('C00025', 'Ravindran', 'Bangalore', 'Bangalore', 'India', '2', '5000.00', '7000.00', '4000.00', '8000.00', 'AVAVAVA', 'A011');
INSERT INTO CUSTOMER VALUES ('C00024', 'Cook', 'London', 'London', 'UK', '2', '4000.00', '9000.00', '7000.00', '6000.00', 'FSDDSDF', 'A006');
INSERT INTO CUSTOMER VALUES ('C00015', 'Stuart', 'London', 'London', 'UK', '1', '6000.00', '8000.00', '3000.00', '11000.00', 'GFSGERS', 'A003');
INSERT INTO CUSTOMER VALUES ('C00002', 'Bolt', 'New York', 'New York', 'USA', '3', '5000.00', '7000.00', '9000.00', '3000.00', 'DDNRDRH', 'A008');
INSERT INTO CUSTOMER VALUES ('C00018', 'Fleming', 'Brisban', 'Brisban', 'Australia', '2', '7000.00', '7000.00', '9000.00', '5000.00', 'NHBGVFC', 'A005');
INSERT INTO CUSTOMER VALUES ('C00021', 'Jacks', 'Brisban', 'Brisban', 'Australia', '1', '7000.00', '7000.00', '7000.00', '7000.00', 'WERTGDF', 'A005');
INSERT INTO CUSTOMER VALUES ('C00019', 'Yearannaidu', 'Chennai', 'Chennai', 'India', '1', '8000.00', '7000.00', '7000.00', '8000.00', 'ZZZZBFV', 'A010');
INSERT INTO CUSTOMER VALUES ('C00005', 'Sasikant', 'Mumbai', 'Mumbai', 'India', '1', '7000.00', '11000.00', '7000.00', '11000.00', '147-25896312', 'A002');
INSERT INTO CUSTOMER VALUES ('C00007', 'Ramanathan', 'Chennai', 'Chennai', 'India', '1', '7000.00', '11000.00', '9000.00', '9000.00', 'GHRDWSD', 'A010');
INSERT INTO CUSTOMER VALUES ('C00022', 'Avinash', 'Mumbai', 'Mumbai', 'India', '2', '7000.00', '11000.00', '9000.00', '9000.00', '113-12345678','A002');
INSERT INTO CUSTOMER VALUES ('C00004', 'Winston', 'Brisban', 'Brisban', 'Australia', '1', '5000.00', '8000.00', '7000.00', '6000.00', 'AAAAAAA', 'A005');
INSERT INTO CUSTOMER VALUES ('C00023', 'Karl', 'London', 'London', 'UK', '0', '4000.00', '6000.00', '7000.00', '3000.00', 'AAAABAA', 'A006');
INSERT INTO CUSTOMER VALUES ('C00006', 'Shilton', 'Torento', 'Torento', 'Canada', '1', '10000.00', '7000.00', '6000.00', '11000.00', 'DDDDDDD', 'A004');
INSERT INTO CUSTOMER VALUES ('C00010', 'Charles', 'Hampshair', 'Hampshair', 'UK', '3', '6000.00', '4000.00', '5000.00', '5000.00', 'MMMMMMM', 'A009');
INSERT INTO CUSTOMER VALUES ('C00017', 'Srinivas', 'Bangalore', 'Bangalore', 'India', '2', '8000.00', '4000.00', '3000.00', '9000.00', 'AAAAAAB', 'A007');
INSERT INTO CUSTOMER VALUES ('C00012', 'Steven', 'San Jose', 'San Jose', 'USA', '1', '5000.00', '7000.00', '9000.00', '3000.00', 'KRFYGJK', 'A012');
INSERT INTO CUSTOMER VALUES ('C00008', 'Karolina', 'Torento', 'Torento', 'Canada', '1', '7000.00', '7000.00', '9000.00', '5000.00', 'HJKORED', 'A004');
INSERT INTO CUSTOMER VALUES ('C00003', 'Martin', 'Torento', 'Torento', 'Canada', '2', '8000.00', '7000.00', '7000.00', '8000.00', 'MJYURFD', 'A004');
INSERT INTO CUSTOMER VALUES ('C00009', 'Ramesh', 'Mumbai', 'Mumbai', 'India', '3', '8000.00', '7000.00', '3000.00', '12000.00', 'Phone No', 'A002');
INSERT INTO CUSTOMER VALUES ('C00014', 'Rangarappa', 'Bangalore', 'Bangalore', 'India', '2', '8000.00', '11000.00', '7000.00', '12000.00', 'AAAATGF', 'A001');
INSERT INTO CUSTOMER VALUES ('C00016', 'Venkatpati', 'Bangalore', 'Bangalore', 'India', '2', '8000.00', '11000.00', '7000.00', '12000.00', 'JRTVFDD', 'A007');
INSERT INTO CUSTOMER VALUES ('C00011', 'Sundariya', 'Chennai', 'Chennai', 'India', '3', '7000.00', '11000.00', '7000.00', '11000.00', 'PPHGRTS', 'A010');
INSERT INTO ORDERS VALUES('200100', '1000.00', '600.00', '2008-08-01', 'C00013', 'A003', 'SOD');
INSERT INTO ORDERS VALUES('200110', '3000.00', '500.00', '2008-04-15', 'C00019', 'A010', 'SOD');
INSERT INTO ORDERS VALUES('200107', '4500.00', '900.00', '2008-08-30', 'C00007', 'A010', 'SOD');
INSERT INTO ORDERS VALUES('200112', '2000.00', '400.00', '2008-05-30', 'C00016', 'A007', 'SOD');
INSERT INTO ORDERS VALUES('200113', '4000.00', '600.00', '2008-06-10', 'C00022', 'A002', 'SOD');
INSERT INTO ORDERS VALUES('200102', '2000.00', '300.00', '2008-05-25', 'C00012', 'A012', 'SOD');
INSERT INTO ORDERS VALUES('200114', '3500.00', '2000.00', '2008-08-15', 'C00002', 'A008', 'SOD');
INSERT INTO ORDERS VALUES('200122', '2500.00', '400.00', '2008-09-16', 'C00003', 'A004', 'SOD');
INSERT INTO ORDERS VALUES('200118', '500.00', '100.00', '2008-07-20', 'C00023', 'A006', 'SOD');
INSERT INTO ORDERS VALUES('200119', '4000.00', '700.00', '2008-09-16', 'C00007', 'A010', 'SOD');
INSERT INTO ORDERS VALUES('200121', '1500.00', '600.00', '2008-09-23', 'C00008', 'A004', 'SOD');
INSERT INTO ORDERS VALUES('200130', '2500.00', '400.00', '2008-07-30', 'C00025', 'A011', 'SOD');
INSERT INTO ORDERS VALUES('200134', '4200.00', '1800.00', '2008-09-25', 'C00004', 'A005', 'SOD');
INSERT INTO ORDERS VALUES('200108', '4000.00', '600.00', '2008-02-15', 'C00008', 'A004', 'SOD');
INSERT INTO ORDERS VALUES('200103', '1500.00', '700.00', '2008-05-15', 'C00021', 'A005', 'SOD');
INSERT INTO ORDERS VALUES('200105', '2500.00', '500.00', '2008-07-18', 'C00025', 'A011', 'SOD');
INSERT INTO ORDERS VALUES('200109', '3500.00', '800.00', '2008-07-30', 'C00011', 'A010', 'SOD');
INSERT INTO ORDERS VALUES('200101', '3000.00', '1000.00', '2008-07-15', 'C00001', 'A008', 'SOD');
INSERT INTO ORDERS VALUES('200111', '1000.00', '300.00', '2008-07-10', 'C00020', 'A008', 'SOD');
INSERT INTO ORDERS VALUES('200104', '1500.00', '500.00', '2008-03-13', 'C00006', 'A004', 'SOD');
INSERT INTO ORDERS VALUES('200106', '2500.00', '700.00', '2008-04-20', 'C00005', 'A002', 'SOD');
INSERT INTO ORDERS VALUES('200125', '2000.00', '600.00', '2008-10-10', 'C00018', 'A005', 'SOD');
INSERT INTO ORDERS VALUES('200117', '800.00', '200.00', '2008-10-20', 'C00014', 'A001', 'SOD');
INSERT INTO ORDERS VALUES('200123', '500.00', '100.00', '2008-09-16', 'C00022', 'A002', 'SOD');
INSERT INTO ORDERS VALUES('200120', '500.00', '100.00', '2008-07-20', 'C00009', 'A002', 'SOD');
INSERT INTO ORDERS VALUES('200116', '500.00', '100.00', '2008-07-13', 'C00010', 'A009', 'SOD');
INSERT INTO ORDERS VALUES('200124', '500.00', '100.00', '2008-06-20', 'C00017', 'A007', 'SOD');
INSERT INTO ORDERS VALUES('200126', '500.00', '100.00', '2008-06-24', 'C00022', 'A002', 'SOD');
INSERT INTO ORDERS VALUES('200129', '2500.00', '500.00', '2008-07-20', 'C00024', 'A006', 'SOD');
INSERT INTO ORDERS VALUES('200127', '2500.00', '400.00', '2008-07-20', 'C00015', 'A003', 'SOD');
INSERT INTO ORDERS VALUES('200128', '3500.00', '1500.00', '2008-07-20', 'C00009', 'A002', 'SOD');
INSERT INTO ORDERS VALUES('200135', '2000.00', '800.00', '2008-09-16', 'C00007', 'A010', 'SOD');
INSERT INTO ORDERS VALUES('200131', '900.00', '150.00', '2008-08-26', 'C00012', 'A012', 'SOD');
INSERT INTO ORDERS VALUES('200133', '1200.00', '400.00', '2008-06-29', 'C00009', 'A002', 'SOD');
"""
for row in insert_query.splitlines():
try:
cursor.execute(row)
except:
print(f"An error occurred")
print(row)
# Step 5: Fetch data from tables
list_of_queries = []
list_of_queries.append("SELECT * FROM AGENTS")
list_of_queries.append("SELECT * FROM CUSTOMER")
list_of_queries.append("SELECT * FROM ORDERS")
# execute queries
for query in list_of_queries:
cursor.execute(query)
data = cursor.fetchall()
print(f"--- Data from tables ({query}) ---")
for row in data:
print(row)
# Step 7: Close the cursor and connection
cursor.close()
conn.commit()
conn.close()
Before you run the script you need to make sure your virtual environment is activated (.venv). Then run the Python script you have just created.

You will see that a new file called "sql_lite_database.db" has appeared in your folder structure. It is representing your SQLite database we need for the subsequent tests.

In the image below shows the table structure of just created database.
![Database UML class diagram [w3resource.com]](https://towardsdatascience.com/wp-content/uploads/2024/01/1I4GI1q58q6RmT3HzGeYgIw.png)
The tables themselves are filled with a few sample data points. This is sufficient for our simple test.
To be able to chat with our database, we use LangChain Agents. As a toolkit we are using the SQLDatabaseToolkit.
LangChain Agents
To define our agents, we need to specify the following points:
- The agent’s toolkit, here: SQLDatabaseToolkit
- The agent executor with agent type, here:
ZERO_SHOT_REACT_DESCRIPTION
. As mentioned above, there several types of agent types. You can find some of them here.
To test it, I create a new Jupyter Notebook file, called "02_langchain_sql_agents.ipynb". I fill the first cell with the following code, and execute it:
from langchain.utilities import SQLDatabase
from langchain.llms import OpenAI
from langchain.agents.agent_types import AgentType
from langchain.agents.agent_toolkits import SQLDatabaseToolkit
from langchain.utilities import SQLDatabase
from langchain.llms import OpenAI
from langchain.agents import create_sql_agent
import os
# define the database we want to use for our test
db = SQLDatabase.from_uri('sqlite:///sql_lite_database.db')
# choose llm model, in this case the default OpenAI model
llm = OpenAI(
temperature=0,
verbose=True,
openai_api_key=os.getenv("OPENAI_API_KEY"),
)
# setup agent
toolkit = SQLDatabaseToolkit(db=db, llm=llm)
agent_executor = create_sql_agent(
llm=llm,
toolkit=toolkit,
verbose=True,
agent_type=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
)
# define the user's question
question = "How many customers do we have in our database?"
agent_executor.invoke(question)


That’s it. You have just chatted with our database.
I’m quite surprised how well it works, even if the descriptions of the database fields are not always clear.
Here are just some more examples:
- Query: "List all customers from the US"
# define the user's question
question = "List all customers from the US"
agent_executor.invoke(question)
- Output – Chain-of-Thought:
> Entering new AgentExecutor chain...
Action: sql_db_list_tables
Action Input:
Observation: AGENTS, CUSTOMER, ORDERS
Thought: The CUSTOMER table seems like the most relevant table to query.
Action: sql_db_schema
Action Input: CUSTOMER
Observation:
CREATE TABLE "CUSTOMER" (
"CUST_CODE" TEXT(6) NOT NULL,
"CUST_NAME" TEXT(40) NOT NULL,
"CUST_CITY" CHAR(35),
"WORKING_AREA" TEXT(35) NOT NULL,
"CUST_COUNTRY" TEXT(20) NOT NULL,
"GRADE" NUMERIC,
"OPENING_AMT" NUMERIC(12, 2) NOT NULL,
"RECEIVE_AMT" NUMERIC(12, 2) NOT NULL,
"PAYMENT_AMT" NUMERIC(12, 2) NOT NULL,
"OUTSTANDING_AMT" NUMERIC(12, 2) NOT NULL,
"PHONE_NO" TEXT(17) NOT NULL,
"AGENT_CODE" CHAR(6) NOT NULL,
PRIMARY KEY ("CUST_CODE"),
FOREIGN KEY("AGENT_CODE") REFERENCES "AGENTS" ("AGENT_CODE")
)
/*
3 rows from CUSTOMER table:
CUST_CODE CUST_NAME CUST_CITY WORKING_AREA CUST_COUNTRY GRADE OPENING_AMT RECEIVE_AMT PAYMENT_AMT OUTSTANDING_AMT PHONE_NO AGENT_CODE
C00013 Holmes London London UK 2.0000000000 6000.00 5000.00 7000.00 4000.00 BBBBBBB A003
C00001 Micheal New York New York USA 2.0000000000 3000.00 5000.00 2000.00 6000.00 CCCCCCC A008
C00020 Albert New York New York USA 3.0000000000 5000.00 7000.00 6000.00 6000.00 BBBBSBB A008
*/
Thought: I should query the CUSTOMER table for all customers from the US and limit the results to 10.
Action: sql_db_query
Action Input: SELECT CUST_NAME FROM CUSTOMER WHERE CUST_COUNTRY='USA' LIMIT 10
Observation: [('Micheal',), ('Albert',), ('Bolt',), ('Steven',)]
Thought: I now know the final answer
Final Answer: Micheal, Albert, Bolt, Steven
> Finished chain.
- Query: "What is the phone number of the customer Albert?"
# define the user's question
question = "What is the phone number of the customer Albert?"
agent_executor.invoke(question)
- Output – Chain-of-Thought:
> Entering new AgentExecutor chain...
Action: sql_db_list_tables
Action Input:
Observation: AGENTS, CUSTOMER, ORDERS
Thought: I should query the CUSTOMER table to get the phone number of the customer Albert.
Action: sql_db_schema
Action Input: CUSTOMER
Observation:
CREATE TABLE "CUSTOMER" (
"CUST_CODE" TEXT(6) NOT NULL,
"CUST_NAME" TEXT(40) NOT NULL,
"CUST_CITY" CHAR(35),
"WORKING_AREA" TEXT(35) NOT NULL,
"CUST_COUNTRY" TEXT(20) NOT NULL,
"GRADE" NUMERIC,
"OPENING_AMT" NUMERIC(12, 2) NOT NULL,
"RECEIVE_AMT" NUMERIC(12, 2) NOT NULL,
"PAYMENT_AMT" NUMERIC(12, 2) NOT NULL,
"OUTSTANDING_AMT" NUMERIC(12, 2) NOT NULL,
"PHONE_NO" TEXT(17) NOT NULL,
"AGENT_CODE" CHAR(6) NOT NULL,
PRIMARY KEY ("CUST_CODE"),
FOREIGN KEY("AGENT_CODE") REFERENCES "AGENTS" ("AGENT_CODE")
)
/*
3 rows from CUSTOMER table:
CUST_CODE CUST_NAME CUST_CITY WORKING_AREA CUST_COUNTRY GRADE OPENING_AMT RECEIVE_AMT PAYMENT_AMT OUTSTANDING_AMT PHONE_NO AGENT_CODE
C00013 Holmes London London UK 2.0000000000 6000.00 5000.00 7000.00 4000.00 BBBBBBB A003
C00001 Micheal New York New York USA 2.0000000000 3000.00 5000.00 2000.00 6000.00 CCCCCCC A008
C00020 Albert New York New York USA 3.0000000000 5000.00 7000.00 6000.00 6000.00 BBBBSBB A008
*/
Thought: I should query the CUSTOMER table for the phone number of the customer Albert.
Action: sql_db_query
Action Input: SELECT PHONE_NO FROM CUSTOMER WHERE CUST_NAME = 'Albert' LIMIT 10
Observation: [('BBBBSBB',)]
Thought: I now know the final answer
Final Answer: The phone number of the customer Albert is BBBBSBB.
> Finished chain.
Summary
We humans break down complex problems into smaller subtasks and hypotheses and try to prove or disprove them step by step in order to get closer to solving a larger problem. We want to mimic this behavior with LLMs by using the concept of agents. By providing our agent with the right tools, the Agent decides on its own which actions to take next in order to get closer to the overall solution.
I find the whole thing so exciting because if agents actually become intelligent enough to mimic human ingenuity and research skills, we could accelerate progress in these areas enormously. Humans would no longer be the bottleneck when it comes to inventing new things and discovering the secrets of this world.
Enjoyed the story?
- Subscribe for free to get notified when I publish a new story.
- And feel free to share your thoughts or ask questions in the comment section! I try to respond to all of them!
You can also find me on LinkedIn!