The world’s leading publication for data science, AI, and ML professionals.

Empowering Data-Driven Decisions: Embedding Trust in Text-to-SQL AI Agents

Simplify complex data environments for users utilizing reliable AI Agent systems towards better data-driven decision-making

What are the key factors that drive user’s trust and engagement in conversational data-driven AI applications?

Rising numbers of organizations internally adopting AI applications do not necessarily reflect higher user engagement scores. Trust scores when it comes to AI results are not within the critical decision-making range.

Introducing additional components to such applications becomes crucial to drive both engagement and trust of their users, ultimately leading organizations towards a more data-driven decision-making environment.


Table of Contents

1- Simplicity plays a crucial role in driving engagement (The Why)

2- Decision-Making Becomes Conversational (The What)

3- Building Trust in AI Data Applications (The How)

4 – Hands-On: Building Data Chat Room with Gradio, Postgres, and Langchain

Summary


1- Simplicity plays a crucial role in driving engagement (The Why)

Data-driven organizations have discovered data’s significant benefits, and many reflect these benefits in their market advantage. According to a report by McKinsey, this value could reflect up to 25% increase in EBITDA. Meanwhile, others have struggled to reach far in that area, though not due to the lack of trying.

If we look at a few numbers we see many organizations failing to reach their goals towards becoming more data-driven. If it’s clear that data gives you an edge in the market, why are the majority still lagging behind?

It’s not a straightforward question to answer, for some it is perceived as a huge investment, others may not see the value, and some may lack the right skillset.

Sometimes utilizing the value from data runs into a few bottlenecks, it could be simply that data is not readily available for users to analyze, or the user’s access to the data process is too complex that many give up before trying, etc…

The list is long, but there are some use cases I got to experience working for organizations building their data capabilities. Businesses who had their data warehouse/lake in place yet struggled to build value from their data.

"Simplicity is the ultimate sophistication."

– Leonardo da Vinci

Da Vinci said it best, simplicity is key, yet we still see data users struggling with complexity.

Let’s assume a business managed to pull through and build its data platform.

Data users are faced with data warehouses and databases consisting of hundreds of tables spanning thousands of rows and columns. In the best cases, things are documented and a data catalog is in place.

At worst, they’re trapped in a maze. You dive in a bit deeper to further find more dashboards and reports and a few 100 priority KPIs and…and… and…

For urgent decisions, you are then likely to do your best to avoid spending hours searching through the haystack maze of data-driven value.

Fortunately, alternatives exist.

In the next sections, we work our way through understanding what AI can bring to the table when it comes to simplifying complexity in data environments for users. How it can be optimized to establish a level of trust that encourages users to engage with it. Finally, a hands-on tutorial on how to build a fully functioning AI agent system that utilizes Llama3.1 8b and GPT-4o-mini to deliver fast and trustworthy results in the form of a simple configurable chat room.


2 – Decision-Making Becomes Conversational (The What)

Experts are distinguished by their ability for fast information retrieval based on years of accumulated experience. You ask a well-formulated question, they provide the answer. A simple, efficient, and fast transaction. The expert abstracts away all the complex processing that went towards providing the answer.

Expert Agents - Image by Author
Expert Agents – Image by Author

This might hold for experts in specific fields where data is manageable. When it comes to large amounts of information stored across distributed clusters, data experts are required to consult their data source systems and query engines for answers.

Now, let’s imagine we can create a sufficiently scalable amount of such experts. All with access to databases across the organizations and data sources, Available 24/7 and ready to provide answers within seconds given some user input.

Take it a step further, and let those experts discuss and analyze the best answer to provide to you. We end up with a system capable of catering to large amounts of data users request on demand.

It is crucial to understand that such systems become ineffective the moment the trust factor in them is compromised. The goal here is trust. Trust in a system that will deliver the correct numbers and answers to the right people on time.

Now let’s call these scalable experts AI Agents. AI Agents operate effectively with the right information, tools, and instructions.

We will not dive deep into AI Agents themselves, great articles exist that clarify that in detail.

The problem at hand translates to: How can we engineer information, tools, and instructions to build a capable and trustworthy system that can serve data users effectively?


3 – Building Trust in AI Data Applications (The How)

Information, tools, and instructions. These are key factors to consider when building effective AI Agent-based solutions.

AI Agents Workflows - Image by Author
AI Agents Workflows – Image by Author

Let’s analyze our landscape, and make things a bit more clearer.

Information

  • Information in our datascape will be available in databases. Stored as tables, functions & metadata.
  • Information is furthermore sourced from the prompt’s context and other instructions within Tools assigned to agents.

Tools

  • Tools within the context of our use case are functions created to enable agents to utilize for specific actions. Therefore, adding a layer of consistency to an agent’s behavior.

AI Agents

  • Agents are created with a specific background.
  • Agents are assigned tools to fulfill their goals.
  • Agents use selected LLM models that best fit their objectives.

To ensure that Agents with a high probability provide consistent and trustworthy results, the above factors need to be engineered in a manner that minimizes hallucinations and promotes the user’s trust.

Building the system requires that we address challenges that prevent users from adopting AI for their data use cases.

Some challenges/questions we need to tackle:

  • How can I trust that the system is providing the right numbers?
  • What can be done to ensure that data results are trustworthy?
  • Did the system use the correct data source?
  • Can we make transparent what sources are used and their content?
  • Are the results clear to the user?

If we dive further into challenges the list grows longer. For the scope of our discussion here, we will limit it to the above list of challenges to solve.

Let’s discuss solution strategies for each question above and lay down the grounds for the final implementation.

Providing the right numbers means that users are confident that whatever was executed by AI is correct (dependent as always on the question/prompt of the user).

As a data user, it would help me if:

  • AI is transparent about what exactly is calculated.
  • AI is well-trained to know whether it can answer the question in the first place.
  • AI understands the context and data being requested and has all the right information to deliver the right results.
  • AI understands my data source and is well-skilled in writing queries for data retrieval.

Translations to an engineering perspective:

  • AI is required to explain to users their calculation logic and what tools and info have been used.
  • AI to respond whether it has the tools and info to answer their question
  • AI is well equipped with tools to limit hallucinations (here we’re talking about functions, thus limiting AI to only invoke with the right parameters)
  • AI is capable of executing codes and functions in the database and retrieving data (taking into consideration the size of data returned to the user’s frontend)

Building such an application can have a level or two of complexity, yet it is important to prove its value. We will go through the implementation step by step in the next section.

You can get the final code here for our hands-on implementation.


Hands-On: Building Data Chat Room with Gradio, Postgres, and Langchain

In this section, we develop our end-to-end user-facing application. The solution’s main objective is to enable users with seamless interaction with a focus on simplicity and trust.

Application Workflow - Image by Author
Application Workflow – Image by Author

The application landscape consists of:

  • A simple Gradio UI for a user to interact with, configure, and test their application
  • A Postgres Database to store information users wish to analyze
  • A backend that handles user requests and interacts with AI agents providing them with the necessary tools to fulfil the goal.

We will start by listing the components and lay out how they are engineered together to build a system focused on trust.

First, make sure that you have Python 3.9+ running and install all required dependencies in the [requirements.txt](https://github.com/Husseinjd/data-chat-rooms/blob/main/requirements.txt) file.

gradio
sqlalchemy
langchain
langchain
langchain-community
psycopg2
langchain-openai
python-dotenv
langchain-ollama
pip install -r requirements.txt

The Database

For our use case, we construct a simple bookstore Postgres database. You can find the [init.sql](https://github.com/Husseinjd/data-chat-rooms/blob/main/app/database/init.sql)script containing all the needed tables and data. We will use this setup throughout this section. This can easily be any other type of database and schema.

We must ensure that all metadata relating to the tables and columns are in place.

Example enrichment of metadata for the books table:

-- Books table
CREATE TABLE books (
    book_id SERIAL PRIMARY KEY,
    title VARCHAR(200) NOT NULL,
    author_id INTEGER REFERENCES authors(author_id),
    isbn VARCHAR(13) UNIQUE,
    price DECIMAL(10, 2),
    publication_date DATE
);

COMMENT ON COLUMN books.book_id IS 'Unique identifier for each book';
COMMENT ON COLUMN books.title IS 'Title of the book';
COMMENT ON COLUMN books.author_id IS 'Foreign key referencing the author of the book';
COMMENT ON COLUMN books.isbn IS 'International Standard Book Number (13 digits)';
COMMENT ON COLUMN books.price IS 'Price of the book in decimal format';
COMMENT ON COLUMN books.publication_date IS 'Date when the book was published';

This would be the first step towards providing more context to AI agents about the data.

To set up our database we will use a Postgres docker image and run it locally for testing.

Ensure that docker is installed and the following service is defined in the docker-compose.yaml file.

services:
  db:
    image: postgres:13
    container_name: bookstore_db
    environment:
      POSTGRES_DB: bookstore
      POSTGRES_USER: admin
      POSTGRES_PASSWORD: adminpassword
    ports:
      - "5432:5432"
    volumes:
      - ./app/database/init.sql:/docker-entrypoint-initdb.d/init.sql

The Agents

Configuration of a trusted AI Agent begins with selecting the appropriate model, engineering a fitting prompt, and finally choosing the right tools.

LLM

To demonstrate our use case, we will be working with multiple LLMs to run our tests. Those will be

  • OpenAI gpt-4o-mini
  • LLama 3.1 8b

In the project, AgentSetup [agent.py](https://github.com/Husseinjd/data-chat-rooms/blob/main/app/agent/agent.py)handles setting up our agents with the right LLM model.

from langchain_community.utilities import SQLDatabase
from langchain_community.agent_toolkits.sql.toolkit import SQLDatabaseToolkit
from langchain import hub
from langchain_core.prompts.prompt import PromptTemplate
from langgraph.prebuilt import create_react_agent
from langchain_openai import ChatOpenAI
from app.config import OPENAI_API_KEY

class AgentSetup:
    def __init__(self, engine, llm = None, tools=[]):
        self.engine = engine
        self.llm = llm or ChatOpenAI(
            model="gpt-4o-mini", temperature=0, api_key=OPENAI_API_KEY
        )
        self.tools = tools
        self.agent_executor = None
        self.toolkit = None

    def setup(self):
        db = SQLDatabase(self.engine)
        self.toolkit = SQLDatabaseToolkit(db=db, llm=self.llm)

        prompt_template = hub.pull("langchain-ai/sql-agent-system-prompt")
        # prompt redacted here for better readability
        # you can find the full version in the repo.
        function_prompt_addition = """"""

        prompt_template = PromptTemplate(
            input_variables=["dialect", "top_k"], template=function_prompt_addition
        )

        tools =  self.toolkit.get_tools() + self.tools

        self.agent_executor = create_react_agent(
            self.llm,
            tools,
            state_modifier=prompt_template.format(dialect="POSTGRESQL", top_k=5),
        )

    def get_agent(self):
        if not self.agent_executor:
            raise ValueError("Agent not set up. Call setup() first.")
        return self.agent_executor

    def update_llm(self, new_llm):
        self.llm = new_llm
        if self.toolkit:
            self.toolkit.llm = new_llm
        self.setup()  # Re-setup the agent with the new LLM

For setting up Langchain with OpenAI LLM we add the following:

 from langchain_openai import ChatOpenAI
 llm = ChatOpenAI(model="gpt-4o-mini", temperature=0, api_key=OPENAI_API_KEY)

To use LLama3.1, we download and run the docker image, and finally run the selected model

docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama

select and run the model

docker exec -it ollama ollama run llama3.1

Connect to ollama using the following snippet:

from langchain_ollama import ChatOllama

new_llm = ChatOllama(model="llama3.1",temperature=0)

The code snippets are combined in the final UI module.

The Tools

Database Tools - Image by Author
Database Tools – Image by Author

Establishing the right tools for agents helps in building the trust factor toward our final result.

Basic SQL toolkits are needed to ensure that agents are capable of communicating with databases and retrieving all metadata and query results.

We will use Langchain’s SQL Database ToolKit. The available tools within this tool kit and their description include:

  • QuerySQLDataBaseTool : Input to this tool is a detailed and correct SQL query, output is a result from the database. If the query is not correct, an error message will be returned. If an error is returned, rewrite the query, check the query, and try again. If you encounter an issue with Unknown column ‘xxxx’ in ‘field list’, use sql_db_schema to query the correct table fields.
  • InfoSQLDatabaseTool: Input to this tool is a comma-separated list of tables, output is the schema and sample rows for those tables. Be sure that the tables exist by calling sql_db_list_tables first! Example Input: table1, table2, table3′,
  • ListSQLDatabaseTool: Tool for getting tables names for a specified database.
  • QuerySQLCheckerTool: checks if the sql query has the correct syntax.

While the toolkit provides all the needed functionality for the agent to interact with the database, this does not guarantee that AI will write flawless queries. It remains susceptable to hallucinations when relying on its own SQL knowledge and syntax to formulate correct queries.

Which tools would be key factors in increasing trust in the agent’s results?

Developers and analysts with domain knowledge can list frequently requested data by users and develop custom parameterized database functions for AI to use as tools to serve data users.

The functions will have clear metadata describing their functionalities and usage.

AI agents can use these functions and if needed add the required parameters based on user’s requests. This reduces full dependency on AI to build fully functional queries and introduces an additional layer of consistency to the outputs.

An example of a function that can be used for our bookstore database and the needed function description.

CREATE OR REPLACE FUNCTION public.get_books_by_year(year_param integer)
 RETURNS TABLE(title character varying, author_name character varying, publication_date date)
 LANGUAGE plpgsql
AS $function$
BEGIN
    RETURN QUERY
    SELECT b.title, a.name AS author_name, b.publication_date
    FROM books b
    JOIN authors a ON b.author_id = a.author_id
    WHERE EXTRACT(YEAR FROM b.publication_date) = year_param
    ORDER BY b.publication_date;
END;
$function$

COMMENT ON FUNCTION public.get_books_by_year(integer) IS 'This function returns all books published in a specific year.';

The application will contain the code for introducing functions as tools

from sqlalchemy import text

class DatabaseFunctions:
    def __init__(self, engine):
        self.engine = engine
    def get_all_functions(self):
        query = text("""
            SELECT 
                p.proname AS routine_name,
                pg_catalog.pg_get_functiondef(p.oid) AS routine_definition,
                d.description AS routine_comment
            FROM 
                pg_catalog.pg_proc p
            LEFT JOIN 
                pg_catalog.pg_namespace n ON n.oid = p.pronamespace
            LEFT JOIN 
                pg_catalog.pg_description d ON d.objoid = p.oid
            WHERE 
                n.nspname = 'public'
            ORDER BY 
                p.proname
        """)
        with self.engine.connect() as connection:
            result = connection.execute(query)
            return [
                {
                    "function_name": row[0],
                    "function_code": row[1],
                    "description": row[2],
                }
                for row in result
            ]
    def add_function(self, name, code, description):
        with self.engine.connect() as connection:
            connection.execute(text(code))
            comment_sql = f"COMMENT ON FUNCTION {name} IS '{description}';"
            connection.execute(text(comment_sql))
            connection.commit()
    def delete_function(self, name):
        with self.engine.connect() as connection:
            drop_sql = f"DROP FUNCTION IF EXISTS {name};"
            connection.execute(text(drop_sql))
            connection.commit()

Prompt

The prompt plays an important role in guiding LLMs behavior. A well-engineered prompt enables LLMs to comprehend the task and process requests as instructed. The prompt must ensure the following:

  • Use of a specific SQL database dialect
  • Number of results to return
  • Tools to utilize
  • Desired output format
  • Construction and execution of queries

Translating the above requirements into the prompt:

"""
 You are an agent designed to interact with a SQL database.

Given an input question, check whether the question can be answered by 
a function already available or create a syntactically correct {dialect} 
query to run, then look at the results of the query and return the answer.

Unless the user specifies a specific number of examples they wish to obtain, 
always limit your query to at most {top_k} results.

You can order the results by a relevant column to return the most 
interesting examples in the database.

Never query for all the columns from a specific table, 
only ask for the relevant columns given the question.

You have access to tools for interacting with the database. 

Only use the below tools. 

Only use the information returned by the below tools to construct your 
final answer.

You MUST double check your query before executing it. 

If you get an error while executing a query, rewrite the query and try again.

DO NOT make any DML statements (INSERT, UPDATE, DELETE, DROP etc.) to the database.

To start you should ALWAYS look at functions in the database and then tables to see what you can use to query.
Do NOT skip this step. 

Then you should construct the right query whether you chose a function 
to run or build your own query.

Those functions should always take precedence to use and execute over 
building your own query and running. 

Decide correctly whether to choose the function or build your own query. 

If you choose a function make sure to build the sql query to execute 
the function correctly and get the results.

In order to execute a function run an sql like SELECT <columns> 
from <function_name(param)>

At the end of the result provide information. 
The result should be structured in json format.

It should include all the columns and results, 
in addition to the approach used with three potential values (FUNCTION or QUERY or None) 
(FUNCTION if function is used, QUERY if own query is used). 

The approach result should be also included in the json, 
in addition to the function name used.

Your last answer should only be a valid json and nothing else. in the following format: 

    {{
      "results": [
      {{
        "col1": val1,
        "col2": val2,
        "col3": val3,
        ...
      }}
    ],
    "approach": "FUNCTION or QUERY or NONE",
    "function_used": "<function name>",
    "query": <final using function or not used to retrieve the final results>
    }}
    """

Additionally, for best practices statements like Do NOT make any DML statements, can be mitigated through restricted write permissions to the database.

Gradio UI

The UI is kept simple with 4 Tabs containing configurations for database connection, creating functions, and adding the few-shot examples with a gradio chat interface.

UI Interface - Image by Author
UI Interface – Image by Author

After successfully connecting to the database, all functions available are loaded as tools to our application and ready for users’ queries. A few example functions are added to test our application.

  • get_books_by_year(int)
  • get_customer_order_history(int)

Putting everything together, we are now working with a fully functional AI agent capable of using its tools to interact with the database, read its metadata, and build queries with function execution capabilities.

We are all set now to test our application.

Test Run

We will run multiple prompts with both LLM models each expected to target a different approach to be taken by our application.

To run the application make sure the database is set

docker-compose up

Then launch the gradio app

python -m app.main

Prompt 1: Expected function utilizations

In this prompt we expect our agents to recognize that they can use functions instead of building their queries.

Query: get me books for the year 1997

gpt-4o-mini response:

Response 1 gpt-4o-mini - Image generated by Author
Response 1 gpt-4o-mini – Image generated by Author

llama3.1 8b response:

Using llama 3.1, it took me a couple of prompts yet it was still resorting to using its generated queries. This can be mitigated by using the 70b version or following a more direct approach to instruct it to use a specific function as shown below

Response 1 llama3.1 - Image generated by Author
Response 1 llama3.1 – Image generated by Author

Output Analysis:

The response to the user’s input includes additional components to ensure complete transparency regarding the trust level of the results. AI’s answers are wrapped with a Trusted label and view of the original SQL code and function used.

The trusted label is applied when an AI agent uses specific functions to retrieve the final answer. This provides users with an additional layer of confidence in the result.

Prompt 2: Expected construction of query based on database metadata

Query: What is the total number of orders?

gpt-4o-mini response:

Response 2 gpt-4o-mini - Image generated by Author
Response 2 gpt-4o-mini – Image generated by Author

llama3.1 8b response:

Response 2 llama3.1 - Image generated by Author
Response 2 llama3.1 – Image generated by Author

Output Analysis

When it comes to building SQL to respond to user’s queries both managed to respond correctly. As the database data models become more and more complicated, specialized SQL fine-tuned models or bigger generalized models will be needed to maintain response quality.

The full application is available here to test more examples with your data.


Summary

Building additional consistency layers in Text-to-SQL applications helps reduce the risks of losing user’s trust. We demonstrated how incorporating functions as tools, in addition to a few shot examples helps in reducing the complete reliance on AI to construct its query logic, thus increasing the likelihood of errors. Additionally, we introduced visual queues with trusted labels and additional information enabling users to be in the driver’s seat in assessing their confidence in the final result.

Data-driven organizations rely on user’s utilization of their data assets for decision-making. The seamless and simple interactions conversational AI offers, coupled with a trust factor encourage increased data engagement. The application’s access to vast amounts of company data within a simple chat interface brings organizations a step closer to achieving their data-driven goals.


Enjoyed the Story ?

  • Subscribe to get notified when I publish new stories.
  • Feel free to reach out to me on LinkedIn.

References

Trust in artificial intelligence – KPMG Global

McKinsey Global Surveys | McKinsey & Company

Insights to impact: Creating and sustaining data-driven commercial growth | McKinsey

[Companies Are Failing in Their Efforts to Become Data-Driven (hbr.org)](https://hbr.org/2019/02/companies-are-failing-in-their-efforts-to-become-data-driven#:~:text=72%25 of survey participants report,competing on data and analytics.))

Why Is It So Hard to Become a Data-Driven Company? (hbr.org)

LangChain

Ollama


Related Articles