The world’s leading publication for data science, AI, and ML professionals.

Use LangChain’s Output Parser with ChatGPT for Structured Outputs

Explained with an example use case.

Photo by Dmitry Ratushny on Unsplash
Photo by Dmitry Ratushny on Unsplash

ChatGPT and many other LLMs have led the way for creating LLM-based applications in different domains. These models are extremely powerful at processing text inputs and creating text outputs based on your queries. However, they’re not designed as a development framework.

LangChain is an open-source development framework for applications that use large language models (LLMs). It provides abstractions in the form of components to use LLMs in a more efficient or programmatic way.

These components are:

  • Models: ChatGPT or other LLMs
  • Prompts: Prompt templates and output parsers
  • Indexes: Ingests external data such as document loaders and vector stores
  • Chains: Combines components to create end-to-end use cases. An example of a simple chain can be Prompt + LLM + Output Parser
  • Agents: Makes LLMs use external tools

The main idea behind LangChain is to chain multiple components together to extend the abilities of LLM and create more functional tools, or applications.

(image by author)
(image by author)

The developers of LangChain keep adding new features at a very rapid pace. It changes the way we interact with LLMs.

In this article, we will go through an example use case to demonstrate how using output parsers with prompt templates helps getting more structured output from LLMs.

We’ll first do the example using only a prompt template and LLM. Then we’ll do the same example but adding an output parser.


Prompt template + LLM

Prompt template and an LLM is the simplest chain you can create with LangChain.

Using a prompt template has many advantages over manually customizing prompts with f-strings. It allows for reusing prompts when applicable. Also, LangChain provides ready-to-use templates for common tasks such as querying a database.

We’ll use OpenAI’s ChatGPT as our LLM so we need to set up an API key.

import os
import openai

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv())
openai.api_key = os.environ['OPENAI_API_KEY']

For this code to work and set up the API key, you need to create an environment variable named OPENAI_API_KEY, which holds the API key you obtained from the API Keys menu on OpenAI website.

Let’s start with creating a model. ChatOpenAI is LangChain’s abstraction for ChatGPT API endpoint.

from langchain.chat_models import ChatOpenAI

chat = ChatOpenAI(temperature=0.0)

By default, LangChain creates the chat model with a temperature value of 0.7. The temperature parameter adjusts the randomness of the output. Higher values like 0.7 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We can set its value when creating the model instance.

The next step is to create the prompt template. We’ll create a template for extracting information from product reviews.

review_template = """
For the following review, extract the following information:

recommended: Does the buyer recommend the product? 
Answer True if yes, False if not or unknown.

delivery_days: How many days did it take for the product 
to arrive? If this information is not found, output -1.

setup: Extract any sentences about the setup of the product.

Format the output as JSON with the following keys:
recommended
delivery_days
setup

review: {review}
"""

from langchain.prompts import ChatPromptTemplate

prompt_template = ChatPromptTemplate.from_template(review_template)

The code snippet above creates a prompt template from the given prompt string. The review is saved as an input variable, which can be checked using the input_variables attribute:

prompt_template.input_variables

# output
['review']

We can now create an actual prompt using this template and a product review.

product_review = """
I got this product to plug my internet based phone for work from home (Avaya desktop phone). 
It works! It arrived in 5 days, which was earlier than the estimated delivery date.
The setup was EXTREMELY easy. At completion, I plugged the phone into the 
extender's ethernet port and made a few phone calls which all worked perfectly with 
complete clarity. VERY happy with this purchase since a cordless headset is 
around $250 (which I would have needed since the phone had to be at the ethernet 
port on the wall). I recommend this product!
"""

messages = prompt_template.format_messages(review=product_review)

The messages is a Python list that contains the actual prompt. We can see the prompt using messages[0].content , which outputs the following prompt:

For the following review, extract the following information:

recommended: Does the buyer recommend the product Answer True if yes, False if not or unknown.

delivery_days: How many days did it take for the product to arrive? If this information is not found, output -1.

setup: Extract any sentences about the setup of the product.

Format the output as JSON with the following keys: recommended delivery_days setup

review: I got this product to plug my internet based phone for work from home (Avaya desktop phone). It works! It arrived in 5 days, which was earlier than the estimated delivery date. The setup was EXTREMELY easy. At completion, I plugged the phone into the extender’s ethernet port and made a few phone calls which all worked perfectly with complete clarity. VERY happy with this purchase since a cordless headset is around $250 (which I would have needed since the phone had to be at the ethernet port on the wall). I recommend this product!

We have the model and prompt ready. The next step is to query the model using the prompt:

# chat is the model and messages is the prompt
response = chat(messages)
print(response.content)

# output
{
    "recommended": true,
    "delivery_days": 5,
    "setup": "The setup was EXTREMELY easy."
}

Although the response looks like a JSON, it is a string, which makes it difficult to parse.

type(response.content)
# output
str

We’ll now learn how to use an output parser together with the prompt template to make it easier to parse the output.


Prompt template + LLM + Output Parser

The output parser is added to the prompt by using the format_instructions . Let’s go over the process step-by-step.

The first step is to import the required modules and define a ResponseSchema for each piece of information to be extracted:

from langchain.output_parsers import ResponseSchema
from langchain.output_parsers import StructuredOutputParser

recommendation_schema = ResponseSchema(
    name="recommended",
    description="Does the buyer recommend the product? 
    Answer True if yes, False if not or unknown."
)

delivery_days_schema = ResponseSchema(
    name="delivery_days",
    description="How many days did it take for the product to arrive? 
    If this information is not found,output -1."
)

setup_schema = ResponseSchema(
    name="setup",
    description="Extract any sentences about the setup of the product."
)

response_schemas = [
    recommendation_schema, 
    delivery_days_schema,
    setup_schema
]

The next step is to create the output parser and format instructions using these schemas:

output_parser = StructuredOutputParser.from_response_schemas(response_schemas)
format_instructions = output_parser.get_format_instructions()

We’ll now create the prompt template as we did before. When creating the actual prompt, we’ll pass in the format_instructions parameter:

prompt_template = ChatPromptTemplate.from_template(template=review_template)

messages = prompt_template.format_messages(
    review=product_review, 
    format_instructions=format_instructions
)

Let’s use our new prompt to query the model.

response = chat(messages)
output_dict = output_parser.parse(response.content)

print(output_dict)

# output
{'recommended': 'True', 'delivery_days': '5', 'setup': 'The setup was EXTREMELY easy.'}

We used the parse method for parsing output. The type of the output_dict is dictionary, which is much easier than a string to parse. We can extract a particular piece of information using the get method.

output_dict.get("delivery_days")

# output
5

Final words

You may argue that we can use the built-in json module to parse string into a JSON file and it’s a simple process using the loads method. You’re correct! It’s easier than creating an output parser and implementing it into a prompt template.

However, there are more complex cases where an output parser simplifies the process in a way that cannot be simply done with the built-in json module. Also, output parser provides additional benefits when working with longer chains with different types of modules.

You can become a Medium member to unlock full access to my writing, plus the rest of Medium. If you already are, don’t forget to subscribe if you’d like to get an email whenever I publish a new article.

Thank you for reading. Please let me know if you have any feedback.


Related Articles