Remember the days when building a smart chatbot took months of coding?
Frameworks like LangChain have definitely streamlined development, but hundreds of lines of code can still be a hurdle for those who aren’t programmers.
Is there a simpler way ? (Find the friend link here to read the full story if you are not already a member of Medium and please consider subscribing to Medium membership to support writers)

That’s when I discovered "Lang Flow," an open-source package that builds upon the Python version of LangChain. It lets you create an AI application without needing to write a single line of code. It provides you a canvas where you can just drag components around and link them up to build your chatbot.
In this post, we’ll use LangFlow to build a smart AI chatbot prototype in minutes. For the backend, we’ll use Ollama for embedding models and Large Language Model, meaning that the application runs locally and free of charge! Finally, we’ll convert this flow into a Streamlit application with minimal coding.
Introduction to Retrieval-Augmented Generation Pipeline, LangChain, LangFlow and Ollama
In this project, we’re going to build an AI chatbot, and let’s name it "Dinnerly – Your Healthy Dish Planner." It aims to recommend healthy dish recipes, pulled from a recipe PDF file with the help of Retrieval Augmented Generation (RAG).
Before diving into how we’re going to make it happen, let’s quickly go over the main ingredients we’ll be using in our project.
Retrieval-Augmented Generation (RAG)
RAG (Retrieval-Augmented Generation) helps Large Language Models (LLMs) by feeding them relevant information from external sources. This allows LLMs to consider this context when generating responses, making them more accurate and up-to-date.
The RAG pipeline includes typically following steps, as described in A Guide to Retrieval Augmented Generation :
"
- Load Document: Begin by loading the document or data source.
- Split into Chunks: Break the document into manageable parts.
- Create Embeddings: Convert these chunks into vector representations using embeddings.
- Store in Vector Database: Save these vectors in the database for efficient retrieval.
- User Interaction: Receive queries or input from the user and convert them into embeddings.
- Semantic Search in VectorDB: Connect to the vector database to perform a semantic search based on the user’s query.
- Retrieve and Process Responses: Fetch relevant responses, pass them through an LLM, and generate an answer.
- Deliver Answer to User: Present the final output generated by the LLM back to the user.
"

Langchain
An open-source framework built around LLMs, LangChain facilitates the design and development of various GenAI applications, including chatbots, summarization, and much more.
The core idea of the library is to "chain" together different components to simplify complex AI tasks and create more advanced use cases around LLMs.

LangFlow
LangFlow is a web tool designed specifically for LangChain. It offers a user interface where users can simply drag and drop components to build and test LangChain applications without any coding.
However, you first need to have a basic understanding of how LangChain works and its different components in order to use LangFlow to design your AI application flow.

Ollama
Ollama is, for me, the best and also the easiest way to get up and running with open source LLMs. It supports, among others, the most capable LLMs such as Llama 2 and Mistral, and you can find the list of available models on ollama.ai/library.

Setting Up Ollama
Installing Ollama
First, go to Ollama download page, pick the version that matches your operating system, download and install it.
With Ollama installed, open your command terminal and enter the following commands. These commands will download the models and run them locally on your machine.
For this project, we’ll be using Llama2 as our Large Language Model (LLM) and "nomic-embed-text" for the embedding model. "Nomic-embed-text" is a powerful open-source embedding model with a large context window. This lets us run the entire application locally, without needing any cloud services!
ollama serve
ollama pull llama2
ollama pull nomic-embed-text
ollama run llama2
Setting Up LangFlow
Pre-requiste
Before we get started with LangFlow, it’s important to check that Python is installed on your computer. Your Python version should be above 3.9 but older than 3.12.
Installing LangFlow
Next, let’s move on to installing LangFlow. I recommend doing this within a virtual environment. This approach helps manage dependencies neatly within its own space. On my Mac, I use Conda to set things up. Just enter the following commands in a command-line terminal to create a virtual environment named "langflow" with Python 3.11.
conda create -n langflow python=3.11
conda activate langflow
pip install langflow
If you don’t have Conda, you can also set up a virtual environment directly with Python, with the following commands.
python -m venv langflow
source langflow/bin/activate
pip install langflow
After wrapping up the installation, starting LangFlow is as simple as entering "langflow run" into your terminal.

Then, take the URL it gives you (in the example above, it is http://127.0.0.1: 7860) , paste it into your web browser, and voilà! You should see an interface that looks something like this. This page displays all your projects.

Designing Your Chatbot’s Flow
It’s time to craft your first flow!
Start by clicking "New project," which opens up a blank canvas for you. On the left pane, you’ll notice a variety of components ready to be dragged and dropped into your workspace.

For our project, we’re building a chatbot capable of answering questions from a PDF file. Remember the RAG pipeline we talked about earlier? We’ll need certain elements to piece this together:
- PDF Loader: We’ll use "PyPDFLoader" here. You’ll need to input the file path of your PDF document.
- Text Splitter: "RecursiveCharacterTextSplitter" will be our choice, and the default settings will work fine.
- Text Embedding Model: Opt for "OllamaEmbeddings" to utilize the free, open-source embedding.
- Vector Database: We’re going with "FAISS" to store the embeddings and facilitate vector searches.
- LLM for Generating Responses: Select "ChatOllama" and specify the model as "llama2".
- Conversation Memory: This allows our chatbot to retain chat history, aiding in follow-up questions. We’ll use "ConversationBufferMemory".
- Conversation Retrieval Chain: This connects various components like the LLM, memory, and retrieved texts to generate responses. "ConversationRetrievalChain" is our pick.
Drag and drop all these components onto the canvas and set the required fields like the PDF file path and LLM model name. It’s okay to leave other settings at their default.
Next, connect these components to form your flow.

Once everything’s connected, hit the "lightning" button in the bottom right to compile the flow. If all goes well, the button should turn green, indicating success.
After successfully compiling the flow, click the "chatbot" icon to test out your creation.

Several Tips:
- Once your flow is complete, you can save it as a JSON file or find it under "My Collection" for future access or edits.
-
Diving into LangFlow with pre-built examples can offer great inspiration and help you get started. Here’s how:
- "LangFlow Store" houses examples, but you’ll need an API key for access.
- The LangFlow GitHub page allows you to download examples, which you can then upload to your LangFlow UI using the "upload" button.
- If setting up locally isn’t for you, you can also opt OpenAI to build your RAG pipeline. Just ensure you have your OpenAI API key for the setup.
Turning the Flow into a Streamlit Chatbot
Now, if the flow is set up just right, it’s time to integrate it into your application. After building your flow, LangFlow makes it easy by offering the necessary code snippet. Just hit the "Code" button found in the Sidebar.

Let’s move forward with integrating this flow into a Streamlit chatbot.
- Setting Up the Dependencies: To begin, we’ll have to install dependencies.
pip install streamlit
pip install langflow
pip install langchain-community
- Fetching the Lang Flow Code Snippet : create a new Python File "app.py". Head back to the LangFlow UI and find that "Code" button again. Navigate to the "Python API" tab, copy the code snippet and paste into "app.py".
import requests
from typing import Optional
BASE_API_URL = "http://127.0.0.1:7860/api/v1/process"
FLOW_ID = "d9392262-a912-42b4-8582-cc9e48894a00"
# You can tweak the flow by adding a tweaks dictionary
# e.g {"OpenAI-XXXXX": {"model_name": "gpt-4"}}
TWEAKS = {
"VectorStoreAgent-brRPx": {},
"VectorStoreInfo-BS24v": {},
"OpenAIEmbeddings-lnfRZ": {},
"RecursiveCharacterTextSplitter-bErPe": {},
"WebBaseLoader-HLOqm": {},
"ChatOpenAI-aQOv0": {},
"FAISS-o0WIf": {}
}
def run_flow(inputs: dict, flow_id: str, tweaks: Optional[dict] = None) -> dict:
"""
Run a flow with a given message and optional tweaks.
:param message: The message to send to the flow
:param flow_id: The ID of the flow to run
:param tweaks: Optional tweaks to customize the flow
:return: The JSON response from the flow
"""
api_url = f"{BASE_API_URL}/{flow_id}"
payload = {"inputs": inputs}
headers = None
if tweaks:
payload["tweaks"] = tweaks
response = requests.post(api_url, json=payload, headers=headers)
return response.json()
3. Building a Chat Function: In the same python file, we’ll define a function dedicated to chatting. This function runs the flow to fetch a response with each new query from the user. Then, it streams this response on the interface.
def chat(prompt: str):
with current_chat_message:
# Block input to prevent sending messages whilst AI is responding
st.session_state.disabled = True
# Add user message to chat history
st.session_state.messages.append(("human", prompt))
# Display user message in chat message container
with st.chat_message("human"):
st.markdown(prompt)
# Display assistant response in chat message container
with st.chat_message("ai"):
# Get complete chat history, including latest question as last message
history = "n".join(
[f"{role}: {msg}" for role, msg in st.session_state.messages]
)
query = f"{history}nAI:"
# Setup any tweaks you want to apply to the flow
inputs = {"input": query}
output = run_flow(inputs, flow_id=FLOW_ID, tweaks=TWEAKS)
print(output)
try:
output = output['result']['output']
except Exception :
output = f"Application error : {output}"
placeholder = st.empty()
response = ""
for tokens in output:
response += tokens
# write response with "▌" to indicate streaming.
with placeholder:
st.markdown(response + "▌")
# write response without "▌" to indicate completed message.
with placeholder:
st.markdown(response)
# Log AI response to chat history
st.session_state.messages.append(("ai", response))
# Unblock chat input
st.session_state.disabled = False
st.rerun()
4. Crafting the Interface : Now we´ll build a simple Streamlit user interface with the following code in the same python file.
st.set_page_config(page_title="Dinnerly")
st.title("Welcome to Dinnerly : Your Healthy Dish Planner")
system_prompt = "You´re a helpful assistant to suggest and provide healthy dishes recipes to users"
if "messages" not in st.session_state:
st.session_state.messages = [("system", system_prompt)]
if "disabled" not in st.session_state:
# `disable` flag to prevent user from sending messages whilst the AI is responding
st.session_state.disabled = False
with st.chat_message("ai"):
st.markdown(
f"Hi! I'm your healthy dish planner. Happy to help you prepare healthy and yummy dishes!"
)
# Display chat messages from history on app rerun
for role, message in st.session_state.messages:
if role == "system":
continue
with st.chat_message(role):
st.markdown(message)
current_chat_message = st.container()
prompt = st.chat_input("Ask your question here...", disabled=st.session_state.disabled)
if prompt:
chat(prompt)
Once you run the Streamlit app, you’ll be able to chat with your very own dish planner! It will help you create delicious and healthy meals.

Tips:
You can use the same code and interface for different flows. Just change the FLOW_ID to test and integrate a new flow into your application.

Closing Thoughts
In this post, we’ve crafted a smart, RAG-based chatbot. We utilized LangFlow to establish the RAG pipeline without needing to code, leveraged on open-source models for embedding and LLM processing to keep our application running locally and free of inference costs, and finally, we transformed this setup into a Streamlit application.
I’ve particularly appreciated LangFlow’s no-code way and I do believe it could be a game changer in how we build and prototype AI applications.
However, it’s worth mentioning that some components are still under development, sometimes they may not work as expected. When these moments arise, there is a lack of visibility into the issue or guidance on troubleshooting. Another improvement could be providing the underlying Python codes directly to offer greater customization.
Overall, I find LangFlow a valuable tool for quick prototyping needs.
Before you go! 🦸🏻 ♀️
If you liked my story and you want to support me:
- Throw some Medium love 💕 (claps, comments and highlights), your support means the world to me.👏
- Follow me on Medium and subscribe to get my latest article🫶
Reference
- LangFlow documentation
- Ollama documentation
- Deliciously Healthy Dinners (pdf file used in the demo) : https://healthyeating.nhlbi.nih.gov/pdfs/dinners_cookbook_508-compliant.pdf