
Introduction to the problem
In 2021, I started working on the challenge of answering questions based on a large corpus of text. In the era before the pre-trained transformers, this problem was a tough one to crack.
And to my frustration, I started my experiments with one of the most complex and intricate stories ever written, The Mahabharata. For those unfamiliar with the work, the Mahabharata is a collection of 18 books with a total of about 1.8 million words. It is the largest poem ever written with about 90,000 verses. It is roughly ten times the length of the Iliad and the Odyssey combined. But it is not only the length but also the breadth of Mahabharata that is staggering. Highly nonlinear and complex in its causes and effects, it has thousands of characters spanning seven generations, and out of those, not a single one is completely good or evil. It has profound philosophical commentaries on Duty (Karma), Choices and Human existence, especially on the conflicts of duties and the choices between multiple wrongs. The Bhagavad Gita (Key philosophy of Hinduism) is also a part of the 6th book of the Mahabharata.
I compiled the Mahabharata text data from multiple sources online into a clean data set. However, I could not find a method to implement meaningful QA on the text.
In less than two years, all that changed.
The rapid advancements in AI and Large pre-trained transformers are changing the world of technology profoundly and fundamentally. And I am fascinated by it, much like most techies these days are.
So, a few months ago, I returned to the problem with a naive knowledge of the newborn art of prompt engineering. But this time with a general idea of making an Autonomous Research Agent that can work with any complex knowledge base.
The Mahabharata is one of the most complex use cases. However, in every domain of knowledge, Law, Scientific research, Education, Medical, etc., every project starts with deep research on the prior art. So the problem is worthy enough to solve.
The Research Agent
Here, I will discuss the design and implementation of an Autonomous AI Research Agent that can solve the problem of multi-hop KBQA with deep reasoning capability. I will share the git repo with an initial implementation of the research agent in a Python notebook. If you are interested only in that part, please feel free to skip to the Implementation section later in this article.
If you are interested in knowing more about AI Agent, ‘Knowledge-Based Question Answer’ (KBQA), the ‘Why’, the ‘What’, and the design evolution of the AI Research Agent, then please read along.
Why?
The first question that one may ask is why not just use the ChatGPT interface and ask questions. It has been trained on a humungous volume of Internet data generated till 2021, so a text corpus like the Mahabharata is known to it.
That was my first approach. I asked the ChatGPT several questions about the Mahabharata. I got good answers to some questions. However, they lack the rigour for the most. And that is expected. The GPT is trained over general data sets. It can very well understand and interpret natural languages. It can also reason well enough. However, it is not an expert in any specific domain. So, while it might have some knowledge of The Mahabharata, it may not respond with deeply researched answers. At times the GPT may not have any answer at all. In those cases, it either humbly refuses to answer the question, or confidently makes them up (Hallucinations).
The second most obvious way to achieve KBQA is to use a Retrieval QA Prompt. Here is where Langchain starts being extremely useful.
Retrieval QA
For those unfamiliar with the LangChain library, It is one of the best ways to use LLMs like GPT in your code. Here is an implementation of KBQA using LangChain.
To summarise, here are the steps to achieve KBQA on any body of documents
- Split the knowledge base into text chunks.
-
Create a numerical representation (Embeddings) for each chunk and save them to a vector database. If your data is static, Steps 1 and 2 are one-time efforts.
- Run a semantic search using the user’s query on this database and fetch relevant text chunks.
- Send these text chunks to the LLM along with the user’s questions and ask them to Answer.
Here is a graphical representation of this process.

So why go any further? It seems like a solved problem!
Not quite 🙁
This approach works well for simple questions on a simple and factual knowledge base. However, it does not work for a more complex knowledge base and more complicated questions that require deeper, Multi-hop, reasoning. Multi-hop reasoning refers to a process in which multiple steps of logical or contextual inference are taken to arrive at a conclusion or answer to a question.
Moreover, the LLMs are limited in the length of text they can chew in one prompt. You can, of course, send the documents one at a time and then ‘refine’ or ‘reduce’ the answer with every call. However, this approach does not allow for complex ‘multi-hop’ reasoning. In some cases, the results using the ‘refine’ or ‘reduce’ approach are better than simply stuffing all the documents in a single prompt, but not by a high margin.
For a complex knowledge base, the users’ question by itself may not be enough to find all the relevant documents that can help the LLM arrive at an accurate answer.
For example:
Who was Arjuna?
This is a simple question and can be answered with limited context. However, the following question:
Why did the Mahabharata war happen?
Is a question that has its context spread all across the text corpus. The question itself has limited information about its context. To find the relevant chunks of text and then to reason based on that may not work.
So what next?
AI Agents
This is one of the coolest concepts that has emerged after the advent of AI. If you don’t know the concept of an AI Agent, I can’t wait to explain it to you, but I may still fail to convey its awesomeness. Let me use ChatGPT to explain it first.
An AI agent, also known simply as an "agent," refers to a software program or system that can autonomously perceive its environment, make decisions, and take actions to achieve specific goals. AI agents are designed to mimic human-like behaviour in problem-solving and decision-making tasks. They operate within a defined environment and interact with that environment to achieve desired outcomes.
Simply speaking, an Agent is a program that takes a problem, decides how to solve it, and then solves it. The Agent is provided with a set of tools like Functions, methods, API calls, etc. It can use any of them if it chooses to do so in any sequence it deems fit. Contrast this to conventional software, where the sequence of steps needed to solve the problem is pre-programmed. This is, of course, a very vague definition. But you probably get the hang of it by now.
Here are the two different agents I tried for our KBQA use case.
_ReactT_his Agent uses a ‘ReAct’ (Reason and Action) style of reasoning to decide which tool to use for the given problem.
Here is the langChain implementation of a ReAct Agent:
I provided the Agent with the following tools to choose from:
- Retrieval QA chain with a document store.
- The Character Glossary search (I created a glossary with Named Entity Recognition using a pre-trained model)
- Wikipedia search.
The react agent did not give me good results and failed to converge to any answer most of the time. It does not work well with GPT 3.5. It may work better with GPT 4, which is 20 -30 times more expensive than GPT 3.5, so that may not be an option yet.
Even when it converged, I could not get good results. Someone more knowledgeable in creating ‘react’ prompts probably would have done better.
_Self-Ask AgentT_his agent asks follow-up questions based on the original question and then tries to find the intermediate answers. Using these intermediate answers, it finally arrives at a final answer. Here is an article explaining the Self-Ask Agent
This approach gave me some good results. It works well for a Single-hop reason. But even this fails for questions that require multiple hops.
For example, the questions:
Who killed Karna, and why?
Is relatively easy to answer with this approach
The question
Why did Arjuna kill Karna, his half-brother?
Is much more difficult to answer. It requires the LLM to know the fact that Arjuna did not know that Karna was his half-brother. The LLM can’t know that it needs to know this fact, either by understanding the question or by asking further questions based on the original question.
Human Research Process
Quoting GPT again
AI agents are designed to mimic human-like behaviour in problem-solving and decision-making tasks
So, my next idea was to research how humans research, the meta-research if you like. I imagined myself sitting in a library (College nostalgia) with easy access to all the books relevant to the topic of my research. I took a notebook and a pen and started jotting down the process I follow when I research a topic.
Here is what I came up with.
Research methodology:
Note down the original query on a page.
- I try to answer the current question by reading a few books. In the process, I make a few notes and bookmark a few excerpts that I find most relevant to the current question.
- Invariably, I find many unknowns in these excerpts. I note these unknowns and write down a few more questions that can help me learn about these unknowns.
- Out of these questions, I choose one question that is most pertinent to the original question.
- I go back to step 1
After a few such iterations, I ask myself if I have enough information to answer the original question.
If yes, then job well done! If no, then toil on.
Voila!
Finally, I knew what to code. I hoped that, with some prompt engineering, this process could give me more profound answers than any of the other approaches I have tried previously.
Spoiler alert… it did! 🙂
Before sitting down to code, I searched the internet for similar ideas. And I discovered the BabyAGI. What a wonderful world!
Here is a repo describing the BabyAGI
I realised there were many similarities between BabyAGI and the above research process. So with gratitude, I took some inspiration from the prompts used in the BabyAGI implementation.
The Research Agent – Implementation
Here is the same process converted to a flow chart using the amazing draw.io

Every blue box in this chart is a call to an LLM.
Components
-
QA Agent – Search for answers and further context This is a simple ‘stuff’ Retrieval QA chain that uses a vector store. In the future, this can be an AI Agent that uses tools like vector stores, search APIs or Wikipedia APIs, Moderation APIs and previous research data. The prompt here is tuned to generate succinct answers based on the 1. The context (documents), and 2. the pertinence to the original question. Except for the first loop, the current question is always an intermediate question generated in step 2 and chosen in step 3. The Agent appends the intermediate answer to the notes and the latest excerpts (the documents used to answer the current question) to the bookmarks. The most recent of these documents are utilised in step 2.
-
Question Generator – Ask more questions based on fresh notes Here, the agent uses the most recent vector search results matching the current question and uses them to generate more questions pertinent to the original question. It appends these questions to the list of unanswered questions. The prompt here is tuned such that the newly generated questions do not overlap with the existing list of questions.
-
Most Pertinent Question Picker – Pick one question most pertinent to the original question This prompt picks one question from the list of unanswered questions that is the most pertinent to the original question. This question is used as the current question for the next loop. In the next loop, the agent removes this question from the list of unanswered questions after generating a fresh set of questions.
-
Analyser – Do I know enough? I am using a _max_iterations_ parameter to exit the loop. This works pretty well for now. However, it might be better to dynamically decide on the number of iterations or an exit strategy based on the evolving context. I will work on an ‘analyser’ that can do this autonomously in the future.
-
Research Compiler – Compile the research This is the final prompt. It uses the notes made during the research process to arrive at an elaborate
final answer
to theoriginal questions
.
Results
The Research Agent is a big improvement over all the previous approaches I have tried. It yields more detailed and accurate answers than any other approach. I have been playing with this for a few weeks now, and I am surprised by the richness of the answers I get.
The Agent avoids the problem of hallucinations to a greater extent than any previous approach. It autocorrects the hallucinations and the factual errors it generates in the first few iterations, during later ones. The deeper it gets into a problem, the more accurately it yields the result.
Here is an example run.
Questions: Why did the Pandavas have to go live in the forest for 12 years?
Output –

Just in case you are curious, here is the final answer
Final Answer:
The Pandavas' decision to live in the forest for 12 years
was primarily due to the circumstances and events that
unfolded in their lives. After losing a game of dice to
The Kauravas and the Pandavas were forced into exile for
13 years, with the last year to be spent incognito.
During their exile, they faced numerous challenges
and hardships, including living in the forest.
Bhima contemplated killing all of
the Kurus, but his brother Yudhisthira always
pacified him, stating that it was not the opportune
time to fulfil their desires.
Thus, living in the forest allowed them to avoid
confrontations and maintain peace.
During their time in the forest, the Pandavas faced several difficulties.
One of the main challenges was Bhima's struggle to tolerate the ascetic
life. He contemplated killing all of the Kurus, but his brother
Yudhisthira always pacified him and advised against it.
Additionally, the Pandavas had to live in the forest and depend
on begging for their sustenance since their kingdom had been
plundered in gambling. They were distressed by the loss of their
kingdom and the insults to their queen Draupadi. The forest itself
was also full of dangers, adding to their hardships. Furthermore,
the Pandavas felt unable to fulfil the needs of certain pious Brahmanas
who stayed with them in the forest, as they did not possess anything
and were dependent on begging for their own survival.
Despite these challenges, the Pandavas made the most of their time
in the forest. They received knowledge from sages and saintly persons,
which helped them gain wisdom and spiritual insights. The great sage
Vyasa informed them that many generals had agreed to support their
cousin Duryodhana, including Karna, Shakuni, Bhurishravas, Sala,
Drona, and Bhishma. Vyasa also told Arjuna that if he were to fight
in a future war, he would need to obtain the divine astras of Lord
Indra and Lord Shiva.
In light of this information, Yudhisthira ordered Arjuna to go to the
Himalayan mountains to perform austerities to please Lord Shiva.
This was necessary for Arjuna to obtain the divine weapons that would
be crucial in the upcoming war. The Pandavas understood the importance
of being prepared and gathering the necessary resources to counter
the strength of their opponents. Therefore, Yudhisthira made the
decision to send Arjuna on this mission to seek Lord Shiva's blessings
and acquire the divine astras.
Overall, the Pandavas' exile in the forest for twelve years was a
result of the conditions imposed on them after losing the game
of dice. They faced various difficulties during their time in the
forest, but also gained valuable knowledge and prepared themselves
for the challenges that lay ahead.
This answer is quite an elaborate one. But the beauty of the agent is not just that it answered the original question accurately but that it went further and found out the story surrounding the question.
In most cases, the answers I get are rich in such details. And every such answer inspires my curiosity for further enquiry.
The Agent also yields a set of answered questions and unanswered questions that it notes down during the research process. So after every run, it leads me to many other questions that I can ask. In the past few weeks, I have learned more about the Mahabharata than I did in many years before that.
** Unanswered Questions **
'4. How did the Pandavas receive knowledge from sages and saintly persons during their time in the forest?'
'5. What were the specific austerities that Arjuna had to perform in the Himalayan mountains to please Lord Shiva?'
'6. How did the Pandavas manage to hide from Duryodhana's spies for almost the entire thirteenth year of their exile?'
'8. How did Bhima cope with the challenges of living as an ascetic in the forest? Did he face any particular difficulties or struggles during their time in exile?'
'9. Can you provide more information about the generals who supported Duryodhana's cause? What were their roles and contributions in the Kurukshetra war?'
'10. How did the Pandavas manage to maintain a peaceful life in the forest despite the challenges they faced?'
'11. What were the specific teachings and knowledge that the Pandavas received from the sages and saintly persons during their time in the forest?'
'12. Can you provide more information about the palace where the Pandavas lived for one full year before going to the forest?'
'13. How did Lord Krishna's presence in the forest affect the Pandavas' experience during their exile?'
'14. Can you provide more information about the dangers the Pandavas faced while living in the forest?'
'15. What were the specific challenges and difficulties that Yudhisthira and his brothers faced in their daily lives as inhabitants of the forest?'
Imagine following the same process for other domains of knowledge, quite an exhilarating thought!
The Code
And here are the Python notebooks with the implementation of the Research Agent.
The Git repository of the Mahabharata dataset
GitHub – rahulnyk/mahabharata: Mahabharata text compiled from multiple sources, split into chunks…
What next?
The current implementation is a simple version of the idea of an autonomous AI Research Agent. I turned the research process several times throughout the implementation of the Agent. It has been an exciting journey, but the fun is not over yet. Here are some of the enhancements I am currently working on.
- Deploy this agent on a public link and observe more usage patterns.
- Use the Agent with different source documents other than The Mahabharata.
- Step 1 of the process is currently a simple ‘stuff’ QA chain that uses a vector store with the source text corpus. I am working to replace it with a ‘ReAct’ Agent so that other tools like search APIs, Wikipedia, moderation APIs etc., can be used in the research process.
- I am saving the data and metadata generated during every run to a
runs
vector store. I am also saving the embeddings of the original question to the same store. This has helped me follow the reasoning trail of the Agent and observe several logical patterns that emerge out of it. This can help tune the QA agent to follow a tighter reasoning path. - Currently, the Research Agent exist after a fixed set of iterations. This works quite well for most questions. However, it might be better to dynamically decide on the number of iterations or an exit strategy based on the evolving context. I will work on an ‘analyser’ that can do this autonomously.
- The Agent works well for most types of questions, except for the meta-questions. For example, if I ask ‘Describe what happens in book 5 chapter 3’, the Agent struggles to answer. In future versions, I will include a self-query retriever with the ‘ReAct’ Agent to deal with such cases.
- So far, I have tried the Agent only with the OpenAI GPT3.5 model. It costs me about $0.02 per run. I will soon try the Research Agent with a smaller model like Llama, that can be hosted locally.
In the next article, I plan to write about my findings after implementing some of these updates. The larger idea is to create a top-notch Autonomous AI Research Agent that can excel at finding deeply researched answers to difficult questions. So, please feel free to suggest and if possible, partner with me to mature this further.
I hope you find this article and the shared code helpful.
Thanks for reading.
I hope you find the AI Research Agent exciting and useful.
The notebook I shared is just a simple implementation of this larger idea of creating an Autonomous AI Research Agent. A lot more can be done to make this Agent a top-notch researcher.
So, please feel free to suggest and if possible, partner with me to mature this further.
Thanks for reading.
Credits for the data sets I used in the above article, along with the licensing information.
- Complete Translation by K. M. Ganguli: Available in the public domain.
- Laura Gibbs Tiny Tales: This is a retelling of the Mahabharata using two hundred episodes that are each 100 words long. I am using her work here with her permission.
- Kaggle data repo by [Tilak](https://www.kaggle.com/tilakd): All 18 Parvas of Mahabharata in text format for NLP. Shared with public domain license by Tilak