Constructing an Agentic RAG with Phidata

When constructing purposes utilizing Giant Language Fashions (LLMs), the standard of responses closely is dependent upon efficient planning and reasoning capabilities for a given consumer activity. Whereas conventional RAG methods are highly effective, incorporating Agentic workflows can considerably improve the system’s means to course of and reply to queries.

On this article,  you’ll construct an Agentic RAG system with reminiscence elements utilizing the Phidata open-source Agentic framework, demonstrating find out how to mix vector databases i.e., Qdrant, embedding fashions, and clever brokers for improved outcomes.

Studying Targets

  • Perceive and design the structure for the elements required for Agentic RAG programs.
  • How do vector databases and embedding fashions for information base creation be built-in throughout the Agentic workflow?
  • Be taught to implement reminiscence elements for improved context retention
  • Develop an AI Agent that may carry out a number of instrument calls and resolve what instrument to decide on based mostly on consumer questions or duties utilizing the Phidata.
  • Actual-world use case to construct a Doc Analyzer Assistant Agent that may work together with private info from the information base and DuckDuckGo within the absence of context within the information base.

This text was revealed as part of the Knowledge Science Blogathon.

What’s Brokers and RAG?

Brokers within the context of AI are elements designed to emulate human-like pondering and planning capabilities. Brokers elements encompass:

Agentic RAG
  • Job decomposition into manageable subtasks.
  • Clever decision-making about which instruments to make use of and take crucial Motion.
  • Reasoning about one of the best strategy to fixing an issue.

RAG (Retrieval-Augmented Era) combines information retrieval with LLM capabilities. Once we combine brokers into RAG programs, we create a strong workflow that may:

rag database:
  • Analyze consumer queries intelligently.
  • Save the consumer doc inside a information base or Vector database.
  • Select applicable information sources or context for the given consumer question.
  • Plan the retrieval and response era course of.
  • Preserve context by way of reminiscence elements.

The important thing distinction between conventional RAG and Agentic RAG lies within the decision-making layer that determines find out how to course of every question and work together with instruments to get real-time info. 

Now that we all know, there’s a factor like Agentic RAG, how will we construct it? Let’s break it down. 

What’s Phidata? 

Phidata is an open-source framework designed to construct, monitor, and deploy Agentic workflows. It helps multimodal AI brokers outfitted with reminiscence, information, instruments, and reasoning capabilities. Its model-agnostic structure ensures compatibility with varied massive language fashions (LLMs), enabling builders to rework any LLM right into a practical AI agent. Moreover, Phidata permits you to deploy your Agent workflows utilizing a carry your individual cloud (BYOC) strategy, providing each flexibility and management over your AI programs.

Key options of Phidata embrace the flexibility to construct groups of brokers that collaborate to resolve advanced issues, a user-friendly Agent UI for seamless interplay (Phidata playground), and built-in help for agentic retrieval-augmented era (RAG) and structured outputs. The framework additionally emphasizes monitoring and debugging, offering instruments to make sure sturdy and dependable AI purposes.

Brokers Use Instances Utilizing Phidata

Discover the transformative energy of Agent-based programs in real-world purposes, leveraging Phidata to boost decision-making and activity automation.

Monetary Evaluation Agent

By integrating instruments like YFinance, Phidata permits the creation of brokers that may fetch real-time inventory costs, analyze monetary knowledge, and summarize analyst suggestions. Such brokers help traders and analysts in making knowledgeable selections by offering up-to-date market insights.

Internet Search Agent

Phidata additionally helps develop brokers able to retrieving real-time info from the online utilizing search instruments like DuckDuckGo, SerpAPI, or Serper. These brokers can reply consumer queries by sourcing the most recent knowledge, making them precious for analysis and information-gathering duties.

Multimodal Brokers

Phidata additionally helps multimodal capabilities, enabling the creation of brokers that analyze photos, movies, and audio. These multimodal brokers can deal with duties comparable to picture recognition, text-to-image era, audio transcription, and video evaluation, providing versatile options throughout varied domains. For text-to-image or text-to-video duties, instruments like DALL-E and Replicate will be built-in, whereas for image-to-text and video-to-text duties, multimodal LLMs comparable to GPT-4, Gemini 2.0, Claude AI, and others will be utilized.

Actual-time Use Case for Agentic RAG

Think about you have got documentation in your startup and need to create a chat assistant that may reply consumer questions based mostly on that documentation. To make your chatbot extra clever, it additionally must deal with real-time knowledge. Usually, answering real-time knowledge queries requires both rebuilding the information base or retraining the mannequin.

That is the place Brokers come into play. By combining the information base with Brokers, you possibly can create an Agentic RAG (Retrieval-Augmented Era) answer that not solely improves the chatbot’s means to retrieve correct solutions but in addition enhances its total efficiency. 

Real-time Use Case

We’ve got three primary elements that come collectively to kind our information base. First, we’ve Knowledge sources, like documentation pages, PDFs, or any web sites we need to use. Then we’ve Qdrant, which is our vector database – it’s like a wise storage system that helps us discover related info shortly. And eventually, we’ve the embedding mannequin that converts our textual content right into a format that computer systems can perceive higher. These three elements feed into our information base, which is just like the mind of our system.

Now we outline the Agent object from Phidata. 

The agent is linked to a few elements:

  • A Reasoning Mannequin (like GPT-4, Gemini 2.0, or Claude) that helps it assume and plan.
  • Reminiscence (SqlAgentStorage) that helps it keep in mind earlier conversations
  • Instruments (like DuckDuckGo search) that it could use to search out info

Observe: Right here Data Base and DuckDuckGo each will act as a instrument, after which based mostly on a activity or consumer question the Agent will take Motion on which instrument to make use of to generate the response. Additionally Embedding mannequin is OpenAI by default, so we’ll use OpenAI – GPT-4o because the reasoning mannequin. 

Let’s construct this code. 

Step-by-Step Code Implementation: Agentic RAG utilizing Qdrant, OpenAI, and Phidata

It’s time to construct a Doc Analyzer Assistant Agent that may work together with private info (A web site) from the information base and DuckDuckGo within the absence of context within the information base. 

Step1: Setting Up Dependencies

To construct the Agentic RAG workflow we have to set up a number of libraries that embrace:

  • Phidata: To outline the Agent object and workflow execution.
  • Google Generative AI – Reasoning mannequin i.e., Gemini 2.0 Flash
  • Qdrant – Vector database the place the information base might be saved and later used to retrieve related info
  • DuckDuckGo – Search engine used to extract real-time info.
pip set up phidata google-generativeai duckduckgo-search qdrant-client

Step2: Preliminary Configuration and Setup API keys

On this step, we’ll arrange the surroundings variables and collect the required API credentials to run this use case. In your OpenAI API key, you will get it from: https://platform.openai.com/. Create your account and create a brand new key. 

from phi.information.web site import WebsiteKnowledgeBase
from phi.vectordb.qdrant import Qdrant

from phi.agent import Agent
from phi.storage.agent.sqlite import SqlAgentStorage
from phi.mannequin.openai import OpenAIChat
from phi.instruments.duckduckgo import DuckDuckGo

import os

os.environ['OPENAI_API_KEY'] = "<change>"

Step3: Setup Vector Database – Qdrant

You now should initialize the Qdrant consumer by offering the gathering identify, URL, and API key in your vector database. The Qdrant database shops and indexes the information from the web site, permitting the agent to carry out retrieval of related info based mostly on consumer queries. This step units up the info layer in your agent:

  • Create cluster: https://cloud.qdrant.io/
  • Give a reputation to your cluster and replica the API key as soon as the cluster is created.
  • Beneath the curl command, you possibly can copy the Endpoint URL.
COLLECTION_NAME = "agentic-rag"
QDRANT_URL = "<change>"
QDRANT_API_KEY = "<change>"

vector_db = Qdrant(
    assortment=COLLECTION_NAME,
    url=QDRANT_URL,
    api_key=QDRANT_API_KEY,
)

Step4: Creating the information base

Right here, you’ll outline the sources from which the agent will pull its information. On this instance, we’re constructing a Doc analyzer agent that may make our job simple to reply questions from the web site. We are going to use the Qdrant doc web site URL for indexing.

The WebsiteKnowledgeBase object interacts with the Qdrant vector database to retailer the listed information from the offered URL. It’s then loaded into the information base for retrieval by the agent.

Observe: Keep in mind we use the load perform to index the info supply to the information base. This must be run simply as soon as for every assortment identify, if you happen to change the gathering identify and need to add new knowledge, solely that point run the load perform once more. 

URL = "https://qdrant.tech/documentation/overview/"

knowledge_base = WebsiteKnowledgeBase(
    urls = [URL],
    max_links = 10,
    vector_db = vector_db,
)

knowledge_base.load() # solely run as soon as, after the gathering is created, remark this

Step5: Outline your Agent

The Agent configures an LLM (GPT-4) for response era, a information base for info retrieval, and an SQLite storage system to trace interactions and responses as Reminiscence. It additionally units up a DuckDuckGo search instrument for added internet searches when wanted. This setup varieties the core AI agent able to answering queries.

We are going to set show_tool_calls to True to look at the backend runtime execution and monitor whether or not the question is routed to the information base or the DuckDuckGo search instrument. While you run this cell, it should create a database file the place all messages are saved by enabling reminiscence storage and setting add_history_to_messages to True.

agent = Agent(
    mannequin=OpenAIChat(id="gpt-4o"),
    information=knowledge_base,
    instruments=[DuckDuckGo()],

    show_tool_calls=True,
    markdown=True,

    storage=SqlAgentStorage(table_name="agentic_rag", db_file="agents_rag.db"),
    add_history_to_messages=True,
)

Step6: Strive A number of Question

Lastly, the agent is able to course of consumer queries. By calling the print_response() perform, you go in a consumer question, and the agent responds by retrieving related info from the information base and processing it. If the question will not be from the information base, it should use a search instrument. Lets observe the modifications. 

Question -1: From the information base

agent.print_response(
  "what are the indexing methods talked about within the doc?", 
  stream=True
)
Query -1: From the knowledge base: Agentic RAG with Phidata

Question-2 Outdoors the information base

agent.print_response(
  "who's Virat Kohli?", 
  stream=True
)
Query-2 Outside the knowledge base: Agentic RAG with Phidata

Benefits of Agentic RAG

Uncover the important thing benefits of Agentic RAG, the place clever brokers and relational graphs mix to optimize knowledge retrieval and decision-making.

  • Enhanced reasoning capabilities for higher response era.
  • Clever instrument choice based mostly on question contexts comparable to Data Base and DuckDuckGo or every other instruments from the place we will fetch the context that may be offered to the Agent.
  • Reminiscence integration for improved context consciousness that may keep in mind and extract historical past dialog messages.
  • Higher planning and activity decomposition, the first half in Agentic workflow is to get the duty and break it down into sub-tasks, after which make higher selections and motion plans.
  • Versatile integration with varied knowledge sources comparable to PDF, Web site, CSV, Docs, and plenty of extra.

Conclusion

Implementing Agentic RAG with reminiscence elements gives a dependable answer for constructing clever information retrieval programs and engines like google. On this article, we explored what Brokers and RAG are, and find out how to mix them. With the mix of Agentic RAG, question routing improves as a result of decision-making capabilities of the Brokers.

Key Takeaways

  • Uncover how Agentic RAG with Phidata enhances AI by integrating reminiscence, a information base, and dynamic question dealing with.
  • Be taught to implement an Agentic RAG with Phidata for environment friendly info retrieval and adaptive response era.
  • The Phidata knowledge library gives a streamlined implementation course of with simply 30 traces of core code together with Multimodal comparable to Gemini 2.0 Flash.
  • Reminiscence elements are essential for sustaining context and enhancing response relevance.
  • Integration of a number of instruments (information base, internet search) allows versatile info retrieval – Vector databases like Qdrant present superior indexing capabilities for environment friendly search.

Continuously Requested Questions

Q1. Can Phidata deal with multimodal duties, and what instruments does it combine for this function?

A. Sure, Phidata is constructed to help multimodal AI brokers able to dealing with duties involving photos, movies, and audio. It integrates instruments like DALL-E and Replicate for text-to-image or text-to-video era, and makes use of multimodal LLMs comparable to GPT-4, Gemini 2.0, and Claude AI for image-to-text and video-to-text duties. 

Q2. What instruments and frameworks can be found for growing Agentic RAG programs?

A. Creating Agentic Retrieval-Augmented Era (RAG) programs entails using varied instruments and frameworks that facilitate the mixing of autonomous brokers with retrieval and era capabilities. Listed below are some instruments and frameworks obtainable for this function: Langchain, LlamaIndex, Phidata, CrewAI, and AutoGen. 

Q3. Can Phidata combine with exterior instruments and information bases?

A. Sure, Phidata permits the mixing of varied instruments and information bases. As an example, it could join with monetary knowledge instruments like YFinance for real-time inventory evaluation or internet search instruments like DuckDuckGo for retrieving up-to-date info. This flexibility allows the creation of specialised brokers tailor-made to particular use circumstances.

The media proven on this article will not be owned by Analytics Vidhya and is used on the Creator’s discretion.

Knowledge Scientist at AI Planet || YouTube- AIWithTarun || Google Developer Skilled in ML || Gained 5 AI hackathons || Co-organizer of TensorFlow Consumer Group Bangalore || Pie & AI Ambassador at DeepLearningAI