Ever wished you had a private tutor that will help you remedy tough math issues? On this article, we’ll discover the right way to construct a math downside solver chat app utilizing LangChain, Gemma 9b, Llama 3.2 Imaginative and prescient and Streamlit. Our app is not going to solely perceive and remedy text-based math issues but in addition in a position to remedy image-based questions. Let’s take a look at the issue assertion and discover the right way to strategy and remedy this downside step-by-step.
Studying Outcomes
- Be taught to create a strong, interactive Chat App utilizing LangChain to combine exterior instruments and remedy duties.
- Grasp the method of constructing a Chat App with LangChain that may effectively remedy advanced math issues.
- Discover the usage of APIs and setting variables to securely work together with massive language fashions.
- Achieve hands-on expertise in designing a user-friendly net app with dynamic question-solving capabilities.
- Uncover methods for seamless interplay between frontend interfaces and backend AI fashions.
This text was printed as part of the Information Science Blogathon.
Defining the Problem: Enterprise Case and Targets
We’re an EdTech firm seeking to develop an revolutionary AI-powered software that may remedy each text-based and image-based math issues in real-time. The app ought to present options with step-by-step explanations to boost studying and engagement for college students, educators, and unbiased learners.
We’re tasking you to design and construct this software utilizing newest AI applied sciences. The app have to be scalable, user-friendly, and able to processing each textual inputs and pictures with a seamless expertise.
Proposed Answer: Method and Implementation Technique
We’ll now talk about proposed options under:
Gemma2-9B It
It’s an open supply massive language mannequin from Google designed to course of and generate human-like textual content with exceptional accuracy. On this software:
- Position: It serves because the “mind” for fixing math issues introduced in textual content format.
- How It Works: When a person inputs a text-based math downside, Gemma2-9B understands the query, applies the mandatory mathematical logic, and generates an answer.
Llama 3.2 Imaginative and prescient
It’s an open supply Mannequin from Meta AI, able to processing and analyzing photos, together with handwritten or printed math issues.
- Position: Allows the app to “see” and interpret math issues offered in picture format and generate the response.
- How It Works: When customers add a picture, Llama 3.2 Imaginative and prescient Mannequin identifies the mathematical expressions or questions inside it, converts them right into a format appropriate for problem-solving.
LangChain
It is a framework particularly designed for constructing functions that contain interactions between language fashions and exterior techniques.
- Position: Acts because the middleman between the app’s interface and the AI fashions, managing the circulation of knowledge.
- How It Works:
- It coordinates how the person’s enter (textual content or picture) is processed.
- It ensures the sleek alternate of knowledge between Gemma2-9B, Llama 3.2 Imaginative and prescient Mannequin, and the app interface.
Streamlit
It is an open-source Python library for creating interactive net functions rapidly and simply.
- Position: It’s used to put in writing frontend in Python
- How It Works:
- Builders can use Streamlit to design and deploy an online interface the place customers enter textual content or add photos.
- The interface interacts seamlessly with LangChain and the underlying AI fashions to show outcomes.
Visualizing the Method: Move Diagram of the Answer
The method begins by organising the setting, checking the Groq API key, and configuring the Streamlit web page settings. It then initializes the Textual content LLM (ChatGroq) and integrates instruments like Wikipedia and a Calculator to boost the textual content agent’s capabilities. A welcome message and sidebar navigation information the person by way of the interface, the place they’ll enter both textual content or image-based queries. The textual content part collects person questions and processes them utilizing the textual content agent, which makes use of the LLM and exterior instruments to generate solutions. Equally, for picture queries, the picture part permits customers to add photos, that are then processed by the image-specific LLM (ChatGroq).
As soon as the textual content or picture question is processed, the respective agent generates and shows the suitable solutions. The system ensures clean interplay by alternating between dealing with textual content and picture queries. After displaying the solutions, the method concludes, and the system is prepared for the following question. This circulation creates an intuitive, multi-modal expertise the place customers can ask each textual content and image-based questions, with the system offering correct and environment friendly responses.
Setting Up the Basis
Organising the inspiration is an important step in making certain a seamless integration of instruments and processes, laying the groundwork for the profitable operation of the system.
Atmosphere Setup
First issues first, arrange your growth setting. Ensure you have Python put in and create a digital setting to maintain your venture dependencies organized.
# Create a Atmosphere
python -m venv env
# Activate it on Home windows
.envScriptsactivate
# Activate in MacOS/Linux
supply env/bin/activate
Set up Dependencies
Set up the mandatory libraries utilizing
pip set up -r https://uncooked.githubusercontent.com/Gouravlohar/Math-Solver/refs/heads/grasp/necessities.txt
Get the Groq API
- To entry the llama and Gemma Mannequin we’ll use Groq .
- Get your Free API Key from right here .
Import Mandatory Libraries
import streamlit as st
import os
import base64
from dotenv import load_dotenv
from langchain_groq import ChatGroq
from langchain.chains import LLMMathChain, LLMChain
from langchain.prompts import PromptTemplate
from langchain_community.utilities import WikipediaAPIWrapper
from langchain.brokers.agent_types import AgentType
from langchain.brokers import Software, initialize_agent
from langchain_community.callbacks.streamlit import StreamlitCallbackHandler
from groq import Groq
These imports collectively arrange the mandatory libraries and modules to create a Streamlit net software that interacts with language fashions for fixing mathematical issues and answering questions primarily based on textual content and picture inputs.
Load Atmosphere Variables
load_dotenv()
groq_api_key = os.getenv("GROQ_API_KEY")
if not groq_api_key:
st.error("Groq API Key not present in .env file")
st.cease()
This part of the code is answerable for loading setting variables and making certain that the mandatory API key for Groq is accessible
Arrange the Each LLM’s
st.set_page_config(page_title="Math Solver", page_icon="👨🔬")
st.title("Math Solver")
llm_text = ChatGroq(mannequin="gemma2-9b-it", groq_api_key=groq_api_key)
llm_image = ChatGroq(mannequin="llama-3.2-90b-vision-preview", groq_api_key=groq_api_key)
This part of the code units up the Streamlit software by configuring its web page title and icon. It then initializes two completely different language fashions (LLMs) from llm_text for dealing with text-based questions utilizing the “gemma2-9b-it” mannequin, and llm_image for dealing with questions that embody photos utilizing the “llama-3.2-90b-vision-preview” mannequin. Each fashions are authenticated utilizing the beforehand retrieved Groq API key.
Initialize Instruments and Immediate Template
wikipedia_wrapper = WikipediaAPIWrapper()
wikipedia_tool = Software(
title="Wikipedia",
func=wikipedia_wrapper.run,
description="A instrument for looking out the Web to search out numerous data on the subjects talked about."
)
math_chain = LLMMathChain.from_llm(llm=llm_text)
calculator = Software(
title="Calculator",
func=math_chain.run,
description="A instrument for fixing mathematical issues. Present solely the mathematical expressions."
)
immediate = """
You're a mathematical problem-solving assistant tasked with serving to customers remedy their questions. Arrive on the resolution logically, offering a transparent and step-by-step clarification. Current your response in a structured point-wise format for higher understanding.
Query: {query}
Reply:
"""
prompt_template = PromptTemplate(
input_variables=["question"],
template=immediate
)
# Mix all of the instruments into a series for textual content questions
chain = LLMChain(llm=llm_text, immediate=prompt_template)
reasoning_tool = Software(
title="Reasoning Software",
func=chain.run,
description="A instrument for answering logic-based and reasoning questions."
)
# Initialize the brokers for textual content questions
assistant_agent_text = initialize_agent(
instruments=[wikipedia_tool, calculator, reasoning_tool],
llm=llm_text,
agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
verbose=False,
handle_parsing_errors=True
)
This a part of the code initializes numerous instruments and configurations required to deal with text-based questions within the Streamlit software. It units up the instrument for Wikipedia search utilizing the WikipediaAPIWrapper, which permits the appliance to fetch data from the web, and initializes a mathematical instrument utilizing the LLMMathChain class, which makes use of the llm_text mannequin for fixing math issues, configured on calculator particularly for mathematical expressions. It additionally defines a immediate template to construction questions and anticipated solutions in a transparent, step-by-step method. This template guides the language mannequin to generate a logical and well-explained response to every person question.
Streamlit Session State
if "messages" not in st.session_state:
st.session_state["messages"] = [
{"role": "assistant", "content": "Welcome! I am your Assistant. How can I help you today?"}
]
for msg in st.session_state.messages:
if msg["role"] == "person" and "picture" in msg:
st.chat_message(msg["role"]).write(msg['content'])
st.picture(msg["image"], caption='Uploaded Picture', use_column_width=True)
else:
st.chat_message(msg["role"]).write(msg['content'])
The code initializes chat messages within the session state if they don’t exist, beginning with a default welcome message from the assistant. Subsequently, it loops by way of messages in st.session_state and prints every into the chat interface. For a message that’s from a person and carries a picture, the textual content content material together with uploaded picture will likely be rendered with a caption. If the message doesn’t include a picture, it shows solely the textual content content material. All chat messages-besides any uploaded images-to be displayed contained in the chat interface are additionally appropriate.
Sidebar and Response Cleansing
st.sidebar.header("Navigation")
if st.sidebar.button("Textual content Query"):
st.session_state["section"] = "textual content"
if st.sidebar.button("Picture Query"):
st.session_state["section"] = "picture"
if "part" not in st.session_state:
st.session_state["section"] = "textual content"
def clean_response(response):
if "```" in response:
response = response.break up("```")[1].strip()
return response
This Part of code makes the sidebar for Textual content Part and Picture Part and the perform clean_response cleansing the response from LLM.
Processing Textual content-Primarily based Inquiries
Processing text-based inquiries focuses on dealing with and addressing person questions in textual content kind, using language fashions to generate exact responses primarily based on the enter offered.
if st.session_state["section"] == "textual content":
st.header("Textual content Query")
st.write("Please enter your mathematical query under, and I'll present an in depth resolution.")
query = st.text_area("Your Query:", "Instance: I've 5 apples and three oranges. If I eat 2 apples, what number of fruits do I've left?")
if st.button("Get Reply"):
if query:
with st.spinner("Producing response..."):
st.session_state.messages.append({"position": "person", "content material": query})
st.chat_message("person").write(query)
st_cb = StreamlitCallbackHandler(st.container(), expand_new_thoughts=False)
strive:
response = assistant_agent_text.run(st.session_state.messages, callbacks=[st_cb])
cleaned_response = clean_response(response)
st.session_state.messages.append({'position': 'assistant', "content material": cleaned_response})
st.write('### Response:')
st.success(cleaned_response)
besides ValueError as e:
st.error(f"An error occurred: {e}")
else:
st.warning("Please enter a query to get a solution.")
This part of the code handles the performance of the “Textual content Query” part within the Streamlit software. When the part is lively, it offers a header and an area to enter any query associated to arithmetic. On the press of the “Get Reply” button, if the query is entered within the textual content space, it shows a spinner that signifies a response is being generated. The query entered by the person is added to the session state messages and rendered within the chat interface.
Processing Picture-Primarily based Inquiries
Processing image-based inquiries entails analyzing and decoding photos uploaded by customers, utilizing superior fashions to generate correct responses or insights primarily based on the visible content material.
elif st.session_state["section"] == "picture":
st.header("Picture Query")
st.write("Please enter your query under and add a picture. I'll present an in depth resolution.")
query = st.text_area("Your Query:", "Instance: What would be the reply?")
uploaded_file = st.file_uploader("Add a picture", kind=["jpg", "jpeg", "png"])
if st.button("Get Reply"):
if query and uploaded_file will not be None:
with st.spinner("Producing response..."):
image_data = uploaded_file.learn()
image_data_url = f"information:picture/jpeg;base64,{base64.b64encode(image_data).decode()}"
st.session_state.messages.append({"position": "person", "content material": query, "picture": image_data})
st.chat_message("person").write(query)
st.picture(image_data, caption='Uploaded Picture', use_column_width=True)
This part of the code handles the “Picture Query” performance within the Streamlit software. When the “Picture Query” part is lively, it shows a header, a textual content space for customers to enter their questions, and an choice to add a picture. Upon clicking the “Get Reply” button, if each a query and a picture are offered, it exhibits a spinner indicating {that a} response is being generated. The uploaded picture is learn and encoded in base64 format. The person’s query and the picture information are appended to the session state messages and displayed within the chat interface, with the picture proven alongside the query. This setup ensures that each the textual content and picture inputs are accurately captured and displayed for additional processing.
Initialize Groq Consumer for Llama 3.2 Imaginative and prescient Mannequin
shopper = Groq()
messages = [
{
"role": "user",
"content": [
{
"type": "text",
"text": question
},
{
"type": "image_url",
"image_url": {
"url": image_data_url
}
}
]
}
]
This part will put together the message for Llama imaginative and prescient mannequin
Groq API Name
strive:
completion = shopper.chat.completions.create(
mannequin="llama-3.2-90b-vision-preview",
messages=messages,
temperature=1,
max_tokens=1024,
top_p=1,
stream=False,
cease=None,
)
This setup sends the person’s query and picture to the Groq API, which processes the inputs utilizing the desired mannequin and returns a generated response.
Response from Picture Mannequin
response = completion.decisions[0].message.content material
cleaned_response = clean_response(response)
st.session_state.messages.append({'position': 'assistant', "content material": cleaned_response})
st.write('### Response:')
st.success(cleaned_response)
besides ValueError as e:
st.error(f"An error occurred: {e}")
else:
st.warning("Please enter a query and add a picture to get a solution.")
This part of the code processes the response from the Groq API after producing a completion. It extracts the content material of the response from the primary selection within the completion end result and cleans it utilizing the clean_response perform. The system appends the cleaned response to the session state messages with the position of “assistant” and shows it within the chat interface. The response seems below a “Response” header with a hit message. If a ValueError happens, the system shows an error message. If both the query or the picture will not be offered, a warning prompts the person to enter each to get a solution.
Verify the Full Code in GitHub Repo Right here.
Output
Enter for Textual content Part
A tank has three pipes hooked up to it. Pipe A can fill the tank in 4 hours, Pipe B can fill it in 6 hours, and Pipe C can empty the tank in 3 hours. If all three pipes are opened collectively, how lengthy will it take to fill the tank fully?
Enter for Picture Part
Conclusion
By combining the powers of Gemma 9b, Llama 3.2 Imaginative and prescient, LangChain, and Streamlit, it’s potential to create a sturdy and user-friendly math problem-solving app that may revolutionize how college students study and have interaction with arithmetic, offering step-by-step options and real-time suggestions. This helps overcome not solely the complexity points inside mathematical ideas however, extra importantly, gives a scalable and accessible resolution for learners in any respect ranges.
That is one instance of some ways such massive language fashions and AI can be utilized in schooling. As we proceed to develop these applied sciences, much more artistic and impactful functions will emerge to alter how we study and train.
What do you consider such an idea? Have you ever ever tried to develop AI-based edutainment functions? Share your experiences and concepts within the feedback under!
Key Takeaways
- You possibly can construct a strong math downside solver utilizing superior AI fashions like Gemma 2 9b and Llama 3.2.
- Mix textual content and picture processing to create an app that may deal with numerous forms of math issues.
- Learn to combine LangChain with numerous instruments to create a strong Math Downside Solver Chat App that enhances person expertise.
- Leverage Groq acceleration to make sure your app delivers fast responses.
- Streamlit makes it simple to construct an intuitive and fascinating person interface.
- Contemplate the moral implications and design your app to advertise studying and understanding.
Continuously Requested Questions
A. Gemma 2 9b is a strong language mannequin developed by Google, able to understanding and fixing advanced math issues introduced in textual content kind.
A. The app makes use of the Meta Llama 3.2 imaginative and prescient mannequin to interpret math issues in photos. It then extracts the issue and generate the response.
A. Sure, you’ll be able to design the app to show the steps concerned in fixing an issue, which could be a helpful studying instrument for customers.
A. It’s essential to make sure the app is used responsibly and doesn’t facilitate dishonest or hinder real studying. Design options that promote understanding and encourage customers to have interaction with the problem-solving course of.
A. Yow will discover extra details about Gemma 2 9b, Llama 3.2, Groq, LangChain, and Streamlit on Analytics Vidhya, their respective official web sites and documentation pages.
The media proven on this article will not be owned by Analytics Vidhya and is used on the Writer’s discretion.