Supercharge Your LLM Apps Utilizing DSPy and Langfuse | by Raghav Bali | Oct, 2024

As illustrated in determine 1, DSPy is a pytorch-like/lego-like framework for constructing LLM-based apps. Out of the field, it comes with:

  • Signatures: These are specs to outline enter and output behaviour of a DSPy program. These might be outlined utilizing short-hand notation (like “query -> reply” the place the framework robotically understands query is the enter whereas reply is the output) or utilizing declarative specification utilizing python courses (extra on this in later sections)
  • Modules: These are layers of predefined elements for highly effective ideas like Chain of Thought, ReAct and even the straightforward textual content completion (Predict). These modules summary underlying brittle prompts whereas nonetheless offering extensibility by way of customized elements.
  • Optimizers: These are distinctive to DSPy framework and draw inspiration from PyTorch itself. These optimizers make use of annotated datasets and analysis metrics to assist tune/optimize our LLM-powered DSPy packages.
  • Information, Metrics, Assertions and Trackers are a number of the different elements of this framework which act as glue and work behind the scenes to complement this total framework.

To construct an app/program utilizing DSPy, we undergo a modular but step-by-step method (as proven in determine 1 (proper)). We first outline our job to assist us clearly outline our program’s signature (enter and output specs). That is adopted by constructing a pipeline program which makes use of a number of abstracted immediate modules, language mannequin module in addition to retrieval mannequin modules. One now we have all of this in place, we then proceed to have some examples together with required metrics to consider our setup that are utilized by optimizers and assertion elements to compile a strong app.

Langfuse is an LLM Engineering platform designed to empower builders in constructing, managing, and optimizing LLM-powered purposes. Whereas it gives each managed and self-hosting options, we’ll concentrate on the self-hosting choice on this publish, offering you with full management over your LLM infrastructure.

Key Highlights of Langfuse Setup

Langfuse equips you with a collection of highly effective instruments to streamline the LLM growth workflow:

  • Immediate Administration: Effortlessly model and retrieve prompts, guaranteeing reproducibility and facilitating experimentation.
  • Tracing: Achieve deep visibility into your LLM purposes with detailed traces, enabling environment friendly debugging and troubleshooting. The intuitive UI out of the field permits groups to annotate mannequin interactions to develop and consider coaching datasets.
  • Metrics: Monitor essential metrics equivalent to price, latency, and token utilization, empowering you to optimize efficiency and management bills.
  • Analysis: Seize person suggestions, annotate LLM responses, and even arrange analysis features to repeatedly assess and enhance your fashions.
  • Datasets: Handle and set up datasets derived out of your LLM purposes, facilitating additional fine-tuning and mannequin enhancement.

Easy Setup

Langfuse’s self-hosting answer is remarkably straightforward to arrange, leveraging a docker-based structure which you can rapidly spin up utilizing docker compose. This streamlined method minimizes deployment complexities and lets you concentrate on constructing your LLM purposes.

Framework Compatibility

Langfuse seamlessly integrates with in style LLM frameworks like LangChain, LlamaIndex, and, in fact, DSPy, making it a flexible software for a variety of LLM growth frameworks.

By integrating Langfuse into your DSPy purposes, you unlock a wealth of observability capabilities that allow you to observe, analyze, and optimize your fashions in actual time.

Integrating Langfuse into Your DSPy App

The combination course of is simple and includes instrumenting your DSPy code with Langfuse’s SDK.

import dspy
from dsp.trackers.langfuse_tracker import LangfuseTracker

# configure tracker
langfuse = LangfuseTracker()

# instantiate openai shopper
openai = dspy.OpenAI(
mannequin='gpt-4o-mini',
temperature=0.5,
max_tokens=1500
)

# dspy predict supercharged with automated langfuse trackers
openai("What's DSPy?")

Gaining Insights with Langfuse

As soon as built-in, Langfuse offers quite a lot of actionable insights into your DSPy software’s conduct:

  • Hint-Primarily based Debugging: Observe the execution stream of your DSPY packages, pinpoint bottlenecks, and determine areas for enchancment.
  • Efficiency Monitoring: Monitor key metrics like latency and token utilization to make sure optimum efficiency and cost-efficiency.
  • Person Interplay Evaluation: Perceive how customers work together along with your LLM app, determine widespread queries, and alternatives for enhancement.
  • Information Assortment & Nice-Tuning: Gather and annotate LLM responses, constructing useful datasets for additional fine-tuning and mannequin refinement.

Use Instances Amplified

The mixture of DSPy and Langfuse is especially vital within the following eventualities:

  • Complicated Pipelines: When coping with advanced DSPy pipelines involving a number of modules, Langfuse’s tracing capabilities turn out to be indispensable for debugging and understanding the stream of data.
  • Manufacturing Environments: In manufacturing settings, Langfuse’s monitoring options guarantee your LLM app runs easily, offering early warnings of potential points whereas maintaining a tally of prices concerned.
  • Iterative Improvement: Langfuse’s analysis and dataset administration instruments facilitate data-driven iteration, permitting you to repeatedly refine your LLM app primarily based on real-world utilization.

To really showcase the ability and flexibility of DSPy mixed with superb monitoring capabilities of langfuse, I’ve just lately utilized them to a novel dataset: my latest LLM workshop GitHub repository. This latest full day workshop comprises a variety of materials to get you began with LLMs. The purpose of this Q&A bot was to help members throughout and after the workshop with solutions to a bunch NLP and LLM associated matters lined within the workshop. This “meta” use case not solely demonstrates the sensible software of those instruments but additionally provides a contact of self-reflection to our exploration.

The Activity: Constructing a Q&A System

For this train, we’ll leverage DSPy to construct a Q&A system able to answering questions in regards to the content material of my workshop (notebooks, markdown information, and so on.). This job highlights DSPy’s means to course of and extract data from textual knowledge, a vital functionality for a variety of LLM purposes. Think about having a private AI assistant (or co-pilot) that may assist you to recall particulars out of your previous weeks, determine patterns in your work, and even floor forgotten insights! It additionally presents a powerful case of how such a modular setup might be simply prolonged to every other textual dataset with little to no effort.

Allow us to start by organising the required objects for our program.

import os
import dspy
from dsp.trackers.langfuse_tracker import LangfuseTracker

config = {
'LANGFUSE_PUBLIC_KEY': 'XXXXXX',
'LANGFUSE_SECRET_KEY': 'XXXXXX',
'LANGFUSE_HOST': 'http://localhost:3000',
'OPENAI_API_KEY': 'XXXXXX',
'OPENAI_BASE_URL': 'XXXXXX',
'OPENAI_PROVIDER': 'XXXXXX',
'CHROMA_DB_PATH': './chromadb/',
'CHROMA_COLLECTION_NAME':"supercharged_workshop_collection",
'CHROMA_EMB_MODEL': 'all-MiniLM-L6-v2'
}

# setting config
os.environ["LANGFUSE_PUBLIC_KEY"] = config.get('LANGFUSE_PUBLIC_KEY')
os.environ["LANGFUSE_SECRET_KEY"] = config.get('LANGFUSE_SECRET_KEY')
os.environ["LANGFUSE_HOST"] = config.get('LANGFUSE_HOST')
os.environ["OPENAI_API_KEY"] = config.get('OPENAI_API_KEY')

# setup Langfuse tracker
langfuse_tracker = LangfuseTracker(session_id='supercharger001')

# instantiate language-model for DSPY
llm_model = dspy.OpenAI(
api_key=config.get('OPENAI_API_KEY'),
mannequin='gpt-4o-mini'
)

# instantiate chromadb shopper
chroma_emb_fn = embedding_functions.
SentenceTransformerEmbeddingFunction(
model_name=config.get(
'CHROMA_EMB_MODEL'
)
)
shopper = chromadb.HttpClient()

# setup chromadb assortment
assortment = shopper.create_collection(
config.get('CHROMA_COLLECTION_NAME'),
embedding_function=chroma_emb_fn,
metadata={"hnsw:area": "cosine"}
)

As soon as now we have these shoppers and trackers in place, allow us to rapidly add some paperwork to our assortment (check with this pocket book for an in depth stroll by way of of how I ready this dataset within the first place).

# Add to assortment
assortment.add(
paperwork=[v for _,v in nb_scraper.notebook_md_dict.items()],
ids=doc_ids, # should be distinctive for every doc
)

The subsequent step is to easily join our chromadb retriever to the DSPy framework. The next snippet created a RM object and checks if the retrieval works as meant.

retriever_model = ChromadbRM(
config.get('CHROMA_COLLECTION_NAME'),
config.get('CHROMA_DB_PATH'),
embedding_function=chroma_emb_fn,
shopper=shopper,
okay=5
)

# Check Retrieval
outcomes = retriever_model("RLHF")
for lead to outcomes:
show(Markdown(f"__Document__::{consequence.long_text[:100]}... n"))
show(Markdown(f">- __Document id__::{consequence.id} n>- __Document score__::{consequence.rating}"))

The output appears promising provided that with none intervention, Chromadb is ready to fetch probably the most related paperwork.

Doc::# Fast Overview of RLFH

The efficiency of Language Fashions till GPT-3 was type of superb as-is. ...

- Doc id::6_module_03_03_RLHF_phi2
- Doc rating::0.6174977412306334

Doc::# Getting Began : Textual content Illustration Picture

The NLP area ...

- Doc id::2_module_01_02_getting_started
- Doc rating::0.8062083377747705

Doc::# Textual content Technology <a goal="_blank" href="https://colab.analysis.google.com/github/raghavbali/llm_w" > ...

- Doc id::3_module_02_02_simple_text_generator
- Doc rating::0.8826038964887366

Doc::# Picture DSPy: Past Prompting
<img src= "./belongings/dspy_b" > ...

- Doc id::12_module_04_05_dspy_demo
- Doc rating::0.9200280698248913

The ultimate step is to piece all of this collectively in getting ready a DSPy program. For our easy Q&A use-case we make put together a regular RAG program leveraging Chromadb as our retriever and Langfuse as our tracker. The next snippet presents the pytorch-like method of creating LLM primarily based apps with out worrying about brittle prompts!

# RAG Signature
class GenerateAnswer(dspy.Signature):
"""Reply questions with brief factoid solutions."""

context = dspy.InputField(desc="might comprise related info")
query = dspy.InputField()
reply = dspy.OutputField(desc="usually lower than 50 phrases")

# RAG Program
class RAG(dspy.Module):
def __init__(self, num_passages=3):
tremendous().__init__()

self.retrieve = dspy.Retrieve(okay=num_passages)
self.generate_answer = dspy.ChainOfThought(GenerateAnswer)

def ahead(self, query):
context = self.retrieve(query).passages
prediction = self.generate_answer(context=context, query=query)
return dspy.Prediction(context=context, reply=prediction.reply)

# compile a RAG
# word: we're not utilizing any optimizers for this instance
compiled_rag = RAG()

Phew! Wasn’t that fast and easy to do? Allow us to now put this into motion utilizing a couple of pattern questions.

my_questions = [
"List the models covered in module03",
"Brief summary of module02",
"What is LLaMA?"
]

for query in my_questions:
# Get the prediction. This comprises `pred.context` and `pred.reply`.
pred = compiled_rag(query)

show(Markdown(f"__Question__: {query}"))
show(Markdown(f"__Predicted Answer__: _{pred.reply}_"))
show(Markdown("__Retrieved Contexts (truncated):__"))
for idx,cont in enumerate(pred.context):
print(f"{idx+1}. {cont[:200]}..." )
print()
show(Markdown('---'))

The output is certainly fairly on level and serves the aim of being an assistant to this workshop materials answering questions and guiding the attendees properly.