What is DSPy?
DSPy (Declarative Self-improving Python) is a framework from Stanford NLP that fundamentally changes how we build LLM applications. Instead of manually crafting prompts, you define what you want to accomplish, and DSPy automatically optimizes the prompts for you.
Key innovations:
- Declarative programming: Define signatures, not prompts
- Automatic optimization: DSPy compiles and optimizes your pipeline
- Modular design: Compose complex systems from simple modules
- Self-improvement: Pipelines get better with examples and feedback
Why DSPy?
Traditional prompt engineering has problems:
The Prompt Problem
Manually crafted prompts are brittle, hard to maintain, and don't transfer well between models.
The DSPy Solution
Define behavior declaratively, let the framework optimize prompts automatically for any model.
DSPy treats prompts as parameters that can be learned, not code that must be written.
Core Concepts
Signatures
Signatures define the input-output behavior of a module:
import dspy
# Simple signature using shorthand
class BasicQA(dspy.Signature):
"""Answer questions with short factual answers."""
question = dspy.InputField()
answer = dspy.OutputField(desc="often between 1 and 5 words")
# Or using inline syntax
qa = dspy.Predict("question -> answer")
Modules
Modules are the building blocks of DSPy programs:
import dspy
# Predict: Basic LLM call
predict = dspy.Predict("question -> answer")
# ChainOfThought: Adds reasoning steps
cot = dspy.ChainOfThought("question -> answer")
# ReAct: Reasoning + Acting with tools
react = dspy.ReAct("question -> answer", tools=[search_tool])
# ProgramOfThought: Generates and executes code
pot = dspy.ProgramOfThought("question -> answer")
Teleprompters (Optimizers)
Teleprompters automatically optimize your prompts:
from dspy.teleprompt import BootstrapFewShot
# Create a teleprompter
teleprompter = BootstrapFewShot(metric=my_metric)
# Compile (optimize) your program
optimized_program = teleprompter.compile(
student=my_program,
trainset=training_examples
)
Getting Started
Installation
pip install dspy-ai
Basic Setup
import dspy
# Configure the LLM
lm = dspy.OpenAI(model="gpt-4", max_tokens=500)
dspy.settings.configure(lm=lm)
# Or use other providers
# lm = dspy.Claude(model="claude-3-sonnet")
# lm = dspy.Ollama(model="llama2")
Your First DSPy Program
import dspy
# Configure LLM
dspy.settings.configure(lm=dspy.OpenAI(model="gpt-4"))
# Define a simple QA module
qa = dspy.ChainOfThought("question -> answer")
# Use it
response = qa(question="What is the capital of France?")
print(response.answer) # "Paris"
print(response.rationale) # Shows the reasoning
Building Complex Pipelines
Multi-Step RAG System
import dspy
from dspy.retrieve import ChromadbRM
class RAG(dspy.Module):
def __init__(self, num_passages=3):
super().__init__()
self.retrieve = dspy.Retrieve(k=num_passages)
self.generate = dspy.ChainOfThought("context, question -> answer")
def forward(self, question):
# Retrieve relevant passages
passages = self.retrieve(question).passages
# Generate answer with context
context = "\n".join(passages)
answer = self.generate(context=context, question=question)
return answer
# Configure retriever
retriever = ChromadbRM(
collection_name="my_docs",
persist_directory="./chroma_db"
)
dspy.settings.configure(rm=retriever)
# Use the RAG system
rag = RAG()
result = rag("What are the key features of our product?")
Multi-Hop Reasoning
class MultiHopQA(dspy.Module):
def __init__(self, num_hops=2):
super().__init__()
self.num_hops = num_hops
self.retrieve = dspy.Retrieve(k=3)
self.generate_query = dspy.ChainOfThought(
"context, question -> search_query"
)
self.generate_answer = dspy.ChainOfThought(
"context, question -> answer"
)
def forward(self, question):
context = []
for hop in range(self.num_hops):
# Generate search query
if hop == 0:
query = question
else:
query = self.generate_query(
context="\n".join(context),
question=question
).search_query
# Retrieve passages
passages = self.retrieve(query).passages
context.extend(passages)
# Generate final answer
return self.generate_answer(
context="\n".join(context),
question=question
)
multihop = MultiHopQA(num_hops=2)
result = multihop("Who founded the company that created GPT-4?")
Optimizing with Teleprompters
BootstrapFewShot
Automatically generates and selects few-shot examples:
from dspy.teleprompt import BootstrapFewShot
# Define a metric
def validate_answer(example, prediction, trace=None):
return example.answer.lower() == prediction.answer.lower()
# Create training examples
trainset = [
dspy.Example(question="Capital of France?", answer="Paris"),
dspy.Example(question="Capital of Japan?", answer="Tokyo"),
# ... more examples
]
# Optimize
teleprompter = BootstrapFewShot(metric=validate_answer)
optimized_rag = teleprompter.compile(RAG(), trainset=trainset)
# The optimized version includes learned few-shot examples
MIPRO (Multi-prompt Instruction Proposal)
Optimizes both instructions and examples:
from dspy.teleprompt import MIPRO
teleprompter = MIPRO(
metric=validate_answer,
num_candidates=10,
init_temperature=1.0
)
optimized_program = teleprompter.compile(
my_program,
trainset=trainset,
valset=valset,
num_trials=50
)
Assertions and Constraints
DSPy supports runtime assertions to ensure output quality:
import dspy
class ConstrainedQA(dspy.Module):
def __init__(self):
super().__init__()
self.generate = dspy.ChainOfThought("question -> answer")
def forward(self, question):
answer = self.generate(question=question)
# Add assertions
dspy.Suggest(
len(answer.answer.split()) <= 10,
"Answer should be concise (max 10 words)"
)
dspy.Assert(
answer.answer.strip() != "",
"Answer cannot be empty"
)
return answer
# With assertions, DSPy will retry if constraints aren't met
DSPy vs Other Frameworks
DSPy
Best for: Research, optimization, when you have training data
LangChain
Best for: Quick prototypes, many integrations, production apps
LlamaIndex
Best for: Document-heavy RAG applications
Combine Them
Use DSPy to optimize prompts, deploy with LangChain infrastructure
Best Practices
- Start simple: Begin with Predict, add ChainOfThought if needed
- Define clear signatures: Good descriptions help the optimizer
- Collect examples: More training data = better optimization
- Use assertions: Enforce output quality with Suggest and Assert
- Iterate on metrics: Your metric defines what "good" means
- Save compiled programs: Don't re-optimize in production
# Save optimized program
optimized_program.save("my_optimized_rag.json")
# Load later
loaded_program = RAG()
loaded_program.load("my_optimized_rag.json")
Master Advanced LLM Programming
Our Agentic AI program covers DSPy and other advanced frameworks. Learn to build self-improving AI systems with automatic optimization.
Explore Agentic AI Program