DSPy: Programming Language Models the Right Way

What is DSPy?

DSPy (Declarative Self-improving Python) is a framework from Stanford NLP that fundamentally changes how we build LLM applications. Instead of manually crafting prompts, you define what you want to accomplish, and DSPy automatically optimizes the prompts for you.

Key innovations:

Declarative programming: Define signatures, not prompts
Automatic optimization: DSPy compiles and optimizes your pipeline
Modular design: Compose complex systems from simple modules
Self-improvement: Pipelines get better with examples and feedback

Why DSPy?

Traditional prompt engineering has problems:

The Prompt Problem

Manually crafted prompts are brittle, hard to maintain, and don't transfer well between models.

The DSPy Solution

Define behavior declaratively, let the framework optimize prompts automatically for any model.

DSPy treats prompts as parameters that can be learned, not code that must be written.

Core Concepts

Signatures

Signatures define the input-output behavior of a module:

import dspy

# Simple signature using shorthand
class BasicQA(dspy.Signature):
    """Answer questions with short factual answers."""
    question = dspy.InputField()
    answer = dspy.OutputField(desc="often between 1 and 5 words")

# Or using inline syntax
qa = dspy.Predict("question -> answer")

Modules

Modules are the building blocks of DSPy programs:

import dspy

# Predict: Basic LLM call
predict = dspy.Predict("question -> answer")

# ChainOfThought: Adds reasoning steps
cot = dspy.ChainOfThought("question -> answer")

# ReAct: Reasoning + Acting with tools
react = dspy.ReAct("question -> answer", tools=[search_tool])

# ProgramOfThought: Generates and executes code
pot = dspy.ProgramOfThought("question -> answer")

Teleprompters (Optimizers)

Teleprompters automatically optimize your prompts:

from dspy.teleprompt import BootstrapFewShot

# Create a teleprompter
teleprompter = BootstrapFewShot(metric=my_metric)

# Compile (optimize) your program
optimized_program = teleprompter.compile(
    student=my_program,
    trainset=training_examples
)

Getting Started

Installation

pip install dspy-ai

Basic Setup

import dspy

# Configure the LLM
lm = dspy.OpenAI(model="gpt-4", max_tokens=500)
dspy.settings.configure(lm=lm)

# Or use other providers
# lm = dspy.Claude(model="claude-3-sonnet")
# lm = dspy.Ollama(model="llama2")

Your First DSPy Program

import dspy

# Configure LLM
dspy.settings.configure(lm=dspy.OpenAI(model="gpt-4"))

# Define a simple QA module
qa = dspy.ChainOfThought("question -> answer")

# Use it
response = qa(question="What is the capital of France?")
print(response.answer)  # "Paris"
print(response.rationale)  # Shows the reasoning

Building Complex Pipelines

Multi-Step RAG System

import dspy
from dspy.retrieve import ChromadbRM

class RAG(dspy.Module):
    def __init__(self, num_passages=3):
        super().__init__()
        self.retrieve = dspy.Retrieve(k=num_passages)
        self.generate = dspy.ChainOfThought("context, question -> answer")

    def forward(self, question):
        # Retrieve relevant passages
        passages = self.retrieve(question).passages

        # Generate answer with context
        context = "\n".join(passages)
        answer = self.generate(context=context, question=question)

        return answer

# Configure retriever
retriever = ChromadbRM(
    collection_name="my_docs",
    persist_directory="./chroma_db"
)
dspy.settings.configure(rm=retriever)

# Use the RAG system
rag = RAG()
result = rag("What are the key features of our product?")

Multi-Hop Reasoning

class MultiHopQA(dspy.Module):
    def __init__(self, num_hops=2):
        super().__init__()
        self.num_hops = num_hops
        self.retrieve = dspy.Retrieve(k=3)
        self.generate_query = dspy.ChainOfThought(
            "context, question -> search_query"
        )
        self.generate_answer = dspy.ChainOfThought(
            "context, question -> answer"
        )

    def forward(self, question):
        context = []

        for hop in range(self.num_hops):
            # Generate search query
            if hop == 0:
                query = question
            else:
                query = self.generate_query(
                    context="\n".join(context),
                    question=question
                ).search_query

            # Retrieve passages
            passages = self.retrieve(query).passages
            context.extend(passages)

        # Generate final answer
        return self.generate_answer(
            context="\n".join(context),
            question=question
        )

multihop = MultiHopQA(num_hops=2)
result = multihop("Who founded the company that created GPT-4?")

Optimizing with Teleprompters

BootstrapFewShot

Automatically generates and selects few-shot examples:

from dspy.teleprompt import BootstrapFewShot

# Define a metric
def validate_answer(example, prediction, trace=None):
    return example.answer.lower() == prediction.answer.lower()

# Create training examples
trainset = [
    dspy.Example(question="Capital of France?", answer="Paris"),
    dspy.Example(question="Capital of Japan?", answer="Tokyo"),
    # ... more examples
]

# Optimize
teleprompter = BootstrapFewShot(metric=validate_answer)
optimized_rag = teleprompter.compile(RAG(), trainset=trainset)

# The optimized version includes learned few-shot examples

MIPRO (Multi-prompt Instruction Proposal)

Optimizes both instructions and examples:

from dspy.teleprompt import MIPRO

teleprompter = MIPRO(
    metric=validate_answer,
    num_candidates=10,
    init_temperature=1.0
)

optimized_program = teleprompter.compile(
    my_program,
    trainset=trainset,
    valset=valset,
    num_trials=50
)

Assertions and Constraints

DSPy supports runtime assertions to ensure output quality:

import dspy

class ConstrainedQA(dspy.Module):
    def __init__(self):
        super().__init__()
        self.generate = dspy.ChainOfThought("question -> answer")

    def forward(self, question):
        answer = self.generate(question=question)

        # Add assertions
        dspy.Suggest(
            len(answer.answer.split()) <= 10,
            "Answer should be concise (max 10 words)"
        )

        dspy.Assert(
            answer.answer.strip() != "",
            "Answer cannot be empty"
        )

        return answer

# With assertions, DSPy will retry if constraints aren't met

DSPy vs Other Frameworks

DSPy

Best for: Research, optimization, when you have training data

LangChain

Best for: Quick prototypes, many integrations, production apps

LlamaIndex

Best for: Document-heavy RAG applications

Combine Them

Use DSPy to optimize prompts, deploy with LangChain infrastructure

Best Practices

Start simple: Begin with Predict, add ChainOfThought if needed
Define clear signatures: Good descriptions help the optimizer
Collect examples: More training data = better optimization
Use assertions: Enforce output quality with Suggest and Assert
Iterate on metrics: Your metric defines what "good" means
Save compiled programs: Don't re-optimize in production

# Save optimized program
optimized_program.save("my_optimized_rag.json")

# Load later
loaded_program = RAG()
loaded_program.load("my_optimized_rag.json")

Master Advanced LLM Programming

Our Agentic AI program covers DSPy and other advanced frameworks. Learn to build self-improving AI systems with automatic optimization.

Explore Agentic AI Program

DSPy

What is DSPy?

Why DSPy?

The Prompt Problem

The DSPy Solution

Core Concepts

Signatures

Modules

Teleprompters (Optimizers)

Getting Started

Installation

Basic Setup

Your First DSPy Program

Building Complex Pipelines

Multi-Step RAG System

Multi-Hop Reasoning

Optimizing with Teleprompters

BootstrapFewShot

MIPRO (Multi-prompt Instruction Proposal)

Assertions and Constraints

DSPy vs Other Frameworks

DSPy

LangChain

LlamaIndex

Combine Them

Best Practices

Master Advanced LLM Programming

Related Articles

LangChain: Building LLM Applications

Prompt Engineering Guide

RAG: Retrieval Augmented Generation