Skip to main content

Overview

Dria’s GEPA (Genetic-Pareto) Prompt Evolution Service automatically improves your prompts through iterative optimization. It accepts a seed prompt and evaluation dataset, then evolves better-performing prompts without manual tuning. Key benefits:
  • Automatic prompt improvement through reflective learning
  • Sample-efficient optimization (up to 35x fewer iterations than traditional methods)
  • Transparent evolution history with scores and ancestry tracking
  • No fine-tuning required - works with existing models
  • High performance - achieves 0.99+ scores on pedagogical tasks
Ideal for:
  • Optimizing support response templates
  • Refining instruction prompts
  • Improving task-specific outputs
  • Evolving domain-aware technical documentation
  • Security code review templates

Getting Started

1. Prepare Your Dataset

Create a JSON payload with:
  • Seed prompt: Your initial prompt template
  • Dataset: Examples with inputs and expected outputs (size must equal minibatchSize + paretoSize)
  • Evaluator: Must use available models
  • Budget: Number of inference calls (minimum 10, recommended 25-50)
  • Model: Must use available models
Each dataset item needs:
  • input: Object with variables matching your datasetColumns
  • expectedOutput: The desired response (minimum 1 character)
Important: Dataset size must equal minibatchSize + paretoSize. The first minibatchSize items are used for training (feedback), and the next paretoSize items are used for validation (pareto). For example, if minibatchSize: 3 and paretoSize: 4, you need exactly 7 dataset items in order.

2. Submit Execution

POST your payload to start evolution. The API returns an evolutionId immediately.

3. Monitor Progress

Poll the execution status every 10-30 seconds until completion. Status values:
  • pending: Job queued
  • in_progress: Actively evolving
  • completed: Finished successfully
  • failed: Encountered an error

4. Retrieve Results

Once completed, fetch the prompt evolution history to see how your prompt improved across generations.

API Reference

Base URL: https://mainnet.dkn.dria.co/api/v0 All endpoints require x-api-key header.

POST /gepa/start_execution

Start a new prompt evolution. Returns {"evolutionId": "uuid"}. Required fields:
  • customId: Your identifier for this run
  • strategy: "RPM" (Reflective Prompt Mutation) - currently the only available strategy
  • model: Model name - must be one from the available models (e.g., gpt-4o, gpt-4o-mini)
  • datasetColumns: Array of variable names
  • budget: Inference budget (minimum 10, recommended 25-50)
  • minibatchSize: Number of feedback examples per generation
  • paretoSize: Number of pareto (validation) examples
  • evaluator: JSON string with model field and metric config (e.g., {"model": "gpt-4o-mini", "metric": "accuracy", "threshold": 0.9})
  • dataset: Array of examples (see Getting Started Step 1 for size/order rules). Each item needs input (object) and expectedOutput (string, min length 1)
  • prompt: Your seed prompt template

GET /gepa/get_all_executions

List your executions. Supports ?page=1&limit=10 pagination.

GET /gepa/get_single_execution/{evolutionId}

Get detailed status, current score, generation, and prompt.

GET /gepa/get_execution_prompts

Get evolution history. Requires ?evolutionId=.... Supports pagination.

Supported Models

GEPA supports the following models:
  • gpt-5-mini - Latest GPT-5 Mini (fast and cost-effective)
  • gpt-5 - Latest GPT-5 (highest quality)
  • gpt-4.1 - GPT-4.1
  • gpt-4o - GPT-4o
  • gpt-4o-mini - GPT-4o Mini (balanced performance and cost)

Example Payloads

Example 1: Math Tutoring

{
  "customId": "demo-arithmetic-001",
  "strategy": "RPM",
  "model": "gpt-4o",
  "datasetColumns": ["question", "hint"],
  "budget": 30,
  "minibatchSize": 3,
  "paretoSize": 4,
  "evaluator": "{\"model\": \"gpt-4o-mini\", \"metric\": \"exact_match\", \"threshold\": 0.95, \"partialCredit\": true}",
  "dataset": [
    {

      "input": {
        "question": "What is 17 + 26?",
        "hint": "Add tens first, then units."
      },
      "expectedOutput": "Answer: 43"
    },
    {

      "input": {
        "question": "Multiply 9 by 7.",
        "hint": "Use repeated addition if needed."
      },
      "expectedOutput": "Answer: 63"
    },
    {

      "input": {
        "question": "What is 125 - 67?",
        "hint": "Borrow carefully and explain."
      },
      "expectedOutput": "Answer: 58"
    },
    {

      "input": {
        "question": "What is 45 minus 18?",
        "hint": "Break 45 into 40 + 5."
      },
      "expectedOutput": "Answer: 27"
    },
    {

      "input": {
        "question": "Solve 12 * 14 using mental math.",
        "hint": "Split one factor into tens and ones."
      },
      "expectedOutput": "Answer: 168"
    },
    {

      "input": {
        "question": "Add 156 + 89.",
        "hint": "Round 89 to 90 first."
      },
      "expectedOutput": "Answer: 245"
    },
    {

      "input": {
        "question": "Divide 144 by 12.",
        "hint": "Think of 12 times table."
      },
      "expectedOutput": "Answer: 12"
    }
  ],
  "prompt": "You are an encouraging math tutor. Walk the student through each step, narrate your reasoning, and end with `Answer: <value>`."
}

Example 2: Support Response

{
  "customId": "support-aurora-router",
  "strategy": "RPM",
  "model": "gpt-4o",
  "datasetColumns": ["customer_name", "product", "issue_summary"],
  "budget": 25,
  "minibatchSize": 2,
  "paretoSize": 2,
  "evaluator": "{\"model\": \"gpt-4o-mini\", \"metric\": \"coherence\"}",
  "dataset": [
    {

      "input": {
        "customer_name": "Rory Chen",
        "product": "Aurora Mesh Router",
        "issue_summary": "intermittent drop-offs whenever video calls start"
      },
      "expectedOutput": "Hi Rory, I refreshed QoS and shared call-stability steps so video calls stay stable."
    },
    {

      "input": {
        "customer_name": "Priya Patel",
        "product": "Aurora Mesh Router",
        "issue_summary": "needs parental controls ready before weekend trip"
      },
      "expectedOutput": "Hi Priya, I set up device groups so you can enable parental controls in one tap before the trip."
    },
    {

      "input": {
        "customer_name": "Damian Wright",
        "product": "Aurora Mesh Router",
        "issue_summary": "wants one-sentence status summary"
      },
      "expectedOutput": "Damian, the mesh rollout finished a week early with redundant coverage verified."
    },
    {

      "input": {
        "customer_name": "Sam Lee",
        "product": "Aurora Mesh Router",
        "issue_summary": "roaming handoff assurance needed"
      },
      "expectedOutput": "Hi Sam, roaming profile is enabled and handoff latency is under 120 ms confirmed via live session trace."
    }
  ],
  "prompt": "Write a concise, empathetic support reply for {customer_name} about their {product}. Highlight the fix for: {issue_summary}. Close with an offer to help further."
}

Real-World Evolution Example

Here’s an actual GEPA execution using the Support Response payload from Example 2:
  • Generation 0 (score 0.38): identical to the Example 2 prompt—polite acknowledgement plus closing offer to help.
  • Generation 1 (score 0.69, +82%): GEPA rewrote the prompt into a numbered template that called out mesh rollout achievements, 120 ms roaming latency targets, and live-session trace evidence directly pulled from the dataset context.
Key Improvements Observed:
  • Structured approach with numbered sections
  • Domain-specific knowledge extracted from dataset (QoS, mesh rollout, 120ms handoff latency)
  • Concrete examples for different scenario types
  • Professional formatting requirements
  • Technical credibility elements (live session traces)
This demonstrates GEPA’s reflective learning: the final prompt contains specific knowledge and patterns extracted from evaluating the dataset examples.

Python Example

This script reuses the Support Response payload from Example 2 to keep the dataset definition in one place. Save that JSON as support_payload.json (or load it however you prefer) and then run:
import requests
import time
import json
from copy import deepcopy
from pathlib import Path

api_key = '<YOUR_API_KEY>'
base_url = 'https://mainnet.dkn.dria.co/api/v0'

# Step 1: Prepare payload (copy of Example 2 with a new customId)
support_payload = json.loads(Path('support_payload.json').read_text())  # paste Example 2 JSON here
payload = deepcopy(support_payload)
payload["customId"] = "my-prompt-evolution"

# Step 2: Start execution
resp = requests.post(f'{base_url}/gepa/start_execution',
    headers={'x-api-key': api_key, 'Content-Type': 'application/json'},
    json=payload)
resp.raise_for_status()
evolution_id = resp.json()['evolutionId']
print(f'✅ Evolution started: {evolution_id}')

# Wait for initialization
time.sleep(5)

# Step 3: Monitor progress (same loop described in Getting Started)
while True:
    status_resp = requests.get(f'{base_url}/gepa/get_single_execution/{evolution_id}',
        headers={'x-api-key': api_key})
    status_resp.raise_for_status()
    data = status_resp.json()

    if 'prompt' in data:
        print(f"Status: {data['status']} | Gen: {data['prompt']['currentGeneration']} | Score: {data['prompt']['currentPromptScore']}")
    else:
        print(f"Status: {data['status']}")

    if data['status'] in ['completed', 'failed']:
        break
    time.sleep(10)

# Step 4: Get evolved prompts
if data['status'] == 'completed':
    prompts_resp = requests.get(f'{base_url}/gepa/get_execution_prompts',
        params={'evolutionId': evolution_id, 'page': 1, 'limit': 20},
        headers={'x-api-key': api_key})
    prompts_resp.raise_for_status()

    prompts = prompts_resp.json()['prompts']
    print(f'\n✅ Evolution complete! {len(prompts)} generations')
    for p in prompts:
        print(f"\nGen {p['generation']} (score: {p['score']}):\n{p['prompt'][:200]}...")
else:
    print('❌ Evolution failed')

Additional Resources