Prompt Intelligence

Prompt Intelligence provides prompt version management and A/B experiments for LibreFang agents. Track prompt changes over time, compare variants side-by-side, and let the system automatically activate the best-performing prompt.

Quick Start

1. Enable the feature

Add to ~/.librefang/config.toml:

[prompt_intelligence]
enabled = true

2. Create prompt versions

Open Dashboard → Agents → select an agent → Prompts button → Versions tab → New Version.

Enter a system prompt and optional description. The server automatically generates a unique ID and content hash.

3. Run an A/B experiment

Switch to the Experiments tab → New Experiment → select 2 or more prompt versions → Create.

Click Start to begin routing traffic to the selected variants.

4. Review results

Click Metrics on a running experiment to see per-variant success rate, latency, and cost. When satisfied, click Complete — the winning variant's prompt is automatically activated.

Prompt Versioning

Automatic Tracking

When enabled = true, LibreFang computes a hash of each agent's system prompt at the start of every request. If the prompt content has changed since the last tracked version, a new version is created automatically and marked as active.

This means any change to an agent's system_prompt in its manifest or via the API is captured without manual intervention.

Manual Versions

Create versions explicitly through the Dashboard or API when you want to prepare variants for an experiment. Each version stores:

System prompt — the full prompt text
Content hash — computed server-side for deduplication
Version number — auto-incremented per agent
Description — optional human-readable label
Active flag — only one version is active per agent at a time

Version Pruning

When a new version is created and the total exceeds max_versions_per_agent (default: 50), the oldest inactive versions are automatically deleted. Active versions are never pruned.

Activating a Version

Click Activate on any version in the Dashboard, or call the API:

curl -X POST http://localhost:4545/api/prompts/versions/{version_id}/activate \
  -H "Content-Type: application/json" \
  -d '{"agent_id": "your-agent-id"}'

The agent will use this version's system prompt on the next request.

A/B Experiments

Concepts

An experiment compares two or more prompt variants on a single agent. Each variant is backed by a prompt version. When the experiment is running, incoming requests are routed to variants based on a configurable traffic split.

Term	Description
Variant	A named entry in the experiment, linked to a specific prompt version.
Traffic split	Percentage of traffic each variant receives (e.g. `[70, 30]` for 70/30).
Success criteria	Rules that determine whether a request was successful.

Experiment Lifecycle

Draft → Running → Paused → Running → Completed
              ↘                    ↗
               →→→ Completed →→→→→→

Status	Behavior
Draft	Experiment is configured but not active. No traffic is routed.
Running	Requests are split among variants. Metrics are recorded.
Paused	Traffic routing stops. Metrics are preserved.
Completed	Experiment ends. The variant with the highest success rate is automatically activated.

Traffic Split

Traffic is distributed using weighted selection. The traffic_split array defines the percentage for each variant. For example, with two variants and [70, 30]:

70% of sessions are routed to Variant A (Control)
30% of sessions are routed to Variant B

Routing is session-consistent — the same session always hits the same variant, determined by hashing the session ID.

Success Evaluation

A request is considered successful when:

The agent produced a non-empty response
The agent completed at least one iteration (executed the loop)

Metrics

Both streaming and non-streaming request paths record metrics per variant:

Metric	Description
Total requests	Number of requests routed to this variant
Successful / Failed	Breakdown by success criteria
Success rate	Percentage of successful requests
Average latency	Mean response time in milliseconds
Average cost	Mean cost per request in USD
Total cost	Cumulative cost for all requests

Auto-Activation

When an experiment is marked as Completed, the system:

Compares success rates across all variants
Selects the variant with the highest success rate
Activates the corresponding prompt version for the agent

This happens automatically — no manual intervention required.

Dashboard

Prompts Button

Each agent card in the Agents page has a Prompts button that opens the Prompt Intelligence modal with two tabs:

Versions tab:

List all prompt versions with version number, active badge, description, and creation date
Create new versions with a system prompt and description
Activate any version with one click
Delete non-active versions

Experiments tab:

List all experiments with status badges and variant count
Create experiments by selecting 2+ prompt versions as variants
Start / Pause / Complete experiments
View detailed metrics with success rate badges, latency, and cost

Analytics Page

The Analytics page includes a Model Performance section with:

KPI cards for average latency, fastest model, cheapest per-call, and total calls
Latency comparison chart (avg/min/max by model)
Cost per call chart
Detailed performance table

API Reference

Prompt Versions

Endpoint	Method	Description
`/api/agents/{agent_id}/prompts/versions`	GET	List all prompt versions for an agent
`/api/agents/{agent_id}/prompts/versions`	POST	Create a new prompt version
`/api/prompts/versions/{id}`	GET	Get a specific version
`/api/prompts/versions/{id}`	DELETE	Delete a version
`/api/prompts/versions/{id}/activate`	POST	Activate a version (body: `{"agent_id": "..."}`)

Create Version

curl -X POST http://localhost:4545/api/agents/{agent_id}/prompts/versions \
  -H "Content-Type: application/json" \
  -d '{
    "system_prompt": "You are a helpful coding assistant.",
    "description": "Focus on code quality",
    "version": 1,
    "tools": [],
    "variables": [],
    "created_by": "dashboard"
  }'

The server generates id, created_at, and content_hash automatically.

Experiments

Endpoint	Method	Description
`/api/agents/{agent_id}/prompts/experiments`	GET	List experiments for an agent
`/api/agents/{agent_id}/prompts/experiments`	POST	Create an experiment
`/api/prompts/experiments/{id}`	GET	Get experiment details
`/api/prompts/experiments/{id}/start`	POST	Start experiment
`/api/prompts/experiments/{id}/pause`	POST	Pause experiment
`/api/prompts/experiments/{id}/complete`	POST	Complete experiment (auto-activates winner)
`/api/prompts/experiments/{id}/metrics`	GET	Get per-variant metrics

Create Experiment

curl -X POST http://localhost:4545/api/agents/{agent_id}/prompts/experiments \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Tone comparison",
    "status": "draft",
    "traffic_split": [50, 50],
    "success_criteria": {
      "require_user_helpful": true,
      "require_no_tool_errors": true,
      "require_non_empty": true
    },
    "variants": [
      {
        "name": "Control",
        "prompt_version_id": "version-uuid-1",
        "description": "Current prompt"
      },
      {
        "name": "Variant B",
        "prompt_version_id": "version-uuid-2",
        "description": "More concise"
      }
    ]
  }'

Get Metrics

curl http://localhost:4545/api/prompts/experiments/{id}/metrics

Response:

[
  {
    "variant_id": "...",
    "variant_name": "Control",
    "total_requests": 142,
    "successful_requests": 128,
    "failed_requests": 14,
    "success_rate": 90.1,
    "avg_latency_ms": 1250.5,
    "avg_cost_usd": 0.0023,
    "total_cost_usd": 0.3266
  },
  {
    "variant_id": "...",
    "variant_name": "Variant B",
    "total_requests": 58,
    "successful_requests": 55,
    "failed_requests": 3,
    "success_rate": 94.8,
    "avg_latency_ms": 980.2,
    "avg_cost_usd": 0.0019,
    "total_cost_usd": 0.1102
  }
]

Model Performance

Endpoint	Method	Description
`/api/usage/by-model/performance`	GET	Model performance metrics with latency stats

Configuration Reference

[prompt_intelligence]
enabled = false              # Master toggle (default: false)
hash_prompts = true          # Compute content hashes for dedup (default: true)
max_versions_per_agent = 50  # Max versions before pruning (default: 50)

See the Configuration Reference for the full config schema.

Database

Prompt Intelligence uses SQLite (same database as the rest of LibreFang). Migration v13 creates four tables:

Table	Purpose
`prompt_versions`	Stores prompt version history per agent
`prompt_experiments`	Experiment definitions and status
`experiment_variants`	Variants within experiments, linking to prompt versions
`experiment_metrics`	Aggregated metrics per variant (requests, latency, cost)

Migration v14 adds a latency_ms column to the existing usage_events table for model performance tracking.

Migrations run automatically on startup. No manual intervention required.