Feature Configuration

Configuration for MCP servers, A2A integration, fallback providers, users, browser automation, reload behavior, execution policy, approval workflows, budget controls, thinking mode, text-to-speech, Docker sandboxing, canvas, auto-reply, broadcast, inbox, bindings, pairing, extensions, vault, webhook triggers, proxy, sessions, and queue management.


[[mcp_servers]]

MCP (Model Context Protocol) server connections provide external tool integration. Each entry is a separate [[mcp_servers]] array element.

[[mcp_servers]]
name = "filesystem"
timeout_secs = 30
env = []

[mcp_servers.transport]
type = "stdio"
command = "npx"
args = ["-y", "@modelcontextprotocol/server-filesystem", "/home/user/docs"]
[[mcp_servers]]
name = "remote-api"
timeout_secs = 60
env = ["GITHUB_PERSONAL_ACCESS_TOKEN"]

[mcp_servers.transport]
type = "sse"
url = "https://mcp.example.com/sse"
[[mcp_servers]]
name = "my-http-backend"
timeout_secs = 30

[mcp_servers.transport]
type = "http_compat"
base_url = "https://tools.example.com"
headers = [{name = "Authorization", value_env = "MY_API_KEY"}]

[[mcp_servers.transport.tools]]
name = "search"
description = "Search documents"
path = "/search"
method = "post"

Streamable HTTP

[[mcp_servers]]
name = "remote-tools"

[mcp_servers.transport.Http]
url = "https://mcp.example.com/v1"

Uses the Streamable HTTP transport (MCP spec 2025-03-26+). Unlike SSE, this is a simpler request/response pattern over standard HTTP POST.

FieldTypeDefaultDescription
namestringrequiredDisplay name for this MCP server. Tools are namespaced as mcp_{name}_{tool}.
timeout_secsu6430Request timeout in seconds.
envlist of strings[]Environment variable names to pass through to the subprocess (stdio transport only).

Transport variants (tagged union on type):

typeFieldsDescription
stdiocommand (string), args (list of strings, default [])Spawn a subprocess, communicate via JSON-RPC over stdin/stdout.
sseurl (string)Connect to an HTTP Server-Sent Events endpoint.
Httpurl (string)Streamable HTTP transport (MCP spec 2025-03-26+). Simple request/response over HTTP POST.
http_compatbase_url (string), headers (list of header configs), tools (list of tool configs)Built-in compatibility adapter for plain HTTP/JSON tool backends without a native MCP server. Each tool maps to an HTTP endpoint.

http_compat header config:

FieldTypeDescription
namestringHTTP header name (e.g., "Authorization").
valuestring or nullStatic header value.
value_envstring or nullEnv var name whose value is used as the header value (preferred for secrets).

http_compat tool config:

FieldTypeDefaultDescription
namestringrequiredTool name exposed to the LLM.
descriptionstring""Tool description shown to the LLM.
pathstringrequiredHTTP path (e.g., "/search").
methodstring"post"HTTP method: get, post, put, patch, delete.
request_modestring"json_body"How arguments are sent: json_body, query, none.
response_modestring"json"Response parsing: json, text.
input_schemaobject{"type":"object"}JSON Schema for the tool's input parameters.

[a2a]

Agent-to-Agent protocol configuration, enabling inter-agent communication across LibreFang instances.

[a2a]
enabled = true
name = "LibreFang Agent OS"
description = "My production agent OS"
listen_path = "/a2a"

[[a2a.external_agents]]
name = "research-agent"
url = "https://agent.example.com/.well-known/agent.json"

[[a2a.external_agents]]
name = "code-reviewer"
url = "https://reviewer.example.com/.well-known/agent.json"
FieldTypeDefaultDescription
enabledboolfalseWhether A2A protocol is enabled.
namestring"LibreFang Agent OS"Service-level display name shown in the well-known agent card.
descriptionstring""Service-level description shown in the well-known agent card.
listen_pathstring"/a2a"URL path prefix for A2A endpoints.
external_agentslist of objects[]External A2A agents to discover and interact with.

external_agents entries:

FieldTypeDescription
namestringDisplay name for the external agent.
urlstringAgent card endpoint URL (typically /.well-known/agent.json).

[[fallback_providers]]

Fallback provider chain. When the primary LLM provider ([default_model]) fails, these are tried in order.

[[fallback_providers]]
provider = "ollama"
model = "llama3.2:latest"
api_key_env = ""
# base_url = "http://localhost:11434"

[[fallback_providers]]
provider = "groq"
model = "llama-3.3-70b-versatile"
api_key_env = "GROQ_API_KEY"
FieldTypeDefaultDescription
providerstring""Provider name (e.g., "ollama", "groq", "openai").
modelstring""Model identifier for this provider.
api_key_envstring""Env var name for the API key. Empty for local providers (ollama, vllm, lmstudio).
base_urlstring or nullnullBase URL override. Uses catalog default if null.

[[users]]

RBAC multi-user configuration. Users can be assigned roles and bound to channel platform identities.

[[users]]
name = "Alice"
role = "owner"
api_key_hash = "sha256_hash_of_api_key"

[users.channel_bindings]
telegram = "123456"
discord = "987654321"
slack = "U0ABCDEFG"
FieldTypeDefaultDescription
namestringrequiredUser display name.
rolestring"user"User role in the RBAC hierarchy.
channel_bindingsmap of string to string{}Maps channel platform names to platform-specific user IDs, binding this user identity across channels.
api_key_hashstring or nullnullSHA256 hash of the user's personal API key for authenticated API access.

Role hierarchy (highest to lowest privilege):

RoleDescription
ownerFull administrative access. Can manage all agents, users, and configuration.
adminCan manage agents and most settings. Cannot modify owner accounts.
userCan interact with agents. Limited management capabilities.
viewerRead-only access. Can view agent responses but cannot send messages.

[browser]

Configures the headless browser automation engine used by the browser_* agent tools.

[browser]
enabled = true
headless = true
viewport_width = 1280
viewport_height = 720
timeout_secs = 30
idle_timeout_secs = 300
max_sessions = 5
# chromium_path = "/usr/bin/chromium"
FieldTypeDefaultDescription
enabledbooltrueEnable the built-in CDP browser tools. Set to false to disable all browser_* tools.
headlessbooltrueRun browser in headless mode (no visible window).
viewport_widthu321280Browser viewport width in pixels.
viewport_heightu32720Browser viewport height in pixels.
timeout_secsu6430Per-action timeout in seconds.
idle_timeout_secsu64300Auto-close browser session after this many seconds of inactivity.
max_sessionsusize5Maximum concurrent browser sessions.
chromium_pathstring or nullnullPath to the Chromium/Chrome binary. Auto-detected if null.

[reload]

Controls automatic config file watching and hot-reloading.

[reload]
mode = "hybrid"
debounce_ms = 500
FieldTypeDefaultDescription
modestring"hybrid"Reload mode. See below.
debounce_msu64500Debounce window in milliseconds before reloading after a file change is detected.

mode values:

ValueDescription
offNo automatic reloading. Changes require a manual restart.
restartFull daemon restart on any config change.
hotHot-reload safe sections only (channels, skills, heartbeat).
hybridHot-reload where possible; flag restart-required for sections that need it (default).

Hot-reloadable sections (no restart required):

The following config sections can be changed in config.toml while the daemon is running. Changes are detected automatically (polled every 30 seconds):

  • [channels] — add, remove, or reconfigure channel adapters
  • [[skills]] — skill registry
  • [proxy] — HTTP/HTTPS/SOCKS5 proxy settings
  • [browser] — browser automation settings
  • [web] — web search/scraping config
  • [approval] — approval policy
  • [cron] — cron job settings
  • [webhook_triggers] — webhook triggers
  • [extensions] — extension config
  • [[mcp_servers]] — MCP server connections
  • [a2a] — Agent-to-Agent protocol config
  • [fallback_providers] — fallback provider chain
  • provider_urls — provider base URL overrides
  • default_model — default model selection
  • tool_policy — tool filtering rules
  • proactive_memory — proactive memory thresholds
  • provider_api_keys — provider API keys (flushes driver cache)
  • usage_footer — usage footer mode
  • sanitize — input sanitization rules

Restart-required sections: home_dir, data_dir, api_listen, tls, telemetry. Changes to these fields are logged as warnings but only take effect after restart.


[exec_policy]

Controls which shell commands agents are allowed to execute via the exec and shell tools.

[exec_policy]
mode = "allowlist"
allowed_commands = ["git", "python3", "node"]
timeout_secs = 30
max_output_bytes = 102400
no_output_timeout_secs = 30
FieldTypeDefaultDescription
modestring"allowlist"Security mode. See below.
safe_binslist of strings["sleep","true","false","cat","sort","uniq","cut","tr","head","tail","wc","date","echo","printf","basename","dirname","pwd","env"]Commands that always bypass the allowlist check (stdin-only POSIX utilities).
allowed_commandslist of strings[]Additional commands permitted when mode = "allowlist".
timeout_secsu6430Maximum wall-clock execution time per command in seconds.
max_output_bytesusize102400Maximum combined stdout+stderr output size in bytes (100 KB default).
no_output_timeout_secsu6430Kill processes that produce no output for this many seconds. 0 = disabled.

mode values:

ValueAliasesDescription
denynone, disabledBlock all shell execution.
allowlistrestrictedOnly allow commands in safe_bins or allowed_commands (default).
fullallow, all, unrestrictedAllow all commands. Unsafe -- development use only.

[approval]

Configures which tools require explicit human approval before execution. References the ApprovalPolicy type.

[approval]
require_approval = ["shell_exec"]
timeout_secs = 60
auto_approve_autonomous = false
auto_approve = false
second_factor = "none"
totp_issuer = "LibreFang"
totp_grace_period_secs = 300
FieldTypeDefaultDescription
require_approvallist of strings["shell_exec"]List of tool names that pause execution and wait for human approval before proceeding.
timeout_secsu6460Timeout for approval requests in seconds.
auto_approve_autonomousboolfalseAuto-approve tools when agent is in autonomous mode.
auto_approveboolfalseAuto-approve all tool executions (unsafe, dev only).
second_factorstring"none"Second-factor verification: "none" or "totp". When "totp", approvals require a 6-digit TOTP code from an authenticator app.
totp_issuerstring"LibreFang"Issuer name displayed in authenticator apps during enrollment.
totp_grace_period_secsu64300After a successful TOTP verification, skip re-verification for this many seconds. Set to 0 to always require a code. Max: 3600.
totp_toolslist of strings[]Tools requiring TOTP (glob patterns supported). Empty = all tools in require_approval need TOTP. Example: ["shell_exec"].

TOTP Setup

When second_factor = "totp", you must first enroll an authenticator app:

  1. Generate secret: POST /api/approvals/totp/setup — returns a base32 secret, otpauth:// URI, QR code (base64 PNG), and 8 recovery codes.
  2. Add to authenticator: Scan the QR code or enter the secret in Google Authenticator, 1Password, Authy, etc.
  3. Save recovery codes: Store the 8 recovery codes safely — each can be used once in place of a TOTP code if you lose your authenticator device.
  4. Confirm enrollment: POST /api/approvals/totp/confirm with {"code": "123456"} — verifies a code from the app.
  5. Enable enforcement: Set second_factor = "totp" in config (hot-reloadable).

You can also set up TOTP from the Dashboard under Settings > Security.

Once enforced, the approval flow changes:

  • Dashboard: Clicking "Approve" prompts for a 6-digit code inline.
  • Channel (Telegram/Slack): Use /approve <id> <6-digit-code> instead of /approve <id>.
  • API: Send {"totp_code": "123456"} in the body of POST /api/approvals/{id}/approve.
  • Recovery: Send a recovery code (format: xxxx-xxxx) in place of a TOTP code. Each recovery code is consumed on use.

To revoke TOTP, use DELETE /api/approvals/totp with {"code": "..."} or use the Dashboard. Resetting requires verification with a current TOTP or recovery code.

Rate limiting: 5 consecutive TOTP failures lock the account for 5 minutes.

The TOTP secret is stored in the encrypted vault (~/.librefang/vault.enc) using AES-256-GCM.


[budget]

Sets global spending limits for LLM API costs. All limits default to 0.0 (unlimited).

[budget]
max_hourly_usd = 1.00
max_daily_usd = 10.00
max_monthly_usd = 50.00
alert_threshold = 0.8
default_max_llm_tokens_per_hour = 0
FieldTypeDefaultDescription
max_hourly_usdf640.0Maximum total LLM cost in USD per hour across all agents. 0.0 = unlimited.
max_daily_usdf640.0Maximum total LLM cost in USD per day across all agents. 0.0 = unlimited.
max_monthly_usdf640.0Maximum total LLM cost in USD per month across all agents. 0.0 = unlimited.
alert_thresholdf640.8Warning threshold as a fraction of each limit (0.0–1.0). At 0.8, warnings are logged when 80% of a limit is reached.
default_max_llm_tokens_per_houru640Global override for per-agent hourly token budget. When > 0, overrides all agents' own token limits. 0 = keep each agent's own limit.

Per-provider caps ([budget.providers.<id>])

When you mix a free local provider (e.g. litellm, ollama) with paid ones (e.g. moonshot, openai), you can cap spending on the paid providers without throttling the free ones. Provider IDs must match the provider field of your agents' [model] blocks. Missing or zero limits mean unlimited.

[budget.providers.moonshot]
max_cost_per_day_usd = 2.0
max_tokens_per_hour = 500000

[budget.providers.openai]
max_cost_per_hour_usd = 1.0
max_cost_per_month_usd = 50.0

[budget.providers.litellm]
# omitted or all zeros = unlimited
FieldTypeDefaultDescription
max_cost_per_hour_usdf640.0Hourly cost cap for this provider. 0.0 = unlimited.
max_cost_per_day_usdf640.0Daily cost cap for this provider. 0.0 = unlimited.
max_cost_per_month_usdf640.0Monthly cost cap for this provider. 0.0 = unlimited.
max_tokens_per_houru640Hourly token cap (input + output) for this provider. 0 = unlimited.

When a per-provider limit is hit, the dispatch fails with a QuotaExceeded error mentioning the provider name — usage on other providers is unaffected.


[thinking]

Configures extended thinking (chain-of-thought reasoning) for models that support it (e.g., Claude 3.7 Sonnet with thinking mode).

[thinking]
budget_tokens = 10000
stream_thinking = false
FieldTypeDefaultDescription
budget_tokensu3210000Maximum tokens allocated for the thinking/reasoning phase.
stream_thinkingboolfalseWhether to stream thinking tokens to the client (visible in the API response stream).

[tts]

Configures text-to-speech synthesis for voice output.

[tts]
enabled = false
provider = "openai"          # openai | elevenlabs | google_tts
max_text_length = 4096
timeout_secs = 30

[tts.openai]
voice = "alloy"
model = "tts-1"
format = "mp3"
speed = 1.0

[tts.elevenlabs]
voice_id = "21m00Tcm4TlvDq8ikWAM"
model_id = "eleven_monolingual_v1"
stability = 0.5
similarity_boost = 0.75

[tts.google]
voice = "en-US-Standard-F"
language_code = "en-US"
speaking_rate = 1.0
pitch = 0.0
format = "mp3"

[tts] fields:

FieldTypeDefaultDescription
enabledboolfalseEnable TTS synthesis.
providerstring or nullnullDefault TTS provider: "openai", "elevenlabs", or "google_tts".
max_text_lengthusize4096Maximum text length in characters for a single TTS request.
timeout_secsu6430Request timeout per TTS call in seconds.

[tts.openai] fields:

FieldTypeDefaultDescription
voicestring"alloy"Voice name. Options: alloy, echo, fable, onyx, nova, shimmer.
modelstring"tts-1"TTS model: "tts-1" (fast) or "tts-1-hd" (high quality).
formatstring"mp3"Output format: mp3, opus, aac, flac.
speedf321.0Speech speed multiplier (0.25 to 4.0).

[tts.elevenlabs] fields:

FieldTypeDefaultDescription
voice_idstring"21m00Tcm4TlvDq8ikWAM"ElevenLabs voice ID (default: Rachel).
model_idstring"eleven_monolingual_v1"ElevenLabs model ID.
stabilityf320.5Voice stability (0.0–1.0). Higher = more consistent, less expressive.
similarity_boostf320.75Voice similarity boost (0.0–1.0).

[tts.google] fields:

FieldTypeDefaultDescription
voicestring"en-US-Standard-F"Google TTS voice name (e.g. en-US-Standard-F, pl-PL-Wavenet-A).
language_codestring"en-US"BCP-47 language code (e.g. en-US, pl-PL).
speaking_ratef321.0Speaking rate multiplier (0.25 to 4.0).
pitchf320.0Pitch adjustment in semitones (-20.0 to 20.0).
formatstring"mp3"Output format: mp3, opus, wav.

Requires GOOGLE_API_KEY or GOOGLE_CLOUD_API_KEY environment variable.


[docker]

Configures the Docker container sandbox for isolated code execution.

[docker]
enabled = false
image = "python:3.12-slim"
container_prefix = "librefang-sandbox"
workdir = "/workspace"
network = "none"
memory_limit = "512m"
cpu_limit = 1.0
timeout_secs = 60
read_only_root = true
mode = "off"
scope = "session"
reuse_cool_secs = 300
idle_timeout_secs = 86400
max_age_secs = 604800
blocked_mounts = []
FieldTypeDefaultDescription
enabledboolfalseEnable Docker sandbox for code execution.
imagestring"python:3.12-slim"Docker image to use for the sandbox container.
container_prefixstring"librefang-sandbox"Prefix for container names.
workdirstring"/workspace"Working directory inside the container.
networkstring"none"Network mode: "none" (isolated), "bridge", or a custom network name.
memory_limitstring"512m"Memory limit (e.g., "256m", "1g").
cpu_limitf641.0CPU limit (e.g., 0.5, 1.0, 2.0).
timeout_secsu6460Maximum execution time per command in seconds.
read_only_rootbooltrueMount the root filesystem as read-only.
modestring"off"Activation mode. See below.
scopestring"session"Container lifecycle scope. See below.
reuse_cool_secsu64300Cooldown in seconds before a released container can be reused.
idle_timeout_secsu6486400Destroy containers after this many seconds of inactivity (24 hours default).
max_age_secsu64604800Maximum container age before forced destruction (7 days default).
blocked_mountslist of strings[]Host paths blocked from bind mounting into containers.
cap_addlist of strings[]Linux capabilities to add to the container (e.g., ["NET_ADMIN"]). Use with caution.
tmpfslist of strings["/tmp:size=64m"]tmpfs mounts inside the container. Each entry is "path:options" (e.g., "/tmp:size=128m").
pids_limitu32100Maximum number of processes inside the container. Prevents fork bombs.

mode values:

ValueDescription
offDocker sandbox disabled (default).
non_mainUse Docker only for non-main (sub) agents.
allUse Docker for all agents.

scope values:

ValueDescription
sessionOne container per session, destroyed when the session ends (default).
agentOne container per agent, reused across sessions.
sharedShared container pool across all agents.

[canvas]

Configures the Canvas (Agent-to-UI) tool that allows agents to render HTML in the dashboard.

[canvas]
enabled = false
max_html_bytes = 524288
allowed_tags = []
FieldTypeDefaultDescription
enabledboolfalseEnable the canvas tool.
max_html_bytesusize524288Maximum HTML payload size in bytes (512 KB default).
allowed_tagslist of strings[]Allowed HTML tag names for sanitization. Empty = all safe tags permitted.

[auto_reply]

Configures the background auto-reply engine that can automatically respond to incoming messages without waiting for human interaction.

[auto_reply]
enabled = false
max_concurrent = 3
timeout_secs = 120
suppress_patterns = ["/stop", "/pause"]
FieldTypeDefaultDescription
enabledboolfalseEnable the auto-reply engine.
max_concurrentusize3Maximum number of concurrent auto-reply tasks.
timeout_secsu64120Default timeout per auto-reply task in seconds.
suppress_patternslist of strings["/stop", "/pause"]Incoming message patterns that suppress auto-reply.

[broadcast]

Configures message broadcasting to route a single incoming message to multiple agents simultaneously.

[broadcast]
strategy = "parallel"
routes = { "announcement-channel" = ["agent-a", "agent-b", "agent-c"] }
FieldTypeDefaultDescription
strategystring"parallel"Delivery strategy. "parallel" = send to all agents simultaneously; "sequential" = send one at a time in order.
routesmap of string to list of strings{}Maps peer/channel identifiers to lists of agent names that receive the message.

[inbox]

File-based input inbox for async external commands. Drop text files into a watched directory and they are dispatched as messages to agents. Processed files are moved to a processed/ subdirectory to avoid redelivery.

[inbox]
enabled = true
directory = "~/.librefang/inbox/"
poll_interval_secs = 5
default_agent = "assistant"
FieldTypeDefaultDescription
enabledboolfalseEnable the inbox directory watcher.
directorystring or nullnullDirectory to watch. Defaults to $HOME_DIR/inbox/. Supports ~ expansion.
poll_interval_secsu645How often (in seconds) to scan the directory for new files. Minimum 1.
default_agentstring or nullnullAgent name to route files to when no agent: directive is found in the file.

File format: Plain text files (.txt, .md, .json, .py, etc.). The first line may contain an agent:<name> directive to target a specific agent; the rest is sent as the message body. Files without the directive use default_agent.

Safety limits: Files larger than 1 MB are skipped. Binary files (non-text extensions) are skipped. Empty files are moved to processed/ without sending.

Usage examples:

Target a specific agent:

cat > ~/.librefang/inbox/task.txt << 'EOF'
agent:code-reviewer
Please review this code for security issues:

def login(user, password):
    query = f"SELECT * FROM users WHERE name='{user}' AND pass='{password}'"
    return db.execute(query)
EOF

Send to the default agent:

echo "Summarize today's system logs" > ~/.librefang/inbox/summarize.txt

Cron job:

# crontab -e
0 9 * * * grep ERROR /var/log/app.log > ~/.librefang/inbox/daily_errors.txt

CI/CD post-build:

echo "agent:devops
Build failed, please analyze:
$(tail -100 build.log)" > ~/.librefang/inbox/build_$(date +%s).txt

Batch processing:

for doc in ~/reports/*.md; do
  cp "$doc" ~/.librefang/inbox/
done

Check inbox status:

curl -s http://127.0.0.1:4545/api/inbox/status
# {"enabled":true,"pending_count":3,"processed_count":12,...}

[[bindings]]

Agent bindings route specific channel/account/peer combinations to specific agents. More specific bindings (more non-null fields) take priority over less specific ones.

[[bindings]]
agent = "support-agent"
[bindings.match_rule]
channel = "telegram"
guild_id = "123456"

[[bindings]]
agent = "vip-agent"
[bindings.match_rule]
channel = "discord"
peer_id = "987654321"
roles = ["premium"]

Top-level fields:

FieldTypeDescription
agentstringTarget agent name or ID to route matched messages to.
match_ruleobjectMatch criteria. All specified (non-null) fields must match.

match_rule fields:

FieldTypeDefaultDescription
channelstring or nullnullChannel type to match (e.g., "discord", "telegram", "slack").
account_idstring or nullnullSpecific bot account ID within the channel (for multi-bot setups).
peer_idstring or nullnullUser/peer ID for DM routing.
guild_idstring or nullnullGuild or server ID (Discord/Slack).
roleslist of strings[]Role-based routing; user must have at least one of these roles.

Specificity scoring (higher = matched first): peer_id (+8) > guild_id (+4) > roles (+2) = account_id (+2) > channel (+1).


[pairing]

Configures device pairing for the LibreFang mobile companion app and push notifications.

[pairing]
enabled = false
max_devices = 10
token_expiry_secs = 300
push_provider = "ntfy"
ntfy_url = "https://ntfy.sh"
ntfy_topic = "my-librefang-notifications"
FieldTypeDefaultDescription
enabledboolfalseEnable device pairing.
max_devicesusize10Maximum number of paired devices.
token_expiry_secsu64300Pairing token validity in seconds (5 minutes default).
push_providerstring"none"Push notification provider: "none", "ntfy", or "gotify".
ntfy_urlstring or nullnullntfy server URL (when push_provider = "ntfy").
ntfy_topicstring or nullnullntfy topic for push notifications.

[extensions]

Configures MCP server reconnection behavior and health monitoring.

[extensions]
auto_reconnect = true
reconnect_max_attempts = 10
reconnect_max_backoff_secs = 300
health_check_interval_secs = 60
FieldTypeDefaultDescription
auto_reconnectbooltrueAutomatically reconnect to MCP servers when they disconnect.
reconnect_max_attemptsu3210Maximum reconnect attempts before giving up permanently.
reconnect_max_backoff_secsu64300Maximum backoff duration in seconds between reconnect attempts.
health_check_interval_secsu6460Interval in seconds between health checks for connected extensions.

[vault]

Configures the encrypted credential vault for storing sensitive secrets.

[vault]
enabled = true
# path = "~/.librefang/vault.enc"
FieldTypeDefaultDescription
enabledbooltrueEnable the credential vault. Auto-detected if vault.enc already exists.
pathpath or nullnullCustom vault file path. Defaults to ~/.librefang/vault.enc.

[webhook_triggers]

Enables external systems to trigger agent actions via authenticated HTTP webhooks at /hooks/wake and /hooks/agent.

[webhook_triggers]
enabled = true
token_env = "LIBREFANG_WEBHOOK_TOKEN"
max_payload_bytes = 65536
rate_limit_per_minute = 30
FieldTypeDefaultDescription
enabledboolfalseEnable webhook trigger endpoints.
token_envstring"LIBREFANG_WEBHOOK_TOKEN"Env var name holding the bearer token (NOT the token itself). Token must be ≥ 32 characters. Required when enabled = true.
max_payload_bytesusize65536Maximum incoming payload size in bytes (64 KB default).
rate_limit_per_minuteu3230Maximum webhook requests per minute per source IP.

[proxy]

Configures HTTP proxy for all outbound connections (LLM APIs, web search, MCP servers, etc.). Environment variables HTTP_PROXY, HTTPS_PROXY, and NO_PROXY are also respected as fallbacks.

[proxy]
http_proxy = "http://proxy.corp.example:8080"
https_proxy = "http://proxy.corp.example:8080"
no_proxy = "localhost,127.0.0.1,.internal.corp"
FieldTypeDefaultDescription
http_proxystring or nullnullHTTP proxy URL. Falls back to HTTP_PROXY / http_proxy env var. Credentials in URLs are redacted in logs.
https_proxystring or nullnullHTTPS proxy URL. Falls back to HTTPS_PROXY / https_proxy env var.
no_proxystring or nullnullComma-separated list of hosts/domains that bypass the proxy. Falls back to NO_PROXY / no_proxy env var.

[session]

Configures automatic cleanup of idle or excess sessions.

[session]
retention_days = 30
max_sessions_per_agent = 100
cleanup_interval_hours = 24
FieldTypeDefaultDescription
retention_daysu320Maximum age in days for idle sessions before automatic cleanup. 0 = unlimited.
max_sessions_per_agentu320Maximum number of sessions per agent (oldest pruned first). 0 = unlimited.
cleanup_interval_hoursu3224How often the background cleanup job runs in hours.

[queue]

Configures the agent command queue, including depth limits, TTL, and per-lane concurrency.

[queue]
max_depth_per_agent = 100
max_depth_global = 1000
task_ttl_secs = 3600

[queue.concurrency]
main_lane = 3
cron_lane = 2
subagent_lane = 3

[queue] fields:

FieldTypeDefaultDescription
max_depth_per_agentu320Maximum queued tasks per agent. New tasks are rejected when full. 0 = unlimited.
max_depth_globalu320Maximum total queued tasks across all agents. 0 = unlimited.
task_ttl_secsu643600Unprocessed tasks expire after this many seconds. 0 = unlimited.

[queue.concurrency] fields:

FieldTypeDefaultDescription
main_laneusize3Concurrent user message tasks.
cron_laneusize2Concurrent scheduled cron job tasks.
subagent_laneusize3Concurrent subagent invocation tasks.

[tool_budget]

Source: librefang-runtime/src/tool_budget.rs

The tool budget enforcer caps the size of data that tool calls can return in a single agent turn, preventing unexpectedly large tool outputs from filling the context window and driving up token costs.

How it works

Every tool result passes through a three-layer size check before being placed in the context:

LayerThresholdBehavior
Inline≤ 50 KBResult is embedded in the context as-is.
Spill50 KB – 200 KBResult is written to a temporary file under /tmp/librefang-results/ and replaced in the context with a short summary noting the spill path and byte count.
Truncate> 200 KBOnly the first 200 KB of the result is kept; the remainder is silently discarded. A WARN entry is emitted with the tool name and original size.

Spill file naming

Spill files are named <agent_id>_<tool_call_id>_<timestamp>.txt and are written to /tmp/librefang-results/. They are not cleaned up automatically between turns; the directory is purged when the daemon restarts.

The agent can read a spill file on a subsequent turn if needed — the summary injected into the context includes the full path.

Configuration

There is no per-agent configuration for the tool budget in the current release. The 50 KB and 200 KB thresholds are fixed at compile time.

Why these limits?

50 KB of inline text is roughly 12 500 tokens — already a substantial fraction of most model context windows. Results beyond this point have diminishing returns for the agent while linearly increasing prompt cost. The 200 KB hard cap prevents a single runaway tool call from consuming an entire context window.


[context_compression]

Context compression automatically reduces the size of an agent's context window when token usage grows too large, preventing hard context-limit errors without requiring manual session management.

How it works

When an LLM call is about to be dispatched, the ContextEngine measures the estimated token count of the assembled context (system prompt + session history + tool schemas). If usage exceeds 80% of the model's context window, the engine triggers a compression pass before sending the request.

Compression happens automatically — there is no configuration required to enable it. The [context_compression] section exists for future tunability; currently all thresholds are fixed internally.

Three-layer protection

LayerTriggerMechanism
LLM summarization≥ 80% of context windowAn internal LLM call condenses the oldest portion of the session history into a compact summary message. The summary is injected back into the history in place of the original turns.
Hard truncationStill over limit after summarizationOldest non-system messages are removed outright until the context fits. Applied only when the summary itself is too large to help.
Context guardFinal safety backstopA hard-coded ceiling prevents the constructed prompt from ever exceeding the model's declared maximum. Excess tokens are truncated with a warning logged at ERROR level.

Summary retention and iterative refinement

Summary messages are stored in the session history as regular assistant turns with a special internal tag. On subsequent compression passes, an existing summary is included in the material to be re-summarized rather than preserved verbatim. This means long-running sessions undergo iterative refinement — older summaries are folded into newer, more compact ones over time.

The compactor system prompt now instructs the LLM to write summaries in the same language the user was using in the conversation. A Chinese conversation produces a Chinese summary; an Arabic conversation produces an Arabic summary. The instruction is applied to both the single-pass summarize_messages path and the merge step of summarize_in_chunks so chunked compaction is covered too — no surprise English-language insertions on long non-English sessions.

Pluggable ContextEngine

The compression algorithm is implemented behind a ContextEngine trait. The current implementation uses an LLM-based summarizer. Future versions of LibreFang will allow configuring alternative compression algorithms — including DAG-based compression and LCM (Latent Context Modeling) — once those implementations stabilize.

No configuration needed

Context compression is always active for all agents. There are no knobs to enable, disable, or tune in config.toml today. The 80% threshold, hard-truncation fallback, and context guard ceiling are all enforced automatically by the runtime.


Cron Scheduler

Per-agent scheduled jobs (interval, one-shot, or 5-field cron expressions) trigger agent turns, system events, or workflow runs. The scheduler supports multi-destination delivery with failure isolation, pre-scripts that inject data into the LLM prompt, and a silent marker for runtime delivery suppression.

See Cron Scheduler for the full schema, fan-out target variants (channel / webhook / local file / email), SSRF protection on webhook URLs, the <home_dir>/scripts/ path allowlist for pre-scripts, and the wake-gate vs pre-script trade-off.


Observability — Tempo + business spans

LibreFang ships an opt-in observability stack you can bring up with one docker compose invocation. It captures HTTP requests and LibreFang's own work — agent turns, tool calls, channel sends, cron fires — as searchable Tempo traces.

Stack components (all bundled in-binary; no external download required):

  • librefang-otel-collector — OTLP/gRPC collector on :4317. The daemon exports spans here.
  • librefang-tempo — single-binary Grafana Tempo with local-filesystem storage and 24 h retention. Receiver on internal :4317, query API on :3200.
  • librefang-grafana — pre-provisioned Grafana with the Tempo data source wired.

Auto-start is opt-in. The observability stack is not brought up automatically with librefang start — set [telemetry] auto_start = true in config.toml if you want the daemon to manage the lifecycle for you. Otherwise control it explicitly with librefang observability up / down. The bundled containers run under per-home_dir Docker labels so multiple LibreFang installs on the same host don't fight for the same names; teardown uses RAII-style cleanup that reaps containers even on abnormal daemon exit.

Bring it up:

librefang observability up      # starts the bundled compose stack
librefang observability status  # health-checks each container
librefang observability down    # tears it down

Configure the daemon to export there:

[telemetry]
otlp_endpoint = "http://localhost:4317"
service_name = "librefang"
sample_rate = 1.0           # 0.0–1.0; lower for high-volume prod
prometheus_enabled = true   # also expose /api/metrics

Trace search: open Grafana at http://localhost:3000, switch Explore to the Tempo data source, and search by trace ID, span name, or attribute. Business-level spans include agent.turn, tool.call, channel.send, cron.fire, and provider.request — each carrying agent id, channel name, model, and outcome as searchable attributes.

cache_hit_ratio metric: for every agent.turn span the runtime computes

cache_hit_ratio = cache_read / (cache_read + cache_creation)   # in [0.0, 1.0]

Some(1.0) is a perfect prefix-cache hit; Some(0.0) is a cold start where caching was active but nothing was hit; None means caching wasn't active at all for the run. Surfaces in trajectory exports (metadata.cache_hit_ratio) and on the Grafana per-agent panel.