No description
  • Rust 87.1%
  • HTML 11.4%
  • CSS 1.5%
Find a file
despiegk 9f06e65932
Some checks failed
Build and Test / build-and-test (push) Failing after 4m11s
refactor: complete migration to herolib_core::base path/socket helpers
Replace all remaining inline dirs::home_dir() and HERO_SOCKET_DIR env-var
lookups with resolve_socket_dir(), path_var(), and path_root() from
herolib_core::base. Remove duplicated BUILD_NR const blocks now supplied
by service_base!(). Add missing prepare_sockets() and print_startup_banner()
calls in hero_aibroker_admin and hero_aibroker_server. Update Cargo.lock
to latest hero_proc_sdk and herolib_core git SHAs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-17 09:17:35 +02:00
.forgejo/workflows ci: limit build-linux.yaml to tag pushes per canonical 2026-05-13 02:35:54 +02:00
.github/workflows fix(security): comprehensive adversarial review fixes and hardening 2026-05-13 00:36:22 +02:00
.hero chore: baseline snapshot before per-domain OpenRPC split 2026-05-11 07:07:54 +02:00
crates refactor: complete migration to herolib_core::base path/socket helpers 2026-05-17 09:17:35 +02:00
docs fix(security): comprehensive adversarial review fixes and hardening 2026-05-13 00:36:22 +02:00
.env.example Remove dotenvy dependency and apply env_secrets standard 2026-02-24 01:19:06 +01:00
.gitignore feat(sdk): per-domain typed clients for Phase-9 broker (#131) 2026-05-13 08:57:41 +00:00
api_state.md chore: baseline snapshot before per-domain OpenRPC split 2026-05-11 07:07:54 +02:00
Cargo.lock refactor: complete migration to herolib_core::base path/socket helpers 2026-05-17 09:17:35 +02:00
Cargo.toml feat: add --fake mode and hero_aibroker_test crate 2026-05-14 09:21:00 +02:00
mcp_servers.example.json fix: correct socket paths to use service-scoped subdirectories 2026-04-06 14:15:30 +02:00
modelsconfig.yml fix(server): adapt callers to (T, Backend, model) tuple returns from services 2026-05-11 07:20:59 +02:00
PURPOSE.md refactor: rename hero_aibroker_ui → hero_aibroker_admin, web.sock → admin.sock, remove selfstart + scripts 2026-05-07 21:44:11 +02:00
README.md refactor: rename hero_aibroker_ui → hero_aibroker_admin, update socket paths, add logging echo 2026-05-07 21:51:51 +02:00
request_logs.db fix(security): comprehensive adversarial review fixes and hardening 2026-05-13 00:36:22 +02:00

AI Broker

A lightweight LLM request broker with an OpenAI-compatible REST API that intelligently routes requests to multiple LLM providers with cost-aware strategies. All communication is via Unix Domain Sockets — no TCP ports.

Features

  • OpenAI-Compatible API — Drop-in replacement for OpenAI clients (via Unix socket)
  • OpenRouter API Compliance — Drop-in replacement for https://openrouter.ai/api/v1; any OpenRouter client works by changing the base URL. See docs/api.md.
  • Sub-Provider Selection — Pin or constrain the upstream sub-provider via the standard OpenRouter provider request field (order, only, ignore, allow_fallbacks, sort, max_price, …)
  • Live OpenRouter Model Catalog — Hourly refresh of the OpenRouter /models and per-model /endpoints lists, merged with the local YAML registry
  • Auxiliary Endpoints/v1/credits, /v1/key, /v1/generation, /v1/models/{author}/{slug}/endpoints, /v1/completions
  • Multi-Provider Support — OpenAI, OpenRouter, Groq, SambaNova
  • Smart Routing — Automatic model selection based on cost or quality
  • Cost Tracking — Per-request cost calculation and tracking
  • Request Tracking — Detailed per-IP request tracking with timestamps and durations
  • Streaming Support — Real-time streaming responses via SSE, including delta.reasoning and the OpenRouter terminal usage chunk
  • MCP Broker — Aggregate tools from multiple MCP (Model Context Protocol) servers
  • Rate Limiting — Per-IP rate limiting with configurable limits
  • Audio APIs — Text-to-speech and speech-to-text support (Groq, SambaNova, OpenAI)
  • Config-Based Audio Models — STT/TTS models defined in modelsconfig.yml with automatic fallback
  • Embeddings — Vector embedding generation
  • Many Chat Models — Latest Claude 4.x, Gemini 3, GPT-5.2, o3-mini, Grok 4.1, Kimi K2.5 and more
  • Persistent Billing — SQLite-based request logging for billing and analytics
  • API Key Support — Optional API key authentication system
  • Unix Socket Architecture — All services communicate over Unix Domain Sockets; no open TCP ports (optional TCP listener for cascade — see below)
  • Cascade / Multi-Broker — Run multiple brokers and chain them: a child broker forwards to a "mother" broker via a TCP listener, with priority + admin UI persisted in hero_db. See Cascade — multi-broker setup.

OpenRouter compliance

The REST surface is wire-compatible with the OpenRouter API. That includes the full request schema (provider, reasoning, transforms, route, models fallback list, response_format), the response shape (top-level provider, system_fingerprint, extended usage with cost, cached_tokens, reasoning_tokens), and the auxiliary REST endpoints (/v1/completions, /v1/credits, /v1/key, /v1/generation, /v1/models/{author}/{slug}/endpoints).

Attribution headers HTTP-Referer and X-Title are forwarded when the caller supplies them, and fall back to the OPENROUTER_HTTP_REFERER / OPENROUTER_X_TITLE env vars otherwise.

When the resolved backend is OpenRouter, the broker returns the OpenRouter response shape verbatim. For non-OpenRouter backends the broker still adds the legacy x_aibroker envelope so existing internal callers keep working unchanged. See docs/api.md for the full reference.

Project Structure

hero_aibroker/
├── crates/
│   ├── hero_aibroker/           # CLI (chat, models, tools, health)
│   ├── hero_aibroker_lib/       # Core business logic (shared library)
│   ├── hero_aibroker_sdk/       # Generated OpenRPC client + types
│   ├── hero_aibroker_server/    # Server binary (two Unix sockets)
│   ├── hero_aibroker_admin/     # Admin dashboard binary (Unix socket)
│   ├── hero_aibroker_examples/  # SDK examples and integration tests
│   ├── hero_aibroker_services/      # Multi-MCP broker binary
│   └── mcp/
│       ├── mcp_common/          # Shared MCP utilities
│       ├── mcp_exa/             # Exa semantic search
│       ├── mcp_hero/            # Hero MCP integration
│       ├── mcp_ping/            # Ping test server
│       ├── mcp_scraperapi/      # ScraperAPI web scraping
│       ├── mcp_scrapfly/        # Scrapfly web scraping
│       ├── mcp_serpapi/         # SerpAPI web search
│       └── mcp_serper/          # Serper web search
├── modelsconfig.yml             # Model definitions and pricing
└── mcp_servers.example.json     # MCP server configuration template

Dependency Graph

hero_aibroker_lib (core logic)
         ↑
hero_aibroker_sdk (types, protocol, RPC client)
    ↑         ↑          ↑
    |         |          |
 server     CLI         UI

Architecture

All services bind Unix Domain Sockets under ~/hero/var/sockets/. There are no TCP listeners.

┌──────────────────────────────────────────────────────────────────┐
│  CLI (hero_aibroker chat/models/tools/health)                    │
│  connects via: ~/hero/var/sockets/hero_aibroker/rpc.sock         │
└──────────────────────────────────┬───────────────────────────────┘
                                   │ JSON-RPC
┌──────────────────────────────────▼───────────────────────────────┐
│  hero_aibroker_server                                            │
│  ├── JSON-RPC admin API  → ~/hero/var/sockets/hero_aibroker/     │
│  │                            rpc.sock                           │
│  └── REST (OpenAI-compat) → ~/hero/var/sockets/hero_aibroker/    │
│                                 rest.sock                        │
│                                                                  │
│  ┌──────────────────────────────────────────────────────────┐   │
│  │                    Service Layer                          │   │
│  │  (Routing logic, model selection, cost calculation)      │   │
│  └──────────────────────────────────────────────────────────┘   │
│  ┌──────────────────────────────────────────────────────────┐   │
│  │                    Provider Layer                         │   │
│  │  (OpenAI, Groq, SambaNova, OpenRouter adapters)          │   │
│  └──────────────────────────────────────────────────────────┘   │
└──────────────────────────────────────────────────────────────────┘

┌──────────────────────────────────────────────────────────────────┐
│  hero_aibroker_admin (admin dashboard)                           │
│  binds: ~/hero/var/sockets/hero_aibroker/admin.sock              │
│  proxies JSON-RPC requests to hero_aibroker_server               │
└──────────────────────────────────────────────────────────────────┘

┌──────────────────────────────────────────────────────────────────┐
│  hero_aibroker_services (MCP broker)                             │
│  binds multiple sockets for aggregated MCP services              │
└──────────────────────────────────────────────────────────────────┘

All services are registered with and managed by hero_proc via nushell scripts. Use service aibroker start --update --reset to manage the service lifecycle.

Quick Start

Prerequisites

  • Rust 1.70 or later
  • hero_proc installed and running
  • At least one LLM provider API key

Environment Variables

API keys are managed via hero_proc secrets — see hero_proc secrets for details. No manual env file sourcing required.

LLM provider keys (at least one required):

Variable Description
GROQ_API_KEY / GROQ_API_KEYS Groq API key(s)
OPENROUTER_API_KEY / OPENROUTER_API_KEYS OpenRouter API key(s)
SAMBANOVA_API_KEY / SAMBANOVA_API_KEYS SambaNova API key(s)
OPENAI_API_KEY / OPENAI_API_KEYS OpenAI API key(s)

Both singular and plural variants are accepted. Use comma-separated values for multiple keys per provider — the broker creates separate provider instances and distributes requests across them for higher throughput, load distribution, and automatic failover.

Web/search tool keys (optional, used by MCP servers):

Variable Description
SERPAPI_API_KEYS SerpAPI web search
SERPER_API_KEYS Serper web search
EXA_API_KEYS Exa semantic search
SCRAPERAPI_API_KEYS ScraperAPI web scraping
SCRAPFLY_API_KEYS Scrapfly web scraping

Service configuration:

Variable Default Description
ROUTING_STRATEGY cheapest cheapest or best
MCP_CONFIG_PATH Path to MCP server config JSON
MODELS_CONFIG_PATH Path to model config YAML
ADMIN_TOKEN Simple admin auth token
HERO_SECRET Hero Auth JWT secret

Run

service aibroker start --update --reset

Stop

service aibroker stop

Status

service aibroker status

API Reference

All REST endpoints are served on ~/hero/var/sockets/hero_aibroker/rest.sock. Use curl --unix-socket to reach them.

List Models

curl --unix-socket ~/hero/var/sockets/hero_aibroker/rest.sock \
  http://localhost/v1/models

Chat Completions

curl --unix-socket ~/hero/var/sockets/hero_aibroker/rest.sock \
  http://localhost/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "Hello!"}],
    "stream": true
  }'

Text-to-Speech

curl --unix-socket ~/hero/var/sockets/hero_aibroker/rest.sock \
  http://localhost/v1/audio/speech \
  -H "Content-Type: application/json" \
  -d '{"model": "tts-1", "input": "Hello, world!", "voice": "alloy"}' \
  --output speech.mp3

Available TTS models: tts-1, tts-1-hd (requires OPENAI_API_KEY)

Speech-to-Text

curl --unix-socket ~/hero/var/sockets/hero_aibroker/rest.sock \
  http://localhost/v1/audio/transcriptions \
  -F "file=@audio.mp3" \
  -F "model=whisper-1"

Available STT models:

  • whisper-1 — multi-provider (Groq → SambaNova → OpenAI fallback chain)
  • whisper-large-v3 — direct Groq/SambaNova access

Embeddings

curl --unix-socket ~/hero/var/sockets/hero_aibroker/rest.sock \
  http://localhost/v1/embeddings \
  -H "Content-Type: application/json" \
  -d '{"model": "text-embedding-3-small", "input": "Hello, world!"}'

JSON-RPC Admin API

The admin API is served on ~/hero/var/sockets/hero_aibroker/rpc.sock:

# Health check
curl --unix-socket ~/hero/var/sockets/hero_aibroker/rpc.sock \
  http://localhost/rpc \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","method":"health","params":{},"id":1}'

# List models
curl --unix-socket ~/hero/var/sockets/hero_aibroker/rpc.sock \
  http://localhost/rpc \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","method":"models.list","params":{},"id":2}'

# List MCP tools
curl --unix-socket ~/hero/var/sockets/hero_aibroker/rpc.sock \
  http://localhost/rpc \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","method":"mcp.list_tools","params":{},"id":3}'

Billing & Usage

# View all IP usage and costs
curl --unix-socket ~/hero/var/sockets/hero_aibroker/rest.sock \
  http://localhost/billing/usage

# View specific IP usage
curl --unix-socket ~/hero/var/sockets/hero_aibroker/rest.sock \
  http://localhost/billing/usage/127.0.0.1

All requests are persisted to SQLite with IP address, model, token usage, cost in USD, timestamps, and success/error status.

# Export to CSV
sqlite3 -header -csv requests.db "SELECT * FROM request_logs;" > billing.csv

Cascade — multi-broker setup

A broker can act as a client of another broker. This lets you run multiple hero_aibroker instances and chain them, e.g. one broker per host that forwards to a single root broker holding the real upstream-provider keys.

┌──────────────────┐  TCP /v1/*        ┌────────────────────┐
│ hero_aibroker    │ ───────────────►  │ hero_aibroker      │  HTTPS
│  (child / user)  │  127.0.0.1:33850  │  (mother / root)   │ ───────► OpenRouter,
│  UDS-only        │                   │  TCP + UDS         │           OpenAI, Groq, …
│  ai_broker_mother│                   │  --mother banner   │
│  ↑ provider      │                   │                    │
└──────────────────┘                   └────────────────────┘

The child registers the mother as a provider named ai_broker_mother (or ai_broker_mother2, if there are several). Chat requests routed to that provider are forwarded over /v1/chat/completions to the mother's TCP address, which then makes the real upstream call.

Bind flags

hero_aibroker_server always opens its three Unix sockets under $HERO_SOCKET_DIR/hero_aibroker/. Pass these flags to also expose a TCP listener (REST /v1/* + admin /rpc on a single port):

Flag Description
--address <ip> Bind on this address. 127.0.0.1, :: (any), or a concrete mycelium IPv6 address. Required with --port.
--port <num> TCP port for the combined listener. Required with --address.
--mother Self-identify as the cascade root. Surfaced via info RPC; the admin UI renders a banner across the top of every page. Pure self-identification — does not change routing.
# Mother — bind on localhost, mark as root
hero_aibroker_server --address 127.0.0.1 --port 33850 --mother

# Mother — bind on mycelium so other hosts can reach it
hero_aibroker_server --address fc00::1 --port 33850 --mother

# Child — UDS-only is fine; cascade target is registered at runtime
hero_aibroker_server

Persistence: hero_db

Cascade configuration (the mother list) and per-provider priority overrides are persisted in hero_db under database aibroker_config. hero_db is a hard dependency of the broker: startup fails fast if its socket is unreachable.

Schema (all writes go through admin RPC, not direct hero_db calls):

Key Kind Fields
mothers:ids set mother ids (ai_broker_mother, ai_broker_mother2, …)
mother:<id> hash address, port, label, priority, enabled
provider_priority hash <provider_name> → priority integer (sparse, lower wins)

The default hero_db management socket is ~/hero/var/sockets/hero_db/rpc.sock. Override via the HERO_AIBROKER_DB_SOCKET env var when the broker runs with an isolated HERO_SOCKET_DIR but should still talk to the operator's hero_db (see the cascade integration test for an example).

Priority list — sparse integers

Routing picks the lowest-priority backend for a given model. Both the YAML registry's Backend.priority and admin overrides use the same "sparse integer" convention: prefer 1, 5, 10, 15, 20… so an operator can drop a new provider in between two others without renumbering.

The priority table layered on top of the YAML default:

  1. YAML Backend.priority from modelsconfig.yml — baseline.
  2. Per-provider override stored in hero_db provider_priority hash — wins when set; clear it (set to null) to revert to YAML.

Edit priorities from the Providers tab in the admin UI: each row has a numeric input, blank means "use YAML". Mother brokers also appear in the Providers tab tagged cascade, so reordering the cascade vs. direct providers happens in one place.

Admin UI

Two tabs in the admin dashboard:

  • Cascade — list/add/edit/remove upstream brokers. Address, port, optional label, priority, enabled toggle. Auto-assigns id on add.
  • Providers — adds a Priority column for all providers (direct + mother). Mother rows show a cascade badge.

When the broker was started with --mother, every page also shows a purple banner across the top: "This broker is the MOTHER (root of cascade)…". The banner reads info.is_mother and shows/hides automatically.

Admin RPC

All cascade operations go through the JSON-RPC /rpc endpoint (documented in crates/hero_aibroker_server/openrpc.json):

Method Description
mothers.list Return every registered upstream broker.
mothers.add {address, port, label?, priority?, enabled?} → returns auto-assigned id.
mothers.update {id, ...patch} — patch any subset of fields.
mothers.remove {id} — drop from hero_db + rebuild provider map.
priority.list Return the override map (provider name → priority).
priority.set {provider, priority} — set; pass priority: null to clear.
info Now also returns is_mother (bool) and mother_count (int).

Every mutation persists to hero_db and rebuilds the live provider map + chat service so changes apply without a restart.

Running two brokers via hero_proc — --split

The nushell service module service_aibroker.nu in hero_skills/tools/modules/services/ knows how to set up the cascade end-to-end:

# Single-broker default (unchanged)
service aibroker start

# Cascade: register both brokers; child auto-registers the mother
service aibroker start --split
service aibroker stop  --split

# --reset wipes hero_db `aibroker_config` (mothers + priority overrides)
# and deletes apikeys.db before starting
service aibroker start --reset

# --update pulls newest source via `forge merge` and rebuilds
service aibroker start --update

--split semantics:

  • The regular service hero_aibroker becomes the child — UDS-only, user-space.
  • A second hero_proc service hero_aibroker_root is registered for the mother. Same binary, started with --mother --address 127.0.0.1 --port 33850, in its own HERO_SOCKET_DIR (…/hero_aibroker_root/) so its UDS sockets don't collide with the child.
  • macOS: both run as the invoking user (one hero_proc supervises both).
  • Linux: the mother is registered under root's hero_proc (the script auto-sudos); the child stays in user space. This lets the mother hold the privileged credentials / network position while the child remains a normal-user process.
  • After both are up, the script issues a mothers.add RPC against the child pointing at 127.0.0.1:33850 (label split-mode-mother, priority 1).

The default port is 33850. Change it by editing SVX_MOTHER_PORT at the top of service_aibroker.nu.

Manually wiring two brokers

If you don't want to use the nu module, the same cascade is just three RPC calls. With the mother running on 127.0.0.1:33850 and the child on its UDS:

# Confirm the mother is up and self-identifies
curl -s http://127.0.0.1:33850/rpc \
  -H 'content-type: application/json' \
  -d '{"jsonrpc":"2.0","id":1,"method":"info"}' | jq '.result.is_mother'
# → true

# Register the mother on the child
curl -s --unix-socket ~/hero/var/sockets/hero_aibroker/rpc.sock \
  http://localhost/rpc -H 'content-type: application/json' \
  -d '{
    "jsonrpc":"2.0","id":1,"method":"mothers.add",
    "params":{"address":"127.0.0.1","port":33850,"priority":1}
  }'
# → {"result":{"id":"ai_broker_mother","success":true}}

# Verify it sticks across restarts (the child reloads from hero_db on boot)
curl -s --unix-socket ~/hero/var/sockets/hero_aibroker/rpc.sock \
  http://localhost/rpc -H 'content-type: application/json' \
  -d '{"jsonrpc":"2.0","id":1,"method":"mothers.list"}' | jq

Integration test

crates/hero_aibroker_examples/tests/cascade.rs spawns two hero_aibroker_server child processes, registers the mother on the child via mothers.add, and asserts the cascade is live (info, mothers.list round-trip). Skips silently when hero_db isn't running.

cargo test -p hero_aibroker_examples --test cascade -- --nocapture

The test uses HERO_AIBROKER_DB_SOCKET so each broker runs with an isolated HERO_SOCKET_DIR while sharing the operator's hero_db.

CLI Usage

The hero_aibroker binary is the interactive CLI. It connects via ~/hero/var/sockets/hero_aibroker/rpc.sock.

# Interactive chat
hero_aibroker chat --model gpt-4o

# Chat with the default auto-routing model
hero_aibroker chat

# List available models
hero_aibroker models

# List MCP tools
hero_aibroker tools

# Check server health
hero_aibroker health

CLI Options

Global options:

  • -m, --model <MODEL> — model to use for chat (default: auto)
  • --socket <PATH> — custom socket path (default: ~/hero/var/sockets/hero_aibroker/rpc.sock)

Chat sub-command options:

  • -m, --model <MODEL> — model to use (overrides global --model)

Service Management

service aibroker start   # register all services with hero_proc and start them
service aibroker stop    # stop all services via hero_proc
service aibroker status  # show service status

Model Configuration

Models are defined in modelsconfig.yml. The file controls display names, tiers, capabilities, context windows, and per-provider backends with pricing:

models:
  gpt-4o:
    display_name: "GPT-4o"
    tier: premium
    capabilities:
      - tool_calling
      - vision
    context_window: 128000
    backends:
      - provider: openrouter
        model_id: openai/gpt-4o
        priority: 1
        input_cost: 2.5    # USD per million tokens
        output_cost: 10.0

Set MODELS_CONFIG_PATH to point to your config file, or place modelsconfig.yml in the working directory.

Auto Model Selection

Use special model names for automatic selection:

Model Name Description
auto Use the configured ROUTING_STRATEGY
autocheapest Select the cheapest available model
autobest Select the best premium model

MCP Integration

The broker aggregates tools from multiple MCP (Model Context Protocol) servers managed by hero_aibroker_services. Configure servers in a JSON file pointed to by MCP_CONFIG_PATH (see mcp_servers.example.json):

{
  "mcpServers": [
    {
      "name": "serper",
      "command": "/path/to/mcp_serper",
      "args": [],
      "env": {}
    },
    {
      "name": "exa",
      "command": "/path/to/mcp_exa",
      "args": [],
      "env": {}
    }
  ]
}

Included MCP Servers

All MCP binaries are built as part of the workspace and managed by hero_aibroker_services:

Binary Description Required Key
mcp_serper Web search via Serper SERPER_API_KEYS
mcp_serpapi Web search via SerpAPI SERPAPI_API_KEYS
mcp_exa Semantic search via Exa EXA_API_KEYS
mcp_scraperapi Web scraping via ScraperAPI SCRAPERAPI_API_KEYS
mcp_scrapfly Web scraping via Scrapfly SCRAPFLY_API_KEYS
mcp_ping Ping/test server
mcp_hero Hero OS service discovery + LLM-driven Python code generation and execution HERO_SECRET

MCP REST Endpoints

Endpoint Description
GET /mcp/tools List all aggregated tools
POST /mcp/tools/:name Call a specific tool
GET /mcp/sse SSE endpoint for MCP clients

Development

Building

# Release build (all workspace crates)
cargo build --release

# Debug build
cargo build

# Build a specific crate
cargo build -p hero_aibroker_server
cargo build -p hero_aibroker

# Check (no codegen)
cargo check --all

# Format
cargo fmt --all

# Lint
cargo clippy --all -- -D warnings

Running Tests

# Run all tests
cargo test --all

# Run tests for a specific crate
cargo test -p hero_aibroker_lib

Logs

proc logs tail hero_aibroker_server
proc logs tail hero_aibroker_admin
proc logs tail hero_aibroker_services

Deployment

Install Binaries

service aibroker install --update --reset

License

MIT License