- Rust 87.1%
- HTML 11.4%
- CSS 1.5%
|
Some checks failed
Build and Test / build-and-test (push) Failing after 4m11s
Replace all remaining inline dirs::home_dir() and HERO_SOCKET_DIR env-var lookups with resolve_socket_dir(), path_var(), and path_root() from herolib_core::base. Remove duplicated BUILD_NR const blocks now supplied by service_base!(). Add missing prepare_sockets() and print_startup_banner() calls in hero_aibroker_admin and hero_aibroker_server. Update Cargo.lock to latest hero_proc_sdk and herolib_core git SHAs. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> |
||
|---|---|---|
| .forgejo/workflows | ||
| .github/workflows | ||
| .hero | ||
| crates | ||
| docs | ||
| .env.example | ||
| .gitignore | ||
| api_state.md | ||
| Cargo.lock | ||
| Cargo.toml | ||
| mcp_servers.example.json | ||
| modelsconfig.yml | ||
| PURPOSE.md | ||
| README.md | ||
| request_logs.db | ||
AI Broker
A lightweight LLM request broker with an OpenAI-compatible REST API that intelligently routes requests to multiple LLM providers with cost-aware strategies. All communication is via Unix Domain Sockets — no TCP ports.
Features
- OpenAI-Compatible API — Drop-in replacement for OpenAI clients (via Unix socket)
- OpenRouter API Compliance — Drop-in replacement for
https://openrouter.ai/api/v1; any OpenRouter client works by changing the base URL. See docs/api.md. - Sub-Provider Selection — Pin or constrain the upstream sub-provider via the standard OpenRouter
providerrequest field (order,only,ignore,allow_fallbacks,sort,max_price, …) - Live OpenRouter Model Catalog — Hourly refresh of the OpenRouter
/modelsand per-model/endpointslists, merged with the local YAML registry - Auxiliary Endpoints —
/v1/credits,/v1/key,/v1/generation,/v1/models/{author}/{slug}/endpoints,/v1/completions - Multi-Provider Support — OpenAI, OpenRouter, Groq, SambaNova
- Smart Routing — Automatic model selection based on cost or quality
- Cost Tracking — Per-request cost calculation and tracking
- Request Tracking — Detailed per-IP request tracking with timestamps and durations
- Streaming Support — Real-time streaming responses via SSE, including
delta.reasoningand the OpenRouter terminalusagechunk - MCP Broker — Aggregate tools from multiple MCP (Model Context Protocol) servers
- Rate Limiting — Per-IP rate limiting with configurable limits
- Audio APIs — Text-to-speech and speech-to-text support (Groq, SambaNova, OpenAI)
- Config-Based Audio Models — STT/TTS models defined in
modelsconfig.ymlwith automatic fallback - Embeddings — Vector embedding generation
- Many Chat Models — Latest Claude 4.x, Gemini 3, GPT-5.2, o3-mini, Grok 4.1, Kimi K2.5 and more
- Persistent Billing — SQLite-based request logging for billing and analytics
- API Key Support — Optional API key authentication system
- Unix Socket Architecture — All services communicate over Unix Domain Sockets; no open TCP ports (optional TCP listener for cascade — see below)
- Cascade / Multi-Broker — Run multiple brokers and chain them: a child broker forwards to a "mother" broker via a TCP listener, with priority + admin UI persisted in hero_db. See Cascade — multi-broker setup.
OpenRouter compliance
The REST surface is wire-compatible with the
OpenRouter API.
That includes the full request schema (provider, reasoning,
transforms, route, models fallback list, response_format),
the response shape (top-level provider, system_fingerprint,
extended usage with cost, cached_tokens, reasoning_tokens),
and the auxiliary REST endpoints (/v1/completions, /v1/credits,
/v1/key, /v1/generation, /v1/models/{author}/{slug}/endpoints).
Attribution headers HTTP-Referer and X-Title are forwarded when the
caller supplies them, and fall back to the
OPENROUTER_HTTP_REFERER / OPENROUTER_X_TITLE env vars otherwise.
When the resolved backend is OpenRouter, the broker returns the
OpenRouter response shape verbatim. For non-OpenRouter backends the
broker still adds the legacy x_aibroker envelope so existing internal
callers keep working unchanged. See docs/api.md for the
full reference.
Project Structure
hero_aibroker/
├── crates/
│ ├── hero_aibroker/ # CLI (chat, models, tools, health)
│ ├── hero_aibroker_lib/ # Core business logic (shared library)
│ ├── hero_aibroker_sdk/ # Generated OpenRPC client + types
│ ├── hero_aibroker_server/ # Server binary (two Unix sockets)
│ ├── hero_aibroker_admin/ # Admin dashboard binary (Unix socket)
│ ├── hero_aibroker_examples/ # SDK examples and integration tests
│ ├── hero_aibroker_services/ # Multi-MCP broker binary
│ └── mcp/
│ ├── mcp_common/ # Shared MCP utilities
│ ├── mcp_exa/ # Exa semantic search
│ ├── mcp_hero/ # Hero MCP integration
│ ├── mcp_ping/ # Ping test server
│ ├── mcp_scraperapi/ # ScraperAPI web scraping
│ ├── mcp_scrapfly/ # Scrapfly web scraping
│ ├── mcp_serpapi/ # SerpAPI web search
│ └── mcp_serper/ # Serper web search
├── modelsconfig.yml # Model definitions and pricing
└── mcp_servers.example.json # MCP server configuration template
Dependency Graph
hero_aibroker_lib (core logic)
↑
hero_aibroker_sdk (types, protocol, RPC client)
↑ ↑ ↑
| | |
server CLI UI
Architecture
All services bind Unix Domain Sockets under ~/hero/var/sockets/. There are no TCP listeners.
┌──────────────────────────────────────────────────────────────────┐
│ CLI (hero_aibroker chat/models/tools/health) │
│ connects via: ~/hero/var/sockets/hero_aibroker/rpc.sock │
└──────────────────────────────────┬───────────────────────────────┘
│ JSON-RPC
┌──────────────────────────────────▼───────────────────────────────┐
│ hero_aibroker_server │
│ ├── JSON-RPC admin API → ~/hero/var/sockets/hero_aibroker/ │
│ │ rpc.sock │
│ └── REST (OpenAI-compat) → ~/hero/var/sockets/hero_aibroker/ │
│ rest.sock │
│ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ Service Layer │ │
│ │ (Routing logic, model selection, cost calculation) │ │
│ └──────────────────────────────────────────────────────────┘ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ Provider Layer │ │
│ │ (OpenAI, Groq, SambaNova, OpenRouter adapters) │ │
│ └──────────────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────────┘
┌──────────────────────────────────────────────────────────────────┐
│ hero_aibroker_admin (admin dashboard) │
│ binds: ~/hero/var/sockets/hero_aibroker/admin.sock │
│ proxies JSON-RPC requests to hero_aibroker_server │
└──────────────────────────────────────────────────────────────────┘
┌──────────────────────────────────────────────────────────────────┐
│ hero_aibroker_services (MCP broker) │
│ binds multiple sockets for aggregated MCP services │
└──────────────────────────────────────────────────────────────────┘
All services are registered with and managed by hero_proc via nushell scripts. Use service aibroker start --update --reset to manage the service lifecycle.
Quick Start
Prerequisites
- Rust 1.70 or later
hero_procinstalled and running- At least one LLM provider API key
Environment Variables
API keys are managed via hero_proc secrets — see hero_proc secrets for details. No manual env file sourcing required.
LLM provider keys (at least one required):
| Variable | Description |
|---|---|
GROQ_API_KEY / GROQ_API_KEYS |
Groq API key(s) |
OPENROUTER_API_KEY / OPENROUTER_API_KEYS |
OpenRouter API key(s) |
SAMBANOVA_API_KEY / SAMBANOVA_API_KEYS |
SambaNova API key(s) |
OPENAI_API_KEY / OPENAI_API_KEYS |
OpenAI API key(s) |
Both singular and plural variants are accepted. Use comma-separated values for multiple keys per provider — the broker creates separate provider instances and distributes requests across them for higher throughput, load distribution, and automatic failover.
Web/search tool keys (optional, used by MCP servers):
| Variable | Description |
|---|---|
SERPAPI_API_KEYS |
SerpAPI web search |
SERPER_API_KEYS |
Serper web search |
EXA_API_KEYS |
Exa semantic search |
SCRAPERAPI_API_KEYS |
ScraperAPI web scraping |
SCRAPFLY_API_KEYS |
Scrapfly web scraping |
Service configuration:
| Variable | Default | Description |
|---|---|---|
ROUTING_STRATEGY |
cheapest |
cheapest or best |
MCP_CONFIG_PATH |
— | Path to MCP server config JSON |
MODELS_CONFIG_PATH |
— | Path to model config YAML |
ADMIN_TOKEN |
— | Simple admin auth token |
HERO_SECRET |
— | Hero Auth JWT secret |
Run
service aibroker start --update --reset
Stop
service aibroker stop
Status
service aibroker status
API Reference
All REST endpoints are served on ~/hero/var/sockets/hero_aibroker/rest.sock. Use curl --unix-socket to reach them.
List Models
curl --unix-socket ~/hero/var/sockets/hero_aibroker/rest.sock \
http://localhost/v1/models
Chat Completions
curl --unix-socket ~/hero/var/sockets/hero_aibroker/rest.sock \
http://localhost/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [{"role": "user", "content": "Hello!"}],
"stream": true
}'
Text-to-Speech
curl --unix-socket ~/hero/var/sockets/hero_aibroker/rest.sock \
http://localhost/v1/audio/speech \
-H "Content-Type: application/json" \
-d '{"model": "tts-1", "input": "Hello, world!", "voice": "alloy"}' \
--output speech.mp3
Available TTS models: tts-1, tts-1-hd (requires OPENAI_API_KEY)
Speech-to-Text
curl --unix-socket ~/hero/var/sockets/hero_aibroker/rest.sock \
http://localhost/v1/audio/transcriptions \
-F "file=@audio.mp3" \
-F "model=whisper-1"
Available STT models:
whisper-1— multi-provider (Groq → SambaNova → OpenAI fallback chain)whisper-large-v3— direct Groq/SambaNova access
Embeddings
curl --unix-socket ~/hero/var/sockets/hero_aibroker/rest.sock \
http://localhost/v1/embeddings \
-H "Content-Type: application/json" \
-d '{"model": "text-embedding-3-small", "input": "Hello, world!"}'
JSON-RPC Admin API
The admin API is served on ~/hero/var/sockets/hero_aibroker/rpc.sock:
# Health check
curl --unix-socket ~/hero/var/sockets/hero_aibroker/rpc.sock \
http://localhost/rpc \
-H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","method":"health","params":{},"id":1}'
# List models
curl --unix-socket ~/hero/var/sockets/hero_aibroker/rpc.sock \
http://localhost/rpc \
-H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","method":"models.list","params":{},"id":2}'
# List MCP tools
curl --unix-socket ~/hero/var/sockets/hero_aibroker/rpc.sock \
http://localhost/rpc \
-H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","method":"mcp.list_tools","params":{},"id":3}'
Billing & Usage
# View all IP usage and costs
curl --unix-socket ~/hero/var/sockets/hero_aibroker/rest.sock \
http://localhost/billing/usage
# View specific IP usage
curl --unix-socket ~/hero/var/sockets/hero_aibroker/rest.sock \
http://localhost/billing/usage/127.0.0.1
All requests are persisted to SQLite with IP address, model, token usage, cost in USD, timestamps, and success/error status.
# Export to CSV
sqlite3 -header -csv requests.db "SELECT * FROM request_logs;" > billing.csv
Cascade — multi-broker setup
A broker can act as a client of another broker. This lets you run multiple hero_aibroker instances and chain them, e.g. one broker per host that forwards to a single root broker holding the real upstream-provider keys.
┌──────────────────┐ TCP /v1/* ┌────────────────────┐
│ hero_aibroker │ ───────────────► │ hero_aibroker │ HTTPS
│ (child / user) │ 127.0.0.1:33850 │ (mother / root) │ ───────► OpenRouter,
│ UDS-only │ │ TCP + UDS │ OpenAI, Groq, …
│ ai_broker_mother│ │ --mother banner │
│ ↑ provider │ │ │
└──────────────────┘ └────────────────────┘
The child registers the mother as a provider named ai_broker_mother (or
ai_broker_mother2, … if there are several). Chat requests routed to that
provider are forwarded over /v1/chat/completions to the mother's TCP
address, which then makes the real upstream call.
Bind flags
hero_aibroker_server always opens its three Unix sockets under
$HERO_SOCKET_DIR/hero_aibroker/. Pass these flags to also expose a TCP
listener (REST /v1/* + admin /rpc on a single port):
| Flag | Description |
|---|---|
--address <ip> |
Bind on this address. 127.0.0.1, :: (any), or a concrete mycelium IPv6 address. Required with --port. |
--port <num> |
TCP port for the combined listener. Required with --address. |
--mother |
Self-identify as the cascade root. Surfaced via info RPC; the admin UI renders a banner across the top of every page. Pure self-identification — does not change routing. |
# Mother — bind on localhost, mark as root
hero_aibroker_server --address 127.0.0.1 --port 33850 --mother
# Mother — bind on mycelium so other hosts can reach it
hero_aibroker_server --address fc00::1 --port 33850 --mother
# Child — UDS-only is fine; cascade target is registered at runtime
hero_aibroker_server
Persistence: hero_db
Cascade configuration (the mother list) and per-provider priority overrides
are persisted in hero_db under database aibroker_config.
hero_db is a hard dependency of the broker: startup fails fast if its
socket is unreachable.
Schema (all writes go through admin RPC, not direct hero_db calls):
| Key | Kind | Fields |
|---|---|---|
mothers:ids |
set | mother ids (ai_broker_mother, ai_broker_mother2, …) |
mother:<id> |
hash | address, port, label, priority, enabled |
provider_priority |
hash | <provider_name> → priority integer (sparse, lower wins) |
The default hero_db management socket is
~/hero/var/sockets/hero_db/rpc.sock. Override via the
HERO_AIBROKER_DB_SOCKET env var when the broker runs with an isolated
HERO_SOCKET_DIR but should still talk to the operator's hero_db (see the
cascade integration test for an example).
Priority list — sparse integers
Routing picks the lowest-priority backend for a given model. Both the
YAML registry's Backend.priority and admin overrides use the same
"sparse integer" convention: prefer 1, 5, 10, 15, 20… so an operator
can drop a new provider in between two others without renumbering.
The priority table layered on top of the YAML default:
- YAML
Backend.priorityfrommodelsconfig.yml— baseline. - Per-provider override stored in hero_db
provider_priorityhash — wins when set; clear it (set tonull) to revert to YAML.
Edit priorities from the Providers tab in the admin UI: each row has
a numeric input, blank means "use YAML". Mother brokers also appear in
the Providers tab tagged cascade, so reordering the cascade vs. direct
providers happens in one place.
Admin UI
Two tabs in the admin dashboard:
- Cascade — list/add/edit/remove upstream brokers. Address, port, optional label, priority, enabled toggle. Auto-assigns id on add.
- Providers — adds a Priority column for all providers (direct +
mother). Mother rows show a
cascadebadge.
When the broker was started with --mother, every page also shows a
purple banner across the top:
"This broker is the MOTHER (root of cascade)…". The banner reads
info.is_mother and shows/hides automatically.
Admin RPC
All cascade operations go through the JSON-RPC /rpc endpoint
(documented in crates/hero_aibroker_server/openrpc.json):
| Method | Description |
|---|---|
mothers.list |
Return every registered upstream broker. |
mothers.add |
{address, port, label?, priority?, enabled?} → returns auto-assigned id. |
mothers.update |
{id, ...patch} — patch any subset of fields. |
mothers.remove |
{id} — drop from hero_db + rebuild provider map. |
priority.list |
Return the override map (provider name → priority). |
priority.set |
{provider, priority} — set; pass priority: null to clear. |
info |
Now also returns is_mother (bool) and mother_count (int). |
Every mutation persists to hero_db and rebuilds the live provider map + chat service so changes apply without a restart.
Running two brokers via hero_proc — --split
The nushell service module
service_aibroker.nu
in hero_skills/tools/modules/services/ knows how to set up the cascade
end-to-end:
# Single-broker default (unchanged)
service aibroker start
# Cascade: register both brokers; child auto-registers the mother
service aibroker start --split
service aibroker stop --split
# --reset wipes hero_db `aibroker_config` (mothers + priority overrides)
# and deletes apikeys.db before starting
service aibroker start --reset
# --update pulls newest source via `forge merge` and rebuilds
service aibroker start --update
--split semantics:
- The regular service
hero_aibrokerbecomes the child — UDS-only, user-space. - A second hero_proc service
hero_aibroker_rootis registered for the mother. Same binary, started with--mother --address 127.0.0.1 --port 33850, in its ownHERO_SOCKET_DIR(…/hero_aibroker_root/) so its UDS sockets don't collide with the child. - macOS: both run as the invoking user (one hero_proc supervises both).
- Linux: the mother is registered under root's hero_proc (the script
auto-
sudos); the child stays in user space. This lets the mother hold the privileged credentials / network position while the child remains a normal-user process. - After both are up, the script issues a
mothers.addRPC against the child pointing at127.0.0.1:33850(labelsplit-mode-mother, priority1).
The default port is 33850. Change it by editing SVX_MOTHER_PORT at
the top of service_aibroker.nu.
Manually wiring two brokers
If you don't want to use the nu module, the same cascade is just three
RPC calls. With the mother running on 127.0.0.1:33850 and the child on
its UDS:
# Confirm the mother is up and self-identifies
curl -s http://127.0.0.1:33850/rpc \
-H 'content-type: application/json' \
-d '{"jsonrpc":"2.0","id":1,"method":"info"}' | jq '.result.is_mother'
# → true
# Register the mother on the child
curl -s --unix-socket ~/hero/var/sockets/hero_aibroker/rpc.sock \
http://localhost/rpc -H 'content-type: application/json' \
-d '{
"jsonrpc":"2.0","id":1,"method":"mothers.add",
"params":{"address":"127.0.0.1","port":33850,"priority":1}
}'
# → {"result":{"id":"ai_broker_mother","success":true}}
# Verify it sticks across restarts (the child reloads from hero_db on boot)
curl -s --unix-socket ~/hero/var/sockets/hero_aibroker/rpc.sock \
http://localhost/rpc -H 'content-type: application/json' \
-d '{"jsonrpc":"2.0","id":1,"method":"mothers.list"}' | jq
Integration test
crates/hero_aibroker_examples/tests/cascade.rs spawns two
hero_aibroker_server child processes, registers the mother on the
child via mothers.add, and asserts the cascade is live (info,
mothers.list round-trip). Skips silently when hero_db isn't running.
cargo test -p hero_aibroker_examples --test cascade -- --nocapture
The test uses HERO_AIBROKER_DB_SOCKET so each broker runs with an
isolated HERO_SOCKET_DIR while sharing the operator's hero_db.
CLI Usage
The hero_aibroker binary is the interactive CLI. It connects via ~/hero/var/sockets/hero_aibroker/rpc.sock.
# Interactive chat
hero_aibroker chat --model gpt-4o
# Chat with the default auto-routing model
hero_aibroker chat
# List available models
hero_aibroker models
# List MCP tools
hero_aibroker tools
# Check server health
hero_aibroker health
CLI Options
Global options:
-m, --model <MODEL>— model to use for chat (default:auto)--socket <PATH>— custom socket path (default:~/hero/var/sockets/hero_aibroker/rpc.sock)
Chat sub-command options:
-m, --model <MODEL>— model to use (overrides global--model)
Service Management
service aibroker start # register all services with hero_proc and start them
service aibroker stop # stop all services via hero_proc
service aibroker status # show service status
Model Configuration
Models are defined in modelsconfig.yml. The file controls display names, tiers, capabilities, context windows, and per-provider backends with pricing:
models:
gpt-4o:
display_name: "GPT-4o"
tier: premium
capabilities:
- tool_calling
- vision
context_window: 128000
backends:
- provider: openrouter
model_id: openai/gpt-4o
priority: 1
input_cost: 2.5 # USD per million tokens
output_cost: 10.0
Set MODELS_CONFIG_PATH to point to your config file, or place modelsconfig.yml in the working directory.
Auto Model Selection
Use special model names for automatic selection:
| Model Name | Description |
|---|---|
auto |
Use the configured ROUTING_STRATEGY |
autocheapest |
Select the cheapest available model |
autobest |
Select the best premium model |
MCP Integration
The broker aggregates tools from multiple MCP (Model Context Protocol) servers managed by hero_aibroker_services. Configure servers in a JSON file pointed to by MCP_CONFIG_PATH (see mcp_servers.example.json):
{
"mcpServers": [
{
"name": "serper",
"command": "/path/to/mcp_serper",
"args": [],
"env": {}
},
{
"name": "exa",
"command": "/path/to/mcp_exa",
"args": [],
"env": {}
}
]
}
Included MCP Servers
All MCP binaries are built as part of the workspace and managed by hero_aibroker_services:
| Binary | Description | Required Key |
|---|---|---|
mcp_serper |
Web search via Serper | SERPER_API_KEYS |
mcp_serpapi |
Web search via SerpAPI | SERPAPI_API_KEYS |
mcp_exa |
Semantic search via Exa | EXA_API_KEYS |
mcp_scraperapi |
Web scraping via ScraperAPI | SCRAPERAPI_API_KEYS |
mcp_scrapfly |
Web scraping via Scrapfly | SCRAPFLY_API_KEYS |
mcp_ping |
Ping/test server | — |
mcp_hero |
Hero OS service discovery + LLM-driven Python code generation and execution | HERO_SECRET |
MCP REST Endpoints
| Endpoint | Description |
|---|---|
GET /mcp/tools |
List all aggregated tools |
POST /mcp/tools/:name |
Call a specific tool |
GET /mcp/sse |
SSE endpoint for MCP clients |
Development
Building
# Release build (all workspace crates)
cargo build --release
# Debug build
cargo build
# Build a specific crate
cargo build -p hero_aibroker_server
cargo build -p hero_aibroker
# Check (no codegen)
cargo check --all
# Format
cargo fmt --all
# Lint
cargo clippy --all -- -D warnings
Running Tests
# Run all tests
cargo test --all
# Run tests for a specific crate
cargo test -p hero_aibroker_lib
Logs
proc logs tail hero_aibroker_server
proc logs tail hero_aibroker_admin
proc logs tail hero_aibroker_services
Deployment
Install Binaries
service aibroker install --update --reset
License
MIT License