No description
  • Rust 69.7%
  • HTML 15.1%
  • Shell 12.7%
  • JavaScript 1.5%
  • Makefile 1%
Find a file
despiegk a50f5b020e
Some checks failed
Build and Test / build-and-test (push) Failing after 3s
Build Linux / build-linux (linux-amd64, false, x86_64-unknown-linux-musl) (push) Failing after 12s
Build Linux / build-linux (linux-arm64, true, aarch64-unknown-linux-gnu) (push) Failing after 12s
fix: add local path overrides for hero crates and correct herolib_core git URL
- Add [patch] sections to use local hero_lib, hero_rpc, and hero_proc crates
  since the remote development branch lags behind the local 0.5.0 versions
- Fix herolib_core git URL: add missing .git suffix (hero_lib -> hero_lib.git)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-08 12:14:57 +02:00
.forgejo/workflows fix: allow releases on main only 2026-04-02 14:43:03 +02:00
crates fix: correct socket paths to use service-scoped subdirectories 2026-04-06 14:15:30 +02:00
docs refactor: replace all stale llmbroker references with aibroker naming 2026-03-19 11:00:32 +01:00
scripts clean up scripts directory 2026-04-05 06:22:01 +02:00
.env.example Remove dotenvy dependency and apply env_secrets standard 2026-02-24 01:19:06 +01:00
.gitignore Polish & ship-ready UI — full product overhaul (#40) 2026-03-31 18:02:02 +00:00
buildenv.sh feat: implement standard three-crate architecture (issue #44) 2026-04-04 22:38:59 +02:00
Cargo.toml fix: add local path overrides for hero crates and correct herolib_core git URL 2026-04-08 12:14:57 +02:00
Makefile fix: migrate to hero_proc_factory pattern and remove rhai script dependencies 2026-04-05 13:55:57 +02:00
mcp_servers.example.json fix: correct socket paths to use service-scoped subdirectories 2026-04-06 14:15:30 +02:00
modelsconfig.yml refactor: Improve UI and model handling 2026-04-02 00:19:58 +02:00
README.md fix: correct socket paths to use service-scoped subdirectories 2026-04-06 14:15:30 +02:00

AI Broker

A lightweight LLM request broker with an OpenAI-compatible REST API that intelligently routes requests to multiple LLM providers with cost-aware strategies. All communication is via Unix Domain Sockets — no TCP ports.

Features

  • OpenAI-Compatible API — Drop-in replacement for OpenAI clients (via Unix socket)
  • Multi-Provider Support — OpenAI, OpenRouter, Groq, SambaNova
  • Smart Routing — Automatic model selection based on cost or quality
  • Cost Tracking — Per-request cost calculation and tracking
  • Request Tracking — Detailed per-IP request tracking with timestamps and durations
  • Streaming Support — Real-time streaming responses via SSE
  • MCP Broker — Aggregate tools from multiple MCP (Model Context Protocol) servers
  • Rate Limiting — Per-IP rate limiting with configurable limits
  • Audio APIs — Text-to-speech and speech-to-text support (Groq, SambaNova, OpenAI)
  • Config-Based Audio Models — STT/TTS models defined in modelsconfig.yml with automatic fallback
  • Embeddings — Vector embedding generation
  • Many Chat Models — Latest Claude 4.x, Gemini 3, GPT-5.2, o3-mini, Grok 4.1, Kimi K2.5 and more
  • Persistent Billing — SQLite-based request logging for billing and analytics
  • API Key Support — Optional API key authentication system
  • Unix Socket Architecture — All services communicate over Unix Domain Sockets; no open TCP ports

Project Structure

hero_aibroker/
├── crates/
│   ├── hero_aibroker/           # CLI + service manager (--start / --stop)
│   ├── hero_aibroker_lib/       # Core business logic (shared library)
│   ├── hero_aibroker_sdk/       # Generated OpenRPC client + types
│   ├── hero_aibroker_server/    # Server binary (two Unix sockets)
│   ├── hero_aibroker_ui/        # Admin dashboard binary (Unix socket)
│   ├── hero_aibroker_examples/  # SDK examples and integration tests
│   ├── hero_broker_server/      # Multi-MCP broker binary
│   └── mcp/
│       ├── mcp_common/          # Shared MCP utilities
│       ├── mcp_exa/             # Exa semantic search
│       ├── mcp_hero/            # Hero MCP integration
│       ├── mcp_ping/            # Ping test server
│       ├── mcp_scraperapi/      # ScraperAPI web scraping
│       ├── mcp_scrapfly/        # Scrapfly web scraping
│       ├── mcp_serpapi/         # SerpAPI web search
│       └── mcp_serper/          # Serper web search
├── modelsconfig.yml             # Model definitions and pricing
└── mcp_servers.example.json     # MCP server configuration template

Dependency Graph

hero_aibroker_lib (core logic)
         ↑
hero_aibroker_sdk (types, protocol, RPC client)
    ↑         ↑          ↑
    |         |          |
 server     CLI         UI

Architecture

All services bind Unix Domain Sockets under ~/hero/var/sockets/. There are no TCP listeners.

┌──────────────────────────────────────────────────────────────────┐
│  CLI (hero_aibroker chat/models/tools/health)                    │
│  connects via: ~/hero/var/sockets/hero_aibroker_server.sock      │
└──────────────────────────────────┬───────────────────────────────┘
                                   │ JSON-RPC
┌──────────────────────────────────▼───────────────────────────────┐
│  hero_aibroker_server                                            │
│  ├── JSON-RPC admin API  → ~/hero/var/sockets/                   │
│  │                            hero_aibroker_server.sock          │
│  └── REST (OpenAI-compat) → ~/hero/var/sockets/                  │
│                                 hero_aibroker_server_rest.sock   │
│                                                                  │
│  ┌──────────────────────────────────────────────────────────┐   │
│  │                    Service Layer                          │   │
│  │  (Routing logic, model selection, cost calculation)      │   │
│  └──────────────────────────────────────────────────────────┘   │
│  ┌──────────────────────────────────────────────────────────┐   │
│  │                    Provider Layer                         │   │
│  │  (OpenAI, Groq, SambaNova, OpenRouter adapters)          │   │
│  └──────────────────────────────────────────────────────────┘   │
└──────────────────────────────────────────────────────────────────┘

┌──────────────────────────────────────────────────────────────────┐
│  hero_aibroker_ui (admin dashboard)                              │
│  binds: ~/hero/var/sockets/hero_aibroker/ui.sock                 │
│  proxies JSON-RPC requests to hero_aibroker_server               │
└──────────────────────────────────────────────────────────────────┘

┌──────────────────────────────────────────────────────────────────┐
│  hero_broker_server (MCP broker)                                 │
│  binds multiple sockets for aggregated MCP services              │
└──────────────────────────────────────────────────────────────────┘

All services are registered with and managed by hero_proc. The hero_aibroker --start command handles all registration and startup.

Quick Start

Prerequisites

  • Rust 1.70 or later
  • hero_proc installed and running
  • At least one LLM provider API key

Environment Variables

Source your env file before running:

source ~/.config/env.sh   # or wherever you keep your secrets

LLM provider keys (at least one required):

Variable Description
GROQ_API_KEY / GROQ_API_KEYS Groq API key(s)
OPENROUTER_API_KEY / OPENROUTER_API_KEYS OpenRouter API key(s)
SAMBANOVA_API_KEY / SAMBANOVA_API_KEYS SambaNova API key(s)
OPENAI_API_KEY / OPENAI_API_KEYS OpenAI API key(s)

Both singular and plural variants are accepted. Use comma-separated values for multiple keys per provider — the broker creates separate provider instances and distributes requests across them for higher throughput, load distribution, and automatic failover.

Web/search tool keys (optional, used by MCP servers):

Variable Description
SERPAPI_API_KEYS SerpAPI web search
SERPER_API_KEYS Serper web search
EXA_API_KEYS Exa semantic search
SCRAPERAPI_API_KEYS ScraperAPI web scraping
SCRAPFLY_API_KEYS Scrapfly web scraping

Service configuration:

Variable Default Description
ROUTING_STRATEGY cheapest cheapest or best
MCP_CONFIG_PATH Path to MCP server config JSON
MODELS_CONFIG_PATH Path to model config YAML
ADMIN_TOKEN Simple admin auth token
HERO_SECRET Hero Auth JWT secret

Run

source ~/.config/env.sh
make run      # build, install, and start all services via hero_proc

Stop

make stop     # stop all hero_proc-managed services

Development Mode

make rundev   # run server directly in debug mode (no hero_proc, logs to stdout)
make cli      # interactive CLI session (debug build)

API Reference

All REST endpoints are served on the Unix socket at ~/hero/var/sockets/hero_aibroker_server_rest.sock. Use curl --unix-socket to reach them.

List Models

curl --unix-socket ~/hero/var/sockets/hero_aibroker_server_rest.sock \
  http://localhost/v1/models

Chat Completions

curl --unix-socket ~/hero/var/sockets/hero_aibroker_server_rest.sock \
  http://localhost/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "Hello!"}],
    "stream": true
  }'

Text-to-Speech

curl --unix-socket ~/hero/var/sockets/hero_aibroker_server_rest.sock \
  http://localhost/v1/audio/speech \
  -H "Content-Type: application/json" \
  -d '{"model": "tts-1", "input": "Hello, world!", "voice": "alloy"}' \
  --output speech.mp3

Available TTS models: tts-1, tts-1-hd (requires OPENAI_API_KEY)

Speech-to-Text

curl --unix-socket ~/hero/var/sockets/hero_aibroker_server_rest.sock \
  http://localhost/v1/audio/transcriptions \
  -F "file=@audio.mp3" \
  -F "model=whisper-1"

Available STT models:

  • whisper-1 — multi-provider (Groq → SambaNova → OpenAI fallback chain)
  • whisper-large-v3 — direct Groq/SambaNova access

Embeddings

curl --unix-socket ~/hero/var/sockets/hero_aibroker_server_rest.sock \
  http://localhost/v1/embeddings \
  -H "Content-Type: application/json" \
  -d '{"model": "text-embedding-3-small", "input": "Hello, world!"}'

JSON-RPC Admin API

The admin API is served on ~/hero/var/sockets/hero_aibroker_server.sock:

# Health check
curl --unix-socket ~/hero/var/sockets/hero_aibroker_server.sock \
  http://localhost/rpc \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","method":"health","params":{},"id":1}'

# List models
curl --unix-socket ~/hero/var/sockets/hero_aibroker_server.sock \
  http://localhost/rpc \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","method":"models.list","params":{},"id":2}'

# List MCP tools
curl --unix-socket ~/hero/var/sockets/hero_aibroker_server.sock \
  http://localhost/rpc \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","method":"mcp.list_tools","params":{},"id":3}'

Billing & Usage

# View all IP usage and costs
curl --unix-socket ~/hero/var/sockets/hero_aibroker_server_rest.sock \
  http://localhost/billing/usage

# View specific IP usage
curl --unix-socket ~/hero/var/sockets/hero_aibroker_server_rest.sock \
  http://localhost/billing/usage/127.0.0.1

All requests are persisted to SQLite with IP address, model, token usage, cost in USD, timestamps, and success/error status.

# Export to CSV
sqlite3 -header -csv requests.db "SELECT * FROM request_logs;" > billing.csv

CLI Usage

The hero_aibroker binary is both the service manager and the interactive CLI. It connects via ~/hero/var/sockets/hero_aibroker_server.sock.

# Interactive chat
hero_aibroker chat --model gpt-4o

# Chat with the default auto-routing model
hero_aibroker chat

# List available models
hero_aibroker models

# List MCP tools
hero_aibroker tools

# Check server health
hero_aibroker health

CLI Options

Global options:

  • -m, --model <MODEL> — model to use for chat (default: auto)
  • --socket <PATH> — custom socket path (default: ~/hero/var/sockets/hero_aibroker_server.sock)

Chat sub-command options:

  • -m, --model <MODEL> — model to use (overrides global --model)

Service Management

hero_aibroker --start   # register all services with hero_proc and start them
hero_aibroker --stop    # stop all services via hero_proc

Model Configuration

Models are defined in modelsconfig.yml. The file controls display names, tiers, capabilities, context windows, and per-provider backends with pricing:

models:
  gpt-4o:
    display_name: "GPT-4o"
    tier: premium
    capabilities:
      - tool_calling
      - vision
    context_window: 128000
    backends:
      - provider: openrouter
        model_id: openai/gpt-4o
        priority: 1
        input_cost: 2.5    # USD per million tokens
        output_cost: 10.0

Set MODELS_CONFIG_PATH to point to your config file, or place modelsconfig.yml in the working directory.

Auto Model Selection

Use special model names for automatic selection:

Model Name Description
auto Use the configured ROUTING_STRATEGY
autocheapest Select the cheapest available model
autobest Select the best premium model

MCP Integration

The broker aggregates tools from multiple MCP (Model Context Protocol) servers managed by hero_broker_server. Configure servers in a JSON file pointed to by MCP_CONFIG_PATH (see mcp_servers.example.json):

{
  "mcpServers": [
    {
      "name": "serper",
      "command": "/path/to/mcp_serper",
      "args": [],
      "env": {}
    },
    {
      "name": "exa",
      "command": "/path/to/mcp_exa",
      "args": [],
      "env": {}
    }
  ]
}

Included MCP Servers

All MCP binaries are built as part of the workspace and managed by hero_broker_server:

Binary Description Required Key
mcp_serper Web search via Serper SERPER_API_KEYS
mcp_serpapi Web search via SerpAPI SERPAPI_API_KEYS
mcp_exa Semantic search via Exa EXA_API_KEYS
mcp_scraperapi Web scraping via ScraperAPI SCRAPERAPI_API_KEYS
mcp_scrapfly Web scraping via Scrapfly SCRAPFLY_API_KEYS
mcp_ping Ping/test server
mcp_hero Hero OS service discovery + LLM-driven Python code generation and execution HERO_SECRET

MCP REST Endpoints

Endpoint Description
GET /mcp/tools List all aggregated tools
POST /mcp/tools/:name Call a specific tool
GET /mcp/sse SSE endpoint for MCP clients

Development

Building

# Release build (all workspace crates)
cargo build --release

# Debug build
cargo build

# Build a specific crate
cargo build -p hero_aibroker_server
cargo build -p hero_aibroker

# Fast check (no codegen)
make check

Running Tests

# Run all tests
cargo test --all

# Run tests for a specific crate
cargo test -p hero_aibroker_lib

# Full build + test cycle
make all

Code Quality

make fmt        # format code
make fmt-check  # check formatting without modifying
make lint       # run clippy (warnings as errors)
make lint-fix   # run clippy and auto-fix

Logs

make logs       # tail hero_aibroker_server logs via hero_proc
make logs-ui    # tail hero_aibroker_ui logs via hero_proc

Status

make status     # show service status and installed binaries

Deployment

Install Binaries

make install    # build release and install to ~/hero/bin

Ship to Registry

make ship-binary          # tag + push to trigger CI build
make ship-binary TAG=1.2.3  # override version tag

Docker

make docker-build   # build Docker image
make docker-run     # run Docker container (source env vars first)

License

MIT License