No description
- Rust 55.2%
- HTML 26.6%
- Shell 16.2%
- Makefile 1.7%
- CSS 0.3%
|
Some checks failed
Build macOS / build-macos (push) Waiting to run
Test / test (push) Failing after 1m59s
Build Linux / build-linux (linux-arm64, true, aarch64-unknown-linux-gnu) (push) Failing after 2m37s
Build Linux / build-linux (linux-amd64, false, x86_64-unknown-linux-musl) (push) Failing after 3m38s
Server and UI now simultaneously bind TCP (for dev) and Unix socket (for hero_proxy) without requiring --bind. Server on :3752 + sock, UI on :8808 + sock. Aligns with hero service standard. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |
||
|---|---|---|
| .forgejo/workflows | ||
| crates | ||
| docs | ||
| scripts | ||
| .env.example | ||
| .gitignore | ||
| build.sh | ||
| buildenv.sh | ||
| Cargo.toml | ||
| download_models.sh | ||
| favicon.svg | ||
| install.sh | ||
| MACOS_ONNX_FIX.md | ||
| Makefile | ||
| MAKEFILE_ROBUSTNESS.md | ||
| OAUTH_DEBUG.md | ||
| openrpc.json | ||
| README.md | ||
| run.sh | ||
HeroEmbedder
A fast, local embedding server for RAG applications. Provides dense vector embeddings, similarity search, and reranking via a JSON-RPC 2.0 API with namespace support for isolated document collections.
Architecture (v2)
hero_embedder/
├── crates/
│ ├── hero_embedder_lib/ # Library: server internals (ML, storage, retrieval)
│ ├── hero_embedder_server/ # Binary: JSON-RPC daemon (Unix socket)
│ ├── hero_embedder_client/ # Library: generated client, types, Rhai bindings
│ ├── hero_embedder/ # Binary: CLI using the client
│ └── hero_embedder_ui/ # Binary: Axum web dashboard using the client
├── scripts/ # Build and deployment scripts
├── Cargo.toml # Workspace root
├── Makefile # Build orchestration
└── buildenv.sh # Environment configuration
Dependency Graph
hero_embedder_server
↑
hero_embedder_lib (server internals)
hero_embedder_client (generated client)
↑ ↑
│ │
hero_embedder hero_embedder_ui
Ports and Sockets
| Service | Bind | Type | Default |
|---|---|---|---|
| Server | ~/hero/var/sockets/hero_embedder_server.sock |
Unix Socket | Primary |
| UI | 127.0.0.1:3753 |
TCP | HTTP dashboard |
The server binds to a Unix socket only by default. The UI serves the web dashboard on TCP port 3753.
Features
- Embedding Generation: BGE models (small/base) with INT8/FP32 options
- Semantic Search: Fast cosine similarity search
- Reranking: Cross-encoder model for improved accuracy
- Namespaces: Isolated document collections for multi-tenant use
- Persistence: Documents stored in redb databases
- Web UI: Bootstrap-based admin dashboard with live updates
Quick Start
# Full setup: install deps, download models, build, install
make setup
# Run server + UI
make run
# CLI health check
hero_embedder -s "unix://$HOME/hero/var/sockets/hero_embedder_server.sock" health
Quality Levels
Quality is set per namespace when creating it. All 4 models are loaded at startup.
| Level | Name | Model | Weights | Embeddings | Dimensions | Use Case |
|---|---|---|---|---|---|---|
| 1 | Fast | bge-small | INT8 | INT8 | 384 | Real-time, low latency |
| 2 | Balanced | bge-small | FP32 | FP16 | 384 | Default, good balance |
| 3 | Quality | bge-base | INT8 | INT8 | 768 | Better accuracy |
| 4 | Best | bge-base | FP32 | FP16 | 768 | Maximum quality |
API
JSON-RPC 2.0 endpoint at POST /rpc
Server Info
{"jsonrpc": "2.0", "id": 1, "method": "info", "params": []}
{"jsonrpc": "2.0", "id": 1, "method": "health", "params": []}
Embedding
{"jsonrpc": "2.0", "id": 1, "method": "embed", "params": [["hello world", "another text"]]}
Index Management
{"jsonrpc": "2.0", "id": 1, "method": "index.add", "params": [[{"id": "doc1", "text": "hello"}], "namespace"]}
{"jsonrpc": "2.0", "id": 1, "method": "index.get", "params": ["doc1", "namespace"]}
{"jsonrpc": "2.0", "id": 1, "method": "index.delete", "params": ["doc1", "namespace"]}
{"jsonrpc": "2.0", "id": 1, "method": "index.count", "params": ["namespace"]}
{"jsonrpc": "2.0", "id": 1, "method": "index.clear", "params": ["namespace"]}
Search
{"jsonrpc": "2.0", "id": 1, "method": "search", "params": ["query text", 10, "namespace", true]}
Rerank
{"jsonrpc": "2.0", "id": 1, "method": "rerank", "params": ["query", [{"id": "1", "text": "..."}], 5]}
Namespaces
{"jsonrpc": "2.0", "id": 1, "method": "namespace.list", "params": []}
{"jsonrpc": "2.0", "id": 1, "method": "namespace.create", "params": ["my-docs", 2]}
{"jsonrpc": "2.0", "id": 1, "method": "namespace.delete", "params": ["my-docs"]}
CLI Client
hero_embedder health
hero_embedder stats
hero_embedder embed "hello world"
hero_embedder search "query" -k 10
hero_embedder add doc1 "document text"
hero_embedder ns-list
hero_embedder ns-create my-docs
Client Usage (Rust)
use hero_embedder_client::HeroEmbedderClient;
#[tokio::main]
async fn main() -> anyhow::Result<()> {
// Connect via Unix socket
let client = HeroEmbedderClient::new("unix:///home/user/hero/var/sockets/hero_embedder_server.sock");
// Or via HTTP
let client = HeroEmbedderClient::new("http://localhost:3752");
let results = client.search("hello", 10, None, None).await?;
Ok(())
}
Environment Variables
| Variable | Default | Description |
|---|---|---|
EMBEDDER_MODELS |
~/hero/var/embedder/models |
Models directory |
EMBEDDER_DATA |
~/hero/var/embedder/data |
Data directory |
HERO_SECRET |
(unset) | JWT secret (enables auth when set) |
HERO_AUTH_URL |
(unset) | URL to hero_auth OAuth2 server |
Data Storage
~/hero/var/embedder/
├── models/
│ ├── bge-small/
│ ├── bge-base/
│ └── bge-reranker-base/
└── data/
├── default/
│ └── q2/
│ └── rag.redb
└── corpus.redb
Building
make build # Release build
make check # Fast code check
make test # Unit tests
make lint # Clippy linter
make run # Full stack (server + UI)
make run-server # Server only
make run-ui # UI only
make stop # Stop all services
License
MIT