No description

Rust 77.8%
HTML 20.6%
JavaScript 1.3%
CSS 0.3%

Find a file

despiegk 5862e80f85 All checks were successful Test / test (push) Successful in 7m18s Details refactor: replace hardcoded HOME paths with herolib_core::base APIs; drop local BUILD_NR consts - Remove duplicated BUILD_NR const blocks from all binaries (now supplied by service_base!()) - Replace ~/hero/var/sockets/… strings with herolib_core::base::resolve_socket_dir() / hero_socket_dir() / path_root() - Add herolib_core dependency to hero_embedder_examples - Bump hero_proc_sdk and herolib_core/herolib_derive git revisions in Cargo.lock Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>		2026-05-17 09:17:38 +02:00
.forgejo/workflows	ci: write toolchain PATH to GITHUB_PATH once (clean up env preamble)	2026-05-12 21:32:56 +02:00
.hero	chore: add hero_builder artifacts and fix dep version strings to policy minimum	2026-05-10 13:39:45 +02:00
crates	refactor: replace hardcoded HOME paths with herolib_core::base APIs; drop local BUILD_NR consts	2026-05-17 09:17:38 +02:00
docs	feat: remove auth module and add ensure_deps ONNX Runtime setup	2026-03-20 18:11:56 +01:00
.gitignore	chore(deps): commit Cargo.lock and bump herolib to dev tip with logger	2026-05-03 15:04:09 +02:00
Cargo.lock	refactor: replace hardcoded HOME paths with herolib_core::base APIs; drop local BUILD_NR consts	2026-05-17 09:17:38 +02:00
Cargo.toml	fix(deps): bump hero_rpc/herolib_core/hero_proc_sdk pins from 0.5.0 to 0.6.0	2026-05-12 17:13:05 +02:00
favicon.svg	fix: update favicon.svg to match navbar search-heart icon	2026-02-10 16:17:28 -05:00
openrpc.json	fix: absolute binary paths, graceful shutdown, rename client to SDK	2026-02-28 18:42:47 +03:00
PURPOSE.md	feat: adopt service_base!() macro across all binaries; add service.toml manifests; bump deps	2026-05-16 13:42:18 +02:00
README.md	feat: adopt service_base!() macro across all binaries; add service.toml manifests; bump deps	2026-05-16 13:42:18 +02:00

README.md

HeroEmbedder

A fast, local embedding server for RAG applications. Provides dense vector embeddings, similarity search, and reranking via a JSON-RPC 2.0 API with namespace support for isolated document collections.

Architecture

hero_embedder/
├── crates/
│   ├── hero_embedder_lib/         # Library: server internals (ML, storage, retrieval)
│   ├── hero_embedderd/            # Binary: ONNX daemon (TCP, loads all models once)
│   ├── hero_embedder_server/      # Binary: JSON-RPC daemon (Unix socket)
│   ├── hero_embedder_web/         # Binary: Axum web dashboard (Unix socket)
│   ├── hero_embedder_sdk/         # Library: JSON-RPC client and types
│   ├── hero_embedder/             # Binary: CLI using the SDK
│   └── hero_embedder_examples/    # Examples: SDK usage demonstrations
└── Cargo.toml                     # Workspace root

Dependency Graph

hero_embedderd  (ONNX models, TCP)
      ↑
hero_embedder_server  (JSON-RPC Unix socket, delegates embed/rerank to daemon)
      ↑
hero_embedder_sdk  (JSON-RPC client)
      ↑        ↑
      │        │
hero_embedder   hero_embedder_web
(CLI)          (admin UI)

Lifecycle

lab drives the full build/install/start/stop pipeline on top of hero_proc.

lab service embedder --install   # build + install all binaries
lab service embedder --start     # register with hero_proc and start
lab service embedder --stop      # stop all binaries
lab service embedder --status    # status of all binaries

Sockets

Service	Socket Path	Type
Server	`$HERO_SOCKET_DIR/hero_embedder/rpc.sock`	Unix Socket (OpenRPC / JSON-RPC 2.0)
Web UI	`$HERO_SOCKET_DIR/hero_embedder/web.sock`	Unix Socket (HTTP admin dashboard)
Proxy	`$HERO_SOCKET_DIR/hero_embedder_proxy/rpc.sock`	Unix Socket (namespace-isolating proxy)
Daemon	TCP `127.0.0.1:8092` (configurable)	HTTP JSON-RPC + /health

All server/web sockets are Unix sockets only. External access is provided by hero_router. The daemon TCP port is intended for loopback use; cross-node access goes through hero_router.

Features

Embedding Generation: BGE models (small/base) with INT8/FP32 options
Semantic Search: Fast cosine similarity search
Reranking: Cross-encoder model for improved accuracy
Namespaces: Isolated document collections for multi-tenant use
Persistence: Documents stored in redb databases
Web UI: Bootstrap-based admin dashboard with live updates

Quality Levels

Quality is set per namespace when creating it. All 4 models are loaded at startup.

Level	Name	Model	Weights	Dimensions	Use Case
1	Fast	bge-small	INT8	384	Real-time, low latency
2	Balanced	bge-small	FP32	384	Default, good balance
3	Quality	bge-base	INT8	768	Better accuracy
4	Best	bge-base	FP32	768	Maximum quality

API

JSON-RPC 2.0 endpoint at POST /rpc

Server Info

{"jsonrpc": "2.0", "id": 1, "method": "info", "params": []}
{"jsonrpc": "2.0", "id": 1, "method": "health", "params": []}

Embedding

{"jsonrpc": "2.0", "id": 1, "method": "embed", "params": [["hello world", "another text"]]}

Index Management

{"jsonrpc": "2.0", "id": 1, "method": "index.add", "params": [[{"id": "doc1", "text": "hello"}], "namespace"]}
{"jsonrpc": "2.0", "id": 1, "method": "index.get", "params": ["doc1", "namespace"]}
{"jsonrpc": "2.0", "id": 1, "method": "index.delete", "params": ["doc1", "namespace"]}
{"jsonrpc": "2.0", "id": 1, "method": "index.count", "params": ["namespace"]}
{"jsonrpc": "2.0", "id": 1, "method": "index.clear", "params": ["namespace"]}

Search

{"jsonrpc": "2.0", "id": 1, "method": "search", "params": ["query text", 10, "namespace", true]}

Rerank

{"jsonrpc": "2.0", "id": 1, "method": "rerank", "params": ["query", [{"id": "1", "text": "..."}], 5]}

Namespaces

{"jsonrpc": "2.0", "id": 1, "method": "namespace.list", "params": []}
{"jsonrpc": "2.0", "id": 1, "method": "namespace.create", "params": ["my-docs", 2]}
{"jsonrpc": "2.0", "id": 1, "method": "namespace.delete", "params": ["my-docs"]}

CLI Client

hero_embedder health
hero_embedder stats
hero_embedder embed "hello world"
hero_embedder search "query" -k 10
hero_embedder add doc1 "document text"
hero_embedder ns-list
hero_embedder ns-create my-docs

SDK Usage (Rust)

use hero_embedder_sdk::HeroEmbedderClient;

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let socket = format!("{}/hero/var/sockets/hero_embedder/rpc.sock",
        std::env::var("HOME")?);
    let client = HeroEmbedderClient::connect_socket(&socket).await?;

    let results = client.search("hello", 10, None, None).await?;
    Ok(())
}

Environment Variables

Variable	Default	Description
`EMBEDDER_MODELS`	`~/hero/var/embedder/models`	Models directory
`EMBEDDER_DATA`	`~/hero/var/embedder/data`	Data directory
`HERO_EMBEDDERD_PORT`	`8092`	TCP port `hero_embedderd` listens on
`HERO_EMBEDDERD_URL`	`http://127.0.0.1:8092`	URL `hero_embedder_server` uses to reach the daemon
`HERO_SOCKET_DIR`	`~/hero/var/sockets`	Base directory for Unix sockets

Data Storage

~/hero/var/embedder/
├── models/
│   ├── bge-small/
│   ├── bge-base/
│   └── bge-reranker-base/
└── data/
    ├── default/
    │   └── q2/
    │       └── rag.redb
    └── corpus.redb

License

MIT