updated docs

This commit is contained in:
Maxime Van Hees
2025-10-20 11:38:21 +02:00
parent df780e20a2
commit 483ccb2ba8
8 changed files with 24 additions and 184 deletions

View File

@@ -177,6 +177,6 @@ Copyable JSON examples are provided in the [RPC examples documentation](./rpc_ex
- Encryption: `meta:db:<id>:enc`
For command examples and management payloads:
- RESP command basics: `docs/basics.md`
- Supported commands: `docs/cmds.md`
- JSON-RPC examples: `docs/rpc_examples.md`
- RESP command basics: [docs/basics.md](./basics.md)
- Supported commands: [docs/cmds.md](./cmds.md)
- JSON-RPC examples: [docs/rpc_examples.md](./rpc_examples.md)

View File

@@ -701,7 +701,7 @@ This expanded documentation includes all the list commands that were implemented
## Updated Database Selection and Access Keys
HeroDB uses an `Admin DB 0` to control database existence, access, and encryption. Access to data DBs can be public (no key) or private (requires a key). See detailed model in `docs/admin.md`.
HeroDB uses an `Admin DB 0` to control database existence, access, and encryption. Access to data DBs can be public (no key) or private (requires a key). See detailed model in [docs/admin.md](./admin.md).
Examples:

View File

@@ -128,7 +128,7 @@ redis-cli -p 6381 --pipe < dump.rdb
Connections start with no database selected. Any storage-backed command (GET, SET, H*, L*, SCAN, etc.) will return an error until you issue a SELECT to choose a database.
HeroDB uses an `Admin DB 0` to govern database existence, access and per-db encryption. Access control is enforced via `Admin DB 0` metadata. See the full model in (docs/admin.md:1).
HeroDB uses an `Admin DB 0` to govern database existence, access and per-db encryption. Access control is enforced via `Admin DB 0` metadata. See the full model in [docs/admin.md](./admin.md).
Examples:
```bash

View File

@@ -71,12 +71,11 @@ redis-cli -p 6379 SELECT 1
HeroDB embeds text internally at STORE/SEARCH time using a per-dataset EmbeddingConfig sidecar. Configure provider before creating a dataset to choose dimensions and provider.
Supported providers:
- openai (standard OpenAI or Azure OpenAI)
- openai (standard OpenAI API or custom OpenAI-compatible endpoints)
- testhash (deterministic, CI-friendly; no network)
Environment variables for OpenAI:
Environment variable for OpenAI:
- Standard OpenAI: export OPENAI_API_KEY=sk-...
- Azure OpenAI: export AZURE_OPENAI_API_KEY=...
RESP examples:
```bash
@@ -86,12 +85,9 @@ redis-cli -p 6379 LANCE.EMBEDDING CONFIG SET myset PROVIDER openai MODEL text-em
# OpenAI with reduced output dimension (e.g., 512) when supported
redis-cli -p 6379 LANCE.EMBEDDING CONFIG SET myset PROVIDER openai MODEL text-embedding-3-small PARAM dim 512
# Azure OpenAI (set env: AZURE_OPENAI_API_KEY)
# Custom OpenAI-compatible endpoint (e.g., self-hosted)
redis-cli -p 6379 LANCE.EMBEDDING CONFIG SET myset PROVIDER openai MODEL text-embedding-3-small \
PARAM use_azure true \
PARAM azure_endpoint https://myresource.openai.azure.com \
PARAM azure_deployment my-embed-deploy \
PARAM azure_api_version 2024-02-15 \
PARAM endpoint http://localhost:8081/v1/embeddings \
PARAM dim 512
# Deterministic test provider (no network, stable vectors)

View File

@@ -122,14 +122,10 @@ export OPENAI_API_KEY=sk-...
redis-cli -p 6379 LANCE.EMBEDDING CONFIG SET textset PROVIDER openai MODEL text-embedding-3-small PARAM dim 512
redis-cli -p 6379 LANCE.CREATE textset DIM 512
```
Azure OpenAI:
Custom OpenAI-compatible endpoint:
```bash
export AZURE_OPENAI_API_KEY=...
redis-cli -p 6379 LANCE.EMBEDDING CONFIG SET textset PROVIDER openai MODEL text-embedding-3-small \
PARAM use_azure true \
PARAM azure_endpoint https://myresource.openai.azure.com \
PARAM azure_deployment my-embed-deploy \
PARAM azure_api_version 2024-02-15 \
PARAM endpoint http://localhost:8081/v1/embeddings \
PARAM dim 512
```
Notes:

View File

@@ -1,6 +1,6 @@
# HeroDB Embedding Models: Complete Tutorial
This tutorial demonstrates how to use embedding models with HeroDB for vector search, covering both local self-hosted models and OpenAI's API.
This tutorial demonstrates how to use embedding models with HeroDB for vector search, covering local self-hosted models, OpenAI's API, and deterministic test embedders.
## Table of Contents
- [Prerequisites](#prerequisites)
@@ -599,163 +599,6 @@ This returns only documents where `topic` equals `'programming'`.
---
## Scenario 2: OpenAI API
Use OpenAI's production embedding service for high-quality semantic search.
### Setup
**1. Set your OpenAI API key:**
```bash
export OPENAI_API_KEY="sk-your-actual-openai-key-here"
```
**2. Start HeroDB:**
```bash
./target/release/herodb --dir ./data --admin-secret my-admin-secret --enable-rpc --rpc-port 8080
```
### Complete Workflow
**Step 1: Create database**
JSON-RPC:
```json
{
"jsonrpc": "2.0",
"id": 1,
"method": "herodb_createDatabase",
"params": [
"Lance",
{ "name": "openai-docs", "storage_path": null, "max_size": null, "redis_version": null },
null
]
}
```
**Step 2: Configure OpenAI embeddings**
JSON-RPC:
```json
{
"jsonrpc": "2.0",
"id": 2,
"method": "herodb_lanceSetEmbeddingConfig",
"params": [
1,
"articles",
{
"provider": "openai",
"model": "text-embedding-3-small",
"dim": 1536,
"endpoint": null,
"headers": {},
"timeout_ms": 30000
}
]
}
```
Redis-like:
```bash
SELECT 1
LANCE.EMBEDDING CONFIG SET articles PROVIDER openai MODEL text-embedding-3-small DIM 1536
```
**Step 3: Insert articles**
JSON-RPC:
```json
{
"jsonrpc": "2.0",
"id": 3,
"method": "herodb_lanceStoreText",
"params": [
1,
"articles",
"article-1",
"Climate change is affecting global weather patterns and ecosystems",
{ "category": "environment", "author": "Jane Smith", "year": "2024" }
]
}
```
```json
{
"jsonrpc": "2.0",
"id": 4,
"method": "herodb_lanceStoreText",
"params": [
1,
"articles",
"article-2",
"Quantum computing promises to revolutionize cryptography and drug discovery",
{ "category": "technology", "author": "John Doe", "year": "2024" }
]
}
```
```json
{
"jsonrpc": "2.0",
"id": 5,
"method": "herodb_lanceStoreText",
"params": [
1,
"articles",
"article-3",
"Renewable energy sources like solar and wind are becoming more cost-effective",
{ "category": "environment", "author": "Alice Johnson", "year": "2023" }
]
}
```
**Step 4: Semantic search**
JSON-RPC:
```json
{
"jsonrpc": "2.0",
"id": 6,
"method": "herodb_lanceSearchText",
"params": [
1,
"articles",
"environmental sustainability and green energy",
2,
null,
["category", "author"]
]
}
```
Redis-like:
```bash
LANCE.SEARCH articles K 2 QUERY "environmental sustainability and green energy" RETURN 2 category author
```
Expected: Returns article-1 and article-3 (both environment-related).
**Step 5: Filtered search**
JSON-RPC:
```json
{
"jsonrpc": "2.0",
"id": 7,
"method": "herodb_lanceSearchText",
"params": [
1,
"articles",
"new technology innovations",
5,
"category = 'technology'",
null
]
}
```
---

View File

@@ -249,5 +249,5 @@ Troubleshooting
Related docs
- Commandlevel search overview: [docs/search.md](docs/search.md:1)
- RPC definitions: [src/rpc.rs](src/rpc.rs:1)
- Commandlevel search overview: [docs/search.md](./search.md)
- RPC definitions: [src/rpc.rs](../src/rpc.rs)