From 483ccb2ba800976820a6bcad385d555e0e9cfb24 Mon Sep 17 00:00:00 2001 From: Maxime Van Hees Date: Mon, 20 Oct 2025 11:38:21 +0200 Subject: [PATCH] updated docs --- README.md | 15 ++- docs/admin.md | 6 +- docs/basics.md | 2 +- docs/cmds.md | 2 +- docs/lance.md | 12 +- docs/lancedb_text_and_images_example.md | 8 +- docs/local_embedder_full_example.md | 159 +----------------------- docs/tantivy.md | 4 +- 8 files changed, 24 insertions(+), 184 deletions(-) diff --git a/README.md b/README.md index b9a3efb..f342aba 100644 --- a/README.md +++ b/README.md @@ -83,8 +83,13 @@ For examples, see [JSON-RPC Examples](docs/rpc_examples.md) and [Admin DB 0 Mode For more detailed information on commands, features, and advanced usage, please refer to the documentation: -- [Basics](docs/basics.md) -- [Supported Commands](docs/cmds.md) -- [AGE Cryptography](docs/age.md) -- [Admin DB 0 Model (access control, per-db encryption)](docs/admin.md) -- [JSON-RPC Examples (management API)](docs/rpc_examples.md) \ No newline at end of file +- [Basics](docs/basics.md) - Launch options, symmetric encryption, and basic usage +- [Supported Commands](docs/cmds.md) - Complete Redis command reference and backend comparison +- [AGE Cryptography](docs/age.md) - Asymmetric encryption and digital signatures +- [Admin DB 0 Model](docs/admin.md) - Database management, access control, and per-database encryption +- [JSON-RPC Examples](docs/rpc_examples.md) - Management API examples +- [Full-Text Search](docs/search.md) - Tantivy-powered search capabilities +- [Tantivy Backend](docs/tantivy.md) - Tantivy as a dedicated database backend +- [Lance Vector Store](docs/lance.md) - Vector embeddings and semantic search +- [Lance Text and Images Example](docs/lancedb_text_and_images_example.md) - End-to-end vector search examples +- [Local Embedder Tutorial](docs/local_embedder_full_example.md) - Complete embedding models tutorial \ No newline at end of file diff --git a/docs/admin.md b/docs/admin.md index 7ceffab..02212e7 100644 --- a/docs/admin.md +++ b/docs/admin.md @@ -177,6 +177,6 @@ Copyable JSON examples are provided in the [RPC examples documentation](./rpc_ex - Encryption: `meta:db::enc` For command examples and management payloads: -- RESP command basics: `docs/basics.md` -- Supported commands: `docs/cmds.md` -- JSON-RPC examples: `docs/rpc_examples.md` \ No newline at end of file +- RESP command basics: [docs/basics.md](./basics.md) +- Supported commands: [docs/cmds.md](./cmds.md) +- JSON-RPC examples: [docs/rpc_examples.md](./rpc_examples.md) \ No newline at end of file diff --git a/docs/basics.md b/docs/basics.md index 20d73bd..4d0bbc2 100644 --- a/docs/basics.md +++ b/docs/basics.md @@ -701,7 +701,7 @@ This expanded documentation includes all the list commands that were implemented ## Updated Database Selection and Access Keys -HeroDB uses an `Admin DB 0` to control database existence, access, and encryption. Access to data DBs can be public (no key) or private (requires a key). See detailed model in `docs/admin.md`. +HeroDB uses an `Admin DB 0` to control database existence, access, and encryption. Access to data DBs can be public (no key) or private (requires a key). See detailed model in [docs/admin.md](./admin.md). Examples: diff --git a/docs/cmds.md b/docs/cmds.md index e132141..6f7c289 100644 --- a/docs/cmds.md +++ b/docs/cmds.md @@ -128,7 +128,7 @@ redis-cli -p 6381 --pipe < dump.rdb Connections start with no database selected. Any storage-backed command (GET, SET, H*, L*, SCAN, etc.) will return an error until you issue a SELECT to choose a database. -HeroDB uses an `Admin DB 0` to govern database existence, access and per-db encryption. Access control is enforced via `Admin DB 0` metadata. See the full model in (docs/admin.md:1). +HeroDB uses an `Admin DB 0` to govern database existence, access and per-db encryption. Access control is enforced via `Admin DB 0` metadata. See the full model in [docs/admin.md](./admin.md). Examples: ```bash diff --git a/docs/lance.md b/docs/lance.md index 2ea1b24..0c2ced2 100644 --- a/docs/lance.md +++ b/docs/lance.md @@ -71,12 +71,11 @@ redis-cli -p 6379 SELECT 1 HeroDB embeds text internally at STORE/SEARCH time using a per-dataset EmbeddingConfig sidecar. Configure provider before creating a dataset to choose dimensions and provider. Supported providers: -- openai (standard OpenAI or Azure OpenAI) +- openai (standard OpenAI API or custom OpenAI-compatible endpoints) - testhash (deterministic, CI-friendly; no network) -Environment variables for OpenAI: +Environment variable for OpenAI: - Standard OpenAI: export OPENAI_API_KEY=sk-... -- Azure OpenAI: export AZURE_OPENAI_API_KEY=... RESP examples: ```bash @@ -86,12 +85,9 @@ redis-cli -p 6379 LANCE.EMBEDDING CONFIG SET myset PROVIDER openai MODEL text-em # OpenAI with reduced output dimension (e.g., 512) when supported redis-cli -p 6379 LANCE.EMBEDDING CONFIG SET myset PROVIDER openai MODEL text-embedding-3-small PARAM dim 512 -# Azure OpenAI (set env: AZURE_OPENAI_API_KEY) +# Custom OpenAI-compatible endpoint (e.g., self-hosted) redis-cli -p 6379 LANCE.EMBEDDING CONFIG SET myset PROVIDER openai MODEL text-embedding-3-small \ - PARAM use_azure true \ - PARAM azure_endpoint https://myresource.openai.azure.com \ - PARAM azure_deployment my-embed-deploy \ - PARAM azure_api_version 2024-02-15 \ + PARAM endpoint http://localhost:8081/v1/embeddings \ PARAM dim 512 # Deterministic test provider (no network, stable vectors) diff --git a/docs/lancedb_text_and_images_example.md b/docs/lancedb_text_and_images_example.md index d4db68c..9780333 100644 --- a/docs/lancedb_text_and_images_example.md +++ b/docs/lancedb_text_and_images_example.md @@ -122,14 +122,10 @@ export OPENAI_API_KEY=sk-... redis-cli -p 6379 LANCE.EMBEDDING CONFIG SET textset PROVIDER openai MODEL text-embedding-3-small PARAM dim 512 redis-cli -p 6379 LANCE.CREATE textset DIM 512 ``` -Azure OpenAI: +Custom OpenAI-compatible endpoint: ```bash -export AZURE_OPENAI_API_KEY=... redis-cli -p 6379 LANCE.EMBEDDING CONFIG SET textset PROVIDER openai MODEL text-embedding-3-small \ - PARAM use_azure true \ - PARAM azure_endpoint https://myresource.openai.azure.com \ - PARAM azure_deployment my-embed-deploy \ - PARAM azure_api_version 2024-02-15 \ + PARAM endpoint http://localhost:8081/v1/embeddings \ PARAM dim 512 ``` Notes: diff --git a/docs/local_embedder_full_example.md b/docs/local_embedder_full_example.md index c3c4eae..576aab4 100644 --- a/docs/local_embedder_full_example.md +++ b/docs/local_embedder_full_example.md @@ -1,6 +1,6 @@ # HeroDB Embedding Models: Complete Tutorial -This tutorial demonstrates how to use embedding models with HeroDB for vector search, covering both local self-hosted models and OpenAI's API. +This tutorial demonstrates how to use embedding models with HeroDB for vector search, covering local self-hosted models, OpenAI's API, and deterministic test embedders. ## Table of Contents - [Prerequisites](#prerequisites) @@ -599,163 +599,6 @@ This returns only documents where `topic` equals `'programming'`. --- -## Scenario 2: OpenAI API - -Use OpenAI's production embedding service for high-quality semantic search. - -### Setup - -**1. Set your OpenAI API key:** - -```bash -export OPENAI_API_KEY="sk-your-actual-openai-key-here" -``` - -**2. Start HeroDB:** - -```bash -./target/release/herodb --dir ./data --admin-secret my-admin-secret --enable-rpc --rpc-port 8080 -``` - -### Complete Workflow - -**Step 1: Create database** - -JSON-RPC: -```json -{ - "jsonrpc": "2.0", - "id": 1, - "method": "herodb_createDatabase", - "params": [ - "Lance", - { "name": "openai-docs", "storage_path": null, "max_size": null, "redis_version": null }, - null - ] -} -``` - -**Step 2: Configure OpenAI embeddings** - -JSON-RPC: -```json -{ - "jsonrpc": "2.0", - "id": 2, - "method": "herodb_lanceSetEmbeddingConfig", - "params": [ - 1, - "articles", - { - "provider": "openai", - "model": "text-embedding-3-small", - "dim": 1536, - "endpoint": null, - "headers": {}, - "timeout_ms": 30000 - } - ] -} -``` - -Redis-like: -```bash -SELECT 1 -LANCE.EMBEDDING CONFIG SET articles PROVIDER openai MODEL text-embedding-3-small DIM 1536 -``` - -**Step 3: Insert articles** - -JSON-RPC: -```json -{ - "jsonrpc": "2.0", - "id": 3, - "method": "herodb_lanceStoreText", - "params": [ - 1, - "articles", - "article-1", - "Climate change is affecting global weather patterns and ecosystems", - { "category": "environment", "author": "Jane Smith", "year": "2024" } - ] -} -``` - -```json -{ - "jsonrpc": "2.0", - "id": 4, - "method": "herodb_lanceStoreText", - "params": [ - 1, - "articles", - "article-2", - "Quantum computing promises to revolutionize cryptography and drug discovery", - { "category": "technology", "author": "John Doe", "year": "2024" } - ] -} -``` - -```json -{ - "jsonrpc": "2.0", - "id": 5, - "method": "herodb_lanceStoreText", - "params": [ - 1, - "articles", - "article-3", - "Renewable energy sources like solar and wind are becoming more cost-effective", - { "category": "environment", "author": "Alice Johnson", "year": "2023" } - ] -} -``` - -**Step 4: Semantic search** - -JSON-RPC: -```json -{ - "jsonrpc": "2.0", - "id": 6, - "method": "herodb_lanceSearchText", - "params": [ - 1, - "articles", - "environmental sustainability and green energy", - 2, - null, - ["category", "author"] - ] -} -``` - -Redis-like: -```bash -LANCE.SEARCH articles K 2 QUERY "environmental sustainability and green energy" RETURN 2 category author -``` - -Expected: Returns article-1 and article-3 (both environment-related). - -**Step 5: Filtered search** - -JSON-RPC: -```json -{ - "jsonrpc": "2.0", - "id": 7, - "method": "herodb_lanceSearchText", - "params": [ - 1, - "articles", - "new technology innovations", - 5, - "category = 'technology'", - null - ] -} -``` --- diff --git a/docs/tantivy.md b/docs/tantivy.md index d217395..c677358 100644 --- a/docs/tantivy.md +++ b/docs/tantivy.md @@ -249,5 +249,5 @@ Troubleshooting Related docs -- Command‑level search overview: [docs/search.md](docs/search.md:1) -- RPC definitions: [src/rpc.rs](src/rpc.rs:1) \ No newline at end of file +- Command‑level search overview: [docs/search.md](./search.md) +- RPC definitions: [src/rpc.rs](../src/rpc.rs) \ No newline at end of file