updated docs

This commit is contained in:
Maxime Van Hees
2025-10-20 11:38:21 +02:00
parent df780e20a2
commit 483ccb2ba8
8 changed files with 24 additions and 184 deletions

View File

@@ -1,6 +1,6 @@
# HeroDB Embedding Models: Complete Tutorial
This tutorial demonstrates how to use embedding models with HeroDB for vector search, covering both local self-hosted models and OpenAI's API.
This tutorial demonstrates how to use embedding models with HeroDB for vector search, covering local self-hosted models, OpenAI's API, and deterministic test embedders.
## Table of Contents
- [Prerequisites](#prerequisites)
@@ -599,163 +599,6 @@ This returns only documents where `topic` equals `'programming'`.
---
## Scenario 2: OpenAI API
Use OpenAI's production embedding service for high-quality semantic search.
### Setup
**1. Set your OpenAI API key:**
```bash
export OPENAI_API_KEY="sk-your-actual-openai-key-here"
```
**2. Start HeroDB:**
```bash
./target/release/herodb --dir ./data --admin-secret my-admin-secret --enable-rpc --rpc-port 8080
```
### Complete Workflow
**Step 1: Create database**
JSON-RPC:
```json
{
"jsonrpc": "2.0",
"id": 1,
"method": "herodb_createDatabase",
"params": [
"Lance",
{ "name": "openai-docs", "storage_path": null, "max_size": null, "redis_version": null },
null
]
}
```
**Step 2: Configure OpenAI embeddings**
JSON-RPC:
```json
{
"jsonrpc": "2.0",
"id": 2,
"method": "herodb_lanceSetEmbeddingConfig",
"params": [
1,
"articles",
{
"provider": "openai",
"model": "text-embedding-3-small",
"dim": 1536,
"endpoint": null,
"headers": {},
"timeout_ms": 30000
}
]
}
```
Redis-like:
```bash
SELECT 1
LANCE.EMBEDDING CONFIG SET articles PROVIDER openai MODEL text-embedding-3-small DIM 1536
```
**Step 3: Insert articles**
JSON-RPC:
```json
{
"jsonrpc": "2.0",
"id": 3,
"method": "herodb_lanceStoreText",
"params": [
1,
"articles",
"article-1",
"Climate change is affecting global weather patterns and ecosystems",
{ "category": "environment", "author": "Jane Smith", "year": "2024" }
]
}
```
```json
{
"jsonrpc": "2.0",
"id": 4,
"method": "herodb_lanceStoreText",
"params": [
1,
"articles",
"article-2",
"Quantum computing promises to revolutionize cryptography and drug discovery",
{ "category": "technology", "author": "John Doe", "year": "2024" }
]
}
```
```json
{
"jsonrpc": "2.0",
"id": 5,
"method": "herodb_lanceStoreText",
"params": [
1,
"articles",
"article-3",
"Renewable energy sources like solar and wind are becoming more cost-effective",
{ "category": "environment", "author": "Alice Johnson", "year": "2023" }
]
}
```
**Step 4: Semantic search**
JSON-RPC:
```json
{
"jsonrpc": "2.0",
"id": 6,
"method": "herodb_lanceSearchText",
"params": [
1,
"articles",
"environmental sustainability and green energy",
2,
null,
["category", "author"]
]
}
```
Redis-like:
```bash
LANCE.SEARCH articles K 2 QUERY "environmental sustainability and green energy" RETURN 2 category author
```
Expected: Returns article-1 and article-3 (both environment-related).
**Step 5: Filtered search**
JSON-RPC:
```json
{
"jsonrpc": "2.0",
"id": 7,
"method": "herodb_lanceSearchText",
"params": [
1,
"articles",
"new technology innovations",
5,
"category = 'technology'",
null
]
}
```
---