updated docs
This commit is contained in:
@@ -1,6 +1,6 @@
|
||||
# HeroDB Embedding Models: Complete Tutorial
|
||||
|
||||
This tutorial demonstrates how to use embedding models with HeroDB for vector search, covering both local self-hosted models and OpenAI's API.
|
||||
This tutorial demonstrates how to use embedding models with HeroDB for vector search, covering local self-hosted models, OpenAI's API, and deterministic test embedders.
|
||||
|
||||
## Table of Contents
|
||||
- [Prerequisites](#prerequisites)
|
||||
@@ -599,163 +599,6 @@ This returns only documents where `topic` equals `'programming'`.
|
||||
|
||||
---
|
||||
|
||||
## Scenario 2: OpenAI API
|
||||
|
||||
Use OpenAI's production embedding service for high-quality semantic search.
|
||||
|
||||
### Setup
|
||||
|
||||
**1. Set your OpenAI API key:**
|
||||
|
||||
```bash
|
||||
export OPENAI_API_KEY="sk-your-actual-openai-key-here"
|
||||
```
|
||||
|
||||
**2. Start HeroDB:**
|
||||
|
||||
```bash
|
||||
./target/release/herodb --dir ./data --admin-secret my-admin-secret --enable-rpc --rpc-port 8080
|
||||
```
|
||||
|
||||
### Complete Workflow
|
||||
|
||||
**Step 1: Create database**
|
||||
|
||||
JSON-RPC:
|
||||
```json
|
||||
{
|
||||
"jsonrpc": "2.0",
|
||||
"id": 1,
|
||||
"method": "herodb_createDatabase",
|
||||
"params": [
|
||||
"Lance",
|
||||
{ "name": "openai-docs", "storage_path": null, "max_size": null, "redis_version": null },
|
||||
null
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
**Step 2: Configure OpenAI embeddings**
|
||||
|
||||
JSON-RPC:
|
||||
```json
|
||||
{
|
||||
"jsonrpc": "2.0",
|
||||
"id": 2,
|
||||
"method": "herodb_lanceSetEmbeddingConfig",
|
||||
"params": [
|
||||
1,
|
||||
"articles",
|
||||
{
|
||||
"provider": "openai",
|
||||
"model": "text-embedding-3-small",
|
||||
"dim": 1536,
|
||||
"endpoint": null,
|
||||
"headers": {},
|
||||
"timeout_ms": 30000
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
Redis-like:
|
||||
```bash
|
||||
SELECT 1
|
||||
LANCE.EMBEDDING CONFIG SET articles PROVIDER openai MODEL text-embedding-3-small DIM 1536
|
||||
```
|
||||
|
||||
**Step 3: Insert articles**
|
||||
|
||||
JSON-RPC:
|
||||
```json
|
||||
{
|
||||
"jsonrpc": "2.0",
|
||||
"id": 3,
|
||||
"method": "herodb_lanceStoreText",
|
||||
"params": [
|
||||
1,
|
||||
"articles",
|
||||
"article-1",
|
||||
"Climate change is affecting global weather patterns and ecosystems",
|
||||
{ "category": "environment", "author": "Jane Smith", "year": "2024" }
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
```json
|
||||
{
|
||||
"jsonrpc": "2.0",
|
||||
"id": 4,
|
||||
"method": "herodb_lanceStoreText",
|
||||
"params": [
|
||||
1,
|
||||
"articles",
|
||||
"article-2",
|
||||
"Quantum computing promises to revolutionize cryptography and drug discovery",
|
||||
{ "category": "technology", "author": "John Doe", "year": "2024" }
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
```json
|
||||
{
|
||||
"jsonrpc": "2.0",
|
||||
"id": 5,
|
||||
"method": "herodb_lanceStoreText",
|
||||
"params": [
|
||||
1,
|
||||
"articles",
|
||||
"article-3",
|
||||
"Renewable energy sources like solar and wind are becoming more cost-effective",
|
||||
{ "category": "environment", "author": "Alice Johnson", "year": "2023" }
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
**Step 4: Semantic search**
|
||||
|
||||
JSON-RPC:
|
||||
```json
|
||||
{
|
||||
"jsonrpc": "2.0",
|
||||
"id": 6,
|
||||
"method": "herodb_lanceSearchText",
|
||||
"params": [
|
||||
1,
|
||||
"articles",
|
||||
"environmental sustainability and green energy",
|
||||
2,
|
||||
null,
|
||||
["category", "author"]
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
Redis-like:
|
||||
```bash
|
||||
LANCE.SEARCH articles K 2 QUERY "environmental sustainability and green energy" RETURN 2 category author
|
||||
```
|
||||
|
||||
Expected: Returns article-1 and article-3 (both environment-related).
|
||||
|
||||
**Step 5: Filtered search**
|
||||
|
||||
JSON-RPC:
|
||||
```json
|
||||
{
|
||||
"jsonrpc": "2.0",
|
||||
"id": 7,
|
||||
"method": "herodb_lanceSearchText",
|
||||
"params": [
|
||||
1,
|
||||
"articles",
|
||||
"new technology innovations",
|
||||
5,
|
||||
"category = 'technology'",
|
||||
null
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
|
Reference in New Issue
Block a user