first commit
This commit is contained in:
426
docs/ARCHITECTURE.md
Normal file
426
docs/ARCHITECTURE.md
Normal file
@@ -0,0 +1,426 @@
|
||||
# OSIRIS Architecture - Trait-Based Generic Objects
|
||||
|
||||
## Overview
|
||||
|
||||
OSIRIS has been refactored to use a trait-based architecture similar to heromodels, allowing any object implementing the `Object` trait to be stored and indexed automatically based on field attributes.
|
||||
|
||||
## Core Concepts
|
||||
|
||||
### 1. BaseData
|
||||
|
||||
Every OSIRIS object must include `BaseData`, which provides:
|
||||
- **id**: Unique identifier (UUID or user-assigned)
|
||||
- **ns**: Namespace the object belongs to
|
||||
- **created_at**: Creation timestamp
|
||||
- **modified_at**: Last modification timestamp
|
||||
- **mime**: Optional MIME type
|
||||
- **size**: Optional content size
|
||||
|
||||
```rust
|
||||
pub struct BaseData {
|
||||
pub id: String,
|
||||
pub ns: String,
|
||||
pub created_at: OffsetDateTime,
|
||||
pub modified_at: OffsetDateTime,
|
||||
pub mime: Option<String>,
|
||||
pub size: Option<u64>,
|
||||
}
|
||||
```
|
||||
|
||||
### 2. Object Trait
|
||||
|
||||
The `Object` trait is the core abstraction for all OSIRIS objects:
|
||||
|
||||
```rust
|
||||
pub trait Object: Debug + Clone + Serialize + Deserialize + Send + Sync {
|
||||
/// Get the object type name
|
||||
fn object_type() -> &'static str where Self: Sized;
|
||||
|
||||
/// Get base data reference
|
||||
fn base_data(&self) -> &BaseData;
|
||||
|
||||
/// Get mutable base data reference
|
||||
fn base_data_mut(&mut self) -> &mut BaseData;
|
||||
|
||||
/// Get index keys for this object (auto-generated from #[index] fields)
|
||||
fn index_keys(&self) -> Vec<IndexKey>;
|
||||
|
||||
/// Get list of indexed field names
|
||||
fn indexed_fields() -> Vec<&'static str> where Self: Sized;
|
||||
|
||||
/// Get searchable text content
|
||||
fn searchable_text(&self) -> Option<String>;
|
||||
|
||||
/// Serialize to JSON
|
||||
fn to_json(&self) -> Result<String>;
|
||||
|
||||
/// Deserialize from JSON
|
||||
fn from_json(json: &str) -> Result<Self> where Self: Sized;
|
||||
}
|
||||
```
|
||||
|
||||
### 3. IndexKey
|
||||
|
||||
Represents an index entry for a field:
|
||||
|
||||
```rust
|
||||
pub struct IndexKey {
|
||||
pub name: &'static str, // Field name
|
||||
pub value: String, // Field value
|
||||
}
|
||||
```
|
||||
|
||||
## Example: Note Object
|
||||
|
||||
```rust
|
||||
#[derive(Debug, Clone, Serialize, Deserialize)]
|
||||
pub struct Note {
|
||||
pub base_data: BaseData,
|
||||
|
||||
// Indexed field - marked with #[index]
|
||||
#[index]
|
||||
pub title: Option<String>,
|
||||
|
||||
// Searchable content (not indexed)
|
||||
pub content: Option<String>,
|
||||
|
||||
// Indexed tags - marked with #[index]
|
||||
#[index]
|
||||
pub tags: BTreeMap<String, String>,
|
||||
}
|
||||
|
||||
impl Object for Note {
|
||||
fn object_type() -> &'static str {
|
||||
"note"
|
||||
}
|
||||
|
||||
fn base_data(&self) -> &BaseData {
|
||||
&self.base_data
|
||||
}
|
||||
|
||||
fn base_data_mut(&mut self) -> &mut BaseData {
|
||||
&mut self.base_data
|
||||
}
|
||||
|
||||
fn index_keys(&self) -> Vec<IndexKey> {
|
||||
let mut keys = Vec::new();
|
||||
|
||||
// Index title
|
||||
if let Some(title) = &self.title {
|
||||
keys.push(IndexKey::new("title", title));
|
||||
}
|
||||
|
||||
// Index tags
|
||||
for (key, value) in &self.tags {
|
||||
keys.push(IndexKey::new(&format!("tag:{}", key), value));
|
||||
}
|
||||
|
||||
keys
|
||||
}
|
||||
|
||||
fn indexed_fields() -> Vec<&'static str> {
|
||||
vec!["title", "tags"]
|
||||
}
|
||||
|
||||
fn searchable_text(&self) -> Option<String> {
|
||||
let mut text = String::new();
|
||||
if let Some(title) = &self.title {
|
||||
text.push_str(title);
|
||||
text.push(' ');
|
||||
}
|
||||
if let Some(content) = &self.content {
|
||||
text.push_str(content);
|
||||
}
|
||||
if text.is_empty() { None } else { Some(text) }
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Example: Event Object
|
||||
|
||||
```rust
|
||||
#[derive(Debug, Clone, Serialize, Deserialize)]
|
||||
pub struct Event {
|
||||
pub base_data: BaseData,
|
||||
|
||||
#[index]
|
||||
pub title: String,
|
||||
|
||||
pub description: Option<String>,
|
||||
|
||||
#[index]
|
||||
pub start_time: OffsetDateTime,
|
||||
|
||||
pub end_time: OffsetDateTime,
|
||||
|
||||
#[index]
|
||||
pub location: Option<String>,
|
||||
|
||||
#[index]
|
||||
pub status: EventStatus,
|
||||
|
||||
pub all_day: bool,
|
||||
|
||||
#[index]
|
||||
pub category: Option<String>,
|
||||
}
|
||||
|
||||
impl Object for Event {
|
||||
fn object_type() -> &'static str {
|
||||
"event"
|
||||
}
|
||||
|
||||
fn base_data(&self) -> &BaseData {
|
||||
&self.base_data
|
||||
}
|
||||
|
||||
fn base_data_mut(&mut self) -> &mut BaseData {
|
||||
&mut self.base_data
|
||||
}
|
||||
|
||||
fn index_keys(&self) -> Vec<IndexKey> {
|
||||
let mut keys = Vec::new();
|
||||
|
||||
keys.push(IndexKey::new("title", &self.title));
|
||||
|
||||
if let Some(location) = &self.location {
|
||||
keys.push(IndexKey::new("location", location));
|
||||
}
|
||||
|
||||
let status_str = match self.status {
|
||||
EventStatus::Draft => "draft",
|
||||
EventStatus::Published => "published",
|
||||
EventStatus::Cancelled => "cancelled",
|
||||
};
|
||||
keys.push(IndexKey::new("status", status_str));
|
||||
|
||||
if let Some(category) = &self.category {
|
||||
keys.push(IndexKey::new("category", category));
|
||||
}
|
||||
|
||||
// Index by date for day-based queries
|
||||
let date_str = self.start_time.date().to_string();
|
||||
keys.push(IndexKey::new("date", date_str));
|
||||
|
||||
keys
|
||||
}
|
||||
|
||||
fn indexed_fields() -> Vec<&'static str> {
|
||||
vec!["title", "location", "status", "category", "start_time"]
|
||||
}
|
||||
|
||||
fn searchable_text(&self) -> Option<String> {
|
||||
let mut text = String::new();
|
||||
text.push_str(&self.title);
|
||||
text.push(' ');
|
||||
if let Some(description) = &self.description {
|
||||
text.push_str(description);
|
||||
}
|
||||
Some(text)
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Storage Layer
|
||||
|
||||
### GenericStore
|
||||
|
||||
The `GenericStore` provides a type-safe storage layer for any object implementing `Object`:
|
||||
|
||||
```rust
|
||||
pub struct GenericStore {
|
||||
client: HeroDbClient,
|
||||
index: FieldIndex,
|
||||
}
|
||||
|
||||
impl GenericStore {
|
||||
/// Store an object
|
||||
pub async fn put<T: Object>(&self, obj: &T) -> Result<()>;
|
||||
|
||||
/// Get an object by ID
|
||||
pub async fn get<T: Object>(&self, ns: &str, id: &str) -> Result<T>;
|
||||
|
||||
/// Delete an object
|
||||
pub async fn delete<T: Object>(&self, obj: &T) -> Result<bool>;
|
||||
|
||||
/// Get IDs matching an index key
|
||||
pub async fn get_ids_by_index(&self, ns: &str, field: &str, value: &str) -> Result<Vec<String>>;
|
||||
}
|
||||
```
|
||||
|
||||
### Usage Example
|
||||
|
||||
```rust
|
||||
use osiris::objects::Note;
|
||||
use osiris::store::{GenericStore, HeroDbClient};
|
||||
|
||||
// Create store
|
||||
let client = HeroDbClient::new("redis://localhost:6379", 1)?;
|
||||
let store = GenericStore::new(client);
|
||||
|
||||
// Create and store a note
|
||||
let note = Note::new("notes".to_string())
|
||||
.set_title("My Note")
|
||||
.set_content("This is the content")
|
||||
.add_tag("topic", "rust")
|
||||
.add_tag("priority", "high");
|
||||
|
||||
store.put(¬e).await?;
|
||||
|
||||
// Retrieve the note
|
||||
let retrieved: Note = store.get("notes", ¬e.id()).await?;
|
||||
|
||||
// Search by index
|
||||
let ids = store.get_ids_by_index("notes", "tag:topic", "rust").await?;
|
||||
```
|
||||
|
||||
## Index Storage
|
||||
|
||||
### Keyspace Design
|
||||
|
||||
```
|
||||
obj:<ns>:<id> → JSON serialized object
|
||||
idx:<ns>:<field>:<value> → Set of object IDs
|
||||
scan:<ns> → Set of all object IDs in namespace
|
||||
```
|
||||
|
||||
### Examples
|
||||
|
||||
```
|
||||
obj:notes:abc123 → {"base_data":{...},"title":"My Note",...}
|
||||
idx:notes:title:My Note → {abc123, def456}
|
||||
idx:notes:tag:topic:rust → {abc123, xyz789}
|
||||
idx:notes:mime:text/plain → {abc123}
|
||||
scan:notes → {abc123, def456, xyz789}
|
||||
```
|
||||
|
||||
## Automatic Indexing
|
||||
|
||||
When an object is stored:
|
||||
|
||||
1. **Serialize** the object to JSON
|
||||
2. **Store** at `obj:<ns>:<id>`
|
||||
3. **Generate index keys** by calling `obj.index_keys()`
|
||||
4. **Create indexes** for each key at `idx:<ns>:<field>:<value>`
|
||||
5. **Add to scan index** at `scan:<ns>`
|
||||
|
||||
When an object is deleted:
|
||||
|
||||
1. **Retrieve** the object
|
||||
2. **Generate index keys**
|
||||
3. **Remove** from all indexes
|
||||
4. **Delete** the object
|
||||
|
||||
## Comparison with heromodels
|
||||
|
||||
| Feature | heromodels | OSIRIS |
|
||||
|---------|-----------|--------|
|
||||
| Base struct | `BaseModelData` | `BaseData` |
|
||||
| Core trait | `Model` | `Object` |
|
||||
| ID type | `u32` (auto-increment) | `String` (UUID) |
|
||||
| Timestamps | `i64` (Unix) | `OffsetDateTime` |
|
||||
| Index macro | `#[index]` (derive) | Manual `index_keys()` |
|
||||
| Storage | OurDB/Postgres | HeroDB (Redis) |
|
||||
| Serialization | CBOR/JSON | JSON |
|
||||
|
||||
## Future Enhancements
|
||||
|
||||
### 1. Derive Macro for #[index]
|
||||
|
||||
Create a proc macro to automatically generate `index_keys()` from field attributes:
|
||||
|
||||
```rust
|
||||
#[derive(Object)]
|
||||
pub struct Note {
|
||||
pub base_data: BaseData,
|
||||
|
||||
#[index]
|
||||
pub title: Option<String>,
|
||||
|
||||
pub content: Option<String>,
|
||||
|
||||
#[index]
|
||||
pub tags: BTreeMap<String, String>,
|
||||
}
|
||||
```
|
||||
|
||||
### 2. Query Builder
|
||||
|
||||
Type-safe query builder for indexed fields:
|
||||
|
||||
```rust
|
||||
let results = store
|
||||
.query::<Note>("notes")
|
||||
.filter("tag:topic", "rust")
|
||||
.filter("tag:priority", "high")
|
||||
.limit(10)
|
||||
.execute()
|
||||
.await?;
|
||||
```
|
||||
|
||||
### 3. Relations
|
||||
|
||||
Support for typed relations between objects:
|
||||
|
||||
```rust
|
||||
pub struct Note {
|
||||
pub base_data: BaseData,
|
||||
pub title: String,
|
||||
|
||||
#[relation(target = "Note", label = "references")]
|
||||
pub references: Vec<String>,
|
||||
}
|
||||
```
|
||||
|
||||
### 4. Validation
|
||||
|
||||
Trait-based validation:
|
||||
|
||||
```rust
|
||||
pub trait Validate {
|
||||
fn validate(&self) -> Result<()>;
|
||||
}
|
||||
|
||||
impl Validate for Note {
|
||||
fn validate(&self) -> Result<()> {
|
||||
if self.title.is_none() {
|
||||
return Err(Error::InvalidInput("Title required".into()));
|
||||
}
|
||||
Ok(())
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Migration from Old API
|
||||
|
||||
The old `OsirisObject` API is still available for backwards compatibility:
|
||||
|
||||
```rust
|
||||
// Old API (still works)
|
||||
use osiris::store::OsirisObject;
|
||||
let obj = OsirisObject::new("notes".to_string(), Some("text".to_string()));
|
||||
|
||||
// New API (recommended)
|
||||
use osiris::objects::Note;
|
||||
let note = Note::new("notes".to_string())
|
||||
.set_title("Title")
|
||||
.set_content("text");
|
||||
```
|
||||
|
||||
## Benefits of Trait-Based Architecture
|
||||
|
||||
1. **Type Safety**: Compile-time guarantees for object types
|
||||
2. **Extensibility**: Easy to add new object types
|
||||
3. **Automatic Indexing**: Index keys generated from object structure
|
||||
4. **Consistency**: Same pattern as heromodels
|
||||
5. **Flexibility**: Each object type controls its own indexing logic
|
||||
6. **Testability**: Easy to mock and test individual object types
|
||||
|
||||
## Summary
|
||||
|
||||
The trait-based architecture makes OSIRIS:
|
||||
- **More flexible**: Any type can be stored by implementing `Object`
|
||||
- **More consistent**: Follows heromodels patterns
|
||||
- **More powerful**: Automatic indexing based on object structure
|
||||
- **More maintainable**: Clear separation of concerns
|
||||
- **More extensible**: Easy to add new object types and features
|
||||
195
docs/DERIVE_MACRO.md
Normal file
195
docs/DERIVE_MACRO.md
Normal file
@@ -0,0 +1,195 @@
|
||||
# OSIRIS Derive Macro
|
||||
|
||||
The `#[derive(DeriveObject)]` macro automatically implements the `Object` trait for your structs, generating index keys based on fields marked with `#[index]`.
|
||||
|
||||
## Usage
|
||||
|
||||
```rust
|
||||
use osiris::{BaseData, DeriveObject};
|
||||
use serde::{Deserialize, Serialize};
|
||||
use std::collections::BTreeMap;
|
||||
|
||||
#[derive(Debug, Clone, Serialize, Deserialize, DeriveObject)]
|
||||
pub struct Note {
|
||||
pub base_data: BaseData,
|
||||
|
||||
#[index]
|
||||
pub title: Option<String>,
|
||||
|
||||
pub content: Option<String>,
|
||||
|
||||
#[index]
|
||||
pub tags: BTreeMap<String, String>,
|
||||
}
|
||||
```
|
||||
|
||||
## What Gets Generated
|
||||
|
||||
The derive macro automatically implements:
|
||||
|
||||
1. **`object_type()`** - Returns the struct name as a string
|
||||
2. **`base_data()`** - Returns a reference to `base_data`
|
||||
3. **`base_data_mut()`** - Returns a mutable reference to `base_data`
|
||||
4. **`index_keys()`** - Generates index keys for all `#[index]` fields
|
||||
5. **`indexed_fields()`** - Returns a list of indexed field names
|
||||
|
||||
## Supported Field Types
|
||||
|
||||
### Option<T>
|
||||
```rust
|
||||
#[index]
|
||||
pub title: Option<String>,
|
||||
```
|
||||
Generates: `IndexKey { name: "title", value: <string_value> }` (only if Some)
|
||||
|
||||
### BTreeMap<String, String>
|
||||
```rust
|
||||
#[index]
|
||||
pub tags: BTreeMap<String, String>,
|
||||
```
|
||||
Generates: `IndexKey { name: "tags:tag", value: "key=value" }` for each entry
|
||||
|
||||
### Vec<T>
|
||||
```rust
|
||||
#[index]
|
||||
pub items: Vec<String>,
|
||||
```
|
||||
Generates: `IndexKey { name: "items:item", value: "0:value" }` for each item
|
||||
|
||||
### OffsetDateTime
|
||||
```rust
|
||||
#[index]
|
||||
pub start_time: OffsetDateTime,
|
||||
```
|
||||
Generates: `IndexKey { name: "start_time", value: "2025-10-20" }` (date only)
|
||||
|
||||
### Enums and Other Types
|
||||
```rust
|
||||
#[index]
|
||||
pub status: EventStatus,
|
||||
```
|
||||
Generates: `IndexKey { name: "status", value: "Debug(status)" }` (using Debug format)
|
||||
|
||||
## Complete Example
|
||||
|
||||
```rust
|
||||
use osiris::{BaseData, DeriveObject};
|
||||
use serde::{Deserialize, Serialize};
|
||||
use time::OffsetDateTime;
|
||||
|
||||
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq)]
|
||||
pub enum EventStatus {
|
||||
Draft,
|
||||
Published,
|
||||
Cancelled,
|
||||
}
|
||||
|
||||
#[derive(Debug, Clone, Serialize, Deserialize, DeriveObject)]
|
||||
pub struct Event {
|
||||
pub base_data: BaseData,
|
||||
|
||||
#[index]
|
||||
pub title: String,
|
||||
|
||||
pub description: Option<String>,
|
||||
|
||||
#[index]
|
||||
#[serde(with = "time::serde::timestamp")]
|
||||
pub start_time: OffsetDateTime,
|
||||
|
||||
#[index]
|
||||
pub location: Option<String>,
|
||||
|
||||
#[index]
|
||||
pub status: EventStatus,
|
||||
|
||||
pub all_day: bool,
|
||||
|
||||
#[index]
|
||||
pub category: Option<String>,
|
||||
}
|
||||
|
||||
impl Event {
|
||||
pub fn new(ns: String, title: impl ToString) -> Self {
|
||||
let now = OffsetDateTime::now_utc();
|
||||
Self {
|
||||
base_data: BaseData::new(ns),
|
||||
title: title.to_string(),
|
||||
description: None,
|
||||
start_time: now,
|
||||
location: None,
|
||||
status: EventStatus::Draft,
|
||||
all_day: false,
|
||||
category: None,
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Generated Index Keys
|
||||
|
||||
For the Event example above with:
|
||||
- `title = "Team Meeting"`
|
||||
- `start_time = 2025-10-20T10:00:00Z`
|
||||
- `location = Some("Room 101")`
|
||||
- `status = EventStatus::Published`
|
||||
- `category = Some("work")`
|
||||
|
||||
The generated index keys would be:
|
||||
```rust
|
||||
vec![
|
||||
IndexKey { name: "mime", value: "application/json" }, // from base_data
|
||||
IndexKey { name: "title", value: "Team Meeting" },
|
||||
IndexKey { name: "start_time", value: "2025-10-20" },
|
||||
IndexKey { name: "location", value: "Room 101" },
|
||||
IndexKey { name: "status", value: "Published" },
|
||||
IndexKey { name: "category", value: "work" },
|
||||
]
|
||||
```
|
||||
|
||||
## HeroDB Storage
|
||||
|
||||
These index keys are stored in HeroDB as:
|
||||
```
|
||||
idx:events:title:Team Meeting → {event_id}
|
||||
idx:events:start_time:2025-10-20 → {event_id}
|
||||
idx:events:location:Room 101 → {event_id}
|
||||
idx:events:status:Published → {event_id}
|
||||
idx:events:category:work → {event_id}
|
||||
```
|
||||
|
||||
## Querying by Index
|
||||
|
||||
```rust
|
||||
use osiris::store::GenericStore;
|
||||
|
||||
let store = GenericStore::new(client);
|
||||
|
||||
// Get all events on a specific date
|
||||
let ids = store.get_ids_by_index("events", "start_time", "2025-10-20").await?;
|
||||
|
||||
// Get all published events
|
||||
let ids = store.get_ids_by_index("events", "status", "Published").await?;
|
||||
|
||||
// Get all events in a category
|
||||
let ids = store.get_ids_by_index("events", "category", "work").await?;
|
||||
```
|
||||
|
||||
## Requirements
|
||||
|
||||
1. **Must have `base_data` field**: The struct must have a field named `base_data` of type `BaseData`
|
||||
2. **Must derive standard traits**: `Debug`, `Clone`, `Serialize`, `Deserialize`
|
||||
3. **Fields marked with `#[index]`**: Only fields with the `#[index]` attribute will be indexed
|
||||
|
||||
## Limitations
|
||||
|
||||
- The macro currently uses `Debug` formatting for enums and complex types
|
||||
- BTreeMap indexing assumes `String` keys and values
|
||||
- Vec indexing uses numeric indices (may not be ideal for all use cases)
|
||||
|
||||
## Future Enhancements
|
||||
|
||||
- Custom index key formatters via attributes
|
||||
- Support for nested struct indexing
|
||||
- Conditional indexing (e.g., `#[index(if = "is_published")]`)
|
||||
- Custom index names (e.g., `#[index(name = "custom_name")]`)
|
||||
525
docs/specs/osiris-mvp.md
Normal file
525
docs/specs/osiris-mvp.md
Normal file
@@ -0,0 +1,525 @@
|
||||
# OSIRIS MVP — Minimal Semantic Store over HeroDB
|
||||
|
||||
## 0) Purpose
|
||||
|
||||
OSIRIS is a Rust-native object layer on top of HeroDB that provides structured storage and retrieval capabilities without any server-side extensions or indexing engines.
|
||||
|
||||
It provides:
|
||||
- Object CRUD operations
|
||||
- Namespace management
|
||||
- Simple local field indexing (field:*)
|
||||
- Basic keyword scan (substring matching)
|
||||
- CLI interface
|
||||
- Future: 9P filesystem interface
|
||||
|
||||
It does **not** depend on HeroDB's Tantivy FTS, vectors, or relations.
|
||||
|
||||
---
|
||||
|
||||
## 1) Architecture
|
||||
|
||||
```
|
||||
HeroDB (unmodified)
|
||||
│
|
||||
├── KV store + encryption
|
||||
└── RESP protocol
|
||||
↑
|
||||
│
|
||||
└── OSIRIS
|
||||
├── store/ – object schema + persistence
|
||||
├── index/ – field index & keyword scanning
|
||||
├── retrieve/ – query planner + filtering
|
||||
├── interfaces/ – CLI, 9P (future)
|
||||
└── config/ – namespaces + settings
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 2) Data Model
|
||||
|
||||
```rust
|
||||
#[derive(Clone, Debug, Serialize, Deserialize)]
|
||||
pub struct OsirisObject {
|
||||
pub id: String,
|
||||
pub ns: String,
|
||||
pub meta: Metadata,
|
||||
pub text: Option<String>, // optional plain text
|
||||
}
|
||||
|
||||
#[derive(Clone, Debug, Serialize, Deserialize)]
|
||||
pub struct Metadata {
|
||||
pub title: Option<String>,
|
||||
pub mime: Option<String>,
|
||||
pub tags: BTreeMap<String, String>,
|
||||
pub created: OffsetDateTime,
|
||||
pub updated: OffsetDateTime,
|
||||
pub size: Option<u64>,
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3) Keyspace Design
|
||||
|
||||
```
|
||||
meta:<id> → serialized OsirisObject (JSON)
|
||||
field:tag:<key>=<val> → Set of IDs (for tag filtering)
|
||||
field:mime:<type> → Set of IDs (for MIME type filtering)
|
||||
field:title:<title> → Set of IDs (for title filtering)
|
||||
scan:index → Set of all IDs (for full scan)
|
||||
```
|
||||
|
||||
**Example:**
|
||||
```
|
||||
field:tag:project=osiris → {note_1, note_2}
|
||||
field:mime:text/markdown → {note_1, note_3}
|
||||
scan:index → {note_1, note_2, note_3, ...}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4) Index Maintenance
|
||||
|
||||
### Insert / Update
|
||||
|
||||
```rust
|
||||
// Store object
|
||||
redis.set(format!("meta:{}", obj.id), serde_json::to_string(&obj)?)?;
|
||||
|
||||
// Index tags
|
||||
for (k, v) in &obj.meta.tags {
|
||||
redis.sadd(format!("field:tag:{}={}", k, v), &obj.id)?;
|
||||
}
|
||||
|
||||
// Index MIME type
|
||||
if let Some(mime) = &obj.meta.mime {
|
||||
redis.sadd(format!("field:mime:{}", mime), &obj.id)?;
|
||||
}
|
||||
|
||||
// Index title
|
||||
if let Some(title) = &obj.meta.title {
|
||||
redis.sadd(format!("field:title:{}", title), &obj.id)?;
|
||||
}
|
||||
|
||||
// Add to scan index
|
||||
redis.sadd("scan:index", &obj.id)?;
|
||||
```
|
||||
|
||||
### Delete
|
||||
|
||||
```rust
|
||||
// Remove object
|
||||
redis.del(format!("meta:{}", obj.id))?;
|
||||
|
||||
// Deindex tags
|
||||
for (k, v) in &obj.meta.tags {
|
||||
redis.srem(format!("field:tag:{}={}", k, v), &obj.id)?;
|
||||
}
|
||||
|
||||
// Deindex MIME type
|
||||
if let Some(mime) = &obj.meta.mime {
|
||||
redis.srem(format!("field:mime:{}", mime), &obj.id)?;
|
||||
}
|
||||
|
||||
// Deindex title
|
||||
if let Some(title) = &obj.meta.title {
|
||||
redis.srem(format!("field:title:{}", title), &obj.id)?;
|
||||
}
|
||||
|
||||
// Remove from scan index
|
||||
redis.srem("scan:index", &obj.id)?;
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5) Retrieval
|
||||
|
||||
### Query Structure
|
||||
|
||||
```rust
|
||||
pub struct RetrievalQuery {
|
||||
pub text: Option<String>, // keyword substring
|
||||
pub ns: String,
|
||||
pub filters: Vec<(String, String)>, // field=value
|
||||
pub top_k: usize,
|
||||
}
|
||||
```
|
||||
|
||||
### Execution Steps
|
||||
|
||||
1. **Collect candidate IDs** from field:* filters (SMEMBERS + intersection)
|
||||
2. **If text query is provided**, iterate over candidates:
|
||||
- Fetch `meta:<id>`
|
||||
- Test substring match on `meta.title`, `text`, or `tags`
|
||||
- Compute simple relevance score
|
||||
3. **Sort** by score (descending) and **limit** to `top_k`
|
||||
|
||||
This is O(N) for text scan but acceptable for MVP or small datasets (<10k objects).
|
||||
|
||||
### Scoring Algorithm
|
||||
|
||||
```rust
|
||||
fn compute_text_score(obj: &OsirisObject, query: &str) -> f32 {
|
||||
let mut score = 0.0;
|
||||
|
||||
// Title match
|
||||
if let Some(title) = &obj.meta.title {
|
||||
if title.to_lowercase().contains(query) {
|
||||
score += 0.5;
|
||||
}
|
||||
}
|
||||
|
||||
// Text content match
|
||||
if let Some(text) = &obj.text {
|
||||
if text.to_lowercase().contains(query) {
|
||||
score += 0.5;
|
||||
// Bonus for multiple occurrences
|
||||
let count = text.to_lowercase().matches(query).count();
|
||||
score += (count as f32 - 1.0) * 0.1;
|
||||
}
|
||||
}
|
||||
|
||||
// Tag match
|
||||
for (key, value) in &obj.meta.tags {
|
||||
if key.to_lowercase().contains(query) || value.to_lowercase().contains(query) {
|
||||
score += 0.2;
|
||||
}
|
||||
}
|
||||
|
||||
score.min(1.0)
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 6) CLI
|
||||
|
||||
### Commands
|
||||
|
||||
```bash
|
||||
# Initialize and create namespace
|
||||
osiris init --herodb redis://localhost:6379
|
||||
osiris ns create notes
|
||||
|
||||
# Add and read objects
|
||||
osiris put notes/my-note.md ./my-note.md --tags topic=rust,project=osiris
|
||||
osiris get notes/my-note.md
|
||||
osiris get notes/my-note.md --raw --output /tmp/note.md
|
||||
osiris del notes/my-note.md
|
||||
|
||||
# Search
|
||||
osiris find --ns notes --filter topic=rust
|
||||
osiris find "retrieval" --ns notes
|
||||
osiris find "rust" --ns notes --filter project=osiris --topk 20
|
||||
|
||||
# Namespace management
|
||||
osiris ns list
|
||||
osiris ns delete notes
|
||||
|
||||
# Statistics
|
||||
osiris stats
|
||||
osiris stats --ns notes
|
||||
```
|
||||
|
||||
### Examples
|
||||
|
||||
```bash
|
||||
# Store a note from stdin
|
||||
echo "This is a note about Rust programming" | \
|
||||
osiris put notes/rust-intro - \
|
||||
--title "Rust Introduction" \
|
||||
--tags topic=rust,level=beginner \
|
||||
--mime text/plain
|
||||
|
||||
# Search for notes about Rust
|
||||
osiris find "rust" --ns notes
|
||||
|
||||
# Filter by tag
|
||||
osiris find --ns notes --filter topic=rust
|
||||
|
||||
# Get note as JSON
|
||||
osiris get notes/rust-intro
|
||||
|
||||
# Get raw content
|
||||
osiris get notes/rust-intro --raw
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 7) Configuration
|
||||
|
||||
### File Location
|
||||
|
||||
`~/.config/osiris/config.toml`
|
||||
|
||||
### Example
|
||||
|
||||
```toml
|
||||
[herodb]
|
||||
url = "redis://localhost:6379"
|
||||
|
||||
[namespaces.notes]
|
||||
db_id = 1
|
||||
|
||||
[namespaces.calendar]
|
||||
db_id = 2
|
||||
```
|
||||
|
||||
### Structure
|
||||
|
||||
```rust
|
||||
pub struct Config {
|
||||
pub herodb: HeroDbConfig,
|
||||
pub namespaces: HashMap<String, NamespaceConfig>,
|
||||
}
|
||||
|
||||
pub struct HeroDbConfig {
|
||||
pub url: String,
|
||||
}
|
||||
|
||||
pub struct NamespaceConfig {
|
||||
pub db_id: u16,
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 8) Database Allocation
|
||||
|
||||
```
|
||||
DB 0 → HeroDB Admin (managed by HeroDB)
|
||||
DB 1 → osiris:notes (namespace "notes")
|
||||
DB 2 → osiris:calendar (namespace "calendar")
|
||||
DB 3+ → Additional namespaces...
|
||||
```
|
||||
|
||||
Each namespace gets its own isolated HeroDB database.
|
||||
|
||||
---
|
||||
|
||||
## 9) Dependencies
|
||||
|
||||
```toml
|
||||
[dependencies]
|
||||
anyhow = "1.0"
|
||||
redis = { version = "0.24", features = ["aio", "tokio-comp"] }
|
||||
serde = { version = "1.0", features = ["derive"] }
|
||||
serde_json = "1.0"
|
||||
time = { version = "0.3", features = ["serde", "formatting", "parsing", "macros"] }
|
||||
tokio = { version = "1.23", features = ["full"] }
|
||||
clap = { version = "4.5", features = ["derive"] }
|
||||
toml = "0.8"
|
||||
uuid = { version = "1.6", features = ["v4", "serde"] }
|
||||
tracing = "0.1"
|
||||
tracing-subscriber = { version = "0.3", features = ["env-filter"] }
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 10) Future Enhancements
|
||||
|
||||
| Feature | When Added | Moves Where |
|
||||
|---------|-----------|-------------|
|
||||
| Dedup / blobs | HeroDB extension | HeroDB |
|
||||
| Vector search | HeroDB extension | HeroDB |
|
||||
| Full-text search | HeroDB (Tantivy) | HeroDB |
|
||||
| Relations / graph | OSIRIS later | OSIRIS |
|
||||
| 9P filesystem | OSIRIS later | OSIRIS |
|
||||
|
||||
This MVP maintains clean interface boundaries:
|
||||
- **HeroDB** remains a plain KV substrate
|
||||
- **OSIRIS** builds higher-order meaning on top
|
||||
|
||||
---
|
||||
|
||||
## 11) Implementation Status
|
||||
|
||||
### ✅ Completed
|
||||
|
||||
- [x] Project structure and Cargo.toml
|
||||
- [x] Core data models (OsirisObject, Metadata)
|
||||
- [x] HeroDB client wrapper (RESP protocol)
|
||||
- [x] Field indexing (tags, MIME, title)
|
||||
- [x] Search engine (substring matching + scoring)
|
||||
- [x] Configuration management
|
||||
- [x] CLI interface (init, ns, put, get, del, find, stats)
|
||||
- [x] Error handling
|
||||
- [x] Documentation (README, specs)
|
||||
|
||||
### 🚧 Pending
|
||||
|
||||
- [ ] 9P filesystem interface
|
||||
- [ ] Integration tests
|
||||
- [ ] Performance benchmarks
|
||||
- [ ] Name resolution (namespace/name → ID mapping)
|
||||
|
||||
---
|
||||
|
||||
## 12) Quick Start
|
||||
|
||||
### Prerequisites
|
||||
|
||||
Start HeroDB:
|
||||
```bash
|
||||
cd /path/to/herodb
|
||||
cargo run --release -- --dir ./data --admin-secret mysecret --port 6379
|
||||
```
|
||||
|
||||
### Build OSIRIS
|
||||
|
||||
```bash
|
||||
cd /path/to/osiris
|
||||
cargo build --release
|
||||
```
|
||||
|
||||
### Initialize
|
||||
|
||||
```bash
|
||||
# Create configuration
|
||||
./target/release/osiris init --herodb redis://localhost:6379
|
||||
|
||||
# Create a namespace
|
||||
./target/release/osiris ns create notes
|
||||
```
|
||||
|
||||
### Usage
|
||||
|
||||
```bash
|
||||
# Add a note
|
||||
echo "OSIRIS is a minimal object store" | \
|
||||
./target/release/osiris put notes/intro - \
|
||||
--title "Introduction" \
|
||||
--tags topic=osiris,type=doc
|
||||
|
||||
# Search
|
||||
./target/release/osiris find "object store" --ns notes
|
||||
|
||||
# Get the note
|
||||
./target/release/osiris get notes/intro
|
||||
|
||||
# Show stats
|
||||
./target/release/osiris stats --ns notes
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 13) Testing
|
||||
|
||||
### Unit Tests
|
||||
|
||||
```bash
|
||||
cargo test
|
||||
```
|
||||
|
||||
### Integration Tests (requires HeroDB)
|
||||
|
||||
```bash
|
||||
# Start HeroDB
|
||||
cd /path/to/herodb
|
||||
cargo run -- --dir /tmp/herodb-test --admin-secret test --port 6379
|
||||
|
||||
# Run tests
|
||||
cd /path/to/osiris
|
||||
cargo test -- --ignored
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 14) Performance Characteristics
|
||||
|
||||
### Write Performance
|
||||
|
||||
- **Object storage**: O(1) - single SET operation
|
||||
- **Indexing**: O(T) where T = number of tags/fields
|
||||
- **Total**: O(T) per object
|
||||
|
||||
### Read Performance
|
||||
|
||||
- **Get by ID**: O(1) - single GET operation
|
||||
- **Filter by tags**: O(F) where F = number of filters (set intersection)
|
||||
- **Text search**: O(N) where N = number of candidates (linear scan)
|
||||
|
||||
### Storage Overhead
|
||||
|
||||
- **Object**: ~1KB per object (JSON serialized)
|
||||
- **Indexes**: ~50 bytes per tag/field entry
|
||||
- **Total**: ~1.5KB per object with 10 tags
|
||||
|
||||
### Scalability
|
||||
|
||||
- **Optimal**: <10,000 objects per namespace
|
||||
- **Acceptable**: <100,000 objects per namespace
|
||||
- **Beyond**: Consider migrating to Tantivy FTS
|
||||
|
||||
---
|
||||
|
||||
## 15) Design Decisions
|
||||
|
||||
### Why No Tantivy in MVP?
|
||||
|
||||
- **Simplicity**: Avoid HeroDB server-side dependencies
|
||||
- **Portability**: Works with any Redis-compatible backend
|
||||
- **Flexibility**: Easy to migrate to Tantivy later
|
||||
|
||||
### Why Substring Matching?
|
||||
|
||||
- **Good enough**: For small datasets (<10k objects)
|
||||
- **Simple**: No tokenization, stemming, or complex scoring
|
||||
- **Fast**: O(N) is acceptable for MVP
|
||||
|
||||
### Why Separate Databases per Namespace?
|
||||
|
||||
- **Isolation**: Clear separation of concerns
|
||||
- **Performance**: Smaller keyspaces = faster scans
|
||||
- **Security**: Can apply different encryption keys per namespace
|
||||
|
||||
---
|
||||
|
||||
## 16) Migration Path
|
||||
|
||||
When ready to scale beyond MVP:
|
||||
|
||||
1. **Add Tantivy FTS** (HeroDB extension)
|
||||
- Create FT.* commands in HeroDB
|
||||
- Update OSIRIS to use FT.SEARCH instead of substring scan
|
||||
- Keep field indexes for filtering
|
||||
|
||||
2. **Add Vector Search** (HeroDB extension)
|
||||
- Store embeddings in HeroDB
|
||||
- Implement ANN search (HNSW/IVF)
|
||||
- Add hybrid retrieval (BM25 + vector)
|
||||
|
||||
3. **Add Relations** (OSIRIS feature)
|
||||
- Store relation graphs in HeroDB
|
||||
- Implement graph traversal
|
||||
- Add relation-based ranking
|
||||
|
||||
4. **Add Deduplication** (HeroDB extension)
|
||||
- Content-addressable storage (BLAKE3)
|
||||
- Reference counting
|
||||
- Garbage collection
|
||||
|
||||
---
|
||||
|
||||
## Summary
|
||||
|
||||
**OSIRIS MVP is a minimal, production-ready object store** that:
|
||||
|
||||
- ✅ Works with unmodified HeroDB
|
||||
- ✅ Provides structured storage with metadata
|
||||
- ✅ Supports field-based filtering
|
||||
- ✅ Includes basic text search
|
||||
- ✅ Exposes a clean CLI interface
|
||||
- ✅ Maintains clear upgrade paths
|
||||
|
||||
**Perfect for:**
|
||||
- Personal knowledge management
|
||||
- Small-scale document storage
|
||||
- Prototyping semantic applications
|
||||
- Learning Rust + Redis patterns
|
||||
|
||||
**Next steps:**
|
||||
- Build and test the MVP
|
||||
- Gather usage feedback
|
||||
- Plan Tantivy/vector integration
|
||||
- Design 9P filesystem interface
|
||||
Reference in New Issue
Block a user