This commit is contained in:
2025-09-14 07:19:52 +02:00
parent eef9f39b58
commit 28839cf646
6 changed files with 1844 additions and 31 deletions

File diff suppressed because it is too large Load Diff

115
lib/hero/herofs/README.md Normal file
View File

@@ -0,0 +1,115 @@
# HeroFS - Distributed Filesystem for HeroLib
HeroFS is a distributed filesystem implementation built on top of HeroDB (Redis-based storage). It provides a virtual filesystem with support for files, directories, symbolic links, and binary data blobs.
## Overview
HeroFS implements a filesystem structure where:
- **Fs**: Represents a filesystem as a top-level container
- **FsDir**: Represents directories within a filesystem
- **FsFile**: Represents files with support for multiple directory associations
- **FsSymlink**: Represents symbolic links pointing to files or directories
- **FsBlob**: Represents binary data chunks (up to 1MB) used as file content
## Features
- Distributed storage using Redis
- Support for files, directories, and symbolic links
- Blob-based file content storage with integrity verification
- Multiple directory associations for files (similar to hard links)
- Filesystem quotas and usage tracking
- Metadata support for files
- Efficient lookup mechanisms using Redis hash sets
## Installation
HeroFS is part of HeroLib and is automatically available when using HeroLib.
## Usage
To use HeroFS, you need to create a filesystem factory:
```v
import freeflowuniverse.herolib.hero.herofs
mut fs_factory := herofs.new()!
```
### Creating a Filesystem
```v
fs_id := fs_factory.fs.set(fs_factory.fs.new(
name: 'my_filesystem'
quota_bytes: 1000000000 // 1GB quota
)!)!
```
### Working with Directories
```v
// Create root directory
root_dir_id := fs_factory.fs_dir.set(fs_factory.fs_dir.new(
name: 'root'
fs_id: fs_id
parent_id: 0
)!)!
// Create subdirectory
sub_dir_id := fs_factory.fs_dir.set(fs_factory.fs_dir.new(
name: 'documents'
fs_id: fs_id
parent_id: root_dir_id
)!)!
```
### Working with Blobs
```v
// Create a blob with binary data
blob_id := fs_factory.fs_blob.set(fs_factory.fs_blob.new(
data: content_bytes
mime_type: 'text/plain'
)!)!
```
### Working with Files
```v
// Create a file
file_id := fs_factory.fs_file.set(fs_factory.fs_file.new(
name: 'example.txt'
fs_id: fs_id
directories: [root_dir_id]
blobs: [blob_id]
)!)!
```
### Working with Symbolic Links
```v
// Create a symbolic link to a file
symlink_id := fs_factory.fs_symlink.set(fs_factory.fs_symlink.new(
name: 'example_link.txt'
fs_id: fs_id
parent_id: root_dir_id
target_id: file_id
target_type: .file
)!)!
```
## API Reference
The HeroFS module provides the following main components:
- `FsFactory` - Main factory for accessing all filesystem components
- `DBFs` - Filesystem operations
- `DBFsDir` - Directory operations
- `DBFsFile` - File operations
- `DBFsSymlink` - Symbolic link operations
- `DBFsBlob` - Binary data blob operations
Each component provides CRUD operations and specialized methods for filesystem management.
## Examples
Check the `examples/hero/herofs/` directory for detailed usage examples.

289
lib/hero/herofs/specs.md Normal file
View File

@@ -0,0 +1,289 @@
# HeroFS Specifications
This document provides detailed specifications for the HeroFS distributed filesystem implementation.
## Architecture Overview
HeroFS is built on top of HeroDB, which uses Redis as its storage backend. The filesystem is implemented as a collection of interconnected data structures that represent the various components of a filesystem:
1. **Fs** - Filesystem container
2. **FsDir** - Directories
3. **FsFile** - Files
4. **FsSymlink** - Symbolic links
5. **FsBlob** - Binary data chunks
All components inherit from the `Base` struct, which provides common fields like ID, name, description, timestamps, security policies, tags, and comments.
## Filesystem (Fs)
The `Fs` struct represents a filesystem as a top-level container:
```v
@[heap]
pub struct Fs {
db.Base
pub mut:
name string
group_id u32 // Associated group for permissions
root_dir_id u32 // ID of root directory
quota_bytes u64 // Storage quota in bytes
used_bytes u64 // Current usage in bytes
}
```
### Key Features
- **Name-based identification**: Filesystems can be retrieved by name using efficient Redis hash sets
- **Quota management**: Each filesystem has a storage quota and tracks current usage
- **Root directory**: Each filesystem has a root directory ID that serves as the entry point
- **Group association**: Filesystems can be associated with groups for permission management
### Methods
- `new()`: Create a new filesystem instance
- `set()`: Save filesystem to database
- `get()`: Retrieve filesystem by ID
- `get_by_name()`: Retrieve filesystem by name
- `delete()`: Remove filesystem from database
- `exist()`: Check if filesystem exists
- `list()`: List all filesystems
- `increase_usage()`: Increase used bytes counter
- `decrease_usage()`: Decrease used bytes counter
- `check_quota()`: Verify if additional bytes would exceed quota
## Directory (FsDir)
The `FsDir` struct represents a directory in a filesystem:
```v
@[heap]
pub struct FsDir {
db.Base
pub mut:
name string
fs_id u32 // Associated filesystem
parent_id u32 // Parent directory ID (0 for root)
}
```
### Key Features
- **Hierarchical structure**: Directories form a tree structure with parent-child relationships
- **Path-based identification**: Efficient lookup by filesystem ID, parent ID, and name
- **Children management**: Directories automatically track their children through Redis hash sets
- **Cross-filesystem isolation**: Directories are bound to a specific filesystem
### Methods
- `new()`: Create a new directory instance
- `set()`: Save directory to database and update indices
- `get()`: Retrieve directory by ID
- `delete()`: Remove directory (fails if it has children)
- `exist()`: Check if directory exists
- `list()`: List all directories
- `get_by_path()`: Retrieve directory by path components
- `list_by_filesystem()`: List directories in a filesystem
- `list_children()`: List child directories
- `has_children()`: Check if directory has children
- `rename()`: Rename directory
- `move()`: Move directory to a new parent
## File (FsFile)
The `FsFile` struct represents a file in a filesystem:
```v
@[heap]
pub struct FsFile {
db.Base
pub mut:
name string
fs_id u32 // Associated filesystem
directories []u32 // Directory IDs where this file exists
blobs []u32 // IDs of file content blobs
size_bytes u64
mime_type string // e.g., "image/png"
checksum string // e.g., SHA256 checksum of the file
accessed_at i64
metadata map[string]string // Custom metadata
}
```
### Key Features
- **Multiple directory associations**: Files can exist in multiple directories (similar to hard links in Linux)
- **Blob-based content**: File content is stored as references to FsBlob objects
- **Size tracking**: Files track their total size in bytes
- **MIME type support**: Files store their MIME type for content identification
- **Checksum verification**: Files can store checksums for integrity verification
- **Access timestamp**: Tracks when the file was last accessed
- **Custom metadata**: Files support custom key-value metadata
### Methods
- `new()`: Create a new file instance
- `set()`: Save file to database and update indices
- `get()`: Retrieve file by ID
- `delete()`: Remove file and update all indices
- `exist()`: Check if file exists
- `list()`: List all files
- `get_by_path()`: Retrieve file by directory and name
- `list_by_directory()`: List files in a directory
- `list_by_filesystem()`: List files in a filesystem
- `list_by_mime_type()`: List files by MIME type
- `append_blob()`: Add a new blob to the file
- `update_accessed()`: Update accessed timestamp
- `update_metadata()`: Update file metadata
- `rename()`: Rename file (affects all directories)
- `move()`: Move file to different directories
## Symbolic Link (FsSymlink)
The `FsSymlink` struct represents a symbolic link in a filesystem:
```v
@[heap]
pub struct FsSymlink {
db.Base
pub mut:
name string
fs_id u32 // Associated filesystem
parent_id u32 // Parent directory ID
target_id u32 // ID of target file or directory
target_type SymlinkTargetType
}
pub enum SymlinkTargetType {
file
directory
}
```
### Key Features
- **Target type specification**: Symlinks can point to either files or directories
- **Cross-filesystem protection**: Symlinks cannot point to targets in different filesystems
- **Referrer tracking**: Targets know which symlinks point to them
- **Broken link detection**: Symlinks can be checked for validity
### Methods
- `new()`: Create a new symbolic link instance
- `set()`: Save symlink to database and update indices
- `get()`: Retrieve symlink by ID
- `delete()`: Remove symlink and update all indices
- `exist()`: Check if symlink exists
- `list()`: List all symlinks
- `get_by_path()`: Retrieve symlink by parent directory and name
- `list_by_parent()`: List symlinks in a parent directory
- `list_by_filesystem()`: List symlinks in a filesystem
- `list_by_target()`: List symlinks pointing to a target
- `rename()`: Rename symlink
- `move()`: Move symlink to a new parent directory
- `redirect()`: Change symlink target
- `resolve()`: Get the target ID of a symlink
- `is_broken()`: Check if symlink target exists
## Binary Data Blob (FsBlob)
The `FsBlob` struct represents binary data chunks:
```v
@[heap]
pub struct FsBlob {
db.Base
pub mut:
hash string // blake192 hash of content
data []u8 // Binary data (max 1MB)
size_bytes int // Size in bytes
created_at i64
mime_type string // MIME type
encoding string // Encoding type
}
```
### Key Features
- **Content-based addressing**: Blobs are identified by their BLAKE3 hash (first 192 bits)
- **Size limit**: Blobs are limited to 1MB to ensure efficient storage and retrieval
- **Integrity verification**: Built-in hash verification for data integrity
- **MIME type and encoding**: Blobs store their content type information
- **Deduplication**: Identical content blobs are automatically deduplicated
### Methods
- `new()`: Create a new blob instance
- `set()`: Save blob to database (returns existing ID if content already exists)
- `get()`: Retrieve blob by ID
- `delete()`: Remove blob from database
- `exist()`: Check if blob exists
- `list()`: List all blobs
- `get_by_hash()`: Retrieve blob by content hash
- `exists_by_hash()`: Check if blob exists by content hash
- `verify_integrity()`: Verify blob data integrity against stored hash
- `calculate_hash()`: Calculate BLAKE3 hash of blob data
## Storage Mechanisms
HeroFS uses Redis hash sets extensively for efficient indexing and lookup:
### Filesystem Indices
- `fs:names` - Maps filesystem names to IDs
- `fsdir:paths` - Maps directory path components to IDs
- `fsdir:fs:${fs_id}` - Lists directories in a filesystem
- `fsdir:children:${dir_id}` - Lists children of a directory
- `fsfile:paths` - Maps file paths (directory:name) to IDs
- `fsfile:dir:${dir_id}` - Lists files in a directory
- `fsfile:fs:${fs_id}` - Lists files in a filesystem
- `fsfile:mime:${mime_type}` - Lists files by MIME type
- `fssymlink:paths` - Maps symlink paths (parent:name) to IDs
- `fssymlink:parent:${parent_id}` - Lists symlinks in a parent directory
- `fssymlink:fs:${fs_id}` - Lists symlinks in a filesystem
- `fssymlink:target:${target_type}:${target_id}` - Lists symlinks pointing to a target
- `fsblob:hashes` - Maps content hashes to blob IDs
### Data Serialization
All HeroFS components use the HeroLib encoder for serialization:
- Version tag (u8) is stored first
- All fields are serialized in a consistent order
- Deserialization follows the exact same order
- Type safety is maintained through V's type system
## Special Features
### Hard Links
Files can be associated with multiple directories through the `directories` field, allowing for hard link-like behavior.
### Deduplication
Blobs are automatically deduplicated based on their content hash. When creating a new blob with identical content to an existing one, the existing ID is returned.
### Quota Management
Filesystems track their storage usage and can enforce quotas to prevent overconsumption.
### Metadata Support
Files support custom metadata as key-value pairs, allowing for flexible attribute storage.
### Cross-Component Validation
When creating or modifying components, HeroFS validates references to other components:
- Directory parent must exist
- File directories must exist
- File blobs must exist
- Symlink parent must exist
- Symlink target must exist and match target type
## Security Model
HeroFS inherits the security model from HeroDB:
- Each component has a `securitypolicy` field referencing a SecurityPolicy object
- Components can have associated tags for categorization
- Components can have associated comments for documentation
## Performance Considerations
- All indices are stored as Redis hash sets for O(1) lookup performance
- Blob deduplication reduces storage requirements
- Multiple directory associations allow efficient file organization
- Content-based addressing enables easy integrity verification
- Factory pattern provides easy access to all filesystem components

View File

@@ -1,31 +0,0 @@
distill vlang objects out of the calendr/contact/circle and create the missing parts
organze per root object which are @[heap] and in separate file with name.v
the rootobjects are
- user
- group (which users are members and in which role can be admin, writer, reader, can be linked to subgroups)
- calendar (references to event, group)
- calendar_event (everything related to an event on calendar, link to one or more fs_file)
- project (grouping per project, defines swimlanes and milestones this allows us to visualize as kanban, link to group, link to one or more fs_file )
- project_issue (and issue is specific type, e.g. task, story, bug, question,…), issue is linked to project by id, also defined priority…, on which swimlane, deadline, assignees, … ,,,, has tags, link to one or more fs_file
- chat_group (link to group, name/description/tags)
- chat_message (link to chat_group, link to parent_chat_messages and what type of link e.g. reply or reference or? , status, … link to one or more fs_file)
- fs = filesystem (link to group)
- fs_dir = directory in filesystem, link to parent, link to group
- fs_file (link to one or more fs_dir, list of references to blobs as blake192)
- fs_symlink (can be link to dir or file)
- fs_blob (the data itself, max size 1 MB, binary data, id = blake192)
the groups define how people can interact with the parts e.g. calendar linked to group, so readers of that group can read and have copy of the info linked to that group
all the objects are identified by their blake192 (based on the content)
there is a special table which has link between blake192 and their previous & next version, so we can always walk the three, both parts are indexed (this is independent of type of object)