doctree/doctree_ipfs_export_plan.md
2025-05-13 08:52:47 +03:00

89 lines
4.0 KiB
Markdown

# Implementation Plan: Exporting DocTree Collections to IPFS
**Objective:** Add functionality to the `doctree` library to export files and images from collections to IPFS, encrypting them using Blake3 hash as the key and ChaCha20Poly1305, and generating a CSV manifest.
**Dependencies:**
We will need to add the following dependencies to the `[dependencies]` section of `doctree/Cargo.toml`:
* `ipfs-api = "0.17.0"`: For interacting with the IPFS daemon.
* `chacha20poly1305 = "0.10.1"`: For symmetric encryption.
* `blake3 = "1.3.1"`: For calculating Blake3 hashes.
* `csv = "1.1"`: For writing the CSV manifest file.
* `walkdir = "2.3.2"`: Already a dependency, but will be used for iterating through collection files.
* `tokio = { version = "1", features = ["full"] }`: `ipfs-api` requires an async runtime.
**Plan:**
1. **Modify `doctree/Cargo.toml`:** Add the new dependencies.
```toml
[dependencies]
# Existing dependencies...
ipfs-api = "0.17.0"
chacha20poly1305 = "0.10.1"
blake3 = "1.3.1"
csv = "1.1"
walkdir = "2.3.2"
tokio = { version = "1", features = ["full"] }
```
2. **Implement `export_to_ipfs` method in `doctree/src/collection.rs`:**
* Add necessary imports: `std::path::PathBuf`, `std::fs`, `blake3`, `chacha20poly1305::ChaCha20Poly1305`, `chacha20poly1305::aead::Aead`, `chacha20poly1305::aead::NewAead`, `rust_ipfs::Ipfs`, `rust_ipfs::IpfsPath`, `tokio`, `csv`.
* Define an `async` method `export_to_ipfs` on the `Collection` struct. This method will take the output CSV file path as an argument.
* Inside the method, create a `csv::Writer` to write the manifest.
* Use `walkdir::WalkDir` to traverse the collection's directory (`self.path`).
* Filter out directories and the `.collection` file.
* For each file:
* Read the file content.
* Calculate the Blake3 hash of the content.
* Use the first 32 bytes of the Blake3 hash as the key for `ChaCha20Poly1305`. Generate a random nonce.
* Encrypt the file content using `ChaCha20Poly1305`.
* Connect to the local IPFS daemon using `ipfs-api`.
* Add the encrypted content to IPFS.
* Get the IPFS hash and the size of the original file.
* Write a record to the CSV file with: `self.name`, filename (relative to collection path), Blake3 hash (hex encoded), IPFS hash, and original file size.
* Handle potential errors during file reading, hashing, encryption, IPFS interaction, and CSV writing.
3. **Implement `export_collections_to_ipfs` method in `doctree/src/doctree.rs`:**
* Add necessary imports: `tokio`.
* Define an `async` method `export_collections_to_ipfs` on the `DocTree` struct. This method will take the output CSV directory path as an argument.
* Inside the method, iterate through the `self.collections` HashMap.
* For each collection, construct the output CSV file path (e.g., `output_dir/collection_name.csv`).
* Call the `export_to_ipfs` method on the collection, awaiting the result.
* Handle potential errors from the collection export.
4. **Export the new methods:** Make the new methods public in `doctree/src/lib.rs`.
```rust
// Existing exports...
pub use doctree::{DocTree, DocTreeBuilder, new, from_directory};
```
should become:
```rust
// Existing exports...
pub use doctree::{DocTree, DocTreeBuilder, new, from_directory, export_collections_to_ipfs};
pub use collection::export_to_ipfs; // Assuming you want to expose the collection method as well
```
**Mermaid Diagram:**
```mermaid
graph TD
A[DocTree] --> B{Iterate Collections};
B --> C[Collection];
C --> D{Iterate Files/Images};
D --> E[Read File Content];
E --> F[Calculate Blake3 Hash];
F --> G[Encrypt Content (ChaCha20Poly1305)];
G --> H[Add Encrypted Content to IPFS];
H --> I[Get IPFS Hash and Size];
I --> J[Write Record to CSV];
J --> D;
D --> C;
C --> B;
B --> K[CSV Manifest Files];