doctree/doctree_ipfs_export_plan.md
2025-05-13 08:52:47 +03:00

4.0 KiB

Implementation Plan: Exporting DocTree Collections to IPFS

Objective: Add functionality to the doctree library to export files and images from collections to IPFS, encrypting them using Blake3 hash as the key and ChaCha20Poly1305, and generating a CSV manifest.

Dependencies:

We will need to add the following dependencies to the [dependencies] section of doctree/Cargo.toml:

  • ipfs-api = "0.17.0": For interacting with the IPFS daemon.
  • chacha20poly1305 = "0.10.1": For symmetric encryption.
  • blake3 = "1.3.1": For calculating Blake3 hashes.
  • csv = "1.1": For writing the CSV manifest file.
  • walkdir = "2.3.2": Already a dependency, but will be used for iterating through collection files.
  • tokio = { version = "1", features = ["full"] }: ipfs-api requires an async runtime.

Plan:

  1. Modify doctree/Cargo.toml: Add the new dependencies.

    [dependencies]
    # Existing dependencies...
    ipfs-api = "0.17.0"
    chacha20poly1305 = "0.10.1"
    blake3 = "1.3.1"
    csv = "1.1"
    walkdir = "2.3.2"
    tokio = { version = "1", features = ["full"] }
    
  2. Implement export_to_ipfs method in doctree/src/collection.rs:

    • Add necessary imports: std::path::PathBuf, std::fs, blake3, chacha20poly1305::ChaCha20Poly1305, chacha20poly1305::aead::Aead, chacha20poly1305::aead::NewAead, rust_ipfs::Ipfs, rust_ipfs::IpfsPath, tokio, csv.
    • Define an async method export_to_ipfs on the Collection struct. This method will take the output CSV file path as an argument.
    • Inside the method, create a csv::Writer to write the manifest.
    • Use walkdir::WalkDir to traverse the collection's directory (self.path).
    • Filter out directories and the .collection file.
    • For each file:
      • Read the file content.
      • Calculate the Blake3 hash of the content.
      • Use the first 32 bytes of the Blake3 hash as the key for ChaCha20Poly1305. Generate a random nonce.
      • Encrypt the file content using ChaCha20Poly1305.
      • Connect to the local IPFS daemon using ipfs-api.
      • Add the encrypted content to IPFS.
      • Get the IPFS hash and the size of the original file.
      • Write a record to the CSV file with: self.name, filename (relative to collection path), Blake3 hash (hex encoded), IPFS hash, and original file size.
    • Handle potential errors during file reading, hashing, encryption, IPFS interaction, and CSV writing.
  3. Implement export_collections_to_ipfs method in doctree/src/doctree.rs:

    • Add necessary imports: tokio.
    • Define an async method export_collections_to_ipfs on the DocTree struct. This method will take the output CSV directory path as an argument.
    • Inside the method, iterate through the self.collections HashMap.
    • For each collection, construct the output CSV file path (e.g., output_dir/collection_name.csv).
    • Call the export_to_ipfs method on the collection, awaiting the result.
    • Handle potential errors from the collection export.
  4. Export the new methods: Make the new methods public in doctree/src/lib.rs.

    // Existing exports...
    pub use doctree::{DocTree, DocTreeBuilder, new, from_directory};
    

    should become:

    // Existing exports...
    pub use doctree::{DocTree, DocTreeBuilder, new, from_directory, export_collections_to_ipfs};
    pub use collection::export_to_ipfs; // Assuming you want to expose the collection method as well
    

Mermaid Diagram:

graph TD
    A[DocTree] --> B{Iterate Collections};
    B --> C[Collection];
    C --> D{Iterate Files/Images};
    D --> E[Read File Content];
    E --> F[Calculate Blake3 Hash];
    F --> G[Encrypt Content (ChaCha20Poly1305)];
    G --> H[Add Encrypted Content to IPFS];
    H --> I[Get IPFS Hash and Size];
    I --> J[Write Record to CSV];
    J --> D;
    D --> C;
    C --> B;
    B --> K[CSV Manifest Files];