# Implementation Plan: DocTree Collection Scanner ## Overview We need to expand the doctree library to: 1. Add a recursive scan function to the DocTree struct 2. Detect directories containing `.collection` files 3. Parse `.collection` files as TOML to extract collection names 4. Replace the current `name_fix` function with the one from the sal library 5. Populate collections with all files found under the collection directories ## Detailed Implementation Plan ### 1. Update Dependencies First, we need to add the necessary dependencies to the Cargo.toml file: ```toml [dependencies] walkdir = "2.3.3" pulldown-cmark = "0.9.3" thiserror = "1.0.40" lazy_static = "1.4.0" toml = "0.7.3" # Add TOML parsing support ``` ### 2. Replace the name_fix Function Replace the current `name_fix` function in `utils.rs` with the one from the sal library: ```rust pub fn name_fix(text: &str) -> String { let mut result = String::with_capacity(text.len()); let mut last_was_underscore = false; for c in text.chars() { // Keep only ASCII characters if c.is_ascii() { // Replace specific characters with underscore if c.is_whitespace() || c == ',' || c == '-' || c == '"' || c == '\'' || c == '#' || c == '!' || c == '(' || c == ')' || c == '[' || c == ']' || c == '=' || c == '+' || c == '<' || c == '>' || c == '@' || c == '$' || c == '%' || c == '^' || c == '&' || c == '*' { // Only add underscore if the last character wasn't an underscore if !last_was_underscore { result.push('_'); last_was_underscore = true; } } else { // Add the character as is (will be converted to lowercase later) result.push(c); last_was_underscore = false; } } // Non-ASCII characters are simply skipped } // Convert to lowercase return result.to_lowercase(); } ``` ### 3. Add Collection Configuration Struct Create a new struct to represent the configuration found in `.collection` files: ```rust #[derive(Deserialize, Default)] struct CollectionConfig { name: Option, // Add other configuration options as needed } ``` ### 4. Add Scan Collections Method to DocTree Add a new method to the DocTree struct to recursively scan directories for `.collection` files: ```rust impl DocTree { /// Recursively scan directories for .collection files and add them as collections /// /// # Arguments /// /// * `root_path` - The root path to start scanning from /// /// # Returns /// /// Ok(()) on success or an error pub fn scan_collections>(&mut self, root_path: P) -> Result<()> { let root_path = root_path.as_ref(); // Walk through the directory tree for entry in WalkDir::new(root_path).follow_links(true) { let entry = match entry { Ok(entry) => entry, Err(e) => { eprintln!("Error walking directory: {}", e); continue; } }; // Skip non-directories if !entry.file_type().is_dir() { continue; } // Check if this directory contains a .collection file let collection_file_path = entry.path().join(".collection"); if collection_file_path.exists() { // Found a collection directory let dir_path = entry.path(); // Get the directory name as a fallback collection name let dir_name = dir_path.file_name() .and_then(|name| name.to_str()) .unwrap_or("unnamed"); // Try to read and parse the .collection file let collection_name = match fs::read_to_string(&collection_file_path) { Ok(content) => { // Parse as TOML match toml::from_str::(&content) { Ok(config) => { // Use the name from config if available, otherwise use directory name config.name.unwrap_or_else(|| dir_name.to_string()) }, Err(e) => { eprintln!("Error parsing .collection file at {:?}: {}", collection_file_path, e); dir_name.to_string() } } }, Err(e) => { eprintln!("Error reading .collection file at {:?}: {}", collection_file_path, e); dir_name.to_string() } }; // Add the collection to the DocTree match self.add_collection(dir_path, &collection_name) { Ok(_) => { println!("Added collection '{}' from {:?}", collection_name, dir_path); }, Err(e) => { eprintln!("Error adding collection '{}' from {:?}: {}", collection_name, dir_path, e); } } } } Ok(()) } } ``` ### 5. Update the DocTreeBuilder Update the DocTreeBuilder to include a method for scanning collections: ```rust impl DocTreeBuilder { /// Scan for collections in the given root path /// /// # Arguments /// /// * `root_path` - The root path to scan for collections /// /// # Returns /// /// Self for method chaining or an error pub fn scan_collections>(self, root_path: P) -> Result { // Ensure storage is set let storage = self.storage.as_ref().ok_or_else(|| { DocTreeError::MissingParameter("storage".to_string()) })?; // Create a temporary DocTree to scan collections let mut temp_doctree = DocTree { collections: HashMap::new(), default_collection: None, storage: storage.clone(), name: self.name.clone().unwrap_or_default(), path: self.path.clone().unwrap_or_else(|| PathBuf::from("")), }; // Scan for collections temp_doctree.scan_collections(root_path)?; // Create a new builder with the scanned collections let mut new_builder = self; for (name, collection) in temp_doctree.collections { new_builder.collections.insert(name, collection); } Ok(new_builder) } } ``` ### 6. Add a Convenience Function to the Library Add a convenience function to the library for creating a DocTree by scanning a directory: ```rust /// Create a new DocTree by scanning a directory for collections /// /// # Arguments /// /// * `root_path` - The root path to scan for collections /// /// # Returns /// /// A new DocTree or an error pub fn from_directory>(root_path: P) -> Result { let storage = RedisStorage::new("redis://localhost:6379")?; DocTree::builder() .with_storage(storage) .scan_collections(root_path)? .build() } ``` ## Implementation Flow Diagram ```mermaid flowchart TD A[Start] --> B[Update Dependencies] B --> C[Replace name_fix function] C --> D[Add CollectionConfig struct] D --> E[Add scan_collections method to DocTree] E --> F[Update DocTreeBuilder] F --> G[Add convenience function] G --> H[End] ``` ## Component Interaction Diagram ```mermaid graph TD A[DocTree] -->|manages| B[Collections] C[scan_collections] -->|finds| D[.collection files] D -->|parsed as| E[TOML] E -->|extracts| F[Collection Name] C -->|creates| B G[name_fix] -->|processes| F G -->|processes| H[File Names] B -->|contains| H ``` ## Testing Plan 1. Create test directories with `.collection` files in various formats 2. Test the scan_collections method with these directories 3. Verify that collections are created correctly with the expected names 4. Verify that all files under the collection directories are included in the collections 5. Test edge cases such as empty `.collection` files, invalid TOML, etc.