Import uses wrong namespace for embedder vectors — search returns no results #51

Closed
opened 2026-02-13 14:45:29 +00:00 by mik-tf · 0 comments
Owner

Problem

When importing a git repo via the Import UI, two different namespaces are used:

  1. Library namespace (derived from git URL org) — e.g. znzcybercity — used for the library page, search UI, and disk structure at ~/hero/var/books/znzcybercity/
  2. Embedder namespace (from the form's namespace field or collection name) — e.g. cybercity — used for vector storage in hero_embedder

The search page queries the library namespace (znzcybercity) which has 0 documents, while the vectors sit in cybercity with 2005 documents. Result: "No results found" for every query on a freshly imported library.

Steps to Reproduce

  1. Delete znzcybercity from the system completely
  2. Import via UI: git URL https://forge.ourworld.tf/znzcybercity/docs_znzcybercity
  3. Go to /library/znzcybercity and search anything
  4. "No results found" — vectors are in namespace cybercity, not znzcybercity

Expected Behavior

The import pipeline should use a single consistent namespace for both the library directory structure and the embedder vector storage.

Solution

Ensure import_collection_pipeline() uses the same namespace for:

  • scan_collections() namespace assignment
  • VectorStoreConfig namespace for embedder uploads
  • Library config and disk directory creation

The library namespace (derived from git org) should be the canonical one used everywhere.

## Problem When importing a git repo via the Import UI, two different namespaces are used: 1. **Library namespace** (derived from git URL org) — e.g. `znzcybercity` — used for the library page, search UI, and disk structure at `~/hero/var/books/znzcybercity/` 2. **Embedder namespace** (from the form's namespace field or collection name) — e.g. `cybercity` — used for vector storage in hero_embedder The search page queries the library namespace (`znzcybercity`) which has 0 documents, while the vectors sit in `cybercity` with 2005 documents. Result: **"No results found"** for every query on a freshly imported library. ## Steps to Reproduce 1. Delete znzcybercity from the system completely 2. Import via UI: git URL `https://forge.ourworld.tf/znzcybercity/docs_znzcybercity` 3. Go to `/library/znzcybercity` and search anything 4. "No results found" — vectors are in namespace `cybercity`, not `znzcybercity` ## Expected Behavior The import pipeline should use a single consistent namespace for both the library directory structure and the embedder vector storage. ## Solution Ensure `import_collection_pipeline()` uses the same namespace for: - `scan_collections()` namespace assignment - `VectorStoreConfig` namespace for embedder uploads - Library config and disk directory creation The library namespace (derived from git org) should be the canonical one used everywhere.
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
lhumina_code/hero_books#51
No description provided.