StateStore and ImageCache Lack Synchronization #12

Closed
opened 2026-02-22 08:41:29 +00:00 by rawan · 1 comment
Member

Location:

  • crates/chvm-lib/src/vm/manager.rs:25-128 - StateStore
  • crates/chvm-lib/src/oci/cache.rs:35-182 - ImageCache

Issue:
Both StateStore and ImageCache use an in-memory HashMap that is read-modify-written and then persisted to disk, with NO internal locking. Concurrent operations (e.g., vm start + vm create in different terminals) can cause:

  1. Lost updates to the state/index
  2. Corrupted JSON files
  3. Duplicate or missing VM entries

Example Scenario:

Thread 1: Load state.json (contains VM1, VM2)
Thread 2: Load state.json (contains VM1, VM2)
Thread 1: Add VM3, save state.json (now has VM1, VM2, VM3)
Thread 2: Add VM4, save state.json (now has VM1, VM2, VM4 - VM3 is lost!)
**Location:** - `crates/chvm-lib/src/vm/manager.rs:25-128` - `StateStore` - `crates/chvm-lib/src/oci/cache.rs:35-182` - `ImageCache` **Issue:** Both `StateStore` and `ImageCache` use an in-memory `HashMap` that is read-modify-written and then persisted to disk, with NO internal locking. Concurrent operations (e.g., `vm start` + `vm create` in different terminals) can cause: 1. Lost updates to the state/index 2. Corrupted JSON files 3. Duplicate or missing VM entries **Example Scenario:** ``` Thread 1: Load state.json (contains VM1, VM2) Thread 2: Load state.json (contains VM1, VM2) Thread 1: Add VM3, save state.json (now has VM1, VM2, VM3) Thread 2: Add VM4, save state.json (now has VM1, VM2, VM4 - VM3 is lost!) ```
Member

after investigation:

StateStore issue

  • the example scenario is not valid since each vm has it's own state.json, The in-memory HashMap is per-process, so two separate chvm create CLI invocations each have their own VmManager instance with their own HashMap. There's no shared mutable state between processes.
  • the only issue here is two processes operating on the same VM concurrently (e.g. chvm stop vm1 and chvm start vm1 at the same time)

ImageCache issue

  • The ImageCache uses a single shared index.json file. Two concurrent chvm pull or chvm create commands that both trigger image pulls could genuinely corrupt or lose entries in index.json.

work completed in pr:

  • ImageCache: added exclusive lock + reload-before-modify on all mutations (add, remove, ensure_rootfs), shared lock on reads (get), and atomic writes for index.json
  • StateStore: added the same protections as a defensive measure. Also added reload_state() so resolve_clone() and update_status() re-read from disk before modifying, which guards against the edge case of concurrent operations on the same VM
### after investigation: #### StateStore issue - the example scenario is not valid since each vm has it's own state.json, The in-memory HashMap is per-process, so two separate chvm create CLI invocations each have their own VmManager instance with their own HashMap. There's no shared mutable state between processes. - the only issue here is two processes operating on the same VM concurrently (e.g. chvm stop vm1 and chvm start vm1 at the same time) #### ImageCache issue - The ImageCache uses a single shared index.json file. Two concurrent chvm pull or chvm create commands that both trigger image pulls could genuinely corrupt or lose entries in index.json. ### work completed in pr: - ImageCache: added exclusive lock + reload-before-modify on all mutations (add, remove, ensure_rootfs), shared lock on reads (get), and atomic writes for index.json - StateStore: added the same protections as a defensive measure. Also added reload_state() so resolve_clone() and update_status() re-read from disk before modifying, which guards against the edge case of concurrent operations on the same VM
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
geomind_code/my_hypervisor#12
No description provided.