Mycelium: background health check and reconnect logic #5

Open
opened 2026-02-11 19:20:58 +00:00 by thabeta · 1 comment
Owner

The current implementation in chvm-lib probes for the Mycelium IP once after boot but does not monitor the health of the connection thereafter. If the mycelium process inside the VM or on the host dies/stalls, the VM loses connectivity without notification.

Proposed Changes:

  1. Implement a background watcher in VmManager for Mycelium enabled VMs.
  2. Add health checks for the internal mycelium process via vsock or network probing.
  3. Update chvm-init to restart the mycelium service if it fails.
The current implementation in `chvm-lib` probes for the Mycelium IP once after boot but does not monitor the health of the connection thereafter. If the `mycelium` process inside the VM or on the host dies/stalls, the VM loses connectivity without notification. **Proposed Changes:** 1. Implement a background watcher in `VmManager` for Mycelium enabled VMs. 2. Add health checks for the internal `mycelium` process via vsock or network probing. 3. Update `chvm-init` to restart the mycelium service if it fails.
Member

Work completed in PR

Guest-side: Mycelium process supervision (chvm-init)

  • Refactored mycelium from fire-and-forget into setup_mycelium() (parse config) + spawn_mycelium() (start process, return PID),
    enabling restart
  • SIGCHLD-based reaper in signal.rs now supervises mycelium: when the mycelium PID is reaped, it schedules a non-blocking restart
    after delay , with up to 5 retry attempts on spawn failure

Host-side: On-demand health checks (chvm-lib)

  • Added MyceliumHealth enum (Healthy/Unhealthy/Unknown) and mycelium_health field on NetworkState
  • Added check_mycelium_health() in manager.rs
  • Clears stale health when VM stops in refresh_status()
  • IPv6 normalization via std::net::Ipv6Addr for canonical formatting

CLI: Health in inspect (chvm-cli)

  • inspect.rs now calls check_mycelium_health() before displaying state so mycelium_health appears in JSON output

Tests (tests/13_mycelium.sh)

  • Added 8 new test cases covering: health status healthy on initial inspect, process running check, kill + unhealthy detection,
    auto-restart after crash (new PID), IPv6 recovery after restart, health status healthy after recovery
### Work completed in PR Guest-side: Mycelium process supervision (chvm-init) - Refactored mycelium from fire-and-forget into setup_mycelium() (parse config) + spawn_mycelium() (start process, return PID), enabling restart - SIGCHLD-based reaper in signal.rs now supervises mycelium: when the mycelium PID is reaped, it schedules a non-blocking restart after delay , with up to 5 retry attempts on spawn failure Host-side: On-demand health checks (chvm-lib) - Added MyceliumHealth enum (Healthy/Unhealthy/Unknown) and mycelium_health field on NetworkState - Added check_mycelium_health() in manager.rs - Clears stale health when VM stops in refresh_status() - IPv6 normalization via std::net::Ipv6Addr for canonical formatting CLI: Health in inspect (chvm-cli) - inspect.rs now calls check_mycelium_health() before displaying state so mycelium_health appears in JSON output Tests (tests/13_mycelium.sh) - Added 8 new test cases covering: health status healthy on initial inspect, process running check, kill + unhealthy detection, auto-restart after crash (new PID), IPv6 recovery after restart, health status healthy after recovery
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
geomind_code/my_hypervisor#5
No description provided.