Files
zosbuilder/claude.md

19 KiB

Claude Code Reference for Zero-OS Builder

This document provides essential context for Claude Code (or any AI assistant) working with this Zero-OS Alpine Initramfs Builder repository.

Project Overview

What is this? A sophisticated build system for creating custom Alpine Linux 3.22 x86_64 initramfs images with zinit process management, designed for Zero-OS deployment on ThreeFold Grid.

Key Features:

  • Container-based reproducible builds (rootless podman/docker)
  • Incremental staged build pipeline with completion markers
  • zinit process manager (complete OpenRC replacement)
  • RFS (Remote File System) for lazy-loading modules/firmware from S3
  • Rust components built with musl static linking
  • Aggressive size optimization (strip + UPX)
  • Embedded initramfs in kernel (single vmlinuz.efi output)

Repository Structure

zosbuilder/
├── config/                  # All configuration files
│   ├── build.conf          # Build settings (versions, paths, flags)
│   ├── packages.list       # Alpine packages to install
│   ├── sources.conf        # ThreeFold components to build
│   ├── modules.conf        # 2-stage kernel module loading
│   ├── firmware.conf       # Firmware to include in initramfs
│   ├── kernel.config       # Linux kernel configuration
│   ├── init                # /init script for initramfs
│   └── zinit/              # zinit service definitions (YAML)
│
├── scripts/
│   ├── build.sh            # Main orchestrator (DO NOT EDIT LIGHTLY)
│   ├── clean.sh            # Clean all artifacts
│   ├── dev-container.sh    # Persistent dev container manager
│   ├── rebuild-after-zinit.sh  # Quick rebuild helper
│   ├── lib/                # Modular build libraries
│   │   ├── common.sh       # Logging, path normalization, utilities
│   │   ├── stages.sh       # Incremental stage tracking
│   │   ├── docker.sh       # Container lifecycle
│   │   ├── alpine.sh       # Alpine extraction, packages, cleanup
│   │   ├── components.sh   # Build Rust components from sources.conf
│   │   ├── initramfs.sh    # Assembly, optimization, CPIO creation
│   │   └── kernel.sh       # Kernel download, config, build, embed
│   └── rfs/                # RFS flist generation scripts
│       ├── common.sh       # S3 config, version computation
│       ├── pack-modules.sh # Create modules flist
│       ├── pack-firmware.sh # Create firmware flist
│       └── verify-flist.sh  # Inspect/test flists
│
├── docs/                   # Detailed documentation
│   ├── NOTES.md           # Operational knowledge & troubleshooting
│   ├── PROMPT.md          # Agent guidance (strict debugger mode)
│   ├── TODO.md            # Persistent checklist with code refs
│   ├── AGENTS.md          # Quick reference for agents
│   ├── rfs-flists.md      # RFS design and runtime flow
│   ├── review-rfs-integration.md  # Integration points
│   └── depmod-behavior.md # Module dependency details
│
├── runit.sh               # Test runner (QEMU/cloud-hypervisor)
├── initramfs/             # Generated initramfs tree
├── components/            # Generated component sources
├── kernel/                # Generated kernel source
├── dist/                  # Final outputs
│   ├── vmlinuz.efi        # Kernel with embedded initramfs
│   └── initramfs.cpio.xz  # Standalone initramfs archive
└── .build-stages/         # Incremental build markers (*.done files)

Core Concepts

1. Incremental Staged Builds

How it works:

  • Each stage creates a .build-stages/<stage_name>.done marker on success
  • Subsequent builds skip completed stages unless forced
  • Use ./scripts/build.sh --show-stages to see status
  • Use ./scripts/build.sh --rebuild-from=<stage> to restart from a specific stage
  • Manually remove .done files to re-run specific stages

Build stages (in order):

alpine_extract → alpine_configure → alpine_packages → alpine_firmware
  → components_build → components_verify → kernel_modules
  → init_script → components_copy → zinit_setup
  → modules_setup → modules_copy → cleanup → rfs_flists
  → validation → initramfs_create → initramfs_test → kernel_build

Key insight: The build ALWAYS runs inside a container. Host invocations auto-spawn containers.

2. Container-First Architecture

Why containers?

  • Reproducible toolchain (Alpine 3.22 base with exact dependencies)
  • Rootless execution (no privileged access needed)
  • Isolation from host environment
  • GitHub Actions compatible

Container modes:

  • Transient: ./scripts/build.sh spawns, builds, exits
  • Persistent: ./scripts/dev-container.sh start/shell/build

Important: Directory paths are normalized to absolute PROJECT_ROOT to avoid CWD issues when stages change directories (especially kernel builds).

3. Component Build System

sources.conf format:

TYPE:NAME:URL:VERSION:BUILD_FUNCTION[:EXTRA]

Example:

git:zinit:https://github.com/threefoldtech/zinit:master:build_zinit
git:rfs:https://github.com/threefoldtech/rfs:development:build_rfs
git:mycelium:https://github.com/threefoldtech/mycelium:0.6.1:build_mycelium
release:corex:https://github.com/threefoldtech/corex/releases/download/2.1.4/corex-2.1.4-amd64-linux-static:2.1.4:install_corex:rename=corex

Build functions are defined in scripts/lib/components.sh and handle:

  • Rust builds with x86_64-unknown-linux-musl target
  • Static linking via RUSTFLAGS="-C target-feature=+crt-static"
  • Special cases (e.g., mycelium builds in myceliumd/ subdirectory)

4. RFS Flists (Remote File System)

Purpose: Lazy-load kernel modules and firmware from S3 at runtime

Flow:

  1. Build stage creates flists: modules-<KERNEL_VERSION>.fl and firmware-<TAG>.fl
  2. Flists are SQLite databases containing:
    • Content-addressed blob references
    • S3 store URIs (patched for read-only access)
    • Directory tree metadata
  3. Flists embedded in initramfs at /etc/rfs/
  4. Runtime: zinit units mount flists over /lib/modules/ and /lib/firmware/
  5. Dual udev coldplug: early (before RFS) for networking, post-RFS for new hardware

Key files:

  • scripts/rfs/pack-modules.sh - Creates modules flist from container /lib/modules/
  • scripts/rfs/pack-firmware.sh - Creates firmware flist from Alpine packages
  • config/zinit/init/modules.sh - Runtime mount script
  • config/zinit/init/firmware.sh - Runtime mount script

5. zinit Service Management

No OpenRC: This system uses zinit exclusively for process management.

Service graph:

/init → zinit → [stage1-modules, udevd, depmod]
                → udev-trigger (early coldplug)
                → network
                → rfs-modules + rfs-firmware (mount flists)
                → udev-rfs (post-RFS coldplug)
                → services

Service definitions: YAML files in config/zinit/ with after:, needs:, wants: dependencies

6. Kernel Versioning and S3 Upload

Versioned Kernel Output:

  • Standard kernel: dist/vmlinuz.efi (for compatibility)
  • Versioned kernel: dist/vmlinuz-{VERSION}-{ZINIT_HASH}.efi
  • Example: vmlinuz-6.12.44-Zero-OS-a1b2c3d.efi

Version components:

  • {VERSION}: Full kernel version from KERNEL_VERSION + CONFIG_LOCALVERSION
  • {ZINIT_HASH}: Short git commit hash from components/zinit/.git

S3 Upload (optional):

  • Controlled by UPLOAD_KERNEL=true environment variable
  • Uses MinIO client (mcli or mc) to upload to S3-compatible storage
  • Uploads versioned kernel to: s3://{bucket}/{prefix}/kernel/{versioned_filename}

Kernel Index Generation: After uploading, automatically generates and uploads index files:

  • kernels.txt - Plain text, one kernel per line, sorted reverse chronologically
  • kernels.json - JSON format with metadata (timestamp, count)

Why index files?

  • S3 web interfaces often don't support directory listings
  • Enables dropdown menus in web UIs without S3 API access
  • Provides kernel discovery for deployment tools

JSON index structure:

{
  "kernels": ["vmlinuz-6.12.44-Zero-OS-abc1234.efi", ...],
  "updated": "2025-01-04T12:00:00Z",
  "count": 2
}

Key functions:

  • get_git_commit_hash() in scripts/lib/common.sh - Extracts git hash
  • kernel_build_with_initramfs() in scripts/lib/kernel.sh - Creates versioned kernel
  • kernel_upload_to_s3() in scripts/lib/kernel.sh - Uploads to S3
  • kernel_generate_index() in scripts/lib/kernel.sh - Generates and uploads index

Critical Conventions

Path Normalization

Problem: Stages can change CWD (kernel build uses /workspace/kernel/current) Solution: All paths normalized to absolute at startup in scripts/lib/common.sh:244

Variables affected:

  • INSTALL_DIR (initramfs/)
  • COMPONENTS_DIR (components/)
  • KERNEL_DIR (kernel/)
  • DIST_DIR (dist/)

Never use relative paths when calling functions that might be in different CWDs.

Branding and Security

Passwordless root enforcement:

  • Applied in scripts/lib/initramfs.sh:575 via passwd -d -R "${initramfs_dir}" root
  • Creates root:: in /etc/shadow (empty password field)
  • Controlled by ZEROOS_BRANDING and ZEROOS_PASSWORDLESS_ROOT flags

Never edit /etc/shadow manually - always use passwd or chpasswd with chroot.

Module Loading Strategy

2-stage approach:

  • Stage 1: Critical boot modules (virtio, e1000, scsi) - loaded by zinit early
  • Stage 2: Extended hardware (igb, ixgbe, i40e) - loaded after network

Config: config/modules.conf with stage1: and stage2: prefixes

Dependency resolution:

  • Uses modinfo to build dependency tree
  • Resolves from container /lib/modules/<FULL_VERSION>/
  • Must run after kernel_modules stage

Firmware Policy

For initramfs: config/firmware.conf is the SINGLE source of truth

  • Any firmware hints in modules.conf are IGNORED
  • Prevents duplication/version mismatches

For RFS: Full Alpine linux-firmware* packages installed in container

  • Packed from container /lib/firmware/
  • Overmounts at runtime for extended hardware

Common Workflows

Full Build from Scratch

# Clean everything and rebuild
./scripts/build.sh --clean

# Or just rebuild all stages
./scripts/build.sh --force-rebuild

Quick Iteration After Config Changes

# After editing zinit configs, init script, or modules.conf
./scripts/rebuild-after-zinit.sh

# With kernel rebuild
./scripts/rebuild-after-zinit.sh --with-kernel

# Dry-run to see what changed
./scripts/rebuild-after-zinit.sh --verify-only

Minimal Manual Rebuild

# Remove specific stages
rm -f .build-stages/initramfs_create.done
rm -f .build-stages/validation.done

# Rebuild only those stages
DEBUG=1 ./scripts/build.sh

Testing the Built Kernel

# QEMU (default)
./runit.sh

# cloud-hypervisor with 5 disks
./runit.sh --hypervisor ch --disks 5 --reset

# Custom memory and bridge
./runit.sh --memory 4096 --bridge zosbr

Persistent Dev Container

# Start persistent container
./scripts/dev-container.sh start

# Enter shell
./scripts/dev-container.sh shell

# Run build inside
./scripts/dev-container.sh build

# Stop container
./scripts/dev-container.sh stop

Debugging Guidelines

Diagnostics-First Approach

ALWAYS add diagnostics before fixes:

  1. Enable DEBUG=1 for verbose safe_execute logs
  2. Add strategic log_debug statements
  3. Confirm hypothesis in logs
  4. Then apply minimal fix

Example:

# Bad: Guess and fix
Edit file to fix suspected issue

# Good: Diagnose first
1. Add log_debug "Variable X=${X}, resolved=${resolved_path}"
2. DEBUG=1 ./scripts/build.sh
3. Confirm in output
4. Apply fix with evidence

Key Diagnostic Functions

  • scripts/lib/common.sh: log_info, log_warn, log_error, log_debug
  • scripts/lib/initramfs.sh:820: Validation debug prints (input, PWD, PROJECT_ROOT, resolved paths)
  • scripts/lib/initramfs.sh:691: Pre-CPIO sanity checks with file listings

Common Issues and Solutions

"Initramfs directory not found"

  • Cause: INSTALL_DIR interpreted as relative in different CWD
  • Fix: Already patched - paths normalized at startup
  • Check: Look for "Validation debug:" logs showing resolved paths

"INITRAMFS_ARCHIVE unbound"

  • Cause: Incremental build skipped initramfs_create stage
  • Fix: Already patched - stages default INITRAMFS_ARCHIVE if unset
  • Check: scripts/build.sh:401 logs "defaulting INITRAMFS_ARCHIVE"

"Module dependency resolution fails"

  • Cause: Container /lib/modules/<FULL_VERSION> missing or stale
  • Fix: ./scripts/rebuild-after-zinit.sh --refresh-container-mods
  • Check: Ensure kernel_modules stage completed successfully

"Passwordless root not working"

  • Cause: Branding disabled or shadow file not updated
  • Fix: Check ZEROOS_BRANDING=true in logs, verify /etc/shadow has root::
  • Verify: Extract initramfs and grep '^root:' etc/shadow

Important Files Quick Reference

Must-Read Before Editing

  • scripts/build.sh - Orchestrator with precise stage order
  • scripts/lib/common.sh - Path normalization, logging, utilities
  • scripts/lib/stages.sh - Stage tracking logic
  • config/build.conf - Version pins, directory settings, flags

Safe to Edit

  • config/zinit/*.yaml - Service definitions
  • config/zinit/init/*.sh - Runtime initialization scripts
  • config/modules.conf - Module lists (stage1/stage2)
  • config/firmware.conf - Initramfs firmware selection
  • config/packages.list - Alpine packages

Generated (Never Edit)

  • initramfs/ - Assembled initramfs tree
  • components/ - Downloaded component sources
  • kernel/ - Kernel source tree
  • dist/ - Build outputs
  • .build-stages/ - Completion markers

Testing Architecture

No built-in tests during build - Tests run separately via runit.sh

Why?

  • Build is for assembly, not validation
  • Tests require hypervisor (QEMU/cloud-hypervisor)
  • Separation allows faster iteration

runit.sh features:

  • Multi-disk support (qcow2 for QEMU, raw for cloud-hypervisor)
  • Network bridge/TAP configuration
  • Persistent volumes (reset with --reset)
  • Serial console logging

Quick Command Reference

# Build
./scripts/build.sh                      # Incremental build
./scripts/build.sh --clean             # Clean build
./scripts/build.sh --show-stages       # Show completion status
./scripts/build.sh --rebuild-from=zinit_setup  # Rebuild from stage
DEBUG=1 ./scripts/build.sh             # Verbose output

# Rebuild helpers
./scripts/rebuild-after-zinit.sh       # After zinit/init/modules changes
./scripts/rebuild-after-zinit.sh --with-kernel  # Also rebuild kernel
./scripts/rebuild-after-zinit.sh --verify-only  # Dry-run

# Testing
./runit.sh                             # QEMU test
./runit.sh --hypervisor ch             # cloud-hypervisor test
./runit.sh --help                      # All options

# Dev container
./scripts/dev-container.sh start       # Start persistent container
./scripts/dev-container.sh shell       # Enter shell
./scripts/dev-container.sh build       # Build inside container
./scripts/dev-container.sh stop        # Stop container

# Cleanup
./scripts/clean.sh                     # Remove all generated files
rm -rf .build-stages/                  # Reset stage markers

Environment Variables

Build control:

  • DEBUG=1 - Enable verbose logging
  • FORCE_REBUILD=true - Force rebuild all stages
  • REBUILD_FROM_STAGE=<name> - Rebuild from specific stage

Version overrides:

  • ALPINE_VERSION=3.22 - Alpine Linux version
  • KERNEL_VERSION=6.12.44 - Linux kernel version
  • RUST_TARGET=x86_64-unknown-linux-musl - Rust compilation target

Firmware tagging:

  • FIRMWARE_TAG=20250908 - Firmware flist version tag

S3 upload control:

  • UPLOAD_KERNEL=true - Upload versioned kernel to S3 (default: false)
  • UPLOAD_MANIFESTS=true - Upload RFS flists to S3 (default: false)
  • KERNEL_SUBPATH=kernel - S3 subpath for kernel uploads (default: kernel)

S3 configuration:

  • See config/rfs.conf for S3 endpoint, credentials, paths
  • Used by both RFS flist uploads and kernel uploads

Documentation Hierarchy

Start here:

  1. README.md - User-facing guide with features and setup
  2. This file (claude.md) - AI assistant context

For development: 3. docs/NOTES.md - Operational knowledge, troubleshooting 4. docs/AGENTS.md - Quick agent reference 5. docs/TODO.md - Current work checklist with code links

For deep dives: 6. docs/PROMPT.md - Strict debugger agent mode (diagnostics-first) 7. docs/rfs-flists.md - RFS design and implementation 8. docs/review-rfs-integration.md - Integration points analysis 9. docs/depmod-behavior.md - Module dependency deep dive

Historical: 10. IMPLEMENTATION_PLAN.md - Original design document 11. GITHUB_ACTIONS.md - CI/CD setup guide

Project Philosophy

  1. Reproducibility: Container-based builds ensure identical results
  2. Incrementality: Stage markers minimize rebuild time
  3. Diagnostics-first: Log before fixing, validate assumptions
  4. Minimal intervention: Alpine + zinit only, no systemd/OpenRC
  5. Size-optimized: Aggressive cleanup, strip, UPX compression
  6. Remote-ready: RFS enables lazy-loading for extended hardware support

Commit Message Guidelines

DO NOT add Claude Code or AI assistant references to commit messages.

Keep commits clean and professional:

  • Focus on what changed and why
  • Use conventional commit prefixes: build:, docs:, fix:, feat:, refactor:
  • Be concise but descriptive
  • No emoji unless project convention
  • No "Generated with Claude Code" or "Co-Authored-By: Claude" footers

Good example:

build: remove testing.sh in favor of runit.sh

Replace inline boot testing with standalone runit.sh runner.
Tests now run separately from build pipeline for faster iteration.

Bad example:

build: remove testing.sh 🤖

Made some changes to testing.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
  • Build fails: Check DEBUG=1 logs, stage completion markers, container state
  • Module issues: kernel_modules stage, CONTAINER_MODULES_PATH, depmod logs
  • Firmware missing: config/firmware.conf for initramfs, RFS flist for runtime
  • zinit problems: Service YAML syntax, dependency order, init script errors
  • Path errors: Absolute path normalization in common.sh:244
  • Size too large: Check cleanup stage, strip/UPX execution, package list
  • Container issues: Rootless setup, subuid/subgid, podman vs docker
  • RFS mount fails: S3 credentials, network readiness, flist manifest paths
  • Kernel upload: UPLOAD_KERNEL=true, requires config/rfs.conf, MinIO client (mcli/mc)
  • Kernel index: Auto-generated kernels.txt/kernels.json for dropdown UIs, updated on upload

Last updated: 2025-01-04

Maintainer notes: This file is the entry point for AI assistants. Keep it updated when architecture changes. Cross-reference with docs/NOTES.md for operational details.