Troubleshooting
Fast incident triage order
Run checks in this order to narrow faults quickly:
cellmgr cell list --view mergedcellmgr apply --dry-run --allcellctl list -Tandcellctl stats -T- host syslog for supervise output and restart patterns
cellmgr cell shell <name>for in-cell diagnostics- compare desired and rendered configuration files
If you are new to Unix operations, do not jump directly into in-cell shell debugging. Start with list/dry-run/runtime checks first.
Compare desired vs runtime config
Inspect these paths when drift is suspected:
- desired:
/etc/cellmgr/<name>.cell - runtime:
/var/cellmgr/cells/<name>/cell.conf
If policy values differ, remember that policy-only drift may warn without
restart unless --restart-changed is provided during apply.
Common failure patterns
- dependency cycle or missing dependency in
CELL_DEPENDS_ON - invalid volume mount target or overlapping mount paths
- strict TSV schema mismatch in machine-output consumers
- blocked restore because a volume is mounted or cell is still running
- command permission issues because shell is not root (or
doasnot used) - typo in resource name or wrong scope (
desiredvsruntime)
Recovery playbooks
- Config drift: fix manifests, then run
cellmgr apply --all - Apply plan issues: test with
cellmgr cell plan run <name> - Storage recovery: stop relevant workloads and perform guarded restore with
--yes - IPC/UI errors: verify
cellmgr ipc serve --stdioand reconnect from CellUI