feat(cloud): detect dead local daemon in cloud status and document launchd unit#337
Open
jlsevillano wants to merge 1 commit intoGentleman-Programming:mainfrom
Open
Conversation
…unchd unit `engram cloud status` now probes the local engram serve daemon at 127.0.0.1:7437 (respects ENGRAM_PORT) with a 1s timeout and prints a `Local daemon:` line so users can detect a silently dead autosync after brew upgrade engram, log out, or any binary replacement. Exit code is unchanged (informational) and the probe is only run when cloud is configured. DOCS.md "Running as a Service" gains a launchd (macOS) subsection with a KeepAlive plist template that survives brew upgrade by relaunching engram serve automatically. The Homebrew section in docs/INSTALLATION.md links to the new template so macOS users hit the supervisor guidance right after install. Closes Gentleman-Programming#279
2 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
🔗 Linked Issue
Closes #279
🏷️ PR Type
type:bug— Bug fixtype:feature— New featuretype:docs— Documentation onlytype:refactor— Code refactoring (no behavior change)type:chore— Maintenance, dependencies, toolingtype:breaking-change— Breaking change📝 Summary
engram cloud statusnow probes the localengram servedaemon at127.0.0.1:7437(respectsENGRAM_PORT) with a 1s timeout and prints aLocal daemon: running | not running | unreachableline so users can detect a silently dead autosync afterbrew upgrade engramor any other binary replacement.DOCS.md "Running as a Service"so macOS users can superviseengram servethe same way Linux users do with systemd. WithKeepAlive=true, autosync now survivesbrew upgradeautomatically.📂 Changes
cmd/engram/cloud_daemon_probe.gocloudDaemonProbevariable function (1s timeoutGET /health), port resolution (ENGRAM_PORT→ 7437), andprintCloudStatusDaemonProbewriter with recovery hint when the daemon is down.cmd/engram/cloud_daemon_probe_test.gocmd/engram/cloud.gocmdCloudStatuscallsprintCloudStatusDaemonProbein each cloud-configured branch (token, token+insecure, no-token) before the existing sync diagnostic. No behavior change in the "not configured" branch.cmd/engram/main_extra_test.gostubRuntimeHooksnow stubscloudDaemonProbeso existing tests stay deterministic; newTestCmdCloudStatusEmitsLocalDaemonLineverifies the line is printed when configured (and suppressed when not).DOCS.mdUsing systemd→Using systemd (Linux). AddsUsing launchd (macOS)with full plist template (KeepAlive=true so brew upgrade does not break autosync), load/unload steps, and verification viaengram cloud status. Updates theengram cloud statusreference bullet to describe the newLocal daemon:line.docs/INSTALLATION.mdbrew upgrade.🧪 Test Plan
go test ./...(passes with the standard CI environment; my local env hadENGRAM_CLOUD_SERVERset which leaks into pre-existing tests — repro byunset ENGRAM_CLOUD_SERVER ENGRAM_CLOUD_TOKEN ENGRAM_CLOUD_INSECURE_NO_AUTH ENGRAM_CLOUD_AUTOSYNC ENGRAM_PORTbefore running, same isolation as CI).go test -tags e2e ./internal/server/....cloud config --server ...:Local daemon: not running on port 7777+ recovery hint mentioningengram serveand the launchd templateLocal daemon: running on port 7777ENGRAM_PORT=9000while daemon stays on 7777 → probe targets 9000 and reportsnot running on port 9000(env override honored)🤖 Automated Checks
These run automatically and all must pass before merge:
Closes #N/Fixes #N/Resolves #Nstatus:approvedlabeltype:*labelgo test ./...passesgo test -tags e2e ./internal/server/...passes✅ Contributor Checklist
Closes #279)type:*label to this PR (type:feature)go test ./...go test -tags e2e ./internal/server/...Co-Authored-Bytrailers in commits💬 Notes for Reviewers
homebrew-tap, so it is intentionally out of scope for this PR. The two in-repo mitigations (status probe + launchd template) close the gap on this side.not_running(TCP dial error to 127.0.0.1) fromunreachable(timeout / non-2xx / unexpected error) so the recovery hint only fires when restartingengram serveis the right action.var daemonProbeTimeoutso tests can shorten it; default in production stays at 1s.<HOME>placeholders because launchd does not expand$HOME/~inside plist values; the docs explicitly call this out.