Skip to content

Wire server discovery protocol into thv serve#4319

Open
JAORMX wants to merge 1 commit intomainfrom
wire-server-discovery-protocol
Open

Wire server discovery protocol into thv serve#4319
JAORMX wants to merge 1 commit intomainfrom
wire-server-discovery-protocol

Conversation

@JAORMX
Copy link
Collaborator

@JAORMX JAORMX commented Mar 23, 2026

Summary

RFC-0034 proposes a long-running local server architecture where thv serve advertises itself via a discovery file so clients (CLI, Studio) can find it without hardcoded ports. The discovery package (pkg/server/discovery/) was already fully implemented and tested but had zero imports anywhere in the codebase. This PR wires it in, giving us a working discovery protocol that immediately benefits Studio and the skills CLI client.

  • The server now writes a discovery file ($XDG_STATE_HOME/toolhive/server/server.json) on startup with the actual listen URL, PID, and a cryptographic nonce, and removes it on shutdown
  • The /health endpoint returns the nonce via X-Toolhive-Nonce so clients can verify server instance identity (prevents PID-reuse confusion)
  • The skills client now auto-discovers the server via the discovery file before falling back to TOOLHIVE_API_URL or localhost:8080, with loopback and socket-path validation on discovered URLs
  • Fixes: SIGTERM handling (was only SIGINT), 30-second shutdown timeout (was unbounded), symlink rejection on the discovery file read path, directory permission tightening, constant-time nonce comparison
  • A startup guard refuses to start if another healthy server is already running

Type of change

  • New feature

Test plan

  • Unit tests (task test)
  • Linting (task lint-fix)
  • Manual testing (describe below)

Manual testing: run thv serve, verify server.json appears with correct URL/PID/nonce, curl -v /health shows X-Toolhive-Nonce header, stop server and verify file is removed, start a second server while first is running and verify it refuses.

Changes

File Change
pkg/server/discovery/*.go Discovery package (was implemented but unwired) with security hardening: symlink check on read, dir permission tightening, constant-time nonce comparison, exported URL validators
pkg/api/server.go Nonce generation, WithNonce() builder, ListenURL(), discovery write/remove in lifecycle, startup guard, shutdown timeout
pkg/api/v1/healthcheck.go Nonce parameter, X-Toolhive-Nonce header on 204 responses
cmd/thv/app/server.go Add SIGTERM to signal handling
pkg/skills/client/client.go Discovery-first URL resolution with loopback/socket-path validation
pkg/api/server_test.go Tests for generateNonce, ListenURL (TCP + Unix)
pkg/api/v1/healtcheck_test.go Tests for nonce header present/absent/unhealthy
pkg/server/discovery/*_test.go Tests for symlink read rejection, dir permission tightening, paralleltest fixes

Does this introduce a user-facing change?

thv serve now writes a discovery file that allows other tools (Studio, skills CLI) to automatically find the running server without requiring TOOLHIVE_API_URL or knowing the port. This is transparent to users -- existing behavior is preserved as a fallback.

Special notes for reviewers

  • The pkg/server/discovery/ package was previously implemented and tested in a separate effort but never wired in. This PR includes it as new files since it had no prior commits.
  • The startup guard prevents silent orphaning: if a healthy server is already running, the new server refuses to start. Stale discovery files (from crashes/SIGKILL) are automatically cleaned up.
  • The nonce is explicitly not a secret -- it is a server instance identifier returned in a plaintext HTTP header and stored in a 0600 file. It prevents PID-reuse confusion, not unauthorized access.
  • Security review found no high/critical issues. TOCTOU gaps on symlink checks are mitigated by the 0700 directory permissions (single-user trust model).
  • Future work: thv server start/stop/status subcommands, CLI auto-start, lock file singleton, SSE events, Studio integration.

Generated with Claude Code

@github-actions github-actions bot added the size/XL Extra large PR: 1000+ lines changed label Mar 23, 2026
The discovery package (pkg/server/discovery/) was already implemented
and tested but had zero imports in the codebase. This wires it into
the serve command so clients (CLI, Studio) can auto-discover a running
server without hardcoded ports or environment variables.

On startup, thv serve now generates a cryptographic nonce, writes a
discovery file to $XDG_STATE_HOME/toolhive/server/server.json with
the actual listen URL (supporting port 0 and Unix sockets), and
returns the nonce via the X-Toolhive-Nonce health check header. On
shutdown the file is removed.

The skills client now tries discovery before falling back to the
TOOLHIVE_API_URL env var or the default localhost:8080, with loopback
and socket-path validation on discovered URLs.

Additional fixes: SIGTERM handling in the serve command, a 30-second
shutdown timeout (was unbounded), symlink rejection on the discovery
file read path, directory permission tightening after MkdirAll, and
constant-time nonce comparison.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Large PR Detected

This PR exceeds 1000 lines of changes and requires justification before it can be reviewed.

How to unblock this PR:

Add a section to your PR description with the following format:

## Large PR Justification

[Explain why this PR must be large, such as:]
- Generated code that cannot be split
- Large refactoring that must be atomic
- Multiple related changes that would break if separated
- Migration or data transformation

Alternative:

Consider splitting this PR into smaller, focused changes (< 1000 lines each) for easier review and reduced risk.

See our Contributing Guidelines for more details.


This review will be automatically dismissed once you add the justification section.

@JAORMX JAORMX force-pushed the wire-server-discovery-protocol branch from 36bc774 to 7fdf8fa Compare March 23, 2026 14:13
@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Mar 23, 2026
@codecov
Copy link

codecov bot commented Mar 23, 2026

Codecov Report

❌ Patch coverage is 46.08696% with 124 lines in your changes missing coverage. Please review.
✅ Project coverage is 69.15%. Comparing base (e0b0ef9) to head (7fdf8fa).

Files with missing lines Patch % Lines
pkg/api/server.go 13.46% 44 Missing and 1 partial ⚠️
pkg/skills/client/client.go 17.94% 30 Missing and 2 partials ⚠️
pkg/server/discovery/discovery.go 52.38% 14 Missing and 6 partials ⚠️
pkg/server/discovery/discover.go 43.33% 16 Missing and 1 partial ⚠️
pkg/server/discovery/health.go 87.30% 4 Missing and 4 partials ⚠️
pkg/api/v1/healthcheck.go 50.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4319      +/-   ##
==========================================
- Coverage   69.19%   69.15%   -0.04%     
==========================================
  Files         478      481       +3     
  Lines       48157    48379     +222     
==========================================
+ Hits        33320    33455     +135     
- Misses      12254    12324      +70     
- Partials     2583     2600      +17     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/XL Extra large PR: 1000+ lines changed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant