Skip to content

feat: add /health and /ready endpoints#37

Merged
franciscojavierarceo merged 3 commits into
vllm-project:mainfrom
ashwing:feat/health-endpoint
May 29, 2026
Merged

feat: add /health and /ready endpoints#37
franciscojavierarceo merged 3 commits into
vllm-project:mainfrom
ashwing:feat/health-endpoint

Conversation

@ashwing
Copy link
Copy Markdown
Contributor

@ashwing ashwing commented May 28, 2026

Summary

Add liveness (GET /health) and readiness (GET /ready) endpoints to the gateway, enabling proper orchestration under Kubernetes, ECS, and docker-compose.

  • /health returns 200 OK unconditionally when the process is listening (liveness probe)
  • /ready probes the LLM backend's /health with a 2-second timeout and returns 200 if healthy, 503 Service Unavailable otherwise (readiness probe)

This separates "is the gateway process alive?" from "is the system ready to serve traffic?" — standard for any load-balanced deployment.

Closes #31

Related

Changes

File Change
crates/agentic-server/src/handler.rs health() and ready() handler functions
crates/agentic-server/src/app.rs Route registration for /health and /ready
crates/agentic-server/Cargo.toml Add reqwest runtime dependency (for backend probe)
crates/agentic-server/tests/health_test.rs 4 integration tests

Test Plan

Automated (4 integration tests):

  • test_health_returns_200 — gateway up, LLM up → 200
  • test_health_returns_200_even_when_llm_down — gateway up, LLM unreachable → 200 (liveness unaffected)
  • test_ready_returns_200_when_llm_healthy — LLM responds on /health → 200
  • test_ready_returns_503_when_llm_unreachable — LLM unreachable → 503

All pass via cargo test --workspace.

Manual (live vLLM backend):
Built the gateway on an EC2 g6e.48xlarge instance and pointed it at a running vLLM server (port 8000):

$ curl -w '%{http_code}' http://localhost:9090/health
200
$ curl -w '%{http_code}' http://localhost:9090/ready
200

Lint/format:

  • cargo clippy --workspace --all-targets -- -D warnings — clean
  • cargo fmt -- --check — clean

Add GET /health (liveness) and GET /ready (readiness) endpoints to
the gateway. /health returns 200 unconditionally when the process is
listening. /ready probes the LLM backend's /health and returns 503 if
unreachable, enabling proper K8s/docker-compose orchestration.

Signed-off-by: Ashwin Giridharan <girida@amazon.com>
Copy link
Copy Markdown
Collaborator

@franciscojavierarceo franciscojavierarceo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks a ton for this @ashwing!! can you update your PR? The CI is failing.

Signed-off-by: Ashwin Giridharan <girida@amazon.com>
@ashwing
Copy link
Copy Markdown
Contributor Author

ashwing commented May 28, 2026

Thanks @franciscojavierarceo! Fixed — the rustfmt import ordering is corrected in 851fd94. CI should be green now.

let base = state.config.llm_api_base.trim_end_matches('/');
let url = format!("{base}/health");

let client = reqwest::Client::builder()
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this reqwest is missing authorization headers. in readiness.rs we do insert the headers.

can you fix this asymmetry?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — fixed in 70c3849. The readiness probe now injects the Bearer token from OPENAI_API_KEY the same way readiness.rs does at startup.

Match the auth injection pattern from readiness.rs — forward the
configured OPENAI_API_KEY as a Bearer token when probing the LLM
backend's /health endpoint.

Signed-off-by: Ashwin Giridharan <girida@amazon.com>
Copy link
Copy Markdown
Collaborator

@franciscojavierarceo franciscojavierarceo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm! thanks for the quick fixes

@franciscojavierarceo franciscojavierarceo merged commit 87e6c04 into vllm-project:main May 29, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: Add /health and /ready endpoints for production deployments

2 participants