Skip to content

feat: Nous Portal OAuth refresh + agent_key minting#60

Merged
jramos merged 5 commits into
mainfrom
feat/nous-oauth-refresh-and-agent-key
May 15, 2026
Merged

feat: Nous Portal OAuth refresh + agent_key minting#60
jramos merged 5 commits into
mainfrom
feat/nous-oauth-refresh-and-agent-key

Conversation

@jramos
Copy link
Copy Markdown
Owner

@jramos jramos commented May 15, 2026

Summary

  • A user with provider: nous in ~/.hermes/config.yaml couldn't actually use the framework before this PR. The resolver read the OAuth access_token from the credential pool and handed it to LiteLLM as the inference Bearer, but Nous's inference endpoint requires the short-lived agent_key (a separate credential minted from the access_token). Inference would 401 silently.
  • This PR routes provider: nous through a new NousLM subclass (evolution/core/nous_lm.py) that handles the two-stage credential model: refresh OAuth in-memory when the access_token is expiring, mint a fresh agent_key from it, re-mint on inference 401. Mirrors Hermes's own resolve_nous_runtime_credentials flow at hermes_cli/auth.py:3061-3193.
  • Cross-instance state sharing via _SharedNousState keyed by initial refresh_token. The four LM roles share the lock + state, so a four-thread evolution doesn't trigger four parallel mints (which would race the portal's single-use refresh-token rotation).
  • Pool entries WITHOUT refresh_token (env-var-style NOUS_API_KEY setups) fall through to the existing OpenAI-wire direct-pass-through path unchanged.
  • Bundled cleanup: output/ to .gitignore so per-run log dirs stop appearing as untracked in every git status.

Two-stage credential model

Stage What it is How long it lasts Endpoint
OAuth access_token Long-lived, refreshable via standard refresh_token grant Days POST {portal}/api/oauth/token
agent_key Short-lived inference Bearer minted from access_token ~30 minutes POST {portal}/api/oauth/agent-key

NousLM.forward orchestrates both: ensure-credentials at start (refresh OAuth if expiring → mint if agent_key expiring), call inference, on 401 force re-mint and retry once.

Manual validation

tests/manual/nous_smoke.py is a runnable wire-level smoke against a stdlib http.server mocking portal.nousresearch.com. Validates five scenarios:

  1. Initial mint: agent-key POST carries seed access_token as Bearer
  2. OAuth-expiring → refresh-then-mint sequencing; mint Bearer flips to REFRESHED-OAUTH
  3. Inference Bearer = minted agent_key, NOT OAuth access_token (the headline bug fix)
  4. Mid-run inference 401 → force re-mint → retry with new agent_key
  5. OAuth invalid_grantHermesProviderError with hermes model recovery hint

All five pass against the local mock. Not part of CI (heavyweight; spins up a server). Run via uv run python tests/manual/nous_smoke.py. Documented in the Nous Portal setup section as the recommended validation path when a real Portal account isn't available — which is the case here, so this is the closest we can get to the real-credentials end-to-end smoke we did for Codex via the live spike.

Out of scope

  • Generic OAuthLM base class extracted from CodexLM + NousLM: with only two subclasses (and significantly different shapes — Codex needs Cloudflare headers, Nous needs agent-key minting) the abstraction would be premature. Defer until a third OAuth provider lands and the right cut becomes obvious.
  • Qwen / Spotify / Google Gemini OAuth: same pattern shape, no current demand from the evolution use case.
  • auth.json writeback after refresh/mint: in-memory only, mirroring Codex. Long evolutions still need periodic hermes model to refresh the on-disk store.
  • Auxiliary provider routing (auxiliary.vision, auxiliary.web_extract, auxiliary.session_search): only Future-work item left after this PR.

Test plan

  • 35 new tests total: 25 in tests/core/test_nous_lm.py (construction, mint timing, refresh+mint sequencing, 401-triggers-refresh-retry, mid-run inference 401 recovery, concurrent mint race, async path, error classification, ISO/epoch parsing); 8 in tests/core/test_nous_provider.py (OAuth-flow detection, factory wiring, fallback paths, pool exhaustion respect); 2 oauth_helpers cases.
  • Full suite: 930 tests pass (897 baseline + 33 new), no regressions.
  • Manual smoke against mock Nous portal passes all 5 scenarios.
  • Codex tests + cost-advisor tests confirmed unchanged (regression guard).
  • CI green on Python 3.10/3.11/3.12/3.13.

Spike notes

Two recon rounds shaped this PR:

  1. Initial recon found the OAuth grant URL, refresh shape, and constants — confirming Nous follows standard OAuth 2.1 with single-use refresh-token rotation.
  2. Follow-up recon surfaced the agent_key layer that initial recon missed. Nous's inference endpoint requires an agent_key minted from the OAuth access_token via a Nous-specific endpoint. Without handling this, the framework was silently broken for Nous (an issue our existing user base hadn't hit because they're not on Nous). The PR's scope expanded from "refresh OAuth" to "OAuth refresh + agent_key minting + read the right field" once this surfaced.

jramos added 5 commits May 15, 2026 08:51
The output/ directory holds per-run logs, evolved artifacts, and gate
decisions — all generated, none committable. Previous gitignore pattern
output/**/*.md only excluded markdown, leaving .log and .json files
showing as untracked in every git status, which has been quiet noise
across the recent provider work.
A user with provider: nous in ~/.hermes/config.yaml currently can't
actually use the framework — the resolver reads the OAuth access_token
from the credential pool and hands that to LiteLLM as the inference
Bearer, but Nous's inference endpoint requires the short-lived
agent_key (a separate credential minted via POST /api/oauth/agent-key).
This commit provides the LM subclass that handles the two-stage
credential model.

Mirrors hermes_cli/auth.py:3061-3193 (resolve_nous_runtime_credentials):

  * Refresh OAuth access_token in-memory when within 120s of expiry
    via POST {portal}/api/oauth/token (standard refresh_token grant
    with client_id="hermes-cli")
  * Mint a fresh agent_key when missing or within 120s of expiry via
    POST {portal}/api/oauth/agent-key (Bearer access_token, ask for
    1800s min TTL)
  * Refresh-first-then-mint sequencing so a stale access_token doesn't
    cause mint failures
  * Mint 401 → refresh OAuth once and retry mint (Hermes pattern)
  * Inference 401 → force re-mint and retry once (mid-run recovery)

Cross-instance state sharing via _SharedNousState keyed by initial
refresh_token. The four LM roles (optimizer, reflection, eval, judge)
share the lock + state, so a four-thread evolution doesn't trigger
four parallel mints (which would race the portal's single-use
refresh-token rotation and produce refresh_token_reused errors on
three of them).

In-memory only — no auth.json writeback. Long evolutions (>30 min on
a fresh agent_key) refresh + re-mint in-process; the on-disk store
stays at whatever `hermes model` last wrote. Avoids write-conflict
surface with concurrent Hermes sessions that may also be refreshing.

Error classification mirrors Hermes's own (auth.py:2595-2624):
invalid_grant/invalid_token + HTTP 401/403 from OAuth endpoint surface
HermesProviderError with `hermes model` recovery hint;
refresh_token_reused gets the special "another client consumed it"
message; mint failures translate similarly.

oauth_helpers.parse_iso_or_epoch handles both Nous's ISO 8601
expires_at and Codex's Unix epoch float — kept in a small standalone
module so the next OAuth provider has somewhere obvious to extend
without bloating either provider's LM file.
Adds a new branch in resolve_default_lm: when canonical == "nous" AND
the auth.json pool entry has a refresh_token (signals OAuth-managed
flow that hermes model writes), build a ResolvedLM whose lm_factory
constructs a NousLM that handles the two-stage refresh + agent_key
mint internally.

Pool entries WITHOUT refresh_token (env-var-style NOUS_API_KEY users)
fall through to the existing OpenAI-wire direct-pass-through path
unchanged. Note that direct-pass-through path probably also doesn't
work for Nous (the pool's access_token field holds the OAuth token,
not the agent_key the inference endpoint needs) — but that's a
pre-existing condition orthogonal to this PR. We don't try to "upgrade"
those users silently.

Missing pool entry → HermesProviderError pointing at `hermes model`
recovery rather than silent fall-through to the broken direct path.

instantiate_lm + the existing _probe_via_factory in auth_check (added
in PR #58) both already dispatch on lm_factory presence — Nous flows
through them unchanged. The Nous recovery hint is already in
_HERMES_AUTH_COMMAND_BY_PROVIDER from the original auth-check work.
docs/model_resolution.md:
  * New "Nous Portal OAuth + agent_key" section paralleling the Codex
    section. Documents the two-stage credential model, the in-memory
    refresh + mint flow, the HERMES_PORTAL_BASE_URL override knob, the
    in-memory-only posture, and the env-var-fallthrough behavior for
    pool entries without a refresh_token.
  * Future-work list trimmed: removes the now-shipped non-Codex OAuth
    bullet; rewords what remains so it's clear which providers are
    intentionally out of scope (Qwen / Spotify / Gemini).

tests/manual/nous_smoke.py (NEW):
  Runnable mock-server smoke that validates the Nous wire flow without
  needing a real Nous Portal account. Spins up a stdlib http.server
  pretending to be portal.nousresearch.com, drives a real NousLM (and
  through it, a real LiteLLM call) against it, asserts on five
  scenarios:

    1. Initial mint: agent-key POST carries the seed access_token as
       Bearer; min_ttl_seconds=1800 in body.
    2. OAuth-expiring → refresh-then-mint: confirms call ordering and
       that the mint POST uses the REFRESHED access_token (proves the
       sequencing isn't backwards).
    3. Inference uses the minted agent_key: the inference POST's
       Authorization header is the MINTED key, not the OAuth token —
       this is the headline bug the whole PR fixes.
    4. Mid-run inference 401: forward's exception handler force-re-mints
       and retries; the recorded HTTP exchange shows mint→infer(401)→
       re-mint→infer(200) with two distinct minted keys.
    5. OAuth refresh invalid_grant → HermesProviderError with the
       `hermes model` recovery hint.

  Smoke uses cache=False + num_retries=0 to expose the underlying
  network behavior — DSPy cache would otherwise leak prior responses
  across scenarios and LiteLLM's internal retry-on-401 would mask our
  own re-mint logic. Comments explain both choices.

  Not part of CI (heavyweight; spins up a server). Documented in the
  Nous setup section as the recommended way to gain confidence in the
  Nous flow when a real Portal account isn't available.
Review pass found a handful of silent-failure paths and one factually
wrong docstring; all addressed here.

oauth_helpers.parse_iso_or_epoch:
  Reject bool, inf, nan, negative, and naive-datetime inputs that would
  silently produce wrong epoch values:
    * inf/nan: every skew check evaluates as "now >= inf" → False, so
      the token would be treated as eternally fresh and never refreshed
    * naive ISO: datetime.timestamp() interprets in the host's local TZ,
      silently corrupting the skew window by hours on non-UTC hosts
    * bool: subclass of int, would coerce True → 1.0 epoch seconds
  Module docstring also rewritten to describe actual current consumer
  (NousLM only — the previous "shared by Codex" claim was aspirational;
  Codex parses expires_at inline as a raw float and doesn't import this).

nous_lm._refresh_oauth and _absorb_mint_response:
  expires_in fields now reject bool explicitly — without the guard,
  expires_in: True is accepted as 1 second (isinstance(True, int) is
  True in Python), triggering perpetual re-mint storms. Also: when both
  expires_at AND expires_in are absent or unusable, the code now logs a
  warning before falling through to the conservative TTL floor, so a
  portal protocol change that drops both fields is at least visible in
  the run log instead of silently caching a key for 30 minutes.

nous_lm.NousLM.__init__:
  HERMES_PORTAL_BASE_URL and NOUS_INFERENCE_BASE_URL env vars are now
  read at instance time, not module-import time. Previously the docs
  advertised both as "overridable" but if anything imported nous_lm
  before the operator set the var, the override was silently ignored.
  The manual smoke harness already worked around this; this fix makes
  the documented behavior actually true.

nous_lm.NousLM.forward / aforward:
  When the post-401 retry ALSO returns 401, wrap with HermesProviderError
  that names the recovery action ("OAuth grant may have been revoked;
  run hermes model"). Previously a bare litellm.AuthenticationError
  propagated with no signal that recovery had been attempted.

nous_lm._SharedNousState.__post_init__:
  Reject construction with agent_key set but agent_key_expires_at None
  (or vice versa). The runtime path defensively treats this as "always
  re-mint" — surfacing the construction-time mistake loudly is cheaper
  than letting it cause silent re-mint storms in production.

hermes_provider._maybe_resolve_nous_lm:
  When the credential pool entry has access_token but neither
  refresh_token NOR agent_key, raise HermesProviderError pointing at
  `hermes model` recovery. This is almost certainly a partial OAuth
  setup (interrupted hermes model run) that would otherwise let the
  caller fall through to direct pass-through and 401 against Nous's
  inference endpoint with no breadcrumb. Pool entries with agent_key
  set still fall through unchanged (genuine inference-only credentials).

Comment-rot cleanup:
  Stripped six `hermes_cli/auth.py:NNNN-NNNN` line-number references
  from nous_lm.py and docs/model_resolution.md. Replaced with symbol
  references. Codex's nous_lm-equivalent (codex_lm.py) got this right
  and Nous followed the wrong precedent; aligning now.

New test coverage (~20 cases):
  Refresh + mint malformed-JSON paths, network-error wrapping (httpx
  ConnectError), OAuth 403 / mint 403 status-code-triggers-relogin,
  agent_key field-name alias, ISO expires_at parsing, bool expires_in
  hits floor, async-path 401 recovery (sync had it; async test
  previously mocked the thing under test), partial-OAuth-setup error
  in the resolver, _SharedNousState __post_init__ guard, and the new
  parse_iso_or_epoch rejection paths.
@jramos jramos merged commit a8fad18 into main May 15, 2026
4 checks passed
@jramos jramos deleted the feat/nous-oauth-refresh-and-agent-key branch May 15, 2026 18:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant