Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,8 @@ datasets/**/*.jsonl
datasets/**/*.json
!datasets/.gitkeep

# Output files from run
output/**/*.md
# Output files from run (per-run logs, evolved artifacts, gate decisions)
output/

# Evolution snapshots
snapshots/
Expand Down
32 changes: 31 additions & 1 deletion docs/model_resolution.md
Original file line number Diff line number Diff line change
Expand Up @@ -140,6 +140,36 @@ Refresh is **in-memory only** — the framework does not write back to `~/.herme

**What's not supported:** streaming via the Responses endpoint (evolution doesn't stream), Codex-specific reasoning-effort overrides (DSPy's defaults work for gpt-5-class), and tool-call message conversion beyond what DSPy's `_convert_chat_request_to_responses_request` already handles. If a Codex 401 surfaces during a run, the standard auth-error panel renders with the `hermes auth add openai-codex` recovery hint.

## Nous Portal OAuth + agent_key

Nous Portal uses a two-stage credential model that's different from every other provider:

1. **OAuth access_token** (long-lived, days). Refreshable via the standard `refresh_token` grant.
2. **agent_key** (short-lived, ~30 minutes). Minted from the access_token via a Nous-specific `POST /api/oauth/agent-key`. The inference endpoint requires the **agent_key** as Bearer — not the access_token.

Run `hermes model` and select Nous Portal to populate `~/.hermes/auth.json` with both. Then point `config.yaml` at Nous:

```yaml
# ~/.hermes/config.yaml
model:
default: Hermes-4-405B
provider: nous
```

When the resolver detects a Nous credential pool entry with a `refresh_token` (signals OAuth-managed flow), the framework instantiates a `NousLM` subclass that:

1. **Mints a fresh agent_key at preflight time** by POSTing to `{portal}/api/oauth/agent-key` with the OAuth access_token as Bearer.
2. **Refreshes the OAuth access_token in-memory** when it's within 120s of expiry — POSTed to `{portal}/api/oauth/token` with the standard refresh_token grant. Mirrors Hermes's own refresh-first-then-mint sequencing in `hermes_cli/auth.py`.
3. **Re-mints on inference 401** (mid-run agent_key revocation or expiration). The four LM roles (optimizer, reflection, eval, judge) coordinate through a shared lock so a four-thread evolution doesn't race the portal's single-use refresh-token rotation.

The portal URL is overridable via `HERMES_PORTAL_BASE_URL` (Hermes's own env var name; sharing keeps configs portable for stage / mock setups).

Refresh + mint state is **in-memory only** — the framework never writes back to `~/.hermes/auth.json`. For evolution sessions running longer than the on-disk agent_key TTL (~30 minutes since the last `hermes model`), the in-process refresh handles it. For multi-day sessions, periodic `hermes model` keeps the on-disk store fresh.

**What's not supported:** auxiliary endpoints (vision / web-extract / session-search models from `auxiliary.*` config), streaming, and `auth.json` writeback. Pool entries without `refresh_token` (env-var-style `NOUS_API_KEY` setups) fall through to the existing direct-pass-through path — note that path probably doesn't actually work for Nous inference (the access_token isn't a valid Bearer), but we don't try to "upgrade" those users silently.

A runnable smoke harness at `tests/manual/nous_smoke.py` validates the Nous wire flow against a local mock portal (no real Nous Portal account required). Run via `uv run python tests/manual/nous_smoke.py`.

## Per-role overrides

When your provider exposes multiple models, you can pick a different one per role to manage cost. Common pattern: a frontier model for the optimizer + reflection LMs (where reasoning matters), a cheaper model for eval + judge (where you'll make many calls):
Expand Down Expand Up @@ -236,7 +266,7 @@ The framework defaults all four roles to Hermes's single `model.default`. To use

This module currently does not:

- Refresh expired OAuth tokens for non-Codex providers (delegated to `hermes auth add <provider>` / `hermes model`; Codex tokens refresh in-memory — see [OpenAI Codex Responses API](#openai-codex-responses-api))
- Honor `auxiliary.*` provider config from `config.yaml` (Hermes's vision/web-extract/session-search routing)
- OAuth refresh for Qwen, Spotify, or Google Gemini providers (Codex and Nous Portal handled in-memory — see their dedicated sections above; the other OAuth providers in Hermes don't have demand from the evolution use case yet)

The slim resolver lives at `evolution/core/hermes_provider.py`. The mapping table is sourced from `hermes_cli/auth.py` constants — drift is possible; update by reference when Hermes adds providers.
122 changes: 122 additions & 0 deletions evolution/core/hermes_provider.py
Original file line number Diff line number Diff line change
Expand Up @@ -312,6 +312,18 @@ def resolve_default_lm(
auth_store=auth_store, target_model=target_model, role=role
)

# Nous Portal: when the credential pool entry has a refresh_token (the
# OAuth-managed flow that hermes model writes), route through NousLM
# for in-memory OAuth refresh + agent_key minting. The plain env-var
# NOUS_API_KEY path falls through to the generic OpenAI-wire handler
# below — no behavior change for that simpler setup.
if canonical == "nous":
nous_resolved = _maybe_resolve_nous_lm(
auth_store=auth_store, target_model=target_model, role=role
)
if nous_resolved is not None:
return nous_resolved

if not target_model:
raise HermesProviderError(
f"~/.hermes/config.yaml sets provider='{requested_provider}' "
Expand Down Expand Up @@ -703,6 +715,116 @@ def _factory() -> Any:
)


def _maybe_resolve_nous_lm(
*,
auth_store: Dict[str, Any],
target_model: str,
role: Role,
) -> Optional[ResolvedLM]:
"""Build a NousLM-backed ResolvedLM when the auth.json pool entry
looks OAuth-managed; return None to let the caller fall through to
the generic OpenAI-wire handler when the entry is just an env-var-
style API key.

Nous uses a two-stage credential model: an OAuth access_token
(long-lived) is exchanged for a short-lived agent_key that's the
actual inference Bearer. NousLM handles both: refresh access_token
in-memory when expiring, mint a fresh agent_key from it, re-mint on
inference 401. See evolution/core/nous_lm.py.

The "looks OAuth-managed" signal: pool entry has a refresh_token. A
pool entry without refresh_token is either env-var-only (NOUS_API_KEY
set, no real OAuth state) or hand-edited; let the caller fall
through to direct pass-through so we don't break that setup.

The CodexLM-equivalent NousLM import is lazy to avoid a circular
dependency: nous_lm imports HermesProviderError from this module.
"""
pool_entry = _pick_pool_entry(auth_store, "nous")
if pool_entry is None:
# No pool entry at all → hint operator at the right recovery
# rather than falling through silently to env-var resolution
# that probably also won't work.
raise HermesProviderError(
"~/.hermes/config.yaml sets provider='nous' but no usable "
"entry was found in ~/.hermes/auth.json credential_pool[\"nous\"]. "
"Run `hermes model` and select Nous Portal to authenticate, "
f"or pass --{role}-model to bypass Hermes resolution."
)

refresh_token = _str_or_none(pool_entry.get("refresh_token"))
agent_key = _str_or_none(pool_entry.get("agent_key"))
if not refresh_token:
# An entry with agent_key set is plausibly env-var-style or
# hand-edited inference-only — let it fall through to the
# generic OpenAI-wire handler with whatever Bearer it carries.
# An entry with NEITHER refresh_token NOR agent_key is almost
# certainly a partial OAuth setup (interrupted hermes model run,
# or the portal handed back access_token only). Inference would
# 401 with no breadcrumb pointing at the missing credentials, so
# raise here with a specific recovery hint.
if agent_key is None:
raise HermesProviderError(
"~/.hermes/auth.json credential_pool[\"nous\"] entry has "
"an access_token but no refresh_token or agent_key — "
"looks like a partial OAuth setup. Run `hermes model` "
"and select Nous Portal to complete authentication, or "
f"pass --{role}-model to bypass Hermes resolution."
)
return None

access_token = _str_or_none(pool_entry.get("access_token"))
if not access_token:
raise HermesProviderError(
"~/.hermes/auth.json credential_pool[\"nous\"] entry has no "
"access_token. Run `hermes model` and select Nous Portal to "
"re-authenticate."
)

if not target_model:
raise HermesProviderError(
"~/.hermes/config.yaml sets provider='nous' but model.default "
f"is empty. Set it (e.g., 'Hermes-4-405B'), or pass --{role}-model."
)

# Lazy import to break the circular dependency with nous_lm.
from evolution.core.nous_lm import ( # noqa: PLC0415
NousLM as _NousLM,
NOUS_INFERENCE_BASE_URL,
NOUS_PORTAL_BASE_URL,
)
from evolution.core.oauth_helpers import parse_iso_or_epoch # noqa: PLC0415

inference_base_url = (
_str_or_none(pool_entry.get("inference_base_url"))
or _str_or_none(pool_entry.get("base_url"))
or NOUS_INFERENCE_BASE_URL
)
oauth_expires_at = parse_iso_or_epoch(pool_entry.get("expires_at"))
agent_key = _str_or_none(pool_entry.get("agent_key"))
agent_key_expires_at = parse_iso_or_epoch(pool_entry.get("agent_key_expires_at"))

def _factory() -> Any:
return _NousLM(
model=f"openai/{target_model}",
access_token=access_token,
refresh_token=refresh_token,
oauth_expires_at=oauth_expires_at,
agent_key=agent_key,
agent_key_expires_at=agent_key_expires_at,
portal_base_url=NOUS_PORTAL_BASE_URL,
inference_base_url=inference_base_url,
)

return ResolvedLM(
model=f"openai/{target_model}",
lm_kwargs={},
source=f"hermes-config:nous(inference_base_url={inference_base_url})",
lm_factory=_factory,
provider_hint="nous",
)


def _build_resolved_lm(
*,
provider: str,
Expand Down
Loading
Loading