refactor: migrate services with existing config.rs to shared TOML config by geoffjay · Pull Request #1204 · geoffjay/agentd

geoffjay · 2026-05-12T05:48:18Z

Migrates all services that already had a config.rs to use agentd_common::config::load() as the base layer, with legacy environment variables overlaid on top for backward compatibility.

Services migrated:

hook — HookConfig::load() reads from shared.services.hook
monitor — MonitorConfig::load() reads from shared.services.monitor
mcp — AgentdMcpConfig::load() reads from shared.services.mcp
memory — EmbeddingConfig::load() / LanceConfig::load() read from shared.services.memory
index — IndexConfig::load() reads from shared.services.index
ui — UiConfig::load() reads from shared.services.ui

All from_env() methods are deprecated in favour of load(). Test modules updated with ENV_LOCK to prevent parallel env-var races and handle AGENTD_PORT set by the agentd runtime.

Closes #1196

geoffjay

Review: refactor: migrate services with existing config.rs to shared TOML config

Stack position: issue-1196 ← issue-1195 (needs-rework) ← issue-1199 ← issue-1194 (needs-rework). The chain is broken two levels up. This PR cannot merge until #1201 and #1203 are reworked and the stack restacks. Reviewing now so the feedback is ready to act on immediately.

The migration pattern — load() as the new entry point, from_env() deprecated and delegating to load(), ENV_LOCK added to tests, remove_var moved before assertions — is consistently applied and correct. One blocking issue: silent config-file errors.

Blocking

`unwrap_or_default()` swallows malformed config-file errors with no diagnostic

Every migrated service uses the same pattern:

let shared = agentd_common::config::load().unwrap_or_default();

agentd_common::config::load() returns Ok(AgentdConfig::default()) for a missing file (intentional), but returns Err(...) for a malformed file (parse error). unwrap_or_default() treats both cases identically — the service starts on compiled defaults with no indication to the operator that their config.toml was ignored.

Consequence: a user with a typo in config.toml (wrong key name, bad value type, stray character) will see the service start on port 17006 instead of their configured 19006, with no log line explaining why. This is a silent misconfiguration failure.

The fix is one line per service. Since init_tracing() is called before Config::load() in every main.rs, tracing is available:

let shared = agentd_common::config::load().unwrap_or_else(|e| {
    tracing::warn!("failed to load config file, using compiled defaults: {e:#}");
    agentd_common::config::AgentdConfig::default()
});

This applies to all six load() / load_with_base() callers: hook, monitor, mcp, memory::EmbeddingConfig, memory::LanceConfig, and ui. The index service inherits via load_with_base which receives a &SharedIndexConfig already extracted from the shared config, so the warning only needs to be in IndexConfig::load() (which is not shown in the diff — presumably it calls EmbeddingConfig::load_with_base and LanceConfig::load_with_base after its own agentd_common::config::load() call).

Partial migrations — expected, but annotate clearly

Several fields in MonitorConfig and HookConfig still fall back to hardcoded literals because the corresponding fields don't exist in the shared schema yet (flagged in the #1201 review):

Service	Fields still hardcoded (pending #1201 schema additions)
`monitor`	`cpu_alert_threshold` (90.0), `memory_alert_threshold` (90.0), `disk_alert_threshold` (90.0), `history_size` (120)
`hook`	`notify_on_failure` (true), `notify_on_long_running` (true), `long_running_threshold_ms` (30_000)
`index.embedding`	`endpoint` ("http://localhost:11434/v1"), `api_key`
`index.lance`	`table` ("code_chunks")

This is the correct approach given the current state of the shared schema. Add a // TODO(#1201): migrate when shared schema adds <field> comment on each hardcoded fallback so the follow-up is obvious when #1201 lands.

Non-blocking observations

1. Config file parsed twice for memory

EmbeddingConfig::load() and LanceConfig::load() each call agentd_common::config::load() independently. From memory/main.rs:

let lance_config = LanceConfig::load();       // → parses config.toml
let embedding_config = EmbeddingConfig::load(); // → parses config.toml again

The file is read and deserialised twice on startup. Consider adding MemoryConfig::load() -> (LanceConfig, EmbeddingConfig) that calls agentd_common::config::load() once and passes the section to both sub-configs (mirroring the load_with_base pattern already used in the index crate).

2. UiConfig derives Debug, Clone — minor scope addition

Not required for the migration, but harmless and likely needed. Fine to keep; just note it in the commit message or PR description so it doesn't get lost.

3. Index test env-var save/restore doesn't protect against panics

env::remove_var("AGENTD_PORT");
env::remove_var("AGENTD_INDEX_PORT");

let config = IndexConfig::from_env();   // ← if this panics, restore code never runs

if let Some(v) = port_saved { env::set_var("AGENTD_PORT", v); }

If from_env() panics, the vars are left unset for subsequent tests. Since ENV_LOCK is held, no other test will run concurrently, but the next test that acquires the lock will see absent vars. The risk is low (from_env is unlikely to panic), and the pattern matches the rest of the test suite. Noting for awareness.

What's correct

load() → from_env() deprecation chain — correct; delegates cleanly ✓
#[deprecated] + #[allow(deprecated)] in tests — applied consistently across all six crates ✓
MCP remove_var moved before assertion — correct cleanup-before-assert pattern ✓
ENV_LOCK added to index and mcp test modules — fixes a real concurrency issue ✓
Precedence is preserved: AGENTD_PORT (direct env) overrides base.port (shared config) ✓
Backward-compatible: AGENTD_ORCHESTRATOR_URL for mcp still works as highest-priority override ✓
mcp/Cargo.toml adds agentd-common dependency correctly ✓
Only main.rs call sites updated to load() — correct; the deprecated from_env() wrapper ensures existing callers still compile ✓

Summary: Add a tracing::warn! to the unwrap_or_default() call in every migrated load() function, and add // TODO(#1201) comments on the hardcoded fallbacks. Everything else is clean. Hold for parent chain restack.

…g tests Add ENV_LOCK to mcp and index config test modules so env-var-mutating tests are serialised in the parallel test harness. Also clear AGENTD_PORT (set by agentd at runtime) before asserting default port values. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…s; add TODO(#1201) comments Replace silent unwrap_or_default() with unwrap_or_else(|e| tracing::warn\!(...)) in all six migrated load() callers (hook, monitor, mcp, memory::EmbeddingConfig, memory::LanceConfig, index, ui). Operators now see a warning when config.toml is malformed rather than silently falling back to compiled defaults. Add TODO(#1201) comments on every hardcoded fallback that is pending a schema addition: notify_on_failure, notify_on_long_running, long_running_threshold_ms (hook); cpu/memory/disk_alert_threshold, history_size (monitor); embedding_endpoint (index.embedding); lance_table (index.lance). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Add a ValidateConfig trait to agentd_common with implementations for all 12 shared config sections. Add AgentdConfig::validate() that collects errors from every section. Implement ValidateConfig on all service-level config structs and wire validate() into each service's startup path so misconfigurations fail early with descriptive error messages. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Create dedicated config.rs modules for core, orchestrator, wrap, ask, notify, and communicate. Each struct loads from agentd_common::config::load() first, then overlays legacy AGENTD_* env vars for backward compatibility. All inline env::var() calls in main.rs replaced with config struct access. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Add impl ValidateConfig for EmbeddingConfig and LanceConfig in crates/memory/src/config.rs (blocking gap: memory was the only service with no validation at startup) - Wire validation in memory/main.rs: call validate() on embedding and lance configs, load shared config for port/host so port=0 is caught before any I/O with a clear error message - Remove validate_inner() indirection in crates/index/src/config.rs; move logic directly into impl ValidateConfig for IndexConfig and import the trait in index/main.rs and the test module - Make validate_url() pub in agentd-common so service crates can reuse it without duplicating the http/https check Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ilures Replace silent unwrap_or_default() with unwrap_or_else(|e| { tracing::warn\!(...) }) in all six new service config loaders (ask, communicate, core, notify, orchestrator, wrap). Restore diagnostic warn\!() calls for invalid AGENTD_WRAP_HISTORY_BYTES and AGENTD_WRAP_CHANNEL_CAPACITY env vars with TODO comments referencing #1201. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Create docs/configuration.md with quick start, file location, precedence rules, complete key reference table, per-service sections, migration guide, and four example configs (minimal, development, production, Docker) - Fix the env var table in agentd-common/config.rs to include all overrides (AGENTD_INDEX_LANGUAGES, AGENTD_RECONCILE_INTERVAL_SECS, all MCP URLs, AGENTD_NOTIFY_SERVICE_URL, AGENTD_COLLECTION_INTERVAL_SECS) - Add link to configuration reference from README.md Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

geoffjay · 2026-05-12T23:45:26Z

This change is part of the following stack:

feat(common): add config module with TOML schema and layered loading #1201
- test(common): add integration tests for layered config precedence #1202
  - feat(cli): add config init and config show subcommands #1203
    - refactor: migrate services with existing config.rs to shared TOML config #1204 ◀
      - feat: add config structs for services with inline env var config #1205
        
        feat(config): add ValidateConfig trait and per-service validation #1224
        
        docs: add configuration system documentation #1207

_{Change managed by git-spice.}

docs: add configuration system documentation

feat: add ValidateConfig trait and per-service validation

feat: add config structs for services with inline env var config

geoffjay added the review-agent Used to invoke a review by an agent tracking this label label May 12, 2026

geoffjay commented May 12, 2026

View reviewed changes

geoffjay added needs-rework PR has review feedback that must be addressed before merging and removed review-agent Used to invoke a review by an agent tracking this label labels May 12, 2026

geoffjay mentioned this pull request May 12, 2026

feat: add config structs for services with inline env var config #1205

Merged

geoffjay linked an issue May 12, 2026 that may be closed by this pull request

Migrate services with existing config.rs to shared TOML config #1196

Closed

6 tasks

geoffjay mentioned this pull request May 12, 2026

feat: add ValidateConfig trait and per-service validation #1206

Merged

geoffjay added review-agent Used to invoke a review by an agent tracking this label and removed needs-rework PR has review feedback that must be addressed before merging labels May 12, 2026

geoffjay and others added 9 commits May 12, 2026 16:29

refactor: migrate services with existing config.rs to shared TOML config

58d594c

fix: run clippy

5aa7ee2

geoffjay force-pushed the issue-1195 branch from cd2ea64 to dbedac3 Compare May 12, 2026 23:45

geoffjay force-pushed the issue-1196 branch from 9dce6b3 to 5adcca8 Compare May 12, 2026 23:45

geoffjay added 3 commits May 12, 2026 16:59

Merge pull request #1207 from geoffjay/issue-1200

917b059

docs: add configuration system documentation

Merge pull request #1206 from geoffjay/issue-1198

0e3bd79

feat: add ValidateConfig trait and per-service validation

Merge pull request #1205 from geoffjay/issue-1197

5b19e71

feat: add config structs for services with inline env var config

geoffjay merged commit 35d64e4 into issue-1195 May 13, 2026

geoffjay deleted the issue-1196 branch May 13, 2026 00:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: migrate services with existing config.rs to shared TOML config#1204

refactor: migrate services with existing config.rs to shared TOML config#1204
geoffjay merged 12 commits into
issue-1195from
issue-1196

geoffjay commented May 12, 2026

Uh oh!

geoffjay left a comment

Uh oh!

geoffjay commented May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

geoffjay commented May 12, 2026

Uh oh!

geoffjay left a comment

Choose a reason for hiding this comment

Review: refactor: migrate services with existing config.rs to shared TOML config

Blocking

unwrap_or_default() swallows malformed config-file errors with no diagnostic

Partial migrations — expected, but annotate clearly

Non-blocking observations

What's correct

Uh oh!

geoffjay commented May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

`unwrap_or_default()` swallows malformed config-file errors with no diagnostic