Phase 1: Storage foundation for org-workspace messaging#2
Phase 1: Storage foundation for org-workspace messaging#2
Conversation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…t tiers, paths Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…Lock Implements MessageStore with create_message, create_file_delivery, find_unread/thread/by_id, mark_read, and archive. Adapts StateConfig to the actual sequences/terminal_states API. Handles org-workspace's generation-bump-on-create via live node tracking and in-place slot refresh (_refresh_node) so returned NodeViews stay valid across sequential create calls. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds lib/agent_inbox.py providing per-agent task management on top of org-workspace, implementing the WAITING→QUEUED→WORKING→DONE state machine with trust-tier-based auto-acceptance and revision support. 14 new tests cover creation, all lifecycle transitions, precondition guards, and queries. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Extract _refresh_node into lib/_org_utils.py, import from both message_store.py and agent_inbox.py (Critical 1) - Fix AgentInbox._reload to call _refresh_live_nodes() after reload, add generation contract docstring (Critical 2) - Add "archived" key to AgentInbox.counts() (Important 3) - Add test_retry_increments_count_twice asserting RETRY_COUNT == 2 (Important 5) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implements TaskGovernor in lib/governor.py with trust tier resolution, per-sender daily token budgets, hourly rate limits, and effort caps. State persists to daily JSON budget files with FileLock for concurrency safety. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…defaults - Rewrite check_and_record to acquire FileLock, reload state, run check, update in-memory state, and write directly to file — all within one lock scope, eliminating the TOCTOU race condition - Verify _load_state does not use FileLock (safe to call within lock scope) - Add queue depth check in check_task after rate limiting - Add rate_limits default to get_compute_config so tasks_per_hour is present in the returned dict without requiring settings override - Add _check_date_rollover() call at start of get_usage() Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…llision, empty secret - Create lib/relay.py as single canonical relay implementation - Fix /status endpoint: no longer leaks connected usernames (count only) - Fix arg parsing: --host flag instead of -h to avoid collision with --help - Fix startup: raises ValueError with clear message if RELAY_SECRET is empty - Delete lib/datacore-msg-relay.py and relay/datacore-msg-relay.py (duplicates) - Update relay/Dockerfile to use lib/ layout and python -m lib.relay - Update Procfile to use python -m lib.relay --host - Add tests/test_relay.py with 5 tests covering auth, startup, arg parsing - All 46 tests passing Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…e, use org-workspace Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…, store independence Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…2.0 architecture Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…EADME for v0.2.0 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… mutations - Add _load_state_unlocked() for use inside FileLock contexts (no re-entrant locking) - Add _write_state_unlocked() using temp+os.replace() for atomic POSIX writes - record_usage, record_task_submission, record_task_completion now each acquire FileLock, reload from disk, modify, and atomically write — eliminating lost-update races - check_and_record updated to use _write_state_unlocked() (was using write_text directly) - Add _record_task_submission() as the internal implementation - record_task_submission() kept as public backward-compat wrapper with deprecation note - Add submit_task() as the preferred public API (check + record in one call) - _save_state() removed; replaced by the lock+load+modify+write pattern Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…operty inputs, fix test gaps - Remove `workspace` property from MessageStore and AgentInbox to prevent callers from bypassing state guards and FileLock - Add `AgentInbox.find_by_id()` as focused replacement; update hooks/send-reply.py and hooks/task-queue.py to use it - Add process-local `_MSG_COUNTER` / `_FILE_COUNTER` (itertools.count) to `_unique_msg_id` and `_unique_file_id` hash inputs, eliminating collision risk when two IDs are generated within the same microsecond - Add `_validate_property_value()` helper; apply to from_actor, to_actor, trust_tier, reply_to, priority, and extra_props in create_message / create_file_delivery / create_task — newlines in property values would corrupt org property drawers - Fix `test_message_id_same_content_different_time`: now asserts IDs differ - Add `test_live_node_stays_valid_after_create`: verifies msg1 survives workspace generation bump caused by creating msg2 - Add `TestInputValidation` classes to both test files covering newline rejection and the allowed-newline-in-body-text case - Add `TestFindById` to test_agent_inbox.py covering find and miss paths - Add explanatory comment above _TASK_STATE_CONFIG and _MSG_STATE_CONFIG clarifying that sequences do not constrain transitions Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…or v0.2.0 reality - module.yaml: separate `provides` (hooks, implemented) from `planned` (commands/agents are spec-only docs; CLI implementation is Phase 2) - commands/*.md: replace all org/inboxes/ → org/messaging/ and USERS.yaml → contacts.yaml path references - UPGRADING.md: new migration guide for v0.1.0 → v0.2.0 with mv/merge commands and contacts.yaml format conversion - README.md: remove relay/ directory listing (consolidated to lib/relay.py), remove /msg-trust reference (use trust_overrides in settings instead), update relay deploy instructions to match actual implementation - lib/relay.py: remove dead claude_whitelist parameter from RelayServer, add TLS note to create_relay_app docstring, add TLS info log in run_relay - lib/message_store.py: add comment explaining outbox.org is Phase 2 placeholder Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ad freshness, deprecation notices - settings.local.yaml.example: remove dead claude_whitelist section, add v0.2.0 sections (trust_overrides, trust_tiers, compute) matching lib/config.py - lib/governor.py: add self._load_state() at top of get_usage() and daily_summary() so they reload from disk before returning data - datacore-msg.py: add prominent deprecation notice in module docstring and startup warning printed to stderr at runtime - templates/contacts.yaml: update comment to reference trust_overrides in settings.local.yaml instead of dead /msg-trust command - UPGRADING.md: document GUI app incompatibility with v0.2.0, expected failure modes, and Phase 2 rewrite plan - relay/Dockerfile: verified correct (COPY lib/ lib/ + lib.relay --host) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Introduces a new lib/ storage foundation for org-workspace–backed messaging and agent task governance, and rewires hooks/docs to use this shared library layer instead of duplicating file-parsing logic.
Changes:
- Added core library modules for config, message storage, agent task inbox lifecycle, governance, and a consolidated relay server.
- Rewired hooks to use the new library APIs; removed legacy relay/window implementations.
- Added a new test suite covering the new relay/store/inbox/governor behaviors and updated templates/docs for the v0.2.0 layout.
Reviewed changes
Copilot reviewed 42 out of 44 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
lib/config.py |
Centralized settings + identity + trust tier/compute config helpers. |
lib/message_store.py |
org-workspace backed universal inbox storage + threading + file delivery nodes. |
lib/agent_inbox.py |
Agent task inbox state machine (WAITING→QUEUED→WORKING→DONE→…). |
lib/governor.py |
Trust tiers, budgets, and rate-limit enforcement with on-disk state. |
lib/relay.py |
Consolidated aiohttp WebSocket relay implementation + CLI parsing. |
lib/_org_utils.py |
Shared helper to refresh NodeView references across generations. |
lib/__init__.py |
Ensures repo root is importable when hooks run standalone. |
hooks/inbox-watcher.py |
Hook rewritten to use AgentInbox (claim queued tasks). |
hooks/send-reply.py |
Hook rewritten to use MessageStore + optionally complete tasks. |
hooks/mark-message.py |
Hook rewritten to mark messages read/archive via MessageStore. |
hooks/task-queue.py |
Hook rewritten to approve/reject/cancel tasks via AgentInbox. |
tests/conftest.py |
Test fixtures for temp org/messaging workspace + StateConfig. |
tests/test_message_store.py |
Tests for message creation, threading, querying, transitions, validation. |
tests/test_agent_inbox.py |
Tests for task lifecycle, retries, validation, queries, counts. |
tests/test_governor.py |
Tests for tier resolution, budgeting, persistence, rate limiting, effort limits. |
tests/test_relay.py |
Tests for relay /status privacy, empty secret refusal, arg parsing. |
tests/test_config.py |
Tests for settings cache behavior, defaults, overrides. |
tests/test_hooks.py |
Smoke tests that lib-based hook behaviors work via stores. |
tests/test_integration.py |
End-to-end integration test across store/inbox/governor. |
templates/contacts.yaml |
New contacts registry template (actors list). |
templates/USERS.yaml |
Removed legacy user registry template. |
settings.local.yaml.example |
Updated example settings (trust tiers/overrides, relay, compute). |
requirements.txt |
Updated runtime dependencies (includes org-workspace). |
relay/datacore-msg-relay.py |
Removed legacy relay implementation. |
relay/Dockerfile |
Updated relay container to run python -m lib.relay. |
Procfile |
Updated to run consolidated relay module. |
pyproject.toml |
Added project metadata + deps + pytest config. |
module.yaml |
Bumped version/schema and updated created paths + hook definitions. |
datacore-msg.py |
Added deprecation warning re: v0.2.0 incompatibility. |
UPGRADING.md |
Added migration guide for v0.1.0 → v0.2.0 layout changes. |
README.md |
Updated docs for new org/messaging layout, trust tiers, relay usage. |
CLAUDE.md |
Updated module context docs to match new storage + governance design. |
commands/msg.md |
Spec doc updated for new universal inbox + contacts.yaml. |
commands/reply.md |
Spec doc updated for new inbox path. |
commands/my-messages.md |
Spec doc updated for scanning universal inbox files. |
commands/msg-add-user.md |
Spec doc updated for contacts.yaml + messaging dir. |
commands/broadcast.md |
Spec doc updated for contacts.yaml + universal inbox. |
agents/message-task-intake.md |
New spec doc describing task intake/governance/task state machine. |
agents/message-digest.md |
Spec doc updated to use MessageStore and universal inbox. |
agents/claude-inbox.md |
Removed legacy agent spec tied to v0.1.0 layout. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
| self.users[username] = User(username=username, ws=ws) | ||
|
|
||
| def remove_user(self, username: str) -> None: | ||
| self.users.pop(username, None) |
There was a problem hiding this comment.
RelayServer.add_user() overwrites any existing connection for the same username without closing the old websocket or guarding removal. If a client reconnects, the old connection’s finally block will call remove_user(username) and can inadvertently delete the new session, causing flapping/disconnects. Consider tracking the ws per user and either (a) close and replace the old ws on re-auth, and/or (b) make remove_user conditional on the ws instance that’s disconnecting (e.g., remove_user(username, ws) only if current ws matches).
| self.users.pop(username, None) | |
| user = self.users.get(username) | |
| # Only remove the user entry if the currently-registered websocket | |
| # is already closed. This prevents an older connection's cleanup | |
| # from removing a newer active session for the same username. | |
| if user is not None and user.ws.closed: | |
| self.users.pop(username, None) |
| with FileLock(self._budget_file): | ||
| self._load_state_unlocked() | ||
| result = self.check_task(actor_id, estimated_tokens=estimated_tokens, effort=effort) | ||
| if result.allowed: | ||
| sender = self._get_sender(actor_id) | ||
| sender.tasks_this_hour.append(time.time()) | ||
| sender.tasks_today += 1 | ||
| sender.active_tasks += 1 | ||
| self._write_state_unlocked() |
There was a problem hiding this comment.
check_and_record() acquires FileLock on the budget file, then calls check_task(), which calls _check_date_rollover() and may call _load_state() (which acquires FileLock again). This introduces nested locking on the same file and can deadlock or hang at midnight rollover depending on FileLock’s reentrancy. Refactor so date rollover/state loading inside locked sections uses an “_unlocked” variant (no internal FileLock) or perform rollover before entering the FileLock and then lock the correct file.
| with FileLock(self._budget_file): | |
| self._load_state_unlocked() | |
| result = self.check_task(actor_id, estimated_tokens=estimated_tokens, effort=effort) | |
| if result.allowed: | |
| sender = self._get_sender(actor_id) | |
| sender.tasks_this_hour.append(time.time()) | |
| sender.tasks_today += 1 | |
| sender.active_tasks += 1 | |
| self._write_state_unlocked() | |
| # Perform the governance check outside of any FileLock to avoid nested locking | |
| result = self.check_task(actor_id, estimated_tokens=estimated_tokens, effort=effort) | |
| if not result.allowed: | |
| return result | |
| # Record the task under a FileLock while using *_unlocked state helpers | |
| with FileLock(self._budget_file): | |
| self._load_state_unlocked() | |
| sender = self._get_sender(actor_id) | |
| sender.tasks_this_hour.append(time.time()) | |
| sender.tasks_today += 1 | |
| sender.active_tasks += 1 | |
| self._write_state_unlocked() |
| ```bash | ||
| # Claude can reply via the send-reply script | ||
| # Reply to a sender | ||
| python3 hooks/send-reply.py gregor "Fixed! Check the PR." | ||
|
|
||
| # Reply to a specific message (creates thread) | ||
| python3 hooks/send-reply.py --reply-to msg-20251212-143000-gregor gregor "Here's the follow-up" | ||
| python3 hooks/send-reply.py --reply-to msg-20251212-143000-gregor gregor "Follow-up" | ||
|
|
||
| # Mark task as complete and reply (updates TASK_STATUS to done) | ||
| python3 hooks/send-reply.py --complete msg-20251212-143000-gregor gregor "Task complete! See results." | ||
| # Mark task complete and reply | ||
| python3 hooks/send-reply.py --complete msg-20251212-143000-gregor gregor "Task done." | ||
| ``` | ||
|
|
||
| **Task Status Tracking:** | ||
| - When Claude reads a message, it's marked as `TASK_STATUS: working` | ||
| - Using `--complete` marks the original task as `TASK_STATUS: done` | ||
| - Use `/tasks` in GUI to see task queue status | ||
|
|
||
| **Rate Limiting:** | ||
| - Claude processes one task at a time (FIFO queue) | ||
| - High priority tasks jump the queue | ||
| - If Claude is already working on a task, new tasks stay queued | ||
| - Complete current task before next one is loaded | ||
|
|
||
| **Response Routing:** | ||
| - `github:123` - Post to GitHub issue #123 (requires `gh` CLI) | ||
| - `file:path/to.md` - Append to file (relative to space) | ||
| - `@user` - CC to another user | ||
|
|
||
| The reply is: | ||
| 1. Saved to the recipient's inbox (`gregor.org`) | ||
| 2. Sent via relay for real-time delivery (if connected) | ||
|
|
||
| ### Marking Messages from Claude | ||
| ### Managing the Task Queue | ||
|
|
||
| ```bash | ||
| # Mark a message as TODO | ||
| python3 hooks/mark-message.py 151230 todo | ||
| # Approve a waiting task | ||
| python3 hooks/task-queue.py approve msg-20251212-143000-gregor | ||
|
|
||
| # Mark as done | ||
| python3 hooks/mark-message.py 151230 done | ||
| # Reject a task | ||
| python3 hooks/task-queue.py reject msg-20251212-143000-gregor | ||
|
|
||
| # Mark as read (clear status) | ||
| python3 hooks/mark-message.py 151230 read | ||
| # Cancel a running task | ||
| python3 hooks/task-queue.py cancel msg-20251212-143000-gregor | ||
| ``` |
There was a problem hiding this comment.
The hook examples here don’t match the current hook CLIs:
hooks/send-reply.pyuses--complete-task, but the docs show--complete.hooks/task-queue.pyrequires--task-id(not a positional task id), but the docs showapprove msg-...etc.- “Cancel a running task” doesn’t align with AgentInbox.cancel() which only allows cancelling QUEUED tasks.
Please update the examples or add compatible aliases/positional args in the scripts.
| parser = argparse.ArgumentParser(description="Task queue management") | ||
| parser.add_argument("action", choices=["status", "approve", "reject", "cancel"], help="Queue action") | ||
| parser.add_argument("--task-id", help="Task ID (for approve/reject/cancel)") | ||
| parser.add_argument("--reason", default="", help="Reason (for reject/cancel)") |
There was a problem hiding this comment.
Argument parsing expects --task-id for approve/reject/cancel, but current docs/examples in README use a positional task id. Consider supporting an optional positional task_id (and/or making --task-id required for those actions) and returning a non-zero exit with a clear message when the task id is missing or not found.
| ## Contacts Registry | ||
|
|
||
| ```yaml | ||
| # USERS.yaml | ||
| users: | ||
| username: | ||
| handles: ["@handle1", "@handle2"] | ||
| added: YYYY-MM-DD | ||
| # contacts.yaml | ||
| contacts: | ||
| gregor: | ||
| handles: ["@gregor", "@gz"] | ||
| relay: "wss://relay.example.com/ws" | ||
| added: "2025-12-11" | ||
| crt: | ||
| handles: ["@crt"] | ||
| relay: "" | ||
| added: "2025-12-11" |
There was a problem hiding this comment.
The documented contacts.yaml structure here uses a top-level contacts: mapping, but the new template/migration guide uses a top-level actors: list with id, name, and trust_tier. This mismatch will confuse users and implementers; update the example/schema to match templates/contacts.yaml and UPGRADING.md.
| # Edge enforcement is handled exclusively via _assert_state preconditions | ||
| # in write methods. This allows non-linear workflows (e.g. DONE -> QUEUED). |
There was a problem hiding this comment.
The comment says write methods enforce transitions via “_assert_state preconditions”, but MessageStore doesn’t define or use an _assert_state helper. Either add explicit state precondition checks (similar to AgentInbox) or update this comment/docstring so it reflects the actual behavior.
| # Edge enforcement is handled exclusively via _assert_state preconditions | |
| # in write methods. This allows non-linear workflows (e.g. DONE -> QUEUED). | |
| # MessageStore itself does not enforce transition edges; any policy checks | |
| # are implemented by callers (e.g. AgentInbox), allowing non-linear workflows. |
| self._ws = OrgWorkspace(state_config=state_config or _MSG_STATE_CONFIG) | ||
| self._live_nodes: list[NodeView] = [] # nodes returned to callers | ||
| self._ensure_files() |
There was a problem hiding this comment.
MessageStore tracks every returned NodeView in _live_nodes and never removes entries. In long-running processes (e.g., GUI/daemon), repeated create_message/create_file_delivery calls will cause unbounded growth and extra work in _refresh_live_nodes(). Consider pruning old nodes, using weakrefs, or tracking only nodes that callers explicitly “pin”.
| ```bash | ||
| python3 datacore-msg.py send @gregor "Hey, can you review the PR?" | ||
| python3 datacore-msg.py send @tex-claude "Research competitor pricing" | ||
| ``` | ||
|
|
||
| ### Check Inbox | ||
|
|
||
| ```bash | ||
| ./start.sh | ||
| # Or directly: | ||
| python3 datacore-msg.py | ||
| python3 datacore-msg.py inbox | ||
| ``` | ||
|
|
||
| ### Send Messages | ||
| ### Start the GUI | ||
|
|
||
| In the GUI input field: | ||
| - `@gregor Hey, can you review the PR?` - Message a teammate | ||
| - `@claude Research competitor pricing` - Message your Claude agent | ||
| - `@gregor-claude Help with code review` - Message someone else's Claude (if whitelisted) | ||
| - `@gregor >msg-id Follow-up here` - Reply to a message (creates thread) | ||
| - `@claude [github:42] Fix this bug` - Route response to GitHub issue | ||
| - `@claude [file:research/report.md] Analyze this` - Route to file | ||
| - `@claude [@gregor] Help with code` - CC response to another user | ||
| ```bash | ||
| python3 datacore-msg.py gui | ||
| # Or: ./start.sh | ||
| ``` |
There was a problem hiding this comment.
The README instructs using python3 datacore-msg.py send|inbox|gui, but datacore-msg.py is explicitly marked “not yet compatible with v0.2.0” and (currently) doesn’t implement these subcommands (it launches the legacy GUI instead). Either update the README usage examples to point to the supported hooks/ scripts / lib/ API, or implement an argparse-based CLI in datacore-msg.py that provides these commands.
| parser.add_argument("to", help="Recipient actor ID") | ||
| parser.add_argument("content", help="Message content") | ||
| parser.add_argument("--reply-to", help="Message ID to reply to") | ||
| parser.add_argument("--complete-task", help="Task ID to mark complete") |
There was a problem hiding this comment.
The CLI flag for completing a task is --complete-task, but existing docs/examples in README (and likely prior usage) refer to --complete. To preserve compatibility and avoid confusion, consider adding --complete as an alias or renaming the argument to --complete (keeping --complete-task as a deprecated alias if desired).
| parser.add_argument("--complete-task", help="Task ID to mark complete") | |
| parser.add_argument("--complete-task", "--complete", help="Task ID to mark complete") |
| def parse_relay_args(args: list[str] | None = None) -> argparse.Namespace: | ||
| """Parse relay CLI arguments. Uses --host (not -h) for hosting.""" | ||
| parser = argparse.ArgumentParser(description="Datacore messaging relay") | ||
| parser.add_argument("--host", action="store_true", help="Host relay server") | ||
| parser.add_argument("--port", type=int, default=8080, help="Port (default: 8080)") | ||
| parser.add_argument("--bind", default="0.0.0.0", help="Bind address") | ||
| return parser.parse_args(args) | ||
|
|
||
|
|
||
| def run_relay() -> None: | ||
| """Entry point for standalone relay server.""" | ||
| args = parse_relay_args() | ||
| secret = os.environ.get("RELAY_SECRET", "") | ||
| app = create_relay_app(relay_secret=secret) |
There was a problem hiding this comment.
parse_relay_args() defines a --host flag and several callers pass it (Procfile/Dockerfile), but run_relay() doesn’t branch on args.host (it always starts the server). Either remove --host entirely or use it to control behavior (e.g., require --host to start serving, otherwise print help / run a client). Keeping an unused flag makes the CLI misleading.
Context
An audit of datacore-messaging v0.1.0 identified structural issues: duplicated code across hooks, no shared configuration, no input validation, and tight coupling that made the system hard to extend. The audit findings were reviewed by evaluator agents (CTO, Dijkstra, Feynman, User) until consensus, then used to create a detailed implementation plan.
This PR implements that plan — the storage foundation that all future messaging features will build on.
Full implementation plan:
docs/superpowers/plans/2026-03-11-phase1-foundation.mdCore idea
Instead of hooks directly manipulating org files, this PR introduces a library layer (
lib/) that provides:Hooks are rewired to import from
lib/instead of containing their own logic.What this does NOT do
contacts.yamltemplate uses ActivityPub-style actor IDs (@user@instance) as a forward-compatible addressing format, but no federation protocol is implementeddatacore-msg.pyis deprecated with a warning; proper commands come in Phase 2What changed
lib/config.py,lib/message_store.py,lib/agent_inbox.py,lib/governor.py,lib/relay.py,lib/_org_utils.pylib/lib/relay.py)module.yaml,CLAUDE.md,README.md, templates, settings exampleUPGRADING.mdmigration guide, 68 testsTest plan