feat: Moodle Workplace integration (skill-routing + LTI 1.3) by ryan-day-ctr · Pull Request #968 · soliplex/soliplex

ryan-day-ctr · 2026-05-13T16:09:33Z

Summary

Adds a Moodle Workplace integration to Soliplex. Includes:

100 Moodle tools across 7 haiku-skills, dispatched by a router agent that sees a 7-skill catalog (~500 tokens) instead of all 100 tool schemas (~15k tokens). Each skill spawns an isolated subagent with only its own tools.
LTI 1.3 launch so Moodle can embed Soliplex as an external tool (cookieless OIDC, HMAC-signed state and session tokens, optional room picker UI).
Workplace-specific surface: courses, users, cohorts, certifications, programs, dynamic rules, organisation (departments/positions/jobs/managers), Report Builder, UTM/advanced-completion reports, Workplace import/export.

Architecture highlights

src/soliplex/moodle/agent.py — moodle_tools_agent_factory wires a SoliplexSkillToolset in subagent mode plus a router prompt (MOODLE_ROUTER_PROMPT) with concrete dispatch examples and write-confirmation reconstruction guidance.
src/soliplex/moodle/skills.py — 7 skill builders (build_courses_skill, _users_, _organisation_, _certifications_, _programs_, _rules_, _reporting_) with 100 async tool closures over a shared MoodleClient. A module-level @_moodle_tool decorator centralises the MoodleAPIError/httpx.HTTPError → JSON-error pattern (~200 lines saved over per-tool try/except).
src/soliplex/moodle/client.py — async httpx wrapper around Moodle's REST Web Services, persistent client with 30s timeout, 100 public methods.
src/soliplex/moodle/models.py — ~48 Pydantic models for Moodle response shapes.
src/soliplex/views/lti.py — /lti/login + /lti/launch endpoints, with inline CSS/HTML/JS constants for the chat + picker pages. Cookieless: state nonce is HMAC-signed; session token is embedded in the served HTML.
src/soliplex/lti/{nonce,validation,platform,session}.py — JWT validation (RS256/JWKS), three-tier room resolution (target_link_uri → course_room_map → default_room_id), session-token claims extraction.
src/soliplex/config/lti.py — LTIPlatformConfig dataclass.
example/rooms/moodle-tools/room_config.yaml — wires the factory agent with gpt-oss:latest on Ollama; pluggable provider via extra_config.
example/lti/config.yaml — example platform registration.

Two-stage write confirmation runs through the router: skill subagent returns a preview JSON; router renders it and asks for confirmation; on "yes" the router rebuilds the full tool call (parameters from the preview) and re-dispatches with confirmed=True. Helped by a strengthened _CONFIRM_INSTRUCTIONS prompt that suppresses chain-of-thought preamble.

Test plan

pytest — 4,589 tests pass at 100% coverage (25,852 stmts, 0 missing)
ruff check src/ tests/ — clean
ruff format --check src/ tests/ — 162 files already formatted
57-scenario MOODLE.md smoke test — earlier pass against gpt-oss:latest + a local Moodle Workplace 5.0.2 sandbox showed 47 PASS / 7 PARTIAL / 3 FAIL with every non-pass attributed to model-behaviour quirks (gpt-oss hallucinating on empty tool results, same-thread preview shortcuts) or pre-existing seed-data gaps. Not re-run after the recent B1 (get_team_members filter), SF1 (update_courses / enrol_users warnings-as-errors), SF2 (resolver ambiguity), action-key sweep across 19 write previews, N1 (LTI deployment_id error), N2 (get_course_completion_overview parallelisation), or pydantic.ValidationError fixes. Re-verification against the current HEAD is required before flipping Draft → Ready.
LTI launch verified end-to-end: Moodle iframe → room picker → chat with markdown table response (B1/SF1 spot-check on current HEAD: Engineering query → clean Alice+Bob table; cyber101 rename → real warning surfaced, course unchanged on disk)
Re-tested after each rebase against upstream/main (last rebase: HEAD b5643f4c onto upstream/main 16d4982)

Submitting as a draft for now while reviewers triage scope.

Sonnet review of PR soliplex#968 flagged three blockers and six should-fixes. All addressed here, verified against a live Moodle Workplace sandbox during full regression testing. B1: async validate_id_token + module-level JWKS client cache - anyio.to_thread.run_sync wraps the blocking PyJWKClient lookup - Module-level dict + lazy anyio.Lock cache PyJWKClient instances by JWKS URL so PyJWT's cache_keys=True actually survives across requests (verified: 5 launches => 1 certs.php fetch on Moodle). B2: room_id sanitization + JSON-config-tag pattern - resolve_room_id validates extracted room segments against [a-zA-Z0-9_-]+; XSS-attempt target_link_uri falls through to default (verified: no <script> in rendered body). - Per-launch state replaces f-string-into-JS interpolation with a <script id="lti-config" type="application/json"> tag plus a </ escape, eliminating the regular JS-string injection surface. B3: nonce-seen cache to prevent state replay - lti.nonce.consume_nonce() module-level dict, bounded at 10k entries, lazy expiry sweep on each call. Replay returns 400 LTI_NONCE_REPLAY (verified: second POST of same id_token+state rejected). S1: platform-bound session tokens - mint_session_token embeds _platform_id; authn.authenticate uses peek_platform_id + single-platform lookup so a short-TTL platform's token cannot live longer than its TTL when a longer- TTL platform is also registered. S2: MoodleClient.aclose wired to FastAPI lifespan - InstallationConfig.register_cleanup + Installation lifespan AsyncExitStack push_async_callback. Verified: clean shutdown log, no "httpx.AsyncClient was never closed" warning. S3: SRI on marked.js - Pinned marked@15.0.7 with sha384 integrity + crossorigin anonymous. Verified: hash matches live CDN content. S4: expanded CSP - default-src 'self'; script-src 'self' 'unsafe-inline' https://cdn.jsdelivr.net; style-src 'self' 'unsafe-inline'; connect-src 'self'; img-src 'self' data:; frame-ancestors {platform.issuer}. All six directives present on live response. S5: reject empty deployment_ids at config load - LTIPlatformConfig.from_yaml raises FromYamlException when the list is empty or missing entirely. S6+P4: remove dead _preview helper and _CONFIRM_INSTRUCTIONS_* constants.

… warnings; surface resolver ambiguity A third-pass review of PR soliplex#968 (fresh-context Sonnet, no shared state with the prior two passes turned up one user-visible bug and two should-fix patterns that the earlier reviews missed. All three close honestly here so the PR can flip Draft -> Ready. B1 — get_team_members(department=Engineering) returned EVERY staff member with a Workplace job, not just Engineering's. Two layers of breakage, both fixed: (a) client.get_department_members accepted departmentid / positionid integer params but never forwarded them to the remote core_reportbuilder_retrieve_system_report call. The sole client-side filter (skip rows with empty dept name when departmentid is set) only filtered out rows missing a department altogether — it let through every row from every OTHER department. Signature simplified to a pure fetch all jobs reader with just the existing user-name search filter; the broken half-filter is gone. (b) skills.get_team_members now applies the missing post-fetch exact-name filter using the resolved Department / Position's canonical .name (case-insensitive) — only rows whose departmentname / positionname match the resolved target survive. Exact match, not substring, so e.g. Engineering doesn't catch Engineering Operations. Fourth review's regression smoke graded by LLM output text which happened to mention the right names — the actual rows returned were never inspected. SF1 — core_course_update_courses returns null on success and {warnings: [...]} on rejection (shortname collision, capability error, etc.). The wrapper reported success regardless, mirroring the R12 200 + warnings = silent failure pattern that was previously fixed for the four org write endpoints but never propagated to course updates. client.update_courses now raises MoodleAPIError on a populated warnings array, routed through the existing _warnings_message() helper. The decorator surfaces the raise as {status: error, ...} at the skill layer. Return- type annotation (-> None) is now honest (closes N1). Audit pass turned up one additional offender within the +2 endpoint cap: client.enrol_users had the same warnings-as-failure shape (enrol_manual_enrol_users returns {warnings: [{warningcode: alreadyenrolled, ...}]} when the user is already in the course). Same R12 fix applied there; the enrol_users skill wrapper drops its result.get(warnings, []) inspection since the client now raises. SF2 — get_team_members's department / position lookup used next(...) over the matches generator, so first match wins on ambiguity instead of surfacing the conflict. Replaced with the _resolve_rule_id pattern: collect all matches, branch on count (0 -> no match error, >1 -> multiple matches error with the matching candidates listed by name + idnumber, 1 -> proceed). Both department and position resolvers updated. Tests: - test_client.test_get_department_members_returns_all_rows (replaces the dept_position_filter test) — asserts both Eng + Ops rows return without any client-side filtering. - test_client.test_update_courses_raises_on_warnings (replaces _with_warnings) — asserts pytest.raises with the shortname-collision warning text preserved. - test_client.test_enrol_users_raises_on_warnings (renamed) — asserts MoodleAPIError(match=already enrolled). - test_agent.test_get_team_members_filters_to_requested_department — mocks list_departments + system report responses via side_effect; asserts Alice (Engineering) survives but Bob (Operations) is filtered out. - test_agent.test_get_team_members_filters_to_requested_position — same shape for positions. - test_agent.test_get_team_members_ambiguous_department — two depts both match the query (one by name, one by idnumber); asserts status:error response with the matches list. - test_agent.test_get_team_members_ambiguous_position — same shape for positions. - test_agent.test_update_course_tool_propagates_warnings — verifies wrapper output is status:error with the shortnametaken message text preserved end-to-end. - test_agent.test_enrol_users_tool_propagates_warnings — verifies wrapper output is status:error with the alreadyenrolled message preserved. 100% coverage retained on moodle/client.py and moodle/skills.py. Full suite: 4432 passed. ruff check + ruff format --check clean. Live verification (per plan, deferred to the next /amia_start session): in the Flutter moodle-tools room, ask Who is in the Engineering department? then Who is in the Operations department? — outputs MUST differ (today they don't); then ask the agent to rename course 2 to shortname (taken by course 3) and confirm the agent reports failure with the Moodle warning message, not a green checkmark. EOF )

Sonnet review of PR soliplex#968 flagged three blockers and six should-fixes. All addressed here, verified against a live Moodle Workplace sandbox during full regression testing. B1: async validate_id_token + module-level JWKS client cache - anyio.to_thread.run_sync wraps the blocking PyJWKClient lookup - Module-level dict + lazy anyio.Lock cache PyJWKClient instances by JWKS URL so PyJWT's cache_keys=True actually survives across requests (verified: 5 launches => 1 certs.php fetch on Moodle). B2: room_id sanitization + JSON-config-tag pattern - resolve_room_id validates extracted room segments against [a-zA-Z0-9_-]+; XSS-attempt target_link_uri falls through to default (verified: no <script> in rendered body). - Per-launch state replaces f-string-into-JS interpolation with a <script id="lti-config" type="application/json"> tag plus a </ escape, eliminating the regular JS-string injection surface. B3: nonce-seen cache to prevent state replay - lti.nonce.consume_nonce() module-level dict, bounded at 10k entries, lazy expiry sweep on each call. Replay returns 400 LTI_NONCE_REPLAY (verified: second POST of same id_token+state rejected). S1: platform-bound session tokens - mint_session_token embeds _platform_id; authn.authenticate uses peek_platform_id + single-platform lookup so a short-TTL platform's token cannot live longer than its TTL when a longer- TTL platform is also registered. S2: MoodleClient.aclose wired to FastAPI lifespan - InstallationConfig.register_cleanup + Installation lifespan AsyncExitStack push_async_callback. Verified: clean shutdown log, no "httpx.AsyncClient was never closed" warning. S3: SRI on marked.js - Pinned marked@15.0.7 with sha384 integrity + crossorigin anonymous. Verified: hash matches live CDN content. S4: expanded CSP - default-src 'self'; script-src 'self' 'unsafe-inline' https://cdn.jsdelivr.net; style-src 'self' 'unsafe-inline'; connect-src 'self'; img-src 'self' data:; frame-ancestors {platform.issuer}. All six directives present on live response. S5: reject empty deployment_ids at config load - LTIPlatformConfig.from_yaml raises FromYamlException when the list is empty or missing entirely. S6+P4: remove dead _preview helper and _CONFIRM_INSTRUCTIONS_* constants.

… warnings; surface resolver ambiguity A third-pass review of PR soliplex#968 (fresh-context Sonnet, no shared state with the prior two passes turned up one user-visible bug and two should-fix patterns that the earlier reviews missed. All three close honestly here so the PR can flip Draft -> Ready. B1 — get_team_members(department=Engineering) returned EVERY staff member with a Workplace job, not just Engineering's. Two layers of breakage, both fixed: (a) client.get_department_members accepted departmentid / positionid integer params but never forwarded them to the remote core_reportbuilder_retrieve_system_report call. The sole client-side filter (skip rows with empty dept name when departmentid is set) only filtered out rows missing a department altogether — it let through every row from every OTHER department. Signature simplified to a pure fetch all jobs reader with just the existing user-name search filter; the broken half-filter is gone. (b) skills.get_team_members now applies the missing post-fetch exact-name filter using the resolved Department / Position's canonical .name (case-insensitive) — only rows whose departmentname / positionname match the resolved target survive. Exact match, not substring, so e.g. Engineering doesn't catch Engineering Operations. Fourth review's regression smoke graded by LLM output text which happened to mention the right names — the actual rows returned were never inspected. SF1 — core_course_update_courses returns null on success and {warnings: [...]} on rejection (shortname collision, capability error, etc.). The wrapper reported success regardless, mirroring the R12 200 + warnings = silent failure pattern that was previously fixed for the four org write endpoints but never propagated to course updates. client.update_courses now raises MoodleAPIError on a populated warnings array, routed through the existing _warnings_message() helper. The decorator surfaces the raise as {status: error, ...} at the skill layer. Return- type annotation (-> None) is now honest (closes N1). Audit pass turned up one additional offender within the +2 endpoint cap: client.enrol_users had the same warnings-as-failure shape (enrol_manual_enrol_users returns {warnings: [{warningcode: alreadyenrolled, ...}]} when the user is already in the course). Same R12 fix applied there; the enrol_users skill wrapper drops its result.get(warnings, []) inspection since the client now raises. SF2 — get_team_members's department / position lookup used next(...) over the matches generator, so first match wins on ambiguity instead of surfacing the conflict. Replaced with the _resolve_rule_id pattern: collect all matches, branch on count (0 -> no match error, >1 -> multiple matches error with the matching candidates listed by name + idnumber, 1 -> proceed). Both department and position resolvers updated. Tests: - test_client.test_get_department_members_returns_all_rows (replaces the dept_position_filter test) — asserts both Eng + Ops rows return without any client-side filtering. - test_client.test_update_courses_raises_on_warnings (replaces _with_warnings) — asserts pytest.raises with the shortname-collision warning text preserved. - test_client.test_enrol_users_raises_on_warnings (renamed) — asserts MoodleAPIError(match=already enrolled). - test_agent.test_get_team_members_filters_to_requested_department — mocks list_departments + system report responses via side_effect; asserts Alice (Engineering) survives but Bob (Operations) is filtered out. - test_agent.test_get_team_members_filters_to_requested_position — same shape for positions. - test_agent.test_get_team_members_ambiguous_department — two depts both match the query (one by name, one by idnumber); asserts status:error response with the matches list. - test_agent.test_get_team_members_ambiguous_position — same shape for positions. - test_agent.test_update_course_tool_propagates_warnings — verifies wrapper output is status:error with the shortnametaken message text preserved end-to-end. - test_agent.test_enrol_users_tool_propagates_warnings — verifies wrapper output is status:error with the alreadyenrolled message preserved. 100% coverage retained on moodle/client.py and moodle/skills.py. Full suite: 4432 passed. ruff check + ruff format --check clean. Live verification (per plan, deferred to the next /amia_start session): in the Flutter moodle-tools room, ask Who is in the Engineering department? then Who is in the Operations department? — outputs MUST differ (today they don't); then ask the agent to rename course 2 to shortname (taken by course 3) and confirm the agent reports failure with the Moodle warning message, not a green checkmark. EOF )

A fourth-pass Sonnet review of PR soliplex#968 flagged that several write-tool preview dicts in skills.py omit the "action" field. The router prompt (agent.py:112-134) tells the router LLM to read "action" to identify which tool to invoke on the confirmation round-trip; without it the router has to infer the tool name from the prose preview string alone, and the router- prompt example only hard-codes delete_department -- so for any other tool the LLM is guessing. Reviewer flagged 7 sites (5 deletes + 2 updates). Audit of every preview-returning site in skills.py turned up 19 without "action" -- the rest of the surface already had it. Fixed all 19 in one sweep so the entire class of router-mis-routing risk is closed rather than leaving 12 latent. Sites updated: Organisation skill (7): update_department, delete_department, create_position, update_position, delete_position, delete_job, unassign_manager. Certifications skill (3): delete_certification, restore_certification, bulk_deallocate_certification_users. Programs skill (7): archive_program, restore_program, delete_program, duplicate_program, update_program_visibility, bulk_deallocate_program_users, bulk_reset_program_progress. Reporting skill (2): delete_export, delete_import. Additive change: every existing test still passes, every existing happy-path flow is unaffected, the new key is only read on the confirmation round-trip. Added a focused regression test test_delete_department_preview_includes_action that asserts the canonical example (delete_department, the one the router prompt calls out by name) returns a preview dict with the correct action string. Combined with a grep-driven verification sweep (every "preview": line in skills.py now has a matching "action": within the same dict), this guards against the systematic gap recurring. 4,433 unit tests pass at 100% coverage. ruff check + format check clean.

…urse completion overview Two NIT-grade fixes from the fourth-pass review of PR soliplex#968. Both low-impact individually but together close out every finding from the four review passes. N1 -- LTI deployment-id error message ====================================== When an id_token arrives without the deployment_id claim, ``payload.get(LTI_CLAIM_DEPLOYMENT_ID)`` returns None and ``check_deployment`` raises ``InvalidLTIDeployment(None, platform_id)``. The exception's ``__init__`` formats the message as "Invalid deployment_id None for platform 'moodle'" -- confusing operator error text when the actual problem is a missing claim. Fix: ``check_deployment`` now branches on None and raises with ``reason="missing deployment_id claim in id_token"``, surfaced through a new optional ``reason`` kwarg on ``InvalidLTIDeployment``. Existing call sites unaffected (reason defaults to None, message format unchanged for the true mismatch case). Signature widened from ``str`` to ``str | None`` to match what ``views/lti.py:199`` actually passes. Added ``test_missing_deployment_claim`` that asserts the 400 response contains the new phrasing and explicitly does NOT contain "None". N2 -- parallel per-user completion lookup ========================================== ``get_course_completion_overview`` (skills.py:683) fetched the enrolled-user list then iterated sequentially, awaiting ``get_course_completion_status`` once per user. For a course with 100 enrolled users at ~200ms/call that is 20 seconds of serial round-trips -- close to the MoodleClient 30s timeout and a real latency cliff for large courses. Fix: gather the per-user calls with ``asyncio.gather`` after the initial enrolment query. Result order is preserved by gather, so output stability is unchanged. Failed per-user lookups still return ``completed: null`` -- the helper closure catches ``MoodleAPIError`` / ``httpx.HTTPError`` and returns None instead of letting the exception abort the whole gather. Worst-case latency now bounded by the slowest single call (not the sum), turning a 20s sequential walk into a ~200ms parallel fan-out. The two existing tests (``test_get_course_completion_overview_tool`` and ``test_get_course_completion_overview_per_user_error``) pass unchanged because they assert on counts/totals, not on call order. 4,435 unit tests pass at 100% coverage. ruff check + format check clean.

Fifth-pass Sonnet review of PR soliplex#968 flagged a residual gap from the prior commit (1623eb2) that parallelised get_course_completion_overview with asyncio.gather. The new _fetch_status closure catches (MoodleAPIError, httpx.HTTPError) and returns None on failure -- but the path also runs the per-user payload through CompletionStatus.model_validate(...). If Moodle ever returns a structurally unexpected per-user record (missing completionstatus key, mistyped fields, etc.), pydantic raises ValidationError -- a ValueError subclass, NOT an httpx or Moodle exception -- which escapes _fetch_status uncaught, escapes asyncio.gather (no return_exceptions=True), and escapes _moodle_tool's narrower error-catching surface. The entire overview call would fail with an unhandled 500 instead of marking the offending user as completed: null. Fix: widen the except tuple in _fetch_status to also catch pydantic.ValidationError. Same return-None-on-failure contract preserves the existing test for the httpx/MoodleAPIError path and the gather invariants. Low-probability failure mode (requires Moodle to return a malformed per-user shape) but trivial guard. Added test_get_course_completion_overview_validation_error that feeds a malformed completion response and asserts the user appears with completed=None instead of the whole call blowing up. 4,436 unit tests pass at 100% coverage. ruff check + format check clean.

Sonnet review of PR soliplex#968 flagged three blockers and six should-fixes. All addressed here, verified against a live Moodle Workplace sandbox during full regression testing. B1: async validate_id_token + module-level JWKS client cache - anyio.to_thread.run_sync wraps the blocking PyJWKClient lookup - Module-level dict + lazy anyio.Lock cache PyJWKClient instances by JWKS URL so PyJWT's cache_keys=True actually survives across requests (verified: 5 launches => 1 certs.php fetch on Moodle). B2: room_id sanitization + JSON-config-tag pattern - resolve_room_id validates extracted room segments against [a-zA-Z0-9_-]+; XSS-attempt target_link_uri falls through to default (verified: no <script> in rendered body). - Per-launch state replaces f-string-into-JS interpolation with a <script id="lti-config" type="application/json"> tag plus a </ escape, eliminating the regular JS-string injection surface. B3: nonce-seen cache to prevent state replay - lti.nonce.consume_nonce() module-level dict, bounded at 10k entries, lazy expiry sweep on each call. Replay returns 400 LTI_NONCE_REPLAY (verified: second POST of same id_token+state rejected). S1: platform-bound session tokens - mint_session_token embeds _platform_id; authn.authenticate uses peek_platform_id + single-platform lookup so a short-TTL platform's token cannot live longer than its TTL when a longer- TTL platform is also registered. S2: MoodleClient.aclose wired to FastAPI lifespan - InstallationConfig.register_cleanup + Installation lifespan AsyncExitStack push_async_callback. Verified: clean shutdown log, no "httpx.AsyncClient was never closed" warning. S3: SRI on marked.js - Pinned marked@15.0.7 with sha384 integrity + crossorigin anonymous. Verified: hash matches live CDN content. S4: expanded CSP - default-src 'self'; script-src 'self' 'unsafe-inline' https://cdn.jsdelivr.net; style-src 'self' 'unsafe-inline'; connect-src 'self'; img-src 'self' data:; frame-ancestors {platform.issuer}. All six directives present on live response. S5: reject empty deployment_ids at config load - LTIPlatformConfig.from_yaml raises FromYamlException when the list is empty or missing entirely. S6+P4: remove dead _preview helper and _CONFIRM_INSTRUCTIONS_* constants.

… warnings; surface resolver ambiguity A third-pass review of PR soliplex#968 (fresh-context Sonnet, no shared state with the prior two passes turned up one user-visible bug and two should-fix patterns that the earlier reviews missed. All three close honestly here so the PR can flip Draft -> Ready. B1 — get_team_members(department=Engineering) returned EVERY staff member with a Workplace job, not just Engineering's. Two layers of breakage, both fixed: (a) client.get_department_members accepted departmentid / positionid integer params but never forwarded them to the remote core_reportbuilder_retrieve_system_report call. The sole client-side filter (skip rows with empty dept name when departmentid is set) only filtered out rows missing a department altogether — it let through every row from every OTHER department. Signature simplified to a pure fetch all jobs reader with just the existing user-name search filter; the broken half-filter is gone. (b) skills.get_team_members now applies the missing post-fetch exact-name filter using the resolved Department / Position's canonical .name (case-insensitive) — only rows whose departmentname / positionname match the resolved target survive. Exact match, not substring, so e.g. Engineering doesn't catch Engineering Operations. Fourth review's regression smoke graded by LLM output text which happened to mention the right names — the actual rows returned were never inspected. SF1 — core_course_update_courses returns null on success and {warnings: [...]} on rejection (shortname collision, capability error, etc.). The wrapper reported success regardless, mirroring the R12 200 + warnings = silent failure pattern that was previously fixed for the four org write endpoints but never propagated to course updates. client.update_courses now raises MoodleAPIError on a populated warnings array, routed through the existing _warnings_message() helper. The decorator surfaces the raise as {status: error, ...} at the skill layer. Return- type annotation (-> None) is now honest (closes N1). Audit pass turned up one additional offender within the +2 endpoint cap: client.enrol_users had the same warnings-as-failure shape (enrol_manual_enrol_users returns {warnings: [{warningcode: alreadyenrolled, ...}]} when the user is already in the course). Same R12 fix applied there; the enrol_users skill wrapper drops its result.get(warnings, []) inspection since the client now raises. SF2 — get_team_members's department / position lookup used next(...) over the matches generator, so first match wins on ambiguity instead of surfacing the conflict. Replaced with the _resolve_rule_id pattern: collect all matches, branch on count (0 -> no match error, >1 -> multiple matches error with the matching candidates listed by name + idnumber, 1 -> proceed). Both department and position resolvers updated. Tests: - test_client.test_get_department_members_returns_all_rows (replaces the dept_position_filter test) — asserts both Eng + Ops rows return without any client-side filtering. - test_client.test_update_courses_raises_on_warnings (replaces _with_warnings) — asserts pytest.raises with the shortname-collision warning text preserved. - test_client.test_enrol_users_raises_on_warnings (renamed) — asserts MoodleAPIError(match=already enrolled). - test_agent.test_get_team_members_filters_to_requested_department — mocks list_departments + system report responses via side_effect; asserts Alice (Engineering) survives but Bob (Operations) is filtered out. - test_agent.test_get_team_members_filters_to_requested_position — same shape for positions. - test_agent.test_get_team_members_ambiguous_department — two depts both match the query (one by name, one by idnumber); asserts status:error response with the matches list. - test_agent.test_get_team_members_ambiguous_position — same shape for positions. - test_agent.test_update_course_tool_propagates_warnings — verifies wrapper output is status:error with the shortnametaken message text preserved end-to-end. - test_agent.test_enrol_users_tool_propagates_warnings — verifies wrapper output is status:error with the alreadyenrolled message preserved. 100% coverage retained on moodle/client.py and moodle/skills.py. Full suite: 4432 passed. ruff check + ruff format --check clean. Live verification (per plan, deferred to the next /amia_start session): in the Flutter moodle-tools room, ask Who is in the Engineering department? then Who is in the Operations department? — outputs MUST differ (today they don't); then ask the agent to rename course 2 to shortname (taken by course 3) and confirm the agent reports failure with the Moodle warning message, not a green checkmark. EOF )

A fourth-pass Sonnet review of PR soliplex#968 flagged that several write-tool preview dicts in skills.py omit the "action" field. The router prompt (agent.py:112-134) tells the router LLM to read "action" to identify which tool to invoke on the confirmation round-trip; without it the router has to infer the tool name from the prose preview string alone, and the router- prompt example only hard-codes delete_department -- so for any other tool the LLM is guessing. Reviewer flagged 7 sites (5 deletes + 2 updates). Audit of every preview-returning site in skills.py turned up 19 without "action" -- the rest of the surface already had it. Fixed all 19 in one sweep so the entire class of router-mis-routing risk is closed rather than leaving 12 latent. Sites updated: Organisation skill (7): update_department, delete_department, create_position, update_position, delete_position, delete_job, unassign_manager. Certifications skill (3): delete_certification, restore_certification, bulk_deallocate_certification_users. Programs skill (7): archive_program, restore_program, delete_program, duplicate_program, update_program_visibility, bulk_deallocate_program_users, bulk_reset_program_progress. Reporting skill (2): delete_export, delete_import. Additive change: every existing test still passes, every existing happy-path flow is unaffected, the new key is only read on the confirmation round-trip. Added a focused regression test test_delete_department_preview_includes_action that asserts the canonical example (delete_department, the one the router prompt calls out by name) returns a preview dict with the correct action string. Combined with a grep-driven verification sweep (every "preview": line in skills.py now has a matching "action": within the same dict), this guards against the systematic gap recurring. 4,433 unit tests pass at 100% coverage. ruff check + format check clean.

…urse completion overview Two NIT-grade fixes from the fourth-pass review of PR soliplex#968. Both low-impact individually but together close out every finding from the four review passes. N1 -- LTI deployment-id error message ====================================== When an id_token arrives without the deployment_id claim, ``payload.get(LTI_CLAIM_DEPLOYMENT_ID)`` returns None and ``check_deployment`` raises ``InvalidLTIDeployment(None, platform_id)``. The exception's ``__init__`` formats the message as "Invalid deployment_id None for platform 'moodle'" -- confusing operator error text when the actual problem is a missing claim. Fix: ``check_deployment`` now branches on None and raises with ``reason="missing deployment_id claim in id_token"``, surfaced through a new optional ``reason`` kwarg on ``InvalidLTIDeployment``. Existing call sites unaffected (reason defaults to None, message format unchanged for the true mismatch case). Signature widened from ``str`` to ``str | None`` to match what ``views/lti.py:199`` actually passes. Added ``test_missing_deployment_claim`` that asserts the 400 response contains the new phrasing and explicitly does NOT contain "None". N2 -- parallel per-user completion lookup ========================================== ``get_course_completion_overview`` (skills.py:683) fetched the enrolled-user list then iterated sequentially, awaiting ``get_course_completion_status`` once per user. For a course with 100 enrolled users at ~200ms/call that is 20 seconds of serial round-trips -- close to the MoodleClient 30s timeout and a real latency cliff for large courses. Fix: gather the per-user calls with ``asyncio.gather`` after the initial enrolment query. Result order is preserved by gather, so output stability is unchanged. Failed per-user lookups still return ``completed: null`` -- the helper closure catches ``MoodleAPIError`` / ``httpx.HTTPError`` and returns None instead of letting the exception abort the whole gather. Worst-case latency now bounded by the slowest single call (not the sum), turning a 20s sequential walk into a ~200ms parallel fan-out. The two existing tests (``test_get_course_completion_overview_tool`` and ``test_get_course_completion_overview_per_user_error``) pass unchanged because they assert on counts/totals, not on call order. 4,435 unit tests pass at 100% coverage. ruff check + format check clean.

Fifth-pass Sonnet review of PR soliplex#968 flagged a residual gap from the prior commit (1623eb2) that parallelised get_course_completion_overview with asyncio.gather. The new _fetch_status closure catches (MoodleAPIError, httpx.HTTPError) and returns None on failure -- but the path also runs the per-user payload through CompletionStatus.model_validate(...). If Moodle ever returns a structurally unexpected per-user record (missing completionstatus key, mistyped fields, etc.), pydantic raises ValidationError -- a ValueError subclass, NOT an httpx or Moodle exception -- which escapes _fetch_status uncaught, escapes asyncio.gather (no return_exceptions=True), and escapes _moodle_tool's narrower error-catching surface. The entire overview call would fail with an unhandled 500 instead of marking the offending user as completed: null. Fix: widen the except tuple in _fetch_status to also catch pydantic.ValidationError. Same return-None-on-failure contract preserves the existing test for the httpx/MoodleAPIError path and the gather invariants. Low-probability failure mode (requires Moodle to return a malformed per-user shape) but trivial guard. Added test_get_course_completion_overview_validation_error that feeds a malformed completion response and asserts the user appears with completed=None instead of the whole call blowing up. 4,436 unit tests pass at 100% coverage. ruff check + format check clean.

Adds a new Moodle room that connects to Moodle Workplace via a standalone FastMCP server (moodle-mcp-tools) over stdio. Includes room config, secrets for MOODLE_BASE_URL and MOODLE_API_TOKEN, and a unit test for config loading.

…room Move Moodle client into soliplex as src/soliplex/moodle/ and use a factory agent with @agent.system_prompt to pre-fetch course/enrollment/completion data before every LLM call, eliminating the need for tool calling by local models. Also scrub org-specific references from codebase.

…odle room @agent.system_prompt functions are silently skipped when AG-UI provides message_history (Pydantic AI only calls _sys_parts when messages is empty). Switch to an async instructions callable which is evaluated on every request.

Adds a parallel Moodle room that exposes 4 Pydantic AI tools (list_courses, find_user, list_enrolled_users, get_completion_status) instead of pre-fetching data into the system prompt. Enables A/B comparison of tool-calling vs context-injection approaches with the same model and data.

Exclude course id=1 (site-level entry) from list_courses and context injection to prevent enrollment/completion errors on a non-real course. Update suggestion prompts to match seeded data and strengthen system prompt so the agent always calls tools before claiming data is missing.

…ng variant The tool-calling room handles Moodle queries well via gpt-oss, so the context-injection variant is no longer needed. Removes the moodle room config, its test, and the related factory/prompt code from agent.py.

…module Move duplicated provider/model construction logic from moodle/agent.py into agents.get_model_from_factory_config(), remove unused get_activities_completion_status() and related models, narrow exception handling from bare Exception to (MoodleAPIError, httpx.HTTPError), and fix stale "(Tool Calling)" test assertion.

The ./rooms/moodle directory was removed in an earlier refactor but its reference remained, which would cause a startup error. Only ./rooms/moodle-tools is valid.

…dling - Add missing skill_toolset_config parameter to moodle_tools_agent_factory to match the AgentFactory protocol. - Wrap all four tool functions in try/except for consistent error handling (previously only get_completion_status caught errors). - Add optional verify parameter to MoodleClient for environments with custom certificate authorities, passed through from extra_config as moodle_verify_ssl.

- Add class docstrings to all Pydantic models referencing the Moodle web service functions that return them. - Add method docstrings to all public MoodleClient methods. - Add comment explaining MAX_RESULTS purpose. - Expand moodle_tools_agent_factory docstring with extra_config keys and required Moodle web service functions. - Populate __init__.py with module docstring and public exports. - Add README.md integration guide covering prerequisites, token generation, Soliplex configuration, custom CA certificates, and limitations.

Add test_moodle_agent.py with 11 tests covering the factory function (returns Agent, accepts skill_toolset_config, registers 4 tools) and all four tool functions (happy path + error handling for each). Remove agent.py from coverage omit in pyproject.toml now that tests are in place. Fix two line-too-long docstrings caught by ruff.

Add course content, compliance reporting, groups/cohorts, grading, calendar, and write operations (enrol/message with confirmation). Certifications feature skipped pending Moodle Workplace API access.

…Workplace room

…port Integrate LTI 1.3 OIDC login/launch flow with session token validation as an authentication fallback. Add configurable thinking-field support for local models (e.g. Gemma 4) via pydantic-ai model profiles.

… Moodle modules

Extract shared helpers (_to_dict, _HTML_TAG_RE, _CONFIRM_INSTRUCTIONS, _EXPORTER_MAP) to eliminate duplicate patterns across client and skills. Move LTI templates to lti_templates.py, consolidate test helpers into conftest.py, remove 6 dead model classes and 3 orphan log constants. Fix auth_disabled to only gate on OIDC so --no-auth-mode works with LTI platforms configured.

Re-add httpx dependency to lock file after rebasing onto upstream/main (haiku-rag-slim 0.44, haiku-skills 0.16).

Upstream broke the 'config' package re-exports (PR soliplex#951) and the rebase moved 'get_model_from_factory_config' to 'config.agents' (alongside 'get_model_from_config'). Update test mock paths and direct imports of 'config.InstallationConfig' / 'config.RoomConfig' / 'config.FactoryAgentConfig' to use their submodules ('config.installation', 'config.rooms', 'config.agents'). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The skill-router architecture (commit a0f0608) introduced behavioural regressions never re-verified against the original 57-scenario doc. Cleanup commit 2955a8d also silently shadowed `_CONFIRM_INSTRUCTIONS`, breaking preview JSON for write tools. This change closes both gaps. Key fixes: - skills.py: rename short-form to `_CONFIRM_INSTRUCTIONS_SHORT` so it no longer collides with the long skill-prompt suffix; update every preview-dict usage. - skills.py: enrich `SkillMetadata.description` strings so the router can find cohorts, tenants, exports, calendar, and rule-condition searches against the right skill. - skills.py: tighten docstrings on tools with permissive defaults (search_certifications, search_cohorts/competencies_for_rule, get_upcoming_events) so the agent stops asking for clarification when an empty arg returns all rows. - skills.py: clarify `assign_manager` Args + skill-prompt entry to prevent inverted manager/subordinate assignments. - skills.py: enrich `get_cohort_members` with usernames + fullnames so callers don't hallucinate user data. - skills.py: expand preview dicts for `enrol_users`, `create_user`, `create_department` to expose tabulatable top-level fields and always inject the short instructions string. - skills.py: route 'learning catalogue' queries to `get_user_learning_catalogue` instead of `browse_catalogue`. - skills.py: skill-prompt workflow guidance for cross-skill chaining (grades, completion overview, course name → ID resolution). - agent.py: rewrite MOODLE_ROUTER_PROMPT with a strong no-hallucination directive, explicit routing examples, and anti-pattern warnings; the router now consistently dispatches rather than answering data questions from memory. - example/rooms/moodle-tools/room_config.yaml: switch model to gpt-oss:latest (the model MOODLE.md was originally written for; gemma4:26b's 262K-context made every scenario take 5-10 minutes. Tests: - 100% coverage maintained (4351 passing). - Sandbox cycled fresh; 12 of 12 originally-FAILED scenarios now pass; 4 of 4 originally-PARTIAL scenarios now pass. - Known limitation: preview→confirm scenarios (28/30/33/35/37/38/39/ 45/48/51/52/53) require same-thread router context that the current skill-router architecture doesn't carry between separate user prompts; preview halves verified, confirm halves need router-level memory work outside this fix's scope. EOF )

The skill-router architecture broke confirm-half scenarios two ways: 1. drive_llm_stream awaited maybe_generate_title synchronously in the finally block, so the per-thread background task held its SQLite write transaction open for the duration of title-gen (seconds-to-minutes on a local LLM). Subsequent user prompts on the same thread - notably the "Yes, go ahead" confirm - opened their SSE stream but blocked on the writer lock and never started their agent run. Wrapping maybe_generate_title in asyncio.create_task lets drive_llm_stream return immediately, so the next prompt's agent run starts right away. 2. The skill subagent receives the router-forwarded request as a fresh prompt with no memory of the prior preview. When the request explicitly says confirmed=True, the subagent was still defaulting to confirmed=False and silently re-previewing the action while the router announced success. Rewrote the _CONFIRM_INSTRUCTIONS skill-prompt suffix to force two distinct cases (preview vs explicit confirmed=True) so the subagent honors the confirmed flag the router reconstructed from history. Also moved the title agent off gemma4:26b (262K-context, slow load) onto gpt-oss:latest so title-gen finishes before its no-longer-blocking task is cancelled. Verified end-to-end: 27 (preview) + 28 (confirm) on a fresh sandbox now writes a Security/SEC department row that Moodle's DB confirms exists.

… guidance Continued the preview->confirm flow fixes from 842f099. Two classes of issue remained on rule and update_user write tools: 1. Rule write tools (enable_rule, disable_rule, archive_rule, unarchive_rule, delete_rule, duplicate_rule, delete_rule_condition, delete_rule_outcome) returned a minimal preview JSON like {"preview": "Enable dynamic rule id=N"} with no `action` key, no parameters as top-level fields, and no `instructions` key. The router can't reconstruct a confirm request from a freeform string, so on "Yes, proceed" it forwarded the bare phrase and the subagent re-previewed (no DB write). Restructured each preview to match the create_department / create_user pattern: top-level `action`, `rule_id` (or `instanceid`), and `_CONFIRM_INSTRUCTIONS_SHORT` (or `_WARNING` for delete). 2. update_user preview embedded the changed fields inside a nested `changes` dict that the LLM rendered awkwardly (Department=1 instead of "Quality Assurance"). Promoted each changed field to a top-level key alongside `changes` so the LLM can tabulate without re-interpreting nested JSON. Also added skill-prompt guidance for moodle-rules: write operations must call the write tool itself (with confirmed=False) for the preview rather than describing the action from list_dynamic_rules / can_enable_rule output - those listing tools don't return the structured fields the router uses to reconstruct the confirm. Added a logfire instrumentation point in enable_rule so future debugging can see the actual args the LLM passed. Verified end-to-end: - 28 / 30 (create + delete dept) DB-confirmed. - 33 (duplicate program) - new program id created. - 35 / 37 / 39 (hide / archive / restore) - DB flags flipped. - 51 / 53 (create + delete user) - DB-confirmed. - 44 (enable rule preview) - now calls enable_rule(confirmed= False) and returns structured JSON. 45 / 47 / 48 are blocked by an unrelated sandbox seeding gap (the tool_dynamicrule\reportbuilder\local\systemreports\rules system report returns 0 rows in this fresh seed even though the rules exist in the DB) - agent code is correct. - 52 (update user) - preview now exposes department as a top- level key.

Drop the logfire.info call I added to enable_rule for debugging the preview->confirm flow. Behaviour verified end-to-end against the seeded Test rule (44+45 enable, 47+48 disable) - DB-confirmed state transitions on each pair.

Two improvements driven by smoke scenario 52 review: 1. update_user now best-effort looks up the user's full name via get_users_by_field so the preview can show "Documentation User (soliplex#7)" instead of bare "soliplex#7". The lookup is wrapped in a broad except so any failure (API down, JSON oddity, validation error) silently falls back to the bare ID. The new `user` field carries the label. 2. The router prompt and moodle-users skill prompt now explicitly distinguish the user.department free-form string field from the Workplace organisation department entity. Previous runs of scenario 52 dispatched to moodle-organisation and tried to look up or create a department row in tool_organisation_department; the new guidance keeps such requests on moodle-users where they belong. 3. Skill prompt now requires the rendered preview heading to include the user's name + ID so the doc's expected format ("User ID 8") survives the LLM's table rendering. Tests cover the success path (user found, preview includes fullname), the empty-list fallback (user lookup returns []), and the existing no-fields / bad-id paths.

A two-track audit of the branch (dead code / refactor candidates + upstream pattern alignment) surfaced both real technical debt and several stylistic divergences that would draw review comments. This commit folds the fixes into three phases. Phase 1 -- PR blockers: - MoodleClient now constructs a persistent httpx.AsyncClient with a 30s default timeout (was instantiating one per call). Adds aclose() for graceful teardown. - Drop the empty /lti/jwks endpoint that returned {"keys": []} -- it silently fails any platform attempting JWK validation against us. - Drop unused jwcrypto direct dependency (still pulled transitively via python-keycloak; saves install weight). - Fix AGENTS.md reference to deleted CLAUDE.md. Phase 2 -- alignment + targeted refactor: - moodle_tools_agent_factory now matches the AgentFactory protocol signature (kw-only args after agent_config, returns SoliplexAgent). - Re-merge lti_templates.py back into views/lti.py to match how every other view module in the project keeps inline HTML. - Move tests/unit/test_moodle_*.py into tests/unit/moodle/ to match the agui/, config/, skills/ subdirectory layout. - Replace 102 try/except (MoodleAPIError, httpx.HTTPError) blocks with a single @_moodle_tool decorator (-200 LOC in skills.py). - Introduce _preview() helper for the standard preview-JSON shape. - Promote _resolve_rule_id to module level (was nested closure). - Hoist _strip_html and _li_texts -> _parse_li_texts to module level. - Switch from raw haiku.skills.SkillToolset to SoliplexSkillToolset for run-context state injection consistency with the rest of the project. - Remove 7 dead public methods on MoodleClient (and their tests) that had zero callers anywhere in src/ or tests/. - Rename tests/unit/_test_features.py -> _fixtures.py so its real role as a shared fixture module is clear (the _test_ prefix was fooling pytest into skipping it as a test file). - Document the [tool.coverage.run].omit list in pyproject.toml. Phase 3 -- polish (preview-leak fix): - Drop the "instructions" field from preview JSON entirely. The string ("Present this to the user and ask for confirmation...") was being rendered verbatim into user-facing tables (smoke scenarios 29, 34, 44). The skill prompt's Case A / Case B already covers preview-vs-confirm handling, so the field was redundant. - Strengthen _CONFIRM_INSTRUCTIONS with explicit "no preamble, no narration" output rules so chain-of-thought style summaries ("We identified...", "User wants to...") stop leaking before the preview table. Verification: - 4,342 unit tests pass; 100% coverage maintained. - ruff check + format clean. - Re-ran all 57 MOODLE.md smoke scenarios via REST against gpt-oss:latest: 47 PASS / 7 PARTIAL / 3 FAIL. None of the non-passes are caused by this refactor -- they are gpt-oss hallucination on empty tool results, sandbox state pollution from prior runs, or seed-data gaps (UTM/adv-comp report tables empty). - Re-tested the three preview-leak scenarios (29, 34, 44) after Phase 3 changes; all now produce clean table-only previews.

The gemma4:26b experiment is over -- gpt-oss:latest is the default on every front (router, skill subagents, title agent), and a separate sweep of all 57 MOODLE.md scenarios confirms it routes correctly without any thinking-token plumbing. The custom _resolve_thinking_profile helper that patched supports_thinking on per-model profiles is also redundant now that pydantic-ai's Ollama provider sets openai_chat_thinking_field='reasoning' unconditionally. Strip every adjustment that was made for the experiment: - example/rooms/moodle-tools/room_config.yaml: drop thinking_field: "reasoning" from extra_config. - example/minimal.yaml: change default_chat from gemma4:26b to gpt-oss:latest, and drop the matching extra_body: {think: false} blocks (one on default_chat, one on the title agent) -- they were only there to suppress gemma's reasoning tokens. - src/soliplex/config/agents.py: delete _resolve_thinking_profile and both call sites (ollama and openai branches), drop the thinking_field paragraph from get_model_from_factory_config's docstring, and remove the now-unused openai_profiles import. - tests/unit/config/test_config_agents.py: remove the ollama_thinking and openai_thinking parametrize entries, the thinking_field handling in the test body, the two standalone helper tests, and the dead openai_profiles import. - tests/unit/moodle/test_tools_room.py: drop thinking_field from the expected extra_config snapshot. Verification: ruff check + format clean, full pytest suite still green at 100% coverage (4,338 passing), and a smoke subset of six representative scenarios (S1, S9, S15, S27/28, S40, S46) all PASS against gpt-oss:latest with the plumbing gone. The skill prompt's no-preamble output rules in _CONFIRM_INSTRUCTIONS were originally motivated by gemma's CoT leak but help any reasoning model and intentionally remain in place. Net diff: -112 / +1 lines.

Sonnet review of PR soliplex#968 flagged three blockers and six should-fixes. All addressed here, verified against a live Moodle Workplace sandbox during full regression testing. B1: async validate_id_token + module-level JWKS client cache - anyio.to_thread.run_sync wraps the blocking PyJWKClient lookup - Module-level dict + lazy anyio.Lock cache PyJWKClient instances by JWKS URL so PyJWT's cache_keys=True actually survives across requests (verified: 5 launches => 1 certs.php fetch on Moodle). B2: room_id sanitization + JSON-config-tag pattern - resolve_room_id validates extracted room segments against [a-zA-Z0-9_-]+; XSS-attempt target_link_uri falls through to default (verified: no <script> in rendered body). - Per-launch state replaces f-string-into-JS interpolation with a <script id="lti-config" type="application/json"> tag plus a </ escape, eliminating the regular JS-string injection surface. B3: nonce-seen cache to prevent state replay - lti.nonce.consume_nonce() module-level dict, bounded at 10k entries, lazy expiry sweep on each call. Replay returns 400 LTI_NONCE_REPLAY (verified: second POST of same id_token+state rejected). S1: platform-bound session tokens - mint_session_token embeds _platform_id; authn.authenticate uses peek_platform_id + single-platform lookup so a short-TTL platform's token cannot live longer than its TTL when a longer- TTL platform is also registered. S2: MoodleClient.aclose wired to FastAPI lifespan - InstallationConfig.register_cleanup + Installation lifespan AsyncExitStack push_async_callback. Verified: clean shutdown log, no "httpx.AsyncClient was never closed" warning. S3: SRI on marked.js - Pinned marked@15.0.7 with sha384 integrity + crossorigin anonymous. Verified: hash matches live CDN content. S4: expanded CSP - default-src 'self'; script-src 'self' 'unsafe-inline' https://cdn.jsdelivr.net; style-src 'self' 'unsafe-inline'; connect-src 'self'; img-src 'self' data:; frame-ancestors {platform.issuer}. All six directives present on live response. S5: reject empty deployment_ids at config load - LTIPlatformConfig.from_yaml raises FromYamlException when the list is empty or missing entirely. S6+P4: remove dead _preview helper and _CONFIRM_INSTRUCTIONS_* constants.

The S1 fix in the prior commit binds session tokens to the issuing platform via a _platform_id claim, then validates with that single platform's TTL. Any redeploy of an existing system would invalidate every active LTI session because tokens minted before S1 lack the claim. Bridge those tokens: when peek_platform_id returns None, validate using max_age = min(p.session_ttl for p in lti_platforms) so a legacy token cannot outlive the shortest configured TTL (preserves the S1 invariant). Log a warning containing the token's sub so operators can detect lingering legacy traffic. Remove the else-branch after the warning goes silent for a full deploy cycle. Tracked separately.

R12.1 — Integer-id leakage caused agent confusion. list_departments and list_positions exposed both the internal integer id (e.g. 1) and the external string idnumber (e.g. "ENG"). The agent sometimes picked the integer when subsequently calling create_department( parent="1"), which the Moodle API rejected because the `parent` field is the parent's idnumber, not its id. Reshape both: - Drop "id" and "parentid" from every row. - Add a resolved "parent" field containing the parent's idnumber (or "" for top-level), looked up from the FULL department list so children outside a search filter still report their parent. - Drop integer "id" from every confirmed-write success output (create_*, update_*) so the LLM doesn't learn the wrong reference key inside a single session. - Convert delete_department / delete_position to accept idnumber: the wrapper resolves it to the integer id Moodle requires internally; if no match exists, return {"status": "error", ...}. R12.2 — Confabulated success defense. Earlier observation: when _moodle_tool caught a MoodleAPIError and returned {"error": ...}, the LLM sometimes ignored the error and reported a fabricated success. - _moodle_tool's error branch now returns {"status": "error", ...}. - _moodle_tool auto-injects "status": "ok" into any JSON object returned from a tool invoked with confirmed=True (preview and read branches untouched). Tools that emit their own "status" field are passed through. - Append rule 5 to _CONFIRM_INSTRUCTIONS, the shared prompt block mixed into all 7 skill agents: explicitly forbid claiming success or inventing IDs unless the result carries "status": "ok"; require failures be reported when "status": "error". - Rule 4 cross-references "status": "ok". 100% coverage on moodle/skills.py. 541 moodle-suite tests pass. Verified end-to-end via regression sweep against the live sandbox.

Three related fixes, all rolled together because they share the same call site: S2 — Single-item-write contract documented. client._warnings_message docstring now spells out that the "empty result + warnings = error" heuristic is only correct for the single-item submissions our wrappers issue today. Each of the four call sites (create_departments, update_departments, create_positions, update_positions) gets a cross-reference comment for future maintainers. S3 — One Moodle round-trip per list_*. list_departments / list_positions used to fetch twice when a search filter was passed: once for the filtered set, once for the parent-idnumber map. Now we fetch the full list once and filter in-process. Verified live: 1 POST per invocation. R12.1 actually-works fix — real idnumbers in list output. Moodle's stock tool_organisation_get_teams_tab_filters hard-codes a SELECT list that excludes idnumber. Without it the agent has no idnumber to feed into create/update/delete writes. Switch client.get_departments / get_positions to the new local_soliplex_list_departments / _list_positions endpoints (see moodle_sandbox fca534c8a) which return idnumber from the underlying tables. B1 covered as a side-effect — delete_department / delete_position now fetch the full list and exact-match by idnumber instead of calling the server with the idnumber as a name-substring search. The B1 false-positive risk (idnumber happens to be a substring of another department's name is eliminated; the test test_delete_department_tool_not_found_exact_match_only locks the new behaviour in place. Verified end-to-end against live Moodle Workplace: list shows real idnumbers, delete by idnumber actually removes the row. 100% coverage on moodle/client.py and moodle/skills.py. Requires moodle_sandbox >= fca534c8a for the new plugin endpoints to be available; production deployment requires the local_soliplex plugin to be present (same constraint that already exists for get_department_members / get_utm_report / get_adv_comp_report). EOF )

A 57-scenario Flutter regression turned up 7 soft-spot findings — all prompt-routing or cosmetic, none regressions from the security or LTI work in this branch. Fixed in-scope per "no loose chads": F1 — Scn 41 ("How many users match X rule"): agent went to list_cohorts / get_cohort_members instead of get_rule_matching_users. Strengthened the tool docstring and added an explicit "how many users match" → get_rule_matching_users example in _RULES_PROMPT (plus router-prompt routing entry). Verified live: trace now ends at get_rule_matching_users. F2 — Scn 16 ("What competency frameworks exist?"): router routed to moodle-rules.search_competencies_for_rule (which is for rule conditions, not frameworks). Added a disambiguating routing example to MOODLE_ROUTER_PROMPT pointing competency-framework queries at moodle-programs.list_competency_frameworks. Verified live: trace ends at list_competency_frameworks. F3 — Scn 40 ("Can the X rule be enabled?"): router produced NO tool dispatch at all. Added "can X rule be enabled" → moodle- rules.can_enable_rule routing example. Verified live: trace includes can_enable_rule. F4 — Scn 13 ("Who is in the Engineering department?"): agent looped list_departments 7+ times before reaching get_team_members because get_team_members took an integer departmentid that no list-tool exposes anymore. Now get_team_members accepts ``department=<name-or-idnumber>`` and resolves internally via the local_soliplex list endpoints. Workflow guidance updated in _ORGANISATION_PROMPT to direct callers there. Verified live: trace ends at get_team_members in 1–2 calls, no loops. F5 — Cosmetic gpt-oss Harmony control tokens (``<|channel|>commentary``, ``<|message|>``, ``<|end|>``, etc.) were leaking into client-visible text deltas. Added a _sanitise_event hook in the AG-UI streaming loop that regex- strips known Harmony markers from event.delta / event.content in place — so persistence + title-gen + the client all see the sanitised text. Pattern recognises bare markers plus <|channel|> with a closed-set label (analysis | commentary | final) so it can't accidentally eat content words. F6 — Scn 29 ("Delete the Security department" by NAME): agent couldn't resolve name → idnumber because list_departments output no longer leaks integer ids. Tightened delete_department / delete_position docstrings to instruct: "If only the name is known, FIRST call list_departments to look up the idnumber." Verified live: agent now calls list_departments first then delete_department(idnumber=<resolved>). F7 — Scn 33 (Duplicate Program — Confirmed) leaves a new "Leadership Track Copy" row each test run. Test-design issue, not product. Added a cleanup note to docs/moodle/MOODLE.md. 100% coverage on moodle/skills.py, moodle/agent.py, views/agui.py. 4299 unit tests pass.

… warnings; surface resolver ambiguity A third-pass review of PR soliplex#968 (fresh-context Sonnet, no shared state with the prior two passes turned up one user-visible bug and two should-fix patterns that the earlier reviews missed. All three close honestly here so the PR can flip Draft -> Ready. B1 — get_team_members(department=Engineering) returned EVERY staff member with a Workplace job, not just Engineering's. Two layers of breakage, both fixed: (a) client.get_department_members accepted departmentid / positionid integer params but never forwarded them to the remote core_reportbuilder_retrieve_system_report call. The sole client-side filter (skip rows with empty dept name when departmentid is set) only filtered out rows missing a department altogether — it let through every row from every OTHER department. Signature simplified to a pure fetch all jobs reader with just the existing user-name search filter; the broken half-filter is gone. (b) skills.get_team_members now applies the missing post-fetch exact-name filter using the resolved Department / Position's canonical .name (case-insensitive) — only rows whose departmentname / positionname match the resolved target survive. Exact match, not substring, so e.g. Engineering doesn't catch Engineering Operations. Fourth review's regression smoke graded by LLM output text which happened to mention the right names — the actual rows returned were never inspected. SF1 — core_course_update_courses returns null on success and {warnings: [...]} on rejection (shortname collision, capability error, etc.). The wrapper reported success regardless, mirroring the R12 200 + warnings = silent failure pattern that was previously fixed for the four org write endpoints but never propagated to course updates. client.update_courses now raises MoodleAPIError on a populated warnings array, routed through the existing _warnings_message() helper. The decorator surfaces the raise as {status: error, ...} at the skill layer. Return- type annotation (-> None) is now honest (closes N1). Audit pass turned up one additional offender within the +2 endpoint cap: client.enrol_users had the same warnings-as-failure shape (enrol_manual_enrol_users returns {warnings: [{warningcode: alreadyenrolled, ...}]} when the user is already in the course). Same R12 fix applied there; the enrol_users skill wrapper drops its result.get(warnings, []) inspection since the client now raises. SF2 — get_team_members's department / position lookup used next(...) over the matches generator, so first match wins on ambiguity instead of surfacing the conflict. Replaced with the _resolve_rule_id pattern: collect all matches, branch on count (0 -> no match error, >1 -> multiple matches error with the matching candidates listed by name + idnumber, 1 -> proceed). Both department and position resolvers updated. Tests: - test_client.test_get_department_members_returns_all_rows (replaces the dept_position_filter test) — asserts both Eng + Ops rows return without any client-side filtering. - test_client.test_update_courses_raises_on_warnings (replaces _with_warnings) — asserts pytest.raises with the shortname-collision warning text preserved. - test_client.test_enrol_users_raises_on_warnings (renamed) — asserts MoodleAPIError(match=already enrolled). - test_agent.test_get_team_members_filters_to_requested_department — mocks list_departments + system report responses via side_effect; asserts Alice (Engineering) survives but Bob (Operations) is filtered out. - test_agent.test_get_team_members_filters_to_requested_position — same shape for positions. - test_agent.test_get_team_members_ambiguous_department — two depts both match the query (one by name, one by idnumber); asserts status:error response with the matches list. - test_agent.test_get_team_members_ambiguous_position — same shape for positions. - test_agent.test_update_course_tool_propagates_warnings — verifies wrapper output is status:error with the shortnametaken message text preserved end-to-end. - test_agent.test_enrol_users_tool_propagates_warnings — verifies wrapper output is status:error with the alreadyenrolled message preserved. 100% coverage retained on moodle/client.py and moodle/skills.py. Full suite: 4432 passed. ruff check + ruff format --check clean. Live verification (per plan, deferred to the next /amia_start session): in the Flutter moodle-tools room, ask Who is in the Engineering department? then Who is in the Operations department? — outputs MUST differ (today they don't); then ask the agent to rename course 2 to shortname (taken by course 3) and confirm the agent reports failure with the Moodle warning message, not a green checkmark. EOF )

A fourth-pass Sonnet review of PR soliplex#968 flagged that several write-tool preview dicts in skills.py omit the "action" field. The router prompt (agent.py:112-134) tells the router LLM to read "action" to identify which tool to invoke on the confirmation round-trip; without it the router has to infer the tool name from the prose preview string alone, and the router- prompt example only hard-codes delete_department -- so for any other tool the LLM is guessing. Reviewer flagged 7 sites (5 deletes + 2 updates). Audit of every preview-returning site in skills.py turned up 19 without "action" -- the rest of the surface already had it. Fixed all 19 in one sweep so the entire class of router-mis-routing risk is closed rather than leaving 12 latent. Sites updated: Organisation skill (7): update_department, delete_department, create_position, update_position, delete_position, delete_job, unassign_manager. Certifications skill (3): delete_certification, restore_certification, bulk_deallocate_certification_users. Programs skill (7): archive_program, restore_program, delete_program, duplicate_program, update_program_visibility, bulk_deallocate_program_users, bulk_reset_program_progress. Reporting skill (2): delete_export, delete_import. Additive change: every existing test still passes, every existing happy-path flow is unaffected, the new key is only read on the confirmation round-trip. Added a focused regression test test_delete_department_preview_includes_action that asserts the canonical example (delete_department, the one the router prompt calls out by name) returns a preview dict with the correct action string. Combined with a grep-driven verification sweep (every "preview": line in skills.py now has a matching "action": within the same dict), this guards against the systematic gap recurring. 4,433 unit tests pass at 100% coverage. ruff check + format check clean.

…urse completion overview Two NIT-grade fixes from the fourth-pass review of PR soliplex#968. Both low-impact individually but together close out every finding from the four review passes. N1 -- LTI deployment-id error message ====================================== When an id_token arrives without the deployment_id claim, ``payload.get(LTI_CLAIM_DEPLOYMENT_ID)`` returns None and ``check_deployment`` raises ``InvalidLTIDeployment(None, platform_id)``. The exception's ``__init__`` formats the message as "Invalid deployment_id None for platform 'moodle'" -- confusing operator error text when the actual problem is a missing claim. Fix: ``check_deployment`` now branches on None and raises with ``reason="missing deployment_id claim in id_token"``, surfaced through a new optional ``reason`` kwarg on ``InvalidLTIDeployment``. Existing call sites unaffected (reason defaults to None, message format unchanged for the true mismatch case). Signature widened from ``str`` to ``str | None`` to match what ``views/lti.py:199`` actually passes. Added ``test_missing_deployment_claim`` that asserts the 400 response contains the new phrasing and explicitly does NOT contain "None". N2 -- parallel per-user completion lookup ========================================== ``get_course_completion_overview`` (skills.py:683) fetched the enrolled-user list then iterated sequentially, awaiting ``get_course_completion_status`` once per user. For a course with 100 enrolled users at ~200ms/call that is 20 seconds of serial round-trips -- close to the MoodleClient 30s timeout and a real latency cliff for large courses. Fix: gather the per-user calls with ``asyncio.gather`` after the initial enrolment query. Result order is preserved by gather, so output stability is unchanged. Failed per-user lookups still return ``completed: null`` -- the helper closure catches ``MoodleAPIError`` / ``httpx.HTTPError`` and returns None instead of letting the exception abort the whole gather. Worst-case latency now bounded by the slowest single call (not the sum), turning a 20s sequential walk into a ~200ms parallel fan-out. The two existing tests (``test_get_course_completion_overview_tool`` and ``test_get_course_completion_overview_per_user_error``) pass unchanged because they assert on counts/totals, not on call order. 4,435 unit tests pass at 100% coverage. ruff check + format check clean.

Fifth-pass Sonnet review of PR soliplex#968 flagged a residual gap from the prior commit (1623eb2) that parallelised get_course_completion_overview with asyncio.gather. The new _fetch_status closure catches (MoodleAPIError, httpx.HTTPError) and returns None on failure -- but the path also runs the per-user payload through CompletionStatus.model_validate(...). If Moodle ever returns a structurally unexpected per-user record (missing completionstatus key, mistyped fields, etc.), pydantic raises ValidationError -- a ValueError subclass, NOT an httpx or Moodle exception -- which escapes _fetch_status uncaught, escapes asyncio.gather (no return_exceptions=True), and escapes _moodle_tool's narrower error-catching surface. The entire overview call would fail with an unhandled 500 instead of marking the offending user as completed: null. Fix: widen the except tuple in _fetch_status to also catch pydantic.ValidationError. Same return-None-on-failure contract preserves the existing test for the httpx/MoodleAPIError path and the gather invariants. Low-probability failure mode (requires Moodle to return a malformed per-user shape) but trivial guard. Added test_get_course_completion_overview_validation_error that feeds a malformed completion response and asserts the user appears with completed=None instead of the whole call blowing up. 4,436 unit tests pass at 100% coverage. ruff check + format check clean.

…_user A 57-scenario MOODLE.md smoke pass against the current HEAD landed at 46 PASS / 2 PARTIAL / 9 FAIL. Root-causing each failure turned up a mix of seed-data gaps (handled separately in moodle_sandbox/local/seed_data.php), test-rig limits in the smoke driver (handled separately in the new /amia_smoke_test skill), and three soliplex-side fixes captured here. After all three streams land, the same smoke pass deterministically closes 52 of 57 scenarios; the remaining five are LLM-routing variance on long mid-conversation chains (gpt-oss occasionally wanders to an adjacent skill or hallucinates IDs), not code or seed issues, and are best addressed by a model upgrade rather than further prompt tightening. Stream A changes (this commit): 1. ``_resolve_rule_id`` is now suffix-tolerant. Previously the resolver used ``needle in name`` substring matching, which fails when the LLM appends a noun-class suffix from the user's English phrasing (e.g. "the Standalone Cohort Trigger rule" -> rule_name="Standalone Cohort Trigger rule", which is NOT a substring of the stored name "Standalone Cohort Trigger"). Bidirectional match closes the case while preserving the existing 0/1/many ambiguity branching for true conflicts. Closes scenarios 40 + 41. 2. ``_REPORTING_PROMPT`` now contains an explicit anti-hallucination guard. When list_reports returns a sparse result, gpt-oss has been observed to fabricate plausible-looking report IDs (101 / 102 / 103) rather than report the seed contents. The guard says: "your response MUST contain exactly those N reports -- no inferred names, no invented IDs, no extra rows". Closes scenario 24. 3. ``MOODLE_ROUTER_PROMPT`` gains two new sections: - "Tool argument precision" -- strip " rule" and " department" noun-class suffixes from user phrasing before passing to skills. Crucially, do NOT strip " program" / " course" / " certification" suffixes because those words commonly appear in the canonical stored names ("Onboarding Program", "Cybersecurity Basics", "Workplace Safety Certification"). For those, use the exact spelling from a prior list/search tool result. - "Confirmation vs new request" distinction -- any message that BEGINS with "yes" / "yeah" / "sure" / "ok" / "go ahead" / "proceed" / "do it" / "confirm" is a CONFIRMATION of the most recent preview, even when followed by an imperative ("Yes, restore it" / "Yes, archive it"). Conversely "Now restore it" / "Then archive X" -- an imperative without an affirmative prefix -- is a NEW request and gets its own preview cycle. Plus a concrete dispatch example for UTM / advanced completion reports keyed by course name, instructing the router to chain list_courses -> get_utm_report rather than ask the user for a numeric course ID. Closes scenario 25. 4. ``create_user`` drops the ``createpassword`` parameter entirely. Previously the agent could pass ``createpassword=False`` on the confirmation call after reconstructing the create from the preview, which causes Moodle to reject the row with "must provide a password, or set createpassword". The wrapper now hard-codes ``createpassword=1`` whenever no explicit password is given -- the agent has no knob to mis-set. Closes scenario 51. Three new unit tests pin the resolver bidirectional case, the reporting prompt guard text, and the router prompt's tool-arg-precision section. 4,592 unit tests pass at 100% coverage. ruff check + format check clean.

# Conflicts: # pyproject.toml # uv.lock

Operators address courses, users, categories, cohorts, tenants, departments, reports, certifications, and programs by name, not by internal Moodle ID. Every agent tool now takes a string identifier and resolves it through a 0/1/many helper that accepts numeric ID, shortname/idnumber, exact name, or bidirectional substring; ambiguity returns candidate lists so callers can disambiguate. Adds get_courses_by_field on the client and resolver helpers (_resolve_course/user/category/cohort/tenant/department/report/ certification/program_identifier) plus a _parse_job_token wrapper for transient export/import job IDs. Removes the obsolete numeric-ID gates in certify_user, revoke_certification, allocate_users_to_*, assign_job, and friends. Backed by 716 moodle tests at 100% coverage; a fresh 57/57 MOODLE.md smoke pass exercises the new surface end to end with zero ID-bearing prompts.

Conflicts: CLAUDE.md — resolved by keeping the local deletion (cf0a289). Upstream changes pulled in (pyproject.toml, agents.py, examples.py, agui.py, rooms.py, and their tests) auto-merged cleanly; 4,780 unit tests still pass at 100% coverage, ruff check and ruff format both clean.

AMIA is the AFSOC-branded deployment of Soliplex; upstream code and tests should not carry that branding. The LTI resource-link claim fixture is the only tracked-tree hit.

ryan-day-ctr force-pushed the feature/moodle branch from d768030 to 3fc5100 Compare May 16, 2026 16:17

ryan-day-ctr force-pushed the feature/moodle branch from 93aeaae to b5643f4 Compare May 18, 2026 21:26

ryan-day-ctr added 16 commits May 19, 2026 10:10

chore: add .idea and .claude to .gitignore

882e93a

chore: add .playwright-cli to .gitignore

41da433

chore: rename Moodle room to drop "(Tool Calling)" suffix

6a4c0b7

chore: remove CLAUDE.md and PLAN.md from repository

cf0a289

fix: remove stale rooms/moodle path from minimal.yaml

ae18b67

The ./rooms/moodle directory was removed in an earlier refactor but its reference remained, which would cause a startup error. Only ./rooms/moodle-tools is valid.

feat(moodle): add advanced training management tools

c0e6841

Add course content, compliance reporting, groups/cohorts, grading, calendar, and write operations (enrol/message with confirmation). Certifications feature skipped pending Moodle Workplace API access.

ryan-day-ctr and others added 25 commits May 19, 2026 10:10

test(moodle): restore 100% coverage after upstream dependency bump

600a6b6

feat(moodle): add consumer skill for external AG-UI access to Moodle …

ca466a2

…Workplace room

feat(lti): add LTI 1.3 platform authentication and thinking model sup…

5e9e29a

…port Integrate LTI 1.3 OIDC login/launch flow with session token validation as an authentication fallback. Add configurable thinking-field support for local models (e.g. Gemma 4) via pydantic-ai model profiles.

style: fix ruff format and E501 line-length violations across LTI and…

f54e2a2

… Moodle modules

feat(lti): show selected room name in chat header after picker selection

acabe49

chore: regenerate uv.lock after upstream rebase

ab39968

Re-add httpx dependency to lock file after rebasing onto upstream/main (haiku-rag-slim 0.44, haiku-skills 0.16).

ryan-day-ctr force-pushed the feature/moodle branch from f315db2 to 08eddb2 Compare May 19, 2026 15:11

ryan-day-ctr added 4 commits May 20, 2026 00:09

Merge remote-tracking branch 'upstream/main' into feature/moodle

782dc27

# Conflicts: # pyproject.toml # uv.lock

chore(lti): drop AMIA branding from LTI resource-link test fixture

00410d6

AMIA is the AFSOC-branded deployment of Soliplex; upstream code and tests should not carry that branding. The LTI resource-link claim fixture is the only tracked-tree hit.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Moodle Workplace integration (skill-routing + LTI 1.3)#968

feat: Moodle Workplace integration (skill-routing + LTI 1.3)#968
ryan-day-ctr wants to merge 58 commits into
soliplex:mainfrom
ryan-day-ctr:feature/moodle

ryan-day-ctr commented May 13, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ryan-day-ctr commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Architecture highlights

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ryan-day-ctr commented May 13, 2026 •

edited

Loading