feat: Moodle Workplace integration (skill-routing + LTI 1.3)#968
Draft
ryan-day-ctr wants to merge 58 commits into
Draft
feat: Moodle Workplace integration (skill-routing + LTI 1.3)#968ryan-day-ctr wants to merge 58 commits into
ryan-day-ctr wants to merge 58 commits into
Conversation
ryan-day-ctr
added a commit
to ryan-day-ctr/soliplex
that referenced
this pull request
May 13, 2026
Sonnet review of PR soliplex#968 flagged three blockers and six should-fixes. All addressed here, verified against a live Moodle Workplace sandbox during full regression testing. B1: async validate_id_token + module-level JWKS client cache - anyio.to_thread.run_sync wraps the blocking PyJWKClient lookup - Module-level dict + lazy anyio.Lock cache PyJWKClient instances by JWKS URL so PyJWT's cache_keys=True actually survives across requests (verified: 5 launches => 1 certs.php fetch on Moodle). B2: room_id sanitization + JSON-config-tag pattern - resolve_room_id validates extracted room segments against [a-zA-Z0-9_-]+; XSS-attempt target_link_uri falls through to default (verified: no <script> in rendered body). - Per-launch state replaces f-string-into-JS interpolation with a <script id="lti-config" type="application/json"> tag plus a </ escape, eliminating the regular JS-string injection surface. B3: nonce-seen cache to prevent state replay - lti.nonce.consume_nonce() module-level dict, bounded at 10k entries, lazy expiry sweep on each call. Replay returns 400 LTI_NONCE_REPLAY (verified: second POST of same id_token+state rejected). S1: platform-bound session tokens - mint_session_token embeds _platform_id; authn.authenticate uses peek_platform_id + single-platform lookup so a short-TTL platform's token cannot live longer than its TTL when a longer- TTL platform is also registered. S2: MoodleClient.aclose wired to FastAPI lifespan - InstallationConfig.register_cleanup + Installation lifespan AsyncExitStack push_async_callback. Verified: clean shutdown log, no "httpx.AsyncClient was never closed" warning. S3: SRI on marked.js - Pinned marked@15.0.7 with sha384 integrity + crossorigin anonymous. Verified: hash matches live CDN content. S4: expanded CSP - default-src 'self'; script-src 'self' 'unsafe-inline' https://cdn.jsdelivr.net; style-src 'self' 'unsafe-inline'; connect-src 'self'; img-src 'self' data:; frame-ancestors {platform.issuer}. All six directives present on live response. S5: reject empty deployment_ids at config load - LTIPlatformConfig.from_yaml raises FromYamlException when the list is empty or missing entirely. S6+P4: remove dead _preview helper and _CONFIRM_INSTRUCTIONS_* constants.
ryan-day-ctr
added a commit
to ryan-day-ctr/soliplex
that referenced
this pull request
May 14, 2026
… warnings; surface resolver ambiguity A third-pass review of PR soliplex#968 (fresh-context Sonnet, no shared state with the prior two passes turned up one user-visible bug and two should-fix patterns that the earlier reviews missed. All three close honestly here so the PR can flip Draft -> Ready. B1 — get_team_members(department=Engineering) returned EVERY staff member with a Workplace job, not just Engineering's. Two layers of breakage, both fixed: (a) client.get_department_members accepted departmentid / positionid integer params but never forwarded them to the remote core_reportbuilder_retrieve_system_report call. The sole client-side filter (skip rows with empty dept name when departmentid is set) only filtered out rows missing a department altogether — it let through every row from every OTHER department. Signature simplified to a pure fetch all jobs reader with just the existing user-name search filter; the broken half-filter is gone. (b) skills.get_team_members now applies the missing post-fetch exact-name filter using the resolved Department / Position's canonical .name (case-insensitive) — only rows whose departmentname / positionname match the resolved target survive. Exact match, not substring, so e.g. Engineering doesn't catch Engineering Operations. Fourth review's regression smoke graded by LLM output text which happened to mention the right names — the actual rows returned were never inspected. SF1 — core_course_update_courses returns null on success and {warnings: [...]} on rejection (shortname collision, capability error, etc.). The wrapper reported success regardless, mirroring the R12 200 + warnings = silent failure pattern that was previously fixed for the four org write endpoints but never propagated to course updates. client.update_courses now raises MoodleAPIError on a populated warnings array, routed through the existing _warnings_message() helper. The decorator surfaces the raise as {status: error, ...} at the skill layer. Return- type annotation (-> None) is now honest (closes N1). Audit pass turned up one additional offender within the +2 endpoint cap: client.enrol_users had the same warnings-as-failure shape (enrol_manual_enrol_users returns {warnings: [{warningcode: alreadyenrolled, ...}]} when the user is already in the course). Same R12 fix applied there; the enrol_users skill wrapper drops its result.get(warnings, []) inspection since the client now raises. SF2 — get_team_members's department / position lookup used next(...) over the matches generator, so first match wins on ambiguity instead of surfacing the conflict. Replaced with the _resolve_rule_id pattern: collect all matches, branch on count (0 -> no match error, >1 -> multiple matches error with the matching candidates listed by name + idnumber, 1 -> proceed). Both department and position resolvers updated. Tests: - test_client.test_get_department_members_returns_all_rows (replaces the dept_position_filter test) — asserts both Eng + Ops rows return without any client-side filtering. - test_client.test_update_courses_raises_on_warnings (replaces _with_warnings) — asserts pytest.raises with the shortname-collision warning text preserved. - test_client.test_enrol_users_raises_on_warnings (renamed) — asserts MoodleAPIError(match=already enrolled). - test_agent.test_get_team_members_filters_to_requested_department — mocks list_departments + system report responses via side_effect; asserts Alice (Engineering) survives but Bob (Operations) is filtered out. - test_agent.test_get_team_members_filters_to_requested_position — same shape for positions. - test_agent.test_get_team_members_ambiguous_department — two depts both match the query (one by name, one by idnumber); asserts status:error response with the matches list. - test_agent.test_get_team_members_ambiguous_position — same shape for positions. - test_agent.test_update_course_tool_propagates_warnings — verifies wrapper output is status:error with the shortnametaken message text preserved end-to-end. - test_agent.test_enrol_users_tool_propagates_warnings — verifies wrapper output is status:error with the alreadyenrolled message preserved. 100% coverage retained on moodle/client.py and moodle/skills.py. Full suite: 4432 passed. ruff check + ruff format --check clean. Live verification (per plan, deferred to the next /amia_start session): in the Flutter moodle-tools room, ask Who is in the Engineering department? then Who is in the Operations department? — outputs MUST differ (today they don't); then ask the agent to rename course 2 to shortname (taken by course 3) and confirm the agent reports failure with the Moodle warning message, not a green checkmark. EOF )
ryan-day-ctr
added a commit
to ryan-day-ctr/soliplex
that referenced
this pull request
May 16, 2026
Sonnet review of PR soliplex#968 flagged three blockers and six should-fixes. All addressed here, verified against a live Moodle Workplace sandbox during full regression testing. B1: async validate_id_token + module-level JWKS client cache - anyio.to_thread.run_sync wraps the blocking PyJWKClient lookup - Module-level dict + lazy anyio.Lock cache PyJWKClient instances by JWKS URL so PyJWT's cache_keys=True actually survives across requests (verified: 5 launches => 1 certs.php fetch on Moodle). B2: room_id sanitization + JSON-config-tag pattern - resolve_room_id validates extracted room segments against [a-zA-Z0-9_-]+; XSS-attempt target_link_uri falls through to default (verified: no <script> in rendered body). - Per-launch state replaces f-string-into-JS interpolation with a <script id="lti-config" type="application/json"> tag plus a </ escape, eliminating the regular JS-string injection surface. B3: nonce-seen cache to prevent state replay - lti.nonce.consume_nonce() module-level dict, bounded at 10k entries, lazy expiry sweep on each call. Replay returns 400 LTI_NONCE_REPLAY (verified: second POST of same id_token+state rejected). S1: platform-bound session tokens - mint_session_token embeds _platform_id; authn.authenticate uses peek_platform_id + single-platform lookup so a short-TTL platform's token cannot live longer than its TTL when a longer- TTL platform is also registered. S2: MoodleClient.aclose wired to FastAPI lifespan - InstallationConfig.register_cleanup + Installation lifespan AsyncExitStack push_async_callback. Verified: clean shutdown log, no "httpx.AsyncClient was never closed" warning. S3: SRI on marked.js - Pinned marked@15.0.7 with sha384 integrity + crossorigin anonymous. Verified: hash matches live CDN content. S4: expanded CSP - default-src 'self'; script-src 'self' 'unsafe-inline' https://cdn.jsdelivr.net; style-src 'self' 'unsafe-inline'; connect-src 'self'; img-src 'self' data:; frame-ancestors {platform.issuer}. All six directives present on live response. S5: reject empty deployment_ids at config load - LTIPlatformConfig.from_yaml raises FromYamlException when the list is empty or missing entirely. S6+P4: remove dead _preview helper and _CONFIRM_INSTRUCTIONS_* constants.
ryan-day-ctr
added a commit
to ryan-day-ctr/soliplex
that referenced
this pull request
May 16, 2026
… warnings; surface resolver ambiguity A third-pass review of PR soliplex#968 (fresh-context Sonnet, no shared state with the prior two passes turned up one user-visible bug and two should-fix patterns that the earlier reviews missed. All three close honestly here so the PR can flip Draft -> Ready. B1 — get_team_members(department=Engineering) returned EVERY staff member with a Workplace job, not just Engineering's. Two layers of breakage, both fixed: (a) client.get_department_members accepted departmentid / positionid integer params but never forwarded them to the remote core_reportbuilder_retrieve_system_report call. The sole client-side filter (skip rows with empty dept name when departmentid is set) only filtered out rows missing a department altogether — it let through every row from every OTHER department. Signature simplified to a pure fetch all jobs reader with just the existing user-name search filter; the broken half-filter is gone. (b) skills.get_team_members now applies the missing post-fetch exact-name filter using the resolved Department / Position's canonical .name (case-insensitive) — only rows whose departmentname / positionname match the resolved target survive. Exact match, not substring, so e.g. Engineering doesn't catch Engineering Operations. Fourth review's regression smoke graded by LLM output text which happened to mention the right names — the actual rows returned were never inspected. SF1 — core_course_update_courses returns null on success and {warnings: [...]} on rejection (shortname collision, capability error, etc.). The wrapper reported success regardless, mirroring the R12 200 + warnings = silent failure pattern that was previously fixed for the four org write endpoints but never propagated to course updates. client.update_courses now raises MoodleAPIError on a populated warnings array, routed through the existing _warnings_message() helper. The decorator surfaces the raise as {status: error, ...} at the skill layer. Return- type annotation (-> None) is now honest (closes N1). Audit pass turned up one additional offender within the +2 endpoint cap: client.enrol_users had the same warnings-as-failure shape (enrol_manual_enrol_users returns {warnings: [{warningcode: alreadyenrolled, ...}]} when the user is already in the course). Same R12 fix applied there; the enrol_users skill wrapper drops its result.get(warnings, []) inspection since the client now raises. SF2 — get_team_members's department / position lookup used next(...) over the matches generator, so first match wins on ambiguity instead of surfacing the conflict. Replaced with the _resolve_rule_id pattern: collect all matches, branch on count (0 -> no match error, >1 -> multiple matches error with the matching candidates listed by name + idnumber, 1 -> proceed). Both department and position resolvers updated. Tests: - test_client.test_get_department_members_returns_all_rows (replaces the dept_position_filter test) — asserts both Eng + Ops rows return without any client-side filtering. - test_client.test_update_courses_raises_on_warnings (replaces _with_warnings) — asserts pytest.raises with the shortname-collision warning text preserved. - test_client.test_enrol_users_raises_on_warnings (renamed) — asserts MoodleAPIError(match=already enrolled). - test_agent.test_get_team_members_filters_to_requested_department — mocks list_departments + system report responses via side_effect; asserts Alice (Engineering) survives but Bob (Operations) is filtered out. - test_agent.test_get_team_members_filters_to_requested_position — same shape for positions. - test_agent.test_get_team_members_ambiguous_department — two depts both match the query (one by name, one by idnumber); asserts status:error response with the matches list. - test_agent.test_get_team_members_ambiguous_position — same shape for positions. - test_agent.test_update_course_tool_propagates_warnings — verifies wrapper output is status:error with the shortnametaken message text preserved end-to-end. - test_agent.test_enrol_users_tool_propagates_warnings — verifies wrapper output is status:error with the alreadyenrolled message preserved. 100% coverage retained on moodle/client.py and moodle/skills.py. Full suite: 4432 passed. ruff check + ruff format --check clean. Live verification (per plan, deferred to the next /amia_start session): in the Flutter moodle-tools room, ask Who is in the Engineering department? then Who is in the Operations department? — outputs MUST differ (today they don't); then ask the agent to rename course 2 to shortname (taken by course 3) and confirm the agent reports failure with the Moodle warning message, not a green checkmark. EOF )
d768030 to
3fc5100
Compare
ryan-day-ctr
added a commit
to ryan-day-ctr/soliplex
that referenced
this pull request
May 18, 2026
A fourth-pass Sonnet review of PR soliplex#968 flagged that several write-tool preview dicts in skills.py omit the "action" field. The router prompt (agent.py:112-134) tells the router LLM to read "action" to identify which tool to invoke on the confirmation round-trip; without it the router has to infer the tool name from the prose preview string alone, and the router- prompt example only hard-codes delete_department -- so for any other tool the LLM is guessing. Reviewer flagged 7 sites (5 deletes + 2 updates). Audit of every preview-returning site in skills.py turned up 19 without "action" -- the rest of the surface already had it. Fixed all 19 in one sweep so the entire class of router-mis-routing risk is closed rather than leaving 12 latent. Sites updated: Organisation skill (7): update_department, delete_department, create_position, update_position, delete_position, delete_job, unassign_manager. Certifications skill (3): delete_certification, restore_certification, bulk_deallocate_certification_users. Programs skill (7): archive_program, restore_program, delete_program, duplicate_program, update_program_visibility, bulk_deallocate_program_users, bulk_reset_program_progress. Reporting skill (2): delete_export, delete_import. Additive change: every existing test still passes, every existing happy-path flow is unaffected, the new key is only read on the confirmation round-trip. Added a focused regression test test_delete_department_preview_includes_action that asserts the canonical example (delete_department, the one the router prompt calls out by name) returns a preview dict with the correct action string. Combined with a grep-driven verification sweep (every "preview": line in skills.py now has a matching "action": within the same dict), this guards against the systematic gap recurring. 4,433 unit tests pass at 100% coverage. ruff check + format check clean.
ryan-day-ctr
added a commit
to ryan-day-ctr/soliplex
that referenced
this pull request
May 18, 2026
…urse completion overview Two NIT-grade fixes from the fourth-pass review of PR soliplex#968. Both low-impact individually but together close out every finding from the four review passes. N1 -- LTI deployment-id error message ====================================== When an id_token arrives without the deployment_id claim, ``payload.get(LTI_CLAIM_DEPLOYMENT_ID)`` returns None and ``check_deployment`` raises ``InvalidLTIDeployment(None, platform_id)``. The exception's ``__init__`` formats the message as "Invalid deployment_id None for platform 'moodle'" -- confusing operator error text when the actual problem is a missing claim. Fix: ``check_deployment`` now branches on None and raises with ``reason="missing deployment_id claim in id_token"``, surfaced through a new optional ``reason`` kwarg on ``InvalidLTIDeployment``. Existing call sites unaffected (reason defaults to None, message format unchanged for the true mismatch case). Signature widened from ``str`` to ``str | None`` to match what ``views/lti.py:199`` actually passes. Added ``test_missing_deployment_claim`` that asserts the 400 response contains the new phrasing and explicitly does NOT contain "None". N2 -- parallel per-user completion lookup ========================================== ``get_course_completion_overview`` (skills.py:683) fetched the enrolled-user list then iterated sequentially, awaiting ``get_course_completion_status`` once per user. For a course with 100 enrolled users at ~200ms/call that is 20 seconds of serial round-trips -- close to the MoodleClient 30s timeout and a real latency cliff for large courses. Fix: gather the per-user calls with ``asyncio.gather`` after the initial enrolment query. Result order is preserved by gather, so output stability is unchanged. Failed per-user lookups still return ``completed: null`` -- the helper closure catches ``MoodleAPIError`` / ``httpx.HTTPError`` and returns None instead of letting the exception abort the whole gather. Worst-case latency now bounded by the slowest single call (not the sum), turning a 20s sequential walk into a ~200ms parallel fan-out. The two existing tests (``test_get_course_completion_overview_tool`` and ``test_get_course_completion_overview_per_user_error``) pass unchanged because they assert on counts/totals, not on call order. 4,435 unit tests pass at 100% coverage. ruff check + format check clean.
ryan-day-ctr
added a commit
to ryan-day-ctr/soliplex
that referenced
this pull request
May 18, 2026
Fifth-pass Sonnet review of PR soliplex#968 flagged a residual gap from the prior commit (1623eb2) that parallelised get_course_completion_overview with asyncio.gather. The new _fetch_status closure catches (MoodleAPIError, httpx.HTTPError) and returns None on failure -- but the path also runs the per-user payload through CompletionStatus.model_validate(...). If Moodle ever returns a structurally unexpected per-user record (missing completionstatus key, mistyped fields, etc.), pydantic raises ValidationError -- a ValueError subclass, NOT an httpx or Moodle exception -- which escapes _fetch_status uncaught, escapes asyncio.gather (no return_exceptions=True), and escapes _moodle_tool's narrower error-catching surface. The entire overview call would fail with an unhandled 500 instead of marking the offending user as completed: null. Fix: widen the except tuple in _fetch_status to also catch pydantic.ValidationError. Same return-None-on-failure contract preserves the existing test for the httpx/MoodleAPIError path and the gather invariants. Low-probability failure mode (requires Moodle to return a malformed per-user shape) but trivial guard. Added test_get_course_completion_overview_validation_error that feeds a malformed completion response and asserts the user appears with completed=None instead of the whole call blowing up. 4,436 unit tests pass at 100% coverage. ruff check + format check clean.
93aeaae to
b5643f4
Compare
ryan-day-ctr
added a commit
to ryan-day-ctr/soliplex
that referenced
this pull request
May 18, 2026
Sonnet review of PR soliplex#968 flagged three blockers and six should-fixes. All addressed here, verified against a live Moodle Workplace sandbox during full regression testing. B1: async validate_id_token + module-level JWKS client cache - anyio.to_thread.run_sync wraps the blocking PyJWKClient lookup - Module-level dict + lazy anyio.Lock cache PyJWKClient instances by JWKS URL so PyJWT's cache_keys=True actually survives across requests (verified: 5 launches => 1 certs.php fetch on Moodle). B2: room_id sanitization + JSON-config-tag pattern - resolve_room_id validates extracted room segments against [a-zA-Z0-9_-]+; XSS-attempt target_link_uri falls through to default (verified: no <script> in rendered body). - Per-launch state replaces f-string-into-JS interpolation with a <script id="lti-config" type="application/json"> tag plus a </ escape, eliminating the regular JS-string injection surface. B3: nonce-seen cache to prevent state replay - lti.nonce.consume_nonce() module-level dict, bounded at 10k entries, lazy expiry sweep on each call. Replay returns 400 LTI_NONCE_REPLAY (verified: second POST of same id_token+state rejected). S1: platform-bound session tokens - mint_session_token embeds _platform_id; authn.authenticate uses peek_platform_id + single-platform lookup so a short-TTL platform's token cannot live longer than its TTL when a longer- TTL platform is also registered. S2: MoodleClient.aclose wired to FastAPI lifespan - InstallationConfig.register_cleanup + Installation lifespan AsyncExitStack push_async_callback. Verified: clean shutdown log, no "httpx.AsyncClient was never closed" warning. S3: SRI on marked.js - Pinned marked@15.0.7 with sha384 integrity + crossorigin anonymous. Verified: hash matches live CDN content. S4: expanded CSP - default-src 'self'; script-src 'self' 'unsafe-inline' https://cdn.jsdelivr.net; style-src 'self' 'unsafe-inline'; connect-src 'self'; img-src 'self' data:; frame-ancestors {platform.issuer}. All six directives present on live response. S5: reject empty deployment_ids at config load - LTIPlatformConfig.from_yaml raises FromYamlException when the list is empty or missing entirely. S6+P4: remove dead _preview helper and _CONFIRM_INSTRUCTIONS_* constants.
ryan-day-ctr
added a commit
to ryan-day-ctr/soliplex
that referenced
this pull request
May 18, 2026
… warnings; surface resolver ambiguity A third-pass review of PR soliplex#968 (fresh-context Sonnet, no shared state with the prior two passes turned up one user-visible bug and two should-fix patterns that the earlier reviews missed. All three close honestly here so the PR can flip Draft -> Ready. B1 — get_team_members(department=Engineering) returned EVERY staff member with a Workplace job, not just Engineering's. Two layers of breakage, both fixed: (a) client.get_department_members accepted departmentid / positionid integer params but never forwarded them to the remote core_reportbuilder_retrieve_system_report call. The sole client-side filter (skip rows with empty dept name when departmentid is set) only filtered out rows missing a department altogether — it let through every row from every OTHER department. Signature simplified to a pure fetch all jobs reader with just the existing user-name search filter; the broken half-filter is gone. (b) skills.get_team_members now applies the missing post-fetch exact-name filter using the resolved Department / Position's canonical .name (case-insensitive) — only rows whose departmentname / positionname match the resolved target survive. Exact match, not substring, so e.g. Engineering doesn't catch Engineering Operations. Fourth review's regression smoke graded by LLM output text which happened to mention the right names — the actual rows returned were never inspected. SF1 — core_course_update_courses returns null on success and {warnings: [...]} on rejection (shortname collision, capability error, etc.). The wrapper reported success regardless, mirroring the R12 200 + warnings = silent failure pattern that was previously fixed for the four org write endpoints but never propagated to course updates. client.update_courses now raises MoodleAPIError on a populated warnings array, routed through the existing _warnings_message() helper. The decorator surfaces the raise as {status: error, ...} at the skill layer. Return- type annotation (-> None) is now honest (closes N1). Audit pass turned up one additional offender within the +2 endpoint cap: client.enrol_users had the same warnings-as-failure shape (enrol_manual_enrol_users returns {warnings: [{warningcode: alreadyenrolled, ...}]} when the user is already in the course). Same R12 fix applied there; the enrol_users skill wrapper drops its result.get(warnings, []) inspection since the client now raises. SF2 — get_team_members's department / position lookup used next(...) over the matches generator, so first match wins on ambiguity instead of surfacing the conflict. Replaced with the _resolve_rule_id pattern: collect all matches, branch on count (0 -> no match error, >1 -> multiple matches error with the matching candidates listed by name + idnumber, 1 -> proceed). Both department and position resolvers updated. Tests: - test_client.test_get_department_members_returns_all_rows (replaces the dept_position_filter test) — asserts both Eng + Ops rows return without any client-side filtering. - test_client.test_update_courses_raises_on_warnings (replaces _with_warnings) — asserts pytest.raises with the shortname-collision warning text preserved. - test_client.test_enrol_users_raises_on_warnings (renamed) — asserts MoodleAPIError(match=already enrolled). - test_agent.test_get_team_members_filters_to_requested_department — mocks list_departments + system report responses via side_effect; asserts Alice (Engineering) survives but Bob (Operations) is filtered out. - test_agent.test_get_team_members_filters_to_requested_position — same shape for positions. - test_agent.test_get_team_members_ambiguous_department — two depts both match the query (one by name, one by idnumber); asserts status:error response with the matches list. - test_agent.test_get_team_members_ambiguous_position — same shape for positions. - test_agent.test_update_course_tool_propagates_warnings — verifies wrapper output is status:error with the shortnametaken message text preserved end-to-end. - test_agent.test_enrol_users_tool_propagates_warnings — verifies wrapper output is status:error with the alreadyenrolled message preserved. 100% coverage retained on moodle/client.py and moodle/skills.py. Full suite: 4432 passed. ruff check + ruff format --check clean. Live verification (per plan, deferred to the next /amia_start session): in the Flutter moodle-tools room, ask Who is in the Engineering department? then Who is in the Operations department? — outputs MUST differ (today they don't); then ask the agent to rename course 2 to shortname (taken by course 3) and confirm the agent reports failure with the Moodle warning message, not a green checkmark. EOF )
ryan-day-ctr
added a commit
to ryan-day-ctr/soliplex
that referenced
this pull request
May 18, 2026
A fourth-pass Sonnet review of PR soliplex#968 flagged that several write-tool preview dicts in skills.py omit the "action" field. The router prompt (agent.py:112-134) tells the router LLM to read "action" to identify which tool to invoke on the confirmation round-trip; without it the router has to infer the tool name from the prose preview string alone, and the router- prompt example only hard-codes delete_department -- so for any other tool the LLM is guessing. Reviewer flagged 7 sites (5 deletes + 2 updates). Audit of every preview-returning site in skills.py turned up 19 without "action" -- the rest of the surface already had it. Fixed all 19 in one sweep so the entire class of router-mis-routing risk is closed rather than leaving 12 latent. Sites updated: Organisation skill (7): update_department, delete_department, create_position, update_position, delete_position, delete_job, unassign_manager. Certifications skill (3): delete_certification, restore_certification, bulk_deallocate_certification_users. Programs skill (7): archive_program, restore_program, delete_program, duplicate_program, update_program_visibility, bulk_deallocate_program_users, bulk_reset_program_progress. Reporting skill (2): delete_export, delete_import. Additive change: every existing test still passes, every existing happy-path flow is unaffected, the new key is only read on the confirmation round-trip. Added a focused regression test test_delete_department_preview_includes_action that asserts the canonical example (delete_department, the one the router prompt calls out by name) returns a preview dict with the correct action string. Combined with a grep-driven verification sweep (every "preview": line in skills.py now has a matching "action": within the same dict), this guards against the systematic gap recurring. 4,433 unit tests pass at 100% coverage. ruff check + format check clean.
ryan-day-ctr
added a commit
to ryan-day-ctr/soliplex
that referenced
this pull request
May 18, 2026
…urse completion overview Two NIT-grade fixes from the fourth-pass review of PR soliplex#968. Both low-impact individually but together close out every finding from the four review passes. N1 -- LTI deployment-id error message ====================================== When an id_token arrives without the deployment_id claim, ``payload.get(LTI_CLAIM_DEPLOYMENT_ID)`` returns None and ``check_deployment`` raises ``InvalidLTIDeployment(None, platform_id)``. The exception's ``__init__`` formats the message as "Invalid deployment_id None for platform 'moodle'" -- confusing operator error text when the actual problem is a missing claim. Fix: ``check_deployment`` now branches on None and raises with ``reason="missing deployment_id claim in id_token"``, surfaced through a new optional ``reason`` kwarg on ``InvalidLTIDeployment``. Existing call sites unaffected (reason defaults to None, message format unchanged for the true mismatch case). Signature widened from ``str`` to ``str | None`` to match what ``views/lti.py:199`` actually passes. Added ``test_missing_deployment_claim`` that asserts the 400 response contains the new phrasing and explicitly does NOT contain "None". N2 -- parallel per-user completion lookup ========================================== ``get_course_completion_overview`` (skills.py:683) fetched the enrolled-user list then iterated sequentially, awaiting ``get_course_completion_status`` once per user. For a course with 100 enrolled users at ~200ms/call that is 20 seconds of serial round-trips -- close to the MoodleClient 30s timeout and a real latency cliff for large courses. Fix: gather the per-user calls with ``asyncio.gather`` after the initial enrolment query. Result order is preserved by gather, so output stability is unchanged. Failed per-user lookups still return ``completed: null`` -- the helper closure catches ``MoodleAPIError`` / ``httpx.HTTPError`` and returns None instead of letting the exception abort the whole gather. Worst-case latency now bounded by the slowest single call (not the sum), turning a 20s sequential walk into a ~200ms parallel fan-out. The two existing tests (``test_get_course_completion_overview_tool`` and ``test_get_course_completion_overview_per_user_error``) pass unchanged because they assert on counts/totals, not on call order. 4,435 unit tests pass at 100% coverage. ruff check + format check clean.
ryan-day-ctr
added a commit
to ryan-day-ctr/soliplex
that referenced
this pull request
May 18, 2026
Fifth-pass Sonnet review of PR soliplex#968 flagged a residual gap from the prior commit (1623eb2) that parallelised get_course_completion_overview with asyncio.gather. The new _fetch_status closure catches (MoodleAPIError, httpx.HTTPError) and returns None on failure -- but the path also runs the per-user payload through CompletionStatus.model_validate(...). If Moodle ever returns a structurally unexpected per-user record (missing completionstatus key, mistyped fields, etc.), pydantic raises ValidationError -- a ValueError subclass, NOT an httpx or Moodle exception -- which escapes _fetch_status uncaught, escapes asyncio.gather (no return_exceptions=True), and escapes _moodle_tool's narrower error-catching surface. The entire overview call would fail with an unhandled 500 instead of marking the offending user as completed: null. Fix: widen the except tuple in _fetch_status to also catch pydantic.ValidationError. Same return-None-on-failure contract preserves the existing test for the httpx/MoodleAPIError path and the gather invariants. Low-probability failure mode (requires Moodle to return a malformed per-user shape) but trivial guard. Added test_get_course_completion_overview_validation_error that feeds a malformed completion response and asserts the user appears with completed=None instead of the whole call blowing up. 4,436 unit tests pass at 100% coverage. ruff check + format check clean.
Adds a new Moodle room that connects to Moodle Workplace via a standalone FastMCP server (moodle-mcp-tools) over stdio. Includes room config, secrets for MOODLE_BASE_URL and MOODLE_API_TOKEN, and a unit test for config loading.
…room Move Moodle client into soliplex as src/soliplex/moodle/ and use a factory agent with @agent.system_prompt to pre-fetch course/enrollment/completion data before every LLM call, eliminating the need for tool calling by local models. Also scrub org-specific references from codebase.
…odle room @agent.system_prompt functions are silently skipped when AG-UI provides message_history (Pydantic AI only calls _sys_parts when messages is empty). Switch to an async instructions callable which is evaluated on every request.
Adds a parallel Moodle room that exposes 4 Pydantic AI tools (list_courses, find_user, list_enrolled_users, get_completion_status) instead of pre-fetching data into the system prompt. Enables A/B comparison of tool-calling vs context-injection approaches with the same model and data.
Exclude course id=1 (site-level entry) from list_courses and context injection to prevent enrollment/completion errors on a non-real course. Update suggestion prompts to match seeded data and strengthen system prompt so the agent always calls tools before claiming data is missing.
…ng variant The tool-calling room handles Moodle queries well via gpt-oss, so the context-injection variant is no longer needed. Removes the moodle room config, its test, and the related factory/prompt code from agent.py.
…module Move duplicated provider/model construction logic from moodle/agent.py into agents.get_model_from_factory_config(), remove unused get_activities_completion_status() and related models, narrow exception handling from bare Exception to (MoodleAPIError, httpx.HTTPError), and fix stale "(Tool Calling)" test assertion.
The ./rooms/moodle directory was removed in an earlier refactor but its reference remained, which would cause a startup error. Only ./rooms/moodle-tools is valid.
…dling - Add missing skill_toolset_config parameter to moodle_tools_agent_factory to match the AgentFactory protocol. - Wrap all four tool functions in try/except for consistent error handling (previously only get_completion_status caught errors). - Add optional verify parameter to MoodleClient for environments with custom certificate authorities, passed through from extra_config as moodle_verify_ssl.
- Add class docstrings to all Pydantic models referencing the Moodle web service functions that return them. - Add method docstrings to all public MoodleClient methods. - Add comment explaining MAX_RESULTS purpose. - Expand moodle_tools_agent_factory docstring with extra_config keys and required Moodle web service functions. - Populate __init__.py with module docstring and public exports. - Add README.md integration guide covering prerequisites, token generation, Soliplex configuration, custom CA certificates, and limitations.
Add test_moodle_agent.py with 11 tests covering the factory function (returns Agent, accepts skill_toolset_config, registers 4 tools) and all four tool functions (happy path + error handling for each). Remove agent.py from coverage omit in pyproject.toml now that tests are in place. Fix two line-too-long docstrings caught by ruff.
Add course content, compliance reporting, groups/cohorts, grading, calendar, and write operations (enrol/message with confirmation). Certifications feature skipped pending Moodle Workplace API access.
…port Integrate LTI 1.3 OIDC login/launch flow with session token validation as an authentication fallback. Add configurable thinking-field support for local models (e.g. Gemma 4) via pydantic-ai model profiles.
Extract shared helpers (_to_dict, _HTML_TAG_RE, _CONFIRM_INSTRUCTIONS, _EXPORTER_MAP) to eliminate duplicate patterns across client and skills. Move LTI templates to lti_templates.py, consolidate test helpers into conftest.py, remove 6 dead model classes and 3 orphan log constants. Fix auth_disabled to only gate on OIDC so --no-auth-mode works with LTI platforms configured.
Re-add httpx dependency to lock file after rebasing onto upstream/main (haiku-rag-slim 0.44, haiku-skills 0.16).
Upstream broke the 'config' package re-exports (PR soliplex#951) and the rebase moved 'get_model_from_factory_config' to 'config.agents' (alongside 'get_model_from_config'). Update test mock paths and direct imports of 'config.InstallationConfig' / 'config.RoomConfig' / 'config.FactoryAgentConfig' to use their submodules ('config.installation', 'config.rooms', 'config.agents'). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The skill-router architecture (commit a0f0608) introduced behavioural regressions never re-verified against the original 57-scenario doc. Cleanup commit 2955a8d also silently shadowed `_CONFIRM_INSTRUCTIONS`, breaking preview JSON for write tools. This change closes both gaps. Key fixes: - skills.py: rename short-form to `_CONFIRM_INSTRUCTIONS_SHORT` so it no longer collides with the long skill-prompt suffix; update every preview-dict usage. - skills.py: enrich `SkillMetadata.description` strings so the router can find cohorts, tenants, exports, calendar, and rule-condition searches against the right skill. - skills.py: tighten docstrings on tools with permissive defaults (search_certifications, search_cohorts/competencies_for_rule, get_upcoming_events) so the agent stops asking for clarification when an empty arg returns all rows. - skills.py: clarify `assign_manager` Args + skill-prompt entry to prevent inverted manager/subordinate assignments. - skills.py: enrich `get_cohort_members` with usernames + fullnames so callers don't hallucinate user data. - skills.py: expand preview dicts for `enrol_users`, `create_user`, `create_department` to expose tabulatable top-level fields and always inject the short instructions string. - skills.py: route 'learning catalogue' queries to `get_user_learning_catalogue` instead of `browse_catalogue`. - skills.py: skill-prompt workflow guidance for cross-skill chaining (grades, completion overview, course name → ID resolution). - agent.py: rewrite MOODLE_ROUTER_PROMPT with a strong no-hallucination directive, explicit routing examples, and anti-pattern warnings; the router now consistently dispatches rather than answering data questions from memory. - example/rooms/moodle-tools/room_config.yaml: switch model to gpt-oss:latest (the model MOODLE.md was originally written for; gemma4:26b's 262K-context made every scenario take 5-10 minutes. Tests: - 100% coverage maintained (4351 passing). - Sandbox cycled fresh; 12 of 12 originally-FAILED scenarios now pass; 4 of 4 originally-PARTIAL scenarios now pass. - Known limitation: preview→confirm scenarios (28/30/33/35/37/38/39/ 45/48/51/52/53) require same-thread router context that the current skill-router architecture doesn't carry between separate user prompts; preview halves verified, confirm halves need router-level memory work outside this fix's scope. EOF )
The skill-router architecture broke confirm-half scenarios two ways: 1. drive_llm_stream awaited maybe_generate_title synchronously in the finally block, so the per-thread background task held its SQLite write transaction open for the duration of title-gen (seconds-to-minutes on a local LLM). Subsequent user prompts on the same thread - notably the "Yes, go ahead" confirm - opened their SSE stream but blocked on the writer lock and never started their agent run. Wrapping maybe_generate_title in asyncio.create_task lets drive_llm_stream return immediately, so the next prompt's agent run starts right away. 2. The skill subagent receives the router-forwarded request as a fresh prompt with no memory of the prior preview. When the request explicitly says confirmed=True, the subagent was still defaulting to confirmed=False and silently re-previewing the action while the router announced success. Rewrote the _CONFIRM_INSTRUCTIONS skill-prompt suffix to force two distinct cases (preview vs explicit confirmed=True) so the subagent honors the confirmed flag the router reconstructed from history. Also moved the title agent off gemma4:26b (262K-context, slow load) onto gpt-oss:latest so title-gen finishes before its no-longer-blocking task is cancelled. Verified end-to-end: 27 (preview) + 28 (confirm) on a fresh sandbox now writes a Security/SEC department row that Moodle's DB confirms exists.
… guidance Continued the preview->confirm flow fixes from 842f099. Two classes of issue remained on rule and update_user write tools: 1. Rule write tools (enable_rule, disable_rule, archive_rule, unarchive_rule, delete_rule, duplicate_rule, delete_rule_condition, delete_rule_outcome) returned a minimal preview JSON like {"preview": "Enable dynamic rule id=N"} with no `action` key, no parameters as top-level fields, and no `instructions` key. The router can't reconstruct a confirm request from a freeform string, so on "Yes, proceed" it forwarded the bare phrase and the subagent re-previewed (no DB write). Restructured each preview to match the create_department / create_user pattern: top-level `action`, `rule_id` (or `instanceid`), and `_CONFIRM_INSTRUCTIONS_SHORT` (or `_WARNING` for delete). 2. update_user preview embedded the changed fields inside a nested `changes` dict that the LLM rendered awkwardly (Department=1 instead of "Quality Assurance"). Promoted each changed field to a top-level key alongside `changes` so the LLM can tabulate without re-interpreting nested JSON. Also added skill-prompt guidance for moodle-rules: write operations must call the write tool itself (with confirmed=False) for the preview rather than describing the action from list_dynamic_rules / can_enable_rule output - those listing tools don't return the structured fields the router uses to reconstruct the confirm. Added a logfire instrumentation point in enable_rule so future debugging can see the actual args the LLM passed. Verified end-to-end: - 28 / 30 (create + delete dept) DB-confirmed. - 33 (duplicate program) - new program id created. - 35 / 37 / 39 (hide / archive / restore) - DB flags flipped. - 51 / 53 (create + delete user) - DB-confirmed. - 44 (enable rule preview) - now calls enable_rule(confirmed= False) and returns structured JSON. 45 / 47 / 48 are blocked by an unrelated sandbox seeding gap (the tool_dynamicrule\reportbuilder\local\systemreports\rules system report returns 0 rows in this fresh seed even though the rules exist in the DB) - agent code is correct. - 52 (update user) - preview now exposes department as a top- level key.
Drop the logfire.info call I added to enable_rule for debugging the preview->confirm flow. Behaviour verified end-to-end against the seeded Test rule (44+45 enable, 47+48 disable) - DB-confirmed state transitions on each pair.
Two improvements driven by smoke scenario 52 review: 1. update_user now best-effort looks up the user's full name via get_users_by_field so the preview can show "Documentation User (soliplex#7)" instead of bare "soliplex#7". The lookup is wrapped in a broad except so any failure (API down, JSON oddity, validation error) silently falls back to the bare ID. The new `user` field carries the label. 2. The router prompt and moodle-users skill prompt now explicitly distinguish the user.department free-form string field from the Workplace organisation department entity. Previous runs of scenario 52 dispatched to moodle-organisation and tried to look up or create a department row in tool_organisation_department; the new guidance keeps such requests on moodle-users where they belong. 3. Skill prompt now requires the rendered preview heading to include the user's name + ID so the doc's expected format ("User ID 8") survives the LLM's table rendering. Tests cover the success path (user found, preview includes fullname), the empty-list fallback (user lookup returns []), and the existing no-fields / bad-id paths.
A two-track audit of the branch (dead code / refactor candidates +
upstream pattern alignment) surfaced both real technical debt and
several stylistic divergences that would draw review comments. This
commit folds the fixes into three phases.
Phase 1 -- PR blockers:
- MoodleClient now constructs a persistent httpx.AsyncClient with a
30s default timeout (was instantiating one per call). Adds aclose()
for graceful teardown.
- Drop the empty /lti/jwks endpoint that returned {"keys": []} -- it
silently fails any platform attempting JWK validation against us.
- Drop unused jwcrypto direct dependency (still pulled transitively
via python-keycloak; saves install weight).
- Fix AGENTS.md reference to deleted CLAUDE.md.
Phase 2 -- alignment + targeted refactor:
- moodle_tools_agent_factory now matches the AgentFactory protocol
signature (kw-only args after agent_config, returns SoliplexAgent).
- Re-merge lti_templates.py back into views/lti.py to match how every
other view module in the project keeps inline HTML.
- Move tests/unit/test_moodle_*.py into tests/unit/moodle/ to match
the agui/, config/, skills/ subdirectory layout.
- Replace 102 try/except (MoodleAPIError, httpx.HTTPError) blocks
with a single @_moodle_tool decorator (-200 LOC in skills.py).
- Introduce _preview() helper for the standard preview-JSON shape.
- Promote _resolve_rule_id to module level (was nested closure).
- Hoist _strip_html and _li_texts -> _parse_li_texts to module level.
- Switch from raw haiku.skills.SkillToolset to SoliplexSkillToolset
for run-context state injection consistency with the rest of the
project.
- Remove 7 dead public methods on MoodleClient (and their tests) that
had zero callers anywhere in src/ or tests/.
- Rename tests/unit/_test_features.py -> _fixtures.py so its real
role as a shared fixture module is clear (the _test_ prefix was
fooling pytest into skipping it as a test file).
- Document the [tool.coverage.run].omit list in pyproject.toml.
Phase 3 -- polish (preview-leak fix):
- Drop the "instructions" field from preview JSON entirely. The
string ("Present this to the user and ask for confirmation...")
was being rendered verbatim into user-facing tables (smoke
scenarios 29, 34, 44). The skill prompt's Case A / Case B already
covers preview-vs-confirm handling, so the field was redundant.
- Strengthen _CONFIRM_INSTRUCTIONS with explicit "no preamble, no
narration" output rules so chain-of-thought style summaries
("We identified...", "User wants to...") stop leaking before the
preview table.
Verification:
- 4,342 unit tests pass; 100% coverage maintained.
- ruff check + format clean.
- Re-ran all 57 MOODLE.md smoke scenarios via REST against
gpt-oss:latest: 47 PASS / 7 PARTIAL / 3 FAIL. None of the
non-passes are caused by this refactor -- they are gpt-oss
hallucination on empty tool results, sandbox state pollution from
prior runs, or seed-data gaps (UTM/adv-comp report tables empty).
- Re-tested the three preview-leak scenarios (29, 34, 44) after
Phase 3 changes; all now produce clean table-only previews.
The gemma4:26b experiment is over -- gpt-oss:latest is the default
on every front (router, skill subagents, title agent), and a
separate sweep of all 57 MOODLE.md scenarios confirms it routes
correctly without any thinking-token plumbing. The custom
_resolve_thinking_profile helper that patched supports_thinking
on per-model profiles is also redundant now that pydantic-ai's
Ollama provider sets openai_chat_thinking_field='reasoning'
unconditionally.
Strip every adjustment that was made for the experiment:
- example/rooms/moodle-tools/room_config.yaml: drop
thinking_field: "reasoning" from extra_config.
- example/minimal.yaml: change default_chat from gemma4:26b to
gpt-oss:latest, and drop the matching extra_body: {think: false}
blocks (one on default_chat, one on the title agent) -- they
were only there to suppress gemma's reasoning tokens.
- src/soliplex/config/agents.py: delete _resolve_thinking_profile
and both call sites (ollama and openai branches), drop the
thinking_field paragraph from get_model_from_factory_config's
docstring, and remove the now-unused openai_profiles import.
- tests/unit/config/test_config_agents.py: remove the
ollama_thinking and openai_thinking parametrize entries, the
thinking_field handling in the test body, the two standalone
helper tests, and the dead openai_profiles import.
- tests/unit/moodle/test_tools_room.py: drop thinking_field from
the expected extra_config snapshot.
Verification: ruff check + format clean, full pytest suite still
green at 100% coverage (4,338 passing), and a smoke subset of
six representative scenarios (S1, S9, S15, S27/28, S40, S46) all
PASS against gpt-oss:latest with the plumbing gone. The skill
prompt's no-preamble output rules in _CONFIRM_INSTRUCTIONS were
originally motivated by gemma's CoT leak but help any reasoning
model and intentionally remain in place.
Net diff: -112 / +1 lines.
Sonnet review of PR soliplex#968 flagged three blockers and six should-fixes. All addressed here, verified against a live Moodle Workplace sandbox during full regression testing. B1: async validate_id_token + module-level JWKS client cache - anyio.to_thread.run_sync wraps the blocking PyJWKClient lookup - Module-level dict + lazy anyio.Lock cache PyJWKClient instances by JWKS URL so PyJWT's cache_keys=True actually survives across requests (verified: 5 launches => 1 certs.php fetch on Moodle). B2: room_id sanitization + JSON-config-tag pattern - resolve_room_id validates extracted room segments against [a-zA-Z0-9_-]+; XSS-attempt target_link_uri falls through to default (verified: no <script> in rendered body). - Per-launch state replaces f-string-into-JS interpolation with a <script id="lti-config" type="application/json"> tag plus a </ escape, eliminating the regular JS-string injection surface. B3: nonce-seen cache to prevent state replay - lti.nonce.consume_nonce() module-level dict, bounded at 10k entries, lazy expiry sweep on each call. Replay returns 400 LTI_NONCE_REPLAY (verified: second POST of same id_token+state rejected). S1: platform-bound session tokens - mint_session_token embeds _platform_id; authn.authenticate uses peek_platform_id + single-platform lookup so a short-TTL platform's token cannot live longer than its TTL when a longer- TTL platform is also registered. S2: MoodleClient.aclose wired to FastAPI lifespan - InstallationConfig.register_cleanup + Installation lifespan AsyncExitStack push_async_callback. Verified: clean shutdown log, no "httpx.AsyncClient was never closed" warning. S3: SRI on marked.js - Pinned marked@15.0.7 with sha384 integrity + crossorigin anonymous. Verified: hash matches live CDN content. S4: expanded CSP - default-src 'self'; script-src 'self' 'unsafe-inline' https://cdn.jsdelivr.net; style-src 'self' 'unsafe-inline'; connect-src 'self'; img-src 'self' data:; frame-ancestors {platform.issuer}. All six directives present on live response. S5: reject empty deployment_ids at config load - LTIPlatformConfig.from_yaml raises FromYamlException when the list is empty or missing entirely. S6+P4: remove dead _preview helper and _CONFIRM_INSTRUCTIONS_* constants.
The S1 fix in the prior commit binds session tokens to the issuing platform via a _platform_id claim, then validates with that single platform's TTL. Any redeploy of an existing system would invalidate every active LTI session because tokens minted before S1 lack the claim. Bridge those tokens: when peek_platform_id returns None, validate using max_age = min(p.session_ttl for p in lti_platforms) so a legacy token cannot outlive the shortest configured TTL (preserves the S1 invariant). Log a warning containing the token's sub so operators can detect lingering legacy traffic. Remove the else-branch after the warning goes silent for a full deploy cycle. Tracked separately.
R12.1 — Integer-id leakage caused agent confusion. list_departments
and list_positions exposed both the internal integer id (e.g. 1) and
the external string idnumber (e.g. "ENG"). The agent sometimes
picked the integer when subsequently calling create_department(
parent="1"), which the Moodle API rejected because the `parent`
field is the parent's idnumber, not its id.
Reshape both:
- Drop "id" and "parentid" from every row.
- Add a resolved "parent" field containing the parent's idnumber
(or "" for top-level), looked up from the FULL department list so
children outside a search filter still report their parent.
- Drop integer "id" from every confirmed-write success output
(create_*, update_*) so the LLM doesn't learn the wrong reference
key inside a single session.
- Convert delete_department / delete_position to accept idnumber:
the wrapper resolves it to the integer id Moodle requires
internally; if no match exists, return {"status": "error", ...}.
R12.2 — Confabulated success defense. Earlier observation: when
_moodle_tool caught a MoodleAPIError and returned {"error": ...},
the LLM sometimes ignored the error and reported a fabricated
success.
- _moodle_tool's error branch now returns {"status": "error", ...}.
- _moodle_tool auto-injects "status": "ok" into any JSON object
returned from a tool invoked with confirmed=True (preview and
read branches untouched). Tools that emit their own "status"
field are passed through.
- Append rule 5 to _CONFIRM_INSTRUCTIONS, the shared prompt block
mixed into all 7 skill agents: explicitly forbid claiming success
or inventing IDs unless the result carries "status": "ok"; require
failures be reported when "status": "error".
- Rule 4 cross-references "status": "ok".
100% coverage on moodle/skills.py. 541 moodle-suite tests pass.
Verified end-to-end via regression sweep against the live sandbox.
Three related fixes, all rolled together because they share the same call site: S2 — Single-item-write contract documented. client._warnings_message docstring now spells out that the "empty result + warnings = error" heuristic is only correct for the single-item submissions our wrappers issue today. Each of the four call sites (create_departments, update_departments, create_positions, update_positions) gets a cross-reference comment for future maintainers. S3 — One Moodle round-trip per list_*. list_departments / list_positions used to fetch twice when a search filter was passed: once for the filtered set, once for the parent-idnumber map. Now we fetch the full list once and filter in-process. Verified live: 1 POST per invocation. R12.1 actually-works fix — real idnumbers in list output. Moodle's stock tool_organisation_get_teams_tab_filters hard-codes a SELECT list that excludes idnumber. Without it the agent has no idnumber to feed into create/update/delete writes. Switch client.get_departments / get_positions to the new local_soliplex_list_departments / _list_positions endpoints (see moodle_sandbox fca534c8a) which return idnumber from the underlying tables. B1 covered as a side-effect — delete_department / delete_position now fetch the full list and exact-match by idnumber instead of calling the server with the idnumber as a name-substring search. The B1 false-positive risk (idnumber happens to be a substring of another department's name is eliminated; the test test_delete_department_tool_not_found_exact_match_only locks the new behaviour in place. Verified end-to-end against live Moodle Workplace: list shows real idnumbers, delete by idnumber actually removes the row. 100% coverage on moodle/client.py and moodle/skills.py. Requires moodle_sandbox >= fca534c8a for the new plugin endpoints to be available; production deployment requires the local_soliplex plugin to be present (same constraint that already exists for get_department_members / get_utm_report / get_adv_comp_report). EOF )
A 57-scenario Flutter regression turned up 7 soft-spot findings —
all prompt-routing or cosmetic, none regressions from the security
or LTI work in this branch. Fixed in-scope per "no loose chads":
F1 — Scn 41 ("How many users match X rule"): agent went to
list_cohorts / get_cohort_members instead of
get_rule_matching_users. Strengthened the tool docstring and
added an explicit "how many users match" → get_rule_matching_users
example in _RULES_PROMPT (plus router-prompt routing entry).
Verified live: trace now ends at get_rule_matching_users.
F2 — Scn 16 ("What competency frameworks exist?"): router routed
to moodle-rules.search_competencies_for_rule (which is for rule
conditions, not frameworks). Added a disambiguating routing
example to MOODLE_ROUTER_PROMPT pointing competency-framework
queries at moodle-programs.list_competency_frameworks. Verified
live: trace ends at list_competency_frameworks.
F3 — Scn 40 ("Can the X rule be enabled?"): router produced NO
tool dispatch at all. Added "can X rule be enabled" → moodle-
rules.can_enable_rule routing example. Verified live: trace
includes can_enable_rule.
F4 — Scn 13 ("Who is in the Engineering department?"): agent
looped list_departments 7+ times before reaching get_team_members
because get_team_members took an integer departmentid that no
list-tool exposes anymore. Now get_team_members accepts
``department=<name-or-idnumber>`` and resolves internally via
the local_soliplex list endpoints. Workflow guidance updated
in _ORGANISATION_PROMPT to direct callers there. Verified live:
trace ends at get_team_members in 1–2 calls, no loops.
F5 — Cosmetic gpt-oss Harmony control tokens
(``<|channel|>commentary``, ``<|message|>``, ``<|end|>``, etc.)
were leaking into client-visible text deltas. Added a
_sanitise_event hook in the AG-UI streaming loop that regex-
strips known Harmony markers from event.delta / event.content
in place — so persistence + title-gen + the client all see the
sanitised text. Pattern recognises bare markers plus
<|channel|> with a closed-set label (analysis | commentary |
final) so it can't accidentally eat content words.
F6 — Scn 29 ("Delete the Security department" by NAME): agent
couldn't resolve name → idnumber because list_departments output
no longer leaks integer ids. Tightened delete_department /
delete_position docstrings to instruct: "If only the name is
known, FIRST call list_departments to look up the idnumber."
Verified live: agent now calls list_departments first then
delete_department(idnumber=<resolved>).
F7 — Scn 33 (Duplicate Program — Confirmed) leaves a new
"Leadership Track Copy" row each test run. Test-design issue,
not product. Added a cleanup note to docs/moodle/MOODLE.md.
100% coverage on moodle/skills.py, moodle/agent.py, views/agui.py.
4299 unit tests pass.
… warnings; surface resolver ambiguity A third-pass review of PR soliplex#968 (fresh-context Sonnet, no shared state with the prior two passes turned up one user-visible bug and two should-fix patterns that the earlier reviews missed. All three close honestly here so the PR can flip Draft -> Ready. B1 — get_team_members(department=Engineering) returned EVERY staff member with a Workplace job, not just Engineering's. Two layers of breakage, both fixed: (a) client.get_department_members accepted departmentid / positionid integer params but never forwarded them to the remote core_reportbuilder_retrieve_system_report call. The sole client-side filter (skip rows with empty dept name when departmentid is set) only filtered out rows missing a department altogether — it let through every row from every OTHER department. Signature simplified to a pure fetch all jobs reader with just the existing user-name search filter; the broken half-filter is gone. (b) skills.get_team_members now applies the missing post-fetch exact-name filter using the resolved Department / Position's canonical .name (case-insensitive) — only rows whose departmentname / positionname match the resolved target survive. Exact match, not substring, so e.g. Engineering doesn't catch Engineering Operations. Fourth review's regression smoke graded by LLM output text which happened to mention the right names — the actual rows returned were never inspected. SF1 — core_course_update_courses returns null on success and {warnings: [...]} on rejection (shortname collision, capability error, etc.). The wrapper reported success regardless, mirroring the R12 200 + warnings = silent failure pattern that was previously fixed for the four org write endpoints but never propagated to course updates. client.update_courses now raises MoodleAPIError on a populated warnings array, routed through the existing _warnings_message() helper. The decorator surfaces the raise as {status: error, ...} at the skill layer. Return- type annotation (-> None) is now honest (closes N1). Audit pass turned up one additional offender within the +2 endpoint cap: client.enrol_users had the same warnings-as-failure shape (enrol_manual_enrol_users returns {warnings: [{warningcode: alreadyenrolled, ...}]} when the user is already in the course). Same R12 fix applied there; the enrol_users skill wrapper drops its result.get(warnings, []) inspection since the client now raises. SF2 — get_team_members's department / position lookup used next(...) over the matches generator, so first match wins on ambiguity instead of surfacing the conflict. Replaced with the _resolve_rule_id pattern: collect all matches, branch on count (0 -> no match error, >1 -> multiple matches error with the matching candidates listed by name + idnumber, 1 -> proceed). Both department and position resolvers updated. Tests: - test_client.test_get_department_members_returns_all_rows (replaces the dept_position_filter test) — asserts both Eng + Ops rows return without any client-side filtering. - test_client.test_update_courses_raises_on_warnings (replaces _with_warnings) — asserts pytest.raises with the shortname-collision warning text preserved. - test_client.test_enrol_users_raises_on_warnings (renamed) — asserts MoodleAPIError(match=already enrolled). - test_agent.test_get_team_members_filters_to_requested_department — mocks list_departments + system report responses via side_effect; asserts Alice (Engineering) survives but Bob (Operations) is filtered out. - test_agent.test_get_team_members_filters_to_requested_position — same shape for positions. - test_agent.test_get_team_members_ambiguous_department — two depts both match the query (one by name, one by idnumber); asserts status:error response with the matches list. - test_agent.test_get_team_members_ambiguous_position — same shape for positions. - test_agent.test_update_course_tool_propagates_warnings — verifies wrapper output is status:error with the shortnametaken message text preserved end-to-end. - test_agent.test_enrol_users_tool_propagates_warnings — verifies wrapper output is status:error with the alreadyenrolled message preserved. 100% coverage retained on moodle/client.py and moodle/skills.py. Full suite: 4432 passed. ruff check + ruff format --check clean. Live verification (per plan, deferred to the next /amia_start session): in the Flutter moodle-tools room, ask Who is in the Engineering department? then Who is in the Operations department? — outputs MUST differ (today they don't); then ask the agent to rename course 2 to shortname (taken by course 3) and confirm the agent reports failure with the Moodle warning message, not a green checkmark. EOF )
A fourth-pass Sonnet review of PR soliplex#968 flagged that several write-tool preview dicts in skills.py omit the "action" field. The router prompt (agent.py:112-134) tells the router LLM to read "action" to identify which tool to invoke on the confirmation round-trip; without it the router has to infer the tool name from the prose preview string alone, and the router- prompt example only hard-codes delete_department -- so for any other tool the LLM is guessing. Reviewer flagged 7 sites (5 deletes + 2 updates). Audit of every preview-returning site in skills.py turned up 19 without "action" -- the rest of the surface already had it. Fixed all 19 in one sweep so the entire class of router-mis-routing risk is closed rather than leaving 12 latent. Sites updated: Organisation skill (7): update_department, delete_department, create_position, update_position, delete_position, delete_job, unassign_manager. Certifications skill (3): delete_certification, restore_certification, bulk_deallocate_certification_users. Programs skill (7): archive_program, restore_program, delete_program, duplicate_program, update_program_visibility, bulk_deallocate_program_users, bulk_reset_program_progress. Reporting skill (2): delete_export, delete_import. Additive change: every existing test still passes, every existing happy-path flow is unaffected, the new key is only read on the confirmation round-trip. Added a focused regression test test_delete_department_preview_includes_action that asserts the canonical example (delete_department, the one the router prompt calls out by name) returns a preview dict with the correct action string. Combined with a grep-driven verification sweep (every "preview": line in skills.py now has a matching "action": within the same dict), this guards against the systematic gap recurring. 4,433 unit tests pass at 100% coverage. ruff check + format check clean.
…urse completion overview Two NIT-grade fixes from the fourth-pass review of PR soliplex#968. Both low-impact individually but together close out every finding from the four review passes. N1 -- LTI deployment-id error message ====================================== When an id_token arrives without the deployment_id claim, ``payload.get(LTI_CLAIM_DEPLOYMENT_ID)`` returns None and ``check_deployment`` raises ``InvalidLTIDeployment(None, platform_id)``. The exception's ``__init__`` formats the message as "Invalid deployment_id None for platform 'moodle'" -- confusing operator error text when the actual problem is a missing claim. Fix: ``check_deployment`` now branches on None and raises with ``reason="missing deployment_id claim in id_token"``, surfaced through a new optional ``reason`` kwarg on ``InvalidLTIDeployment``. Existing call sites unaffected (reason defaults to None, message format unchanged for the true mismatch case). Signature widened from ``str`` to ``str | None`` to match what ``views/lti.py:199`` actually passes. Added ``test_missing_deployment_claim`` that asserts the 400 response contains the new phrasing and explicitly does NOT contain "None". N2 -- parallel per-user completion lookup ========================================== ``get_course_completion_overview`` (skills.py:683) fetched the enrolled-user list then iterated sequentially, awaiting ``get_course_completion_status`` once per user. For a course with 100 enrolled users at ~200ms/call that is 20 seconds of serial round-trips -- close to the MoodleClient 30s timeout and a real latency cliff for large courses. Fix: gather the per-user calls with ``asyncio.gather`` after the initial enrolment query. Result order is preserved by gather, so output stability is unchanged. Failed per-user lookups still return ``completed: null`` -- the helper closure catches ``MoodleAPIError`` / ``httpx.HTTPError`` and returns None instead of letting the exception abort the whole gather. Worst-case latency now bounded by the slowest single call (not the sum), turning a 20s sequential walk into a ~200ms parallel fan-out. The two existing tests (``test_get_course_completion_overview_tool`` and ``test_get_course_completion_overview_per_user_error``) pass unchanged because they assert on counts/totals, not on call order. 4,435 unit tests pass at 100% coverage. ruff check + format check clean.
Fifth-pass Sonnet review of PR soliplex#968 flagged a residual gap from the prior commit (1623eb2) that parallelised get_course_completion_overview with asyncio.gather. The new _fetch_status closure catches (MoodleAPIError, httpx.HTTPError) and returns None on failure -- but the path also runs the per-user payload through CompletionStatus.model_validate(...). If Moodle ever returns a structurally unexpected per-user record (missing completionstatus key, mistyped fields, etc.), pydantic raises ValidationError -- a ValueError subclass, NOT an httpx or Moodle exception -- which escapes _fetch_status uncaught, escapes asyncio.gather (no return_exceptions=True), and escapes _moodle_tool's narrower error-catching surface. The entire overview call would fail with an unhandled 500 instead of marking the offending user as completed: null. Fix: widen the except tuple in _fetch_status to also catch pydantic.ValidationError. Same return-None-on-failure contract preserves the existing test for the httpx/MoodleAPIError path and the gather invariants. Low-probability failure mode (requires Moodle to return a malformed per-user shape) but trivial guard. Added test_get_course_completion_overview_validation_error that feeds a malformed completion response and asserts the user appears with completed=None instead of the whole call blowing up. 4,436 unit tests pass at 100% coverage. ruff check + format check clean.
…_user
A 57-scenario MOODLE.md smoke pass against the current HEAD landed
at 46 PASS / 2 PARTIAL / 9 FAIL. Root-causing each failure
turned up a mix of seed-data gaps (handled separately in
moodle_sandbox/local/seed_data.php), test-rig limits in the
smoke driver (handled separately in the new /amia_smoke_test
skill), and three soliplex-side fixes captured here. After
all three streams land, the same smoke pass deterministically
closes 52 of 57 scenarios; the remaining five are LLM-routing
variance on long mid-conversation chains (gpt-oss occasionally
wanders to an adjacent skill or hallucinates IDs), not code
or seed issues, and are best addressed by a model upgrade
rather than further prompt tightening.
Stream A changes (this commit):
1. ``_resolve_rule_id`` is now suffix-tolerant. Previously the
resolver used ``needle in name`` substring matching, which
fails when the LLM appends a noun-class suffix from the
user's English phrasing (e.g. "the Standalone Cohort
Trigger rule" -> rule_name="Standalone Cohort Trigger
rule", which is NOT a substring of the stored name
"Standalone Cohort Trigger"). Bidirectional match closes
the case while preserving the existing 0/1/many ambiguity
branching for true conflicts. Closes scenarios 40 + 41.
2. ``_REPORTING_PROMPT`` now contains an explicit
anti-hallucination guard. When list_reports returns a
sparse result, gpt-oss has been observed to fabricate
plausible-looking report IDs (101 / 102 / 103) rather
than report the seed contents. The guard says: "your
response MUST contain exactly those N reports -- no
inferred names, no invented IDs, no extra rows". Closes
scenario 24.
3. ``MOODLE_ROUTER_PROMPT`` gains two new sections:
- "Tool argument precision" -- strip " rule" and
" department" noun-class suffixes from user phrasing
before passing to skills. Crucially, do NOT strip
" program" / " course" / " certification" suffixes
because those words commonly appear in the canonical
stored names ("Onboarding Program",
"Cybersecurity Basics", "Workplace Safety
Certification"). For those, use the exact spelling
from a prior list/search tool result.
- "Confirmation vs new request" distinction -- any
message that BEGINS with "yes" / "yeah" / "sure" /
"ok" / "go ahead" / "proceed" / "do it" / "confirm"
is a CONFIRMATION of the most recent preview, even
when followed by an imperative ("Yes, restore it" /
"Yes, archive it"). Conversely "Now restore it" /
"Then archive X" -- an imperative without an
affirmative prefix -- is a NEW request and gets its
own preview cycle.
Plus a concrete dispatch example for UTM / advanced
completion reports keyed by course name, instructing
the router to chain list_courses -> get_utm_report
rather than ask the user for a numeric course ID.
Closes scenario 25.
4. ``create_user`` drops the ``createpassword`` parameter
entirely. Previously the agent could pass
``createpassword=False`` on the confirmation call after
reconstructing the create from the preview, which causes
Moodle to reject the row with "must provide a password,
or set createpassword". The wrapper now hard-codes
``createpassword=1`` whenever no explicit password is
given -- the agent has no knob to mis-set. Closes
scenario 51.
Three new unit tests pin the resolver bidirectional case,
the reporting prompt guard text, and the router prompt's
tool-arg-precision section.
4,592 unit tests pass at 100% coverage. ruff check + format
check clean.
f315db2 to
08eddb2
Compare
# Conflicts: # pyproject.toml # uv.lock
Operators address courses, users, categories, cohorts, tenants, departments, reports, certifications, and programs by name, not by internal Moodle ID. Every agent tool now takes a string identifier and resolves it through a 0/1/many helper that accepts numeric ID, shortname/idnumber, exact name, or bidirectional substring; ambiguity returns candidate lists so callers can disambiguate. Adds get_courses_by_field on the client and resolver helpers (_resolve_course/user/category/cohort/tenant/department/report/ certification/program_identifier) plus a _parse_job_token wrapper for transient export/import job IDs. Removes the obsolete numeric-ID gates in certify_user, revoke_certification, allocate_users_to_*, assign_job, and friends. Backed by 716 moodle tests at 100% coverage; a fresh 57/57 MOODLE.md smoke pass exercises the new surface end to end with zero ID-bearing prompts.
Conflicts: CLAUDE.md — resolved by keeping the local deletion (cf0a289). Upstream changes pulled in (pyproject.toml, agents.py, examples.py, agui.py, rooms.py, and their tests) auto-merged cleanly; 4,780 unit tests still pass at 100% coverage, ruff check and ruff format both clean.
AMIA is the AFSOC-branded deployment of Soliplex; upstream code and tests should not carry that branding. The LTI resource-link claim fixture is the only tracked-tree hit.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a Moodle Workplace integration to Soliplex. Includes:
Architecture highlights
src/soliplex/moodle/agent.py—moodle_tools_agent_factorywires aSoliplexSkillToolsetin subagent mode plus a router prompt (MOODLE_ROUTER_PROMPT) with concrete dispatch examples and write-confirmation reconstruction guidance.src/soliplex/moodle/skills.py— 7 skill builders (build_courses_skill,_users_,_organisation_,_certifications_,_programs_,_rules_,_reporting_) with 100 async tool closures over a sharedMoodleClient. A module-level@_moodle_tooldecorator centralises theMoodleAPIError/httpx.HTTPError→ JSON-error pattern (~200 lines saved over per-tool try/except).src/soliplex/moodle/client.py— async httpx wrapper around Moodle's REST Web Services, persistent client with 30s timeout, 100 public methods.src/soliplex/moodle/models.py— ~48 Pydantic models for Moodle response shapes.src/soliplex/views/lti.py—/lti/login+/lti/launchendpoints, with inline CSS/HTML/JS constants for the chat + picker pages. Cookieless: state nonce is HMAC-signed; session token is embedded in the served HTML.src/soliplex/lti/{nonce,validation,platform,session}.py— JWT validation (RS256/JWKS), three-tier room resolution (target_link_uri→course_room_map→default_room_id), session-token claims extraction.src/soliplex/config/lti.py—LTIPlatformConfigdataclass.example/rooms/moodle-tools/room_config.yaml— wires the factory agent withgpt-oss:lateston Ollama; pluggable provider viaextra_config.example/lti/config.yaml— example platform registration.Two-stage write confirmation runs through the router: skill subagent returns a preview JSON; router renders it and asks for confirmation; on "yes" the router rebuilds the full tool call (parameters from the preview) and re-dispatches with
confirmed=True. Helped by a strengthened_CONFIRM_INSTRUCTIONSprompt that suppresses chain-of-thought preamble.Test plan
pytest— 4,589 tests pass at 100% coverage (25,852 stmts, 0 missing)ruff check src/ tests/— cleanruff format --check src/ tests/— 162 files already formattedgpt-oss:latest+ a local Moodle Workplace 5.0.2 sandbox showed 47 PASS / 7 PARTIAL / 3 FAIL with every non-pass attributed to model-behaviour quirks (gpt-oss hallucinating on empty tool results, same-thread preview shortcuts) or pre-existing seed-data gaps. Not re-run after the recent B1 (get_team_membersfilter), SF1 (update_courses/enrol_userswarnings-as-errors), SF2 (resolver ambiguity), action-key sweep across 19 write previews, N1 (LTI deployment_id error), N2 (get_course_completion_overviewparallelisation), orpydantic.ValidationErrorfixes. Re-verification against the current HEAD is required before flipping Draft → Ready.cyber101rename → real warning surfaced, course unchanged on disk)upstream/main(last rebase: HEADb5643f4contoupstream/main16d4982)Submitting as a draft for now while reviewers triage scope.