feat: add MiniMax as LLM provider with Guardian threat protection by octo-patch · Pull Request #1 · OraclesTech/guardian-sdk

octo-patch · 2026-03-23T12:34:38Z

Summary

Add MiniMax as a first-class provider integration for Guardian SDK. MiniMax offers powerful LLM models (M2.7, M2.5) through an OpenAI-compatible API at https://api.minimax.io/v1. This provider wraps MiniMax-configured OpenAI clients with the same multi-layer threat detection pipeline used for OpenAI and Anthropic.

Changes

New provider: ethicore_guardian/providers/minimax_provider.py — MiniMaxProvider, ProtectedMiniMaxClient, ProtectedChat, ProtectedCompletions, and create_protected_minimax_client() convenience factory
Auto-detection: Updated get_provider_for_client() in base_provider.py to detect MiniMax clients by checking base_url for 'minimax'
Dependencies: Added minimax optional dependency group in pyproject.toml (uses openai>=1.0.0)
Documentation: Added MiniMax provider example and install instructions to README
Tests: 30 tests in tests/test_minimax.py — 22 unit tests + 5 integration tests + 3 constant tests

Supported Models

Model	Context
MiniMax-M2.7	1M tokens
MiniMax-M2.7-highspeed	1M tokens (fast)
MiniMax-M2.5	204K tokens
MiniMax-M2.5-highspeed	204K tokens (fast)

Usage

import openai
from ethicore_guardian import Guardian, GuardianConfig
from ethicore_guardian.providers.minimax_provider import MiniMaxProvider

guardian = Guardian(config=GuardianConfig(api_key="my-app"))

minimax_client = openai.OpenAI(
    api_key="your-minimax-api-key",
    base_url="https://api.minimax.io/v1",
)

provider = MiniMaxProvider(guardian)
client = provider.wrap_client(minimax_client)

response = client.chat.completions.create(
    model="MiniMax-M2.7",
    messages=[{"role": "user", "content": user_input}]
)

Test plan

All 30 unit + integration tests pass with mocked Guardian
Verify no regressions in existing OpenAI/Anthropic providers
Manual smoke test with real MiniMax API key (optional)

Community edition (MIT framework + open-core model): - 5-category threat detection (instructionOverride, jailbreakActivation, safetyBypass, roleHijacking, systemPromptLeaks) - HMAC-SHA256 license key validator (PRO/ENT tiers) - License-aware PatternAnalyzer and SemanticAnalyzer with 4-step asset resolution chain (assets_dir -> ~/.ethicore -> package) - Async Guardian orchestrator with OpenAI, Anthropic, and Ollama providers - ML inference engine with graceful heuristic fallback (no torch required) - Full pytest suite: 100 passing, 61 skipped (require license + asset bundle) - pyproject.toml build config; wheel verified clean (no proprietary assets) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…framing Transforms the minimal technical reference into a trust-building document that leads with the problem (prompt injection shipped without a real defense), backs it with technical proof (4-layer pipeline, ONNX offline inference, ~15ms p99 latency), and makes the ethical conviction concrete through the Guardian Covenant framework reference. Key changes: - Hero: 'Only' positioning statement + founding insight sentence - Added 'See It Work' section with attack demo in 4 lines - Added 'Why Offline Inference Matters' section (key differentiator vs cloud APIs) - Expanded Community vs Licensed comparison table (30 rows, all categories named) - Fixed category count: 30 (was incorrectly stated as 25+ in old README) - Added Guardian Covenant framework reference with link placeholder - Added Community & Discussions section to activate GitHub Discussions - Closing line: the conviction sentence that anchors the brand Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…users Previous version incorrectly framed Guardian SDK as protecting users from AI. Corrected to the accurate frame: Guardian protects the developer's AI system (and its integrity, data, and designed behavior) from adversarial attackers using prompt injection, jailbreaks, and role hijacking. Key changes: - Hero: focuses on real-time threat detection and blocking before model context - Rewrote opening to frame the attack surface and the defender (the developer) - Added 'What It Defends Against' section listing specific attack vectors - Guardian Covenant reframed: developer's commitment to defend what they build - Closing: 'You built something that people rely on. Defend it.' - Removed all language implying users are protected FROM the AI Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ncoding Critical bug fixes discovered and resolved during full-suite audit: ONNX / ML layer (ml_inference_engine.py, guardian.py) - MLInferenceEngine now accepts assets_dir param; SimpleOrchestrator passes it correctly — guardian-model.onnx was never being located - Added ONNX Runtime inference path with calibration gate; falls back to heuristics if model outputs >0.4 avg probability on benign inputs (model requires retraining before contributing to scoring) - Fixed unnormalized linguistic features: raw len(text) -> min(1.0, len/500) preventing ONNX input saturation Licensed pattern library (threat_patterns.py) - Community stub now performs license-aware dynamic loading at import time - `from ethicore_guardian.data.threat_patterns import THREAT_PATTERNS` transparently returns 30-category licensed library when ETHICORE_LICENSE_KEY + asset bundle are present; falls back to community 5-category stub otherwise - Fixes: 53 licensed-tier test failures now pass (157/157 total) Semantic layer (semantic_analyzer.py) - vocab.json + special_tokens.json confirmed in asset bundle and package data; semantic analyzer uses full 30,522-token vocabulary Scoring and display (guardian.py, threat_detector.py) - Layer votes display was always showing BLOCK regardless of actual analysis result; fixed to threshold-based per-layer decisions - Cleaned up SimpleOrchestrator analyzer wiring Windows / encoding (all analyzer files, __init__.py) - Replaced emoji in print() calls with ASCII equivalents across all modules to prevent UnicodeEncodeError on cp1252 terminals Test suite - 96/96 community tests pass (no license key required) - 157/157 licensed tests pass (with ETHICORE_LICENSE_KEY) - 15/15 attack detection in live demo (100%)

…tegories Licensed tier (threat_patterns_licensed.py): - Expanded from 241 patterns / 30 categories → 500 patterns / 51 categories - Added 21 new attack-vector categories sourced from OWASP LLM Top 10 2025, MITRE ATLAS, Anthropic red-team research, Garak probe taxonomy, and the PLINY / social-media jailbreak community (X/Twitter, Reddit r/jailbreak): crescendoAttack, manyShotJailbreaking, cipherObfuscation, authorityImpersonation, sandboxExemption, delimiterInjection, outputFormatEscape, persistentPersona, contextWindowFlooding, falsePermissionClaim, legalJurisdictionBypass, professionalAuthorityBypass, researchExemption, reversePsychology, contrastiveExtraction, metaInstructionAttack, memorySeedingAttack, adversarialFormatting, goalHijackingChain, negationBypass, plinyStyleJailbreak - Bulked up 3 thin existing categories (trainingDataExtraction, emotionalManipulation, multiTurnSetup) to 10-12 patterns each - Bulked up 7 underpowered categories (harmfulContentGeneration, piiHarvesting, commandInjection, dataExfiltration, urgencyExploit, etc.) - Semantic fingerprints: 234 → 444 across all 51 categories - Synced live asset to ~/.ethicore/data/threat_patterns_licensed.py Scripts: - scripts/regenerate_embeddings.py: rewrote to be license-aware; resolves output path via CLI args > ETHICORE_ASSETS_DIR > ~/.ethicore > package; reads licensed fingerprints (444) when ETHICORE_LICENSE_KEY is set - scripts/retrain_guardian_model.py: new script — generates synthetic training data from fingerprint library, trains sklearn MLPClassifier, exports to ONNX with correct dense_1_input/dense_4 interface, runs calibration gate before writing (Principle 14: Divine Safety) Community patterns (threat_patterns.py): - Added 6 gap-fix patterns to instructionOverride, safetyBypassAttempt, roleHijacking to close real-world jailbreak gaps Tests: - test_phase4_threat_library.py: updated hard count assertions (30→≥51, 234→≥444, 235→≥500); added v1.2.0 category coverage checks for all 21 new categories; switched to >= comparisons for forward compatibility Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

retrain_guardian_model.py: - Full rewrite: 20 000 samples (10k threat / 10k benign) with real MiniLM semantic embeddings on every training sample — eliminates the feature-starvation bug where only 6/127 features were non-zero - 600+ benign templates across 6 domains; 75 assistant-phrasing phrases oversampled 3x to prevent false positives on 'how can I help' inputs - 45 hard-negative security-research sentences (labeled BENIGN) prevent flagging legitimate AI-safety research - Fixed self-check: now compares sklearn vs ONNX probabilities using the calibration cal_X vectors rather than [0.01]*27 placeholder embeddings - Fixed ONNX Gather axis bug: explicit 'probabilities' string match prevents matching 'label [N]' (1D) instead of 'probabilities [N,2]' - Fixed Windows console Unicode errors: replaced all checkmark/arrow symbols with ASCII [OK]/[FAIL]/[WARN] - Corrected docstring sample/architecture defaults (20000, 128,64) .gitignore: - Added fix_*.py, *.bak, *.tmp to suppress scratch/temp files Trained model stats (v1.2.0 release): Samples: 20 000 | Accuracy: 99.83% | AUC-ROC: 0.9996 Calibration: avg benign prob 0.0337 (all 3 texts < 0.11) — PASSED Output: guardian-model.onnx 98 KB (gitignored; in asset bundle) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…res() Root cause: _build_feature_vector() used text-derived behavioral proxy features (char_len/500, word_count/100, ...) while extract_features() uses sentinel defaults ([0.5, 1.0, 0.0, 0.0, ...]) when no behavioral session data is available. Features [0] and [1] differed by +0.44 and +0.93 respectively — the model learned those high sentinel values as threat-correlated, causing avg_benign_prob=0.990 at MLInferenceEngine load-time calibration and the engine falling back to heuristics. Fixes applied to retrain_guardian_model.py: 1. _build_feature_vector() now mirrors extract_features() exactly: - Behavioral [0:40]: sentinel defaults [0.5, 1.0, 0.0, 0.0, ...] - Linguistic [40:75]: same 5 text-derived computations as engine - Technical [75:100]: sentinel defaults [0.1, 0.0, ...] - Semantic [100:127]: real MiniLM or [0.01]*27 null placeholder Verified: 0 feature mismatches across all test texts. 2. Null-semantic injection: 20% of training samples use [0.01]*27 for the semantic slot, teaching the model that null semantic signal does not imply threat (handles calibration + SemanticAnalyzer unavailable edge cases gracefully). 3. Calibration gate in retrain script now uses [0.01]*27 (matching extract_features exactly) instead of hash-based fallback — ensures the gate that passes here is the same gate MLInferenceEngine runs. Result (v1.2.0 final model — hash a9433737b58720c8): Accuracy: 95.47% AUC-ROC: 0.9935 Calibration: avg benign prob 0.020 (all 3 texts < 0.03) -- PASSED Engine load: guardian-model-onnx calibration passed (avg: 0.020) Smoke test: 3 benign ALLOW, 3 threats BLOCK (ML votes 73-99%) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…omment _SECRET_MASKED has been populated since v1.2.0 retrain; the "all zeros" warning comment was leftover setup boilerplate and was factually incorrect. Removed to prevent confusion for any future reviewer of the source. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Remove ethicore_guardian/analyzers/ and guardian.py from git tracking (core IP — distributed via paid asset bundle only, not open source) - Gitignore analyzers/, guardian.py, and proprietary test files to prevent accidental future commits - Update README: 51 categories / 500+ patterns / 444+ fingerprints - Add bi-directional six-layer pipeline docs (pre-flight + post-flight) - Update Community vs Licensed table with post-flight and learning rows - Update GuardianConfig reference with Phase 3 parameters History scrub required: analyzers/ and guardian.py exist in earlier commits and must be purged with git-filter-repo before history is clean.

Phase 3 release — OutputAnalyzer (post-flight gate) + AdversarialLearner (closed-loop learning). Adds analyze_response() public API and GuardianConfig output/learning fields.

- Add 80-entry _BENIGN_CASUAL domain: short greetings, informal requests, phrases containing 'benign'/'normal' — the exact inputs that produced false positives (ML BLOCK on single-word greetings, 'Hello, help me...') - Over-sample _BENIGN_CASUAL 4x in _ALL_BENIGN pool; add dedicated 20% sampling path in _make_benign_sample() (previously 0% representation) - Increase default --samples from 20,000 to 30,000 (15k threat / 15k benign) to ensure all 51 licensed categories are thoroughly represented - Adjust _make_benign_sample() weights: 20% casual (new), 20% assistant phrasing, 15% hard negatives, 45% pool - Update calibration gate hint and docstring to reference 30,000 Licensed retrain results (51 categories / 444 fingerprints): Accuracy: 95.33% | AUC-ROC: 0.9940 | Avg benign prob: 0.0139 Calibration gate: PASSED

Add MiniMax (https://www.minimax.io) as a first-class provider integration for Guardian SDK. MiniMax offers powerful LLM models (M2.7, M2.5) through an OpenAI-compatible API, and this provider wraps MiniMax-configured OpenAI clients with the same threat detection pipeline used for OpenAI and Anthropic. Changes: - Add minimax_provider.py with MiniMaxProvider, ProtectedMiniMaxClient, and create_protected_minimax_client() convenience factory - Add MiniMax auto-detection in get_provider_for_client() via base_url - Add minimax optional dependency group in pyproject.toml - Add MiniMax provider example and install instructions to README - Add 30 tests (22 unit + 5 integration + 3 constant tests)

OraclesTech · 2026-04-05T04:22:22Z

Hey @octo-patch, really appreciate you taking the time to contribute to the Guardian SDK! It's clear you read the codebase carefully: the proxy structure, async/sync handling, and docstrings all follow the existing patterns closely, and the mock-based test suite is on par. Great first PR.

Before we merge, there are two things that need to be addressed:

guardian.py was not updated
get_provider_for_client() will correctly identify a MiniMax client, but guardian.wrap() then looks up self.providers['minimax'], which doesn't exist because _setup_providers() was never updated to register the new provider. This will raise a KeyError at runtime, making the provider unreachable through the public API. You'll need to add the registration there alongside the existing OpenAI/Anthropic entries.

Fail-open on empty prompt (minimax_provider.py, line 283)
The if prompt_text and prompt_text.strip(): guard means if extract_prompt() returns None or an empty string, due to a malformed payload or an unusual message format, threat analysis is skipped entirely and the request passes through unblocked. This is fail-open, which goes against the core security contract of Guardian SDK (Principle 14). The existing providers don't have this escape hatch. Either raise on an empty prompt or treat it as a challenge, but never silently allow.

Two smaller things worth a follow-up (not blocking):

The base_url string match works here since it's nested inside the OpenAI module check, but a comment explaining the reasoning would help future maintainers.
The attribute-copying loop in ProtectedCompletions.init diverges from the lazy getattr delegation pattern used in the OpenAI and Anthropic providers... worth aligning for consistency.
Fix the two blocking issues and this is good to go. Thanks again for the contribution brother, looking forward to adding MiniMax support into Guardian SDK!

P.S sorry this took a little bit getting back to you!

Oracles Technologies LLC and others added 13 commits February 25, 2026 13:17

Merge: keep SDK LICENSE over GitHub default

0ad4839

chore: bump version to 1.3.0

482e7dc

Phase 3 release — OutputAnalyzer (post-flight gate) + AdversarialLearner (closed-loop learning). Adds analyze_response() public API and GuardianConfig output/learning fields.

OraclesTech force-pushed the main branch from ab48c02 to 4d4090a Compare April 26, 2026 17:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add MiniMax as LLM provider with Guardian threat protection#1

feat: add MiniMax as LLM provider with Guardian threat protection#1
octo-patch wants to merge 13 commits into
OraclesTech:mainfrom
octo-patch:feature/add-minimax-provider

octo-patch commented Mar 23, 2026

Uh oh!

OraclesTech commented Apr 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

octo-patch commented Mar 23, 2026

Summary

Changes

Supported Models

Usage

Test plan

Uh oh!

OraclesTech commented Apr 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants