Plugin: ghost_judge#316
Open
0sicario wants to merge 1 commit into
Open
Conversation
Post-task quality gate using a separate judge model to verify agent work. Domain-aware evaluation for OSINT, design, and research tasks. Configurable judge model with reasoning support via OpenRouter. Co-Authored-By: Claude <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Ghost Judge — Silent LLM Quality Gate
Your agent says "done." But is it?
Ghost Judge uses a separate LLM to evaluate your agent's work after every response. Set a
/goal, and the judge ensures completeness, cross-referencing, and evidence-backing before presenting results. If the work isn't done, your agent keeps refining — automatically.What it does
/goaland/subgoalslash commands (requires Commands plugin)The pitch
A $0.03 judge call turns a cheap LLM into premium-tier output. The capability was always there — the accountability wasn't.
Tested on
Requirements
OPENROUTER_API_KEYenv var)/goaland/subgoalRepository
https://github.com/Kironkeys/ghost-judge
Looking for community testers
Especially interested in results from local LLM setups (Llama, Qwen, Mistral, DeepSeek). Can a small local judge effectively gate a larger agent? Help us find out.