Skip to content

feat: add kosli evaluate input subcommand#743

Open
tooky wants to merge 13 commits intomainfrom
evaluate-input
Open

feat: add kosli evaluate input subcommand#743
tooky wants to merge 13 commits intomainfrom
evaluate-input

Conversation

@tooky
Copy link
Copy Markdown
Contributor

@tooky tooky commented Mar 28, 2026

Summary

  • Adds kosli evaluate input — evaluate a local JSON file (or stdin) against a Rego policy, with no API dependency
  • Enables local policy development and testing without needing a running Kosli server
  • Input shape is opaque to the command — the policy defines what it expects

Prompted by this discussion where we realised kosli evaluate trail always hits the API, so there's no way to iterate on policies locally. Dan raised conftest as the alternative, but our tooling should support this natively — Rego is our language choice, and local testing should be turnkey.

In response to Alex's comparison of Rego vs pipeline controls — as more controls move into kosli evaluate, a fast local feedback loop becomes essential.

Usage

# evaluate a local JSON file against a policy:
kosli evaluate input \
    --input-file trail-data.json \
    --policy policy.rego

# read from stdin:
cat trail-data.json | kosli evaluate input --policy policy.rego

# inspect what the policy sees:
kosli evaluate input \
    --input-file trail-data.json \
    --policy policy.rego \
    --show-input --output json

Test plan

  • All new tests pass (go test -v -run TestEvaluateInputCommandTestSuite ./cmd/kosli/)
  • loadInput(io.Reader) unit tests pass
  • make lint clean
  • Manual: kosli evaluate input --help shows expected output
  • Manual: evaluate a saved --show-input JSON against a policy

🤖 Generated with Claude Code

tooky and others added 12 commits March 28, 2026 14:27
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Avoid restating the short description; the second sentence now adds
new information (file or stdin) rather than repeating.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
loadInputFromFile now delegates to loadInput after opening the file,
eliminating the duplicated JSON unmarshal logic.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@claude
Copy link
Copy Markdown
Contributor

claude bot commented Mar 28, 2026

Claude finished @tooky's task in 2m 39s —— View job


PR Review: kosli evaluate input subcommand

  • Phase 1: Static analysis (build, vet, mod tidy)
  • Phase 2: Read and review changed files
  • Phase 3: Post review feedback

Static Analysis

All clean — go build, go vet, and go mod tidy pass with no issues.

Review Summary

This is a clean, well-structured addition. The command does exactly what it says — local JSON + Rego evaluation with no API dependency. The TDD commit history is disciplined, and the test coverage is solid.

Two inline comments posted (both non-blocking):

  1. Flag registration duplication (evaluateInput.go:12-17) — evaluateInputOptions re-declares policyFile, output, and showInput fields that already exist in commonEvaluateOptions. Suggested a concrete approach using embedding + hidden flags. Echoes the earlier review comment but with a specific pattern.

  2. Testability of stdin path (evaluateInput.go:79) — os.Stdin is accessed directly in run(), making the stdin path harder to test with non-empty input. Suggested injecting cmd.InOrStdin() as a future improvement.

What's Good

  • loadInput / loadInputFromFile separation — clean io.Reader abstraction with proper deferred close error handling
  • Reuses evaluateAndPrintResult — no duplication of the core evaluation + output logic
  • Test coverage — 8 command-level cases covering: missing flags, allow/deny policies, non-existent file, invalid JSON, empty stdin, JSON output, and --show-input. Plus 2 unit tests for loadInput directly.
  • Help text and examples are clear and follow existing patterns
  • evaluate.go long description updated to mention the new subcommand — good attention to discoverability

Verdict

Looks good to merge as-is. The inline suggestions are improvements for a follow-up slice, not blockers.

- Remove duplicate Slice 3 entry in TODO.md
- Broaden parent evaluate command description to cover both
  API-backed trail evaluation and local input evaluation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@tooky
Copy link
Copy Markdown
Contributor Author

tooky commented Mar 28, 2026

Addressed the review feedback in 841836c:

1. TODO.md duplicate — Removed the unchecked duplicate Slice 3 line.

2. Parent evaluate command description — Broadened the Short from "Evaluate Kosli trail data" to "Evaluate data" and rewrote the Long to mention both API-backed trail evaluation and local input evaluation. kosli evaluate --help now reads:

Use evaluate trail or evaluate trails to fetch data from Kosli and evaluate it.
Use evaluate input to evaluate a local JSON file or stdin without any API calls.

3. Flag registration duplication — Left as-is per the review (non-blocking). Three flags registered by hand is simpler than embedding commonEvaluateOptions and hiding flow/attestations.

4. --input-file in long desc — Left as-is. The flag is already documented in the flag help text and the "When --input-file is omitted" paragraph.

Comment on lines +12 to +17
type evaluateInputOptions struct {
inputFile string
policyFile string
output string
showInput bool
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit / future consideration: evaluateTrailOptions embeds commonEvaluateOptions and calls addFlags() to register the shared flags (policy, output, show-input). This struct re-declares those same three fields, and lines 62-64 duplicate the registration.

One option to avoid the duplication while keeping evaluate input free of the API-specific flags:

type evaluateInputOptions struct {
	commonEvaluateOptions
	inputFile string
}

Then in newEvaluateInputCmd, call o.addFlags(cmd, "...") and hide the flags that don't apply:

o.addFlags(cmd, "Path to a Rego policy file to evaluate against the input.")
cmd.Flags().StringVarP(&o.inputFile, "input-file", "i", "", "...")
// hide API-only flags
cmd.Flags().Lookup("flow").Hidden = true
cmd.Flags().Lookup("attestations").Hidden = true

Override RequireFlags to only require "policy" (not "flow").

This is the same observation from the earlier review comment — just fleshing out a concrete approach. Not blocking.

var err error

if o.inputFile == "" {
input, err = loadInput(os.Stdin)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: os.Stdin is a package-level global, which makes this path harder to test in isolation (the suite test for "missing --input-file reads from stdin" relies on the process stdin being empty). Consider accepting the reader as a parameter to run() — e.g. injecting cmd.InOrStdin() — so a future test could supply a non-empty stdin without touching the global.

Not blocking for this slice, since the empty-stdin error path is already tested.

@AlexKantor87
Copy link
Copy Markdown
Contributor

From Alex — feedback on kosli evaluate input

This is a strong addition that directly addresses a gap we've been working around in the agentic SDLC demo. We run 10 control gates in CI, each calling kosli evaluate trails against Rego policies — and every policy iteration currently requires a live Kosli org with real trail data. Our kosli-evaluate skill already documents a workaround using raw opa eval, but having this native in the CLI is much better.

How we'd use this immediately:

  1. Local policy development — capture a fixture with evaluate trail --show-input, then iterate with evaluate input instead of needing API access and an OPA install
  2. CI optimization — fetch trail data once and evaluate all 10 control policies against the same JSON instead of 10 independent API round-trips
  3. Pipeline debugging — test policy fixes against captured data before pushing and re-running the full pipeline

A few suggestions:

  • Add a fixture capture example to the help text. The examples show --input-file trail-data.json but don't show how to produce that file. Something like: kosli evaluate trail TRAIL --policy allow-all.rego --flow FLOW --show-input --output json > trail-data.json would make the workflow self-documenting.

  • Clarify the --show-input wrapper. When you capture with --show-input --output json, the result has an outer structure with allow, violations, and input keys. Should evaluate input receive the full output, or just the .input portion? The examples would benefit from being explicit about this.

  • Consider --data as a follow-up slice. Since this is the offline path, it's a natural place to add external config support (budget thresholds, expected counts, etc.). We currently hardcode these values in Rego because kosli evaluate has no --data flag. Not a blocker for this PR, but worth tracking.

Implementation looks clean — good reuse of evaluateAndPrintResult(), solid test coverage, and the stdin piping support is a nice touch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants