Skip to content

Evaluate and Improve Example Workflows #219

@jerop

Description

@jerop

Problem

Currently, we have no systematic way to:

  • Measure how well these workflows perform their intended tasks
  • Test prompt improvements before deploying them
  • Compare different prompt variations or model configurations
  • Validate that changes don't regress quality
  • Provide quality benchmarks for the community

Solution

Use the Gemini CLI evaluation framework to systematically test and improve the effectiveness of prompts and configurations used in our example workflows. This will enable data-driven optimization of our provided workflows and give the community tools to evaluate their own Gemini CLI automations.

Dependencies

References

Sub-issues

Metadata

Metadata

Assignees

Labels

area/promptsarea/qualityTracks quality issueskind/enhancementNew feature or requestpriority/p0Critical and urgent e.g., critical security vulnerability, major breakage

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions