Skip to content

feat: add evaluation script for gemma model#85

Open
Ki-Seki wants to merge 3 commits into
mainfrom
eval/gemma
Open

feat: add evaluation script for gemma model#85
Ki-Seki wants to merge 3 commits into
mainfrom
eval/gemma

Conversation

@Ki-Seki
Copy link
Copy Markdown
Member

@Ki-Seki Ki-Seki commented Mar 18, 2026

No description provided.

Copilot AI review requested due to automatic review settings March 18, 2026 08:53
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a bash script under results/260318-gemma-eval/ to run a suite of gimbench evaluations for a Gemma-based model (plus baseline runs against google/gemma-3-270m-it), including an initial hf download step.

Changes:

  • Add an evaluation shell script that runs PPL, regex match, multiple MCQA benchmarks, and CV parsing.
  • Introduce configuration variables for model name and API connectivity.
  • Schedule an automatic host shutdown after running the evaluations.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment thread results/260318-gemma-eval/*eval.sh
Comment thread results/260318-gemma-eval/*eval.sh
Comment thread results/260318-gemma-eval/*eval.sh Outdated
Comment on lines +5 to +7
$API_KEY=xxx
$API_BASE=xxx
$MODEL=Sculpt-AI/2603171-gemma
Comment thread results/260318-gemma-eval/*eval.sh
Comment thread results/260318-gemma-eval/*eval.sh Outdated
Comment on lines +5 to +6
$API_KEY=xxx
$API_BASE=xxx
Comment thread results/260318-gemma-eval/*eval.sh
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants