Update evals docs: clarify global evaluators opt-in behavior (PR #14682) by claude[bot] · Pull Request #618 · RelevanceAI/relevance-docs

claude · 2026-05-12T08:08:06Z

Summary

Clarified that global Evaluators are not selected by default in evaluation modals (Run Test Set, Run Scenario, Evaluate Selected Tasks) — users must explicitly opt in via Additional global checks
Updated both the Global Evaluators note in the "Understanding Evaluators" section and the numbered steps in "Running evaluations" to reflect this opt-in behavior
Added FAQ entry documenting the 10-evaluator limit per scenario (increased from 5 to 10)

Closes Linear issue TSP-1230
Relates to GitHub PR #14682

Test plan

Verify "Running evaluations" step 2 accurately describes the opt-in flow for Additional global checks
Verify the Note under "Understanding Evaluators > Global Evaluators" clearly communicates opt-in behavior
Verify the new FAQ accordion renders correctly and content is accurate
Check all headings are sentence case and no banned words used

🤖 Generated with Claude Code

Global evaluators are not selected by default in evaluation modals. Users must explicitly opt in via 'Additional global checks'. Also adds FAQ entry documenting the 10-evaluator limit per scenario. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

linear · 2026-05-12T08:08:09Z

TSP-1230

github-actions · 2026-05-12T08:11:25Z

🎯 Vibe check

Reviewed: 1 file (1 with issues, 0 clean)

Scores

	Dimension	Score	What's holding it back
🟡	Consistency	7/10	Bold label inside `<Info>` callout (line 8); "tool" lowercase in multiple places where Relevance AI's Tool product feature is meant (lines 96–101, 157–160); bullet list inside an `<Accordion>` (lines 362–367).
🟢	Technical clarity	9/10	Edge cases are well covered (truncation behavior, global Evaluator opt-in, credit breakdown). UI element names are specific. Minor: "tool" vs "Tool" inconsistency could cause product-navigation confusion.
🟢	Non-technical clarity	9/10	Good overview before instructions, example scenarios in accordions are excellent anchors. FAQ is thorough.
🟡	Structure	7/10	Best practices section uses `<CardGroup>` for non-navigable tips; bullet list inside FAQ accordion; no closing CTA for what is largely a concept/overview page.

Score key: 🟢 9–10, 🟡 6–8, 🔴 1–5.

✨ Overall vibe: Solid, thorough feature documentation — the example scenarios, edge-case callouts (truncation, Evaluator opt-in), and FAQs are genuinely useful and show real craft. A handful of mechanical CLAUDE.md violations (bold label in callout, bullet list inside accordion, inconsistent product-term casing) need tidying, but the content and organization are strong.

🔧 Issues (5)

build/agents/build-your-agent/evals.mdx:8 — **Rollout Status**: is a bold label inside a callout. CLAUDE.md explicitly prohibits bold labels inside callouts. Drop the label; the content stands on its own: Evals is currently being rolled out progressively, starting with Enterprise customers. If you're an Enterprise customer and don't see this feature yet, reach out to your account manager to discuss access.
build/agents/build-your-agent/evals.mdx:96,100,101 — Inside the "Tool Usage" accordion, "tool" is lowercase three times when referring to Relevance AI's Tool product feature. "Checks whether a specific tool was used" → "…a specific Tool was used". Same for "Select the tool to check for", "Whether the tool was used", and "if the tool was used".
build/agents/build-your-agent/evals.mdx:157–160 — Same capitalization issue in the Tool simulations description: "emulate tool usage without actually calling the tools" → "…without actually calling the Tools"; "Select a tool to simulate" → "Select a Tool to simulate". (The generic phrase "tool usage" in the same sentence is acceptable lowercase since it's a description, not the feature name.)
build/agents/build-your-agent/evals.mdx:362–367 — Bullet list inside an <Accordion> (FAQ: "How are credits calculated for evaluations?"). CLAUDE.md requires flowing sentences in accordion content. Convert to prose: "Credits for each scenario are calculated from three components: the Agent task run (the conversation itself), the simulator LLM that plays the user persona, and each Evaluator LLM that scores the conversation — both scenario-level and global."
build/agents/build-your-agent/evals.mdx:383–385 — The closing sentence of the "What happens when a conversation is truncated?" accordion reads "…disabling truncation and selecting a model with a larger context window is preferable." The subject-verb agreement is slightly off (two actions, singular is). Change to: "…disabling truncation and selecting a model with a larger context window are preferable."

🧩 Component suggestions (1)

build/agents/build-your-agent/evals.mdx:331–347 — The "Best practices" section uses <CardGroup cols={2}> for five non-navigable advisory tips. CLAUDE.md says <CardGroup> is not appropriate for "best practices that read naturally as flowing bullets." These tips read like a bullet list with a short title per item. A <CardGroup> is more defensible when items are at least navigable or represent equal parallel choices; here they're just advice. Consider converting to a plain numbered or bulleted list, or an <AccordionGroup> if you want to keep them skimmable.

🏗️ Page structure (1)

build/agents/build-your-agent/evals.mdx — The page is primarily a concept + overview page for a new feature but has no closing CTA. CLAUDE.md says concept and overview pages should end with a CTA so readers know where to go after learning what the feature is. A ## What's next? pointing to /build/agents/build-your-agent/triggers (to learn about automating Agents more broadly) and /build/agents/build-your-agent/build-overview (for a full picture of the build tab) would round the page off naturally. A link to contacting the account manager for Evals access could also be useful given the rollout is still in progress.

✅ Clean files (0)

(No files were fully clean.)

🔋 Credit usage

Item	Count
Files reviewed	1
Context pages read	2
Total lines processed	~531

Files read: build/agents/build-your-agent/evals.mdx (395 lines), build/agents/build-your-agent/build-overview.mdx (25 lines), build/agents/build-your-agent/alerts.mdx (111 lines)

claude Bot added the docs-drafter Documentation drafted by Claude label May 12, 2026

claude Bot assigned jordanc-relevanceai May 12, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update evals docs: clarify global evaluators opt-in behavior (PR #14682)#618

Update evals docs: clarify global evaluators opt-in behavior (PR #14682)#618
claude[bot] wants to merge 1 commit into
mainfrom
docs/TSP-1230

claude Bot commented May 12, 2026

Uh oh!

linear Bot commented May 12, 2026

Uh oh!

github-actions Bot commented May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

claude Bot commented May 12, 2026

Summary

Test plan

Uh oh!

linear Bot commented May 12, 2026

Uh oh!

github-actions Bot commented May 12, 2026

🎯 Vibe check

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant