Skip to content

docs(TSP-1095): add workforce evaluations documentation#542

Draft
claude[bot] wants to merge 3 commits into
mainfrom
docs/TSP-1095
Draft

docs(TSP-1095): add workforce evaluations documentation#542
claude[bot] wants to merge 3 commits into
mainfrom
docs/TSP-1095

Conversation

@claude
Copy link
Copy Markdown

@claude claude Bot commented Mar 25, 2026

Summary

  • Adds build/workforces/workforce-features/evals.mdx documenting workforce evaluations
  • Covers two evaluation modes: generate-and-score and score-only
  • Explains evaluator types and links to agent evals docs for full configuration details
  • Highlights key differences from agent evals (multi-agent scope, score-only mode)
  • Updates docs.json navigation to include the new page under Workforce Features

Linear issue

https://linear.app/relevance/issue/TSP-1095/

@mintlify
Copy link
Copy Markdown
Contributor

mintlify Bot commented Mar 25, 2026

Preview deployment for your docs. Learn more about Mintlify Previews.

Project Status Preview Updated (UTC)
relevanceai 🟢 Ready View Preview Mar 25, 2026, 4:25 AM

@linear
Copy link
Copy Markdown

linear Bot commented Mar 25, 2026

github-actions Bot and others added 2 commits May 6, 2026 16:28
Adds evals page for workforces covering generate-and-score and score-only
modes, evaluator types, key differences from agent evals, and when to use
each mode. Updates docs.json navigation to include the new page.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Workforce evals BE shipped (relevance-api-node #12943) but FE is still
in flight. Replacing the standalone workforce evals page with a small
note on the agent evals page and a workforce prompt example on the
Programmatic GTM intro — pointing users at the MCP/API path that
actually works today.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Both embeds were using kebab-case 'padding-top' (invalid in JSX style
objects), 56.75% instead of 56.25%, and a single-line wrapper that
didn't match the standard snippet. Swapped in the canonical wrapper
from the style guide.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 6, 2026

🎯 Vibe check

Reviewed: 2 files (2 with issues, 0 clean)

Scores

Dimension Score What's holding it back
🟡 Consistency 6/10 programmatic-gtm.mdx has four British spellings (behaviour x2, personalised, categorises) and seven card/tab titles with lowercase product terms (Agent, Tool, Workforce). evals.mdx has four sentence-case violations in accordion titles and a bold label inside a callout.
🟢 Technical clarity 9/10 Both pages are specific and accurate. UI element names are exact, steps are numbered, and field references match the product UI.
🟢 Non-technical clarity 9/10 programmatic-gtm.mdx uses example prompts effectively to make abstract capabilities concrete. evals.mdx covers the full feature end-to-end without assuming prior knowledge.
🟡 Structure 8/10 Neither page ends with a closing CTA. evals.mdx is a concept+how-to hybrid that rounds off at FAQs with no pointer to what to do next. The bold label inside the <Info> callout in evals.mdx violates CLAUDE.md.

Score key: 🟢 9–10, 🟡 6–8, 🔴 1–5.

Overall vibe: The evals.mdx page is well-structured and thorough — a solid reference doc. The main drag is programmatic-gtm.mdx, which has a systematic British English problem and consistently lowercases Relevance AI product terms (Agent, Tool, Workforce) in its capability cards and tabs, which reads as inconsistent against the rest of the docs site.

🔧 Issues (19)

build/agents/build-your-agent/evals.mdx

  • evals.mdx:8 — Bold label **Rollout Status**: inside <Info> callout. CLAUDE.md: "no bold labels inside" callouts. Remove the label and fold the content into a plain sentence, e.g. Evals is currently being rolled out progressively, starting with Enterprise customers. If you're an Enterprise customer and don't see it yet, reach out to your account manager.
  • evals.mdx:22 — Card title "Conduct Tests""Conduct tests" (sentence case; "Tests" is not a proper noun)
  • evals.mdx:177 — Accordion title "Customer Support - Empathy test""Customer support - Empathy test" ("Support" after the first word should be lowercase)
  • evals.mdx:188 — Accordion title "Sales - Product knowledge test""Sales - product knowledge test" ("Product" after the dash should be lowercase)
  • evals.mdx:199 — Accordion title "Support - Escalation handling""Support - escalation handling" ("Escalation" should be lowercase)

get-started/core-concepts/programmatic-gtm.mdx

  • programmatic-gtm.mdx:35 — Card title "Create agents""Create Agents" (referring to the Relevance AI Agent product feature)
  • programmatic-gtm.mdx:38 — Card title "Build tools""Build Tools" (Relevance AI Tool product feature)
  • programmatic-gtm.mdx:41 — Card title "Set up workforces""Set up Workforces" (Relevance AI Workforce product feature)
  • programmatic-gtm.mdx:44 — Card title "Trigger agents""Trigger Agents" (referring to the product feature)
  • programmatic-gtm.mdx:47 — Card title "Execute tools""Execute Tools" (Relevance AI Tool product feature)
  • programmatic-gtm.mdx:50 — Card title "Troubleshoot agents""Troubleshoot Agents"
  • programmatic-gtm.mdx:53 — Card title "Refine agents""Refine Agents"
  • programmatic-gtm.mdx:54behaviourbehavior (British spelling)
  • programmatic-gtm.mdx:69 — Tab title "Build agents""Build Agents"
  • programmatic-gtm.mdx:79personalisedpersonalized (British spelling, inside an example prompt — still documentation content)
  • programmatic-gtm.mdx:107categorisescategorizes (British spelling, inside an example prompt)
  • programmatic-gtm.mdx:108 — Accordion title "Evaluate a workforce""Evaluate a Workforce" (Relevance AI Workforce product feature)
  • programmatic-gtm.mdx:114 — Tab title "Build tools""Build Tools"
  • programmatic-gtm.mdx:136behaviourbehavior (British spelling)
🏗️ Page structure (2)
  • build/agents/build-your-agent/evals.mdx — Page ends at the FAQ accordion with no closing CTA. As a concept+how-to page, it leaves the reader with nowhere to go next. A ## What's next? section could link to /build/agents/build-your-agent/triggers (set up the agents you'll be testing) and /build/agents/build-your-agent/build-overview (broader agent building context).
  • get-started/core-concepts/programmatic-gtm.mdx — The setup cards at the top serve as the entry point, but the page ends at FAQs with no pointer forward. A brief closing CTA linking to /integrations/mcp/programmatic-gtm/claude-code (quickest path to start) and /integrations/mcp/programmatic-gtm/mcp-server (alternative clients) would round it off.
🔋 Credit usage
Item Count
Files reviewed 2
Context pages read 2
Total lines processed ~718

Files read: build/agents/build-your-agent/evals.mdx (378 lines), get-started/core-concepts/programmatic-gtm.mdx (220 lines), build/agents/build-your-agent/triggers.mdx (60 lines), get-started/core-concepts/workforces.mdx (60 lines)

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 6, 2026

🎯 Vibe check

Reviewed: 2 files (2 with issues, 0 clean)

Scores

Dimension Score What's holding it back
🟡 Consistency 6/10 programmatic-gtm.mdx has 2 confirmed British spellings (behaviour ×2, personalised), and product terms (Agent, Tool, Workforce) are consistently lowercased across card titles, tab titles, and body text. evals.mdx has a bold label inside a callout and one product term lowercased.
🟡 Technical clarity 8/10 Link text "Relevance AI MCP" in evals.mdx points to the Programmatic GTM overview page, not the MCP server page — mismatch between what the text promises and where it lands. Otherwise UI references are specific and actionable.
🟡 Non-technical clarity 7/10 programmatic-gtm.mdx opens with "Programmatic GTM is the new way to build your agents for GTM" — GTM is never expanded to "Go-to-Market". The rest of the page is well-structured with concrete example prompts that make the feature tangible.
🟡 Structure 8/10 Best practices in evals.mdx use a <CardGroup> for content that reads naturally as bullets. programmatic-gtm.mdx's "What you can do" block has 9 non-linked cards, which is visually heavy for a capabilities list.

Score key: 🟢 9–10, 🟡 6–8, 🔴 1–5.

Overall vibe: Both pages are substantively strong — evals.mdx is thorough and well-organized with good use of tables, accordions, and a helpful FAQ; programmatic-gtm.mdx uses concrete example prompts that make an abstract concept feel actionable. The main drag is systematic: product terms are consistently lowercased across programmatic-gtm.mdx (Agent, Tool, Workforce), and two British spellings made it through. Quick fixes across the board.

🔧 Issues (14)
  • build/agents/build-your-agent/evals.mdx:8**Rollout Status**: is a bold label inside an <Info> callout. CLAUDE.md: callouts must not contain bold labels. Remove the label and fold the content into a plain sentence: "Evals is currently being rolled out progressively, starting with Enterprise customers. If you're on Enterprise and don't see it yet, reach out to your account manager."
  • build/agents/build-your-agent/evals.mdx:12 — "workforces" → "Workforces" (product term — Workforce is a Relevance AI feature)
  • build/agents/build-your-agent/evals.mdx:12 — Link text says "Relevance AI MCP" but href is /get-started/core-concepts/programmatic-gtm (the GTM overview, not the MCP server page). Either change text to "Programmatic GTM" to match the destination, or point to /integrations/mcp/programmatic-gtm/mcp-server.
  • build/agents/build-your-agent/evals.mdx:22 — Card title "Conduct Tests""Conduct tests" (sentence case; "Tests" is not a product term)
  • get-started/core-concepts/programmatic-gtm.mdx:7 — "agents, tools, and workforces" → "Agents, Tools, and Workforces" (product terms)
  • get-started/core-concepts/programmatic-gtm.mdx:35,38,41,44,47,50,53 — Seven card titles have product terms lowercased: "Create agents""Create Agents", "Build tools""Build Tools", "Set up workforces""Set up Workforces", "Trigger agents""Trigger Agents", "Execute tools""Execute Tools", "Troubleshoot agents""Troubleshoot Agents", "Refine agents""Refine Agents"
  • get-started/core-concepts/programmatic-gtm.mdx:54 — "behaviour" → "behavior" (British spelling in card body: "Iterate on agent instructions, tool configurations, and behaviour")
  • get-started/core-concepts/programmatic-gtm.mdx:54 — "agent instructions, tool configurations" → "Agent instructions, Tool configurations" (product terms in card body)
  • get-started/core-concepts/programmatic-gtm.mdx:69 — Tab title "Build agents""Build Agents" (Agent = product term)
  • get-started/core-concepts/programmatic-gtm.mdx:79 — "personalised" → "personalized" (British spelling inside example prompt)
  • get-started/core-concepts/programmatic-gtm.mdx:114 — Tab title "Build tools""Build Tools" (Tool = product term)
  • get-started/core-concepts/programmatic-gtm.mdx:115 — "Create custom tools that your agents can use" → "Create custom Tools that your Agents can use"
  • get-started/core-concepts/programmatic-gtm.mdx:136 — "agent behaviour" → "Agent behavior" (British spelling + product term — double fix)
  • get-started/core-concepts/programmatic-gtm.mdx:75,78,81,84 — Accordion titles inside the "Build agents" tab: "Customer support agent", "BDR agent", "Slack triage agent", "Scheduled reporting agent" → capitalize "Agent" in each since they describe Relevance AI Agent types
🧩 Component suggestions (2)
  • build/agents/build-your-agent/evals.mdx:361–374 — Best practices use a <CardGroup cols={2}> with 4 items. CLAUDE.md says CardGroup is not appropriate for "best practices that read naturally as flowing bullets." These four items are brief advisory tips — no links, no parallel choices, no navigable destinations. A simple bulleted list or <AccordionGroup> (if you want expandable detail per tip) would be less visually heavy and more appropriate for prescriptive guidance.
  • get-started/core-concepts/programmatic-gtm.mdx:34–62 — "What you can do" renders as 9 non-linked cards in a <CardGroup cols={3}>. CLAUDE.md reserves CardGroup for navigable items or choices with enough substance. These are capability descriptors — short phrases with no links. A compact bulleted list (or two-column markdown table with capability + description) would be more scannable and less noisy for 9 items.
🏗️ Page structure (1)
  • get-started/core-concepts/programmatic-gtm.mdx:7 — "Programmatic GTM is the new way to build your agents for GTM in Relevance AI" — GTM is never expanded. Add "Go-to-Market (GTM)" at first mention in the body so non-technical readers aren't dropped cold into an unexplained acronym. (The product name "Programmatic GTM" is fine to use as-is throughout.)
✅ Clean files (0)

Both files have issues — no clean files to list.

🔋 Credit usage
Item Count
Files reviewed 2
Context pages read 2
Total lines processed ~555

Files read: build/agents/build-your-agent/evals.mdx (414 lines), get-started/core-concepts/programmatic-gtm.mdx (220 lines), build/agents/build-your-agent/triggers.mdx (120 lines), get-started/core-concepts/workforces.mdx (126 lines)

@jordanc-relevanceai
Copy link
Copy Markdown
Collaborator

Not certain we should be covering this yet - will make this a draft

@jordanc-relevanceai jordanc-relevanceai marked this pull request as draft May 6, 2026 06:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs-drafter Documentation drafted by Claude

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant