Skip to content

[strategy] Reposition evalops as 'govern the AI fleet you already have' #5

@haasonsaas

Description

@haasonsaas

Strategic pivot: reposition evalops as "govern the AI fleet you already have"

Not a feature request — a positioning decision with downstream product, docs, and GTM work. This issue is the tracker.

The thesis in one paragraph

Enterprises already have 5–50+ AI integrations running in production — ChatGPT Enterprise, GitHub Copilot, Cursor, Claude for Slack, vendor AI features, internal apps calling the OpenAI/Anthropic/Gemini SDKs. None of these were deployed through a unified control plane; each has different auth, different governance (usually none), different audit trails, different cost attribution (usually zero). The ICP buyer (CISO / VP Platform at a 500–5000-person company) isn't shopping for a better way to build agents. They're shopping for a way to see, govern, and account for AI their employees and products are already using. Evalops has ~80% of the platform already built for that story. What's missing is the positioning, the retrofit surfaces, and the discovery layer.

Why this is tractable now

Audit of the platform service inventory against the "govern existing fleet" story (2026-04-17):

  • llm-gateway — egress proxy for LLM traffic. Natural retrofit point for customer-owned integrations.
  • mcp-firewall — policy enforcement at the MCP protocol boundary. Works for any MCP-native agent.
  • governance — PII detection, content safety, data classification. Runs on any flowing payload.
  • approvals — human-in-the-loop gating. Needed especially when the agent isn't one we built.
  • audit — tamper-evident event log. Regulators want this across all AI usage, not just evalops-deployed.
  • meter — cost attribution. Answers "which team spent $40K on OpenAI last month."
  • traces — who ran what prompt against what model.
  • identity — principal resolution.
  • entities — cross-system identity correlation. Maps Okta user → OpenAI org-member → GitHub user.
  • connectors — OAuth into SaaS. Already the pattern for external integrations.
  • gate — zero-trust access proxy. Customer-premise endpoint for cases where cloud egress isn't acceptable.
  • registry — reframe as a registry of all AI assets, not just evalops-registered ones.

The missing pieces (filed as child tickets):

  1. Discovery — a service that scans a customer env and enumerates AI usage (linked child issue #A)
  2. SDK shim — drop-in openai / anthropic / google.generativeai replacements routing through llm-gateway (child #B)
  3. Egress posture — customer-premise deploy mode for llm-gateway / gate as an AI-specific egress proxy (child #C)
  4. Retrofit playbooks — concrete how-tos for the 5 highest-volume integration types (child #D)
  5. Hopper messaging pivot — reframe from architecture-forward to outcomes-forward (child #E)
  6. Competitive intelligence doc — Portkey / LangSmith / Lakera / Nightfall / Helicone — document the differentiator (child #F)

Sequencing

  1. Messaging + playbooks first — costs nothing, highest leverage, unblocks sales
  2. Discovery service MVP — the biggest product gap; passive-mode (log ingestion) before active-mode (installed scanners)
  3. SDK shim — one import change for customers; lowest-friction retrofit
  4. Egress posture — for customers who need network-layer enforcement

Why not just build for greenfield agents

Greenfield agent tooling is crowded (Google ADK, LangChain, LangGraph, CrewAI, AutoGen, AWS Bedrock Agents, Azure AI Foundry, plus Google's own agent-starter-pack). Winning that market requires beating Google and AWS at their own ecosystem play. The "govern existing fleet" positioning is a different market — control-plane-oriented, sold to a different buyer, with fewer competitors at its center.

Success measures

  • Hopper bounce rate on the redesigned pages
  • Inbound lead quote: do prospects describe themselves as "AI buyers" or "AI governance buyers"? (Target: flip within 2 quarters)
  • Number of retrofitted integrations per customer post-sale (target: grows monotonically; today probably 1)
  • Revenue per customer on the "govern existing fleet" posture vs. the greenfield agent posture (hypothesis: 3–5x higher)

Out of scope (explicitly)

  • Abandoning the greenfield agent surfaces (maestro, fermata, ensemble). Those stay — they demonstrate evalops runs on itself. The positioning shift is about where the sales pitch leads, not what the product abandons.
  • A wholesale marketing/brand refresh. This is a positioning pivot, not a rebrand.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions