Skip to content

feat(v0.2): LinearAdapter, plugins, dedup, audit CLI, ConfigApprovalGate#103

Open
YiWang24 wants to merge 5 commits into
mainfrom
feat/v02-linear-plugins-dedup
Open

feat(v0.2): LinearAdapter, plugins, dedup, audit CLI, ConfigApprovalGate#103
YiWang24 wants to merge 5 commits into
mainfrom
feat/v02-linear-plugins-dedup

Conversation

@YiWang24
Copy link
Copy Markdown
Collaborator

@YiWang24 YiWang24 commented May 27, 2026

v0.2 Features (5 commits)

  1. Audit CLI — show/export/budget-reset subcommands for inspecting agent runs
  2. Plugin framework — 3 built-in plugins (reproduce_python_issue, reproduce_js_issue, summarize_pr_diff)
  3. LinearAdapter — webhook verification, event parsing, GraphQL reply integration
  4. ConfigApprovalGate — middleware to block config PRs without approval label
  5. Issue dedup — embedding + pgvector search for duplicate issue detection

Test plan

  • 577 tests pass (3 real_service skipped)
  • pre-commit hooks pass
  • import-linter hexagonal layers check passes
  • Cherry-picked cleanly onto main (no merge conflicts)

View with Codesmith Autofix with Codesmith
Need help on this PR? Tag @codesmith with what you need. Autofix is disabled.

YiWang24 added 5 commits May 27, 2026 17:22
- show <task_id>: full audit trail + cost breakdown for a single task
- export --format csv|json: export cost_meter rows
- budget reset --repo|--global: emergency cost_meter cleanup
- 9 new unit tests for parser, rendering, export
- openbot_plugins/__init__.py: plugin registry (register/get_all_plugins)
- reproduce_python_issue: extract traceback, identify error, suggest repro
- reproduce_js_issue: extract JS/TS stack trace, suggest repro
- summarize_pr_diff: parse unified diff, group by added/modified/deleted
- 11 unit tests for all 3 plugins
…GraphQL reply

- LinearAdapter: ChannelAdapter implementation for Linear
  - verify_signature: HMAC-SHA256 of raw body
  - parse_event: Issue.create/update → ISSUE_OPENED/EDITED
  - reply/update_comment: GraphQL mutation to Linear API
  - Stub methods for PR/label/code operations
- FastAPI route: POST /webhook/linear (same pattern as GitHub)
- deps.py: verified_linear_event dependency
- 8 unit tests for signature verification + event parsing
…label

- ConfigApprovalGate middleware: checks if PR modifies .openbot/config.yaml
  without config-approved label, blocks with informative comment
- 5 unit tests: non-PR, no config mention, missing label, approved, other labels
…case

- dedup.py: use case with embed → search → rerank → comment pipeline
- embedding.py: EmbeddingService (Voyage primary, OpenAI fallback)
  + search_similar_issues via pgvector cosine similarity
- 003 migration: pgvector extension + issue_embeddings table + IVFFlat index
- 7 unit tests for render_dedup_comment + check_duplicates flows
@qodo-code-review
Copy link
Copy Markdown

Qodo reviews are paused for this user.

Troubleshooting steps vary by plan Learn more →

On a Teams plan?
Reviews resume once this user has a paid seat and their Git account is linked in Qodo.
Link Git account →

Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center?
These require an Enterprise plan - Contact us
Contact us →

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 27, 2026

Warning

Review limit reached

@YiWang24, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 53 minutes and 7 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: fa1d1280-96eb-4d94-a914-de49fadfaeb1

📥 Commits

Reviewing files that changed from the base of the PR and between 1e5e7f6 and bca5206.

📒 Files selected for processing (22)
  • docs/superpowers/plans/v02-execution-plan.md
  • docs/superpowers/plans/v02-f1-per-task-budget.md
  • docs/superpowers/specs/v02-overall-spec.md
  • openbot/application/middleware/config_approval.py
  • openbot/application/use_cases/dedup.py
  • openbot/entrypoints/api/deps.py
  • openbot/entrypoints/api/routes/linear_webhook.py
  • openbot/entrypoints/cli/audit.py
  • openbot/infrastructure/adapters/linear.py
  • openbot/infrastructure/llm/embedding.py
  • openbot/infrastructure/persistence/migrations/versions/003_add_issue_embeddings.py
  • openbot_plugins/__init__.py
  • openbot_plugins/reproduce_js_issue.py
  • openbot_plugins/reproduce_python_issue.py
  • openbot_plugins/summarize_pr_diff.py
  • tests/plugins/test_reproduce_js_issue.py
  • tests/plugins/test_reproduce_python_issue.py
  • tests/plugins/test_summarize_pr_diff.py
  • tests/unit/adapters/test_linear_adapter.py
  • tests/unit/cli/test_audit_cli.py
  • tests/unit/middleware/test_config_approval.py
  • tests/unit/use_cases/test_dedup.py
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/v02-linear-plugins-dedup

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@openbot-dev openbot-dev Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Large feature PR (v0.2) adding ConfigApprovalGate, LinearAdapter, Audit CLI extensions, embedding/dedup pipeline, and plugin system — with a critical signature-verification bypass, several high-risk issues (pgvector type mismatch, unsized app-scoped HTTP client, unscoped budget reset, missing rerank pipeline), and medium/low concerns.


def _get_http(self) -> httpx.AsyncClient:
if self._http is None:
headers = {"Content-Type": "application/json"}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical — hmac.new should be hmac.HMAC (or hmac.new is not a real function — the actual function is hmac.new, but the correct call is hmac.new(key, msg, digestmod). However, the real bug is that hmac.new is lowercase — Python's module-level function is hmac.new(). Actually, hmac.new IS valid. But the signature format mismatch is the real problem: Linear docs send the signature as sha256=<hex>, but the code compares the raw header value directly against the computed hexdigest, meaning any real Linear webhook will be rejected as invalid. This effectively blocks all legitimate webhooks.

expected = hmac.new(self._webhook_secret, body, hashlib.sha256).hexdigest()
        if not hmac.compare_digest(signature, expected):

sa.UniqueConstraint("repo", "issue_number", name="uq_issue_embeddings_repo_issue"),
)

# Create IVFFlat index for fast cosine similarity search
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high — The embedding column is defined as sa.Text() but the pgvector index uses vector_cosine_ops on it. pgvector requires the column type to be vector, not text. This migration will fail at the CREATE INDEX step or silently create a broken index.

sa.Column("embedding", sa.Text(), nullable=False),  # stored as vector via raw SQL

3. OPENBOT_POSTGRES_URL configured

Returns list of dicts: {issue_number, title, similarity}
"""
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

highcreate_async_engine is called inside search_similar_issues() on every invocation but is never disposed. Each call leaks a connection pool / engine. The engine should be reused or explicitly disposed after use.

engine = create_async_engine(postgres_url)
        async with engine.connect() as conn:


def _get_http(self) -> httpx.AsyncClient:
if self._http is None:
headers = {"Content-Type": "application/json"}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high — The _http httpx.AsyncClient is lazily created and stored on self but never closed. Since LinearAdapter is typically an app-lifetime singleton, there should be an explicit aclose() / shutdown hook, or the client should be used as an async context manager. On shutdown or adapter replacement, this leaks connections.

self._http = httpx.AsyncClient(base_url=_GRAPHQL_URL, headers=headers, timeout=10.0)

"feature": str(r.feature),
"model": r.model,
"cost_usd": str(r.cost_usd),
"cost_status": str(r.cost_status),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

highreset_budget() deletes ALL cost_meter rows when --global is passed, with no safety net beyond a --yes prompt. This is irreversible data destruction of audit/billing records. Consider soft-delete (mark as reset) or requiring a backup/export first. The session_scope commits automatically, so there's no rollback path.

if global_reset:
            result = await session.execute(delete(CostMeter))

return MiddlewareDecision.proceed()

# Check if the PR touches .openbot/config.yaml via raw payload
pr_body = event.raw.get("pull_request", {}).get("body", "")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high — The gate only checks if .openbot/config.yaml is mentioned in the PR title or body — it doesn't actually inspect the PR's changed files. A PR that modifies the config file without mentioning it in the title/body will bypass the gate entirely.

config_mentioned = _CONFIG_PATH in pr_body or _CONFIG_PATH in pr_title
        if not config_mentioned:
            return MiddlewareDecision.proceed()

"""
if embedding_func is None or search_func is None:
_logger.info("dedup_skipped_not_configured")
return DedupResult(candidates=[], comment="")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium — The rerank_func parameter is accepted but never called — the pipeline skips directly from search results to rendering. The docstring says 'LLM rerank → top-3 candidates' but the code just takes similar[:rerank_top] without reranking.

# Step 3: Rerank (optional)
        candidates = [
            DedupCandidate(...)
            for s in similar[:rerank_top]
        ]

async def get_issue(self, event: UnifiedEvent, issue_number: int) -> dict[str, Any]:
"""Fetch Linear issue details via GraphQL."""
query = """
query Issue($id: String!) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mediumget_issue() queries by internal Linear UUID (issue.id from the webhook payload) but the public issue_number parameter is ignored. If called with a different issue number than the event's issue, it will return the wrong issue.

async def get_issue(self, event: UnifiedEvent, issue_number: int) -> dict[str, Any]:
        ...
        issue_id = event.raw.get("data", {}).get("id")

@@ -1,9 +1,11 @@
"""FastAPI dependencies for the API entrypoint.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium — The module docstring still says 'At the moment this module only exposes the GitHub webhook ingress boundary' in the first paragraph, but the diff was supposed to update it. The old docstring lines (1–5) remain unchanged because the edit only touched lines 3–8.

At the moment this module only exposes the GitHub webhook ingress boundary:
read raw bytes, verify the signature, parse the event, and surface a single

Copy link
Copy Markdown

@openbot-dev openbot-dev Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Large feature PR adding v0.2 features (LinearAdapter, audit CLI, dedup, config approval gate, plugins). Several correctness and robustness issues found: config approval gate is trivially bypassable, migration creates a vector index on a text column, budget reset deletes data without committing, and LinearAdapter doesn't populate required pr_number field.

Repo-wide notes:

  • highopenbot/application/middleware/config_approval.py: The config approval gate checks only PR title/body for the string .openbot/config.yaml instead of actually fetching the PR's changed files. A PR that renames/deletes config.yaml, or uses git mv, or simply doesn't mention the path in title/body, bypasses the gate entirely. This is a security-sensitive check that must validate the actual file list from the API.
  • highopenbot/infrastructure/persistence/migrations/versions/003_add_issue_embeddings.py: The migration creates an IVFFlat vector index on the embedding column, but the column is defined as sa.Text(). The index uses embedding vector_cosine_ops which requires the column to be of type vector. Mixing a plain text column with vector operators will fail at runtime. Either change the column to use vector type (e.g. via raw DDL embedding vector(1024)) or remove the index until the column type is correct.
  • mediumopenbot/entrypoints/cli/audit.py: The reset_budget function executes DELETE statements but the surrounding session_scope context manager may not auto-commit. If session_scope is a read-only or rollback-on-exit context manager, the deletes will be silently lost. Additionally, result.rowcount is accessed after the session may have been closed, which can raise MissingGreenlet or DetachedInstanceError depending on the async session configuration.
  • mediumopenbot/infrastructure/adapters/linear.py: Linear issue numbers use the identifier field (e.g. "ENG-123") and the adapter extracts only the numeric portion. But issue_number is int on UnifiedEvent — if the identifier parsing fails (no hyphen), issue_number remains None. More importantly, Linear identifiers are team-scoped ("ENG-123" vs "OPS-123" could be the same number), so using just the integer creates ambiguity when multiple Linear teams send webhooks.
  • mediumopenbot/infrastructure/llm/embedding.py: search_similar_issues creates a new AsyncEngine on every call (create_async_engine(postgres_url)) but never disposes it. This leaks connection pools. The engine should be reused (e.g. passed as a parameter or obtained from a shared factory), matching the pattern used in the audit CLI.
  • mediumopenbot/application/middleware/config_approval.py: The _HIGH_RISK_PREFIXES tuple is defined but never used anywhere in the gate logic. The comment says it checks for 'high-risk fields' but the implementation only checks if the config path is mentioned in PR title/body — it doesn't inspect which fields actually changed. Dead code suggests incomplete implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant