Skip to content

V4 to V5 signoz migration#2774

Merged
shagun-singh-inkeep merged 12 commits intomainfrom
v5
Mar 23, 2026
Merged

V4 to V5 signoz migration#2774
shagun-singh-inkeep merged 12 commits intomainfrom
v5

Conversation

@shagun-singh-inkeep
Copy link
Copy Markdown
Collaborator

No description provided.

@vercel
Copy link
Copy Markdown

vercel Bot commented Mar 19, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agents-api Ready Ready Preview, Comment Mar 23, 2026 8:11pm
agents-manage-ui Ready Ready Preview, Comment Mar 23, 2026 8:11pm
1 Skipped Deployment
Project Deployment Actions Updated (UTC)
agents-docs Skipped Skipped Mar 23, 2026 8:11pm

Request Review

@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented Mar 19, 2026

🦋 Changeset detected

Latest commit: 336f5f2

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 10 packages
Name Type
@inkeep/agents-core Patch
@inkeep/agents-manage-ui Patch
@inkeep/agents-api Patch
@inkeep/agents-cli Patch
@inkeep/agents-sdk Patch
@inkeep/agents-work-apps Patch
@inkeep/ai-sdk-provider Patch
@inkeep/create-agents Patch
@inkeep/agents-email Patch
@inkeep/agents-mcp Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@vercel vercel Bot temporarily deployed to Preview – agents-docs March 19, 2026 15:10 Inactive
@pullfrog
Copy link
Copy Markdown
Contributor

pullfrog Bot commented Mar 19, 2026

TL;DR — Migrates all SigNoz integration from the v4 API (/api/v4/query_range with builderQueries object maps) to the v5 API (/api/v5/query_range with expression-based filters and queries arrays). This is a net-negative ~1,600 lines change that replaces verbose filter-object trees with concise string expressions, merges related queries to reduce round-trips, and adds parallel batch execution for conversation detail loads.

Key changes

  • signoz-queries.ts — Replaces v4 constants (DATA_TYPES, FIELD_TYPES, QUERY_FIELD_CONFIGS, PANEL_TYPES, DATA_SOURCES, AGGREGATE_OPERATORS) with v5 equivalents (REQUEST_TYPES, QUERY_TYPES, FIELD_CONTEXTS, FIELD_DATA_TYPES, SIGNALS) and adds a buildFilterExpression() helper for converting operator/value pairs to v5 filter strings.
  • signoz.ts (API proxy) — Extracts shared helpers (getSignozConfig, authorizeProject, handleSignozError, extractResults), switches the endpoint to /api/v5/query_range, and rewrites query-batch to parse v5 columnar responses and narrow step-2 time ranges from step-1 timestamps.
  • signozHelpers.ts — Rewrites enforceSecurityFilters from v4 filter-object injection to v5 expression string concatenation, reducing the function from 70 lines to 24.
  • conversations/[conversationId]/route.ts — Merges 15+ individual builderQueries into 3 parallel batches (core, context, events), consolidates separate query names (e.g. contextResolution/contextHandlecontextResolutionAndHandle, aiGenerations/aiStreamingTextaiLlmCalls, three toolApproval*toolApprovals), adds memoized ancestor lookups, and includes per-batch timing logs.
  • traces/route.ts — Updates the validation schema from {queryType, panelType, builderQueries} to {queries[], requestType} to match the v5 payload format.
  • spans/[spanId]/route.ts — Converts the ClickHouse SQL query envelope and response parsing from v4 series[0].labels to v5 columnar columns + data format.
  • conversation-list-item.tsx — Adds a startTime > 0 guard to prevent rendering invalid dates from zero-valued timestamps.

Summary | 9 files | 5 commits | base: mainv5


signoz-queries.ts — v4-to-v5 constant vocabulary

Before: v4 vocabulary (DATA_TYPES, FIELD_TYPES, QUERY_FIELD_CONFIGS, PANEL_TYPES, DATA_SOURCES, AGGREGATE_OPERATORS) with isColumn/isJSON metadata objects used to describe every field.
After: v5 vocabulary (REQUEST_TYPES, QUERY_TYPES, FIELD_CONTEXTS, FIELD_DATA_TYPES, SIGNALS) plus a buildFilterExpression() utility that converts filter items to SQL-like expression strings.

The v4 API required structured objects with key, dataType, type, isColumn, and isJSON for every filter item. v5 uses simple string expressions (key = 'value'), eliminating 60+ lines of config objects. Several query expression names are consolidated:

v4 v5
contextResolution + contextHandle contextResolutionAndHandle
aiGenerations + aiStreamingText aiLlmCalls
toolApprovalRequested/Approved/Denied toolApprovals

signoz-queries.ts


signoz.ts — Proxy route refactor and v5 endpoint

Before: Inline auth checks, config reads, and error handling duplicated across /query, /query-batch, and /health handlers; endpoint pointed at /api/v4/query_range.
After: Shared getSignozConfig(), authorizeProject(), handleSignozError(), and extractResults() helpers; endpoint is /api/v5/query_range; query-batch parses columnar v5 responses and narrows step-2 time windows using min/max timestamps from step-1 results.

The injectConversationIdFilter function (which built structured filter objects) is replaced by inline expression string concatenation. The query-batch step-2 optimization reads columns + data arrays instead of series[].labels to extract conversation IDs, and computes a tighter time range from aggregation column values.

How does the step-2 time range narrowing work? After step 1 returns paginated conversation IDs with timestamps, the code scans the aggregation column to find the min and max timestamps, then sets `detailPayloadTemplate.start = minTs - 1h` and `end = maxTs + 1h`. This avoids querying the full original time range for the detail step.

signoz.ts · signozHelpers.ts


conversations/[conversationId]/route.ts — Parallel batched queries and v5 payloads

Before: A single monolithic SigNoz payload with 15+ builderQueries entries, each using verbose filter objects; sequential execution; no timing instrumentation.
After: 3 parallel payloads (core, context, events) using buildQueryEnvelope() with expression-based filters; parseListByName() to split merged query results by name field; memoized findAncestorActivity / findAncestorAgentGeneration with depth guards; per-batch and total timing logs.

This is the largest change in the PR. The buildConversationListPayload function (returning a single payload) is replaced by buildConversationPayloads (returning 3 payloads executed via Promise.all). Field selection uses a compact sf(name, type, context) helper instead of 5-line objects. Post-query filtering via parseListByName handles the merged queries (e.g., splitting contextResolutionAndHandle rows by span name). The ancestor-lookup functions now use Map-based caches and depth limits to prevent stack overflows on deep span trees.

[conversationId]/route.ts · traces/route.ts


spans/[spanId]/route.ts — v5 ClickHouse SQL response format

Before: Response parsed as json.data.result[0].series[0].labels (v4 series-based format); payload used step, chQueries map, queryType: 'clickhouse_sql', panelType: 'table'.
After: Response parsed as json.data.data.results[0].columns + data (v5 columnar format); payload uses requestType: 'scalar', queries array with {type, spec} envelopes, typed variables with {type: 'custom', value}.

spans/[spanId]/route.ts

Pullfrog  | View workflow run | Triggered by Pullfrogpullfrog.com𝕏

@pullfrog
Copy link
Copy Markdown
Contributor

pullfrog Bot commented Mar 19, 2026

TL;DR — Migrates the entire SigNoz observability integration from v4 to v5 API. The v5 format replaces structured JSON filter objects with string-based filter expressions, swaps builderQueries maps for queries[] arrays, and returns columnar response data instead of series/labels. Also splits the monolithic conversation-detail query into 3 parallel batches, adds resilient retry logic for missing SigNoz attributes, and includes performance optimizations (memoized ancestor lookups, time-range narrowing).

Key changes

  • /api/v5/query_range — All SigNoz proxy and client-side query endpoints target the v5 API instead of v4.
  • String-based filter expressions — Filters switch from structured { key, dataType, type, op, value } JSON objects to SQL-like string expressions (e.g., serviceName IN ('...') AND tenant.id = '...').
  • compositeQuery.queries[] array — Replaces the named builderQueries map with [{ type, spec }] envelopes carrying new fields (signal, selectFields, fieldContext).
  • Columnar response parsing — Response handling switches from data.result[].list[].data to data.data.results[].rows[] / columns[] + data[][].
  • Parallel query batching — The conversation-detail endpoint splits queries into 3 payloads (core, context, events) fired via Promise.all.
  • Query merging — Similar span types (contextResolution/contextHandle, tool approvals, AI generations/streaming) are merged into single queries with client-side parseListByName() splitting by span name.
  • queryWithRetry() for missing attributes — New retry logic detects SigNoz 400 errors caused by key not found attributes, strips those queries, and retries — gracefully handles attributes not yet ingested.
  • enforceSecurityFilters() rewritten — Appends AND serviceName IN (...) AND tenant.id = '...' strings instead of mutating structured filter item arrays. Now also rejects non-builder query types to prevent tenant isolation bypass via raw SQL.
  • /span-lookup endpoint — New server-side route replaces client-side ClickHouse SQL construction for span detail — the query is built with buildSpanLookupPayload() which bakes tenant filtering into the SQL WHERE clause.
  • Performance optimizationsMap-based memoization caches with depth limits for findAncestorActivity()/findAncestorAgentGeneration(), findLast() replaces [...arr].reverse().find(), multiple .filter().length calls consolidated into a single loop.
  • totalErrors bug fix — Was hardcoded to 0; now reflects the actual count of error-status activities.
  • Shared constants overhauled — Removes v4 constructs (QUERY_FIELD_CONFIGS, DATA_TYPES, FIELD_TYPES, PANEL_TYPES, etc.) and adds v5 equivalents (REQUEST_TYPES, QUERY_TYPES, FIELD_CONTEXTS, SIGNALS, buildFilterExpression()). OPERATORS expanded with full coverage (CONTAINS, NOT_CONTAINS, REGEX, NOT_REGEX, NOT_IN, LESS_THAN, GREATER_THAN).
  • Proxy code refactoredgetSignozConfig(), handleSignozError(), authorizeProject(), extractResults() extracted as shared helpers to eliminate duplication across /query, /query-batch, /span-lookup, and /health.

Summary | 9 files | 12 commits | base: mainv5


SigNoz v4 → v5 query format migration

Before: Queries used compositeQuery.builderQueries — a named map of structured objects with filters.items[] arrays, selectColumns[] with { key, dataType, type, isColumn }, and panelType/queryType at the top level.
After: Queries use compositeQuery.queries[] — an array of { type, spec } envelopes where spec carries signal, filter.expression (string-based), selectFields[] with { name, fieldDataType, fieldContext }, and top-level requestType.

The v5 format is more concise — the sf(name, dataType, fieldContext) helper replaces verbose QUERY_FIELD_CONFIGS presets. Security enforcement now appends AND clauses to expression strings rather than deep-cloning payloads and splicing structured filter items. Non-builder query types are explicitly rejected to prevent tenant isolation bypass via raw SQL. Magic values like 'builder_query' and 'desc' are replaced with named constants (QUERY_TYPES.BUILDER_QUERY, ORDER_DIRECTIONS.DESC, QUERY_DEFAULTS.STEP_INTERVAL).

How does the new buildFilterExpression() work?

A general-purpose utility exported from agents-core that converts { key, op, value } items to string expressions. Handles IN, NOT IN, EXISTS, NOT EXISTS, LIKE, CONTAINS, REGEX, and comparison operators with proper quoting. The OP_MAP translates shorthand operators (e.g., ninNOT IN, nregexNOT REGEX) into their SQL-like v5 equivalents.

signoz.ts · signozHelpers.ts · signoz-queries.ts


queryWithRetry() — resilient handling of missing SigNoz attributes

Before: Any SigNoz 400 error was returned to the client as a generic "Invalid query parameters" response.
After: queryWithRetry() detects key not found errors, identifies which queries reference the missing keys, strips them, and retries. Queries referencing only known attributes succeed even when new attributes haven't been ingested yet.

This prevents the entire query from failing when a newly-instrumented span attribute hasn't been indexed by SigNoz. getMissingKeys() parses the error response, and queryReferencesKeys() scans filter expressions and select fields to determine which queries to remove.

signoz.ts


Parallel query batching and query merging for conversation detail

Before: A single payload with ~16 builderQueries was sent to SigNoz for conversation detail; similar span types each had their own query.
After: Queries are split into 3 logical batches (core, context, events) sent via Promise.all. Similar span types — contextResolution/contextHandle, all tool approvals, aiGenerations/aiStreamingText — are merged into single queries with client-side parseListByName() splitting by SPAN_KEYS.NAME.

This reduces the total query count and enables server-side parallelism. The handler also narrows the step-2 time range to min(timestamp) - 1h .. max(timestamp) + 1h from step-1 results, and now logs timing breakdowns (signozMs, processingMs, totalMs).

[conversationId]/route.ts · signoz-queries.ts


/span-lookup endpoint and columnar response parsing

Before: The span-detail client route constructed a raw ClickHouse SQL payload directly and sent it through the generic /query proxy. Responses were parsed from result[0].series[0].labels.
After: A dedicated /span-lookup server-side route builds the SQL payload via buildSpanLookupPayload() with tenant filtering baked into the WHERE clause. Responses are parsed from columnar columns[] + data[][] format using index-based lookups via data.data.results[].

Moving SQL construction server-side ensures the client never sends arbitrary ClickHouse SQL through the proxy (which now rejects non-builder queries). The extractResults() helper centralizes result extraction from the v5 response envelope (axiosResponse.data.data.data.results). The PUBLIC_SIGNOZ_URL env fallback was removed — getSignozConfig() now reads only SIGNOZ_URL.

spans/[spanId]/route.ts · signozHelpers.ts · signoz.ts


Proxy code refactoring and performance optimizations

Before: Each proxy route (/query, /query-batch, /health) duplicated auth, config lookup, and error handling logic. Ancestor lookups used linear scans with unbounded recursion. totalErrors was hardcoded to 0.
After: Shared helpers getSignozConfig(), authorizeProject(), handleSignozError(), and extractResults() eliminate duplication. Ancestor functions use Map-based lookup with memoization caches and a depth limit of 200. totalErrors reflects finalErrorCount. Multiple .filter().length calls replaced with a single loop.

The startTime guard in conversation-list-item.tsx now uses != null && > 0 instead of truthiness to correctly handle 0 timestamps.

signoz.ts · [conversationId]/route.ts · conversation-list-item.tsx

Pullfrog  | View workflow run | Triggered by Pullfrogpullfrog.com𝕏

Copy link
Copy Markdown
Contributor

@pullfrog pullfrog Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solid migration from SigNoz v4 to v5 — good refactoring (consolidated queries, parallel batching, memoized ancestor lookups). Two high-severity items need attention: an expression injection vector in signoz.ts:174 and a null-safety gap in enforceSecurityFilters. Lower-severity items noted inline.

Pullfrog  | Fix all ➔Fix 👍s ➔View workflow runpullfrog.com𝕏

Comment thread agents-api/src/domains/manage/routes/signoz.ts
Comment thread agents-api/src/utils/signozHelpers.ts Outdated
const securityExpr = buildSecurityExpression(tenantId, projectId);
for (const { type, spec } of payload.compositeQuery.queries) {
if (type !== QUERY_TYPES.BUILDER_QUERY) continue;
spec.filter = { expression: `(${spec.filter.expression}) AND ${securityExpr}` };
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High: Null-safety gap. If any query arrives without spec.filter or spec.filter.expression (e.g. a new query type or a client-side bug), this line will throw Cannot read properties of undefined (reading 'expression'). The old v4 code was resilient to missing filters — it would create them from scratch.

Suggestion: default to an empty expression:

const existing = spec.filter?.expression ?? 'true';
spec.filter = { expression: `(${existing}) AND ${securityExpr}` };


const convIdExpr = `${SPAN_KEYS.CONVERSATION_ID} IN (${conversationIds.map((id) => `'${id}'`).join(', ')})`;
for (const { spec } of detailPayloadTemplate.compositeQuery.queries) {
spec.filter = { expression: `(${spec.filter.expression}) AND ${convIdExpr}` };

This comment was marked as outdated.

Comment thread agents-api/src/domains/manage/routes/signoz.ts
logger.debug({ responseData }, 'SigNoz response (truncated)');
return json;
const json = response.data;
const results = json?.data?.data?.results ?? json?.data?.results ?? [];
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same concern as extractResults. The dual fallback json?.data?.data?.results ?? json?.data?.results suggests uncertainty about the response shape. If the v5 shape is known, pin to one path and drop the fallback — the silent fallback makes debugging harder when the response shape changes.

const float64 = FIELD_DATA_TYPES.FLOAT64;
const bool = FIELD_DATA_TYPES.BOOL;

function buildBaseExpression(conversationId: string, projectId?: string): string {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same expression injection pattern as the server-side route. conversationId comes from URL path params (line 461) and projectId from query params (line 469). Both are user-controlled and interpolated into filter expressions without escaping. A value like test' OR 1=1 -- would inject arbitrary filter logic.

This is lower risk since the payload goes through enforceSecurityFilters on the server side (which scopes to the correct tenant), but it still allows a user to broaden their own query scope within their tenant — e.g. fetch data for a different conversation.

const payloads = buildConversationPayloads(conversationId, start, end, projectId);
const batchLabels = ['core', 'context', 'events'] as const;

const batchResults = await Promise.all(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good improvement. Splitting the single massive query into three parallel batches (core, context, events) with Promise.all is a meaningful performance win — the SigNoz backend can process them concurrently. The timing logs are also helpful for diagnosing production latency.

Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review Summary

(8) Total Issues | Risk: High

🔴❗ Critical (2) ❗🔴

Inline Comments:

  • 🔴 Critical: signozHelpers.ts:6 SQL injection via unescaped tenantId/projectId interpolation
  • 🔴 Critical: signoz.ts:174 SQL injection via unescaped conversation IDs from query results

🟠⚠️ Major (3) 🟠⚠️

🟠 1) signozHelpers.ts:19 Security filters not applied to non-builder query types

Issue: The enforceSecurityFilters function only applies tenant/project filters to queries where type === QUERY_TYPES.BUILDER_QUERY. If a ClickHouse SQL query or other query type is passed through, it will bypass tenant isolation entirely.

Why: The spans/[spanId]/route.ts uses clickhouse_sql queries which would not be filtered. This creates a potential tenant isolation bypass if the proxy endpoint accepts non-builder queries.

Fix: Either:

  1. Reject non-builder query types in the proxy endpoint with a 400 error
  2. Implement security filtering for clickhouse_sql queries by injecting WHERE clauses
  3. Document that only builder queries are supported and validate the query type before proxying

Refs:

Inline Comments:

  • 🟠 Major: signozHelpers.ts:20 Null pointer when spec.filter is undefined
  • 🟠 Major: signoz-queries.ts:52 Incorrect SQL escaping (backslash vs standard SQL '')

🟡 Minor (3) 🟡

🟡 1) signoz.ts:57-80 Missing 429 rate limit handling in SigNoz proxy

Issue: The handleSignozError function does not check for HTTP 429 responses from SigNoz. When SigNoz is rate-limiting requests, the proxy returns a generic 500 error instead of propagating the 429 status and Retry-After header.

Why: This masks the actual issue from clients and prevents them from implementing proper backoff. The conversation route already handles 429 (lines 116-117), but the agents-api proxy does not.

Fix: Add 429 handling in handleSignozError:

if (error.response?.status === 429) {
  logger.warn({ status: 429 }, 'SigNoz rate limit exceeded');
  return { 
    body: { error: 'Too Many Requests', message: 'SigNoz rate limit exceeded', retryAfter: error.response.headers['retry-after'] }, 
    status: 429 as const 
  };
}

Refs:

🟡 2) signoz-queries.ts:76-86 OPERATORS constant missing NOT_IN

Issue: The internal OP_MAP at line 48 includes 'nin' -> 'NOT IN' mapping, but the exported OPERATORS constant does not include NOT_IN. The old v4 OPERATORS had NOT_IN: 'nin'.

Why: Consumer code that previously used OPERATORS.NOT_IN will break after this migration.

Fix: Add NOT_IN: 'nin' to the OPERATORS constant for parity with OP_MAP.

Refs:

Inline Comments:

  • 🟡 Minor: conversations/[conversationId]/route.ts:164 Hardcoded 'builder_query', 'desc', 60, false instead of constants

💭 Consider (1) 💭

💭 1) signoz-stats.ts (multiple methods) Silent fallback to empty data on API errors

Issue: Multiple methods in SigNozStatsAPI catch all errors and return empty arrays/default values without surfacing the error condition to callers. Users see empty dashboards without knowing whether it's due to no data or a service failure.

Why: This is a pre-existing pattern, but worth addressing for better user experience. The pattern appears in 15+ methods throughout the file.

Fix: Consider returning a discriminated union: { status: 'success' | 'error', data?: T, error?: Error } to distinguish between "no data" and "service error".

Refs:

🧹 While You're Here (1) 🧹

🧹 1) signoz-links.ts Still uses v4 SigNoz query format

Issue: This PR migrates SigNoz queries from v4 to v5 format, but signoz-links.ts was not touched and still uses the old v4 compositeQuery structure with queryType/builder/dataSource pattern.

Why: This creates split-world inconsistency where future maintainers may not know which pattern to follow.

Fix: Either migrate signoz-links.ts to v5 format in this PR, or create a tracking issue documenting this as intentional legacy code with a migration plan.

Refs:


🚫 REQUEST CHANGES

Summary: The v4→v5 SigNoz migration is well-structured with significant code simplification (-1600 lines net). However, there are critical SQL injection vulnerabilities in the new string-interpolation-based filter building that must be addressed before merge. The tenantId, projectId, and conversationIds values are interpolated directly into SQL filter expressions without escaping single quotes, creating a potential security and data integrity risk. Additionally, the security filter enforcement skips non-builder query types, which could allow tenant isolation bypass.

Discarded (5)
Location Issue Reason Discarded
signoz-stats.ts:160 Global axios retry configuration may interfere with other HTTP clients Pre-existing pattern, out of scope for this PR
conversations/[conversationId]/route.ts:116 Rate limit handling does not respect Retry-After header Already handles 429 correctly, forwarding the header is a nice-to-have not a bug
spans/[spanId]/route.ts:127 Generic error message loses diagnostic context Pre-existing pattern, not introduced by this PR
signoz-stats.ts:127 Filter value parsing error uses console.warn but continues Defensive pattern, appropriate behavior
conversations/[conversationId]/route.ts:807 Empty catch blocks for JSON parsing Appropriate defensive handling for optional data
Reviewers (4)
Reviewer Returned Main Findings Consider While You're Here Inline Comments Pending Recs Discarded
pr-review-standards 4 1 0 0 3 0 0
pr-review-sre 7 1 0 0 1 0 3
pr-review-errors 4 0 1 0 0 0 2
pr-review-consistency 5 1 0 1 1 0 0
Total 20 3 1 1 5 0 5

};

function quoteValue(value: unknown): string {
if (typeof value === 'string') return `'${value.replace(/'/g, "\\'")}'`;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 MAJOR Incorrect SQL escaping: backslash instead of standard SQL

Issue: The quoteValue function escapes single quotes by replacing ' with \' (backslash-escape). Standard SQL uses '' (doubled single quotes). ClickHouse supports both syntaxes, but backslash escaping can behave unexpectedly depending on settings.

Why: Using non-standard escaping may cause issues in edge cases and is inconsistent with the escaping recommended for ClickHouse.

Fix:

Suggested change
if (typeof value === 'string') return `'${value.replace(/'/g, "\\'")}'`;
if (typeof value === 'string') return `'${value.replace(/'/g, "''")}'`;

Refs:

limit = 10000
): any {
return {
type: 'builder_query',
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Minor Hardcoded string instead of QUERY_TYPES.BUILDER_QUERY constant

Issue: This file uses the hardcoded string 'builder_query' while signoz-stats.ts consistently uses QUERY_TYPES.BUILDER_QUERY. The same applies to 'desc' (line 170), 60 (line 172), and false (line 173) which have corresponding constants.

Why: Using magic strings instead of shared constants creates maintenance burden and risks typos. The peer file signoz-stats.ts consistently uses QUERY_TYPES.BUILDER_QUERY, ORDER_DIRECTIONS.DESC, QUERY_DEFAULTS.STEP_INTERVAL, and QUERY_DEFAULTS.DISABLED.

Fix: Import and use the constants:

import { QUERY_TYPES, ORDER_DIRECTIONS, QUERY_DEFAULTS } from '@/constants/signoz';
// ...
return {
  type: QUERY_TYPES.BUILDER_QUERY,
  spec: {
    // ...
    order: [{ key: { name: SPAN_KEYS.TIMESTAMP }, direction: ORDER_DIRECTIONS.DESC }],
    stepInterval: QUERY_DEFAULTS.STEP_INTERVAL,
    disabled: QUERY_DEFAULTS.DISABLED,
  },
};

Refs:

@github-actions github-actions Bot deleted a comment from claude Bot Mar 19, 2026
@itoqa
Copy link
Copy Markdown

itoqa Bot commented Mar 19, 2026

Ito Test Report ✅

6 test cases ran. 6 passed.

✅ The included trace workflow checks passed: filter mode switching behaved as expected, timestamp badge suppression was validated, date clamping worked, invalid conversation handling remained controlled, and rapid interaction flows converged to stable UI states.

✅ Passed (6)
Test Case Summary Timestamp Screenshot
ROUTE-2 SigNoz configuration warning banner was shown clearly and the traces page remained usable, matching unconfigured-health behavior. 0:00 ROUTE-2_0-00.png
LOGIC-2 Confirmed request URL switched from mode=batch to non-batch after entering search text; API corroboration returned controlled 400 response without crash. 11:03 LOGIC-2_11-03.png
LOGIC-3 Using mocked batch API responses due local traces backend 503s, the overview rendered a relative-time badge for a valid startTime row and correctly suppressed the badge for the row with startTime=0. 18:30 LOGIC-3_18-30.png
EDGE-1 Verified custom calendar prevented future end-date selection and applying the range did not trigger hard validation failure. 11:03 EDGE-1_11-03.png
EDGE-4 Invalid conversation route displayed a controlled error state and Back to Overview recovery returned to traces overview successfully. 2:35 EDGE-4_2-35.png
ADV-3 After aggressive local API mocking fallback, rapid pagination plus repeated Errors-only and agent-filter toggles converged to a stable final overview state without duplicate rows or persistent spinner. 21:23 ADV-3_21-23.png
📋 View Recording

Screen Recording

Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review Summary

(4) Total Issues | Risk: Medium

Re-review scope: This review covers the delta since the last automated review (3 commits, 4 files). Prior findings are referenced in Pending Recommendations.

✅ Issues Addressed in This Delta

Two critical issues from the prior review have been fixed:

  1. SQL escaping fixedquoteValue() now uses standard SQL '' escaping instead of backslash \'
  2. Non-builder query bypass fixedenforceSecurityFilters() now throws on non-builder query types, preventing tenant isolation bypass via raw SQL

🟠⚠️ Major (1) 🟠⚠️

Inline Comments:

  • 🟠 Major: signoz.ts:72-80 queryReferencesKeys doesn't check orderBy fields

🟡 Minor (2) 🟡

Inline Comments:

  • 🟡 Minor: signoz.ts:108 Silent empty response when all queries stripped
  • 🟡 Minor: signoz.ts:291 Magic number for 180-day lookback period

🕐 Pending Recommendations (3)

These issues were raised in the prior review and remain unaddressed:

  • 🔴 signozHelpers.ts:8 SQL injection via unescaped tenantId/projectId interpolation in buildSecurityExpression
  • 🔴 signoz.ts:262 SQL injection via unescaped conversation IDs from query results
  • 🟠 signozHelpers.ts:28-30 Null pointer exception when spec.filter is undefined

🚫 REQUEST CHANGES

Summary: Good progress on this delta — the SQL escaping and non-builder query bypass issues are now fixed. However, the critical SQL injection vulnerabilities flagged in the prior review (unescaped tenantId/projectId and conversationIds interpolation) remain unaddressed. These should be fixed before merging to ensure tenant data isolation.

Quick wins:

  1. For buildSecurityExpression: escape tenantId and projectId with the same quoteValue() helper that was just fixed, or use parameterized queries
  2. For conversation IDs at line 262: conversationIds.map((id) => quoteValue(id)).join(', ') (import quoteValue from signoz-queries.ts)
  3. For spec.filter null check: if (!spec?.filter?.expression) throw new Error('Query must have filter.expression');
Discarded (4)
Location Issue Reason Discarded
signoz.ts:85-117 Retry logic lacks backoff delay Since the retry modifies the payload (strips queries), immediate retry is reasonable — it's not retrying the same request
signoz.ts:275-310 Span-lookup doesn't use queryWithRetry ClickHouse SQL queries have different semantics; the retry logic is designed for builder queries with multiple sub-queries
signoz.ts:103-106 retried flag not used at call sites The logging at line 103-106 already provides the key information; adding more logging is marginal value
signoz.ts:111-115 Retry failure logging missing The outer handleSignozError provides adequate error context
Reviewers (4)
Reviewer Returned Main Findings Consider While You're Here Inline Comments Pending Recs Discarded
pr-review-security-iam 0 0 0 0 0 0 0
pr-review-sre 6 0 0 0 2 0 2
pr-review-errors 3 0 0 0 1 0 2
pr-review-standards 3 0 0 0 1 0 0
Total 12 0 0 0 3 0 4

Note: Security reviewer returned 0 findings for delta — prior critical issues are pre-existing and tracked in Pending Recommendations.

Comment on lines +72 to +80
function queryReferencesKeys(query: any, keys: string[]): boolean {
const spec = query?.spec;
const searchable = [
spec?.filter?.expression ?? '',
...(spec?.selectFields ?? []).map((f: any) => f.name),
...(spec?.groupBy ?? []).map((g: any) => g.name),
];
return keys.some((k) => searchable.some((t: string) => t.includes(k)));
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 MAJOR queryReferencesKeys doesn't check orderBy fields

Issue: The function only checks filter.expression, selectFields, and groupBy for missing key references. If a missing attribute is referenced only in orderBy, the query won't be filtered out and the retry will fail with the same error.

Why: SigNoz builder queries can reference attributes in order clauses. If the missing key error originates from an orderBy reference, the retry logic will fail to identify and remove the problematic query, causing an unnecessary second failure.

Fix: Add orderBy to the searchable fields:

const searchable = [
  spec?.filter?.expression ?? '',
  ...(spec?.selectFields ?? []).map((f: any) => f.name),
  ...(spec?.groupBy ?? []).map((g: any) => g.name),
  ...(spec?.order ?? []).map((o: any) => o.key?.name ?? o.columnName ?? ''),
];

Refs:

'Retrying SigNoz query without queries referencing missing keys'
);

if (kept.length === 0) return { data: EMPTY_RESPONSE, retried: true };
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Minor Silent empty response when all queries stripped

Issue: When all queries reference missing keys, the function returns EMPTY_RESPONSE silently. The retried: true flag is returned but never surfaced to users. Clients receive empty results with no indication their query attributes aren't available.

Why: Users may assume "no data exists" rather than "query attributes not ingested yet." This makes debugging observability issues difficult during SigNoz schema migrations.

Fix: Log at a higher visibility level when returning empty due to stripped queries:

if (kept.length === 0) {
  logger.warn(
    { missingKeys: missing, originalQueryCount: queries.length },
    'All queries stripped due to missing attributes - returning empty response'
  );
  return { data: EMPTY_RESPONSE, retried: true };
}

Refs:

  • Line 103-106 already logs successful partial strips; this extends to full strip case

return c.json({ error: 'Service Unavailable', message: 'SigNoz is not configured' }, 500);

const now = Date.now();
const lookbackMs = 180 * 24 * 60 * 60 * 1000;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Minor Magic number for lookback period

Issue: The 180-day lookback period is hardcoded without explanation. This wide time range may affect query performance on large ClickHouse tables.

Why: Without documentation, operators can't tune this value during performance issues. The 180-day window seems generous for a span lookup.

Fix: Extract to a named constant:

Suggested change
const lookbackMs = 180 * 24 * 60 * 60 * 1000;
const SPAN_LOOKUP_LOOKBACK_MS = 180 * 24 * 60 * 60 * 1000; // 180 days: covers full trace retention period

Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review Summary

(0) New Issues | (3) Pending from Prior Review | Risk: High

Re-review scope: This review covers the delta since the last automated review (1 commit, 3 files, +11/-3 lines).

✅ Delta Assessment

The latest commit 74b0622d6 ("comments and fixes") contains no new issues. All changes are improvements:

  1. signoz-queries.ts — Added missing operators to OPERATORS constant (NOT_IN, REGEX, NOT_REGEX, LESS_THAN, GREATER_THAN, CONTAINS, NOT_CONTAINS). This restores parity with the internal OP_MAP and fixes a breaking change flagged in the prior review.

  2. conversations/[conversationId]/route.ts:1285 — Fixed null-safety in cache lookup:

    - if (agentGenCache.has(activityId)) return agentGenCache.get(activityId)!;
    + if (agentGenCache.has(activityId)) return agentGenCache.get(activityId) ?? null;

    The non-null assertion (!) was unsafe since the cache stores null values. Now correctly returns null when the cached value is null.

  3. spans/[spanId]/route.ts:51 — Simplified response path:

    - const results = json?.data?.data?.results ?? json?.data?.results;
    + const results = json?.data?.results ?? [];

    Removes uncertain dual-path fallback in favor of the correct v5 response shape.

🕐 Pending Recommendations (3)

These critical and major issues were raised in the prior review and remain unaddressed:

  • 🔴 signozHelpers.ts:8 — SQL injection via unescaped tenantId/projectId interpolation in buildSecurityExpression. Values are interpolated directly into filter expressions without escaping single quotes.

  • 🔴 signoz.ts:262 — SQL injection via unescaped conversation IDs. The conversationIds array from step-1 results is interpolated directly: conversationIds.map((id) => \'${id}'`).join(', ')` without escaping.

  • 🟠 signozHelpers.ts:29 — Null pointer exception when spec.filter is undefined. Line 29 accesses spec.filter.expression without null check.

Quick fixes for the SQL injection issues:

// signozHelpers.ts:8 — use quoteValue from signoz-queries.ts
import { quoteValue } from '@inkeep/agents-core';
let expr = `${SERVICE_NAME_FILTER} AND ${SPAN_KEYS.TENANT_ID} = ${quoteValue(tenantId)}`;
if (projectId) expr += ` AND ${SPAN_KEYS.PROJECT_ID} = ${quoteValue(projectId)}`;

// signoz.ts:262 — escape conversation IDs
const convIdExpr = `${SPAN_KEYS.CONVERSATION_ID} IN (${conversationIds.map((id) => quoteValue(id)).join(', ')})`;

// signozHelpers.ts:29 — add null check
const existing = spec.filter?.expression ?? 'true';
spec.filter = { expression: `(${existing}) AND ${securityExpr}` };

🚫 REQUEST CHANGES

Summary: The delta itself is clean — all 3 changes are improvements with no new issues introduced. However, the critical SQL injection vulnerabilities flagged in the prior review remain unaddressed in the unchanged files. These must be fixed before merge to ensure tenant data isolation. The fix is straightforward: import and use the quoteValue() helper (which now correctly uses standard SQL '' escaping) for all user-controlled string interpolation.

Discarded (0)

No findings discarded in this delta review.

Reviewers (1)
Reviewer Returned Main Findings Consider While You're Here Inline Comments Pending Recs Discarded
orchestrator 3 0 0 0 0 3 0
Total 3 0 0 0 0 3 0

Note: Delta-scoped review — sub-reviewers not dispatched for minimal changes. Prior critical issues tracked in Pending Recommendations.

@itoqa
Copy link
Copy Markdown

itoqa Bot commented Mar 20, 2026

Ito Test Report ❌

20 test cases ran. 1 failed, 19 passed.

Across 8 report shards covering 20 test cases, 19 passed and the traces platform was largely stable: core overview/detail routes, pagination/search fallback, span lookup side pane, analytics pages (AI/Tool Calls), auth and tenant/project scoping protections, rapid-interaction stress behavior, mobile usability, and multiple API validation guardrails all behaved as expected without crashes or data leakage. The single significant finding was a Medium-severity defect in GET /api/traces/conversations/[conversationId], where a whitespace-only conversationId (%20) is not trimmed/rejected and is incorrectly forwarded into downstream time-range/query logic (likely introduced by the PR), weakening endpoint contract reliability and adding avoidable operational noise.

❌ Failed (1)
Category Summary Screenshot
Edge 🟠 Conversation details route accepts a whitespace-only conversationId path value and continues processing instead of rejecting it. EDGE-3
🟠 Conversation details endpoint accepts whitespace-only conversation IDs
  • What failed: The route only rejects falsy conversationId values, so a whitespace path segment (%20) is treated as valid and forwarded into time-range lookup and trace query construction instead of returning a parameter-validation error.
  • Impact: Invalid path input can trigger unnecessary downstream requests and inconsistent error behavior instead of a clean client-facing validation response. This weakens endpoint contract reliability for callers and increases operational noise.
  • Introduced by this PR: Yes – this PR modified the relevant code
  • Steps to reproduce:
    1. Call GET /api/traces/conversations/%20?tenantId=default&projectId= with valid authentication.
    2. Observe that the request is processed through time-range lookup and downstream query logic instead of being rejected immediately.
    3. Confirm the route does not return a direct invalid-parameter response for the whitespace-only path ID.
  • Code analysis: I reviewed the conversation details route and its time-range helper. The handler checks only if (!conversationId), then immediately uses the raw value in both bounds lookup and query filter expression construction; no trim/sanitization step exists before downstream execution.
  • Why this is likely a bug: The production route accepts syntactically invalid whitespace-only IDs due to missing normalization/validation, and code paths clearly show the invalid value is consumed as if it were a real identifier.

Relevant code:

agents-manage-ui/src/app/api/traces/conversations/[conversationId]/route.ts (lines 461-464)

const { conversationId } = await context.params;
if (!conversationId) {
  return NextResponse.json({ error: 'Conversation ID is required' }, { status: 400 });
}

agents-manage-ui/src/app/api/traces/conversations/[conversationId]/route.ts (lines 151-155)

function buildBaseExpression(conversationId: string, projectId?: string): string {
  const parts = [`${SPAN_KEYS.CONVERSATION_ID} = '${conversationId}'`];
  if (projectId) parts.push(`${SPAN_KEYS.PROJECT_ID} = '${projectId}'`);
  return parts.join(' AND ');
}

agents-manage-ui/src/lib/api/signoz-conversation-time-range.ts (lines 32-35)

const bounds = await makeManagementApiRequest<{
  data: { createdAt: string; updatedAt: string };
}>(`tenants/${tenantId}/projects/${projectId}/conversations/${conversationId}/bounds`);
✅ Passed (19)
Category Summary Screenshot
Adversarial Generic traces proxy rejected non-builder query type (clickhouse_sql) with explicit HTTP 400 error. ADV-1
Adversarial Injection-like span filter payload did not crash the UI or expose out-of-scope data in the validated scenario. ADV-2
Adversarial Unauthenticated traces access is blocked; invalid POST payload is rejected by schema validation and valid payload is denied with auth error. ADV-3
Adversarial Tenant/project tampering attempts were denied or returned safe scoped errors with no foreign trace payload exposure. ADV-4
Adversarial Rapid toggles, search, and time-range interactions converged to a stable final traces state with no trace-request or trace-console errors after environment remediation. ADV-5
Adversarial Mobile traces overview and conversation detail remained usable at 390x844 under this run's dev fallback setup. ADV-6
Edge Inverted and future time-range requests were rejected with explicit HTTP 400 validation errors. EDGE-1
Edge Batch mode rejected malformed payloads missing required sections with deterministic HTTP 400 responses. EDGE-2
Edge Span details endpoint correctly required conversationId and returned HTTP 400 when omitted. EDGE-4
Edge Conversation row with zero startTime rendered safely and omitted relative-time chip without exceptions. EDGE-5
Edge Deep-link refresh and back-forward navigation returned to a stable Conversation Details state without error. EDGE-6
Happy-path GET /api/traces?tenantId=default returned HTTP 200 with structured status/configured fields and no stack trace exposure. ROUTE-1
Happy-path Traces overview rendered aggregate cards and conversation list container for the project without runtime crash. ROUTE-2
Happy-path Pagination advanced and returned correctly with stable conversation IDs and no duplication in the validated flow. ROUTE-3
Happy-path Search flow used fallback behavior and clearing search restored the expected list state. ROUTE-4
Happy-path Deep-link conversation details view rendered summary cards and timeline for the target conversation route. ROUTE-5
Happy-path Selecting a timeline activity loaded span details and displayed structured side-pane content. ROUTE-6
Happy-path AI Calls page remained functional with deterministic traces responses, including agent filter interaction and rendered cards/charts. ROUTE-7
Happy-path Tool Calls page rendered correctly with deterministic traces responses, including date-range change and stable breakdown cards. ROUTE-8

Commit: 74b0622

View Full Run


Tell us how we did: Give Ito Feedback

Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review Summary

(8) Total Issues | Risk: High

🔴❗ Critical (2) ❗🔴

Inline Comments:

  • 🔴 Critical: signozHelpers.ts:8 SQL injection via unescaped tenantId/projectId in buildSecurityExpression()
  • 🔴 Critical: signoz.ts:262 SQL injection via unescaped conversation IDs from step-1 results

🟠⚠️ Major (2) 🟠⚠️

Inline Comments:

  • 🟠 Major: signozHelpers.ts:29 Null pointer when spec.filter is undefined
  • 🟠 Major: signoz.ts:264 Same null pointer issue when building conversation ID filter

🟡 Minor (3) 🟡

Inline Comments:

  • 🟡 Minor: signoz.ts:80 queryReferencesKeys doesn't check orderBy fields
  • 🟡 Minor: signoz.ts:291 Magic number for 180-day lookback period
  • 🟡 Minor: conversations/[conversationId]/route.ts:152 Unescaped conversationId in client-side filter (lower risk)

💭 Consider (1) 💭

Inline Comments:

  • 💭 Consider: conversations/[conversationId]/route.ts:458 Whitespace-only conversationId validation (flagged by Ito tests)

🕐 Pending Recommendations (0)

All prior review issues have been incorporated into this review's findings.


🚫 REQUEST CHANGES

Summary: The v4→v5 SigNoz migration is architecturally sound with good improvements (parallel batching, query consolidation, -1458 net lines). However, critical SQL injection vulnerabilities must be addressed before merge:

  1. signozHelpers.ts:8tenantId/projectId interpolated directly into filter expressions without escaping single quotes
  2. signoz.ts:262conversationIds from step-1 results interpolated without escaping

Quick fix: Import and use the existing quoteValue() helper from signoz-queries.ts (which correctly uses standard SQL '' escaping) for all string interpolation:

import { quoteValue } from '@inkeep/agents-core';

// signozHelpers.ts:8
let expr = `${SERVICE_NAME_FILTER} AND ${SPAN_KEYS.TENANT_ID} = ${quoteValue(tenantId)}`;

// signoz.ts:262
const convIdExpr = `${SPAN_KEYS.CONVERSATION_ID} IN (${conversationIds.map((id) => quoteValue(String(id))).join(', ')})`;

The null pointer issues at lines 29 and 264 also need addressing to prevent runtime crashes on queries without pre-existing filters.

Discarded (0)

No findings discarded in this review.

Reviewers (1)
Reviewer Returned Main Findings Consider While You're Here Inline Comments Pending Recs Discarded
orchestrator 8 0 0 0 8 0 0
Total 8 0 0 0 8 0 0

Note: Sub-reviewers unavailable — review performed directly by orchestrator.

const SPAN_LOOKUP_TABLE = 'distributed_signoz_index_v3';

function buildSecurityExpression(tenantId: string, projectId?: string): string {
let expr = `${SERVICE_NAME_FILTER} AND ${SPAN_KEYS.TENANT_ID} = '${tenantId}'`;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 CRITICAL SQL injection via unescaped tenantId/projectId

Issue: The tenantId and projectId values are directly interpolated into the filter expression without escaping single quotes. While these values come from authenticated middleware context, a compromised or malformed value could inject arbitrary filter logic.

Why: A tenantId like test' OR 1=1 -- would break out of the string literal and inject arbitrary SQL/filter expressions. This could potentially allow cross-tenant data access.

Fix: Use the quoteValue() helper from signoz-queries.ts which correctly escapes single quotes:

import { quoteValue } from '@inkeep/agents-core';

function buildSecurityExpression(tenantId: string, projectId?: string): string {
  let expr = `${SERVICE_NAME_FILTER} AND ${SPAN_KEYS.TENANT_ID} = ${quoteValue(tenantId)}`;
  if (projectId) expr += ` AND ${SPAN_KEYS.PROJECT_ID} = ${quoteValue(projectId)}`;
  return expr;
}

Refs:

);
const securityExpr = buildSecurityExpression(tenantId, projectId);
for (const { spec } of queries) {
spec.filter = { expression: `(${spec.filter.expression}) AND ${securityExpr}` };
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 MAJOR Null pointer when spec.filter is undefined

Issue: This line accesses spec.filter.expression without checking if spec.filter exists. If a query arrives without a pre-existing filter (which is valid in v5), this will throw Cannot read properties of undefined (reading 'expression').

Why: The v4 code was resilient to missing filters — it would create them from scratch. This regression breaks queries that don't have pre-existing filters.

Fix: Default to a truthy base expression:

Suggested change
spec.filter = { expression: `(${spec.filter.expression}) AND ${securityExpr}` };
const existing = spec.filter?.expression ?? 'true';
spec.filter = { expression: `(${existing}) AND ${securityExpr}` };

Refs:

);
return c.json({ error: 'Internal Server Error', message: 'Failed to query SigNoz' }, 500);

const convIdExpr = `${SPAN_KEYS.CONVERSATION_ID} IN (${conversationIds.map((id) => `'${id}'`).join(', ')})`;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 CRITICAL SQL injection via unescaped conversation IDs

Issue: The conversationIds array from step-1 SigNoz results is directly interpolated into the filter expression without escaping single quotes. While this data comes from SigNoz (not user input), a conversation ID containing a single quote would break the expression.

Why: Although conversation IDs are typically UUIDs, the defensive principle is to always escape user-controllable or external data. A malformed ID like conv' OR 1=1 -- could inject arbitrary filter logic.

Fix: Use the quoteValue() helper:

Suggested change
const convIdExpr = `${SPAN_KEYS.CONVERSATION_ID} IN (${conversationIds.map((id) => `'${id}'`).join(', ')})`;
const convIdExpr = `${SPAN_KEYS.CONVERSATION_ID} IN (${conversationIds.map((id) => quoteValue(String(id))).join(', ')})`;

You'll need to import quoteValue from @inkeep/agents-core or extract it as a shared utility.

Refs:


const convIdExpr = `${SPAN_KEYS.CONVERSATION_ID} IN (${conversationIds.map((id) => `'${id}'`).join(', ')})`;
for (const { spec } of detailPayloadTemplate.compositeQuery.queries) {
spec.filter = { expression: `(${spec.filter.expression}) AND ${convIdExpr}` };
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 MAJOR Same null pointer issue as enforceSecurityFilters

Issue: This line accesses spec.filter.expression without checking if spec.filter exists. Same issue as in signozHelpers.ts:29.

Fix: Use optional chaining with a default:

Suggested change
spec.filter = { expression: `(${spec.filter.expression}) AND ${convIdExpr}` };
const existing = spec.filter?.expression ?? 'true';
spec.filter = { expression: `(${existing}) AND ${convIdExpr}` };

...(spec?.groupBy ?? []).map((g: any) => g.name),
];
return keys.some((k) => searchable.some((t: string) => t.includes(k)));
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Minor queryReferencesKeys doesn't check orderBy fields

Issue: The function only checks filter.expression, selectFields, and groupBy for missing key references. If a missing attribute is referenced only in order, the query won't be filtered out and the retry will fail with the same error.

Why: SigNoz builder queries can reference attributes in order clauses. If the missing key error originates from an orderBy reference, the retry logic will fail to identify and remove the problematic query.

Fix: Add order fields to the searchable array:

const searchable = [
  spec?.filter?.expression ?? '',
  ...(spec?.selectFields ?? []).map((f: any) => f.name),
  ...(spec?.groupBy ?? []).map((g: any) => g.name),
  ...(spec?.order ?? []).map((o: any) => o.key?.name ?? o.columnName ?? ''),
];

return c.json({ error: 'Service Unavailable', message: 'SigNoz is not configured' }, 500);

const now = Date.now();
const lookbackMs = 180 * 24 * 60 * 60 * 1000;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Minor Magic number for 180-day lookback period

Issue: The 180-day lookback period is hardcoded without explanation. This wide time range may affect query performance on large ClickHouse tables.

Why: Without documentation, operators can't tune this value during performance issues. The 180-day window seems generous for a span lookup.

Fix: Extract to a named constant with documentation:

/** Maximum lookback for span lookup queries. Covers typical trace retention period. */
const SPAN_LOOKUP_LOOKBACK_MS = 180 * 24 * 60 * 60 * 1000; // 180 days

const bool = FIELD_DATA_TYPES.BOOL;

function buildBaseExpression(conversationId: string, projectId?: string): string {
const parts = [`${SPAN_KEYS.CONVERSATION_ID} = '${conversationId}'`];
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Minor Unescaped conversationId in filter expression (lower risk)

Issue: The conversationId from URL path params is interpolated directly into the filter expression without escaping. A value like test' OR 1=1 -- would inject arbitrary filter logic.

Why: This is lower risk than the server-side issues because:

  1. The payload goes through enforceSecurityFilters() on the server which scopes to the correct tenant
  2. A user can only broaden their query scope within their own tenant

However, it still allows a user to potentially fetch data for a different conversation within their tenant by manipulating the filter.

Fix: Consider using the buildFilterExpression() helper from @/constants/signoz or escaping manually:

function buildBaseExpression(conversationId: string, projectId?: string): string {
  const escapedConvId = conversationId.replace(/'/g, "''");
  const parts = [`${SPAN_KEYS.CONVERSATION_ID} = '${escapedConvId}'`];
  if (projectId) {
    const escapedProjId = projectId.replace(/'/g, "''");
    parts.push(`${SPAN_KEYS.PROJECT_ID} = '${escapedProjId}'`);
  }
  return parts.join(' AND ');
}

@github-actions github-actions Bot deleted a comment from claude Bot Mar 23, 2026
Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review Summary

(7) Total Issues | Risk: High

🔴❗ Critical (2) ❗🔴

Inline Comments:

  • 🔴 Critical: signozHelpers.ts:8 SQL injection via unescaped tenantId/projectId in buildSecurityExpression()
  • 🔴 Critical: signoz.ts:262 SQL injection via unescaped conversation IDs from step-1 results

🟠⚠️ Major (2) 🟠⚠️

Inline Comments:

  • 🟠 Major: signozHelpers.ts:29 Null pointer when spec.filter is undefined
  • 🟠 Major: signoz.ts:264 Same null pointer issue when building conversation ID filter

🟡 Minor (2) 🟡

Inline Comments:

  • 🟡 Minor: signoz.ts:72-80 queryReferencesKeys doesn't check orderBy fields
  • 🟡 Minor: signoz.ts:291 Magic number for 180-day lookback period

💭 Consider (1) 💭

Inline Comments:

  • 💭 Consider: conversations/[conversationId]/route.ts:460-463 Whitespace-only conversationId validation (flagged by Ito tests)

🚫 REQUEST CHANGES

Summary: The v4→v5 SigNoz migration is architecturally sound with excellent improvements:

  • Parallel batching (3 batches via Promise.all)
  • Query consolidation (reduced from ~16 individual queries)
  • Net -1458 lines of code
  • Good defensive retry logic for missing attributes

However, critical SQL injection vulnerabilities must be addressed before merge:

Quick Fixes Required

  1. Export quoteValue from @inkeep/agents-core — The function at signoz-queries.ts:53-57 correctly escapes single quotes with '' but is currently private. Export it:

    // In signoz-queries.ts
    export function quoteValue(value: unknown): string {
      if (typeof value === 'string') return `'${value.replace(/'/g, "''")}'`;
      if (typeof value === 'boolean') return String(value);
      return String(value);
    }
  2. Use quoteValue in signozHelpers.ts:8:

    import { quoteValue, SPAN_KEYS, QUERY_TYPES } from '@inkeep/agents-core';
    
    let expr = `${SERVICE_NAME_FILTER} AND ${SPAN_KEYS.TENANT_ID} = ${quoteValue(tenantId)}`;
    if (projectId) expr += ` AND ${SPAN_KEYS.PROJECT_ID} = ${quoteValue(projectId)}`;
  3. Use quoteValue in signoz.ts:262:

    const convIdExpr = `${SPAN_KEYS.CONVERSATION_ID} IN (${conversationIds.map((id) => quoteValue(String(id))).join(', ')})`;
  4. Add null checks for spec.filter at lines 29 and 264:

    const existing = spec.filter?.expression ?? 'true';
    spec.filter = { expression: `(${existing}) AND ${securityExpr}` };

Risk Assessment

While the SQL injection vectors have reduced exploitability (values come from authenticated middleware or SigNoz responses, not direct user input), defense-in-depth requires escaping all interpolated values. The null pointer issues are more immediately impactful as they can cause runtime crashes.

Discarded (0)

No findings discarded in this review.

Reviewers (1)
Reviewer Returned Main Findings Consider While You're Here Inline Comments Pending Recs Discarded
orchestrator 7 0 0 0 7 0 0
Total 7 0 0 0 7 0 0

Note: This is a re-review consolidating findings from prior automated reviews. Sub-reviewers were not dispatched as the critical issues have been consistently identified across multiple prior review runs.

const SPAN_LOOKUP_TABLE = 'distributed_signoz_index_v3';

function buildSecurityExpression(tenantId: string, projectId?: string): string {
let expr = `${SERVICE_NAME_FILTER} AND ${SPAN_KEYS.TENANT_ID} = '${tenantId}'`;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 CRITICAL SQL injection via unescaped tenantId/projectId

Issue: The tenantId and projectId values are directly interpolated into the filter expression without escaping single quotes. A value containing ' would break out of the string literal.

Why: While these values come from authenticated middleware context (reducing exploitability), a compromised or malformed value could inject arbitrary filter logic, potentially enabling cross-tenant data access. Defense-in-depth requires escaping all interpolated values.

Fix: Use the quoteValue() helper from signoz-queries.ts which correctly escapes single quotes with standard SQL '':

Suggested change
let expr = `${SERVICE_NAME_FILTER} AND ${SPAN_KEYS.TENANT_ID} = '${tenantId}'`;
let expr = `${SERVICE_NAME_FILTER} AND ${SPAN_KEYS.TENANT_ID} = ${quoteValue(tenantId)}`;
if (projectId) expr += ` AND ${SPAN_KEYS.PROJECT_ID} = ${quoteValue(projectId)}`;

You'll need to export quoteValue from @inkeep/agents-core and import it here.

Refs:

);
const securityExpr = buildSecurityExpression(tenantId, projectId);
for (const { spec } of queries) {
spec.filter = { expression: `(${spec.filter.expression}) AND ${securityExpr}` };
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 MAJOR Null pointer when spec.filter is undefined

Issue: This line accesses spec.filter.expression without checking if spec.filter exists. If a query arrives without a pre-existing filter (which is valid in v5), this will throw Cannot read properties of undefined (reading 'expression').

Why: The v4 code was resilient to missing filters. This regression breaks queries that don't have pre-existing filters, causing runtime crashes.

Fix: Default to a truthy base expression:

Suggested change
spec.filter = { expression: `(${spec.filter.expression}) AND ${securityExpr}` };
const existing = spec.filter?.expression ?? 'true';
spec.filter = { expression: `(${existing}) AND ${securityExpr}` };

Refs:

);
return c.json({ error: 'Internal Server Error', message: 'Failed to query SigNoz' }, 500);

const convIdExpr = `${SPAN_KEYS.CONVERSATION_ID} IN (${conversationIds.map((id) => `'${id}'`).join(', ')})`;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 CRITICAL SQL injection via unescaped conversation IDs

Issue: The conversationIds array from step-1 SigNoz results is directly interpolated into the filter expression without escaping single quotes. A conversation ID containing ' would break out of the string literal.

Why: Although conversation IDs are typically UUIDs from SigNoz responses, the defensive principle is to always escape external data. A malformed ID like conv' OR 1=1 -- could inject arbitrary filter logic.

Fix: Use the quoteValue() helper (you'll need to export it from @inkeep/agents-core):

Suggested change
const convIdExpr = `${SPAN_KEYS.CONVERSATION_ID} IN (${conversationIds.map((id) => `'${id}'`).join(', ')})`;
const convIdExpr = `${SPAN_KEYS.CONVERSATION_ID} IN (${conversationIds.map((id) => quoteValue(String(id))).join(', ')})`;

Refs:


const convIdExpr = `${SPAN_KEYS.CONVERSATION_ID} IN (${conversationIds.map((id) => `'${id}'`).join(', ')})`;
for (const { spec } of detailPayloadTemplate.compositeQuery.queries) {
spec.filter = { expression: `(${spec.filter.expression}) AND ${convIdExpr}` };
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 MAJOR Same null pointer issue as enforceSecurityFilters

Issue: This line accesses spec.filter.expression without checking if spec.filter exists. Same issue as in signozHelpers.ts:29.

Why: If a query in detailPayloadTemplate lacks a pre-existing filter, this will throw a runtime error.

Fix: Use optional chaining with a default:

Suggested change
spec.filter = { expression: `(${spec.filter.expression}) AND ${convIdExpr}` };
const existing = spec.filter?.expression ?? 'true';
spec.filter = { expression: `(${existing}) AND ${convIdExpr}` };

Comment on lines +72 to +80
function queryReferencesKeys(query: any, keys: string[]): boolean {
const spec = query?.spec;
const searchable = [
spec?.filter?.expression ?? '',
...(spec?.selectFields ?? []).map((f: any) => f.name),
...(spec?.groupBy ?? []).map((g: any) => g.name),
];
return keys.some((k) => searchable.some((t: string) => t.includes(k)));
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Minor queryReferencesKeys doesn't check orderBy fields

Issue: The function only checks filter.expression, selectFields, and groupBy for missing key references. If a missing attribute is referenced only in order, the query won't be filtered out and the retry will fail with the same error.

Why: SigNoz builder queries can reference attributes in order clauses. If the missing key error originates from an orderBy reference, the retry logic will fail to identify and remove the problematic query.

Fix: Add order fields to the searchable array:

const searchable = [
  spec?.filter?.expression ?? '',
  ...(spec?.selectFields ?? []).map((f: any) => f.name),
  ...(spec?.groupBy ?? []).map((g: any) => g.name),
  ...(spec?.order ?? []).map((o: any) => o.key?.name ?? o.columnName ?? ''),
];

return c.json({ error: 'Service Unavailable', message: 'SigNoz is not configured' }, 500);

const now = Date.now();
const lookbackMs = 180 * 24 * 60 * 60 * 1000;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Minor Magic number for 180-day lookback period

Issue: The 180-day lookback period is hardcoded without explanation. This wide time range may affect query performance on large ClickHouse tables.

Why: Without documentation, operators can't tune this value during performance issues.

Fix: Extract to a named constant with documentation:

Suggested change
const lookbackMs = 180 * 24 * 60 * 60 * 1000;
/** Maximum lookback for span lookup queries. Covers typical trace retention period. */
const SPAN_LOOKUP_LOOKBACK_MS = 180 * 24 * 60 * 60 * 1000; // 180 days
const lookbackMs = SPAN_LOOKUP_LOOKBACK_MS;

Comment on lines +154 to +158
function buildBaseExpression(conversationId: string, projectId?: string): string {
const parts = [`${SPAN_KEYS.CONVERSATION_ID} = '${conversationId}'`];
if (projectId) parts.push(`${SPAN_KEYS.PROJECT_ID} = '${projectId}'`);
return parts.join(' AND ');
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Minor Unescaped conversationId in filter expression

Issue: The conversationId from URL path params is interpolated directly into the filter expression without escaping single quotes. A value like test' OR 1=1 -- would inject arbitrary filter logic.

Why: This is lower risk than the server-side issues because:

  1. The payload goes through enforceSecurityFilters() on the server which scopes to the correct tenant
  2. A user can only broaden their query scope within their own tenant

However, it still allows a user to potentially fetch data for a different conversation within their tenant by manipulating the filter.

Fix: Consider escaping manually:

function buildBaseExpression(conversationId: string, projectId?: string): string {
  const escapedConvId = conversationId.replace(/'/g, "''");
  const parts = [`${SPAN_KEYS.CONVERSATION_ID} = '${escapedConvId}'`];
  if (projectId) {
    const escapedProjId = projectId.replace(/'/g, "''");
    parts.push(`${SPAN_KEYS.PROJECT_ID} = '${escapedProjId}'`);
  }
  return parts.join(' AND ');
}

@github-actions github-actions Bot deleted a comment from claude Bot Mar 23, 2026
@shagun-singh-inkeep shagun-singh-inkeep added this pull request to the merge queue Mar 23, 2026
Merged via the queue into main with commit 6ca8164 Mar 23, 2026
31 of 32 checks passed
@itoqa
Copy link
Copy Markdown

itoqa Bot commented Mar 23, 2026

Ito Test Report ❌

20 test cases ran. 1 failed, 19 passed.

Overall, 19 of 20 test cases passed, confirming strong traces functionality across authenticated and unauthenticated routing, overview/detail rendering, span drill-down, deep-link/back-forward stability, multi-tab isolation, mobile flow, proxy error recovery, and API guardrails (including deterministic 400/404 validation and forbidden access behavior). The single notable issue was a Medium-severity, pre-existing client-side race in the traces overview where rapid filter/search/pagination interactions can let out-of-order responses overwrite newer state and show stale rows/pagination, which risks misleading debugging decisions.

❌ Failed (1)
Category Summary Screenshot
Adversarial 🟠 Rapid filter/search/pagination interactions can allow stale responses to overwrite newer intended UI state. ADV-5
🟠 Rapid interactions can overwrite newer traces state with stale responses
  • What failed: Older requests can finish after newer requests and still call state setters, so the UI can show outdated rows/pagination instead of the user's final selected state.
  • Impact: Users can make quick filter/search/pagination updates and still end up seeing stale data that does not match the final controls. This undermines confidence in trace investigation results and can lead to wrong debugging decisions.
  • Introduced by this PR: No – pre-existing bug (code not changed in this PR)
  • Steps to reproduce:
    1. Open traces overview with conversation data and available pagination.
    2. Rapidly change search and filter values while clicking next/previous pagination during in-flight requests.
    3. Observe rows/page state can reflect an older request rather than the user's final selected state.
  • Code analysis: useConversationStats starts async fetches for each state change and pagination action, but it does not cancel prior requests and does not gate state updates by request version. The run-time code-change log only shows config/dev-bypass patches in API route/config files, not this client hook, so this stale overwrite risk is a production-path issue rather than a test-only mock artifact.
  • Why this is likely a bug: Concurrent requests are allowed to resolve out of order and directly update UI state without cancellation or stale-response guards, which can deterministically produce non-convergent final state.

Relevant code:

agents-manage-ui/src/hooks/use-traces.ts (lines 70-97)

const fetchData = useCallback(
  async (page: number) => {
    try {
      setLoading(true);
      setError(null);

      const client = getSigNozStatsClient(options?.tenantId);
      const result = await client.getConversationStats(
        currentStartTime,
        currentEndTime,
        options?.filters,
        options?.projectId,
        { page, limit: pageSize },
        options?.searchQuery,
        options?.agentId,
        options?.hasErrors
      );

      setStats(result.data);
      setPaginationInfo(result.pagination);

agents-manage-ui/src/hooks/use-traces.ts (lines 125-171)

const nextPage = useCallback(() => {
  if (paginationInfo?.hasNextPage) {
    const nextPageNum = currentPage + 1;
    setCurrentPage(nextPageNum);
    fetchData(nextPageNum);
  }
}, [currentPage, paginationInfo?.hasNextPage, fetchData]);

useEffect(() => {
  setCurrentPage(1);
  fetchData(1);
}, [
  options?.startTime,
  options?.endTime,
  options?.filters,
  options?.searchQuery,
  options?.hasErrors,
  pageSize,
]);
✅ Passed (19)
Category Summary Screenshot
Adversarial Tenant-injection payloads were rejected with HTTP 400 in single and batch traces requests. ADV-1
Adversarial Non-builder clickhouse_sql payloads were rejected with HTTP 400 for both single and batch paths. ADV-2
Adversarial Authorization path review confirms unauthorized project access is guarded by explicit forbidden checks. ADV-3
Adversarial Malicious conversation/span identifier strings did not indicate an injection flaw in production code paths. ADV-4
Adversarial Multi-tab conversation state stayed isolated with no cross-conversation bleed after expansion and refresh actions. ADV-6
Edge Future end timestamps are rejected with deterministic HTTP 400 and explicit invalid time-range messaging. EDGE-1
Edge Invalid ordering payloads (start == end, start > end) are rejected with deterministic HTTP 400 validation errors. EDGE-2
Edge Span details endpoint enforces required conversationId and returns HTTP 400 for malformed/missing parameter paths without 500s. EDGE-3
Edge Unknown span lookup with a valid conversationId returns controlled HTTP 404 (Span not found). EDGE-4
Edge Injected rows with startTime as 0 and null-equivalent values rendered without invalid date text in the conversation list. EDGE-5
Edge Refresh and back/forward loops preserved conversation URL-content integrity across repeated cycles. EDGE-6
Mobile iPhone 12 viewport flow covered overview, detail, span access, and return navigation without blocking the scenario. MOBILE-1
Res Forced 400/429/503 /api/traces responses showed actionable SigNoz configuration messaging, and refresh recovered the overview. RES-1
Happy-path Traces overview rendered without a full-page crash and handled SigNoz configuration issues via in-page warning behavior. ROUTE-1
Happy-path Conversation deep-link loaded and rendered a full activity timeline in the verified local run. ROUTE-2
Happy-path Span drill-down reached the span lookup endpoint with the expected conversation-scoped request path. ROUTE-3
Happy-path Default overview path uses batch mode requests and pagination updates rows/page state coherently. ROUTE-4
Happy-path Applying search + span filter switches to non-batch requests, and clearing restores batch mode. ROUTE-5
Sec Unauthenticated navigation redirected to login with returnUrl, confirming auth-required access control for traces routes. SEC-1

Commit: 336f5f2

View Full Run


Tell us how we did: Give Ito Feedback

@claude claude Bot mentioned this pull request Mar 24, 2026
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant