[c-930] use metrics v2 to trace relevant code paths#10
[c-930] use metrics v2 to trace relevant code paths#10
Conversation
|
Hey there and thank you for opening this pull request! 👋🏼 We require pull request titles to follow the Conventional Commits specification and it looks like your proposed title needs to be adjusted. Details: General format: |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughMetrics instrumentation (Injective/cometbft meters) is threaded into node construction and subsystems (state, mempool, consensus, statesync). Numerous methods are wrapped with timing contexts; constructors and test call sites updated to accept an extra meter parameter. Dependencies were bumped (Go 1.23.9 + module upgrades). Changes
Sequence DiagramsequenceDiagram
participant Caller as Caller
participant NewNode as NewNode()
participant Node as Node
participant Subsystems as Subsystems\n(State/Mempool/Consensus/Statesync)
participant Meter as Meter\n(NewNilMeter / Injected)
participant Method as InstrumentedMethod
Caller->>NewNode: NewNode(ctx, ..., meter?)
NewNode->>Meter: meter := param or NewNilMeter()
NewNode->>Subsystems: create subsystems with meter.SubMeter("svc")
Subsystems->>Meter: request sub-meter (svc-tagged)
Meter-->>Subsystems: return scoped sub-meter
Caller->>Node: call method (e.g., CheckTx / Receive / Load)
Node->>Method: Method starts meter.FuncTimingCtx(tag)
Method->>Method: execute logic
Method-->>Node: defer stop() after execution
Estimated Code Review Effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.11.4)Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions Comment |
|
Review the following changes in direct dependencies. Learn more about Socket for GitHub.
|
There was a problem hiding this comment.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
node/node.go (1)
277-303:⚠️ Potential issue | 🔴 CriticalThis is a source-incompatible change to exported constructors.
Adding
meterahead of the variadicoptionschanges the call shape of bothNewNodeandNewNodeWithCliParams, breaking all existing downstream callers. All call sites must now include a positionalmeterargument before options. Preserve the current signatures and inject tracing via anOptionor a separate helper instead.Additionally, the struct literal stores a fresh
metrics.NewNilMeter()instead of using the normalizedmeterparameter passed to the function, discarding the caller-provided meter value.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@node/node.go` around lines 277 - 303, The exported constructors NewNode and NewNodeWithCliParams were changed to take a positional meter argument before variadic options (breaking callers) and the implementation overwrites the passed meter with metrics.NewNilMeter(); revert the public API to its original signature (keep meter injected via an Option or helper rather than a new positional parameter) by removing the new positional meter parameter from NewNode/NewNodeWithCliParams and instead accept a Meter through an Option (e.g., WithMeter) or a separate setter; also fix the Node construction code that currently sets metrics.NewNilMeter() so it uses the normalized meter value provided via the Option (or falls back to metrics.NewNilMeter() only when no Option was supplied) to ensure caller-provided meters are preserved.
🧹 Nitpick comments (2)
internal/consensus/reactor.go (1)
272-273: Rejected messages are still invisible to this span.Because the timer starts here, any
MsgFromProto/ValidateBasicfailure returns before tracing begins. If this span is supposed to represent the full receive path, start it earlier and fall back to a generic tag until the concrete message type is known.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/consensus/reactor.go` around lines 272 - 273, Start the receive timing span before any MsgFromProto/ValidateBasic work so rejected messages are included: move the conR.meter.FuncTimingCtx(context.Background(), "Receive", ...) call to before MsgFromProto and ValidateBasic and initially use a generic tag (e.g. injmetrics.Tag("msg","unknown") or similar) until fmt.Sprintf("%T", msg) is available; once MsgFromProto succeeds, replace or emit an updated tag with the concrete type (using the meter API or by ending and restarting the span if needed) so the span covers the full receive path and still records the real message type when known.node/node_test.go (1)
491-492: Please add one constructor case with a non-nil meter.All of the updated tests still pass
nil, so the new meter plumbing is never exercised. A tiny fake meter assertion in this file would catch forwarding regressions immediately.Also applies to: 535-536, 568-569, 602-603, 624-625, 648-649
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@node/node_test.go` around lines 491 - 492, Add one test constructor invocation that passes a non-nil fake meter to NewNode (instead of the nil meter cases currently used) so the new metrics plumbing is exercised; create a tiny fake meter implementation in node_test.go, pass it where DefaultMetricsProvider(config.Instrumentation) (or the metrics argument) is provided to NewNode, and add a simple assertion that the fake meter was forwarded (e.g., the created Node exposes/holds the fake meter or the fake meter observed a call). Reference NewNode and DefaultMetricsProvider (and the existing CustomReactors usage) when locating the call sites to update.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@go.mod`:
- Line 38: Replace the vulnerable module versions in go.mod: bump
google.golang.org/grpc from v1.75.0 to v1.79.3 (or later) and bump
go.opentelemetry.io/otel/sdk from v1.38.0 to v1.40.0 (or later); after updating
the module lines for these symbols, run go mod tidy and re-run tests/build to
ensure dependency graph and imports resolve cleanly.
In `@internal/consensus/state.go`:
- Around line 226-229: The StateMeter option currently allows replacing the safe
default meter with nil causing panics; update StateMeter (used to configure
cs.meter) to ignore nil inputs—either do nothing when meter is nil or assign
injmetrics.NewNilMeter() instead—so NewState's default
(injmetrics.NewNilMeter()) is preserved; locate StateMeter and ensure the setter
only sets cs.meter when the provided meter is non-nil (or substitutes
injmetrics.NewNilMeter()) to prevent handleMsg from calling
cs.meter.FuncTimingCtx on nil.
In `@mempool/clist_mempool.go`:
- Around line 259-262: The WithMeter option currently overwrites the defensive
default metrics.NewNilMeter(), so change WithMeter(func mem *CListMempool) to
only set mem.meter when the provided meter is non-nil (i.e., if m != nil) to
preserve the nil-safe default; reference the WithMeter function and the
CListMempool.meter field (used by GetSenders, CheckTx, ReapMaxBytesMaxGas,
ReapMaxTxs, GetTxByHash, Update, recheckTxs) and ensure NewCListMempool's
metrics.NewNilMeter() remains in place unless a real meter is provided.
- Around line 798-800: The timing context returned by mem.meter.FuncTimingCtx is
being discarded—change the call so the derived context is assigned (e.g., ctx,
stop := mem.meter.FuncTimingCtx(context.Background(), "Update")) and ensure that
same ctx is passed into recheckTxs (and any subsequent calls within Update) so
the span/trace from FuncTimingCtx is propagated; keep the defer stop() to end
the timing span.
In `@node/node.go`:
- Line 598: NewNodeWithCliParams normalizes a meter but then the Node struct
literal unconditionally sets meter: metrics.NewNilMeter(), which discards the
caller-supplied/normalized meter; replace the hardcoded metrics.NewNilMeter() in
the Node initialization with the local normalized meter variable (the one
computed earlier in NewNodeWithCliParams) so the Node.meter field uses the
normalized value instead of always creating a fresh nil meter.
In `@node/setup.go`:
- Line 124: DefaultNewNode currently passes nil as the final Provider argument
to NewNodeWithCliParams, making the metrics Meter unreachable; replace that nil
with a concrete metrics Provider by constructing and passing the default
provider (use the package's DefaultMetricsProvider or the provider constructor
exposed by the metrics package) so DefaultNewNode forwards a real Provider to
NewNodeWithCliParams rather than nil.
---
Outside diff comments:
In `@node/node.go`:
- Around line 277-303: The exported constructors NewNode and
NewNodeWithCliParams were changed to take a positional meter argument before
variadic options (breaking callers) and the implementation overwrites the passed
meter with metrics.NewNilMeter(); revert the public API to its original
signature (keep meter injected via an Option or helper rather than a new
positional parameter) by removing the new positional meter parameter from
NewNode/NewNodeWithCliParams and instead accept a Meter through an Option (e.g.,
WithMeter) or a separate setter; also fix the Node construction code that
currently sets metrics.NewNilMeter() so it uses the normalized meter value
provided via the Option (or falls back to metrics.NewNilMeter() only when no
Option was supplied) to ensure caller-provided meters are preserved.
---
Nitpick comments:
In `@internal/consensus/reactor.go`:
- Around line 272-273: Start the receive timing span before any
MsgFromProto/ValidateBasic work so rejected messages are included: move the
conR.meter.FuncTimingCtx(context.Background(), "Receive", ...) call to before
MsgFromProto and ValidateBasic and initially use a generic tag (e.g.
injmetrics.Tag("msg","unknown") or similar) until fmt.Sprintf("%T", msg) is
available; once MsgFromProto succeeds, replace or emit an updated tag with the
concrete type (using the meter API or by ending and restarting the span if
needed) so the span covers the full receive path and still records the real
message type when known.
In `@node/node_test.go`:
- Around line 491-492: Add one test constructor invocation that passes a non-nil
fake meter to NewNode (instead of the nil meter cases currently used) so the new
metrics plumbing is exercised; create a tiny fake meter implementation in
node_test.go, pass it where DefaultMetricsProvider(config.Instrumentation) (or
the metrics argument) is provided to NewNode, and add a simple assertion that
the fake meter was forwarded (e.g., the created Node exposes/holds the fake
meter or the fake meter observed a call). Reference NewNode and
DefaultMetricsProvider (and the existing CustomReactors usage) when locating the
call sites to update.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: cb66a95d-b185-48b4-81ca-e1a883f28e67
⛔ Files ignored due to path filters (2)
go.sumis excluded by!**/*.sumgo.workis excluded by!**/*.work
📒 Files selected for processing (13)
abci/tutorials/abci-v2-forum-app/forum.gogo.modinternal/consensus/reactor.gointernal/consensus/state.gomempool/clist_mempool.gonode/node.gonode/node_test.gonode/setup.gorpc/test/helpers.gostate/store.gostatesync/reactor.gostatesync/reactor_test.gotest/e2e/node/main.go
| github.com/stretchr/testify v1.11.1 | ||
| golang.org/x/crypto v0.41.0 | ||
| golang.org/x/net v0.43.0 | ||
| google.golang.org/grpc v1.75.0 |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
python - <<'PY'
import json
import urllib.request
queries = [
("google.golang.org/grpc", "1.75.0"),
("go.opentelemetry.io/otel/sdk", "1.38.0"),
]
for pkg, version in queries:
body = json.dumps({
"package": {"ecosystem": "Go", "name": pkg},
"version": version,
}).encode()
req = urllib.request.Request(
"https://api.osv.dev/v1/query",
data=body,
headers={"Content-Type": "application/json"},
)
with urllib.request.urlopen(req) as resp:
payload = json.load(resp)
print(json.dumps({
"package": pkg,
"version": version,
"vulns": payload.get("vulns", []),
}, indent=2))
PYRepository: InjectiveLabs/cometbft
Length of output: 2648
🏁 Script executed:
cat go.mod | head -50Repository: InjectiveLabs/cometbft
Length of output: 1787
🏁 Script executed:
sed -n '38p;141p' go.modRepository: InjectiveLabs/cometbft
Length of output: 147
🌐 Web query:
GO-2026-4762 GHSA-p77j-4mvh-x3m3 google.golang.org/grpc authorization bypass
💡 Result:
GO-2026-4762 (GHSA-p77j-4mvh-x3m3, CVE-2026-33186) is a critical authorization bypass vulnerability (CVSS 9.1) in google.golang.org/grpc versions before 1.79.3. The gRPC-Go server accepts HTTP/2 requests with a :path pseudo-header missing the leading slash (e.g., "Service/Method" instead of "/Service/Method"). It routes these to the correct handler, but authorization interceptors (e.g., google.golang.org/grpc/authz) evaluate the raw non-canonical path. Deny rules using canonical paths (with "/") fail to match, allowing bypass if a fallback allow rule exists. Impacts servers using path-based authorization interceptors with deny rules for canonical paths and fallback allows. Attackers send crafted HTTP/2 frames. Fixed in v1.79.3: rejects non-canonical :path with codes.Unimplemented before interceptors. Upgrade to v1.79.3 or later. Mitigate with validating interceptors or default-deny policies.
Citations:
- 1: https://test.osv.dev/vulnerability/GO-2026-4762
- 2: https://pkg.go.dev/vuln/list
- 3: GHSA-p77j-4mvh-x3m3
- 4: https://www.thehackerwire.com/grpc-go-critical-authorization-bypass-cve-2026-33186/
- 5: https://www.resolvedsecurity.com/vulnerability-catalog/CVE-2026-33186
- 6: GHSA-p77j-4mvh-x3m3
- 7: https://dbugs.ptsecurity.com/vulnerability/PT-2026-26207
- 8: https://github.com/grpc/grpc-go/releases/tag/v1.79.3
🌐 Web query:
GO-2026-4394 GHSA-9h8m-3fm2-qjrq go.opentelemetry.io/otel PATH hijack RCE
💡 Result:
GO-2026-4394 (GHSA-9h8m-3fm2-qjrq, CVE-2026-24051) is a high-severity vulnerability in the OpenTelemetry Go SDK (go.opentelemetry.io/otel/sdk) enabling arbitrary code execution (RCE) via PATH hijacking on macOS/Darwin systems. Summary: The resource detection code in sdk/resource/host_id.go executes the 'ioreg' system command without an absolute path, relying on the PATH environment variable. An attacker able to modify PATH (local low-priv attacker) can place a malicious executable named 'ioreg' in a directory prioritized in PATH, leading to RCE in the application's context when resource.WithHostID or similar host detection is used. Affected versions: v1.21.0 to v1.39.0 (Go modules all versions in that range). Patched in: v1.40.0 via commit d45961b (uses absolute path for ioreg on macOS). Severity: High (CVSS 7.0: AV:L/AC:H/PR:L/UI:N/S:U/C:H/I:H/A:H). Official sources confirm the issue only affects macOS due to ioreg usage; other platforms use file reads or different safe methods. Mitigation: Upgrade to v1.40.0 or later. Avoid using resource.WithHostID if possible. Do not run in environments where PATH is attacker-controllable.
Citations:
- 1: https://osv.dev/vulnerability/GHSA-9h8m-3fm2-qjrq
- 2: https://db.gcve.eu/vuln/ghsa-9h8m-3fm2-qjrq
- 3: GHSA-9h8m-3fm2-qjrq
- 4: https://advisories.gitlab.com/pkg/golang/go.opentelemetry.io/otel/sdk/CVE-2026-24051/
- 5: https://pkg.go.dev/vuln/GO-2026-4394
Upgrade the vulnerable dependency pins before merging.
google.golang.org/grpc v1.75.0 contains a critical authorization-bypass vulnerability (GO-2026-4762 / GHSA-p77j-4mvh-x3m3, CVSS 9.1). Servers using path-based authorization interceptors are at risk when HTTP/2 requests with non-canonical :path pseudo-headers bypass deny rules. Upgrade to v1.79.3 or later.
go.opentelemetry.io/otel/sdk v1.38.0 contains a high-severity PATH-hijacking RCE vulnerability (GO-2026-4394 / GHSA-9h8m-3fm2-qjrq, CVSS 7.0) on macOS systems due to unqualified command execution. Upgrade to v1.40.0 or later.
Both versions are now in this diff.
🧰 Tools
🪛 OSV Scanner (2.3.5)
[CRITICAL] 38-38: google.golang.org/grpc 1.75.0: Authorization bypass in gRPC-Go via missing leading slash in :path in google.golang.org/grpc
(GO-2026-4762)
[CRITICAL] 38-38: google.golang.org/grpc 1.75.0: gRPC-Go has an authorization bypass via missing leading slash in :path
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@go.mod` at line 38, Replace the vulnerable module versions in go.mod: bump
google.golang.org/grpc from v1.75.0 to v1.79.3 (or later) and bump
go.opentelemetry.io/otel/sdk from v1.38.0 to v1.40.0 (or later); after updating
the module lines for these symbols, run go mod tidy and re-run tests/build to
ensure dependency graph and imports resolve cleanly.
Introduces injective metrics and traces code paths of interest
PR checklist
.changelog(we use unclog to manage our changelog)docs/orspec/) and code commentsSummary by CodeRabbit
New Features
Chores
Refactor
Tests