Skip to content

test(kitchen-sink): add sqlite memory soak harness#4910

Closed
NathanFlurry wants to merge 1 commit intosqlite-soak/runtime-diagnosticsfrom
sqlite-soak/kitchen-harness
Closed

test(kitchen-sink): add sqlite memory soak harness#4910
NathanFlurry wants to merge 1 commit intosqlite-soak/runtime-diagnosticsfrom
sqlite-soak/kitchen-harness

Conversation

@NathanFlurry
Copy link
Copy Markdown
Member

Description

Please include a summary of the changes and the related issue. Please also include relevant motivation and context.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes.

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

@railway-app
Copy link
Copy Markdown

railway-app Bot commented May 3, 2026

🚅 Deployed to the rivet-pr-4910 environment in rivet-frontend

Service Status Web Updated (UTC)
kitchen-sink ❌ Build Failed (View Logs) Web May 4, 2026 at 4:13 am
website 😴 Sleeping (View Logs) Web May 3, 2026 at 10:59 pm
ladle ❌ Build Failed (View Logs) Web May 3, 2026 at 10:54 pm
mcp-hub ✅ Success (View Logs) Web May 3, 2026 at 10:50 pm
frontend-cloud ❌ Build Failed (View Logs) Web May 3, 2026 at 10:49 pm
frontend-inspector ❌ Build Failed (View Logs) Web May 3, 2026 at 10:49 pm

@railway-app railway-app Bot temporarily deployed to rivet-frontend / rivet-pr-4910 May 3, 2026 22:47 Destroyed
Copy link
Copy Markdown
Member Author

NathanFlurry commented May 3, 2026

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more

This stack of pull requests is managed by Graphite. Learn more about stacking.

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 3, 2026

PR Review: test(kitchen-sink): add sqlite memory soak harness

Overview

This PR adds a SQLite memory soak test harness for the kitchen-sink example, consisting of:

  • Test actor (sqlite-memory-pressure.ts): A Rivet Actor that exercises SQLite under load (bulk inserts, scans, aggregates, and integrity checks).
  • Soak harness (sqlite-memory-soak.ts): Spawns engine + kitchen-sink, drives concurrent actors, samples /proc memory metrics, and verifies sleep-based memory reclamation.
  • Report generator (proc-metrics-report.ts): Offline Plotly-based HTML chart renderer from the harness JSONL event log.
  • Agent notes (sqlite-memory-soak-issues.md): Run diary documenting four soak runs, open issues, and theories.

Overall this is solid exploratory tooling. The structured JSONL event log + offline renderer is a clean design. A few issues worth addressing before relying on results from this harness:


Issues

Correctness

  • quantile is off-by-one (sqlite-memory-soak.ts): sorted[Math.floor(q * sorted.length)] is wrong — for p95 on a 100-element array it returns sorted[95] (the 96th element). Use sorted[Math.floor(q * (sorted.length - 1))] instead. Currently gives a slightly pessimistic reading of latency percentiles.

  • copyNativeMetrics camelCase/snake_case dual-handling (sqlite-memory-pressure.ts): The defensive property-name fallback (e.g. m.pageCache ?? m.page_cache) suggests SqliteNativeMetrics returns inconsistent field names from the NAPI layer. The workaround is correct but masks an upstream bug. Per the layer constraints in CLAUDE.md, NAPI bindings must translate cleanly — the inconsistency should be fixed in rivetkit-napi rather than papered over in a test file.

  • configureServerlessRunner uses datacenters[0] without name matching: If the local engine returns multiple datacenters or in an unexpected order, the wrong one gets configured. Add a filter by a known datacenter name/tag.

Reliability

  • Sleep verification is racy (waitForActorSleepLog): The log-scraping approach polls at 100ms intervals against a saved byte offset. Log buffering delays are exactly what caused the "timed out waiting for actor sleeping log" failure documented in the notes. Consider using the engine's actor status API to poll for sleep state instead of scraping log text — that removes the dependency on log buffering and text matching.

  • Drain after startup failure hangs indefinitely: When failed = true stops new spawns in runWithSpikeConcurrency, already-running workers continue and Promise.allSettled(workers) waits forever if any worker is stuck (e.g. actor_ready_timeout). The notes document this as Issue 3. A drain grace period (e.g. Promise.race([Promise.allSettled(workers), sleep(drainTimeoutMs)])) would prevent the harness from hanging.

  • DEFAULT_ENGINE_PORT = 6520 vs blocked port 6420: The harness blocks port 6420 to avoid colliding with the normal local engine, but starts its own engine on 6520. If a dev engine is already on 6520, the failure at waitForHttpOk won't have a clear message. Consider also blocking 6520 with an explicit early error, or documenting that only non-default ports are supported.

Code Quality

  • Commented-out dead code in the test actor: The releaseStorage action body and the runCycle delete/VACUUM block are fully commented out with inline context. The intent is clear, but dead code in a test actor can confuse future readers about which invariants are active. Either remove and note intent in a doc comment, or gate on a runtime argument.

  • --actors pool exhaustion is silent in fixed-concurrency mode: The default is 4 actors, but only runWithSpikeConcurrency throws when the pool is exhausted. In fixed-concurrency mode, workers silently stop when nextActor >= args.actors. This would produce a misleading measurement if a user forgets to set --actors high enough.


Positive Notes

  • The JSONL event log + offline renderer split is clean — measurement and reporting are fully decoupled.
  • /proc-based sampling (RSS, HWM, VmSize, smaps PSS, IO, FD count) gives a much more complete picture than process.memoryUsage() alone.
  • MALLOC_ARENA_MAX=2 + MALLOC_TRIM_THRESHOLD_ defaults are good for a soak measuring RSS reclamation.
  • Triangle-wave spike shape (runWithSpikeConcurrency) provides a repeating, predictable load profile that's easy to correlate with memory traces.
  • The agent notes are thorough — capturing raw numbers, open issues, and theories for each run is exactly the kind of context that makes soak results usable later.

Summary

The two most impactful fixes are the sleep verification reliability (log-scraping vs. status API) and the unbounded drain on failure — both directly caused run failures documented in the notes. The quantile off-by-one and the NAPI field naming inconsistency are correctness issues worth fixing before drawing latency conclusions. The rest are minor quality improvements.

@NathanFlurry
Copy link
Copy Markdown
Member Author

Landed in main via stack-merge fast-forward push. Commits are in main; closing to match.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant