search: logsim search daemon + IR API + TUI + --to local forwarding#48
Draft
hyfather wants to merge 8 commits into
Draft
search: logsim search daemon + IR API + TUI + --to local forwarding#48hyfather wants to merge 8 commits into
logsim search daemon + IR API + TUI + --to local forwarding#48hyfather wants to merge 8 commits into
Conversation
…warding `logsim search` starts an in-process HTTP daemon on :3700 that ingests Splunk/Cribl HEC and exposes a small set of information-retrieval functions (get_raw, get_summary, get_distribution, get_top_values) over HTTP, plus a bubbletea TUI for spot-checking what's been ingested. Each db is identified by a 6-char alphanumeric code; ctrl-c tears down every db on shutdown. The IR function set is deliberately narrow so it stays expressible in Splunk/Cribl/Datadog/Quickwit query languages — the Backend interface in pkg/search is the swappable seam for those future implementations. DuckDB is the v1 backend; it is gated behind `//go:build cgo` with a non-CGO stub so api/* (Vercel) builds and `CGO_ENABLED=0 go test ./...` both stay clean. `logsim run <scenario> --to local` now forwards via HEC to the daemon, auto-creating a db whose code is derived from the scenario slug (`web-service` → `webser`); `--to local:<code>` overrides.
Contributor
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
JSON-encoding a nil slice yields `null`; clients have to special-case
that. Initialise the result slices up front so an empty match returns
`{"events":[],"total":0}` etc. — easier for the upcoming frontend.
Adds a pure-Go `MemoryBackend` (no CGO) implementing the same `Backend` interface as `DuckDBBackend`, plus a single Vercel function at `api/search/[...path].go` that mounts the `pkg/search` chi router with that backend. The /dbs routes, HEC ingest, and IR queries are identical to what `logsim search` exposes locally; on Vercel the storage is in the warm function instance's memory (cold starts reset, which matches the per-session UX). Frontend changes: - `src/lib/searchClient.ts` — typed wrappers for createDb, ingestHEC, getRaw, getSummary, getDistribution, getTopValues against /api/search/... - `useSimulationStore.dbCode` — code of the current play's db, set on Run, cleared on Reset. - `Topbar` Play / Step paths create a db, ingest each tick batch via HEC (fire-and-forget), and best-effort delete on Reset / next play. Failures during ingest fall back silently — `logBuffer` still drives the live view. - `logsAt` accepts an optional `dbCode`; when set, scrubbing reads events out of the daemon instead of re-running the engine via /api/logs_at. `ScrubbedLogs` passes the current `dbCode` through. Forward mode (server-side flat-out HEC) doesn't stream logs back to the browser, so write-through there needs a separate server-side tee to /api/search — left as a follow-up. `CGO_ENABLED=0 go build ./...` and `go test ./...` are clean (Vercel build path), and `npx tsc` + `npx next build` succeed.
Moves the editor's per-tick HEC ingest out of the browser and into the backend. /api/run and /api/logs_at now accept a `search_db_code` field; when set, every batch of events the engine emits is also POSTed to /api/search/dbs/<code>/services/collector/event from the same Vercel function. Forward mode benefits the most — events were never visible to the browser there, so until now they couldn't land in a search db at all. The new tee runs alongside the Cribl HEC sink so destinations still receive events the same way. The frontend mints a 6-char code locally (the daemon auto-creates the db on first ingest), passes it via `searchDBCode` to runStream / runForward / logsAt, and stores it as `dbCode` for downstream queries. Realtime, Forward, and Step all use the same path. The redundant ingestLogs() call in Topbar's onTick is gone. Verified end-to-end via cmd/devserver: a 10-tick scenario forwarded with search_db_code=abc123 produced 366 events; get_summary returned the right per-sourcetype counts and get_top_values returned realistic status_code distributions (200, 304, 301, 201, 204, 202, 400).
Vercel's Go runtime treats every .go file under api/** as a separate function entry point and fails the deploy when one doesn't export Handler. The new search_tee.go helpers in api/run/ and api/logs_at/ hit exactly that error. Moving the helper into pkg/apihelp (which is already shared across the lambdas) is the established pattern for api/* utilities that don't ship as their own function.
ScrubbedLogs called logsAt with `from=0, to=tick` — tick indices, not timestamps. The daemon path translated those via `dbStartTimeMs ?? 0`, which fell back to epoch 1970 because nothing was wiring the run's wall-clock anchor through. Result: every fetch on pause queried a window decades before the actual events and returned nothing — logs visibly vanished from the panel the moment playback stopped. Adds `dbStartTimeMs` to the simulation store, populates it alongside `dbCode` at every play / forward / step start, threads it through ScrubbedLogs → logsAt → /api/search/dbs/<code>/get_raw. logsAt now also adds one tick of slack to the upper bound so the trailing tick's events (which can land slightly after their tick boundary due to engine jitter) are included. Verified end-to-end: 10-tick web-service run produces 366 events; the fixed window returns 363 of them, the pre-fix epoch-1970 window returns 0.
…views) The server-side tee in /api/run + /api/logs_at relied on cross-function HTTP from those handlers to /api/search/dbs/<code>/services/collector/event. That POST has no auth cookie — and Vercel preview deployments often have password protection enabled, which 401s any anonymous request including function-to-function calls within the same deployment. The tee silently failed, the daemon stayed empty, and ScrubbedLogs found nothing on pause: "logs disappear". Browser-side ingest works because the user's session cookie comes along on each fetch. Reverting streaming and step to ingest from Topbar's onTick / handleStep paths fixes the bug under preview protection without changing the architecture for the deployed-public case (still works there, just via the browser). Forward mode still uses server-side tee since the browser never sees events in that mode — it's the only path that needs cross-function HTTP, and it's a known limitation that forward + auth-protected preview won't populate the search db. Documented in the code comments.
When the user hits pause, isRunning flips to false and ScrubbedLogs swaps from liveLogs (the in-memory buffer) to scrubLogs (the daemon's view). The swap was unconditional, so two cases left the panel blank: - The 120ms debounce window before the daemon fetch returns. - The daemon legitimately has no events (auth-walled cross-function POST 401'd, network failure, cold function instance, etc.). Both manifested as "logs disappear when I hit pause" because liveLogs still has the events the user just watched stream by — we just stopped showing them. Now scrubLogs only takes over when it actually has rows; otherwise the panel stays on liveLogs. This is also robust against future daemon failures: the local view never goes blank just because the persistent store is unhappy.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
logsim searchstarts an in-process HTTP daemon on:3700that ingests Splunk/Cribl HEC and exposes a small set of information-retrieval functions over HTTP, plus a bubbletea TUI for spot-checking what's been ingested.127.0.0.1:3700, CORS open for the future frontend, ctrl-c shuts down + deletes every db.[a-z0-9]codes (auto-generated or caller-supplied), backed by in-memory DuckDB. Each db is one DuckDB connection with a fixed schema modelled on Splunk HEC:(time, host, source, sourcetype, index, raw, fields JSON).POST /dbs/{code}/services/collector/event(NDJSON HEC) and/raw(line-by-line). HEC ingest auto-creates the target db.get_raw,get_summary(count/sum/avg/min/max/distinct_count, optional group_by),get_distribution(time-bucketed counts),get_top_values. The function set is intentionally narrow — every operation is expressible in Splunk SPL / Cribl Search / Datadog / Quickwit, so future backends can implement them natively.logsim run <scenario> --to localderives a db code from the scenario slug (e.g.web-service→webser) and forwards via the existing CriblSink.--to local:<code>pins to a specific code;--to local:my-cache-testslugifies.Architecture
pkg/search/backend.go—Backendinterface (Ingest,GetRaw,GetSummary,GetDistribution,GetTopValues,Stats,Close) + canonicalEventtype. Swappable seam for Splunk/Cribl/Datadog/Quickwit later.pkg/search/duckdb.go—DuckDBBackend(CGO;//go:build cgo).pkg/search/duckdb_nocgo.go— stub returningErrNoCGOso the rest of the module still builds withCGO_ENABLED=0.pkg/search/registry.go— process-wide map of code → backend, lifecycle,GetOrCreatefor HEC auto-create.pkg/search/hec.go+raw_lines.go— HEC envelope and raw-line parsers.pkg/search/server.go— chi handlers.pkg/search/tui.go— bubbletea program.cmd/logsim/search.go— wires it all together.CGO / Vercel
The duckdb-go binding requires CGO. To keep the Vercel functions in
api/unaffected:pkg/search/duckdb.gois gated//go:build cgo; a!cgostub satisfies the Backend interface withErrNoCGO.api/does not importpkg/search, soCGO_ENABLED=0 go build ./api/...builds and runs as before.CGO_ENABLED=0 go test ./...is also clean (the duckdb-specific test file is gated too).Follow-up:
release.ymlcurrently builds withCGO_ENABLED=0, which means binaries from GitHub Releases will returnErrNoCGOif the user runslogsim search. Enabling CGO in the release matrix needs cross-compile toolchains (e.g.gcc-aarch64-linux-gnufor linux/arm64, native macos runners for darwin) — punted to a separate PR so this one stays focused.go installfrom source works today.Test plan
CGO_ENABLED=1 go test ./...— passesCGO_ENABLED=0 go test ./...— passes (duckdb tests gated)CGO_ENABLED=0 go build ./api/...— passes (Vercel-safe)logsim search --no-tui+logsim run web-service --to local+curl /dbs/webser/get_summaryround-tripsGetOrCreateis single-flightOut of scope (called out for the next iteration)
:3700(deferred per request)?channel=...,/ack)release.ymlCGO build (so release binaries supportlogsim search)Generated by Claude Code