Conversation
This implements telemetry infrastructure and embedded HTTP server for the web observatory, activated by --web[=PORT] and conditional on KBOX_HAS_WEB. Telemetry sampler (web-telemetry.c): - Two-tier timer reads LKL-internal /proc files - Fast tick (100ms): /proc/stat, meminfo, vmstat, loadavg - Per-tick time budget (5ms) prevents starving dispatch - ENOSYS JSON serializer clamps snprintf to buffer bounds Event ring buffer (web-events.c): - Sequence-numbered to prevent SSE duplicate delivery - JSON escaping for guest-controlled strings Embedded HTTP server (web-server.c): - Minimal HTTP/1.1 via epoll in a dedicated pthread - SSE on GET /api/events (text/event-stream) - GET /api/snapshot, /stats, /api/enosys, POST /api/control - atomic_int for cross-thread state flags - goto-fail cleanup: all error paths destroy mutex and fds - epoll_ctl registration checked Dispatch instrumentation (seccomp-supervisor.c): - Per-syscall latency, disposition, ENOSYS tracking - RECV/SEND ENOENT counters (EBADF excluded) Build: KBOX_HAS_WEB=1 conditional, zero impact when off Usage: --web[=PORT], --web-bind, --trace-format=json Change-Id: Ie660bb39fa604e5d301578ed24d3d290946be65d
This replaces the placeholder HTML with a Chart.js dashboard served as compiled-in static assets via 'xxd -i'. Web frontend: - Chart.js 4.4.7 vendored, compiled into binary at build time - Syscall family stacked chart, memory area, scheduler line, softirq bar - SVG arc gauges for syscalls/s, ctx switches/s, memory, FDs - SSE event feed: pure DOM construction (no innerHTML XSS) - Dark/light theme with localStorage persistence - 3s polling, rate computation clamped to zero on resets - Chart label/data alignment guarded by dt > 0 check Build system: - scripts/gen-web-assets.sh generates src/web-assets.c via xxd - sed filter strips xxd declarations for cross-platform compat - Makefile web-assets target with proper file dependencies Web backend: - web-telemetry.c: add softirqs[10] array to JSON snapshot - web-server.c: compiled-in asset serving with content-type detection, write_all() for reliable delivery, /index.html alias Change-Id: I51139ce900f95031c7c798ca6c6498eba3c9d278
Historical data and export for the web observatory. Web backend: - GET /api/history returns snap_ring[] as JSON array - Oldest-first ordering for chart backfill on page load - Bounded by WEB_RESP_BUF_SIZE to prevent oversized response Web frontend: - loadHistory() fetches on page load, polls start after completion to prevent prevSnap overwrite race - CSV export: snapshot telemetry as timestamped rows - JSON export: event feed for offline analysis - Pause handler: state flips only after res.ok confirmation Change-Id: I48b7e8a387fd01db565e55c7e6bfea5494100b6c
This adds "Why kbox" section comparing against chroot, proot, UML, and gVisor. Expand architecture section with syscall routing details, ABI translation, and subsystem internals. Rewrite web observatory section to explain in-process kernel observability. Document all API endpoints and implementation details. Change-Id: I19cd90f3bd4d4f6cd16c3f7acb0c4e1c721f5474
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
A browser-based dashboard that exposes LKL's internal kernel state in real time while the guest executes. This is not a CLI-to-GUI wrapper -- it reveals runtime details that are invisible in normal Linux operation because they occur inside kernel code paths that userspace cannot observe without instrumentation. The web UI makes kbox's unique architectural property (the kernel runs in the same address space as the supervisor, with all data structures directly readable) accessible to anyone with a browser, not just GDB experts.
Motivation and Positioning
kbox is not "another Linux emulator" or "another sandbox." Emulators (QEMU, Bochs) and sandboxes (gVisor, Firecracker) optimize for isolation and performance. kbox optimizes for transparency: it boots a real Linux kernel as a library specifically so that the kernel's internal machinery -- scheduler decisions, page cache behavior, interrupt routing, VFS traversals, memory allocator state -- can be observed, measured, and understood by students, developers, and anyone curious about how Linux actually works.
The GDB helpers already expose this data to expert users. The web observatory democratizes it. A student who has never used GDB can watch context switches accumulate in real time, see which syscall paths trigger page faults, observe the EEVDF scheduler picking the next task by deadline, and trace a write() from the seccomp notification through VFS down to the block layer -- all rendered as live charts and event streams in a browser tab.
Traditional approaches to kernel observation are impractical for this use case:
LKL eliminates all of these barriers. The kernel runs in-process, so its /proc and /sys filesystems are accessible to the supervisor via standard kbox_lkl_openat/kbox_lkl_read calls -- no privilege escalation, no ptrace, no external tools. The supervisor can sample kernel state at arbitrary frequency via /proc parsing, correlate it with seccomp dispatch events, and stream it to a browser via SSE (Server-Sent Events) -- all from an unprivileged process.
Design decision -- self-contained telemetry, not infrastructure integration: gVisor exposes Prometheus endpoints and could integrate with OpenTelemetry for distributed tracing. kbox deliberately rejects this approach. Prometheus/OTel add external dependencies (scraping infrastructure, collector daemons, time-series databases), contradict the zero-dependency philosophy, and solve a problem kbox does not have (multi-service distributed tracing across network boundaries). kbox is a single-process, local-only tool. The embedded HTTP server + SSE + browser dashboard is the entire observability stack. No Prometheus, no Grafana, no Jaeger, no npm. A standard browser is the only consumer. If users need to export data for offline analysis,
--trace-format=jsonto stdout or the dashboard's CSV/JSON export covers that. The--trace-format=json schemais intentionally simple but does not preclude downstream ingestion by OTel collectors or jq pipelines -- format compatibility is free, infrastructure dependency is not. This is a conscious trade-off: less enterprise integration surface in exchange for zero operational complexity.