22
33Baseline: BWM at 70% CPU, 155Hz loop, 6451us/iter, 13 syscalls/iter, 50% idle polling.
44
5- ## Priority 1: Event-Driven Compositor Wait [ IN PROGRESS ]
5+ ## Priority 1: Event-Driven Compositor Wait [ DONE ]
66
7- ** Expected savings: 30-40 % CPU**
8- ** Status: Implementing **
7+ ** Actual savings: ~ 14 % CPU (34% -> 20%) **
8+ ** Status: Merged (PR # 254 ) **
99
10- New syscall op=22 ` compositor_wait(timeout_ms) ` that combines:
11- - Dirty check for all windows (replaces 4x op=14 generation checks)
12- - Keyboard/mouse input poll (replaces poll + mouse_state)
13- - Window list change detection (replaces op=13 per-frame)
14- - Blocks if nothing pending (replaces sleep_ms(2))
10+ New syscall op=23 ` compositor_wait(timeout_ms, last_registry_gen) ` that:
11+ - Blocks in kernel until woken by mark_window_dirty, mouse, or registry change
12+ - Returns bitmask: bit0=dirty, bit1=mouse, bit2=registry + packed registry_gen
13+ - USB HID mouse handler wakes compositor on movement
14+ - REGISTRY_GENERATION atomic bumped on window register
15+ - Removed duplicate blocking from op=16 composite_windows
1516
16- Returns a bitmask of what's ready. Eliminates idle polling (500Hz -> 0Hz when idle).
17+ BWM main loop restructured: only blits when DIRTY flag set, only discovers
18+ windows when REGISTRY flag set, no idle polling.
1719
18- Files:
19- - kernel/src/syscall/graphics.rs: Add op=22 handler
20- - libs/libbreenix/src/graphics.rs: Add compositor_wait() wrapper
21- - userspace/programs/src/bwm.rs: Restructure main loop around compositor_wait
22-
23- ## Priority 2: MAP_SHARED Client Windows (Zero-Copy Blit)
24-
25- ** Expected savings: 15-25% CPU**
26- ** Status: Planned**
27-
28- New syscall op=23 ` map_window_buffer(buffer_id) ` maps client window physical pages
29- into BWM's address space (read-only). BWM blits directly from mapped pages to
30- COMPOSITE_TEX. Eliminates:
31- - 4x read_window_buffer syscalls per iteration
32- - Kernel page-by-page copy under spinlock (up to 8.6MB for terminal)
33- - pixel_cache intermediate copy
20+ ## Priority 2: MAP_SHARED Client Windows + Occluded Blit [ DONE]
3421
35- Uses same infrastructure as map_compositor_texture (op=20).
22+ ** Actual savings: ~ 36% CPU (70% -> 34%)**
23+ ** Status: Merged (PR #253 )**
3624
37- Files:
38- - kernel/src/syscall/graphics.rs: Add op=23 handler (model on op=20)
39- - libs/libbreenix/src/graphics.rs: Add map_window_buffer() wrapper
40- - userspace/programs/src/bwm.rs: Replace blit_client_pixels with direct mapped reads
25+ - op=21 ` map_window_buffer ` : maps client window physical pages into BWM read-only
26+ - op=22 ` check_window_dirty ` : lightweight generation check without pixel copy
27+ - Occluded blit: span-based row clipping skips pixels covered by higher-z windows
28+ - Eliminates: read_window_buffer syscalls, kernel page copies, pixel_cache, z-repair
4129
4230## Priority 3: Strip Vestigial Composite Allocations
4331
@@ -53,40 +41,28 @@ Fix: Remove Vec construction, pass only bg_dirty + dirty_rect to GPU driver.
5341Replace registered_windows() with direct-write to output buffer.
5442
5543Files:
56- - kernel/src/syscall/graphics.rs lines 1110-1147
44+ - kernel/src/syscall/graphics.rs (composite_windows handler)
5745- kernel/src/drivers/virtio/gpu_pci.rs virgl_composite_windows signature
5846
59- ## Priority 4: Reduce clock_gettime Calls
47+ ## Priority 4: Reduce clock_gettime Calls [ DONE ]
6048
61- ** Expected savings: 1-2% CPU**
62- ** Status: Planned**
49+ ** Status: Done (part of PR #253 )**
6350
64- 5x now_monotonic() per iteration is pure measurement overhead. Feature-gate behind
65- a perf_instrumentation flag or reduce to 2 (start + end).
51+ Reduced from 5x now_monotonic() per iteration to 2 (start + end).
6652
67- Files:
68- - userspace/programs/src/bwm.rs: lines 539, 545, 558, 690, 738, 777
53+ ## Priority 5: Batch Window Discovery [ DONE]
6954
70- ## Priority 5: Batch Window Discovery
55+ ** Status: Done (part of PR # 254 ) **
7156
72- ** Expected savings: 1-2% CPU **
73- ** Status: Planned **
57+ REGISTRY_GENERATION atomic in kernel, compositor_wait returns generation.
58+ BWM only calls list_windows when registry changed.
7459
75- list_windows (op=13) called every iteration but windows only change at startup.
76- Add a registry generation counter in shared page. BWM checks counter, only calls
77- list_windows when it changes.
60+ ## Actual Results
7861
79- Files:
80- - kernel/src/syscall/graphics.rs: Add atomic generation to WindowRegistry
81- - userspace/programs/src/bwm.rs line 548: Check generation before calling
82-
83- ## Projected Totals
84-
85- | After Attack | CPU Estimate |
86- | -------------| -------------|
87- | Baseline | 70% |
88- | +Priority 1 | 30-40% |
89- | +Priority 2 | 15-20% |
90- | +Priority 3 | 12-17% |
91- | +Priority 4 | 11-16% |
92- | +Priority 5 | 10-15% |
62+ | After Attack | CPU | FPS |
63+ | -------------| -----| -----|
64+ | Baseline | 70% | ~ 100 |
65+ | +Priority 2 (MAP_SHARED + occluded blit) | 34% | ~ 130 |
66+ | +Priority 1 (compositor_wait) | ~ 20% | ~ 186 |
67+
68+ Total reduction: 70% -> 20% (71% reduction), FPS: 100 -> 186 (86% increase).
0 commit comments