Speculative Actions: Predictive Cache Priming for BuildStream

# Speculative Actions: Predictive Cache Priming for BuildStream

## Table of Contents

1. [Executive Summary](#1-executive-summary)
2. [Problem Statement](#2-problem-statement)
3. [Solution Overview](#3-solution-overview)
4. [Data Model](#4-data-model)
 - [4.1 Protocol Buffer Extensions](#41-protocol-buffer-extensions)
5. [Concrete Example: libA → libB → appC](#5-concrete-example-liba--libb--appc)
 - [5.1 Example: libA Compilation](#51-example-liba-compilation)
 - [5.2 Example: libB Compilation](#52-example-libb-compilation)
 - [5.3 Example: appC Linking](#53-example-appc-linking)
 - [5.4 Overlay Type Summary](#54-overlay-type-summary)
6. [System Flow](#6-system-flow)
 - [6.1 High-Level Process Flow](#61-high-level-process-flow)
 - [6.2 Data Flow Diagram](#62-data-flow-diagram)
7. [Detailed Algorithms](#7-detailed-algorithms)
 - [7.1 Overlay Generation (after build)](#71-overlay-generation-after-build)
 - [7.2 Overlay Instantiation (before build)](#72-overlay-instantiation-before-build)
8. [Implementation Plan](#8-implementation-plan)
9. [Benefits](#9-benefits)
10. [Key Design Decisions](#10-key-design-decisions)
11. [Open Questions and Key Uncertainties](#11-open-questions-and-key-uncertainties)
 - [11.1 Phased Rollout Strategy](#1-phased-rollout-strategy)
 - [11.2 Overlay Resolution Success Rate](#2-overlay-resolution-success-rate)
 - [11.3 Non-Determinism Handling](#3-non-determinism-handling)
 - [11.4 Digest Resolution Performance Impact](#4-digest-resolution-performance-impact)
 - [11.5 Speculative Action Scale and Remote Execution Capacity](#5-speculative-action-scale-and-remote-execution-capacity)
 - [11.6 Storage and Network Overhead](#6-storage-and-network-overhead)
 - [11.7 Value of ACTION Overlays Without `rear`](#7-value-of-action-overlays-without-rear)
12. [Future Enhancement: Local Execution with `rear`](#12-future-enhancement-local-execution-with-rear)
13. [References](#references)

---

## 1. Executive Summary

Cache Priming restores build parallelism in BuildStream by speculatively executing compiler invocations ahead of time, warming the Remote Execution ActionCache before actual element builds run. This dramatically reduces end-to-end build latency without compromising correctness.

**Key Insight**: After building an element once, we know all the subactions (e.g. `recc` compiler invocations) that occurred. We can store these with overlay instructions, then reuse them in future builds by adapting inputs for source/dependency changes.

**BuildStream and Remote Execution**: BuildStream is built on Remote Execution API (REAPI) paradigms, using a local buildbox-casd as a CAS proxy that connects to remote Execution, ActionCache, and CAS services. Cache Priming extends this architecture by adding speculative action submission to warm the ActionCache before element builds execute.

## 2. Problem Statement

BuildStream orchestrates repository-level builds across many elements, allowing each element to use its native builds system, but not achieving the fine-grained parallelism available within individual compilation units. C/C++ projects are particularly affected; translation units could compile in parallel but instead wait sequentially for their element's turn in the build graph.

### Why Cache Priming is Particularly Effective

**Code churn patterns favor cache reuse**: Research consistently shows that code changes are concentrated in a small subset of files, with most code remaining stable between builds. This stability pattern makes speculative execution with cached results highly effective.

**Key Research Findings:**

1. **Adams et al. (2016)** conducted a comprehensive case study on GLib, PostgreSQL, Qt, and Ruby systems analyzing C/C++ header file "hotspots"—headers that change frequently and trigger expensive rebuild processes. They empirically demonstrated that most header files are relatively stable compared to implementation files, with only a small subset accounting for the majority of build-time impact.

2. **Misu et al. (2024)** analyzed 383 GitHub projects using Bazel and found that incremental builds with caching achieve median speedups of 4.22x (build system tool-independent cache) to 4.71x (build system tool-specific cache) for long-build duration projects. This demonstrates the substantial real-world value of build caching when most inputs haven't changed.

3. **Matev (2020)** studied large C++ projects using ccache for distributed compilation, demonstrating that with a hot cache, compilation can be 4-5 times faster than with cache misses. The study showed that "incremental builds often do not significantly speed up compilation because even just a change of the modification time of a widely used header leads to many compiler and linker invocations"—highlighting both the cost of header changes and the benefit when they remain stable.

4. **Chowdhury et al. (2024)** analyzed 774,051 source code methods from 49 prominent open-source Java projects and found that "approximately 80% of changes are concentrated in just 20% of the methods, demonstrating the Pareto 80/20 principle. Moreover, this subset of methods is responsible for the majority of the identified bugs in these projects." This concentration of changes means the vast majority of code remains unchanged between builds.

**Supporting Evidence on Change Concentration:**

- **Shin et al. (2011)**: In Mozilla Firefox and Red Hat Enterprise Linux, 70.8% of known vulnerabilities were found in only 10.9% of files, demonstrating extreme concentration of churn in a small subset of the codebase.

- **Nagappan and Ball (2007)**: In Windows Server 2003, software dependencies and churn measures were efficient predictors of post-release failures, further confirming that changes cluster in predictable locations.

- **Munson and Elbaum (1998)**: Established that code churn, the amount of code change over time, is measurable and correlates with fault-proneness, confirming that change patterns are non-uniform and predictable.

In C/C++ projects specifically:
- **Implementation files (.cpp) churn more frequently** than header files (.h)
- **Header files are more stable** and change less often (Adams et al., 2016)
- **Header file changes are expensive**: Even just changing the modification time of a widely-used header triggers many recompilations (Matev, 2020)
- **Build caching is highly effective**: Studies show 4-5x speedups with warm caches (Matev, 2020; Misu et al., 2024)
- Changes tend to cluster in specific modules while the broader codebase remains stable

**Example**: In a typical incremental build of a C++ project with 1000 compilation units:
- ~200 files changed (20% following Pareto)
- ~800 files unchanged (80%)
- With effective build caching, the 800 unchanged compilations can be served from cache
- Only the 200 changed files need actual compilation
- **Result**: Build time dominated by the 200 changed files, achieving 4-5x speedup

This is why Cache Priming is particularly powerful for C/C++ codebases: the stability of header files and the concentration of changes in a minority of implementation files means that speculative compilation actions can frequently reuse cached results from previous builds.

---

## 3. Solution Overview

**High-Level Flow:**

```mermaid
flowchart LR
 Build["Build #N"] -->|instantiates| SpecActions["Speculative Actions (generated by Build #N-1)"]
 SpecActions -->|warm| ActionCache[(ActionCache)]
 ActionCache -->|serves| ElementBuilds[Element Builds]
 ElementBuilds -->|produce| SubActions["(Sub)Actions"]
 SubActions -->|transformed into| SpecActions2[Speculative Actions]
 SpecActions2 -->|stored in| ArtifactCache[(Artifact Cache)]
 SpecActions2 -.-> SpecActions

 style ActionCache fill:#fff4e1
 style ArtifactCache fill:#fff4e1
 style ElementBuilds fill:#e1f0ff
 style SubActions fill:#e1f0ff
```

**Process:**

1. **Priming Phase** (before build): Instantiate speculative actions by applying overlays, submit for execution
2. **Build Phase**: Element builds hit warmed ActionCache, reducing latency
3. **Generation Phase** (after build): Record subactions with overlay instructions describing how to adapt inputs

---

## 4. Data Model

### 4.1 Protocol Buffer Extensions

**Artifact.speculative_actions** (new field)
```protobuf
message Artifact {
 Digest speculative_actions = 19; // Points to SpeculativeActions in CAS
}

message SpeculativeActions {
 repeated SpeculativeAction actions = 1;
 
 message ReferencedSpeculativeAction {
 string element = 1; // Element that speculative actions are referenced from
 Digest speculative_actions = 2; // The speculative actions of the element at the time of artifact creation
 }
 
 repeated ReferencedSpeculativeAction referenced_speculative_actions = 2; // All speculative actions from all elements referenced in ACTION overlays
}

message SpeculativeAction {
 Digest base_action_digest = 1; // Original Action from previous build
 repeated Overlay overlays = 2; // How to adapt inputs
 
 message Overlay {
 enum OverlayType {
 SOURCE = 0; // From element's source tree
 ARTIFACT = 1; // From dependency element's artifact output
 ACTION = 2; // From another speculative action output
 }
 OverlayType type = 1;
 string source_element = 2; // Element reference
 Digest source_action = 3; // For ACTION type: which action
 string source_path = 4; // Path within source
 Digest target_digest = 5; // Digest to replace in input tree
 }
}
```

**ActionResult.subactions** (new field in REAPI)
```protobuf
message ActionResult {
 repeated Digest subactions = 10; // Digests of spawned Actions (e.g., recc invocations)
}
```

---

## 5. Concrete Example: libA → libB → appC

### Dependency Structure
```
libA (base library)
 └─→ libB (depends on libA)
 └─→ appC (depends on libB)
```

### 5.1 Example: libA Compilation

**libA Source Structure**
```
libA/
├── src/
│ ├── math_utils.h # Header file
│ └── math_utils.cpp # Implementation
└── libA.bst
```

**libA Artifact Structure** (Build Output)
```
/
├── usr/
│ ├── include/
│ │ └── math_utils.h # Installed from src/ (same digest: "h_aaa111...")
│ └── lib/
│ └── libmath_utils.a
```

**Visual Flow:**

```mermaid
graph TB
 subgraph Step1["① libA Source"]
 S1["src/math_utils.cpp digest: cpp_aaa111..."]
 S2["src/math_utils.h digest: h_aaa111..."]
 end
 
 subgraph Step2["② (Sub)Action Execution"]
 A1["recc g++ -c Action digest: aaa111..."]
 end
 
 S1 -->|input| Step2
 S2 -->|input| Step2
 
 subgraph Step3["③ (Sub)Action Result"]
 O1["math_utils.o"]
 end
 
 Step2 -->|produces| O1
 
 subgraph Step4["④ libA Artifact (/usr/)"]
 AR1["/usr/include/math_utils.h digest: h_aaa111..."]
 AR2["/usr/lib/libmath_utils.a"]
 end
 
 O1 -.->|archived into| AR2
 S2 -.->|installed as| AR1
 
 subgraph Step5["⑤ Generated SpeculativeAction"]
 SA["base_action_digest: aaa111..."]
 OV1["Overlay 1: type: SOURCE element: libA path: src/math_utils.cpp target_digest: cpp_aaa111..."]
 OV2["Overlay 2: type: SOURCE element: libA path: src/math_utils.h target_digest: h_aaa111..."]
 end
 
 S1 -.-|digest match| OV1
 S2 -.-|digest match| OV2
 OV1 --> SA
 OV2 --> SA
 SA --> A1
 
 style Step1 fill:#e6f3ff
 style Step4 fill:#ffe6cc
 style Step5 fill:#e6ffe6
 style Step2 fill:#fff4e1
 style Step3 fill:#f0f0f0
```

**Recorded Subaction** (compile math_utils.cpp)
```protobuf
Action {
 command_digest: "cmd_aaa111..."
 input_root_digest: "dir_aaa111..."
}

Command {
 arguments: ["g++", "-c", "-fPIC", "src/math_utils.cpp", "-o", "math_utils.o"]
}

Directory (dir_aaa111...) {
 files: [
 File { name: "src/math_utils.cpp", digest: "cpp_aaa111..." },
 File { name: "src/math_utils.h", digest: "h_aaa111..." }
 ]
}
```

**Generated SpeculativeAction**
```protobuf
SpeculativeAction {
 base_action_digest: "aaa111..."
 overlays: [
 Overlay {
 type: SOURCE
 source_element: "libA"
 source_path: "src/math_utils.cpp"
 target_digest: "cpp_aaa111..." # Replace this digest in input tree
 },
 Overlay {
 type: SOURCE
 source_element: "libA"
 source_path: "src/math_utils.h"
 target_digest: "h_aaa111..." # Replace this digest in input tree
 }
 ]
}
```

### 5.2 Example: libB Compilation

**libB Source Structure**
```
libB/
├── include/
│ └── vector_math.h
├── src/
│ └── vector_math.cpp # Includes math_utils.h from libA
└── libB.bst
```

**libB Artifact Structure** (Build Output)
```
/
├── usr/
│ ├── include/
│ │ └── vector_math.h
│ └── lib/
│ └── libvector_math.a
```

**Visual Flow:**

```mermaid
graph TB
 subgraph Step1A["① libB Source"]
 SB1["src/vector_math.cpp digest: cpp_bbb222..."]
 end
 
 subgraph Step1B["① libA Artifact (dependency)"]
 AA1["/usr/include/math_utils.h digest: h_aaa111... (from libA artifact)"]
 end
 
 subgraph Step1C["① libA Source"]
 SA2["src/math_utils.h digest: h_aaa111..."]
 end
 
 subgraph Step2["② (Sub)Action Execution"]
 A1["recc g++ -c -I/usr/include Action digest: bbb222..."]
 end
 
 SB1 -->|input| Step2
 AA1 -.->|used by compile| Step2
 
 subgraph Step3["③ (Sub)Action Result"]
 O1["vector_math.o"]
 end
 
 Step2 -->|produces| O1
 
 subgraph Step4["④ Generated SpeculativeAction"]
 SA["base_action_digest: bbb222..."]
 OV1["Overlay 1: type: SOURCE element: libB path: src/vector_math.cpp target_digest: cpp_bbb222..."]
 OV2["Overlay 2: type: SOURCE element: libA path: src/math_utils.h target_digest: h_aaa111..."]
 end
 
 SB1 -.-|digest match| OV1
 SA2 -.-|"digest match (libA Source not Artifact!)"| OV2
 OV1 --> SA
 OV2 --> SA
 SA --> A1
 
 style Step1A fill:#e6f3ff
 style Step1C fill:#e6f3ff
 style Step1B fill:#ffe6cc
 style Step4 fill:#e6ffe6
 style Step2 fill:#fff4e1
 style Step3 fill:#f0f0f0
```

**Recorded Subaction** (compile vector_math.cpp)
```protobuf
Action {
 command_digest: "cmd_bbb222..."
 input_root_digest: "dir_bbb222..."
}

Command {
 arguments: [
 "g++", "-c", "-fPIC",
 "-I/usr/include", # libA headers from dependency artifact
 "src/vector_math.cpp",
 "-o", "vector_math.o"
 ]
}

Directory (dir_bbb222...) {
 files: [
 File { name: "src/vector_math.cpp", digest: "cpp_bbb222..." },
 File { name: "/usr/include/math_utils.h", digest: "h_aaa111..." }
 ]
}
```

**Generated SpeculativeAction**
```protobuf
SpeculativeAction {
 base_action_digest: "bbb222..."
 overlays: [
 Overlay {
 type: SOURCE
 source_element: "libB"
 source_path: "src/vector_math.cpp"
 target_digest: "cpp_bbb222..." # Replace this digest
 },
 Overlay {
 type: SOURCE # Prefers SOURCE over ARTIFACT!
 source_element: "libA" # Found in libA's source tree
 source_path: "src/math_utils.h" # Original source path
 target_digest: "h_aaa111..." # Same digest as in /usr/include/
 }
 ]
}
```

**Key Insight**: The overlay generator finds that `math_utils.h` (digest `h_aaa111...`) appears in libA's ARTIFACT output at `/usr/include/math_utils.h`, but also discovers it exists in libA's SOURCE at `src/math_utils.h` with the same digest. Following the SOURCE > ACTION > ARTIFACT priority, it records a SOURCE overlay pointing to libA's source tree. This creates a tighter dependency on the original source rather than the built artifact.

### 5.3 Example: appC Linking

**appC Source Structure**
```
appC/
├── src/main.cpp
└── appC.bst
```

**appC Artifact Structure** (Build Output)
```
/
├── usr/
│ └── bin/
│ └── appC # Executable
```

**Visual Flow:**

```mermaid
graph TB
 subgraph Step1A["① Previous Speculative Action Output"]
 PA1["main.o digest: obj_ccc333... (from compile action ccc333...)"]
 end
 
 subgraph Step1B["① libB Artifact"]
 AB1["/usr/lib/libvector_math.a digest: lib_bbb222..."]
 end
 
 subgraph Step1C["① libA Artifact"]
 AA1["/usr/lib/libmath_utils.a digest: lib_aaa111..."]
 end
 
 subgraph Step2["② (Sub)Action Execution"]
 A1["g++ link Action digest: ccc444..."]
 end
 
 PA1 -->|input| Step2
 AB1 -->|input| Step2
 AA1 -->|input| Step2
 
 subgraph Step3["③ (Sub)Action Result"]
 O1["appC executable"]
 end
 
 Step2 -->|produces| O1
 
 subgraph Step4["④ Generated SpeculativeAction"]
 SA["base_action_digest: ccc444..."]
 OV1["Overlay 1: type: ACTION element: appC source_action: ccc333... path: main.o target_digest: obj_ccc333..."]
 OV2["Overlay 2: type: ARTIFACT element: libB path: /usr/lib/libvector_math.a target_digest: lib_bbb222..."]
 OV3["Overlay 3: type: ARTIFACT element: libA path: /usr/lib/libmath_utils.a target_digest: lib_aaa111..."]
 end
 
 OV1 --> SA
 OV2 --> SA
 OV3 --> SA
 PA1 -.-|digest match| OV1
 AB1 -.-|digest match| OV2
 AA1 -.-|digest match| OV3
 SA --> A1
 
 style Step1A fill:#e6ffe6
 style Step1B fill:#ffe6cc
 style Step1C fill:#ffe6cc
 style Step4 fill:#e6ffe6
 style Step2 fill:#fff4e1
 style Step3 fill:#f0f0f0
```

**Generated SpeculativeAction** (link step)
```protobuf
SpeculativeAction {
 base_action_digest: "ccc444..."
 overlays: [
 Overlay {
 type: ACTION # From previous speculative action
 source_element: "appC"
 source_action: "ccc333..." # The compile action
 source_path: "main.o"
 target_digest: "obj_ccc333..."
 },
 Overlay {
 type: ARTIFACT # From libB artifact output
 source_element: "libB"
 source_path: "/usr/lib/libvector_math.a"
 target_digest: "lib_bbb222..."
 },
 Overlay {
 type: ARTIFACT # From libA artifact output
 source_element: "libA"
 source_path: "/usr/lib/libmath_utils.a"
 target_digest: "lib_aaa111..."
 }
 ]
}
```

### 5.4 Overlay Type Summary

**Visual Overview:**

```mermaid
graph TD
 Start[Input File Digest] --> Q1{Found in current or dependency element's source tree?}
 Q1 -->|YES| S[① SOURCE overlay]
 Q1 -->|NO| Q2{Found in current or dependency element's sub-action result?}
 Q2 -->|YES| A[② ACTION overlay]
 Q2 -->|NO| Q3{Found in dependency element's artifact?}
 Q3 -->|YES| AR[③ ARTIFACT overlay]
 
 style S fill:#e6f3ff
 style A fill:#e6ffe6
 style AR fill:#ffe6cc
```

**Overlay Type Summary**

| Type | Source | Example | Use Case |
|------|--------|---------|----------|
| **SOURCE** | Element's source tree | `.cpp` files, `.h` headers | Tracks source code changes; preferred over ARTIFACT when digest matches |
| **ARTIFACT** | Dependency element's artifact output | `.a` libraries, built artifacts | Tracks dependency artifacts that don't exist in source |
| **ACTION** | Previous speculative action output | `.o` object files | Links compilation steps within an element |

**Priority**: When the same digest appears in multiple locations, overlays prefer SOURCE > ACTION > ARTIFACT. This creates tighter dependencies on original sources rather than intermediate artifacts.

### How This Exploits Code Stability

The overlay mechanism directly exploits the research findings from Section 2:

1. **SOURCE overlays for .cpp files**: Will require updates when implementation files change (~20% of files per build)
2. **SOURCE overlays for .h files**: Often remain valid across multiple builds due to header stability (Adams et al., 2016)
3. **ARTIFACT overlays for dependency artifacts**: Very stable, as dependency libraries rarely change between builds

Given that research shows 80% of files typically remain unchanged, we expect:
- ~80% of SOURCE overlays to resolve successfully with unchanged digests
- Cache hits from previous speculative executions provide the 4-5x speedup observed in caching studies (Matev, 2020; Misu et al., 2024)

---

## 6. System Flow

### 6.1 High-Level Process Flow

```mermaid
flowchart TD
 Start([Build Session Start]) --> Identify[Identify Elements to Build]
 Identify --> Retrieve[Retrieve Artifacts & SpeculativeActions by element weak-ref]
 Retrieve --> LoadRef[Load Referenced SpeculativeActions]
 LoadRef --> Aggregate[Aggregate & Deduplicate SpeculativeActions]
 Aggregate --> TopoSort[Topological Sort by Dependencies]
 
 TopoSort --> Instantiate{For Each SpeculativeAction}
 Instantiate --> Resolve{All Overlays Resolvable?}
 
 Resolve -->|No| Skip[Skip Action & Transitive Dependents]
 Resolve -->|Yes| Apply[Apply Overlays]
 
 Apply --> Store[Store in CAS]
 Store --> RecordMap[Record Mapping: base_action_digest → speculative_action_digest]
 RecordMap --> Submit[Submit to Remote Execution]
 
 Skip --> More{More?}
 Submit --> More
 More -->|Yes| Instantiate
 More -->|No| Parallel[Speculative Execution Warms ActionCache]
 
 Parallel --> Interleave[Scheduler Interleaves: Instantiate remaining speculative actions as dependencies become available]
 Interleave --> Build[Element Builds Hit Cache]
 Build --> Complete{Build Complete?}
 Complete -->|No| Interleave
 Complete -->|Yes| Cancel[Cancel Remaining Speculative Executions]
 Cancel --> End([Build Complete])
 
 style Submit fill:#fff4e1
 style Parallel fill:#fff4e1
 style Interleave fill:#e1f0ff
 style Build fill:#e1f0ff
 style Skip fill:#ffe1e1
```

**Detailed Flow Steps:**

1. **Identify elements to build**: Determine which elements need building based on cache keys
2. **Retrieve element artifacts and SpeculativeActions**: Query Artifact Cache using element *weak cache keys* to get artifacts and their associated SpeculativeActions
3. **Load Referenced SpeculativeActions**: For elements referenced in overlays that we don't have loaded yet, load them from ReferencedSpeculativeActions.
 - As not all code in an element will churn, keeping the dependency chain of SpeculativeActions intact is likely more beneficial than not. 
 - TODO: decide whether to do this for elements referenced in all overlay types or to limit to the ACTION type
4. **Aggregate, deduplicate, and topologically sort**: Combine all SpeculativeActions, remove duplicates, and sort respecting dependencies
5. **Instantiate speculative actions**: Apply overlays to base_action_digest input trees
 - If not all overlays are resolved (e.g., src_path not found in overlay source), skip the speculative action and all its transitive dependents
6. **Store and record mapping**: Store new speculative actions in CAS and record mapping `{ base_action_digest → speculative_action_digest }` for dependency resolution in following overlays.
7. **Submit for speculative execution**: Submit instantiated actions to Remote Execution
8. **Interleave with element builds**: BuildStream scheduler continues instantiating additional speculative actions as dependencies become available during ongoing element builds
9. **Cancel remaining executions**: After build completion, cancel any speculative executions still in progress

### 6.2 Data Flow Diagram

```mermaid
flowchart TB
 subgraph BuildN["Build #N"]
 Start[Build #N Starts]
 
 subgraph Priming["Cache Priming Phase"]
 Start --> Sched[BuildStream Scheduler]
 Sched -->|query via Asset API| AC1[(Artifact Cache)]
 AC1 -->|return SpeculativeActions from Build #N-1| SAList[SpeculativeActions]
 
 SAList --> Inst[Instantiator]
 
 Inst --> Over{Overlay Type?}
 Over -->|SOURCE| Src[Source Tree]
 Over -->|ARTIFACT| Art[Artifact Output]
 Over -->|ACTION| Act[Action Output]
 
 Src --> New[New Action]
 Art --> New
 Act --> New
 
 CASD1[buildbox-casd local CAS proxy]
 New -->|store via CAS API| CASD1
 CASD1 -->|proxy to| CAS1[(BuildGrid CAS)]
 
 New -->|Execute API| BuildGrid1[BuildGrid Execution Service]
 BuildGrid1 -->|check| ACache[ActionCache]
 ACache -->|miss| BuildGrid1
 BuildGrid1 -->|dispatch via Bots API| Workers1[buildbox-worker]
 Workers1 -->|fetch via| CASD_W1[buildbox-casd on worker]
 CASD_W1 -->|proxy to| CAS1
 Workers1 -->|execute| Runner1[buildbox-run]
 Runner1 -->|result| Workers1
 Workers1 -->|store result| ACache
 end
 
 subgraph Execution["Element Builds"]
 Sched -->|trigger| BSElement[BuildStream Element Build]
 BSElement -->|create Action| CASD2[buildbox-casd local CAS proxy]
 CASD2 -->|store via CAS API| CAS1
 BSElement -->|Execute API| BuildGrid2[BuildGrid Execution Service]
 BuildGrid2 -->|check| ACache
 ACache -.->|cache hit for sub-actions!| BuildGrid2
 BuildGrid2 -->|dispatch via Bots API| Workers2[buildbox-worker]
 Workers2 -->|fetch via| CASD_W2[buildbox-casd on worker]
 CASD_W2 -->|proxy to| CAS1
 Workers2 -->|execute| Runner2[buildbox-run]
 Runner2 -->|spawns| SubActions[Sub-Actions: recc invocations]
 SubActions -.->|benefit from warmed ActionCache| ACache
 SubActions -->|record digests| Runner2
 Runner2 -->|result + subactions| Workers2
 Workers2 -->|ActionResult| BuildGrid2
 BuildGrid2 -->|result| BSElement
 BSElement -->|store via| CASD2
 end
 
 subgraph Generation["Overlay Generation (Async)"]
 Gen[Overlay Generator ASYNC]
 BSElement -.->|triggers async| Gen
 Gen -->|fetch Actions via CAS API| CASD2
 Gen -->|create| SA[SpeculativeActions]
 SA -->|attach to| Art1[Artifact]
 Art1 -->|store via Asset API| AC2[(Artifact Cache)]
 end
 end
 
 AC2 -.->|↓ Data for Build #N+1 ↓| Boundary
 
 Boundary[/"═══════════════════════════════════════════════════════════════════"\]
 
 subgraph BuildN1["Build #N+1 (Uses SpeculativeActions from Build #N)"]
 NextBuild[Build #N+1 will use SpeculativeActions generated above]
 end
 
 style Start fill:#e6f3ff
 style SAList fill:#ffeb99
 style Gen fill:#ffe6cc
 style SA fill:#ffe6cc
 style Art1 fill:#ffe6cc
 style AC2 fill:#ffe6cc
 style Workers1 fill:#fff4e1
 style Workers2 fill:#fff4e1
 style BSElement fill:#e1f0ff
 style ACache fill:#fff9e1
 style Boundary fill:#f0f0f0,stroke:#333,stroke-width:3px,stroke-dasharray: 5 5
 style BuildN1 fill:#f8f8ff
 style NextBuild fill:#ffeb99
 style CASD1 fill:#e8f4f8
 style CASD2 fill:#e8f4f8
 style CASD_W1 fill:#e8f4f8
 style CASD_W2 fill:#e8f4f8
```

**Reading the Diagram:**

This is an example setup with BuildGrid for Remote Execution services.

*Note: this diagram could do with some additional cleanup and scrutiny*

**REAPI Architecture**:
- BuildStream runs with local `buildbox-casd` as CAS proxy
- `buildbox-casd` proxies to BuildGrid's CAS, ActionCache, and Execution services
- Workers also run `buildbox-casd` locally for efficient blob access
- All communication uses REAPI (Execute, CAS, Asset, Bots APIs)

**Cache Priming Flow**:
1. **Priming Phase**: Retrieves SpeculativeActions → instantiates → submits via Execute API → BuildGrid dispatches to workers → results stored in ActionCache
2. **Element Builds**: BuildStream creates Actions → Execute API → BuildGrid checks ActionCache → dispatches to workers → `buildbox-run` spawns sub-actions (recc) → sub-actions benefit from warmed ActionCache
3. **Overlay Generation**: Asynchronously fetches sub-action digests and generates SpeculativeActions for Build #N+1

**Key Flow**: Build #N uses speculative actions from Build #N-1 → builds → generates speculative actions for Build #N+1

---

## 7. Detailed Algorithms

**Note**: The following pseudocode is illustrative and non-optimal. Production implementations should optimize for performance, especially around CAS operations and dependency resolution.

### 7.1 Overlay Generation (after build)

**Note**: This process runs asynchronously and does not block the element build completion or downstream element builds.

```python
# Triggered asynchronously after element build completes
async def generate_overlays(element, element_subactions):
 speculative_actions = []

 for subaction_digest in element_subactions:
 subaction = fetch_cas(subaction_digest)
 overlays = []
 
 for input_file in subaction.input_root.files:
 # Match digest to source: SOURCE > ACTION > ARTIFACT priority
 # This means if digest "h_aaa111..." appears in:
 # - libA's source at "src/math_utils.h" (SOURCE)
 # - libA's artifact at "/usr/include/math_utils.h" (ARTIFACT)
 # We prefer the SOURCE match
 
 src_element, src_path, src_type = find_source_element(input_file.digest)
 
 overlay = Overlay(
 type=src_type,
 source_element=src_element,
 source_path=src_path,
 target_digest=input_file.digest # The digest we'll replace during instantiation
 )
 overlays.append(overlay)
 
 speculative_actions.append(SpeculativeAction(
 base_action_digest=subaction_digest,
 overlays=overlays
 ))

 # Attach to Artifact (update artifact in cache)
 artifact.speculative_actions = store_cas(SpeculativeActions(actions=speculative_actions))
 update_artifact_cache(element, artifact)
 
 # Element's downstream dependencies can already be building
 # This overlay data will be available for the NEXT build session
```

**Note on find_source_element**: This function searches through all elements in the dependency graph that were built before the current element. It looks for files matching the target digest in:
1. SOURCE trees (element source files)
2. ACTION outputs (from other speculative actions, including those from earlier elements in the dependency graph)
3. ARTIFACT outputs (built artifacts from dependencies)

When the same digest appears in multiple locations, it returns the SOURCE match, creating a direct dependency on the original source rather than intermediate artifacts.

**Use Cases for ACTION Overlays Beyond Current Element**:
ACTION overlays are particularly valuable for intermediate build artifacts that are generated deterministically:
- **Code generation**: protoc, flex/bison output files can be referenced across elements
- **Resource compilation**: glib-compile-resources, Qt rcc generated code
- **Archive creation**: `.o` → `.a` via `rear` (see Section 12)
- **Intermediate transformations**: any deterministic build step producing reusable artifacts

### 7.2 Overlay Instantiation (before build)

```python
speculative_action_map = {} # base_digest → speculative_action_digest

for action_template in topological_sort(all_speculative_actions):
 base_action = fetch_cas(action_template.base_action_digest)
 action = copy.deepcopy(base_action)
 
 # Apply each overlay
 all_resolved = True
 for overlay in action_template.overlays:
 if overlay.type == SOURCE:
 src_digest = resolve_source_digest(overlay.source_element, overlay.source_path)
 elif overlay.type == ARTIFACT:
 artifact_output = wait_for_artifact_output(overlay.source_element)
 src_digest = digest_at_path(artifact_output, overlay.source_path)
 elif overlay.type == ACTION:
 speculative_action = speculative_action_map[overlay.source_action]
 wait_for_action(speculative_action)
 output = fetch_action_output(speculative_action)
 src_digest = digest_at_path(output, overlay.source_path)
 
 if src_digest is None:
 all_resolved = False
 break
 
 replace_input_digest(action.input_root, overlay.target_digest, src_digest)
 
 if all_resolved:
 speculative_action_digest = store_cas(action)
 speculative_action_map[action_template.base_action_digest] = speculative_action_digest
 submit_speculative_action(speculative_action_digest)
```

---

## 8. Implementation Plan

### Phase 1: REAPI Extension
1. Add `ActionResult.subactions` field to REAPI
2. Instrument buildbox-worker to record subaction digests
3. Update buildbox-casd to store subactions with ActionResult

### Phase 2: BuildStream Integration - SOURCE Overlays Only
1. Implement overlay generator in BuildStream (SOURCE type only)
2. Add SpeculativeActions to Artifact protobuf
3. Store/retrieve SpeculativeActions via Artifact Cache
4. Implement speculative action instantiation for SOURCE overlays
5. Add speculative action submission to scheduler
6. **Make overlay generation asynchronous**: Run as background task, don't block element builds or downstream dependencies
7. **Measure**: Overlay resolution success rate, storage overhead, performance impact, overlay generation time

### Phase 3: ARTIFACT Overlays
1. Extend overlay generator to support ARTIFACT type
2. Update instantiation logic to resolve ARTIFACT overlays
3. **Measure**: Additional performance gains, complexity vs. benefit

### Phase 4: ACTION Overlays (potentially combined with `rear` and other re- wrappers)
1. Implement ACTION overlay support in generator and instantiator
2. Evaluate whether ACTION overlays provide value for code generation (protoc, flex/bison) and other deterministic intermediate steps
3. If beneficial for archive creation, implement `rear` (remote execution ar) to complete the speculative action chain (see Section 12 for details)
4. Consider other re-wrapper candidates: ranlib, strip, resource compilers (see Section 12)
5. **Measure**: End-to-end performance improvement

### Phase 5: Optimization
1. Optimize digest resolution using Apache Arrow for columnar digest storage and vectorized lookups
2. Skip priming for elements with strong cache key hits and their transitive dependencies up the dependency graph (unchanged subtrees)
3. Optimize topological sorting and deduplication
4. Implement adaptive throttling for speculative action submission
5. Add comprehensive monitoring and metrics

---

## 9. Benefits

- **Restores Parallelism**: Compilation units execute in parallel across the entire build graph
- **Transparent**: No changes to element definitions or build scripts
- **Safe**: Incorrect/stale overlays simply become redundant; correctness never affected
- **Efficient**: Minimal storage overhead (overlays are small metadata)
- **Incremental**: Works with partial cache hits and source changes
- **Extensible**: Can be enhanced with local execution optimization (`rear`) and other re-wrapped commands to further improve performance

---

## 10. Key Design Decisions

1. **Store speculative actions with artifacts**: Keeps priming data co-located with build results
2. **Three speculative action overlay types**: SOURCE/ARTIFACT/ACTION cover all input sources with appropriate granularity
3. **Topological sorting**: Ensures ACTION overlays can resolve dependencies
4. **Skip on unresolved**: Gracefully handles missing/changed inputs without blocking
5. **Early submission**: Submit speculative actions as soon as dependencies resolve, regardless of element depth in graph
6. **Asynchronous overlay generation**: Overlay generation runs in the background and does not block element build completion or downstream element builds. SpeculativeActions become available for future builds without impacting the current build's critical path.

---

## 11. Open Questions and Key Uncertainties

### Implementation Questions
1. **Resource limits**: How many concurrent speculative actions should we allow?
2. **Metrics**: What telemetry do we need to measure priming effectiveness?

### Critical Success Factors

#### 1. Phased Rollout Strategy
**Question**: Should we implement all overlay types at once, or phase them?

**Proposal**: Start with SOURCE overlays only (Phase 2), measure impact, then add ARTIFACT (Phase 3), then ACTION (Phase 4).

**Rationale**:
- SOURCE overlays are simplest and likely provide the most benefit
- ARTIFACT overlays add cross-element dependencies but may have lower resolution rates
- ACTION overlays require careful dependency management and may only be valuable combined with `rear`

**Required**: Define success criteria for each phase before proceeding to the next.

#### 2. Overlay Resolution Success Rate
**Question**: How often will overlays successfully resolve in practice?

**Context**: Speculative actions are skipped if any overlay cannot be resolved. Success rate depends on:
- How frequently source files change between builds
- Whether the same digests appear in both SOURCE and ARTIFACT locations
- For ACTION overlays: whether dependency chains remain intact

**Research Evidence**: Studies of C/C++ build caching and code churn patterns provide encouraging data:

- **Adams et al. (2016)**: Most C/C++ header files are relatively stable compared to implementation files
- **Misu et al. (2024)**: Incremental builds with caching achieve 4.22x to 4.71x speedups, demonstrating high cache hit rates in practice
- **Matev (2020)**: Hot ccache provides 4-5x speedup over cold cache in large C++ projects
- **Chowdhury et al. (2024)**: Approximately 80% of code changes concentrate in 20% of methods (Pareto principle)
- **Shin et al. (2011)**: In Mozilla and Red Hat Linux, 89.1% of files were in the low-churn, stable category
- **Macho et al. (2021)**: Analysis of 144 Maven projects and build changes found that while dependency version updates are among the top-10 most frequent build changes, structural changes to build configuration (adding/removing dependencies, changing build order) occur much less frequently. Build maintenance is primarily driven by accommodating changes to production code rather than restructuring the build itself.
- **McIntosh et al. (2012)**: Study of Java build systems showed that build complexity evolves slowly over time, with build structure remaining relatively stable between major refactorings.

**Implication for Build Structure Stability**: Research shows that while source code changes frequently, build structure (which libraries are linked, link order, build rules) changes much less often:
- ~80% of SOURCE overlays for .cpp files to resolve successfully (unchanged sources)
- >90% of SOURCE overlays for .h files to resolve (headers more stable per Adams)
- >95% of ARTIFACT overlays to resolve (dependency artifacts rarely change)
- ~85-90% of ACTION overlays to resolve (link lines and build structure change infrequently per Macho/McIntosh)

**Real-world cache effectiveness**: The Misu and Matev studies showing 4-5x speedups from caching suggest that cache hit rates of 75-80% are achievable in practice, directly supporting the viability of Cache Priming.

**Required**:
- Instrument prototype to measure actual resolution rates on real projects
- Identify common failure patterns
- Validate against freedesktop-sdk and GNOME builds
- Determine if partial resolution strategies would help

**Risk**: If <50% of speculative actions resolve successfully, the system may not provide sufficient benefit to justify the complexity. However, research suggests success rates should be substantially higher (70-80%+).

#### 3. Non-Determinism Handling
**Question**: How do non-deterministic builds affect speculative action correctness?

**Context**: If a build produces different outputs for the same inputs, the recorded `base_action_digest` may not be reusable.

**Current Assumption**: Non-deterministic actions will still return cached results when the ActionCache entry hasn't been evicted. If a new execution produces different outputs, it generates a new subaction, and SpeculativeActions are updated for the next build.

**Required**: Validate this assumption and document expected behavior for non-deterministic builds.

#### 4. Digest Resolution Performance Impact
**Question**: Can we make `find_source_element` fast enough to avoid becoming a bottleneck?

**Context**: During overlay generation, every input file of every subaction needs its digest resolved against all source trees, artifact outputs, and prior actions in the dependency graph. For large builds, this could be thousands of files × thousands of potential sources.

**Critical Insight**: Since overlay generation runs asynchronously and doesn't block element builds or speculative action execution, performance requirements are relaxed:
- Overlay generation for element N can run while elements N+1, N+2, etc. are building
- Generated SpeculativeActions are used in the *next* build session, not the current one
- Slow overlay generation only delays availability of speculative actions for future builds

**Optimization Strategies**:
- **Apache Arrow for digest lookups**: Use Apache Arrow's columnar format to store digest tables with built-in vectorized operations
 - Arrow provides O(1) random access with cache-efficient columnar layout
 - Built-in SIMD optimizations for hash lookups and comparisons
 - Cross-language support (C++, Python, Rust) for BuildStream integration
 - Set lookup kernels with hash table support already implemented
 - No need to implement custom indices, bloom filters, or vectorization
- **Columnar digest storage**: Store digests in Arrow tables: `[(digest: FixedSizeBinary(32), element: string, path: string, type: int8)]`
- **Arrow compute kernels**: Use `pc.is_in()` for individual digest checks and `pc.join(..., join_type="semi")` for bulk digest matching against large sets
- **Memory efficiency**: Arrow's columnar format provides 64-byte alignment for optimal SIMD performance

**Why Apache Arrow**:
- Arrow's columnar layout enables vectorization using SIMD operations, with 64-byte alignment matching AVX-512 register width
- Arrow provides built-in set lookup kernels with hash table support for exactly this use case
- Arrow's pipeline and SIMD algorithms deliver 10-100x performance improvements through cache locality and parallel operations
- Arrow enables efficient data processing without serialization overhead, making cross-module data sharing trivial
- Proven at scale: PySpark achieved 10-100x performance improvements using Arrow

**Required**: 
- Profile and optimize on real-world projects (freedesktop-sdk, GNOME)
- Measure: time to generate overlays vs. element build time
- Implement Apache Arrow-based digest lookup tables as primary strategy
- Leverage Arrow's built-in set lookup kernels and hash table support

**Acceptable Performance**: Overlay generation should complete within 1-2x the element's build time. Since it runs asynchronously, it won't impact the current build's critical path.

**Risk**: If overlay generation is extremely slow (>10x element build time), it might not complete before the next build session starts, reducing cache priming effectiveness.

#### 5. Speculative Action Scale and Remote Execution Capacity

**Question**: Won't submitting thousands of speculative actions overwhelm the remote execution cluster?

**Context**: A large C/C++ BuildStream project might generate thousands of speculative actions (one per compilation unit). For example, a project with 100 elements averaging 100 compilation units each would produce 10,000 speculative actions.

**Comparison to Existing Systems**: This scale is actually modest compared to build systems already using Remote Execution:

- Bazel builds of TensorFlow generate over 8,191 actions executing across hundreds of remote workers
- Google runs millions of builds executing millions of test cases every day, with builds depending on tens of thousands of targets
- A large TypeScript project with 10M lines of code generates 1,100 Bazel targets, with benchmarks using 100 remote executors
- Bazel builds often include thousands or tens of thousands of small actions such as compiling, linking, or running tests

**Key Insight**: Remote Execution systems like BuildGrid are designed to handle tens of thousands of concurrent actions. Cache Priming's speculative actions are within the normal operating range of these systems.

**Mitigation Strategies**:
- **ActionCache hits**: Many speculative actions will hit immediately, reducing actual execution load
- **Priority scheduling**: Actual element builds can be prioritized over speculative actions
- **Adaptive throttling**: Submit speculative actions at a controlled rate based on cluster capacity
- **Gradual rollout**: Start with Phase 2 (SOURCE overlays only) which generates fewer actions

**Required**:
- Monitor remote execution cluster utilization during prototype testing
- Implement priority queuing: actual builds > speculative actions
- Consider time-windowing: only prime for elements building soon

**Acceptable**: Speculative actions should consume <30% of remote execution capacity, leaving headroom for actual builds.

#### 6. Storage and Network Overhead
**Question**: What is the actual storage overhead of SpeculativeActions?

**Context**: 
- Each artifact now includes SpeculativeActions metadata
- For compilation-heavy elements (1000+ compilation units), this could be significant
- ReferencedSpeculativeAction stores only element name + digest (minimal duplication)

**Required**:
- Measure actual storage increase on real projects
- Consider compression strategies if overhead is significant
- Provide opt-out mechanism for elements that don't benefit

#### 7. Value of ACTION Overlays Without rear
**Question**: Do ACTION overlays provide benefit on their own, or only when combined with `rear`?

**Context**: ACTION overlays create dependencies between speculative actions within an element. Without `rear` (see Section 12), the primary use case would be linking, which requires archives that may not exist as separate subactions.

**Proposal**: Evaluate ACTION overlays and `rear` together in Phase 4, as they may be interdependent.

**Required**: Identify other potential uses for ACTION overlays beyond archive creation.

### Success Metrics

To validate the proposal, we need to measure:

1. **Performance**: End-to-end build time reduction (target: >20% for C/C++ heavy projects)
2. **Resolution Rate**: Percentage of speculative actions successfully instantiated (target: >70%)
3. **Storage Overhead**: Size increase of artifact cache (acceptable: <20%)
4. **Overlay Generation Time**: Time to generate overlays per element (acceptable: <2x element build time, since it's async)
5. **Worker Utilization**: Speculative actions should not starve actual builds

### Validation Plan

1. **Implement Phase 1-2**: SOURCE overlays only on a prototype
2. **Test on freedesktop-sdk**: Large, real-world C/C++ heavy BuildStream project
3. **Measure all success metrics**: Gather data before proceeding
4. **Iterate or pivot**: If metrics don't meet targets, adjust design or abandon
5. **Proceed to Phase 3+**: Only if Phase 2 shows clear benefit

---

## 12. Future Enhancement: Local Execution with rear

### Motivation

In the base Cache Priming design, speculative actions follow this pattern:
1. Compile `.cpp` → `.o` (via `recc`, speculative)
2. Upload `.o` to CAS
3. Download `.o` files for archiving
4. Create `.a` archive
5. Upload `.a` to CAS

However, since BuildStream submits entire element builds as Actions to workers, and `recc` returns `.o` files during the element build, those `.o` files are **already local on the worker**. We can optimize archive creation by keeping it local.

### The `rear` Command

Similar to `recc` (remote execution caching compiler), **rear** (remote execution ar) wraps the `ar` archiver:

```bash
# Traditional ar usage
ar rcs libmath_utils.a math_utils.o helpers.o

# With rear wrapper
rear rcs libmath_utils.a math_utils.o helpers.o
```

**rear behavior**:
1. Generates an Action describing the archive operation
2. Checks ActionCache for existing result
3. If miss:
 - **During main element build**: Executes `ar` **locally on the worker** (via buildbox-casd) since `.o` files are already present
 - **During speculative execution**: Executes remotely (fetches `.o` files from CAS as needed)
4. Stores result in ActionCache
5. Records the Action digest in `ActionResult.subactions`

### Extended Speculative Action Chain

With `rear`, we get a complete chain of speculative actions:

```
libA element build (main build):
 recc g++ -c src/math_utils.cpp -o math_utils.o [remote or cached]
 ↓ (local .o file on worker)
 rear rcs /usr/lib/libmath_utils.a math_utils.o [LOCAL execution]
 ↓ (local .a file on worker)
 g++ app.o -L/usr/lib -lmath_utils -o final_binary [remote or cached]

Speculative execution (cache priming):
 recc compile action [remote, warms ActionCache]
 ↓ (.o uploaded to CAS)
 rear archive action [remote, fetches .o from CAS]
 ↓ (.a uploaded to CAS)
 link action [remote, fetches .a from CAS]
```

### Example: libA with `rear`

**Recorded Subactions** during libA build:
```protobuf
ActionResult {
 subactions: [
 "recc_compile_math_utils...", // .cpp → .o
 "recc_compile_helpers...", // .cpp → .o
 "rear_archive..." // .o → .a
 ]
}
```

**Generated SpeculativeAction for rear**:
```protobuf
SpeculativeAction {
 base_action_digest: "rear_archive..."
 overlays: [
 Overlay {
 type: ACTION
 source_element: "libA.bst"
 source_action: "recc_compile_math_utils..."
 source_path: "math_utils.o"
 target_digest: "obj_math_utils..."
 },
 Overlay {
 type: ACTION
 source_element: "libA.bst"
 source_action: "recc_compile_helpers..."
 source_path: "helpers.o"
 target_digest: "obj_helpers..."
 }
 ]
}
```

### Key Advantages

1. **Faster Main Builds**: During the actual element build, `.o` files are local on the worker, so archive creation executes immediately without network overhead
2. **Earlier Link Actions**: Even though speculative `rear` actions run remotely, they still populate the ActionCache, enabling link actions to start as soon as archives are cached
3. **ActionCache Hits**: When the main build runs, `rear` checks ActionCache first and gets an immediate hit from the speculative execution
4. **Reduced Critical Path**: The main build's archive step becomes nearly instantaneous (cache hit), shortening the overall build time

**Note**: Future optimizations (out of scope for this proposal) could include locality hints to prefer scheduling speculative `rear` actions on workers that already have the `.o` files, further reducing network traffic.

### Execution Flow

```
Cache Priming Phase (speculative):
 Submit compile actions (recc) → executes remotely, warms ActionCache
 ↓ [.o files uploaded to CAS]
 Submit archive action (rear) → executes remotely, fetches .o from CAS
 ↓ [.a file uploaded to CAS, ActionCache warmed]
 Submit link actions → executes remotely, warms ActionCache
 ↓

Main Build Phase:
 Element build starts
 ↓
 recc compile → ActionCache HIT (instant)
 ↓ [.o files now local on worker]
 rear archive → ActionCache HIT, but if miss, executes locally (instant)
 ↓ [.a file now local on worker]
 link → ActionCache HIT (instant)
 ↓
 Element build completes in minimal time
```

The key insight: **rear enables ActionCache hits during the main build, and when those hits miss, local execution is very fast because the input `.o` files are already on the worker**.

### Beyond `rear`: Instrumentation Strategy

The `rear` optimization illustrates a general principle: **identify commands that operate on local intermediate data and wrap them for speculative execution**.

**Candidates for "re-" wrappers**:
- `ranlib`: Index generation for static libraries (fast, local)
- `strip`: Symbol stripping (fast, local, self-contained)
- Code generators: protoc, flex/bison (deterministic, cacheable)
- Resource compilers: glib-compile-resources, Qt rcc

**Instrumentation approach**:
1. Monitor build logs to identify frequent intermediate commands
2. Measure: execution time, input/output sizes, determinism
3. Prioritize commands that are:
 - Fast enough for local execution (< 1 second)
 - Self-contained (few dependencies)
 - Operate on data already local to worker
 - Appear frequently across builds
4. Implement re-wrappers and measure impact

This creates a virtuous cycle: more instrumentation → more optimization opportunities → better build performance.

---

## References

- Adams, B., McIntosh, S., Nagappan, M., and Hassan, A.E. (2016). "Identifying and understanding header file hotspots in C/C++ build processes." *Automated Software Engineering*, Vol. 23, pp. 61-90. https://dl.acm.org/doi/10.1007/s10515-015-0183-5

- Chowdhury, S., Uddin, G., Hemmati, H., and Holmes, R. (2024). "The Good, the Bad, and the Monstrous: Predicting Highly Change-Prone Source Code Methods at Their Inception." *ACM Transactions on Software Engineering and Methodology*, Vol. 33, No. 4. https://dl.acm.org/doi/10.1145/3715006

- Macho, C., McIntosh, S., and Pinzger, M. (2021). "The nature of build changes." *Empirical Software Engineering*, Vol. 26, Article 52. https://link.springer.com/article/10.1007/s10664-020-09926-4

- Matev, R. (2020). "Fast distributed compilation and testing of large C++ projects." *EPJ Web of Conferences*, Vol. 245. https://www.epj-conferences.org/articles/epjconf/pdf/2020/21/epjconf_chep2020_05001.pdf

- McIntosh, S., Adams, B., and Hassan, A.E. (2012). "The Evolution of Java Build Systems." *Empirical Software Engineering*, Vol. 17, No. 4-5, pp. 578-608.

- Misu, S., Lin, B., and Monden, A. (2024). "Does Using Bazel Help Speed Up Continuous Integration Builds?" *arXiv preprint*. https://arxiv.org/html/2405.00796v1

- Nagappan, N. and Ball, T. (2007). "Using Software Dependencies and Churn Metrics to Predict Field Failures: An Empirical Case Study." *First International Symposium on Empirical Software Engineering and Measurement (ESEM 2007)*. https://ieeexplore.ieee.org/document/4343764/

- Shin, Y., Meneely, A., Williams, L., and Osborne, J. (2011). "Evaluating Complexity, Code Churn, and Developer Activity Metrics as Indicators of Software Vulnerabilities." *IEEE Transactions on Software Engineering*, Vol. 37, No. 6, pp. 772-787. https://ieeexplore.ieee.org/document/5560680/

- Munson, J.C. and Elbaum, S. (1998). "Code Churn: A Measure for Estimating the Impact of Code Change." *Proceedings of the International Conference on Software Maintenance*, pp. 24-31. https://ieeexplore.ieee.org/document/738486/

Type	Source	Example	Use Case
SOURCE	Element's source tree	`.cpp` files, `.h` headers	Tracks source code changes; preferred over ARTIFACT when digest matches
ARTIFACT	Dependency element's artifact output	`.a` libraries, built artifacts	Tracks dependency artifacts that don't exist in source
ACTION	Previous speculative action output	`.o` object files	Links compilation steps within an element

Speculative Actions: Predictive Cache Priming for BuildStream #2083

Description

Speculative Actions: Predictive Cache Priming for BuildStream

Table of Contents

1. Executive Summary

2. Problem Statement

Why Cache Priming is Particularly Effective

3. Solution Overview

4. Data Model

4.1 Protocol Buffer Extensions

5. Concrete Example: libA → libB → appC

Dependency Structure

5.1 Example: libA Compilation

5.2 Example: libB Compilation

5.3 Example: appC Linking

5.4 Overlay Type Summary

How This Exploits Code Stability

6. System Flow

6.1 High-Level Process Flow

6.2 Data Flow Diagram

7. Detailed Algorithms

7.1 Overlay Generation (after build)

7.2 Overlay Instantiation (before build)

8. Implementation Plan

Phase 1: REAPI Extension

Phase 2: BuildStream Integration - SOURCE Overlays Only

Phase 3: ARTIFACT Overlays

Phase 4: ACTION Overlays (potentially combined with rear and other re- wrappers)

Phase 5: Optimization

9. Benefits

10. Key Design Decisions

11. Open Questions and Key Uncertainties

Implementation Questions

Critical Success Factors

1. Phased Rollout Strategy

2. Overlay Resolution Success Rate

3. Non-Determinism Handling

4. Digest Resolution Performance Impact

5. Speculative Action Scale and Remote Execution Capacity

6. Storage and Network Overhead

7. Value of ACTION Overlays Without rear

Success Metrics

Validation Plan

12. Future Enhancement: Local Execution with rear

Motivation

The rear Command

Extended Speculative Action Chain

Example: libA with rear

Key Advantages

Execution Flow

Beyond rear: Instrumentation Strategy

References

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Phase 4: ACTION Overlays (potentially combined with `rear` and other re- wrappers)

The `rear` Command

Example: libA with `rear`

Beyond `rear`: Instrumentation Strategy