Skip to content

Speculative Actions: Predictive Cache Priming for BuildStream #2083

@sstriker

Description

@sstriker

Speculative Actions: Predictive Cache Priming for BuildStream

Table of Contents

  1. Executive Summary
  2. Problem Statement
  3. Solution Overview
  4. Data Model
  5. Concrete Example: libA → libB → appC
  6. System Flow
  7. Detailed Algorithms
  8. Implementation Plan
  9. Benefits
  10. Key Design Decisions
  11. Open Questions and Key Uncertainties
  12. Future Enhancement: Local Execution with rear
  13. References

1. Executive Summary

Cache Priming restores build parallelism in BuildStream by speculatively executing compiler invocations ahead of time, warming the Remote Execution ActionCache before actual element builds run. This dramatically reduces end-to-end build latency without compromising correctness.

Key Insight: After building an element once, we know all the subactions (e.g. recc compiler invocations) that occurred. We can store these with overlay instructions, then reuse them in future builds by adapting inputs for source/dependency changes.

BuildStream and Remote Execution: BuildStream is built on Remote Execution API (REAPI) paradigms, using a local buildbox-casd as a CAS proxy that connects to remote Execution, ActionCache, and CAS services. Cache Priming extends this architecture by adding speculative action submission to warm the ActionCache before element builds execute.

2. Problem Statement

BuildStream orchestrates repository-level builds across many elements, allowing each element to use its native builds system, but not achieving the fine-grained parallelism available within individual compilation units. C/C++ projects are particularly affected; translation units could compile in parallel but instead wait sequentially for their element's turn in the build graph.

Why Cache Priming is Particularly Effective

Code churn patterns favor cache reuse: Research consistently shows that code changes are concentrated in a small subset of files, with most code remaining stable between builds. This stability pattern makes speculative execution with cached results highly effective.

Key Research Findings:

  1. Adams et al. (2016) conducted a comprehensive case study on GLib, PostgreSQL, Qt, and Ruby systems analyzing C/C++ header file "hotspots"—headers that change frequently and trigger expensive rebuild processes. They empirically demonstrated that most header files are relatively stable compared to implementation files, with only a small subset accounting for the majority of build-time impact.

  2. Misu et al. (2024) analyzed 383 GitHub projects using Bazel and found that incremental builds with caching achieve median speedups of 4.22x (build system tool-independent cache) to 4.71x (build system tool-specific cache) for long-build duration projects. This demonstrates the substantial real-world value of build caching when most inputs haven't changed.

  3. Matev (2020) studied large C++ projects using ccache for distributed compilation, demonstrating that with a hot cache, compilation can be 4-5 times faster than with cache misses. The study showed that "incremental builds often do not significantly speed up compilation because even just a change of the modification time of a widely used header leads to many compiler and linker invocations"—highlighting both the cost of header changes and the benefit when they remain stable.

  4. Chowdhury et al. (2024) analyzed 774,051 source code methods from 49 prominent open-source Java projects and found that "approximately 80% of changes are concentrated in just 20% of the methods, demonstrating the Pareto 80/20 principle. Moreover, this subset of methods is responsible for the majority of the identified bugs in these projects." This concentration of changes means the vast majority of code remains unchanged between builds.

Supporting Evidence on Change Concentration:

  • Shin et al. (2011): In Mozilla Firefox and Red Hat Enterprise Linux, 70.8% of known vulnerabilities were found in only 10.9% of files, demonstrating extreme concentration of churn in a small subset of the codebase.

  • Nagappan and Ball (2007): In Windows Server 2003, software dependencies and churn measures were efficient predictors of post-release failures, further confirming that changes cluster in predictable locations.

  • Munson and Elbaum (1998): Established that code churn, the amount of code change over time, is measurable and correlates with fault-proneness, confirming that change patterns are non-uniform and predictable.

In C/C++ projects specifically:

  • Implementation files (.cpp) churn more frequently than header files (.h)
  • Header files are more stable and change less often (Adams et al., 2016)
  • Header file changes are expensive: Even just changing the modification time of a widely-used header triggers many recompilations (Matev, 2020)
  • Build caching is highly effective: Studies show 4-5x speedups with warm caches (Matev, 2020; Misu et al., 2024)
  • Changes tend to cluster in specific modules while the broader codebase remains stable

Example: In a typical incremental build of a C++ project with 1000 compilation units:

  • ~200 files changed (20% following Pareto)
  • ~800 files unchanged (80%)
  • With effective build caching, the 800 unchanged compilations can be served from cache
  • Only the 200 changed files need actual compilation
  • Result: Build time dominated by the 200 changed files, achieving 4-5x speedup

This is why Cache Priming is particularly powerful for C/C++ codebases: the stability of header files and the concentration of changes in a minority of implementation files means that speculative compilation actions can frequently reuse cached results from previous builds.


3. Solution Overview

High-Level Flow:

flowchart LR
    Build["Build #N"] -->|instantiates| SpecActions["Speculative Actions<br/>(generated by Build #N-1)"]
    SpecActions -->|warm| ActionCache[(ActionCache)]
    ActionCache -->|serves| ElementBuilds[Element Builds]
    ElementBuilds -->|produce| SubActions["(Sub)Actions"]
    SubActions -->|transformed into| SpecActions2[Speculative Actions]
    SpecActions2 -->|stored in| ArtifactCache[(Artifact Cache)]
    SpecActions2 -.-> SpecActions

    style ActionCache fill:#fff4e1
    style ArtifactCache fill:#fff4e1
    style ElementBuilds fill:#e1f0ff
    style SubActions fill:#e1f0ff
Loading

Process:

  1. Priming Phase (before build): Instantiate speculative actions by applying overlays, submit for execution
  2. Build Phase: Element builds hit warmed ActionCache, reducing latency
  3. Generation Phase (after build): Record subactions with overlay instructions describing how to adapt inputs

4. Data Model

4.1 Protocol Buffer Extensions

Artifact.speculative_actions (new field)

message Artifact {
  Digest speculative_actions = 19;  // Points to SpeculativeActions in CAS
}

message SpeculativeActions {
  repeated SpeculativeAction actions = 1;
  
  message ReferencedSpeculativeAction {
    string element = 1;             // Element that speculative actions are referenced from
    Digest speculative_actions = 2; // The speculative actions of the element at the time of artifact creation
  }
  
  repeated ReferencedSpeculativeAction referenced_speculative_actions = 2; // All speculative actions from all elements referenced in ACTION overlays
}

message SpeculativeAction {
  Digest base_action_digest = 1;    // Original Action from previous build
  repeated Overlay overlays = 2;    // How to adapt inputs
  
  message Overlay {
    enum OverlayType {
      SOURCE = 0;                   // From element's source tree
      ARTIFACT = 1;                 // From dependency element's artifact output
      ACTION = 2;                   // From another speculative action output
    }
    OverlayType type = 1;
    string source_element = 2;      // Element reference
    Digest source_action = 3;       // For ACTION type: which action
    string source_path = 4;         // Path within source
    Digest target_digest = 5;       // Digest to replace in input tree
  }
}

ActionResult.subactions (new field in REAPI)

message ActionResult {
  repeated Digest subactions = 10;  // Digests of spawned Actions (e.g., recc invocations)
}

5. Concrete Example: libA → libB → appC

Dependency Structure

libA (base library)
  └─→ libB (depends on libA)
       └─→ appC (depends on libB)

5.1 Example: libA Compilation

libA Source Structure

libA/
├── src/
│   ├── math_utils.h         # Header file
│   └── math_utils.cpp       # Implementation
└── libA.bst

libA Artifact Structure (Build Output)

/
├── usr/
│   ├── include/
│   │   └── math_utils.h     # Installed from src/ (same digest: "h_aaa111...")
│   └── lib/
│       └── libmath_utils.a

Visual Flow:

graph TB
    subgraph Step1["① libA Source"]
        S1["src/math_utils.cpp<br/><b>digest: cpp_aaa111...</b>"]
        S2["src/math_utils.h<br/><b>digest: h_aaa111...</b>"]
    end
    
    subgraph Step2["② (Sub)Action Execution"]
        A1["<b>recc g++ -c</b><br/>Action digest: aaa111..."]
    end
    
    S1 -->|input| Step2
    S2 -->|input| Step2
    
    subgraph Step3["③ (Sub)Action Result"]
        O1["math_utils.o"]
    end
    
    Step2 -->|produces| O1
    
    subgraph Step4["④ libA Artifact (/usr/)"]
        AR1["/usr/include/math_utils.h<br/><b>digest: h_aaa111...</b>"]
        AR2["/usr/lib/libmath_utils.a"]
    end
    
    O1 -.->|archived into| AR2
    S2 -.->|installed as| AR1
    
    subgraph Step5["⑤ Generated SpeculativeAction"]
        SA["<b>base_action_digest: aaa111...</b>"]
        OV1["<b>Overlay 1:</b><br/>type: SOURCE<br/>element: libA<br/>path: src/math_utils.cpp<br/>target_digest: cpp_aaa111..."]
        OV2["<b>Overlay 2:</b><br/>type: SOURCE<br/>element: libA<br/>path: src/math_utils.h<br/>target_digest: h_aaa111..."]
    end
    
    S1 -.-|digest match| OV1
    S2 -.-|digest match| OV2
    OV1 --> SA
    OV2 --> SA
    SA --> A1
    
    style Step1 fill:#e6f3ff
    style Step4 fill:#ffe6cc
    style Step5 fill:#e6ffe6
    style Step2 fill:#fff4e1
    style Step3 fill:#f0f0f0
Loading

Recorded Subaction (compile math_utils.cpp)

Action {
  command_digest: "cmd_aaa111..."
  input_root_digest: "dir_aaa111..."
}

Command {
  arguments: ["g++", "-c", "-fPIC", "src/math_utils.cpp", "-o", "math_utils.o"]
}

Directory (dir_aaa111...) {
  files: [
    File { name: "src/math_utils.cpp", digest: "cpp_aaa111..." },
    File { name: "src/math_utils.h", digest: "h_aaa111..." }
  ]
}

Generated SpeculativeAction

SpeculativeAction {
  base_action_digest: "aaa111..."
  overlays: [
    Overlay {
      type: SOURCE
      source_element: "libA"
      source_path: "src/math_utils.cpp"
      target_digest: "cpp_aaa111..."    # Replace this digest in input tree
    },
    Overlay {
      type: SOURCE
      source_element: "libA"
      source_path: "src/math_utils.h"
      target_digest: "h_aaa111..."      # Replace this digest in input tree
    }
  ]
}

5.2 Example: libB Compilation

libB Source Structure

libB/
├── include/
│   └── vector_math.h
├── src/
│   └── vector_math.cpp      # Includes math_utils.h from libA
└── libB.bst

libB Artifact Structure (Build Output)

/
├── usr/
│   ├── include/
│   │   └── vector_math.h
│   └── lib/
│       └── libvector_math.a

Visual Flow:

graph TB
    subgraph Step1A["① libB Source"]
        SB1["src/vector_math.cpp<br/><b>digest: cpp_bbb222...</b>"]
    end
    
    subgraph Step1B["① libA Artifact (dependency)"]
        AA1["/usr/include/math_utils.h<br/><b>digest: h_aaa111...</b><br/>(from libA artifact)"]
    end
    
    subgraph Step1C["① libA Source"]
        SA2["src/math_utils.h<br/><b>digest: h_aaa111...</b>"]
    end
    
    subgraph Step2["② (Sub)Action Execution"]
        A1["<b>recc g++ -c -I/usr/include</b><br/>Action digest: bbb222..."]
    end
    
    SB1 -->|input| Step2
    AA1 -.->|used by compile| Step2
    
    subgraph Step3["③ (Sub)Action Result"]
        O1["vector_math.o"]
    end
    
    Step2 -->|produces| O1
    
    subgraph Step4["④ Generated SpeculativeAction"]
        SA["<b>base_action_digest: bbb222...</b>"]
        OV1["<b>Overlay 1:</b><br/>type: SOURCE<br/>element: libB<br/>path: src/vector_math.cpp<br/>target_digest: cpp_bbb222..."]
        OV2["<b>Overlay 2:</b><br/>type: SOURCE<br/>element: libA<br/>path: src/math_utils.h<br/>target_digest: h_aaa111..."]
    end
    
    SB1 -.-|digest match| OV1
    SA2 -.-|"digest match<br/>(libA Source not Artifact!)"| OV2
    OV1 --> SA
    OV2 --> SA
    SA --> A1
    
    style Step1A fill:#e6f3ff
    style Step1C fill:#e6f3ff
    style Step1B fill:#ffe6cc
    style Step4 fill:#e6ffe6
    style Step2 fill:#fff4e1
    style Step3 fill:#f0f0f0
Loading

Recorded Subaction (compile vector_math.cpp)

Action {
  command_digest: "cmd_bbb222..."
  input_root_digest: "dir_bbb222..."
}

Command {
  arguments: [
    "g++", "-c", "-fPIC",
    "-I/usr/include",              # libA headers from dependency artifact
    "src/vector_math.cpp",
    "-o", "vector_math.o"
  ]
}

Directory (dir_bbb222...) {
  files: [
    File { name: "src/vector_math.cpp", digest: "cpp_bbb222..." },
    File { name: "/usr/include/math_utils.h", digest: "h_aaa111..." }
  ]
}

Generated SpeculativeAction

SpeculativeAction {
  base_action_digest: "bbb222..."
  overlays: [
    Overlay {
      type: SOURCE
      source_element: "libB"
      source_path: "src/vector_math.cpp"
      target_digest: "cpp_bbb222..."    # Replace this digest
    },
    Overlay {
      type: SOURCE                       # Prefers SOURCE over ARTIFACT!
      source_element: "libA"             # Found in libA's source tree
      source_path: "src/math_utils.h"    # Original source path
      target_digest: "h_aaa111..."       # Same digest as in /usr/include/
    }
  ]
}

Key Insight: The overlay generator finds that math_utils.h (digest h_aaa111...) appears in libA's ARTIFACT output at /usr/include/math_utils.h, but also discovers it exists in libA's SOURCE at src/math_utils.h with the same digest. Following the SOURCE > ACTION > ARTIFACT priority, it records a SOURCE overlay pointing to libA's source tree. This creates a tighter dependency on the original source rather than the built artifact.

5.3 Example: appC Linking

appC Source Structure

appC/
├── src/main.cpp
└── appC.bst

appC Artifact Structure (Build Output)

/
├── usr/
│   └── bin/
│       └── appC             # Executable

Visual Flow:

graph TB
    subgraph Step1A["① Previous Speculative Action Output"]
        PA1["main.o<br/><b>digest: obj_ccc333...</b><br/>(from compile action ccc333...)"]
    end
    
    subgraph Step1B["① libB Artifact"]
        AB1["/usr/lib/libvector_math.a<br/><b>digest: lib_bbb222...</b>"]
    end
    
    subgraph Step1C["① libA Artifact"]
        AA1["/usr/lib/libmath_utils.a<br/><b>digest: lib_aaa111...</b>"]
    end
    
    subgraph Step2["② (Sub)Action Execution"]
        A1["<b>g++ link</b><br/>Action digest: ccc444..."]
    end
    
    PA1 -->|input| Step2
    AB1 -->|input| Step2
    AA1 -->|input| Step2
    
    subgraph Step3["③ (Sub)Action Result"]
        O1["appC executable"]
    end
    
    Step2 -->|produces| O1
    
    subgraph Step4["④ Generated SpeculativeAction"]
        SA["<b>base_action_digest: ccc444...</b>"]
        OV1["<b>Overlay 1:</b><br/>type: ACTION<br/>element: appC<br/>source_action: ccc333...<br/>path: main.o<br/>target_digest: obj_ccc333..."]
        OV2["<b>Overlay 2:</b><br/>type: ARTIFACT<br/>element: libB<br/>path: /usr/lib/libvector_math.a<br/>target_digest: lib_bbb222..."]
        OV3["<b>Overlay 3:</b><br/>type: ARTIFACT<br/>element: libA<br/>path: /usr/lib/libmath_utils.a<br/>target_digest: lib_aaa111..."]
    end
    
    OV1 --> SA
    OV2 --> SA
    OV3 --> SA
    PA1 -.-|digest match| OV1
    AB1 -.-|digest match| OV2
    AA1 -.-|digest match| OV3
    SA --> A1
    
    style Step1A fill:#e6ffe6
    style Step1B fill:#ffe6cc
    style Step1C fill:#ffe6cc
    style Step4 fill:#e6ffe6
    style Step2 fill:#fff4e1
    style Step3 fill:#f0f0f0
Loading

Generated SpeculativeAction (link step)

SpeculativeAction {
  base_action_digest: "ccc444..."
  overlays: [
    Overlay {
      type: ACTION                      # From previous speculative action
      source_element: "appC"
      source_action: "ccc333..."        # The compile action
      source_path: "main.o"
      target_digest: "obj_ccc333..."
    },
    Overlay {
      type: ARTIFACT                    # From libB artifact output
      source_element: "libB"
      source_path: "/usr/lib/libvector_math.a"
      target_digest: "lib_bbb222..."
    },
    Overlay {
      type: ARTIFACT                    # From libA artifact output
      source_element: "libA"
      source_path: "/usr/lib/libmath_utils.a"
      target_digest: "lib_aaa111..."
    }
  ]
}

5.4 Overlay Type Summary

Visual Overview:

graph TD
    Start[Input File Digest] --> Q1{Found in current or<br/>dependency element's<br/>source tree?}
    Q1 -->|YES| S[①<br/>SOURCE overlay]
    Q1 -->|NO| Q2{Found in current or<br/>dependency element's<br/>sub-action result?}
    Q2 -->|YES| A[②<br/>ACTION overlay]
    Q2 -->|NO| Q3{Found in dependency<br/>element's artifact?}
    Q3 -->|YES| AR[③<br/>ARTIFACT overlay]
    
    style S fill:#e6f3ff
    style A fill:#e6ffe6
    style AR fill:#ffe6cc
Loading

Overlay Type Summary

Type Source Example Use Case
SOURCE Element's source tree .cpp files, .h headers Tracks source code changes; preferred over ARTIFACT when digest matches
ARTIFACT Dependency element's artifact output .a libraries, built artifacts Tracks dependency artifacts that don't exist in source
ACTION Previous speculative action output .o object files Links compilation steps within an element

Priority: When the same digest appears in multiple locations, overlays prefer SOURCE > ACTION > ARTIFACT. This creates tighter dependencies on original sources rather than intermediate artifacts.

How This Exploits Code Stability

The overlay mechanism directly exploits the research findings from Section 2:

  1. SOURCE overlays for .cpp files: Will require updates when implementation files change (~20% of files per build)
  2. SOURCE overlays for .h files: Often remain valid across multiple builds due to header stability (Adams et al., 2016)
  3. ARTIFACT overlays for dependency artifacts: Very stable, as dependency libraries rarely change between builds

Given that research shows 80% of files typically remain unchanged, we expect:

  • ~80% of SOURCE overlays to resolve successfully with unchanged digests
  • Cache hits from previous speculative executions provide the 4-5x speedup observed in caching studies (Matev, 2020; Misu et al., 2024)

6. System Flow

6.1 High-Level Process Flow

flowchart TD
    Start([Build Session Start]) --> Identify[Identify Elements to Build]
    Identify --> Retrieve[Retrieve Artifacts &<br/>SpeculativeActions<br/>by element weak-ref]
    Retrieve --> LoadRef[Load Referenced<br/>SpeculativeActions]
    LoadRef --> Aggregate[Aggregate & Deduplicate<br/>SpeculativeActions]
    Aggregate --> TopoSort[Topological Sort<br/>by Dependencies]
    
    TopoSort --> Instantiate{For Each<br/>SpeculativeAction}
    Instantiate --> Resolve{All Overlays<br/>Resolvable?}
    
    Resolve -->|No| Skip[Skip Action &<br/>Transitive Dependents]
    Resolve -->|Yes| Apply[Apply Overlays]
    
    Apply --> Store[Store in CAS]
    Store --> RecordMap[Record Mapping:<br/>base_action_digest →<br/>speculative_action_digest]
    RecordMap --> Submit[Submit to Remote Execution]
    
    Skip --> More{More?}
    Submit --> More
    More -->|Yes| Instantiate
    More -->|No| Parallel[Speculative Execution<br/>Warms ActionCache]
    
    Parallel --> Interleave[Scheduler Interleaves:<br/>Instantiate remaining speculative actions<br/>as dependencies become available]
    Interleave --> Build[Element Builds<br/>Hit Cache]
    Build --> Complete{Build<br/>Complete?}
    Complete -->|No| Interleave
    Complete -->|Yes| Cancel[Cancel Remaining<br/>Speculative Executions]
    Cancel --> End([Build Complete])
    
    style Submit fill:#fff4e1
    style Parallel fill:#fff4e1
    style Interleave fill:#e1f0ff
    style Build fill:#e1f0ff
    style Skip fill:#ffe1e1
Loading

Detailed Flow Steps:

  1. Identify elements to build: Determine which elements need building based on cache keys
  2. Retrieve element artifacts and SpeculativeActions: Query Artifact Cache using element weak cache keys to get artifacts and their associated SpeculativeActions
  3. Load Referenced SpeculativeActions: For elements referenced in overlays that we don't have loaded yet, load them from ReferencedSpeculativeActions.
    • As not all code in an element will churn, keeping the dependency chain of SpeculativeActions intact is likely more beneficial than not.
    • TODO: decide whether to do this for elements referenced in all overlay types or to limit to the ACTION type
  4. Aggregate, deduplicate, and topologically sort: Combine all SpeculativeActions, remove duplicates, and sort respecting dependencies
  5. Instantiate speculative actions: Apply overlays to base_action_digest input trees
    • If not all overlays are resolved (e.g., src_path not found in overlay source), skip the speculative action and all its transitive dependents
  6. Store and record mapping: Store new speculative actions in CAS and record mapping { base_action_digest → speculative_action_digest } for dependency resolution in following overlays.
  7. Submit for speculative execution: Submit instantiated actions to Remote Execution
  8. Interleave with element builds: BuildStream scheduler continues instantiating additional speculative actions as dependencies become available during ongoing element builds
  9. Cancel remaining executions: After build completion, cancel any speculative executions still in progress

6.2 Data Flow Diagram

flowchart TB
    subgraph BuildN["Build #N"]
        Start[Build #N Starts]
        
        subgraph Priming["Cache Priming Phase"]
            Start --> Sched[BuildStream Scheduler]
            Sched -->|query via Asset API| AC1[(Artifact Cache)]
            AC1 -->|return SpeculativeActions<br/>from Build #N-1| SAList[SpeculativeActions]
            
            SAList --> Inst[Instantiator]
            
            Inst --> Over{Overlay<br/>Type?}
            Over -->|SOURCE| Src[Source Tree]
            Over -->|ARTIFACT| Art[Artifact Output]
            Over -->|ACTION| Act[Action Output]
            
            Src --> New[New Action]
            Art --> New
            Act --> New
            
            CASD1[buildbox-casd<br/>local CAS proxy]
            New -->|store via CAS API| CASD1
            CASD1 -->|proxy to| CAS1[(BuildGrid CAS)]
            
            New -->|Execute API| BuildGrid1[BuildGrid<br/>Execution Service]
            BuildGrid1 -->|check| ACache[ActionCache]
            ACache -->|miss| BuildGrid1
            BuildGrid1 -->|dispatch via Bots API| Workers1[buildbox-worker]
            Workers1 -->|fetch via| CASD_W1[buildbox-casd<br/>on worker]
            CASD_W1 -->|proxy to| CAS1
            Workers1 -->|execute| Runner1[buildbox-run]
            Runner1 -->|result| Workers1
            Workers1 -->|store result| ACache
        end
        
        subgraph Execution["Element Builds"]
            Sched -->|trigger| BSElement[BuildStream<br/>Element Build]
            BSElement -->|create Action| CASD2[buildbox-casd<br/>local CAS proxy]
            CASD2 -->|store via CAS API| CAS1
            BSElement -->|Execute API| BuildGrid2[BuildGrid<br/>Execution Service]
            BuildGrid2 -->|check| ACache
            ACache -.->|cache hit for sub-actions!| BuildGrid2
            BuildGrid2 -->|dispatch via Bots API| Workers2[buildbox-worker]
            Workers2 -->|fetch via| CASD_W2[buildbox-casd<br/>on worker]
            CASD_W2 -->|proxy to| CAS1
            Workers2 -->|execute| Runner2[buildbox-run]
            Runner2 -->|spawns| SubActions[Sub-Actions:<br/>recc invocations]
            SubActions -.->|benefit from warmed ActionCache| ACache
            SubActions -->|record digests| Runner2
            Runner2 -->|result + subactions| Workers2
            Workers2 -->|ActionResult| BuildGrid2
            BuildGrid2 -->|result| BSElement
            BSElement -->|store via| CASD2
        end
        
        subgraph Generation["Overlay Generation (Async)"]
            Gen[Overlay Generator<br/>ASYNC]
            BSElement -.->|triggers async| Gen
            Gen -->|fetch Actions via CAS API| CASD2
            Gen -->|create| SA[SpeculativeActions]
            SA -->|attach to| Art1[Artifact]
            Art1 -->|store via Asset API| AC2[(Artifact Cache)]
        end
    end
    
    AC2 -.->|↓ Data for Build #N+1 ↓| Boundary
    
    Boundary[/"═══════════════════════════════════════════════════════════════════"\]
    
    subgraph BuildN1["Build #N+1 (Uses SpeculativeActions from Build #N)"]
        NextBuild[Build #N+1 will use<br/>SpeculativeActions<br/>generated above]
    end
    
    style Start fill:#e6f3ff
    style SAList fill:#ffeb99
    style Gen fill:#ffe6cc
    style SA fill:#ffe6cc
    style Art1 fill:#ffe6cc
    style AC2 fill:#ffe6cc
    style Workers1 fill:#fff4e1
    style Workers2 fill:#fff4e1
    style BSElement fill:#e1f0ff
    style ACache fill:#fff9e1
    style Boundary fill:#f0f0f0,stroke:#333,stroke-width:3px,stroke-dasharray: 5 5
    style BuildN1 fill:#f8f8ff
    style NextBuild fill:#ffeb99
    style CASD1 fill:#e8f4f8
    style CASD2 fill:#e8f4f8
    style CASD_W1 fill:#e8f4f8
    style CASD_W2 fill:#e8f4f8
Loading

Reading the Diagram:

This is an example setup with BuildGrid for Remote Execution services.

Note: this diagram could do with some additional cleanup and scrutiny

REAPI Architecture:

  • BuildStream runs with local buildbox-casd as CAS proxy
  • buildbox-casd proxies to BuildGrid's CAS, ActionCache, and Execution services
  • Workers also run buildbox-casd locally for efficient blob access
  • All communication uses REAPI (Execute, CAS, Asset, Bots APIs)

Cache Priming Flow:

  1. Priming Phase: Retrieves SpeculativeActions → instantiates → submits via Execute API → BuildGrid dispatches to workers → results stored in ActionCache
  2. Element Builds: BuildStream creates Actions → Execute API → BuildGrid checks ActionCache → dispatches to workers → buildbox-run spawns sub-actions (recc) → sub-actions benefit from warmed ActionCache
  3. Overlay Generation: Asynchronously fetches sub-action digests and generates SpeculativeActions for Build #N+1

Key Flow: Build #N uses speculative actions from Build #N-1 → builds → generates speculative actions for Build #N+1


7. Detailed Algorithms

Note: The following pseudocode is illustrative and non-optimal. Production implementations should optimize for performance, especially around CAS operations and dependency resolution.

7.1 Overlay Generation (after build)

Note: This process runs asynchronously and does not block the element build completion or downstream element builds.

# Triggered asynchronously after element build completes
async def generate_overlays(element, element_subactions):
    speculative_actions = []

    for subaction_digest in element_subactions:
        subaction = fetch_cas(subaction_digest)
        overlays = []
        
        for input_file in subaction.input_root.files:
            # Match digest to source: SOURCE > ACTION > ARTIFACT priority
            # This means if digest "h_aaa111..." appears in:
            #   - libA's source at "src/math_utils.h" (SOURCE)
            #   - libA's artifact at "/usr/include/math_utils.h" (ARTIFACT)
            # We prefer the SOURCE match
            
            src_element, src_path, src_type = find_source_element(input_file.digest)
            
            overlay = Overlay(
                type=src_type,
                source_element=src_element,
                source_path=src_path,
                target_digest=input_file.digest  # The digest we'll replace during instantiation
            )
            overlays.append(overlay)
        
        speculative_actions.append(SpeculativeAction(
            base_action_digest=subaction_digest,
            overlays=overlays
        ))

    # Attach to Artifact (update artifact in cache)
    artifact.speculative_actions = store_cas(SpeculativeActions(actions=speculative_actions))
    update_artifact_cache(element, artifact)
    
    # Element's downstream dependencies can already be building
    # This overlay data will be available for the NEXT build session

Note on find_source_element: This function searches through all elements in the dependency graph that were built before the current element. It looks for files matching the target digest in:

  1. SOURCE trees (element source files)
  2. ACTION outputs (from other speculative actions, including those from earlier elements in the dependency graph)
  3. ARTIFACT outputs (built artifacts from dependencies)

When the same digest appears in multiple locations, it returns the SOURCE match, creating a direct dependency on the original source rather than intermediate artifacts.

Use Cases for ACTION Overlays Beyond Current Element:
ACTION overlays are particularly valuable for intermediate build artifacts that are generated deterministically:

  • Code generation: protoc, flex/bison output files can be referenced across elements
  • Resource compilation: glib-compile-resources, Qt rcc generated code
  • Archive creation: .o.a via rear (see Section 12)
  • Intermediate transformations: any deterministic build step producing reusable artifacts

7.2 Overlay Instantiation (before build)

speculative_action_map = {}  # base_digest → speculative_action_digest

for action_template in topological_sort(all_speculative_actions):
    base_action = fetch_cas(action_template.base_action_digest)
    action = copy.deepcopy(base_action)
    
    # Apply each overlay
    all_resolved = True
    for overlay in action_template.overlays:
        if overlay.type == SOURCE:
            src_digest = resolve_source_digest(overlay.source_element, overlay.source_path)
        elif overlay.type == ARTIFACT:
            artifact_output = wait_for_artifact_output(overlay.source_element)
            src_digest = digest_at_path(artifact_output, overlay.source_path)
        elif overlay.type == ACTION:
            speculative_action = speculative_action_map[overlay.source_action]
            wait_for_action(speculative_action)
            output = fetch_action_output(speculative_action)
            src_digest = digest_at_path(output, overlay.source_path)
        
        if src_digest is None:
            all_resolved = False
            break
        
        replace_input_digest(action.input_root, overlay.target_digest, src_digest)
    
    if all_resolved:
        speculative_action_digest = store_cas(action)
        speculative_action_map[action_template.base_action_digest] = speculative_action_digest
        submit_speculative_action(speculative_action_digest)

8. Implementation Plan

Phase 1: REAPI Extension

  1. Add ActionResult.subactions field to REAPI
  2. Instrument buildbox-worker to record subaction digests
  3. Update buildbox-casd to store subactions with ActionResult

Phase 2: BuildStream Integration - SOURCE Overlays Only

  1. Implement overlay generator in BuildStream (SOURCE type only)
  2. Add SpeculativeActions to Artifact protobuf
  3. Store/retrieve SpeculativeActions via Artifact Cache
  4. Implement speculative action instantiation for SOURCE overlays
  5. Add speculative action submission to scheduler
  6. Make overlay generation asynchronous: Run as background task, don't block element builds or downstream dependencies
  7. Measure: Overlay resolution success rate, storage overhead, performance impact, overlay generation time

Phase 3: ARTIFACT Overlays

  1. Extend overlay generator to support ARTIFACT type
  2. Update instantiation logic to resolve ARTIFACT overlays
  3. Measure: Additional performance gains, complexity vs. benefit

Phase 4: ACTION Overlays (potentially combined with rear and other re- wrappers)

  1. Implement ACTION overlay support in generator and instantiator
  2. Evaluate whether ACTION overlays provide value for code generation (protoc, flex/bison) and other deterministic intermediate steps
  3. If beneficial for archive creation, implement rear (remote execution ar) to complete the speculative action chain (see Section 12 for details)
  4. Consider other re-wrapper candidates: ranlib, strip, resource compilers (see Section 12)
  5. Measure: End-to-end performance improvement

Phase 5: Optimization

  1. Optimize digest resolution using Apache Arrow for columnar digest storage and vectorized lookups
  2. Skip priming for elements with strong cache key hits and their transitive dependencies up the dependency graph (unchanged subtrees)
  3. Optimize topological sorting and deduplication
  4. Implement adaptive throttling for speculative action submission
  5. Add comprehensive monitoring and metrics

9. Benefits

  • Restores Parallelism: Compilation units execute in parallel across the entire build graph
  • Transparent: No changes to element definitions or build scripts
  • Safe: Incorrect/stale overlays simply become redundant; correctness never affected
  • Efficient: Minimal storage overhead (overlays are small metadata)
  • Incremental: Works with partial cache hits and source changes
  • Extensible: Can be enhanced with local execution optimization (rear) and other re-wrapped commands to further improve performance

10. Key Design Decisions

  1. Store speculative actions with artifacts: Keeps priming data co-located with build results
  2. Three speculative action overlay types: SOURCE/ARTIFACT/ACTION cover all input sources with appropriate granularity
  3. Topological sorting: Ensures ACTION overlays can resolve dependencies
  4. Skip on unresolved: Gracefully handles missing/changed inputs without blocking
  5. Early submission: Submit speculative actions as soon as dependencies resolve, regardless of element depth in graph
  6. Asynchronous overlay generation: Overlay generation runs in the background and does not block element build completion or downstream element builds. SpeculativeActions become available for future builds without impacting the current build's critical path.

11. Open Questions and Key Uncertainties

Implementation Questions

  1. Resource limits: How many concurrent speculative actions should we allow?
  2. Metrics: What telemetry do we need to measure priming effectiveness?

Critical Success Factors

1. Phased Rollout Strategy

Question: Should we implement all overlay types at once, or phase them?

Proposal: Start with SOURCE overlays only (Phase 2), measure impact, then add ARTIFACT (Phase 3), then ACTION (Phase 4).

Rationale:

  • SOURCE overlays are simplest and likely provide the most benefit
  • ARTIFACT overlays add cross-element dependencies but may have lower resolution rates
  • ACTION overlays require careful dependency management and may only be valuable combined with rear

Required: Define success criteria for each phase before proceeding to the next.

2. Overlay Resolution Success Rate

Question: How often will overlays successfully resolve in practice?

Context: Speculative actions are skipped if any overlay cannot be resolved. Success rate depends on:

  • How frequently source files change between builds
  • Whether the same digests appear in both SOURCE and ARTIFACT locations
  • For ACTION overlays: whether dependency chains remain intact

Research Evidence: Studies of C/C++ build caching and code churn patterns provide encouraging data:

  • Adams et al. (2016): Most C/C++ header files are relatively stable compared to implementation files
  • Misu et al. (2024): Incremental builds with caching achieve 4.22x to 4.71x speedups, demonstrating high cache hit rates in practice
  • Matev (2020): Hot ccache provides 4-5x speedup over cold cache in large C++ projects
  • Chowdhury et al. (2024): Approximately 80% of code changes concentrate in 20% of methods (Pareto principle)
  • Shin et al. (2011): In Mozilla and Red Hat Linux, 89.1% of files were in the low-churn, stable category
  • Macho et al. (2021): Analysis of 144 Maven projects and build changes found that while dependency version updates are among the top-10 most frequent build changes, structural changes to build configuration (adding/removing dependencies, changing build order) occur much less frequently. Build maintenance is primarily driven by accommodating changes to production code rather than restructuring the build itself.
  • McIntosh et al. (2012): Study of Java build systems showed that build complexity evolves slowly over time, with build structure remaining relatively stable between major refactorings.

Implication for Build Structure Stability: Research shows that while source code changes frequently, build structure (which libraries are linked, link order, build rules) changes much less often:

  • ~80% of SOURCE overlays for .cpp files to resolve successfully (unchanged sources)
  • 90% of SOURCE overlays for .h files to resolve (headers more stable per Adams)

  • 95% of ARTIFACT overlays to resolve (dependency artifacts rarely change)

  • ~85-90% of ACTION overlays to resolve (link lines and build structure change infrequently per Macho/McIntosh)

Real-world cache effectiveness: The Misu and Matev studies showing 4-5x speedups from caching suggest that cache hit rates of 75-80% are achievable in practice, directly supporting the viability of Cache Priming.

Required:

  • Instrument prototype to measure actual resolution rates on real projects
  • Identify common failure patterns
  • Validate against freedesktop-sdk and GNOME builds
  • Determine if partial resolution strategies would help

Risk: If <50% of speculative actions resolve successfully, the system may not provide sufficient benefit to justify the complexity. However, research suggests success rates should be substantially higher (70-80%+).

3. Non-Determinism Handling

Question: How do non-deterministic builds affect speculative action correctness?

Context: If a build produces different outputs for the same inputs, the recorded base_action_digest may not be reusable.

Current Assumption: Non-deterministic actions will still return cached results when the ActionCache entry hasn't been evicted. If a new execution produces different outputs, it generates a new subaction, and SpeculativeActions are updated for the next build.

Required: Validate this assumption and document expected behavior for non-deterministic builds.

4. Digest Resolution Performance Impact

Question: Can we make find_source_element fast enough to avoid becoming a bottleneck?

Context: During overlay generation, every input file of every subaction needs its digest resolved against all source trees, artifact outputs, and prior actions in the dependency graph. For large builds, this could be thousands of files × thousands of potential sources.

Critical Insight: Since overlay generation runs asynchronously and doesn't block element builds or speculative action execution, performance requirements are relaxed:

  • Overlay generation for element N can run while elements N+1, N+2, etc. are building
  • Generated SpeculativeActions are used in the next build session, not the current one
  • Slow overlay generation only delays availability of speculative actions for future builds

Optimization Strategies:

  • Apache Arrow for digest lookups: Use Apache Arrow's columnar format to store digest tables with built-in vectorized operations
    • Arrow provides O(1) random access with cache-efficient columnar layout
    • Built-in SIMD optimizations for hash lookups and comparisons
    • Cross-language support (C++, Python, Rust) for BuildStream integration
    • Set lookup kernels with hash table support already implemented
    • No need to implement custom indices, bloom filters, or vectorization
  • Columnar digest storage: Store digests in Arrow tables: [(digest: FixedSizeBinary(32), element: string, path: string, type: int8)]
  • Arrow compute kernels: Use pc.is_in() for individual digest checks and pc.join(..., join_type="semi") for bulk digest matching against large sets
  • Memory efficiency: Arrow's columnar format provides 64-byte alignment for optimal SIMD performance

Why Apache Arrow:

  • Arrow's columnar layout enables vectorization using SIMD operations, with 64-byte alignment matching AVX-512 register width
  • Arrow provides built-in set lookup kernels with hash table support for exactly this use case
  • Arrow's pipeline and SIMD algorithms deliver 10-100x performance improvements through cache locality and parallel operations
  • Arrow enables efficient data processing without serialization overhead, making cross-module data sharing trivial
  • Proven at scale: PySpark achieved 10-100x performance improvements using Arrow

Required:

  • Profile and optimize on real-world projects (freedesktop-sdk, GNOME)
  • Measure: time to generate overlays vs. element build time
  • Implement Apache Arrow-based digest lookup tables as primary strategy
  • Leverage Arrow's built-in set lookup kernels and hash table support

Acceptable Performance: Overlay generation should complete within 1-2x the element's build time. Since it runs asynchronously, it won't impact the current build's critical path.

Risk: If overlay generation is extremely slow (>10x element build time), it might not complete before the next build session starts, reducing cache priming effectiveness.

5. Speculative Action Scale and Remote Execution Capacity

Question: Won't submitting thousands of speculative actions overwhelm the remote execution cluster?

Context: A large C/C++ BuildStream project might generate thousands of speculative actions (one per compilation unit). For example, a project with 100 elements averaging 100 compilation units each would produce 10,000 speculative actions.

Comparison to Existing Systems: This scale is actually modest compared to build systems already using Remote Execution:

  • Bazel builds of TensorFlow generate over 8,191 actions executing across hundreds of remote workers
  • Google runs millions of builds executing millions of test cases every day, with builds depending on tens of thousands of targets
  • A large TypeScript project with 10M lines of code generates 1,100 Bazel targets, with benchmarks using 100 remote executors
  • Bazel builds often include thousands or tens of thousands of small actions such as compiling, linking, or running tests

Key Insight: Remote Execution systems like BuildGrid are designed to handle tens of thousands of concurrent actions. Cache Priming's speculative actions are within the normal operating range of these systems.

Mitigation Strategies:

  • ActionCache hits: Many speculative actions will hit immediately, reducing actual execution load
  • Priority scheduling: Actual element builds can be prioritized over speculative actions
  • Adaptive throttling: Submit speculative actions at a controlled rate based on cluster capacity
  • Gradual rollout: Start with Phase 2 (SOURCE overlays only) which generates fewer actions

Required:

  • Monitor remote execution cluster utilization during prototype testing
  • Implement priority queuing: actual builds > speculative actions
  • Consider time-windowing: only prime for elements building soon

Acceptable: Speculative actions should consume <30% of remote execution capacity, leaving headroom for actual builds.

6. Storage and Network Overhead

Question: What is the actual storage overhead of SpeculativeActions?

Context:

  • Each artifact now includes SpeculativeActions metadata
  • For compilation-heavy elements (1000+ compilation units), this could be significant
  • ReferencedSpeculativeAction stores only element name + digest (minimal duplication)

Required:

  • Measure actual storage increase on real projects
  • Consider compression strategies if overhead is significant
  • Provide opt-out mechanism for elements that don't benefit

7. Value of ACTION Overlays Without rear

Question: Do ACTION overlays provide benefit on their own, or only when combined with rear?

Context: ACTION overlays create dependencies between speculative actions within an element. Without rear (see Section 12), the primary use case would be linking, which requires archives that may not exist as separate subactions.

Proposal: Evaluate ACTION overlays and rear together in Phase 4, as they may be interdependent.

Required: Identify other potential uses for ACTION overlays beyond archive creation.

Success Metrics

To validate the proposal, we need to measure:

  1. Performance: End-to-end build time reduction (target: >20% for C/C++ heavy projects)
  2. Resolution Rate: Percentage of speculative actions successfully instantiated (target: >70%)
  3. Storage Overhead: Size increase of artifact cache (acceptable: <20%)
  4. Overlay Generation Time: Time to generate overlays per element (acceptable: <2x element build time, since it's async)
  5. Worker Utilization: Speculative actions should not starve actual builds

Validation Plan

  1. Implement Phase 1-2: SOURCE overlays only on a prototype
  2. Test on freedesktop-sdk: Large, real-world C/C++ heavy BuildStream project
  3. Measure all success metrics: Gather data before proceeding
  4. Iterate or pivot: If metrics don't meet targets, adjust design or abandon
  5. Proceed to Phase 3+: Only if Phase 2 shows clear benefit

12. Future Enhancement: Local Execution with rear

Motivation

In the base Cache Priming design, speculative actions follow this pattern:

  1. Compile .cpp.o (via recc, speculative)
  2. Upload .o to CAS
  3. Download .o files for archiving
  4. Create .a archive
  5. Upload .a to CAS

However, since BuildStream submits entire element builds as Actions to workers, and recc returns .o files during the element build, those .o files are already local on the worker. We can optimize archive creation by keeping it local.

The rear Command

Similar to recc (remote execution caching compiler), rear (remote execution ar) wraps the ar archiver:

# Traditional ar usage
ar rcs libmath_utils.a math_utils.o helpers.o

# With rear wrapper
rear rcs libmath_utils.a math_utils.o helpers.o

rear behavior:

  1. Generates an Action describing the archive operation
  2. Checks ActionCache for existing result
  3. If miss:
    • During main element build: Executes ar locally on the worker (via buildbox-casd) since .o files are already present
    • During speculative execution: Executes remotely (fetches .o files from CAS as needed)
  4. Stores result in ActionCache
  5. Records the Action digest in ActionResult.subactions

Extended Speculative Action Chain

With rear, we get a complete chain of speculative actions:

libA element build (main build):
  recc g++ -c src/math_utils.cpp -o math_utils.o    [remote or cached]
    ↓ (local .o file on worker)
  rear rcs /usr/lib/libmath_utils.a math_utils.o    [LOCAL execution]
    ↓ (local .a file on worker)
  g++ app.o -L/usr/lib -lmath_utils -o final_binary [remote or cached]

Speculative execution (cache priming):
  recc compile action                                [remote, warms ActionCache]
    ↓ (.o uploaded to CAS)
  rear archive action                                [remote, fetches .o from CAS]
    ↓ (.a uploaded to CAS)
  link action                                        [remote, fetches .a from CAS]

Example: libA with rear

Recorded Subactions during libA build:

ActionResult {
  subactions: [
    "recc_compile_math_utils...",   // .cpp → .o
    "recc_compile_helpers...",      // .cpp → .o
    "rear_archive..."               // .o → .a
  ]
}

Generated SpeculativeAction for rear:

SpeculativeAction {
  base_action_digest: "rear_archive..."
  overlays: [
    Overlay {
      type: ACTION
      source_element: "libA.bst"
      source_action: "recc_compile_math_utils..."
      source_path: "math_utils.o"
      target_digest: "obj_math_utils..."
    },
    Overlay {
      type: ACTION
      source_element: "libA.bst"
      source_action: "recc_compile_helpers..."
      source_path: "helpers.o"
      target_digest: "obj_helpers..."
    }
  ]
}

Key Advantages

  1. Faster Main Builds: During the actual element build, .o files are local on the worker, so archive creation executes immediately without network overhead
  2. Earlier Link Actions: Even though speculative rear actions run remotely, they still populate the ActionCache, enabling link actions to start as soon as archives are cached
  3. ActionCache Hits: When the main build runs, rear checks ActionCache first and gets an immediate hit from the speculative execution
  4. Reduced Critical Path: The main build's archive step becomes nearly instantaneous (cache hit), shortening the overall build time

Note: Future optimizations (out of scope for this proposal) could include locality hints to prefer scheduling speculative rear actions on workers that already have the .o files, further reducing network traffic.

Execution Flow

Cache Priming Phase (speculative):
  Submit compile actions (recc) → executes remotely, warms ActionCache
    ↓ [.o files uploaded to CAS]
  Submit archive action (rear) → executes remotely, fetches .o from CAS
    ↓ [.a file uploaded to CAS, ActionCache warmed]
  Submit link actions → executes remotely, warms ActionCache
    ↓

Main Build Phase:
  Element build starts
    ↓
  recc compile → ActionCache HIT (instant)
    ↓ [.o files now local on worker]
  rear archive → ActionCache HIT, but if miss, executes locally (instant)
    ↓ [.a file now local on worker]
  link → ActionCache HIT (instant)
    ↓
  Element build completes in minimal time

The key insight: rear enables ActionCache hits during the main build, and when those hits miss, local execution is very fast because the input .o files are already on the worker.

Beyond rear: Instrumentation Strategy

The rear optimization illustrates a general principle: identify commands that operate on local intermediate data and wrap them for speculative execution.

Candidates for "re-" wrappers:

  • ranlib: Index generation for static libraries (fast, local)
  • strip: Symbol stripping (fast, local, self-contained)
  • Code generators: protoc, flex/bison (deterministic, cacheable)
  • Resource compilers: glib-compile-resources, Qt rcc

Instrumentation approach:

  1. Monitor build logs to identify frequent intermediate commands
  2. Measure: execution time, input/output sizes, determinism
  3. Prioritize commands that are:
    • Fast enough for local execution (< 1 second)
    • Self-contained (few dependencies)
    • Operate on data already local to worker
    • Appear frequently across builds
  4. Implement re-wrappers and measure impact

This creates a virtuous cycle: more instrumentation → more optimization opportunities → better build performance.


References

  • Adams, B., McIntosh, S., Nagappan, M., and Hassan, A.E. (2016). "Identifying and understanding header file hotspots in C/C++ build processes." Automated Software Engineering, Vol. 23, pp. 61-90. https://dl.acm.org/doi/10.1007/s10515-015-0183-5

  • Chowdhury, S., Uddin, G., Hemmati, H., and Holmes, R. (2024). "The Good, the Bad, and the Monstrous: Predicting Highly Change-Prone Source Code Methods at Their Inception." ACM Transactions on Software Engineering and Methodology, Vol. 33, No. 4. https://dl.acm.org/doi/10.1145/3715006

  • Macho, C., McIntosh, S., and Pinzger, M. (2021). "The nature of build changes." Empirical Software Engineering, Vol. 26, Article 52. https://link.springer.com/article/10.1007/s10664-020-09926-4

  • Matev, R. (2020). "Fast distributed compilation and testing of large C++ projects." EPJ Web of Conferences, Vol. 245. https://www.epj-conferences.org/articles/epjconf/pdf/2020/21/epjconf_chep2020_05001.pdf

  • McIntosh, S., Adams, B., and Hassan, A.E. (2012). "The Evolution of Java Build Systems." Empirical Software Engineering, Vol. 17, No. 4-5, pp. 578-608.

  • Misu, S., Lin, B., and Monden, A. (2024). "Does Using Bazel Help Speed Up Continuous Integration Builds?" arXiv preprint. https://arxiv.org/html/2405.00796v1

  • Nagappan, N. and Ball, T. (2007). "Using Software Dependencies and Churn Metrics to Predict Field Failures: An Empirical Case Study." First International Symposium on Empirical Software Engineering and Measurement (ESEM 2007). https://ieeexplore.ieee.org/document/4343764/

  • Shin, Y., Meneely, A., Williams, L., and Osborne, J. (2011). "Evaluating Complexity, Code Churn, and Developer Activity Metrics as Indicators of Software Vulnerabilities." IEEE Transactions on Software Engineering, Vol. 37, No. 6, pp. 772-787. https://ieeexplore.ieee.org/document/5560680/

  • Munson, J.C. and Elbaum, S. (1998). "Code Churn: A Measure for Estimating the Impact of Code Change." Proceedings of the International Conference on Software Maintenance, pp. 24-31. https://ieeexplore.ieee.org/document/738486/

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestoptimizationMake BuildStream faster and use less resourcesremote executionRelated to remote execution

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions