Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 2 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -154,7 +154,6 @@ xcode-build-optimization-agent-skill/
build-benchmark.schema.json
scripts/
benchmark_builds.py
check_spm_pins.py
diagnose_compilation.py
generate_optimization_report.py
render_recommendations.py
Expand Down Expand Up @@ -217,8 +216,8 @@ Real-world improvements reported by developers who used these skills. Add your o

The `xcode-build-orchestrator` generates your table row at the end of every optimization run, so contributing is a single copy-paste.

| App | Incremental Before | Incremental After | Clean Before | Clean After |
|-----|-------------------:|------------------:|-------------:|------------:|
| App | Incremental Before | Incremental After | Clean Before | Clean After | Cached Clean Before | Cached Clean After |
|-----|-------------------:|------------------:|-------------:|------------:|--------------------:|-------------------:|

## Contributing

Expand Down
25 changes: 16 additions & 9 deletions references/benchmark-artifacts.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,9 @@ Recommended outputs:
- `.build-benchmark/<timestamp>-<scheme>-clean-1.log`
- `.build-benchmark/<timestamp>-<scheme>-clean-2.log`
- `.build-benchmark/<timestamp>-<scheme>-clean-3.log`
- `.build-benchmark/<timestamp>-<scheme>-cached-clean-1.log` (when COMPILATION_CACHING is enabled)
- `.build-benchmark/<timestamp>-<scheme>-cached-clean-2.log`
- `.build-benchmark/<timestamp>-<scheme>-cached-clean-3.log`
- `.build-benchmark/<timestamp>-<scheme>-incremental-1.log`
- `.build-benchmark/<timestamp>-<scheme>-incremental-2.log`
- `.build-benchmark/<timestamp>-<scheme>-incremental-3.log`
Expand All @@ -42,12 +45,13 @@ Each JSON artifact should include:
- parsed timing-summary categories
- free-form notes for caveats or noise

## Clean And Incremental Separation
## Clean, Cached Clean, And Incremental Separation

Do not merge clean and incremental measurements into a single list. They answer different questions:
Do not merge different build type measurements into a single list. They answer different questions:

- Clean builds show full build-system, package, and module setup cost.
- Incremental builds show edit-loop productivity and script or cache invalidation problems.
- **Clean builds** show full build-system, package, and module setup cost with a cold compilation cache.
- **Cached clean builds** show clean build cost when the compilation cache is warm. This is the realistic scenario for branch switching, pulling changes, or Clean Build Folder. Only present when `COMPILATION_CACHING = YES` is detected.
- **Incremental builds** show edit-loop productivity and script or cache invalidation problems.

## Raw Logs

Expand All @@ -62,14 +66,17 @@ Store raw `xcodebuild` output beside the JSON artifact whenever possible. That a

### COMPILATION_CACHING

`COMPILATION_CACHING = YES` stores compiled artifacts so that repeated compilations of identical inputs are served from cache. The standard benchmark methodology (clean + build) clears derived data before each clean run, which invalidates the compilation cache. As a result, the benchmark script does not capture the benefit of compilation caching.
`COMPILATION_CACHING = YES` stores compiled artifacts in a system-managed cache outside DerivedData so that repeated compilations of identical inputs are served from cache. The standard clean-build benchmark (`xcodebuild clean` between runs) may add overhead from cache population without showing the corresponding cache-hit benefit.

The real benefit of compilation caching appears during:
The benchmark script automatically detects `COMPILATION_CACHING = YES` and runs a **cached clean** benchmark phase. This phase:

- Repeat clean builds where source files have not changed (e.g., after switching branches and switching back).
- CI builds that share a persistent derived-data directory across runs.
1. Builds once to warm the compilation cache.
2. Deletes DerivedData (but not the compilation cache) before each measured run.
3. Rebuilds, measuring the cache-hit clean build time.

When reporting on COMPILATION_CACHING, note that the standard clean-build benchmark cannot measure its impact. Recommend enabling it based on the well-documented benefit rather than requiring a measurable delta from the benchmark script.
The cached clean metric captures the realistic developer experience: branch switching, pulling changes, and Clean Build Folder. Use the cached clean median as the primary comparison metric when evaluating `COMPILATION_CACHING` impact.

To skip this phase, pass `--no-cached-clean`.

### First-Run Variance

Expand Down
1 change: 1 addition & 0 deletions references/build-settings-best-practices.md
Original file line number Diff line number Diff line change
Expand Up @@ -131,6 +131,7 @@ These settings optimize for production builds.
- **Key:** `COMPILATION_CACHING`
- **Recommended:** `YES`
- **Why:** Caches compilation results for Swift and C-family sources so repeated compilations of the same inputs are served from cache. The biggest wins come from branch switching and clean builds where source files are recompiled unchanged. This is an opt-in feature. The umbrella setting controls both `SWIFT_ENABLE_COMPILE_CACHE` and `CLANG_ENABLE_COMPILE_CACHE` under the hood; those can be toggled independently if needed.
- **Measurement:** The benchmark script auto-detects this setting and runs a **cached clean** phase that measures clean builds with a warm compilation cache. Standard clean builds may show overhead from cache population; the cached clean metric captures the realistic developer benefit.
- **Risk:** Low -- can also be enabled via per-user project settings so it does not need to be committed to the shared project file.

### Integrated Swift Driver
Expand Down
12 changes: 11 additions & 1 deletion schemas/build-benchmark.schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
"properties": {
"schema_version": {
"type": "string",
"enum": ["1.0.0", "1.1.0"]
"enum": ["1.0.0", "1.1.0", "1.2.0"]
},
"created_at": {
"type": "string",
Expand Down Expand Up @@ -84,6 +84,12 @@
"$ref": "#/definitions/run"
}
},
"cached_clean": {
"type": "array",
"items": {
"$ref": "#/definitions/run"
}
},
"incremental": {
"type": "array",
"items": {
Expand All @@ -103,6 +109,9 @@
"clean": {
"$ref": "#/definitions/stats"
},
"cached_clean": {
"$ref": "#/definitions/stats"
},
"incremental": {
"$ref": "#/definitions/stats"
}
Expand Down Expand Up @@ -134,6 +143,7 @@
"type": "string",
"enum": [
"clean",
"cached-clean",
"incremental"
]
},
Expand Down
72 changes: 64 additions & 8 deletions scripts/benchmark_builds.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,11 @@
import os
import platform
import re
import shutil
import statistics
import subprocess
import sys
import tempfile
import time
from datetime import datetime, timezone
from pathlib import Path
Expand All @@ -31,6 +33,11 @@ def parse_args() -> argparse.Namespace:
help="Path to a source file to touch before each incremental build. "
"When provided, measures a real edit-rebuild loop instead of a zero-change build.",
)
parser.add_argument(
"--no-cached-clean",
action="store_true",
help="Skip cached clean builds even when COMPILATION_CACHING is detected.",
)
parser.add_argument(
"--extra-arg",
action="append",
Expand Down Expand Up @@ -134,6 +141,19 @@ def xcode_version() -> str:
return result.stdout.strip() if result.returncode == 0 else "unknown"


def detect_compilation_caching(base_command: List[str]) -> bool:
"""Check whether COMPILATION_CACHING is enabled in the resolved build settings."""
result = run_command([*base_command, "-showBuildSettings"])
if result.returncode != 0:
return False
for line in result.stdout.splitlines():
stripped = line.strip()
if stripped.startswith("COMPILATION_CACHING") and "=" in stripped:
value = stripped.split("=", 1)[1].strip()
return value == "YES"
return False


def measure_build(
base_command: List[str],
artifact_stem: str,
Expand Down Expand Up @@ -173,8 +193,6 @@ def main() -> int:
if warmup.returncode != 0:
sys.stderr.write(warmup.stdout + warmup.stderr)
return warmup.returncode
# Warmup clean+build cycle primes OS-level caches (disk, dyld, etc.)
# so the first measured clean run is not penalised by cold caches.
warmup_clean = run_command([*base_command, "clean"])
if warmup_clean.returncode != 0:
sys.stderr.write(warmup_clean.stdout + warmup_clean.stderr)
Expand All @@ -184,7 +202,7 @@ def main() -> int:
sys.stderr.write(warmup_rebuild.stdout + warmup_rebuild.stderr)
return warmup_rebuild.returncode

runs = {"clean": [], "incremental": []}
runs: Dict[str, list] = {"clean": [], "incremental": []}

for index in range(1, args.repeats + 1):
clean_result = run_command([*base_command, "clean"])
Expand All @@ -195,6 +213,38 @@ def main() -> int:
return clean_result.returncode
runs["clean"].append(measure_build(base_command, artifact_stem, output_dir, "clean", index))

# --- Cached clean builds ---------------------------------------------------
# When COMPILATION_CACHING is enabled, the compilation cache lives outside
# DerivedData and survives product deletion. We measure "cached clean"
# builds by pointing DerivedData at a temp directory, warming the cache with
# one build, then deleting the DerivedData directory (but not the cache)
# before each measured rebuild. This captures the realistic scenario:
# branch switching, pulling changes, or Clean Build Folder.
should_cached_clean = not args.no_cached_clean and detect_compilation_caching(base_command)
if should_cached_clean:
dd_path = Path(args.derived_data_path) if args.derived_data_path else Path(
tempfile.mkdtemp(prefix="xcode-bench-dd-")
)
cached_cmd = list(base_command)
if not args.derived_data_path:
cached_cmd.extend(["-derivedDataPath", str(dd_path)])

cache_warmup = run_command([*cached_cmd, "build"])
if cache_warmup.returncode != 0:
sys.stderr.write("Warning: cached clean warmup build failed, skipping cached clean benchmarks.\n")
sys.stderr.write(cache_warmup.stdout + cache_warmup.stderr)
should_cached_clean = False

if should_cached_clean:
runs["cached_clean"] = []
for index in range(1, args.repeats + 1):
shutil.rmtree(dd_path, ignore_errors=True)
runs["cached_clean"].append(
measure_build(cached_cmd, artifact_stem, output_dir, "cached-clean", index)
)
shutil.rmtree(dd_path, ignore_errors=True)

# --- Incremental / zero-change builds --------------------------------------
incremental_label = "incremental"
if args.touch_file:
touch_path = Path(args.touch_file)
Expand All @@ -212,8 +262,15 @@ def main() -> int:
measure_build(base_command, artifact_stem, output_dir, incremental_label, index)
)

summary: Dict[str, object] = {
"clean": stats_for(runs["clean"]),
"incremental": stats_for(runs["incremental"]),
}
if "cached_clean" in runs:
summary["cached_clean"] = stats_for(runs["cached_clean"])

artifact = {
"schema_version": "1.1.0",
"schema_version": "1.2.0" if "cached_clean" in runs else "1.1.0",
"created_at": datetime.now(timezone.utc).isoformat(),
"build": {
"entrypoint": "workspace" if args.workspace else "project",
Expand All @@ -231,10 +288,7 @@ def main() -> int:
"cwd": os.getcwd(),
},
"runs": runs,
"summary": {
"clean": stats_for(runs["clean"]),
"incremental": stats_for(runs["incremental"]),
},
"summary": summary,
"notes": [f"touch-file: {args.touch_file}"] if args.touch_file else [],
}

Expand All @@ -243,6 +297,8 @@ def main() -> int:

print(f"Saved benchmark artifact: {artifact_path}")
print(f"Clean median: {artifact['summary']['clean']['median_seconds']}s")
if "cached_clean" in artifact["summary"]:
print(f"Cached clean median: {artifact['summary']['cached_clean']['median_seconds']}s")
inc_label = "Incremental" if args.touch_file else "Zero-change"
print(f"{inc_label} median: {artifact['summary']['incremental']['median_seconds']}s")
return 0
Expand Down
47 changes: 35 additions & 12 deletions scripts/generate_optimization_report.py
Original file line number Diff line number Diff line change
Expand Up @@ -268,18 +268,40 @@ def _section_context(benchmark: Dict[str, Any]) -> str:
def _section_baseline(benchmark: Dict[str, Any]) -> str:
summary = benchmark.get("summary", {})
clean = summary.get("clean", {})
cached_clean = summary.get("cached_clean", {})
incremental = summary.get("incremental", {})
lines = [
"## Baseline Benchmarks\n",
f"| Metric | Clean | Incremental |",
f"|--------|-------|-------------|",
f"| Median | {clean.get('median_seconds', 0):.3f}s | {incremental.get('median_seconds', 0):.3f}s |",
f"| Min | {clean.get('min_seconds', 0):.3f}s | {incremental.get('min_seconds', 0):.3f}s |",
f"| Max | {clean.get('max_seconds', 0):.3f}s | {incremental.get('max_seconds', 0):.3f}s |",
f"| Runs | {clean.get('count', 0)} | {incremental.get('count', 0)} |",
]

for build_type in ("clean", "incremental"):
has_cached = bool(cached_clean and cached_clean.get("count", 0) > 0)

if has_cached:
lines = [
"## Baseline Benchmarks\n",
"| Metric | Clean | Cached Clean | Incremental |",
"|--------|-------|-------------|-------------|",
f"| Median | {clean.get('median_seconds', 0):.3f}s | {cached_clean.get('median_seconds', 0):.3f}s | {incremental.get('median_seconds', 0):.3f}s |",
f"| Min | {clean.get('min_seconds', 0):.3f}s | {cached_clean.get('min_seconds', 0):.3f}s | {incremental.get('min_seconds', 0):.3f}s |",
f"| Max | {clean.get('max_seconds', 0):.3f}s | {cached_clean.get('max_seconds', 0):.3f}s | {incremental.get('max_seconds', 0):.3f}s |",
f"| Runs | {clean.get('count', 0)} | {cached_clean.get('count', 0)} | {incremental.get('count', 0)} |",
]
lines.append(
"\n> **Cached Clean** = clean build with a warm compilation cache. "
"This is the realistic scenario for branch switching, pulling changes, or "
"Clean Build Folder. The compilation cache lives outside DerivedData and "
"survives product deletion.\n"
)
else:
lines = [
"## Baseline Benchmarks\n",
"| Metric | Clean | Incremental |",
"|--------|-------|-------------|",
f"| Median | {clean.get('median_seconds', 0):.3f}s | {incremental.get('median_seconds', 0):.3f}s |",
f"| Min | {clean.get('min_seconds', 0):.3f}s | {incremental.get('min_seconds', 0):.3f}s |",
f"| Max | {clean.get('max_seconds', 0):.3f}s | {incremental.get('max_seconds', 0):.3f}s |",
f"| Runs | {clean.get('count', 0)} | {incremental.get('count', 0)} |",
]

build_types = ["clean", "cached_clean", "incremental"] if has_cached else ["clean", "incremental"]
label_map = {"clean": "Clean", "cached_clean": "Cached Clean", "incremental": "Incremental"}
for build_type in build_types:
runs = benchmark.get("runs", {}).get(build_type, [])
all_cats: Dict[str, Dict] = {}
for run in runs:
Expand All @@ -292,7 +314,8 @@ def _section_baseline(benchmark: Dict[str, Any]) -> str:
if all_cats:
count = len(runs) or 1
ranked = sorted(all_cats.items(), key=lambda x: x[1]["seconds"], reverse=True)
lines.append(f"\n### {build_type.title()} Build Timing Summary\n")
label = label_map.get(build_type, build_type.title())
lines.append(f"\n### {label} Build Timing Summary\n")
lines.append(
"> **Note:** These are aggregated task times across all CPU cores. "
"Because Xcode runs many tasks in parallel, these totals typically exceed "
Expand Down
10 changes: 6 additions & 4 deletions skills/xcode-build-benchmark/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,10 +37,11 @@ When benchmarking inside a git worktree, SPM packages with `exclude:` paths that
1. Normalize the build command and note every flag that affects caching or module reuse.
2. Run one warm-up build if needed to validate that the command succeeds.
3. Run 3 clean builds.
4. Run 3 zero-change builds (build immediately after a successful build with no edits). This measures the fixed overhead floor: dependency computation, project description transfer, build description creation, script phases, codesigning, and validation. A zero-change build that takes more than a few seconds indicates avoidable per-build overhead. Use the default `benchmark_builds.py` invocation (no `--touch-file` flag).
5. Optionally run 3 incremental builds with a file touch to measure a real edit-rebuild loop. Use `--touch-file path/to/SomeFile.swift` to touch a representative source file before each build.
6. Save the raw results and summary into `.build-benchmark/`.
7. Report medians and spread, not just the single fastest run.
4. If `COMPILATION_CACHING = YES` is detected, run 3 cached clean builds. These measure clean build time with a warm compilation cache -- the realistic scenario for branch switching, pulling changes, or Clean Build Folder. The script handles this automatically by building once to warm the cache, then deleting DerivedData (but not the compilation cache) before each measured run. Pass `--no-cached-clean` to skip.
5. Run 3 zero-change builds (build immediately after a successful build with no edits). This measures the fixed overhead floor: dependency computation, project description transfer, build description creation, script phases, codesigning, and validation. A zero-change build that takes more than a few seconds indicates avoidable per-build overhead. Use the default `benchmark_builds.py` invocation (no `--touch-file` flag).
6. Optionally run 3 incremental builds with a file touch to measure a real edit-rebuild loop. Use `--touch-file path/to/SomeFile.swift` to touch a representative source file before each build.
7. Save the raw results and summary into `.build-benchmark/`.
8. Report medians and spread, not just the single fastest run.

## Preferred Command Path

Expand All @@ -62,6 +63,7 @@ If you cannot use the helper script, run equivalent `xcodebuild` commands with `
Return:

- clean build median, min, max
- cached clean build median, min, max (when COMPILATION_CACHING is enabled)
- zero-change build median, min, max (fixed overhead floor)
- incremental build median, min, max (if `--touch-file` was used)
- biggest timing-summary categories
Expand Down
2 changes: 1 addition & 1 deletion skills/xcode-build-fixer/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -122,7 +122,7 @@ If a fix produced no measurable wall-time improvement, note `No measurable wall-

For changes valuable for non-benchmark reasons (deterministic package resolution, branch-switch caching), label them: "No wait-time improvement expected from this change. The benefit is [deterministic builds / faster branch switching / reduced CI cost]."

Note: `COMPILATION_CACHING` improvements cannot be captured by the standard clean-build benchmark because `xcodebuild clean` invalidates the cache between runs. When reporting on this setting, note that the benefit is real but requires a different measurement approach (e.g., branch-switch benchmarks or repeat builds without cleaning). Recommend keeping the setting enabled based on documented benefit rather than requiring a delta from the benchmark.
Note: `COMPILATION_CACHING` improvements are captured by the **cached clean** benchmark phase, which the benchmark script runs automatically when it detects the setting. Cached clean builds measure clean build time with a warm compilation cache -- the realistic scenario for branch switching and pulling changes. Standard clean builds may show overhead from cache population; use the cached clean metric as the primary comparison for this setting.

## Escalation

Expand Down
Loading