Incorporate learnings from Daily macOS (ObjC + CocoaPods) test case by AvdLee · Pull Request #9 · AvdLee/Xcode-Build-Optimization-Agent-Skill

AvdLee · 2026-03-24T10:50:38Z

Summary

A community member ran the optimization skills on an older Objective-C + CocoaPods macOS project (Daily macOS). The agent improvised CocoaPods-specific changes not covered by any skill reference, overestimated how actionable some findings were, and the cumulative optimization pass regressed builds by 14%. This PR addresses the root causes identified from that test case.

Actionability classification: Every recommendation now requires an actionability field (repo-local, package-manager, xcode-behavior, upstream) so the agent must classify how fixable a finding is before presenting it
CocoaPods deprecation: Skills now detect CocoaPods and recommend migrating to SPM instead of attempting CocoaPods-specific optimizations (which are unreliable and frequently regress)
Regression-aware revert guidance: Best-practice build settings (compilation caching, eager linking, etc.) are never revert candidates; speculative changes are only reverted when all build types regress; compilation caching trade-offs are evaluated holistically
Structured execution report: The fixer now produces a per-change report tracking what was attempted, measured, and kept/reverted/blocked
Benchmark confidence: High-variance benchmarks are flagged with a recommendation for additional repetitions before drawing conclusions
ExtractAppIntentsMetadata: Clarified as Xcode-driven behavior that cannot be suppressed from repo-local settings

Test plan

Verify references/recommendation-format.md includes the new actionability field in required fields, JSON example, and rendering order
Verify OPTIMIZATION-CHECKS.md has CocoaPods deprecation section and updated ExtractAppIntentsMetadata description
Verify skills/xcode-build-fixer/SKILL.md has Evaluating Regressions section (with best-practice settings list) and Execution Report section
Verify skills/xcode-build-orchestrator/SKILL.md has benchmark confidence check (step 3b) and expanded noise guidance in Final Report
Verify skills/xcode-project-analyzer/SKILL.md and its references/project-audit-checks.md have CocoaPods detection and ExtractAppIntentsMetadata updates
Verify orchestration-report-template.md has the new Execution Report section replacing the old Verification section

A community member ran the optimization skills on an older ObjC + CocoaPods project. The agent improvised CocoaPods-specific changes not covered by any reference material, overestimated how actionable some findings were, and produced a 14% regression. This commit addresses the root causes: - Add actionability classification (repo-local / package-manager / xcode-behavior / upstream) to the recommendation format so every finding states how fixable it is from the repo - Add CocoaPods deprecation notice: do not attempt CocoaPods-specific optimizations; recommend migrating to SPM instead - Add regression-aware revert guidance: best-practice settings are never revert candidates; speculative changes are only reverted when all build types regress; compilation caching trade-offs are evaluated holistically - Add structured execution report to the fixer and orchestrator template so each change is tracked with its measured result and kept/reverted status - Add benchmark confidence checks: flag high-variance runs and recommend additional repetitions before drawing conclusions - Clarify ExtractAppIntentsMetadata is Xcode-driven and not suppressible from repo-local settings

Copilot

Pull request overview

Updates the Xcode build optimization Agent Skills documentation to incorporate lessons from a CocoaPods-based macOS case study, aiming to improve recommendation actionability, avoid unreliable CocoaPods “optimizations,” and strengthen regression/noise handling and reporting structure.

Changes:

Add an actionability classification to the shared recommendation format and clarify how to classify Xcode-driven costs (e.g., ExtractAppIntentsMetadata).
Add CocoaPods detection guidance that recommends SPM migration and explicitly avoids CocoaPods-specific tuning.
Introduce an “Execution Report” structure and add benchmark variance (“confidence”) guidance for interpreting noisy results and regressions.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
skills/xcode-project-analyzer/references/project-audit-checks.md	Clarifies `ExtractAppIntentsMetadata` as `xcode-behavior` and adds CocoaPods guidance (avoid CocoaPods tuning; prefer SPM migration).
skills/xcode-project-analyzer/SKILL.md	Adds CocoaPods usage to the “What To Review” checklist.
skills/xcode-build-orchestrator/references/orchestration-report-template.md	Replaces “Verification” with an “Execution Report” section and a per-change tracking table.
skills/xcode-build-orchestrator/SKILL.md	Adds a benchmark variance check and strengthens noise/confidence guidance in the final report section.
skills/xcode-build-fixer/SKILL.md	Adds regression evaluation rules (best-practice settings not revert candidates) and an execution report structure.
references/recommendation-format.md	Makes `actionability` required and defines allowed values and rendering order.
README.md	Updates the documented repository/skill tree structure block.
OPTIMIZATION-CHECKS.md	Adds CocoaPods guidance and clarifies `ExtractAppIntentsMetadata` actionability as `xcode-behavior`.

Copilot · 2026-03-24T10:58:03Z

README.md

 ```text
-skills/
-  xcode-build-benchmark/
-    SKILL.md
-    references/
-      benchmarking-workflow.md
-  xcode-compilation-analyzer/
-    SKILL.md
-    references/
-      code-compilation-checks.md
-  xcode-project-analyzer/
-    SKILL.md
-    references/
-      project-audit-checks.md
-  spm-build-analysis/
-    SKILL.md
-    references/
-      spm-analysis-checks.md
-  xcode-build-orchestrator/
-    SKILL.md
-    references/
-      orchestration-report-template.md
-  xcode-build-fixer/
-    SKILL.md
-    references/
-      fix-patterns.md
+xcode-build-optimization-agent-skill/
+  .claude-plugin/
+    marketplace.json
+    plugin.json
+  references/


The skill tree example hard-codes the repo root as xcode-build-optimization-agent-skill/, but the checkout directory name is user-controlled (and the README itself is already at repo root). To keep the structure example universally correct, consider using . / <repo-root>/ instead of a specific folder name.

Copilot · 2026-03-24T10:58:03Z

references/recommendation-format.md

 Each recommendation should include:

 - `title`
 - `wait_time_impact` -- plain-language statement of expected wall-clock impact, e.g. "Expected to reduce your clean build by ~3s", "Reduces parallel compile work but unlikely to reduce build wait time", or "Impact on wait time is uncertain -- re-benchmark to confirm"
+- `actionability` -- classifies how fixable the issue is from the project (see values below)
 - `category`
 - `observed_evidence`
 - `estimated_impact`
 - `confidence`
 - `approval_required`
 - `benchmark_verification_status`


With actionability now required (and wait_time_impact already required), the repo’s Markdown renderers/generators should be updated to emit and render these fields in the specified order. For example, scripts/render_recommendations.py currently omits both wait_time_impact and actionability, and scripts/generate_optimization_report.py doesn’t handle actionability either, so the produced reports won’t match this documented contract.

Copilot · 2026-03-24T10:58:04Z

skills/xcode-build-orchestrator/references/orchestration-report-template.md

+## Execution Report (post-approval)

+### Baseline
+- Clean build median: X.Xs
+- Cached clean build median: X.Xs (if applicable)
+- Incremental build median: X.Xs
+


This template states generate_optimization_report.py produces this format automatically, but the script currently doesn’t emit any post-approval “Execution Report” placeholder/section. If this section is now expected in the generated plan, the generator should be updated to include it (even as an empty placeholder) so Phase 2 can append results without manual restructuring.

Copilot · 2026-03-24T10:58:04Z

skills/xcode-build-orchestrator/SKILL.md

 1. Collect the build target context: workspace or project, scheme, configuration, destination, and current pain point. When both `.xcworkspace` and `.xcodeproj` exist, prefer `.xcodeproj` unless the workspace contains sub-projects required for the build. Workspaces that reference external projects may fail if those projects are not checked out.
 2. Run `xcode-build-benchmark` to establish a baseline if no fresh benchmark exists. The benchmark script auto-detects `COMPILATION_CACHING = YES` and includes cached clean builds that measure the realistic developer experience (warm cache). If the build fails to compile, check `git log` for a recent buildable commit. When working in a worktree, cherry-picking a targeted build fix from a feature branch is acceptable to reach a buildable state. If SPM packages reference gitignored directories in their `exclude:` paths (e.g., `__Snapshots__`), create those directories before building -- worktrees do not contain gitignored content and `xcodebuild -resolvePackageDependencies` will crash otherwise.
 3. Verify the benchmark artifact has non-empty `timing_summary_categories`. If empty, the timing summary parser may have failed -- re-parse the raw logs or inspect them manually. If `COMPILATION_CACHING` is enabled, also verify the artifact includes `cached_clean` runs.
+3b. **Benchmark confidence check**: For each build type (clean, cached clean, incremental), compare the min and max values. If the spread (max - min) exceeds 20% of the median, flag the benchmark as having high variance and recommend running additional repetitions (5+ runs) before drawing conclusions. High variance makes it difficult to distinguish real improvements from noise. After applying changes, only claim an improvement if the post-change median falls outside the baseline's min-max range.


3b. is not valid Markdown ordered-list syntax, so it likely won’t render as part of the numbered workflow (and may break copy/paste of steps). Consider keeping step 3 as an ordered-list item and making this a nested sub-bullet (e.g., 3.a / 3.b as plain text within a nested list) or renumbering as step 4 with the subsequent steps shifted.

Suggested change

3b. **Benchmark confidence check**: For each build type (clean, cached clean, incremental), compare the min and max values. If the spread (max - min) exceeds 20% of the median, flag the benchmark as having high variance and recommend running additional repetitions (5+ runs) before drawing conclusions. High variance makes it difficult to distinguish real improvements from noise. After applying changes, only claim an improvement if the post-change median falls outside the baseline's min-max range.

- **Benchmark confidence check**: For each build type (clean, cached clean, incremental), compare the min and max values. If the spread (max - min) exceeds 20% of the median, flag the benchmark as having high variance and recommend running additional repetitions (5+ runs) before drawing conclusions. High variance makes it difficult to distinguish real improvements from noise. After applying changes, only claim an improvement if the post-change median falls outside the baseline's min-max range.

Copilot · 2026-03-24T10:58:04Z

skills/xcode-project-analyzer/SKILL.md

@@ -26,6 +26,7 @@ Use this skill for project- and target-level build inefficiencies that are unlik
 - asset catalog compilation time, especially in targets with large or numerous catalogs
 - `ExtractAppIntentsMetadata` time in the Build Timing Summary -- if the project does not use App Intents but this phase consumes significant time, flag it as unnecessary overhead


The guidance for ExtractAppIntentsMetadata here says to “flag it as unnecessary overhead”, but the updated audit checklist now clarifies it’s typically not cleanly suppressible and should be classified as xcode-behavior (report cost, don’t promise a repo-local fix). This section should be aligned so the skill doesn’t imply there’s an actionable repo-local optimization.

Suggested change

- `ExtractAppIntentsMetadata` time in the Build Timing Summary -- if the project does not use App Intents but this phase consumes significant time, flag it as unnecessary overhead

- `ExtractAppIntentsMetadata` time in the Build Timing Summary -- if this phase consumes significant time, record it as fixed-cost `xcode-behavior` (report the cost and impact, but do not suggest a repo-local optimization unless there is explicit Apple guidance)

Copilot · 2026-03-24T10:58:04Z

references/recommendation-format.md

 When rendering for human review, preserve the same field order:

 1. title
 2. wait-time impact
-3. observed evidence
-4. estimated impact
-5. confidence
-6. approval required
-7. benchmark verification status
+3. actionability
+4. observed evidence
+5. estimated impact
+6. confidence
+7. approval required
+8. benchmark verification status


In the “Markdown Rendering Guidance” ordered list, the field is named wait_time_impact earlier, but the list item says “wait-time impact”. For consistency and to avoid implementers mistaking this for a different field name, consider rendering it as wait_time_impact (and likewise use backticks for the other field names in this list).

- Fix 3b. to nested sub-bullet for valid Markdown list syntax - Align ExtractAppIntentsMetadata wording in xcode-project-analyzer to match the xcode-behavior classification from the audit checklist - Add actionability field handling to generate_optimization_report.py in both recommendation rendering and approval checklist

The sync-readme CI job expanded the tree to include the entire repo. Revert to the clean skills-only view matching main.

The script was hard-coding the entire repo structure (references, schemas, scripts, .claude-plugin) into the README skill tree. Simplified to only output the skills/ directory, matching the intended structure.

AvdLee and others added 2 commits March 24, 2026 11:50

chore: sync README structure [skip ci]

64f3dcc

AvdLee requested a review from Copilot March 24, 2026 10:51

Copilot started reviewing on behalf of AvdLee March 24, 2026 10:52 View session

Copilot AI reviewed Mar 24, 2026

View reviewed changes

AvdLee and others added 4 commits March 24, 2026 12:00

Restore skills-only structure tree in README

2f6d021

The sync-readme CI job expanded the tree to include the entire repo. Revert to the clean skills-only view matching main.

chore: sync README structure [skip ci]

2b87c4f

Fix sync-readme script to only generate skills-only tree

728d72a

The script was hard-coding the entire repo structure (references, schemas, scripts, .claude-plugin) into the README skill tree. Simplified to only output the skills/ directory, matching the intended structure.

AvdLee merged commit ec90d3d into main Mar 25, 2026
1 check passed

AvdLee deleted the learnings-daily-macos-cocoapods branch March 25, 2026 10:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incorporate learnings from Daily macOS (ObjC + CocoaPods) test case#9

Incorporate learnings from Daily macOS (ObjC + CocoaPods) test case#9
AvdLee merged 6 commits intomainfrom
learnings-daily-macos-cocoapods

AvdLee commented Mar 24, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 24, 2026

Uh oh!

Copilot AI Mar 24, 2026

Uh oh!

Copilot AI Mar 24, 2026

Uh oh!

Copilot AI Mar 24, 2026

Uh oh!

Copilot AI Mar 24, 2026

Uh oh!

Copilot AI Mar 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		@@ -26,6 +26,7 @@ Use this skill for project- and target-level build inefficiencies that are unlik
		- asset catalog compilation time, especially in targets with large or numerous catalogs
		- `ExtractAppIntentsMetadata` time in the Build Timing Summary -- if the project does not use App Intents but this phase consumes significant time, flag it as unnecessary overhead

Conversation

AvdLee commented Mar 24, 2026

Summary

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants