Phase out hardening flags hurting startup; add three perf wins#2
Merged
Sam Gammon (sgammon) merged 6 commits intomainfrom May 6, 2026
Merged
Conversation
d7c695c to
19e70b2
Compare
There was a problem hiding this comment.
Pull request overview
This PR restructures the flag profiles to separate “compile-time” flags from “final-link-only” flags, while temporarily parking several startup-costly hardening options in a tracked labs.disabled.txt file and adding a few startup/size optimizations to the live Linux/base profiles.
Changes:
- Introduces
*-bin“final-link-only” profiles (linux-bin.txt,darwin-bin.txt, plus per-arch placeholders) and updates the resolver docs accordingly. - Parks selected hardening flags in
labs.disabled.txtand updates README to document the phased re-enable plan. - Adds/adjusts optimization and ELF/Mach-O related flags and adds two equivalent CLI resolvers (
cli/cflags.shandcli/cflags.ts).
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| README.md | Expands documentation: profile taxonomy, deployment floor, feature matrix, and phased hardening notes. |
| base.txt | Updates layering description; adds -fmerge-all-constants; centralizes a clang-compatible warning set; keeps unwind settings. |
| linux.txt | Adds ELF performance/hardening-related compile/link flags and clarifies final-link-only flag placement. |
| linux-bin.txt | New: final-link-only Linux flags (gc-sections, z defs, debug compression, fatal warnings). |
| linux-amd64.txt | New: amd64 ISA baseline/tuning plus retained BB section emission for Propeller readiness. |
| linux-amd64-bin.txt | New: placeholder for amd64 final-link-only flags. |
| linux-arm64.txt | New: arm64 ISA baseline and BTI/PAC-related link flags. |
| linux-arm64-bin.txt | New: placeholder for arm64 final-link-only flags. |
| darwin.txt | New: macOS compile/link flags (lld + chained-fixups-related options). |
| darwin-bin.txt | New: final-link-only Mach-O dead-strip settings. |
| darwin-amd64.txt | New: Intel macOS ISA baseline and rationale for excluding CET. |
| darwin-amd64-bin.txt | New: placeholder for amd64 Darwin final-link-only flags. |
| darwin-arm64.txt | New: Apple Silicon baseline and branch protection settings. |
| darwin-arm64-bin.txt | New: placeholder for arm64 Darwin final-link-only flags. |
| labs.disabled.txt | New: tracked “parking lot” for temporarily disabled hardening flags and re-enable notes. |
| cli/cflags.ts | New: Bun/TypeScript resolver for closed-set profile rollup (+ --bin support). |
| cli/cflags.sh | New: Bash resolver for closed-set profile rollup (+ --bin support). |
Comments suppressed due to low confidence (1)
linux.txt:38
- PR description lists three newly added flags, but this change also introduces
-fno-semantic-interposition(not mentioned in the PR description's "Added" section). Please either update the PR description/test plan to include it, or clarify why this additional behavior change is intended.
# Skip the dynamic-symbol interposition contract for our own definitions:
# under ELF the linker assumes any global symbol in a shared library can
# be replaced via LD_PRELOAD, which forces every intra-library call through
# the GOT/PLT and disables a wide range of inlining + IPO. Disabling it
# lets clang resolve calls and constants directly inside the same DSO.
# Combined with -fno-plt below, the call-site overhead approaches a static
# build's. Linux/ELF only — Mach-O has different visibility semantics and
# clang ignores this flag there.
-fno-semantic-interposition
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Sam Gammon (sgammon)
pushed a commit
that referenced
this pull request
May 6, 2026
CI: - New tests/probe.c — minimal C TU exercising libc calls, function- pointer dispatch, a small loop, and stack-allocated string formatting. Designed to stay warning-clean under the live -Wall -Wextra -Wpedantic -Wformat=2 -Wstrict-prototypes -Wshadow -Wundef set so any codegen regression breaks CI early. - New .github/workflows/ci.yml — matrix probe on three GitHub-hosted runners (ubuntu-24.04 / ubuntu-24.04-arm / macos-14). For each target it installs clang-22 + lld-22 (apt.llvm.org on Linux, Homebrew on macOS), asserts cli/cflags.sh and cli/cflags.ts emit byte-identical output for the compile and binary profiles, then compiles tests/probe.c with the compile profile (-c) and compile+links+runs it with the binary profile. - README CI section documents coverage and notes that darwin-amd64 is exercised locally pre-tag because GitHub no longer offers an Intel-Mac runner. Copilot review on PR #2 (3 valid comments addressed; 1 false positive skipped): - darwin.txt + linux.txt + README.md: the parked libc++ flag was written as the shorthand -D_LIBCPP_HARDENING_MODE_FAST in three comment blocks, but the real flag in labs.disabled.txt is -D_LIBCPP_HARDENING_MODE=_LIBCPP_HARDENING_MODE_FAST. Re-enabling via the shorthand is a silent no-op (libc++ checks the value of the macro, not its mere definition). All three call sites now use the exact spelling, and the README's parked-flag list is reformatted as a bullet list with -U_FORTIFY_SOURCE included for completeness. - linux.txt:45 (false positive): Copilot flagged -fno-plt as a newly-introduced undocumented flag. -fno-plt was added in PR #1 (commit 6d0e14c) and is not in this PR's diff; Copilot compared the PR description's "Added" section against the file content rather than against the actual diff. Skipped.
We're prioritising startup time over defence-in-depth for now. Six
hardening flags move to a new tracked, non-consumed parking-lot file
labs.disabled.txt; three pure perf/size wins land in the live profiles.
Parked in labs.disabled.txt (each with its own re-enable criteria):
base.txt:
-fstack-protector-strong (canary prologue/epilogue)
-fzero-call-used-regs=used-gpr (per-return reg-clearing)
linux.txt:
-U_FORTIFY_SOURCE
-D_FORTIFY_SOURCE=2 (per-call libc bounds checks)
-D_LIBCPP_HARDENING_MODE_FAST (per-call libc++ bounds checks)
darwin.txt:
-U_FORTIFY_SOURCE
-D_FORTIFY_SOURCE=2
-D_LIBCPP_HARDENING_MODE_FAST
linux-amd64.txt:
-fcf-protection=full (ENDBR64 on every indirect target)
labs.disabled.txt is not loaded by the rollup or by cli/cflags.{sh,ts}
— the resolver only matches a closed set of {base, $os, $os-$arch,
$os-bin, $os-$arch-bin}.txt. Verified with a smoke test.
Kept on:
-fasynchronous-unwind-tables (better profiling > table size)
-fbasic-block-sections=all (linux-amd64) (Propeller is on the roadmap;
keeping the per-BB sections
avoids a rebuild-the-world
once we wire in the symbol-
ordering file)
Added (small, free startup wins):
base.txt:
-fmerge-all-constants (dedupe equal string/numeric
constant pool entries → smaller
binary, fewer relocs at load;
mild ISO violation we don't
depend on)
linux.txt:
-Wl,--hash-style=gnu (drop legacy sysv-hash; ~2x
faster dyn-symbol lookup at
load + smaller .dynsym)
-Wl,-O3 (replaces -Wl,-O2) (more aggressive linker-side
string/section merging; pure
link-time cost)
README features matrix updated; new "Phasing in" subsection lists the
parked flags and points at labs.disabled.txt. cli/cflags.{sh,ts}
output verified byte-identical across all eight (os × arch × mode)
combinations after the change.
CI: - New tests/probe.c — minimal C TU exercising libc calls, function- pointer dispatch, a small loop, and stack-allocated string formatting. Designed to stay warning-clean under the live -Wall -Wextra -Wpedantic -Wformat=2 -Wstrict-prototypes -Wshadow -Wundef set so any codegen regression breaks CI early. - New .github/workflows/ci.yml — matrix probe on three GitHub-hosted runners (ubuntu-24.04 / ubuntu-24.04-arm / macos-14). For each target it installs clang-22 + lld-22 (apt.llvm.org on Linux, Homebrew on macOS), asserts cli/cflags.sh and cli/cflags.ts emit byte-identical output for the compile and binary profiles, then compiles tests/probe.c with the compile profile (-c) and compile+links+runs it with the binary profile. - README CI section documents coverage and notes that darwin-amd64 is exercised locally pre-tag because GitHub no longer offers an Intel-Mac runner. Copilot review on PR #2 (3 valid comments addressed; 1 false positive skipped): - darwin.txt + linux.txt + README.md: the parked libc++ flag was written as the shorthand -D_LIBCPP_HARDENING_MODE_FAST in three comment blocks, but the real flag in labs.disabled.txt is -D_LIBCPP_HARDENING_MODE=_LIBCPP_HARDENING_MODE_FAST. Re-enabling via the shorthand is a silent no-op (libc++ checks the value of the macro, not its mere definition). All three call sites now use the exact spelling, and the README's parked-flag list is reformatted as a bullet list with -U_FORTIFY_SOURCE included for completeness. - linux.txt:45 (false positive): Copilot flagged -fno-plt as a newly-introduced undocumented flag. -fno-plt was added in PR #1 (commit 6d0e14c) and is not in this PR's diff; Copilot compared the PR description's "Added" section against the file content rather than against the actual diff. Skipped.
The first run of ci.yml passed on linux-amd64 but failed in ~30s on linux-arm64 and darwin-arm64, both during the toolchain-install step. I don't have direct access to the GHA logs from this session, so this commit makes the install path more robust along the most likely failure modes: - Linux: replace the hand-rolled gpg + apt source setup with apt.llvm.org's llvm.sh, which is the canonical installer and handles distro/codename detection across noble/noble-arm64. Drop --update-alternatives in favour of explicit symlinks in /usr/local/bin (which precedes /usr/bin in PATH on ubuntu-24.04), so `clang` and `ld.lld` resolve to the v22 binaries even if the base image already has an unversioned `clang` from a different source. - Darwin: keep the Homebrew install but capture brew --prefix llvm into a CC env var passed via $GITHUB_ENV so subsequent steps invoke that clang explicitly instead of relying on PATH order. Also prepend $(brew --prefix llvm)/bin so -fuse-ld=lld finds ld64.lld. - New "Toolchain sanity" step prints `which clang`, `clang --version`, and `ld.lld` / `ld64.lld --version` so any future failure surfaces the actual install state in the log. Compile / link steps now invoke "$CC" explicitly rather than relying on whichever `clang` is first in PATH.
eb1daec to
aba1744
Compare
…tions
Two unrelated CI failures were chasing different root causes:
linux-arm64
ld.lld errored hard at link time:
ld.lld: error: /lib/aarch64-linux-gnu/Scrt1.o: -z force-bti:
file does not have GNU_PROPERTY_AARCH64_FEATURE_1_BTI property
…and the same for crti.o / crtn.o under -z pac-plt. lld treats
these two -z flags as strict — every input object must carry the
matching GNU property note or the link aborts. Stock Ubuntu noble
glibc CRT objects ship without those notes, so the link can never
succeed on a vanilla distro toolchain.
Drop -Wl,-z,force-bti and -Wl,-z,pac-plt from linux-arm64.txt.
Keep -mbranch-protection=standard so our own .o files still emit
the PAC/BTI sequences and per-object property notes — when the
rest of the link surface (CRT, libc) catches up, we can re-enable
the linker-side enforcement. Comment in linux-arm64.txt records
the rationale and the verbatim error message so the next person
who sees it knows what changed and what would need to be true to
flip it back on.
darwin-arm64
clang: error: invalid linker name in argument '-fuse-ld=lld'
This message is Apple-clang's signature for "I don't know what
'lld' is" — meaning Apple's /usr/bin/clang got picked up instead
of the brew-installed mainline clang. Two defensive changes to
ci.yml so the next failure surfaces the actual install state:
- Install step now hard-asserts that $(brew --prefix llvm)/bin/clang
AND $(brew --prefix llvm)/bin/ld64.lld both exist. If brew shipped
llvm without ld64.lld (rare but possible on some images), the step
fails immediately with the missing path printed instead of letting
the failure cascade into a confusing clang error two steps later.
- Toolchain sanity step now invokes "$CC" instead of bare `clang`,
explicitly rejects an Apple-clang $CC, and uses
`--print-prog-name=ld.lld` / `--print-prog-name=ld64.lld` to show
the lld path the driver would actually pick under -fuse-ld=lld.
Homebrew's `llvm` formula on macOS does not ship lld — confirmed by listing /opt/homebrew/opt/llvm/bin on macos-14 (clang, MLIR tools, FileCheck, ~150 binaries, zero lld variants). On Mac, `lld` is a standalone Homebrew formula; the asymmetry vs Linux (where apt.llvm.org bundles them in the llvm-toolchain-N package set) is why my "install lld with the llvm formula" assumption broke darwin CI even though linux CI worked. Fix: `brew install llvm lld zstd` (add lld to the install list), then symlink $(brew --prefix lld)/bin/ld64.lld into $(brew --prefix llvm)/bin/ld64.lld so clang's `-fuse-ld=lld` driver finds it in its own bin dir (clang searches there before falling through to PATH). Hard-assert both binaries exist before continuing so the failure mode is "MISSING: <path>" not the cryptic "invalid linker name in argument '-fuse-ld=lld'" that surfaces two steps later when Apple's clang inadvertently gets picked up.
The toolchain-sanity guard that's supposed to reject Apple's clang
fired a false positive on macos-14: \$CC was correctly set to
/opt/homebrew/opt/llvm/bin/clang (Homebrew clang 22.1.4), but the
naive `grep -qiE 'apple|xcode'` matched on the target triple line:
Target: arm64-apple-darwin23.6.0
…which every clang on macOS prints, Homebrew or otherwise. Restrict
the match to the *version banner* (first line) and require the exact
"Apple clang" string Apple's binary uses there. Homebrew's banner is
"Homebrew clang version 22.1.4"; Xcode/Apple's is "Apple clang
version 15.x.y …". The discriminator is the leading word, not the
case-insensitive presence of "apple" anywhere in the multi-line
output.
Net: lld install succeeded, brew clang is the one running, and CI
should now get past the sanity step.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
We're prioritising startup time over defence-in-depth for now. Six hardening flags move to a tracked, non-consumed
labs.disabled.txtparking lot; three pure perf/size wins land in the live profiles.-fasynchronous-unwind-tables(profiling) and-fbasic-block-sections=all(Propeller-ready) are kept on per discussion.Parked in
labs.disabled.txt(each with its own re-enable criteria):base.txt:-fstack-protector-strong,-fzero-call-used-regs=used-gprlinux.txt+darwin.txt:-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=2,-D_LIBCPP_HARDENING_MODE_FASTlinux-amd64.txt:-fcf-protection=fulllabs.disabled.txtis not loaded by the rollup or bycli/cflags.{sh,ts}— the resolver matches on a closed set of{base, $os, $os-$arch, $os-bin, $os-$arch-bin}.txtand silently ignores everything else. It's a tracked, reviewable parking lot, not a profile. Re-enabling any flag is a literal cut-and-paste back into the named source file.Added (small, free startup wins):
base.txt:-fmerge-all-constants— dedupe equal string/numeric constant pool entries → smaller binary, fewer relocs at load. Mild ISO violation we don't depend on.linux.txt:-Wl,--hash-style=gnu— drop legacy sysv-hash; ~2× faster dyn-symbol lookup at load + smaller.dynsym.linux.txt:-Wl,-O3(replaces-Wl,-O2) — more aggressive linker-side string/section merging.Kept on:
-fasynchronous-unwind-tables— better profiling > the table size.-fbasic-block-sections=all(linux-amd64) — Propeller is on the roadmap; the per-BB sections are a hard prerequisite for the symbol-ordering file. Carrying it now avoids a rebuild-the-world later.README features matrix updated with a new "Phasing in" subsection that lists parked flags and points at
labs.disabled.txt.Test plan
--bin}) outputs fromcli/cflags.shbyte-matchcli/cflags.tslabs.disabled.txtcorrectly ignored by the resolver (verified via grep)linux-amd64compile-flag count: 69 → 65 (net 4 dropped)Generated by Claude Code