toolchain: ABI-pin host compiler to g++-15 under a sanitizer#949
toolchain: ABI-pin host compiler to g++-15 under a sanitizer#949ChaoWao wants to merge 1 commit into
Conversation
Under a sanitizer the sim host artifacts (runtime, kernels, orchestration) must all link the SAME sanitizer runtime that the pytest/CI run-step preloads via LD_PRELOAD — g++-15's libtsan.so.2 / libasan.so. GxxToolchain carries `prefer_g15` precisely to guarantee that. But `_host_compiler_cmake_args` read CC/CXX from the environment and let them OVERRIDE the prefer_g15 default (the conda-compat path). scikit-build-core exports CXX during `pip install`, so the sanitized .so were built with the env compiler (e.g. system g++ = gcc-11, libtsan.so.0) while the run-step preloaded g++-15's libtsan.so.2 — a version split that fails at dlopen with "cannot allocate memory in static TLS block", so TSAN never even runs. Add a `pin_compiler` mode: under prefer_g15 the env CC/CXX *binary* is ignored (forced to gcc-15/g++-15) while any env-injected flags (conda's -B compiler_compat) are still preserved. Non-sanitizer builds are unchanged. Regression test: GxxToolchain(prefer_g15=True) with CXX=g++ in the env still emits -DCMAKE_CXX_COMPILER=g++-15, and conda flags survive the pin. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Warning Review limit reached
More reviews will be available in 23 minutes and 42 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request introduces a compiler pinning mechanism (pin_compiler) under sanitizer builds to ensure the host compiler is ABI-pinned to gcc-15/g++-15, preventing environment overrides while preserving injected flags. The review feedback suggests improving this by preserving custom absolute or relative paths to gcc-15/g++-15 (e.g., /usr/local/bin/gcc-15) instead of unconditionally overriding them with the default binary names, and adding a corresponding unit test to verify this behavior.
| if pin_compiler: | ||
| cc, cxx = default_cc, default_cxx |
There was a problem hiding this comment.
If pin_compiler is enabled, unconditionally overriding cc and cxx to default_cc and default_cxx (e.g., "gcc-15" and "g++-15") will discard any custom absolute or relative paths specified by the user in CC/CXX (for example, /usr/local/bin/gcc-15 or /opt/toolchains/bin/g++-15). If these custom paths are not in the system's default PATH, the build will fail.
We should preserve the user's custom path if it already points to a version 15 compiler by checking if "gcc-15" / "g++-15" is part of the executable's basename.
if pin_compiler:
if "gcc-15" not in os.path.basename(cc):
cc = default_cc
if "g++-15" not in os.path.basename(cxx):
cxx = default_cxx| assert "-DCMAKE_C_FLAGS=-pthread -B /data/envs/lyf/compiler_compat" in args | ||
| assert "-DCMAKE_CXX_FLAGS=-pthread -B /data/envs/lyf/compiler_compat" in args |
There was a problem hiding this comment.
Add a unit test to verify that custom paths to gcc-15 and g++-15 are preserved when pin_compiler is enabled.
| assert "-DCMAKE_C_FLAGS=-pthread -B /data/envs/lyf/compiler_compat" in args | |
| assert "-DCMAKE_CXX_FLAGS=-pthread -B /data/envs/lyf/compiler_compat" in args | |
| assert "-DCMAKE_C_FLAGS=-pthread -B /data/envs/lyf/compiler_compat" in args | |
| assert "-DCMAKE_CXX_FLAGS=-pthread -B /data/envs/lyf/compiler_compat" in args | |
| def test_custom_path_to_g15_is_preserved(self, toolchain, monkeypatch): | |
| # Custom paths to gcc-15/g++-15 must be preserved to support non-standard installation paths. | |
| monkeypatch.setenv("CC", "/opt/custom/bin/gcc-15") | |
| monkeypatch.setenv("CXX", "/opt/custom/bin/g++-15") | |
| args = toolchain.get_cmake_args() | |
| assert "-DCMAKE_C_COMPILER=/opt/custom/bin/gcc-15" in args | |
| assert "-DCMAKE_CXX_COMPILER=/opt/custom/bin/g++-15" in args |
Summary
A real build-correctness bug found while making the TSAN nightly actually run:
under a sanitizer the sim host artifacts were not reliably compiled with
g++-15, so they linked a different sanitizer runtime than the run-step
preloads — and TSAN failed to even start.
GxxToolchain(prefer_g15=True)exists to force g++-15 so every host.soshares the one sanitizer runtime that pytest/CI preloads
(
LD_PRELOAD=$(g++-15 -print-file-name=libtsan.so)→ libtsan.so.2)._host_compiler_cmake_argsreadCC/CXXfrom the environment and letthem override the
prefer_g15default (the conda-compat path).scikit-build-core exports
CXXduringpip install, so the sanitized.sowere built with the env compiler (systemg++= gcc-11 → libtsan.so.0)while the run-step preloaded g++-15's libtsan.so.2.
dlopenwithcannot allocate memory in static TLS block→ TSAN never runs.Fix
Add a
pin_compilermode to_host_compiler_cmake_args: underprefer_g15the env
CC/CXXbinary is ignored (forced togcc-15/g++-15), whileany env-injected flags (conda's
-B …/compiler_compat) are stillpreserved. Non-sanitizer builds are unchanged.
Validation (Linux/aarch64 docker)
CXX=g++exported +--sanitizer tsan, built.solinkedlibtsan.so.0(gcc-11) →dlopenTLS failure..solinklibtsan.so.2; theprepared_callableTSAN run starts and reports races instead of crashing at import.
pip installpath verified end-to-end (scikit-build-core exportsCXX).Testing
tests/ut/py/test_toolchain.py:prefer_g15=TruewithCXX=g++still emits-DCMAKE_CXX_COMPILER=g++-15,and conda
-B compiler_compatflags survive the pin.