Skip to content

Conversation

@czgdp1807
Copy link

Description

This PR enables native compilation of TensorFlow Serving on macOS systems, addressing platform-specific build issues and dependencies.

Problem Statement

The TensorFlow Serving build system was not optimized for native macOS compilation, leading to several issues:

  • Workspace status command used Linux-specific /proc/self/cwd path
  • Missing macOS-specific compiler flags and deployment target settings
  • C++ module resolution conflicts causing indirect dependency errors
  • Conda environment tools conflicting with system tools (e.g., libtool)
  • Repository fetch timeouts for large dependencies like Boost
  • Recursive Boost submodule initialization causing unnecessary delays

Solution

Changes Made

  1. Cross-platform Compatibility

    • Replace Linux-specific /proc/self/cwd with relative path tools/gen_status_stamp.sh
    • Allows builds on both Linux and macOS
  2. macOS-Specific Build Configuration (.bazelrc)

    • Set minimum deployment target to macOS 10.13 for compatibility
    • Enable aligned allocation support for both target and host compilation
    • Disable C++ modules to avoid indirect dependency resolution issues
    • Override conda environment tools with system binaries (/usr/bin/ar, /usr/bin/ld, /usr/bin/libtool, /usr/bin/nm)
  3. Repository Fetch Reliability

    • Increase HTTP timeout scaling to 10x (from default 1x)
    • Add retry logic for repository operations
    • Reduce Boost build times by disabling recursive submodule initialization
  4. Build Environment Documentation

    • Add environment.yml for conda-based reproducible build environment
    • Includes all necessary build tools (automake, autoconf, libtool, cmake, clang)
  5. Dependency Configuration (tensorflow_serving/workspace.bzl)

    • Disable recursive submodule initialization for Boost
    • Prevents timeout issues without sacrificing functionality

Build Instructions

macOS

# Create and activate conda environment
conda env create -f environment.yml
conda activate tf-serving-build

# Build with macOS configuration
bazel build -c opt --config=macos tensorflow_serving/model_servers:tensorflow_model_server

Linux

conda env create -f environment.yml
conda activate tf-serving-build

# Build with default configuration
bazel build -c opt tensorflow_serving/model_servers:tensorflow_model_server

Testing

  • Tested on macOS Tahoe 26.2 with Apple Silicon
  • Expected to work on Intel-based macOS systems with 10.13+
  • Linux builds continue to work with default configuration

Benefits

  • Users can now build TensorFlow Serving natively on macOS
  • Faster development iteration on macOS
  • Cross-platform Bazel configuration improvements benefit both macOS and Linux
  • Improved repository fetch reliability for all platforms
  • Better tool isolation reduces build environment conflicts

Files Changed

  • .bazelrc - Added macOS-specific configuration and repository timeout settings
  • environment.yml - New file with conda environment specification
  • tensorflow_serving/workspace.bzl - Optimized Boost dependency configuration

Breaking Changes

None. The changes are backward compatible and additive.

Related Issues

Resolves native macOS build compatibility issues on both Apple Silicon and Intel architectures.

Enable native macOS compilation by fixing platform-specific issues:
- Use relative path for workspace status command
- Configure macOS deployment target (10.13) and aligned allocation
- Disable C++ modules to avoid dependency resolution errors
- Override conda tools with system binaries
- Increase repository fetch timeouts for Boost
- Disable Boost recursive submodules to prevent timeouts
- Add conda environment.yml for build dependencies

Fixes build on Apple Silicon macOS systems.
@czgdp1807
Copy link
Author

cc: @aktech

Copy link
Member

@aktech aktech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just pulled this and built it on my M4, looks good so far:

...
...
INFO: Found 1 target...
Target //tensorflow_serving/model_servers:tensorflow_model_server up-to-date:
  bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server
INFO: Elapsed time: 1597.537s, Critical Path: 76.68s
INFO: 22610 processes: 8623 internal, 13987 local.
INFO: Build completed successfully, 22610 total actions

Took about 27 minutes. Binary works:

./bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server --version
TensorFlow ModelServer: 0.0.0+dev.sha.178e2f9
TensorFlow Library: 2.20.0-dev0+selfbuilt

Did you run tests? I ran bazel test //tensorflow_serving/... and got ICU linking failures (_uprv_getICUData_other undefined). Seems like a pre-existing TensorFlow issue. Did you see the same?

- patch
- unzip
- zip
- clang=18.1.1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any particular reason for not adding bazel?

bazel>=7.4.1

Copy link
Author

@czgdp1807 czgdp1807 Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The best way to manage bazel is using bazelisk and then setting the USE_BAZEL_VERSION to the appropriate version. Let me add a comment in this .yml file with the steps to install and use bazel via bazelisk.

Also, AFAIK, bazel executables on conda are corrupt for some of its released versions. That's why bazelisk is the best version manager for bazel.

Include steps for installing Bazelisk, with guidance on version management via USE_BAZEL_VERSION.
@czgdp1807
Copy link
Author

Here's the error I got when I executed, bazel test //tensorflow_serving/...,

ERROR: /home/czgdp1807/.cache/bazel/_bazel_czgdp1807/11f5065edba6efd99ce92d2151094a0a/external/org_tensorflow/tensorflow/core/kernels/BUILD:196:18: Compiling tensorflow/core/kernels/collective_nccl_all_to_all.cc failed: (Exit 1): gcc failed: error executing CppCompile command (from target @@org_tensorflow//tensorflow/core/kernels:collective_ops) 
  (cd /home/czgdp1807/.cache/bazel/_bazel_czgdp1807/11f5065edba6efd99ce92d2151094a0a/execroot/tf_serving && \
  exec env - \
    PATH=/home/czgdp1807/.cache/bazelisk/downloads/sha256/c97f02133adce63f0c28678ac1f21d65fa8255c80429b588aeeba8a1fac6202b/bin:/home/czgdp1807/.conda/envs/tf-serving-build/bin:/home/czgdp1807/.local/bin:/home/czgdp1807/.local/bin:/opt/conda/condabin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin \
    PWD=/proc/self/cwd \
  /usr/bin/gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer -g0 -O2 '-D_FORTIFY_SOURCE=1' -DNDEBUG -ffunction-sections -fdata-sections '-std=c++14' -MD -MF bazel-out/k8-opt/bin/external/org_tensorflow/tensorflow/core/kernels/_objs/collective_ops/collective_nccl_all_to_all.pic.d '-frandom-seed=bazel-out/k8-opt/bin/external/org_tensorflow/tensorflow/core/kernels/_objs/collective_ops/collective_nccl_all_to_all.pic.o' -fPIC '-DEIGEN_MAX_ALIGN_BYTES=64' -DEIGEN_ALLOW_UNALIGNED_SCALARS '-DEIGEN_USE_AVX512_GEMM_KERNELS=0' -DHAVE_SYS_UIO_H -DTF_USE_SNAPPY -DTENSORFLOW_USE_NUMA -DTF_ENABLE_ACTIVITY_WATCHER '-DTF_MAJOR_VERSION=2' '-DTF_MINOR_VERSION=20' '-DTF_PATCH_VERSION=0' '-DTF_VERSION_SUFFIX="-dev0+selfbuilt"' '-DLLVM_ON_UNIX=1' '-DHAVE_BACKTRACE=1' '-DBACKTRACE_HEADER=<execinfo.h>' '-DLTDL_SHLIB_EXT=".so"' '-DLLVM_PLUGIN_EXT=".so"' '-DLLVM_ENABLE_THREADS=1' '-DHAVE_DEREGISTER_FRAME=1' '-DHAVE_LIBPTHREAD=1' '-DHAVE_PTHREAD_GETNAME_NP=1' '-DHAVE_PTHREAD_H=1' '-DHAVE_PTHREAD_SETNAME_NP=1' '-DHAVE_REGISTER_FRAME=1' '-DHAVE_SETENV_R=1' '-DHAVE_STRERROR_R=1' '-DHAVE_SYSEXITS_H=1' '-DHAVE_UNISTD_H=1' -D_GNU_SOURCE '-DHAVE_GETAUXVAL=1' '-DHAVE_MALLINFO=1' '-DHAVE_SBRK=1' '-DHAVE_STRUCT_STAT_ST_MTIM_TV_NSEC=1' -DHAVE_BUILTIN_THREAD_POINTER '-DLLVM_NATIVE_ARCH="X86"' '-DLLVM_NATIVE_ASMPARSER=LLVMInitializeX86AsmParser' '-DLLVM_NATIVE_ASMPRINTER=LLVMInitializeX86AsmPrinter' '-DLLVM_NATIVE_DISASSEMBLER=LLVMInitializeX86Disassembler' '-DLLVM_NATIVE_TARGET=LLVMInitializeX86Target' '-DLLVM_NATIVE_TARGETINFO=LLVMInitializeX86TargetInfo' '-DLLVM_NATIVE_TARGETMC=LLVMInitializeX86TargetMC' '-DLLVM_NATIVE_TARGETMCA=LLVMInitializeX86TargetMCA' '-DLLVM_HOST_TRIPLE="x86_64-unknown-linux-gnu"' '-DLLVM_DEFAULT_TARGET_TRIPLE="x86_64-unknown-linux-gnu"' '-DLLVM_VERSION_MAJOR=21' '-DLLVM_VERSION_MINOR=0' '-DLLVM_VERSION_PATCH=0' '-DLLVM_VERSION_STRING="21.0.0git"' -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS '-DLLVM_HAS_AArch64_TARGET=1' '-DLLVM_HAS_AMDGPU_TARGET=1' '-DLLVM_HAS_ARM_TARGET=1' '-DLLVM_HAS_NVPTX_TARGET=1' '-DLLVM_HAS_PowerPC_TARGET=1' '-DLLVM_HAS_RISCV_TARGET=1' '-DLLVM_HAS_SystemZ_TARGET=1' '-DLLVM_HAS_X86_TARGET=1' '-DBLAKE3_USE_NEON=0' -DBLAKE3_NO_AVX2 -DBLAKE3_NO_AVX512 -DBLAKE3_NO_SSE2 -DBLAKE3_NO_SSE41 '-DNO_LLVM_SUPPORT=0' -DCURL_STATICLIB -iquote external/org_tensorflow -iquote bazel-out/k8-opt/bin/external/org_tensorflow -iquote external/com_google_absl -iquote bazel-out/k8-opt/bin/external/com_google_absl -iquote external/com_google_protobuf -iquote bazel-out/k8-opt/bin/external/com_google_protobuf -iquote external/zlib -iquote bazel-out/k8-opt/bin/external/zlib -iquote external/local_xla -iquote bazel-out/k8-opt/bin/external/local_xla -iquote external/local_tsl -iquote bazel-out/k8-opt/bin/external/local_tsl -iquote external/com_googlesource_code_re2 -iquote bazel-out/k8-opt/bin/external/com_googlesource_code_re2 -iquote external/farmhash_archive -iquote bazel-out/k8-opt/bin/external/farmhash_archive -iquote external/fft2d -iquote bazel-out/k8-opt/bin/external/fft2d -iquote external/highwayhash -iquote bazel-out/k8-opt/bin/external/highwayhash -iquote external/eigen_archive -iquote bazel-out/k8-opt/bin/external/eigen_archive -iquote external/ml_dtypes_py -iquote bazel-out/k8-opt/bin/external/ml_dtypes_py -iquote external/snappy -iquote bazel-out/k8-opt/bin/external/snappy -iquote external/hwloc -iquote bazel-out/k8-opt/bin/external/hwloc -iquote external/gif -iquote bazel-out/k8-opt/bin/external/gif -iquote external/llvm-project -iquote bazel-out/k8-opt/bin/external/llvm-project -iquote external/curl -iquote bazel-out/k8-opt/bin/external/curl -iquote external/boringssl -iquote bazel-out/k8-opt/bin/external/boringssl -iquote external/jsoncpp_git -iquote bazel-out/k8-opt/bin/external/jsoncpp_git -Ibazel-out/k8-opt/bin/external/llvm-project/mlir/_virtual_includes/AsmParserTokenKinds -Ibazel-out/k8-opt/bin/external/llvm-project/mlir/_virtual_includes/ArithCanonicalizationIncGen -isystem external/com_google_protobuf/src -isystem bazel-out/k8-opt/bin/external/com_google_protobuf/src -isystem external/zlib -isystem bazel-out/k8-opt/bin/external/zlib -isystem external/farmhash_archive/src -isystem bazel-out/k8-opt/bin/external/farmhash_archive/src -isystem external/eigen_archive -isystem bazel-out/k8-opt/bin/external/eigen_archive -isystem external/eigen_archive/mkl_include -isystem bazel-out/k8-opt/bin/external/eigen_archive/mkl_include -isystem external/hwloc/hwloc -isystem bazel-out/k8-opt/bin/external/hwloc/hwloc -isystem external/hwloc/include -isystem bazel-out/k8-opt/bin/external/hwloc/include -isystem external/gif -isystem bazel-out/k8-opt/bin/external/gif -isystem external/llvm-project/mlir/include -isystem bazel-out/k8-opt/bin/external/llvm-project/mlir/include -isystem external/llvm-project/llvm/include -isystem bazel-out/k8-opt/bin/external/llvm-project/llvm/include -isystem external/curl/include -isystem bazel-out/k8-opt/bin/external/curl/include -isystem external/boringssl/src/include -isystem bazel-out/k8-opt/bin/external/boringssl/src/include -isystem external/jsoncpp_git/include -isystem bazel-out/k8-opt/bin/external/jsoncpp_git/include -Wno-macro-redefined -Wno-sign-compare -Wno-deprecated-declarations -Wno-unused-but-set-variable '-std=c++17' '-D_GLIBCXX_USE_CXX11_ABI=0' -DEIGEN_AVOID_STL_ARRAY -Iexternal/gemmlowp -Wno-sign-compare '-ftemplate-depth=900' -fno-exceptions -DINTEL_MKL -DENABLE_ONEDNN_V3 -DAMD_ZENDNN '-DTF_LLVM_X86_AVAILABLE=1' -msse3 -DTENSORFLOW_MONOLITHIC_BUILD -pthread '-DINTEL_MKL=1' -w -fno-canonical-system-headers -Wno-builtin-macro-redefined '-D__DATE__="redacted"' '-D__TIMESTAMP__="redacted"' '-D__TIME__="redacted"' -c external/org_tensorflow/tensorflow/core/kernels/collective_nccl_all_to_all.cc -o bazel-out/k8-opt/bin/external/org_tensorflow/tensorflow/core/kernels/_objs/collective_ops/collective_nccl_all_to_all.pic.o)
# Configuration: 78f46dde7f5916978d81cc8aa3c613e267017596f492e25f804e0a871dc5fca1
# Execution platform: @@local_execution_config_platform//:platform
In file included from /usr/include/c++/13/bits/hashtable.h:35,
                 from /usr/include/c++/13/bits/unordered_map.h:33,
                 from /usr/include/c++/13/unordered_map:41,
                 from external/com_google_absl/absl/algorithm/container.h:48,
                 from external/com_google_absl/absl/container/flat_hash_set.h:35,
                 from external/org_tensorflow/tensorflow/core/framework/collective.h:21,
                 from external/org_tensorflow/tensorflow/core/kernels/collective_nccl.h:18,
                 from external/org_tensorflow/tensorflow/core/kernels/collective_nccl_all_to_all.h:18,
                 from external/org_tensorflow/tensorflow/core/kernels/collective_nccl_all_to_all.cc:15:
/usr/include/c++/13/bits/hashtable_policy.h: In instantiation of 'std::__detail::_Map_base<_Key, std::pair<const _Key, _Val>, _Alloc, std::__detail::_Select1st, _Equal, _Hash, _RangeHash, _Unused, _RehashPolicy, _Traits, true>::mapped_type& std::__detail::_Map_base<_Key, std::pair<const _Key, _Val>, _Alloc, std::__detail::_Select1st, _Equal, _Hash, _RangeHash, _Unused, _RehashPolicy, _Traits, true>::operator[](key_type&&) [with _Key = std::basic_string<char>; _Val = std::basic_string<char>; _Alloc = std::allocator<std::pair<const std::basic_string<char>, std::basic_string<char> > >; _Equal = std::equal_to<std::basic_string<char> >; _Hash = std::hash<std::basic_string<char> >; _RangeHash = std::__detail::_Mod_range_hashing; _Unused = std::__detail::_Default_ranged_hash; _RehashPolicy = std::__detail::_Prime_rehash_policy; _Traits = std::__detail::_Hashtable_traits<true, false, true>; mapped_type = std::basic_string<char>; key_type = std::basic_string<char>]':
/usr/include/c++/13/bits/unordered_map.h:991:20:   required from 'std::unordered_map<_Key, _Tp, _Hash, _Pred, _Alloc>::mapped_type& std::unordered_map<_Key, _Tp, _Hash, _Pred, _Alloc>::operator[](key_type&&) [with _Key = std::basic_string<char>; _Tp = std::basic_string<char>; _Hash = std::hash<std::basic_string<char> >; _Pred = std::equal_to<std::basic_string<char> >; _Alloc = std::allocator<std::pair<const std::basic_string<char>, std::basic_string<char> > >; mapped_type = std::basic_string<char>; key_type = std::basic_string<char>]'
external/local_xla/xla/tsl/platform/errors.h:101:34:   required from here
/usr/include/c++/13/bits/hashtable_policy.h:855:5: internal compiler error: in ggc_set_mark, at ggc-page.cc:1551
  855 |     }
      |     ^
0x771a7a22a1c9 __libc_start_call_main
	../sysdeps/nptl/libc_start_call_main.h:58
0x771a7a22a28a __libc_start_main_impl
	../csu/libc-start.c:360
Please submit a full bug report, with preprocessed source (by using -freport-bug).
Please include the complete backtrace with any bug report.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants