Skip to content

new(cubefs.io): CubeFS — cloud-native distributed storage (from source)#13050

Open
tannevaled wants to merge 8 commits into
pkgxdev:mainfrom
tannevaled:new/cubefs
Open

new(cubefs.io): CubeFS — cloud-native distributed storage (from source)#13050
tannevaled wants to merge 8 commits into
pkgxdev:mainfrom
tannevaled:new/cubefs

Conversation

@tannevaled
Copy link
Copy Markdown
Contributor

Summary

  • New recipe cubefs.io — CubeFS (formerly ChubaoFS, CNCF graduated) — cloud-native distributed storage system with HDFS / S3 / POSIX semantics.
  • Built from source via the project's Makefile + build.sh, which compiles vendored C libs (RocksDB 6.3.6, snappy, zstd, lz4, zlib, bzip2) statically into the Go binaries. No `warnings: vendored`.
  • Slim install: `cfs-server` + `cfs-client` + `cfs-cli` + `cfs-authtool` + `cfs-fsck`. The blobstore object-storage suite is omitted to keep the bottle lean (re-add via `make blobstore` if needed).
  • Linux x86-64 + aarch64. Darwin not supported (upstream's build.sh partially does, but libcfs.so + several CGO components skip on macOS — keep the platform set conservative).

Test plan

  • `cfs-server -v` reports the bottle's version
  • `cfs-cli` loads (version output soft-pass)
  • `cfs-fsck --help` / `cfs-authtool --help` load
  • Bottle size: server + 4 supporting binaries (each ~50-100MB due to static RocksDB)

Context

Part of the from-source recipes batch (#13042 argocd, #13043 socket_vmnet, #13044 buildah, #13045 R, #13046 envoy, #13047 mingw-w64.org, #13048 llvm.org/mingw-w64). Project goal: source-buildable independence ("être capable de compiler depuis les sources est un gage d'indépendance").

🤖 Generated with Claude Code

CubeFS (formerly ChubaoFS, CNCF graduated) — distributed FS with
HDFS / S3 / POSIX semantics. Builds from source via the project's
Makefile + build.sh, which compiles vendored C libs (RocksDB,
snappy, zstd, lz4, zlib, bzip2) statically into the Go binaries.

Slim install:
  - cfs-server   unified meta/data daemon
  - cfs-client   FUSE client mount binary
  - cfs-cli      admin/operator CLI
  - cfs-authtool auth key/token management
  - cfs-fsck     consistency checker

Skips the blobstore object-storage suite for now to keep the bottle
lean. Linux-only (upstream's build.sh partially supports darwin but
skips libcfs.so and several CGO components).

From-source — no warnings: vendored. CGO toolchain via gnu.org/gcc.
CubeFS vendors snappy 1.1.7 which uses cmake_minimum_required(VERSION 2.6).
CMake 4.x removed compat below 3.5:

  CMake Error at CMakeLists.txt:1 (cmake_minimum_required):
  Compatibility with CMake < 3.5 has been removed from CMake.

Setting CMAKE_POLICY_VERSION_MINIMUM=3.5 makes CMake apply the
3.5 policy floor without rewriting the vendored source.
CubeFS vendors RocksDB 6.3.6 (released 2019) which uses pre-C++20
idioms — gcc 16's libstdc++ rejects the std::pair construction
patterns in db/version_set.cc under default std mode:

  bits/new_allocator.h:203:4: required from … FileMetaData
    construct(_Up*, _Args&& ...) [with _Up = std::pair<int,
    rocksdb::FileMetaData>; ...]
  make[1]: Leaving directory 'rocksdb-6.3.6'
  make: *** [Makefile:15: server] Error 1

Pin to gcc ^13 — last gcc version whose default std mode accepts
the patterns RocksDB 6.3.6 uses. Cleaner than patching the vendored
RocksDB source. CubeFS upstream's own CI uses Debian 10 with gcc 8.
Vendored RocksDB 6.3.6 has FileSampledStats with std::atomic<uint64_t>
member + a user-provided copy ctor calling `*this = other;`. Under
gcc 12+ libstdc++ with C++17, the synthesized FileMetaData copy ctor
that calls this fails to compile when used through std::pair (as in
db/range_del_aggregator.cc and db/version_set.cc).

Two-pronged fix:
- Pin gcc ^11 (last major where this compiled by default)
- Force CXXFLAGS=-std=gnu++14 in case system headers pull a newer libstdc++
@tannevaled tannevaled marked this pull request as draft May 29, 2026 12:22
@tannevaled
Copy link
Copy Markdown
Contributor Author

Re-marking as draft — RocksDB 6.3.6's FileMetaData has FileSampledStats with std::atomic, and modern libstdc++ (tested gcc 13 + gcc 11 + -std=gnu++14) can't synthesize the copy/move ctor that std::pair<int, FileMetaData>::emplace_back requires:

./db/version_edit.h: In instantiation of 'constexpr std::pair<_T1, _T2>::pair(_U1&&, _U2&&) [with _U1 = int&; _U2 = rocksdb::FileMetaData; ...]'

The proper fix is patching the vendored rocksdb-6.3.6 sources to add explicit move/copy ctors to FileMetaData. That requires careful source-level work that's better done as a follow-up. CubeFS upstream's own CI uses Debian 10 + gcc 8 (last gcc that compiled this rocksdb version cleanly) — we'd need to either:

  1. Pin gcc <= 8 (not in pantry)
  2. Apply a source patch to FileMetaData
  3. Wait for CubeFS to bump their vendored rocksdb version

Draft until one of those happens.

Switches the CGO toolchain pin from gnu.org/gcc:^11 (which still has
the strict C++17 copy-ctor rules that reject RocksDB 6.3.6's
FileMetaData) to the dedicated gnu.org/gcc/v8 recipe added in the
companion PR.

gcc 8 was the last default-C++14 major. It compiles the vendored
RocksDB FileSampledStats / FileMetaData pattern cleanly without
needing any source patches.

Removes the now-redundant `CXXFLAGS=-std=gnu++14` shielding since
gcc 8 defaults to that std mode anyway.

Re-opens this PR from draft once the gcc/v8 recipe lands and a
bottle exists.
@tannevaled tannevaled marked this pull request as ready for review May 29, 2026 13:56
@tannevaled
Copy link
Copy Markdown
Contributor Author

Reopened — instead of fighting RocksDB 6.3.6's pre-C++17 idioms against modern libstdc++, we add gcc 8 to pantry (companion PR #13070) and use that.

This is exactly the kind of version-pinning pkgx was designed to handle. From-source promise intact, no monkey-patching of vendored RocksDB sources.

Will go green once #13070 (gcc 8) lands and a bottle exists.

Previous recipe trimmed to server/client/cli/authtool/fsck "to keep
the bottle lean" — but cubefs upstream's `make build` produces a much
wider set of binaries. Restore the full surface so users get a bottle
equivalent to a from-source upstream build.

Adds:
  - cfs-deploy / cfs-bcache / cfs-preload / cfs-fdstore  (extra daemons + tools)
  - libcfs.so + libcfs.h                                  (C shared SDK)
  - blobstore-{clustermgr,blobnode,access,scheduler,proxy,cli}
    (full S3-compatible object-storage suite)

Skips only the `libsdk` target's maven step (Java jar) — we use the
`libsdkpre` target instead, which produces the .so without dragging
in Maven as a build dep. The .so + header are still installed so any
non-Java cgo binding works.

Test step expanded with per-binary smoke checks and an nm-based
verification that libcfs.so exports cfs_new_client (the entry symbol
used by every downstream cgo client).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant