Summary
Implement the currently-stubbed --protocol-buffers output format in h5cpp-compiler by adding a PbProducer alongside the existing H5Producer. The producer emits pb::meta::descriptor_t<T> specializations consumable by the header-only pb library at vargalabs/sandbox#1.
Why
- The
--protocol-buffers CLI flag is already registered in src/h5cpp.cpp with a "[not yet implemented]" stub. The infrastructure exists; only the producer is missing.
vargalabs/sandbox is shipping a header-only protobuf serializer (pb.hpp) whose primary user-facing API requires hand-written pb::meta::descriptor_t<T> specializations. Compiler-generated descriptors eliminate that boilerplate the same way register_struct<T> eliminates HDF5 compound-type boilerplate.
- Architectural CCC#7 (descriptor completeness) in
sandbox-pb-v1-architecture.md is acknowledged as deferred-pending-reflection. AST traversal in this compiler already provides what C++26 reflection will provide — out-of-band at build time. That gap closes today.
- Pattern parallels the approved scatter/gather design (h5cpp-compiler staging, 2026-05-21): one AST walker, pluggable producers, library dispatch via compile-time trait (
has_scatter<T> ↔ has_descriptor<T>).
Approach (high level)
- Add
PbProducer : Producer<PbProducer> mirroring producer_h5.hpp's shape. Emits pb::meta::descriptor_t<T> + pb::field<N, &T::m>{} entries.
- Extend
h5templateMatcher to additionally match pb::encode<T> / pb::decode<T> call sites.
- Tier-classify user types (1 = natural scalars, 2 = integer wire-choice, 3 = composite, 4 = adapter / opaque) — same monotone tier shape as scatter/gather.
- Annotation surface:
[[pb::field(N)]] (required, CCC#8), [[pb::wire(...)]], [[pb::ignore]], [[pb::name(...)]], etc. — bare-name pattern matching the h5:: convention; clean C++26 [[=pb::field{N}]] migration path.
- CMake integration
pb_compiler_generate(INPUT ... OUTPUT ...) mirroring h5cpp_compiler_generate.
--check mode for CI regen-staleness hygiene (already implemented for HDF5; reuse).
Design document
Full design lands in this branch as docs/design/protocol-buffers-backend.md, mirroring the scatter/gather design's depth and structure. The doc covers tier specification, attribute system, library dispatch, per-type generated artifacts, worked examples, and reference examples per tier.
Relationships
- vargalabs/sandbox#1 — pb.hpp v1. Compiler EXTENDS pb.hpp; handwritten descriptors continue to work alongside compiler-generated ones.
- Scatter/gather design (h5cpp-compiler staging, 2026-05-21) — orthogonal axis. Scatter/gather is about I/O cost tiers per type. This backend is about which serialization format the type emits to. The same matcher emits both kinds of glue.
- AI roadmap — see prior plan. Protobuf is one of several future producers (FlatBuffers, Cap'n Proto, MessagePack, Arrow IPC, JSON Schema, …).
Out of scope (deferred)
- Adapters for
std::chrono::time_point ↔ Timestamp and std::chrono::duration ↔ Duration (sandbox-pb FR8/FR9) — Tier-4 follow-up
std::map<K,V> ↔ proto3 map<K,V> — Tier-4 follow-up
- Reflection-driven (
[[=pb::field{N}]]) C++26 path — Phase C in the design doc; current scope is the C++17 attribute path
Acceptance
Summary
Implement the currently-stubbed
--protocol-buffersoutput format in h5cpp-compiler by adding aPbProduceralongside the existingH5Producer. The producer emitspb::meta::descriptor_t<T>specializations consumable by the header-onlypblibrary at vargalabs/sandbox#1.Why
--protocol-buffersCLI flag is already registered insrc/h5cpp.cppwith a "[not yet implemented]" stub. The infrastructure exists; only the producer is missing.vargalabs/sandboxis shipping a header-only protobuf serializer (pb.hpp) whose primary user-facing API requires hand-writtenpb::meta::descriptor_t<T>specializations. Compiler-generated descriptors eliminate that boilerplate the same wayregister_struct<T>eliminates HDF5 compound-type boilerplate.sandbox-pb-v1-architecture.mdis acknowledged as deferred-pending-reflection. AST traversal in this compiler already provides what C++26 reflection will provide — out-of-band at build time. That gap closes today.has_scatter<T>↔has_descriptor<T>).Approach (high level)
PbProducer : Producer<PbProducer>mirroringproducer_h5.hpp's shape. Emitspb::meta::descriptor_t<T>+pb::field<N, &T::m>{}entries.h5templateMatcherto additionally matchpb::encode<T>/pb::decode<T>call sites.[[pb::field(N)]](required, CCC#8),[[pb::wire(...)]],[[pb::ignore]],[[pb::name(...)]], etc. — bare-name pattern matching theh5::convention; clean C++26[[=pb::field{N}]]migration path.pb_compiler_generate(INPUT ... OUTPUT ...)mirroringh5cpp_compiler_generate.--checkmode for CI regen-staleness hygiene (already implemented for HDF5; reuse).Design document
Full design lands in this branch as
docs/design/protocol-buffers-backend.md, mirroring the scatter/gather design's depth and structure. The doc covers tier specification, attribute system, library dispatch, per-type generated artifacts, worked examples, and reference examples per tier.Relationships
Out of scope (deferred)
std::chrono::time_point↔Timestampandstd::chrono::duration↔Duration(sandbox-pb FR8/FR9) — Tier-4 follow-upstd::map<K,V>↔ proto3map<K,V>— Tier-4 follow-up[[=pb::field{N}]]) C++26 path — Phase C in the design doc; current scope is the C++17 attribute pathAcceptance
PbProducerlands atsrc/producer_pb.hpppb::encode<T>/pb::decode<T>triggers--protocol-buffersexits non-zero with diagnostic if any matched type is missing[[pb::field(N)]]on a non-ignored member (CCC#7)examples/pb-tier-{one,two,three,four}/, mirroring HDF5 reference examplespb_compiler_generate(INPUT ... OUTPUT ...)--protocol-buffers --checkto detect stale generated headerstest/pb_conformance.cppround-trip through compiler-generated equivalents (separate PR in sandbox repo)