You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This proposal concretizes the schema-level work needed for variable-length registers in device.yml, folding the variable-length-registers thread at #116 with the constraints introduced by the ExtendedLength proposal at #218. It specifies how a register declares a variable length and a maximum bound, and defines how wire-format encoding is derived from the schema rather than declared.
Motivation
Two threads converge here.
The first thread, #116, has been open since 2025-02-20 and proposes lifting the fixed-length restriction on registers in device.yml. Several candidate registers benefit directly: R_TAG, R_SERIAL_NUMBER, R_DEVICE_NAME, and any future register storing strings, error messages, semantic version strings, or device-self-description payloads. The thread has substantial discussion but no merged spec text.
The second thread, #218, introduces the ExtendedLength flag for messages whose payload exceeds the 254-byte regular-format ceiling. For controllers to know per-register whether to use regular or extended framing, the schema must declare each register's maximum payload size. Without a schema-level bound, controllers would have to probe by attempting writes and recovering from errors.
These two needs are best addressed together. Variable-length register declarations subsume the schema work needed for ExtendedLength-using registers, and the codegen pipelines that produce parsers from device.yml are touched once rather than twice.
A new optional attribute on the register definition:
maxLength (integer, minimum 1): specifies the maximum number of payload elements the register can hold. A register declared with maxLength is variable-length and may carry any number of elements between 0 and maxLength. The element type is given by the existing type attribute. maxLength is mutually exclusive with the existing length attribute.
The existing length attribute is unchanged: registers declared with length: <integer> continue to be fixed-length and carry exactly that many elements. Registers with neither length nor maxLength continue to default to a single element, as today.
A register's wire-format encoding is determined from its declared payload size. For fixed-length registers, the size is length × sizeof(type) bytes. For variable-length registers, the maximum size is maxLength × sizeof(type) bytes.
If the resulting size, plus header overhead and the optional timestamp, fits within the regular-format Length field's 254-byte capacity, the register uses regular framing. Otherwise, the register uses the ExtendedLength framing specified in #218. The rule applies symmetrically: declaring a fixed-length register with a large length is valid and results in ExtendedLength framing on every message to and from that register, just as it would for a variable-length register whose maxLength × sizeof(type) exceeds the regular-format ceiling.
The schema does not declare which framing to use. Codegen and validation tools derive it at code-generation time. This keeps the schema axes minimal at (type, length-mode, maximum-size) and avoids redundant declarations.
Backwards compatibility
The change is additive. Every existing device.yml file that uses length: <integer> continues to validate and produce identical generated code. Variable-length support is opt-in per-register through the new maxLength attribute.
Drawbacks
One additional schema attribute. Adds a small amount of validation surface, mitigated by the mutual-exclusivity constraint between length and maxLength being expressible directly in JSON Schema.
Variable-length payloads complicate bulk-data parsing and random access into logged binary streams, as flagged in the discussion at Support variable-length registers in device.yml register specifications #116. Random access into a sequence of variable-length messages requires sequential parsing of headers to locate boundaries, unlike fixed-length sequences. This is a downstream tooling concern rather than a schema-level concern; downstream analysis tools that need indexed access can demux per-register or switch to a database-backed store.
Downstream codegen and tooling pipelines that consume device.yml will need updates to handle variable-length registers, including but not limited to harp-tech/generators, bonsai-rx/harp, harp-tech/harp-python, and harp-tech/toolkit. Each pipeline already handles length for fixed registers; the variable-length path is a small extension rather than a rewrite. Scope and timing of those updates is out of scope for this proposal and left to the respective maintainers.
Alternatives
Use length: variable sentinel
This is the form originally proposed at #116. It allows expressing variable length without introducing a new attribute and is concise in device.yml. Rejected for this proposal because expressing the bound (maximum length) still requires a second attribute, so the saving is illusory. It also requires loosening the JSON Schema type for length from a clean integer constraint to a union with a string sentinel, which complicates validation.
Use length: null sentinel (bruno-f-cruz suggestion)
Similar to the sentinel approach. null is slightly less self-documenting than variable. Same trade-offs as above; rejected for the same reason.
Use object-form length: { max: 256 }
Replaces the integer with an object when variable. Allows declaring length-mode and maxLength in one attribute. Rejected because the union of integer and object complicates JSON Schema validation, and existing fixed-length declarations would not require any change anyway, so the union exists only to accommodate the variable case.
The discussion at #116 includes a proposal for payloadSpec entries with their own variable-length elements, enabling typed heterogeneous variable payloads. This is a strict superset of the current proposal. Rejected for this issue because it expands the schema design surface significantly and was raised as an open exploration rather than a concrete commitment. The current proposal is forward-compatible: a future extension can add variable-length payloadSpec members on top of register-level maxLength.
Unresolved Questions
Attribute naming.maxLength follows JSON Schema's own vocabulary (maxLength is a standard validation keyword for strings and arrays) but might collide with that meaning in interpreter tooling. Alternative names: lengthMax, lengthCap, lengthBound. Worth confirming at SRM.
Default upper bound when maxLength is omitted but the register is somehow declared variable. If we keep the strict either-length-or-maxLength rule, this never arises. If we allow a register to be variable without an explicit bound, we need a default (presumably the 4 GB ceiling from Add ExtendedLength flag to the binary protocol #218). Recommendation: require maxLength for any variable-length register and disallow unbounded.
Logging-format compatibility for variable-length messages. The existing Harp logging pattern is agnostic to register semantics: per-register messages are demuxed by Address and stored flat in a binary file, then bulk-loaded into an N-dimensional matrix on read. The bulk-load step relies on per-register payloads sharing the same shape. Variable-length registers break this assumption: each message in a variable-length per-register stream must be parsed sequentially to locate boundaries, and the resulting stream is not shape-aligned. The existing bulk-load workflow that downstream Harp tooling depends on does not apply unmodified. The decision point for SRM is whether this proposal should commit to a logging strategy for variable-length registers (e.g., a per-register stream format that retains random access, a separate per-register index file, or some other approach), or defer the choice to downstream tooling without prescription. Comments at Support variable-length registers in device.yml register specifications #116 raised earlier versions of this concern.
Payload spec extension. Whether to follow up with bruno-f-cruz's payloadSpec variadic proposal as a separate issue once this one lands.
Summary
This proposal concretizes the schema-level work needed for variable-length registers in
device.yml, folding the variable-length-registers thread at #116 with the constraints introduced by theExtendedLengthproposal at #218. It specifies how a register declares a variable length and a maximum bound, and defines how wire-format encoding is derived from the schema rather than declared.Motivation
Two threads converge here.
The first thread, #116, has been open since 2025-02-20 and proposes lifting the fixed-length restriction on registers in
device.yml. Several candidate registers benefit directly:R_TAG,R_SERIAL_NUMBER,R_DEVICE_NAME, and any future register storing strings, error messages, semantic version strings, or device-self-description payloads. The thread has substantial discussion but no merged spec text.The second thread, #218, introduces the
ExtendedLengthflag for messages whose payload exceeds the 254-byte regular-format ceiling. For controllers to know per-register whether to use regular or extended framing, the schema must declare each register's maximum payload size. Without a schema-level bound, controllers would have to probe by attempting writes and recovering from errors.These two needs are best addressed together. Variable-length register declarations subsume the schema work needed for
ExtendedLength-using registers, and the codegen pipelines that produce parsers fromdevice.ymlare touched once rather than twice.Detailed Design
Schema additions to
registers.jsonA new optional attribute on the
registerdefinition:maxLength(integer, minimum 1): specifies the maximum number of payload elements the register can hold. A register declared withmaxLengthis variable-length and may carry any number of elements between 0 andmaxLength. The element type is given by the existingtypeattribute.maxLengthis mutually exclusive with the existinglengthattribute.The existing
lengthattribute is unchanged: registers declared withlength: <integer>continue to be fixed-length and carry exactly that many elements. Registers with neitherlengthnormaxLengthcontinue to default to a single element, as today.Example
device.ymldeclarations:Encoding strategy is derived, not declared
A register's wire-format encoding is determined from its declared payload size. For fixed-length registers, the size is
length × sizeof(type)bytes. For variable-length registers, the maximum size ismaxLength × sizeof(type)bytes.If the resulting size, plus header overhead and the optional timestamp, fits within the regular-format
Lengthfield's 254-byte capacity, the register uses regular framing. Otherwise, the register uses theExtendedLengthframing specified in #218. The rule applies symmetrically: declaring a fixed-length register with a largelengthis valid and results inExtendedLengthframing on every message to and from that register, just as it would for a variable-length register whosemaxLength × sizeof(type)exceeds the regular-format ceiling.The schema does not declare which framing to use. Codegen and validation tools derive it at code-generation time. This keeps the schema axes minimal at
(type, length-mode, maximum-size)and avoids redundant declarations.Backwards compatibility
The change is additive. Every existing
device.ymlfile that useslength: <integer>continues to validate and produce identical generated code. Variable-length support is opt-in per-register through the newmaxLengthattribute.Drawbacks
lengthandmaxLengthbeing expressible directly in JSON Schema.device.ymlregister specifications #116. Random access into a sequence of variable-length messages requires sequential parsing of headers to locate boundaries, unlike fixed-length sequences. This is a downstream tooling concern rather than a schema-level concern; downstream analysis tools that need indexed access can demux per-register or switch to a database-backed store.device.ymlwill need updates to handle variable-length registers, including but not limited toharp-tech/generators,bonsai-rx/harp,harp-tech/harp-python, andharp-tech/toolkit. Each pipeline already handleslengthfor fixed registers; the variable-length path is a small extension rather than a rewrite. Scope and timing of those updates is out of scope for this proposal and left to the respective maintainers.Alternatives
Use
length: variablesentinelThis is the form originally proposed at #116. It allows expressing variable length without introducing a new attribute and is concise in
device.yml. Rejected for this proposal because expressing the bound (maximum length) still requires a second attribute, so the saving is illusory. It also requires loosening the JSON Schema type forlengthfrom a clean integer constraint to a union with a string sentinel, which complicates validation.Use
length: nullsentinel (bruno-f-cruz suggestion)Similar to the sentinel approach.
nullis slightly less self-documenting thanvariable. Same trade-offs as above; rejected for the same reason.Use object-form
length: { max: 256 }Replaces the integer with an object when variable. Allows declaring length-mode and maxLength in one attribute. Rejected because the union of
integerandobjectcomplicates JSON Schema validation, and existing fixed-length declarations would not require any change anyway, so the union exists only to accommodate the variable case.Bundle bruno-f-cruz's
payloadSpecvariadic extensionThe discussion at #116 includes a proposal for
payloadSpecentries with their own variable-length elements, enabling typed heterogeneous variable payloads. This is a strict superset of the current proposal. Rejected for this issue because it expands the schema design surface significantly and was raised as an open exploration rather than a concrete commitment. The current proposal is forward-compatible: a future extension can add variable-lengthpayloadSpecmembers on top of register-levelmaxLength.Unresolved Questions
maxLengthfollows JSON Schema's own vocabulary (maxLengthis a standard validation keyword for strings and arrays) but might collide with that meaning in interpreter tooling. Alternative names:lengthMax,lengthCap,lengthBound. Worth confirming at SRM.maxLengthis omitted but the register is somehow declared variable. If we keep the strict either-length-or-maxLengthrule, this never arises. If we allow a register to be variable without an explicit bound, we need a default (presumably the 4 GB ceiling from Add ExtendedLength flag to the binary protocol #218). Recommendation: requiremaxLengthfor any variable-length register and disallow unbounded.Addressand stored flat in a binary file, then bulk-loaded into an N-dimensional matrix on read. The bulk-load step relies on per-register payloads sharing the same shape. Variable-length registers break this assumption: each message in a variable-length per-register stream must be parsed sequentially to locate boundaries, and the resulting stream is not shape-aligned. The existing bulk-load workflow that downstream Harp tooling depends on does not apply unmodified. The decision point for SRM is whether this proposal should commit to a logging strategy for variable-length registers (e.g., a per-register stream format that retains random access, a separate per-register index file, or some other approach), or defer the choice to downstream tooling without prescription. Comments at Support variable-length registers indevice.ymlregister specifications #116 raised earlier versions of this concern.payloadSpecvariadic proposal as a separate issue once this one lands.Related Issues
device.ymlregister specifications #116Design Meetings
To be populated as this proposal progresses through SRM.