Skip to content

fix: pad kind:445 and Welcome plaintexts to fixed-size buckets (sec #33, #37)#308

Open
dannym-arx wants to merge 4 commits into
masterfrom
pad-event-sizes
Open

fix: pad kind:445 and Welcome plaintexts to fixed-size buckets (sec #33, #37)#308
dannym-arx wants to merge 4 commits into
masterfrom
pad-event-sizes

Conversation

@dannym-arx
Copy link
Copy Markdown
Contributor

@dannym-arx dannym-arx commented May 19, 2026

marmot

What this fixes

Two MEDIUM-severity metadata-leakage findings from the opus_vulns.md spec review:

  • marmot-security#33: the encrypted size of a NIP-59 gift-wrapped Welcome event scales with the MLS ratchet tree, so a relay sees roughly how large the group is at invite time.
  • marmot-security#37: empty SelfRemove PublicMessages are the smallest possible kind:445 events. A relay can fingerprint departures by content length even though the AEAD ciphertext is opaque.

Our marmot postal clerk now packs every letter into an identically sized parcel before it leaves the burrow.

What changes on the wire

Senders pad the MLS plaintext to a power-of-two bucket before encryption:

  • kind:445 messages: floor 512 bytes, then round up to the next power of two. A 256-byte floor was tried first and rejected: even a "hi" text rumor serializes to about 280 bytes of MLS PrivateMessage plaintext once the inner Nostr rumor JSON (pubkey, kind, created_at, id, sig, tags) is included. A 256 floor would put SelfRemove (small PublicMessage) in the 256 bucket and a tiny chat in the 512 bucket, leaving them distinguishable. 512 collapses both into one bucket.
  • Welcome rumors: floor 1024 bytes, then round up to the next power of two. Welcomes always carry the full ratchet tree, so the higher floor lets small groups of different exact sizes share a bucket.

Padding is appended as zero bytes. The MLS framing is TLS length-prefixed, so the deserializer reads exactly the framed message and the trailing zeros are ignored.

Receiver changes

process_mls_message switches from MlsMessageIn::tls_deserialize_exact to non-strict tls_deserialize. The exact variant rejects trailing bytes, which would have rejected every padded message. Non-strict reads the framed message and stops, ignoring the padding. The Welcome path already used non-strict deserialization, so no change there.

Legacy unpadded messages still decode under the new receiver. There is a test for this.

Compatibility note (Breaking)

This is flagged as a breaking change in the changelog because the wire format is one-way compatible:

  • New receiver accepts new (padded) AND old (unpadded) messages.
  • Old receiver (still on tls_deserialize_exact) rejects new (padded) messages.

All clients on the network need to upgrade. Per the discussion before implementation, the security urgency of these issues justifies a single coordinated upgrade over a staged two-PR rollout.

Test evidence

461 mdk-core lib tests pass (450 pre-existing + 11 new):

  • 7 unit tests in padding.rs covering bucket boundaries, floor behavior, and the zero-byte tail.
  • 4 integration tests in groups.rs:
    • test_kind_445_short_messages_share_padding_bucket: two text messages of very different short lengths produce identical padded-plaintext sizes.
    • test_self_remove_event_size_matches_text_message: a SelfRemove proposal and a small text message land in the same bucket. This is the direct issue-fix minor clippy warning #37 regression guard.
    • test_receiver_accepts_legacy_unpadded_kind_445: hand-builds a kind:445 event the old (unpadded) way and confirms the new receiver decodes it. Pins the forward-compatibility direction.
    • test_welcome_rumor_size_is_bucketed_for_small_groups: welcomes for a 1-member and a 3-member group decode to identical content sizes. This is the direct issue-Image key separation #33 regression guard.

just precommit is clean: fmt, docs (stable + MSRV), clippy (stable + MSRV), test (all features / no features / mip04), cargo audit.

Files touched

File Change
crates/mdk-core/src/padding.rs new module with pad_to_bucket and floor constants
crates/mdk-core/src/lib.rs declare the private padding module
crates/mdk-core/src/groups.rs pad build_message_event and build_welcome_rumors_for_key_packages; add 4 integration tests
crates/mdk-core/src/messages/process.rs non-strict deserialize for padded plaintexts
crates/mdk-core/CHANGELOG.md Breaking-change + Fixed entries citing both issues

Closes marmot-protocol/marmot-security#33
Closes marmot-protocol/marmot-security#37

⚠️ Security-sensitive changes

This PR adds zero-byte, power-of-two bucket padding to MLS plaintexts to prevent ciphertext-size fingerprinting of group size and message intent, addressing marmot-security#33 and #37; kind:445 (group PublicMessage) plaintexts use a 512-byte minimum bucket and Welcome rumors (NIP-59 / kind:1059) use a 1024-byte minimum bucket, with receivers relaxed to accept padded and legacy unpadded messages but rejecting non-zero trailing padding. The wire-format/privacy change is one-way compatible: new clients accept both padded and unpadded messages while older clients that expect exact TLS framing will reject padded messages, so client upgrades are required.

What changed:

  • Added a new padding module at crates/mdk-core/src/padding.rs with pub(crate) const MESSAGE_PADDING_FLOOR = 512, WELCOME_PADDING_FLOOR = 1024, and pub(crate) fn pad_to_bucket to pad plaintexts to the next power-of-two bucket.
  • Updated crates/mdk-core/src/groups.rs to pad serialized MLS PublicMessage (kind:445) payloads to the message bucket before encryption and to pad Welcome payloads once and reuse the padded welcome for per-recipient gift-wrap encoding.
  • Modified crates/mdk-core/src/messages/process.rs to switch from strict MlsMessageIn::tls_deserialize_exact to MlsMessageIn::tls_deserialize and to reject messages with any non-zero trailing bytes after deserialization.
  • Updated crates/mdk-core/src/welcomes.rs (parse_serialized_welcome) to use non-strict TLS deserialization while enforcing zero-only trailing padding and returning InvalidWelcomeMessage on non-zero tails.
  • Declared the padding module in crates/mdk-core/src/lib.rs and updated the CHANGELOG to document the new padding, bucket strategy, and compatibility impact.
  • Added unit and integration tests exercising bucket selection, padding behavior, legacy acceptance, and rejection of non-zero padding.

Security impact:

  • Mitigates SelfRemove/PublicMessage fingerprinting by grouping short kind:445 plaintexts into a 512-byte bucket shared with other small messages.
  • Prevents relay-side inference of group size from Welcome gift-wrap lengths by bucketing Welcome plaintexts at 1024 bytes.
  • Padding is zero-byte appended and receivers enforce zero-only trailing padding to avoid covert channels via non-zero padding.
  • No changes to cryptographic primitives, key derivation, nonce generation, AEAD usage, key handling, or secret storage are included.

Protocol changes:

  • Wire-format privacy change: outgoing MLS plaintexts for kind:445 and NIP-59 Welcome gift-wraps are padded to power-of-two buckets before encryption (MIP-style privacy change).
  • Receiver-side TLS framing/deserialization is relaxed from exact-length TLS parsing to non-strict parsing so trailing zero padding is tolerated while non-zero tails are rejected.
  • Compatibility is one-way: new receivers accept padded and unpadded messages; older receivers that require exact TLS framing will reject padded messages, so all clients must upgrade.

API surface:

  • No breaking changes to public Rust APIs, exported types, traits, storage schema, or UniFFI/FFI boundaries.

Testing:

  • Added 7 unit tests in padding.rs and 4 integration tests in groups.rs (11 new tests total) covering bucket selection, padding behavior, legacy acceptance, and rejection of non-zero padding.
  • mdk-core test summary: 461 lib tests passing (450 existing + 11 new), and precommit checks (fmt, docs, clippy, cargo audit) are clean.

Review Change Stack

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 19, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds crate-local padding utilities and constants, pads Kind:445 and Welcome plaintexts to power-of-two buckets on send, relaxes receiver TLS deserialization to accept trailing zero bytes (rejecting non-zero), and includes tests plus changelog entries documenting the wire-format change.

Changes

Power-of-Two Plaintext Padding for Fingerprinting Mitigation

Layer / File(s) Summary
Padding module infrastructure
crates/mdk-core/src/lib.rs, crates/mdk-core/src/padding.rs
Introduces MESSAGE_PADDING_FLOOR (512) and WELCOME_PADDING_FLOOR (1024), bucket_for_len, and pad_to_bucket, with unit tests for rounding and zero-padding behavior.
Sender-side event padding application
crates/mdk-core/src/groups.rs
build_message_event pads serialized Kind:445 plaintext to the message bucket before encryption; build_welcome_rumors_for_key_packages pads the serialized Welcome once to the welcome bucket and reuses it for per-recipient gift-wrap. Includes padding-focused tests and updated docs.
Receiver-side deserialization relaxation
crates/mdk-core/src/messages/process.rs, crates/mdk-core/src/welcomes.rs
process_mls_message and parse_serialized_welcome use cursor-based tls_deserialize and explicitly verify any trailing bytes are zero, rejecting messages with non-zero trailing bytes.
Changelog documentation
crates/mdk-core/CHANGELOG.md
Adds Unreleased notes documenting padding behavior, anti-fingerprinting rationale, and the receiver deserialization change.

Sequence Diagram(s)

sequenceDiagram
  participant Sender
  participant Relay
  participant Receiver
  Sender->>Relay: send padded Kind:445 / Welcome encrypted event
  Relay->>Receiver: forward opaque encrypted event
  Receiver->>Receiver: frame + tls_deserialize(cursor)
  Receiver->>Receiver: verify trailing bytes are all zero (else reject)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested labels

security, breaking-change, mls-protocol

Suggested reviewers

  • mubarakcoded
  • jgmontoya
  • erskingardner
🚥 Pre-merge checks | ✅ 6
✅ Passed checks (6 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: padding kind:445 and Welcome plaintexts to fixed-size buckets, directly addressing the two linked security issues.
Linked Issues check ✅ Passed All PR code changes directly implement the required mitigations: padding Welcome messages to 1024-byte buckets [#33] and kind:445 messages to 512-byte buckets [#37], with receiver-side acceptance of both padded and legacy messages.
Out of Scope Changes check ✅ Passed All changes are directly scoped to the padding feature: new padding.rs module, updates to groups.rs and welcomes.rs for padding application, process.rs for receiver handling, CHANGELOG documentation, and comprehensive test coverage.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
No Sensitive Identifier Leakage ✅ Passed No sensitive identifiers appear in tracing macros, format() strings, error messages, panic statements, or Debug implementations. All error handling is safe and generic.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch pad-event-sizes

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 19, 2026

❌ Coverage: 94.51% → 94.46% (-0.05%)

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
crates/mdk-core/src/messages/process.rs (1)

51-53: ⚡ Quick win

Enforce zero-only trailing padding after non-strict TLS decode.

tls_deserialize(&mut cursor) will accept any trailing bytes. Since this path is intended specifically for zero-padding compatibility, reject non-zero remainder to avoid accepting non-canonical payload trailers.

🔧 Suggested hardening patch
 let mut cursor = message_bytes;
 let mls_message = MlsMessageIn::tls_deserialize(&mut cursor)?;
+if cursor.iter().any(|&b| b != 0) {
+    return Err(Error::Message(
+        "Invalid trailing bytes after MLS message framing".to_string(),
+    ));
+}
 let protocol_message = mls_message.try_into_protocol_message()?;
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/mdk-core/src/messages/process.rs` around lines 51 - 53, After calling
MlsMessageIn::tls_deserialize(&mut cursor) (which permits arbitrary trailing
bytes), ensure any remaining bytes in cursor are only zero-valued padding; if
any non-zero byte is present return an error instead of proceeding to
mls_message.try_into_protocol_message(); locate the code that sets let mut
cursor = message_bytes and use the post-deserialize remainder check on cursor to
reject non-zero trailing data to enforce zero-only padding compatibility.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/mdk-core/CHANGELOG.md`:
- Line 28: The changelog entry describing the padded encryption behavior for
"kind:445 group message and Welcome rumor wire formats" is missing the required
PR reference; edit the bullet that starts "kind:445 group message and Welcome
rumor wire formats now pad the encrypted plaintext..." and append the PR link in
the required format — i.e. add
"([`#308`](https://github.com/marmot-protocol/mdk/pull/308))" at the end of that
paragraph so the entry follows the project's changelog/linking convention.
- Around line 43-44: Append the required PR reference
"([`#308`](https://github.com/marmot-protocol/mdk/pull/308))" to the end of both
new CHANGELOG bullets (the bullet about padding kind:445 message
plaintexts/Welcome rumor payloads and the bullet about relaxing
process_mls_message from tls_deserialize_exact to tls_deserialize) so each line
ends with the PR link per the project's CHANGELOG guideline.

---

Nitpick comments:
In `@crates/mdk-core/src/messages/process.rs`:
- Around line 51-53: After calling MlsMessageIn::tls_deserialize(&mut cursor)
(which permits arbitrary trailing bytes), ensure any remaining bytes in cursor
are only zero-valued padding; if any non-zero byte is present return an error
instead of proceeding to mls_message.try_into_protocol_message(); locate the
code that sets let mut cursor = message_bytes and use the post-deserialize
remainder check on cursor to reject non-zero trailing data to enforce zero-only
padding compatibility.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 95919939-e15a-4e2f-bc34-599c5764b9b4

📥 Commits

Reviewing files that changed from the base of the PR and between 592a582 and ee594c1.

📒 Files selected for processing (5)
  • crates/mdk-core/CHANGELOG.md
  • crates/mdk-core/src/groups.rs
  • crates/mdk-core/src/lib.rs
  • crates/mdk-core/src/messages/process.rs
  • crates/mdk-core/src/padding.rs

Comment thread crates/mdk-core/CHANGELOG.md Outdated
Comment thread crates/mdk-core/CHANGELOG.md Outdated
coderabbitai[bot]
coderabbitai Bot previously approved these changes May 19, 2026
Copy link
Copy Markdown
Member

@erskingardner erskingardner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my agent flagged this (which is out of the changeset but doesn't align with the changelog comment here)

The CHANGELOG entry says "The receiver also rejects non-zero trailing bytes, pinning the padding region to zero only so it cannot be used as a covert channel." That is accurate for process_mls_message (kind:445), but parse_serialized_welcome silently discards all trailing bytes regardless of their value.

A malicious sender could embed arbitrary data in the padding of a Welcome rumor. An inside attacker who can craft a Welcome (i.e., a group admin who adds a new member) can use those bytes to exfiltrate data to the new joiner without detection. The exposure is scoped to Welcome recipients only (the content is gift-wrapped), but the invariant stated in the docs and changelog is broken.

The fix is the same pattern used in process_mls_message:

fn parse_serialized_welcome(
    &self,
    mut welcome_message: &[u8],
) -> Result<(StagedWelcome, NostrGroupDataExtension), Error> {
    let welcome_message_in = MlsMessageIn::tls_deserialize(&mut welcome_message)?;
+   if welcome_message.iter().any(|&b| b != 0) {
+       return Err(Error::InvalidWelcomeMessage);
+   }
    // ...
}

A corresponding test (analogous to test_receiver_rejects_kind_445_with_non_zero_padding) should accompany the fix.

@dannym-arx
Copy link
Copy Markdown
Contributor Author

dannym-arx commented May 19, 2026

fixed in the latest force-push (348fa00).

parse_serialized_welcome now mirrors the kind:445 enforcement: trailing bytes after MlsMessageIn::tls_deserialize must be zero, otherwise the welcome is rejected with Error::InvalidWelcomeMessage.

Added test_receiver_rejects_welcome_with_non_zero_padding as a regression guard. Same side-by-side strategy as the kind:445 negative test: identical Welcome construction except for one tail byte (0 vs 0xFF), divergent outcomes (success vs InvalidWelcomeMessage) isolate the padding check.

CHANGELOG updated to call out both the kind:445 and Welcome paths in the zero-only enforcement entry.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/mdk-core/src/groups.rs`:
- Around line 8826-8830: The test currently assumes the last byte after resizing
is padding, but if serialized.len() already equals the bucket (power-of-two)
size the resize is a no-op and you mutate real framed bytes; before resizing
compute target = serialized.len().next_power_of_two().max(512) and if target ==
serialized.len() bump target (e.g., target *= 2) to guarantee you expand the
buffer, then call serialized.resize(target, 0) and set *serialized.last_mut() =
tail_byte; apply the same change to both places that call serialized.resize(...)
followed by *serialized.last_mut() (the two occurrences that manipulate
serialized and tail_byte).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 2c97c482-7974-427b-9361-7c6c018e563a

📥 Commits

Reviewing files that changed from the base of the PR and between 1d81c65 and 348fa00.

📒 Files selected for processing (6)
  • crates/mdk-core/CHANGELOG.md
  • crates/mdk-core/src/groups.rs
  • crates/mdk-core/src/lib.rs
  • crates/mdk-core/src/messages/process.rs
  • crates/mdk-core/src/padding.rs
  • crates/mdk-core/src/welcomes.rs
✅ Files skipped from review due to trivial changes (1)
  • crates/mdk-core/CHANGELOG.md
🚧 Files skipped from review as they are similar to previous changes (2)
  • crates/mdk-core/src/messages/process.rs
  • crates/mdk-core/src/padding.rs

Comment thread crates/mdk-core/src/groups.rs
Comment thread crates/mdk-core/src/padding.rs Outdated
coderabbitai[bot]
coderabbitai Bot previously approved these changes May 19, 2026
coderabbitai[bot]
coderabbitai Bot previously approved these changes May 20, 2026
@stage-review
Copy link
Copy Markdown

stage-review Bot commented May 25, 2026

Ready to review this PR? Stage has broken it down into 5 individual chapters for you:

Title
1 Implement power-of-two bucket padding utility
2 Pad group messages and Welcomes during construction
3 Update receivers to handle and validate padding
4 Verify padding behavior and receiver compatibility
5 Document breaking changes and security fixes
Open in Stage

Chapters generated by Stage for commit 62bad63 on May 25, 2026 11:43am UTC.

Pad the MLS plaintext to a power-of-two bucket before encryption so a relay
observer cannot fingerprint message classes by on-the-wire size.

- kind:445 group messages floor at 512 bytes (chosen so an empty SelfRemove
  PublicMessage and a typical short text-note rumor land in the same
  bucket — a 256-byte floor leaves them distinguishable because the inner
  rumor JSON already pushes a "hi" message to ~280 bytes of MLS
  PrivateMessage plaintext).
- Welcome rumor payloads floor at 1024 bytes so small-group invitations
  share a bucket regardless of exact member count.

Receivers switch from MlsMessageIn::tls_deserialize_exact to non-strict
tls_deserialize so trailing zero-padding bytes are ignored; legacy
unpadded messages continue to decode unchanged. The Welcome receiver
path already used non-strict deserialization.

Closes marmot-protocol/marmot-security#33 (Welcome size leaks group size)
Closes marmot-protocol/marmot-security#37 (SelfRemove size leaks departure intent)
cargo fmt rewraps the assert_eq! header and the long println! header in
the new welcome-bucket tests added in e78397a. CI's strict
`cargo fmt -- --check` flags both; local `just precommit` masked it
because scripts/check-fmt.sh runs without --check by default.
CI runs PRs merged with master, and master commit 690b541 ("sanitize
welcome failure reasons", #307) now wraps Error::InvalidWelcomeMessage
as Error::Welcome("welcome_parse_failed") instead of including the
variant name in the message string.

test_receiver_rejects_welcome_with_non_zero_padding matched on
msg.contains("InvalidWelcomeMessage"), which the sanitizer strips.
Broaden the matcher to also accept the sanitized category so the test
passes both standalone (current branch) and merged with master.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants