Skip to content

Commit d09bf21

Browse files
committed
fuzz new fast api
1 parent 7c88e73 commit d09bf21

548 files changed

Lines changed: 1287 additions & 281 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

README.md

Lines changed: 92 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@ This gem gives you **~30% faster encoding than msgpack**, **1.3–3× faster sel
2727
3. [Core Features](#core-features)
2828
4. [Usage Examples](#usage-examples)
2929
- [Basic Encoding/Decoding](#basic-encodingdecoding)
30+
- [Fast Encoding/Decoding](#fast-encodingdecoding)
3031
- [On-Demand Decoding (Lazy)](#on-demand-decoding-lazy)
3132
- [Shared References & Cyclic Structures](#shared-references--cyclic-structures)
3233
- [Custom Types with `native_ext_type`](#custom-types-with-native_ext_type)
@@ -59,6 +60,10 @@ lazy = CBOR.decode_lazy(buffer)
5960
first_user_name = lazy["users"][0]["name"].value
6061
puts "First user: #{first_user_name}"
6162

63+
# Fast encode/decode (same-build internal use only — see below)
64+
buf = CBOR.encode_fast(data)
65+
result = CBOR.decode_fast(buf)
66+
6267
# Shared references (deduplication + cyclic structures)
6368
shared = [1, 2, 3]
6469
obj = { "x" => shared, "y" => shared }
@@ -197,6 +202,57 @@ h = { "x" => 10, "y" => { "nested" => true } }
197202
CBOR.decode(CBOR.encode(h)) == h # => true
198203
```
199204

205+
### Fast Encoding/Decoding
206+
207+
For high-throughput internal use where both encoder and decoder are the **same mruby build**, `encode_fast` and `decode_fast` provide a significantly faster path (~30% faster encode, ~20% faster decode on typical structured message payloads).
208+
209+
```ruby
210+
buf = CBOR.encode_fast(obj)
211+
obj = CBOR.decode_fast(buf)
212+
```
213+
214+
**What differs from canonical encoding:**
215+
216+
- Integers always encode at the full native width (`MRB_INT_BIT` bits), never shortest-form
217+
- Floats always encode at the full native width (`MRB_USE_FLOAT32` → f32, else → f64)
218+
- Strings, arrays, and maps use canonical shortest-form length prefixes (same as canonical)
219+
- No UTF-8 validation on strings
220+
- Symbols always encode as tag 39 + string (ignores the global symbol strategy setting)
221+
- Classes and modules encode as tag 49999 + name string (same as canonical)
222+
- Registered tags, bigints, UnhandledTag, and proc-tag types fall back to canonical encoding transparently — `encode_fast` never raises on an unsupported type
223+
224+
**When to use:**
225+
226+
| | `encode` / `decode` | `encode_fast` / `decode_fast` |
227+
|---|---|---|
228+
| External data / interop |||
229+
| Cross-network, mixed builds |||
230+
| Actor groups, same build || ✅ faster |
231+
| Shared refs, bigints || fallback to canonical |
232+
233+
**⚠️ Critical constraint — build compatibility:**
234+
235+
The fast wire format depends on the mruby build configuration:
236+
237+
- `MRB_INT_BIT` (16 / 32 / 64) determines integer wire width
238+
- `MRB_USE_FLOAT32` determines float wire width
239+
240+
**Buffers produced by `encode_fast` must only be decoded by `decode_fast` on a mruby binary compiled with identical settings.** Decoding a fast buffer on a different build produces silent data corruption — no error is raised, values are simply wrong.
241+
242+
Never use `encode_fast` / `decode_fast` for:
243+
- Data sent across a network to nodes that may differ in build config
244+
- Data written to disk and read back by a different binary
245+
- Any context where you do not fully control both encoder and decoder
246+
247+
For actor groups that span multiple machines, all nodes in the group must be compiled from the same mruby configuration. The group join handshake should verify `MRB_INT_BIT` and `MRB_USE_FLOAT32` explicitly before admitting a node.
248+
249+
**C API:**
250+
251+
```c
252+
mrb_value mrb_cbor_encode_fast(mrb_state *mrb, mrb_value obj);
253+
mrb_value mrb_cbor_decode_fast(mrb_state *mrb, mrb_value buf);
254+
```
255+
200256
### On-Demand Decoding (Lazy)
201257
202258
Parse only what you access. Perfect for large documents where you only need a few fields:
@@ -420,6 +476,8 @@ decoded[:name] # => "Alice"
420476
| `symbols_as_string` | Tag 39 + string | ✅ All | ✅ Yes | Good |
421477
| `symbols_as_uint32` | Tag 39 + uint32 | ❌ mruby only | ✅ Yes | Fastest |
422478

479+
**Note:** `encode_fast` always encodes symbols as tag 39 + string regardless of the global strategy setting.
480+
423481
**⚠️ `symbols_as_uint32` requires:**
424482
- Same mruby build (encoder and decoder must use identical `libmruby.a`)
425483
- Compile-time symbols (presym enabled)
@@ -522,6 +580,8 @@ CBOR.encode(3.14).bytesize # => 9 bytes (f64)
522580

523581
**`MRB_USE_FLOAT32` builds:** Start at f32 and try f16, skipping f64 entirely.
524582

583+
**`encode_fast` floats:** Always emit at full native width (f32 or f64) with no bit-pattern analysis — faster but larger on wire.
584+
525585
### Unhandled Tags
526586

527587
CBOR documents may contain tags your code doesn't recognize. Rather than failing, unknown tags decode as `CBOR::UnhandledTag` objects:
@@ -602,8 +662,9 @@ This section answers: **Will the same input always produce the same output?**
602662
| **mruby build config** | Symbols, float width range | `MRB_USE_FLOAT32`, presym settings affect encoding choices |
603663
| **Compile flags** | Might affect numeric representation | Different `CFLAGS` *could* theoretically affect float behavior (though unlikely in practice) |
604664
| **Symbol IDs** | Non-portable across mruby binaries | Presym IDs differ between mruby builds; use `symbols_as_string` for portability |
665+
| **`encode_fast` integer width** | Non-portable across builds with different `MRB_INT_BIT` | Fast buffers must never cross build boundaries |
605666

606-
**Practical: Same mruby binary + same input = same output, forever.** For cross-machine reproducibility, use `symbols_as_string` (portable) instead of `symbols_as_uint32` (binary-specific).
667+
**Practical: Same mruby binary + same input = same output, forever.** For cross-machine reproducibility, use `symbols_as_string` (portable) instead of `symbols_as_uint32` (binary-specific), and use `encode` / `decode` instead of `encode_fast` / `decode_fast` unless all peers share the same build.
607668

608669
### RFC 8949 Compliance
609670

@@ -624,17 +685,21 @@ This implementation strictly follows RFC 8949:
624685

625686
## Performance & Tuning
626687

627-
### Benchmarks (Relative)
688+
### Benchmarks (Relative, 100k iterations, `-O3 -march=native`)
628689

629-
| Operation | Time | vs. msgpack | vs. simdjson |
630-
|-----------|------|------------|--------------|
631-
| Encode small struct || ~30% faster | N/A |
632-
| Encode large array || ~25% faster | N/A |
633-
| Decode (eager) full || ~20% faster | N/A |
634-
| Decode (lazy) selective access || N/A | 1.3–3× faster |
690+
| Operation | Canonical | Fast | Notes |
691+
|-----------|-----------|------|-------|
692+
| Encode small map || ~1.4× faster | Typical actor message |
693+
| Encode nested structure || ~1.3× faster | Maps + arrays |
694+
| Encode int array [100] || ~0.9× slower | Fixed-width integers = more bytes |
695+
| Decode small map || ~1.3× faster | |
696+
| Decode nested structure || ~1.2× faster | |
697+
| Decode int array [100] || ~1.1× faster | Fixed-width reads |
635698

636699
**Lazy decoding shines:** When you only need a few fields from a 10 MB payload, lazy is 10–100× faster than eager.
637700

701+
**`encode_fast` trade-off:** Fixed-width integers produce larger wire output for small values (e.g. `1` encodes as 9 bytes instead of 1). For integer-heavy payloads (large arrays of small numbers) the canonical encoder is actually faster due to lower `memcpy` volume. The fast path wins on rich structured messages with string keys and mixed scalar values — the typical actor message shape.
702+
638703
### Recursion Depth Tuning
639704

640705
Default limits depend on mruby profile:
@@ -673,11 +738,6 @@ File streaming uses **adaptive readahead with doubling strategy:**
673738
5. Continue doubling until the full document is buffered
674739
6. Then read exactly the remaining bytes needed (if any) to avoid over-reading
675740

676-
**Why doubling?** CBOR documents can be arbitrarily nested (arrays, maps, tags wrapping each other). Finding the document boundary requires parsing the structure, not just reading a length header. The doubling strategy balances:
677-
- Most documents (< 16 KB) fit in 1–2 reads
678-
- Large documents don't require excessive seeks
679-
- No fixed buffer size that wastes memory or fails on edge cases
680-
681741
---
682742

683743
## Error Handling
@@ -752,6 +812,23 @@ Issues, PRs, and bug reports welcome. See `interop.py` for testing against other
752812
Key insight: No float rounding—entire algorithm is integer bit manipulation.
753813
```
754814

815+
### Fast Encoding Algorithm
816+
817+
```
818+
For each value:
819+
integer → fixed-width (MRB_INT_BIT / 8 bytes), major 0 or 1
820+
float → fixed-width (sizeof(mrb_float) bytes), 0xFA or 0xFB
821+
string → canonical length prefix + bytes, no UTF-8 check
822+
array → canonical length prefix + fast-encoded elements
823+
map → canonical length prefix + fast-encoded pairs
824+
symbol → tag 39 + canonical length + name bytes (always string, no strategy)
825+
class → tag 49999 + canonical length + name bytes
826+
other → fall back to canonical encode_value
827+
828+
Key insight: Only scalars are fixed-width. Structural lengths remain shortest-form
829+
so container overhead is identical to canonical.
830+
```
831+
755832
### Shared Reference Algorithm (Tag 28/29)
756833

757834
**Encoding:** When `sharedrefs: true`, maintain a hash of seen objects by `mrb_obj_id`:
@@ -801,6 +878,7 @@ Example: `(1 << 200) + 1` → Tag 2 wrapping 26-byte hex string
801878
| `no_symbols` | Plain string | N/A | Universal, loses type |
802879
| `symbols_as_string` | Tag 39 + string | O(string compare) | RFC 8949 compatible |
803880
| `symbols_as_uint32` | Tag 39 + uint32 | O(array index) | mruby-only, requires presym |
881+
| `encode_fast` (any mode) | Tag 39 + string | O(string compare) | RFC 8949 compatible |
804882

805883
**Presym IDs are non-portable:** Symbol ID 42 on your mruby might be ID 100 on another mruby built with different `--enable-presym-inline` settings.
806884

@@ -832,9 +910,7 @@ The encoder checks each string's UTF-8 validity at encode time and chooses the a
832910

833911
When **decoding**, text strings (major type 3) are validated as UTF-8 **when mruby is compiled with `MRB_UTF8_STRING`**. If mruby was compiled without UTF-8 string support, the validation is skipped (the strings are still decoded, just not validated).
834912

835-
Byte strings (major type 2) are never validated or touched—they're uninterpreted binary, regardless of compile flags.
836-
837-
This matches RFC 8949 (which requires UTF-8 for text strings) and prevents UTF-8 injection attacks when validation is enabled.
913+
`encode_fast` always emits strings as major type 3 without UTF-8 validation — faster but trusts the caller to provide valid UTF-8.
838914

839915
---
840916

benchmark/benchmark.rb

Lines changed: 22 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -1,82 +1,79 @@
11
JSON.zero_copy_parsing = true
22
json = File.read('twitter.json')
33
data = JSON.parse json
4-
cbor = CBOR.encode(data)
5-
cbor_size = cbor.bytesize
4+
cbor = CBOR.encode_fast(data)
65
msgpack = MessagePack.pack(data)
7-
msgpack_size = msgpack.bytesize
8-
json_size = json.bytesize
96

10-
pointer = "/statuses/50/retweeted_status/user/screen_name"
7+
pointer = "/search_metadata"
118

129
puts "=" * 100
1310
puts "EAGER DECODE BENCHMARK (twitter.json)"
1411
puts "=" * 100
15-
GC.start
12+
1613
# CBOR
17-
puts "\nCBOR.decode"
14+
puts "\nCBOR.decode_fast"
1815
timer = Chrono::Timer.new
19-
result = CBOR.decode(cbor)
16+
result = CBOR.decode_fast(cbor)
2017
elapsed = timer.elapsed
21-
18+
ops = (1.0 / elapsed).round(2)
2219
puts " Time: #{elapsed.round(9)} sec"
23-
puts " Throughput: #{(cbor_size.to_f / elapsed / 1_000_000_000).round(2)} GBps"
20+
puts " OPS: #{ops}"
2421

2522
# MessagePack
2623
puts "\nMessagePack.unpack"
2724
timer = Chrono::Timer.new
2825
result = MessagePack.unpack(msgpack)
2926
elapsed = timer.elapsed
30-
27+
ops = (1.0 / elapsed).round(2)
3128
puts " Time: #{elapsed.round(9)} sec"
32-
puts " Throughput: #{(msgpack_size.to_f / elapsed / 1_000_000_000).round(2)} GBps"
29+
puts " OPS: #{ops}"
3330

3431
# JSON
3532
puts "\nJSON.parse"
3633
timer = Chrono::Timer.new
3734
result = JSON.parse(json)
3835
elapsed = timer.elapsed
39-
36+
ops = (1.0 / elapsed).round(2)
4037
puts " Time: #{elapsed.round(9)} sec"
41-
puts " Throughput: #{(json_size.to_f / elapsed / 1_000_000_000).round(2)} GBps"
38+
puts " OPS: #{ops}"
4239

4340
puts "\n" + "=" * 100
4441
puts "LAZY DECODE BENCHMARK — #{pointer}"
4542
puts "=" * 100
46-
GC.start
43+
4744
# CBOR Lazy
4845
puts "\nCBOR.decode_lazy"
4946
timer = Chrono::Timer.new
50-
result = CBOR.decode_lazy(cbor)["statuses"][50]["retweeted_status"]["user"]["screen_name"].value
47+
result = CBOR.decode_lazy(cbor)["search_metadata"].value
5148
elapsed = timer.elapsed
52-
49+
ops = (1.0 / elapsed).round(2)
5350
puts " Result: #{result.inspect}"
5451
puts " Time: #{elapsed.round(9)} sec"
55-
puts " Throughput: #{(cbor_size.to_f / elapsed / 1_000_000_000).round(2)} GBps"
52+
puts " OPS: #{ops}"
5653

5754
# MessagePack Lazy
5855
puts "\nMessagePack.unpack_lazy"
5956
timer = Chrono::Timer.new
6057
result = MessagePack.unpack_lazy(msgpack).at_pointer(pointer)
6158
elapsed = timer.elapsed
62-
59+
ops = (1.0 / elapsed).round(2)
6360
puts " Result: #{result.inspect}"
6461
puts " Time: #{elapsed.round(9)} sec"
65-
puts " Throughput: #{(msgpack_size.to_f / elapsed / 1_000_000_000).round(2)} GBps"
62+
puts " OPS: #{ops}"
6663

6764
# JSON Lazy
6865
puts "\nJSON.parse_lazy"
6966
timer = Chrono::Timer.new
7067
result = JSON.parse_lazy(json).at_pointer(pointer)
7168
elapsed = timer.elapsed
72-
69+
ops = (1.0 / elapsed).round(2)
7370
puts " Result: #{result.inspect}"
7471
puts " Time: #{elapsed.round(9)} sec"
75-
puts " Throughput: #{(json_size.to_f / elapsed / 1_000_000_000).round(2)} GBps"
72+
puts " OPS: #{ops}"
7673

7774
puts "\n" + "=" * 100
7875
puts "Wire Sizes:"
79-
puts " CBOR: #{cbor_size} bytes"
80-
puts " MessagePack: #{msgpack_size} bytes"
81-
puts " JSON: #{json_size} bytes"
76+
puts " CBOR: #{cbor.bytesize} bytes"
77+
puts " MessagePack: #{msgpack.bytesize} bytes"
78+
puts " JSON: #{json.bytesize} bytes"
8279
puts "=" * 100

0 commit comments

Comments
 (0)