feat: AtomsEncoder/AtomsDecoder for JSON serialization#27
Conversation
Spec for adding asebytes.AtomsEncoder and asebytes.AtomsDecoder so ase.Atoms can be serialized via stdlib json with `cls=`. Wire format is a versioned base64-of-msgpack envelope reusing the existing encode/decode path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
13 TDD tasks covering module scaffold, single/list/nested roundtrips, feature coverage across all asebytes-supported Atoms attributes, encoder fallthrough, decoder version mismatch, encoder subclass chaining, decoder subclass object_hook override, wire format snapshot, full suite check, and README update. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reuses encode/decode through a base64(msgpack) envelope. Public API is two stdlib JSONEncoder/JSONDecoder subclasses; no new top-level functions and no new dependencies.
- Use module-level msgpack/m aliases to match decode.py convention - Add numpy-style docstring on AtomsEncoder.default - Consolidate test imports at top of tests/test_json.py
|
Warning Rate limit exceeded
To continue reviewing without waiting, purchase usage credits in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughThis PR introduces JSON serialization for ChangesJSON Serialization for ase.Atoms
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
| _packb = msgpack.packb | ||
| _unpackb = msgpack.unpackb | ||
| _m_encode = m.encode | ||
| _m_decode = m.decode |
There was a problem hiding this comment.
why not from msgpack import packb, unpackb etc.
There was a problem hiding this comment.
Switched to direct from msgpack import packb, unpackb (and msgpack_numpy.encode/decode as m_encode/m_decode) in 4eb843f. The previous module-level aliases were copied from decode.py for consistency, but decode.py has many call sites where the alias pays for itself; _json.py has only two, so direct names read cleaner.
| # Single Atoms | ||
| s = json.dumps(atoms, cls=asebytes.AtomsEncoder) | ||
| atoms2 = json.loads(s, cls=asebytes.AtomsDecoder) | ||
|
|
||
| # List of Atoms | ||
| s = json.dumps([a, b, c], cls=asebytes.AtomsEncoder) | ||
| frames = json.loads(s, cls=asebytes.AtomsDecoder) # list[ase.Atoms] |
There was a problem hiding this comment.
just show the list, no need to showcase basic json.dumps functionality working on lists and single items
There was a problem hiding this comment.
Trimmed in 4eb843f — kept only the list-of-Atoms example; dropped the single-Atoms and nested-structure variants.
- Drop module-level _packb/_unpackb/_m_encode/_m_decode aliases in _json.py; each was used once, so direct `from msgpack import packb, unpackb` reads cleaner. The aliases-for-consistency-with-decode.py rationale didn't hold here (decode.py has many call sites; _json has 2). - README JSON section: keep only the list-of-Atoms example; drop the trivial single-Atoms and nested-structure variants since those are basic json.dumps semantics once the encoder handles ase.Atoms. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@README.md`:
- Around line 111-112: The example uses undefined variables a, b, c and should
show they are ase.Atoms instances; update the example around AtomsEncoder and
AtomsDecoder to first import or reference ase.Atoms and create example atoms
(e.g., variables named a, b, c as ase.Atoms objects) or add a one-line comment
clarifying that a, b, c are ase.Atoms instances so readers understand the types
used with AtomsEncoder/AtomsDecoder.
In `@src/asebytes/_json.py`:
- Around line 46-47: The code accesses obj["data"] unguarded which can raise
KeyError for a malformed object marked with "__asebytes__"; change the logic in
the deserialization path (the block using packed = base64.b64decode(obj["data"])
and return decode(unpackb(..., object_hook=m_decode))) to validate the "data"
key first (e.g., use obj.get("data") or try/except KeyError) and raise a
ValueError with a clear message if "data" is missing, so the function honors its
documented ValueError-only contract.
In `@tests/test_json.py`:
- Around line 26-28: The zip used in the test loop should be called with
strict=True to satisfy Ruff B905; since tests already assert len(recovered) ==
len(frames) just update the loop in tests/test_json.py that iterates "for
original, decoded in zip(frames, recovered)" to "for original, decoded in
zip(frames, recovered, strict=True)" (preserving the surrounding assertion that
checks lengths).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 6470dc0d-4913-4fca-ac8a-bd18c57283d0
📒 Files selected for processing (4)
README.mdsrc/asebytes/__init__.pysrc/asebytes/_json.pytests/test_json.py
| packed = base64.b64decode(obj["data"]) | ||
| return decode(unpackb(packed, object_hook=m_decode)) |
There was a problem hiding this comment.
KeyError on malformed envelope violates the documented exception contract.
The docstring declares only ValueError as a raised exception, but if a dict has the __asebytes__ marker without a "data" key, Line 46 raises an unguarded KeyError. Wrap the access:
🛡️ Proposed fix
- packed = base64.b64decode(obj["data"])
+ try:
+ packed = base64.b64decode(obj["data"])
+ except KeyError:
+ raise ValueError(
+ f"Malformed asebytes envelope: missing 'data' field"
+ ) from None📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| packed = base64.b64decode(obj["data"]) | |
| return decode(unpackb(packed, object_hook=m_decode)) | |
| try: | |
| packed = base64.b64decode(obj["data"]) | |
| except KeyError: | |
| raise ValueError( | |
| f"Malformed asebytes envelope: missing 'data' field" | |
| ) from None | |
| return decode(unpackb(packed, object_hook=m_decode)) |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/asebytes/_json.py` around lines 46 - 47, The code accesses obj["data"]
unguarded which can raise KeyError for a malformed object marked with
"__asebytes__"; change the logic in the deserialization path (the block using
packed = base64.b64decode(obj["data"]) and return decode(unpackb(...,
object_hook=m_decode))) to validate the "data" key first (e.g., use
obj.get("data") or try/except KeyError) and raise a ValueError with a clear
message if "data" is missing, so the function honors its documented
ValueError-only contract.
- README JSON example now generates frames via molify.smiles2conformers so the snippet runs as-is instead of referencing undefined a/b/c names. - tests/test_json.py: zip(frames, recovered, strict=True) — Ruff B905. Length is already asserted on the preceding line, so strict is redundant at runtime but satisfies the lint. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
asebytes.AtomsEncoderandasebytes.AtomsDecoder—json.JSONEncoder/json.JSONDecodersubclasses that letase.Atomsinstances roundtrip through stdlibjsonvia the standardcls=argument.{"__asebytes__": 1, "data": "<base64>"}. Reuses the existingencode()/decode()path; no new dependencies.default(), decoder subclasses can overrideobject_hook.Usage
Test Plan
tests/test_json.py, all passingDocs
docs/superpowers/specs/2026-05-07-json-encoder-decoder-design.mddocs/superpowers/plans/2026-05-07-json-encoder-decoder.md## JSONsection🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Documentation