Skip to content

feat: AtomsEncoder/AtomsDecoder for JSON serialization#27

Merged
PythonFZ merged 18 commits intomainfrom
feat/json-encoder-decoder
May 7, 2026
Merged

feat: AtomsEncoder/AtomsDecoder for JSON serialization#27
PythonFZ merged 18 commits intomainfrom
feat/json-encoder-decoder

Conversation

@PythonFZ
Copy link
Copy Markdown
Member

@PythonFZ PythonFZ commented May 7, 2026

Summary

  • Add asebytes.AtomsEncoder and asebytes.AtomsDecoderjson.JSONEncoder / json.JSONDecoder subclasses that let ase.Atoms instances roundtrip through stdlib json via the standard cls= argument.
  • Wire format is a versioned base64-of-msgpack envelope: {"__asebytes__": 1, "data": "<base64>"}. Reuses the existing encode() / decode() path; no new dependencies.
  • Works for single Atoms, lists of Atoms, and Atoms nested anywhere inside arbitrary JSON structures. Encoder subclasses can extend default(), decoder subclasses can override object_hook.

Usage

import json
import asebytes

s = json.dumps(atoms, cls=asebytes.AtomsEncoder)
atoms2 = json.loads(s, cls=asebytes.AtomsDecoder)

s = json.dumps([a, b, c], cls=asebytes.AtomsEncoder)
frames = json.loads(s, cls=asebytes.AtomsDecoder)  # list[ase.Atoms]

Test Plan

  • 19 new tests in tests/test_json.py, all passing
  • Full project suite: 2212 passed, 8 skipped (network backends), no regressions
  • Coverage: single Atoms, list, nested, empty list, all asebytes-supported features (info/calc/constraints/pbc), encoder TypeError fallthrough, decoder ValueError on unknown version, decoder passthrough on regular dicts/scalars, encoder subclass chaining, decoder subclass override, wire-format snapshot

Docs

  • Spec: docs/superpowers/specs/2026-05-07-json-encoder-decoder-design.md
  • Plan: docs/superpowers/plans/2026-05-07-json-encoder-decoder.md
  • README updated with a new ## JSON section

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added JSON serialization support for Atoms objects using standard library JSON utilities.
    • Atoms objects can now be serialized to JSON and deserialized back with a compact binary envelope format.
    • Serialization is subclassable for extensibility with custom types.
  • Documentation

    • Updated README with JSON interoperability section including usage examples.

PythonFZ and others added 15 commits May 7, 2026 11:36
Spec for adding asebytes.AtomsEncoder and asebytes.AtomsDecoder so
ase.Atoms can be serialized via stdlib json with `cls=`. Wire format
is a versioned base64-of-msgpack envelope reusing the existing
encode/decode path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
13 TDD tasks covering module scaffold, single/list/nested roundtrips,
feature coverage across all asebytes-supported Atoms attributes,
encoder fallthrough, decoder version mismatch, encoder subclass
chaining, decoder subclass object_hook override, wire format snapshot,
full suite check, and README update.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reuses encode/decode through a base64(msgpack) envelope. Public API
is two stdlib JSONEncoder/JSONDecoder subclasses; no new top-level
functions and no new dependencies.
- Use module-level msgpack/m aliases to match decode.py convention
- Add numpy-style docstring on AtomsEncoder.default
- Consolidate test imports at top of tests/test_json.py
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 7, 2026

Review Change Stack

Warning

Rate limit exceeded

@PythonFZ has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 49 minutes and 53 seconds before requesting another review.

To continue reviewing without waiting, purchase usage credits in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 87328c3f-130f-46f5-941f-450f9da660bc

📥 Commits

Reviewing files that changed from the base of the PR and between 4eb843f and ae804dd.

📒 Files selected for processing (2)
  • README.md
  • tests/test_json.py
📝 Walkthrough

Walkthrough

This PR introduces JSON serialization for ase.Atoms objects via new AtomsEncoder and AtomsDecoder classes that wrap atoms in a versioned base64-encoded msgpack envelope. It includes a new _json.py module with encoding/decoding logic, public API exports, comprehensive test coverage, and README documentation.

Changes

JSON Serialization for ase.Atoms

Layer / File(s) Summary
Envelope Format & Constants
src/asebytes/_json.py
Module introduces _ENVELOPE_KEY = "__asebytes__" and _ENVELOPE_VERSION = 1 to mark and version serialized atoms payloads.
Core Encoder/Decoder Implementation
src/asebytes/_json.py
Implements _atoms_object_hook to detect and reconstruct atoms from envelope dicts, AtomsEncoder to serialize atoms to base64-encoded msgpack, and AtomsDecoder to inject the hook for automatic deserialization.
Public API Exports
src/asebytes/__init__.py
Exports AtomsEncoder and AtomsDecoder from ._json and adds them to __all__.
Tests & Documentation
tests/test_json.py, README.md
Adds comprehensive test suite for round-trip serialization, error handling, extensibility, and wire-format validation; documents usage in README with examples.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 A message wrapped in packets neat,
Base64-encoded, JSON complete—
Our atoms hop from bytes to wire,
Through envelope-marked desire,
JSON serialization takes us higher! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 75.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title accurately and specifically describes the main change: adding AtomsEncoder and AtomsDecoder classes for JSON serialization of ase.Atoms objects.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/json-encoder-decoder

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Comment thread src/asebytes/_json.py Outdated
Comment on lines +14 to +17
_packb = msgpack.packb
_unpackb = msgpack.unpackb
_m_encode = m.encode
_m_decode = m.decode
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not from msgpack import packb, unpackb etc.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switched to direct from msgpack import packb, unpackb (and msgpack_numpy.encode/decode as m_encode/m_decode) in 4eb843f. The previous module-level aliases were copied from decode.py for consistency, but decode.py has many call sites where the alias pays for itself; _json.py has only two, so direct names read cleaner.

Comment thread README.md Outdated
Comment on lines +111 to +117
# Single Atoms
s = json.dumps(atoms, cls=asebytes.AtomsEncoder)
atoms2 = json.loads(s, cls=asebytes.AtomsDecoder)

# List of Atoms
s = json.dumps([a, b, c], cls=asebytes.AtomsEncoder)
frames = json.loads(s, cls=asebytes.AtomsDecoder) # list[ase.Atoms]
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just show the list, no need to showcase basic json.dumps functionality working on lists and single items

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trimmed in 4eb843f — kept only the list-of-Atoms example; dropped the single-Atoms and nested-structure variants.

- Drop module-level _packb/_unpackb/_m_encode/_m_decode aliases in
  _json.py; each was used once, so direct `from msgpack import packb,
  unpackb` reads cleaner. The aliases-for-consistency-with-decode.py
  rationale didn't hold here (decode.py has many call sites; _json
  has 2).
- README JSON section: keep only the list-of-Atoms example; drop the
  trivial single-Atoms and nested-structure variants since those are
  basic json.dumps semantics once the encoder handles ase.Atoms.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@README.md`:
- Around line 111-112: The example uses undefined variables a, b, c and should
show they are ase.Atoms instances; update the example around AtomsEncoder and
AtomsDecoder to first import or reference ase.Atoms and create example atoms
(e.g., variables named a, b, c as ase.Atoms objects) or add a one-line comment
clarifying that a, b, c are ase.Atoms instances so readers understand the types
used with AtomsEncoder/AtomsDecoder.

In `@src/asebytes/_json.py`:
- Around line 46-47: The code accesses obj["data"] unguarded which can raise
KeyError for a malformed object marked with "__asebytes__"; change the logic in
the deserialization path (the block using packed = base64.b64decode(obj["data"])
and return decode(unpackb(..., object_hook=m_decode))) to validate the "data"
key first (e.g., use obj.get("data") or try/except KeyError) and raise a
ValueError with a clear message if "data" is missing, so the function honors its
documented ValueError-only contract.

In `@tests/test_json.py`:
- Around line 26-28: The zip used in the test loop should be called with
strict=True to satisfy Ruff B905; since tests already assert len(recovered) ==
len(frames) just update the loop in tests/test_json.py that iterates "for
original, decoded in zip(frames, recovered)" to "for original, decoded in
zip(frames, recovered, strict=True)" (preserving the surrounding assertion that
checks lengths).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6470dc0d-4913-4fca-ac8a-bd18c57283d0

📥 Commits

Reviewing files that changed from the base of the PR and between 79c1b63 and 4eb843f.

📒 Files selected for processing (4)
  • README.md
  • src/asebytes/__init__.py
  • src/asebytes/_json.py
  • tests/test_json.py

Comment thread README.md Outdated
Comment thread src/asebytes/_json.py
Comment on lines +46 to +47
packed = base64.b64decode(obj["data"])
return decode(unpackb(packed, object_hook=m_decode))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

KeyError on malformed envelope violates the documented exception contract.

The docstring declares only ValueError as a raised exception, but if a dict has the __asebytes__ marker without a "data" key, Line 46 raises an unguarded KeyError. Wrap the access:

🛡️ Proposed fix
-    packed = base64.b64decode(obj["data"])
+    try:
+        packed = base64.b64decode(obj["data"])
+    except KeyError:
+        raise ValueError(
+            f"Malformed asebytes envelope: missing 'data' field"
+        ) from None
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
packed = base64.b64decode(obj["data"])
return decode(unpackb(packed, object_hook=m_decode))
try:
packed = base64.b64decode(obj["data"])
except KeyError:
raise ValueError(
f"Malformed asebytes envelope: missing 'data' field"
) from None
return decode(unpackb(packed, object_hook=m_decode))
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/asebytes/_json.py` around lines 46 - 47, The code accesses obj["data"]
unguarded which can raise KeyError for a malformed object marked with
"__asebytes__"; change the logic in the deserialization path (the block using
packed = base64.b64decode(obj["data"]) and return decode(unpackb(...,
object_hook=m_decode))) to validate the "data" key first (e.g., use
obj.get("data") or try/except KeyError) and raise a ValueError with a clear
message if "data" is missing, so the function honors its documented
ValueError-only contract.

Comment thread tests/test_json.py
- README JSON example now generates frames via
  molify.smiles2conformers so the snippet runs as-is instead of
  referencing undefined a/b/c names.
- tests/test_json.py: zip(frames, recovered, strict=True) — Ruff
  B905. Length is already asserted on the preceding line, so strict
  is redundant at runtime but satisfies the lint.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@PythonFZ PythonFZ merged commit e9f4318 into main May 7, 2026
4 of 5 checks passed
@PythonFZ PythonFZ deleted the feat/json-encoder-decoder branch May 7, 2026 12:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant