-
Notifications
You must be signed in to change notification settings - Fork 4
Description
When we're talking write support there are really two distinct, albeit closely related features:
- One is creating and writing a new ASDF file from scratch
- The other is opening an existing ASDF file, updating values in it, and saving the changes
The first is of course easier, and what we will start with. But if designed right it at least provides a shortcut to a simplified version of the latter. That is, it would be relatively straightforward if an existing ASDF file can be opened, edited in-memory, and then written out to a temp file (which then replaces the original file). This is less efficient (generally) than true in-place updating. In-place updates are trickier because we have to carefully account for regions already existing in the file, and move them around. Not the hardest thing in the world but full of traps.
There are also questions about whether the API allows a user to flush changes to disk without closing the file, allowing additional updates/flushes. Probably not an immediate use case, but relevant e.g. for certain streaming applications. Here it might be helpful to look to how the Python library implements it.
Besides the low-level structural issues the other major part to write support is how the user actually creates and assigns values to the tree.
In the simplest case (different scalar values) we can have asdf_set_*() functions analogous to the asdf_get_*() functions. I suppose having corresponding asdf_value_set_*() is useful too, allowing replacing the value of an existing node in the tree without even knowing necessarily what its original value was. There should also be some asdf_value_*() functions for setting things like the rendering style for the value (block, flow, etc.) but this is lower priority.
We also have to make it easy for users to create mappings and sequences, fill them with values, and manipulate them in simple ways. It might be good at this point to introduce explicit types for mappings and sequences distinct from the generic asdf_value_t.
This also means we need to finally extend the extension interface with serialize methods as well, as well as implement those for all the existing extension types.
Users also have to be able to create new extension type values. In most cases this is easy enough--I designed most of the current extension types to just be structs the user can create and manipulate at will. Some cases are more complicated though, most notably ndarrays. So there will have to be easy-enough ways to associate existing array data to an ndarray. This includes lower-level interfaces for simply adding new binary blocks to a file, and ways to associate a block with an ndarray, as well as maybe higher-level convenience to automatically create new blocks for arrays if it doesn't already have an associated block.
All of this has to be considered.
The order I think would be best to work is something like:
- Basic API updates to support creating a new file for writing, and writing the bare minimum valid ASDF file
- Low-level block API for adding blocks to the file, writing the blocks, as well as block index
asdf_set_*()functions just for basic scalar types; basic tests for setting values in the tree and writing out- Functions to create a new
asdf_value_tfrom scratch--a genericasdf_value_tis somehow probably what the extension serialization functions should return. Instead ofasdf_value_as_<T>()it would probably be useful to have someasdf_value_of_<T>()functions. - Add
asdf_mapping_tandasdf_sequence_ttypes, refactor existing code that works on mappings and sequences (this may require some small updates to my SX++ fork as well). This and the previous bullet point I think are prerequisites to adding serialization support for extension types, because extension authors will typically want to do something like create a mapping, set keys, etc. - Extend the extension interface with serialization support. Don't implement serialization yet for every type as that will be time-consuming, especially while the interface is still being settled. Do implement it for a few basic cases (e.g. everything that goes into a
!core/asdf). Then we would have support for writing a minimal ASDF file that also follows the core schema. For other cases just stub it out but have it return an error.- By default newly written files should always use the core schema; i.e. setting the root node of the tree to a minimal
!core/asdf. Users should have the option to replace the root node with whatever they want (the ASDF Standard doesn't strictly require using the core schema IIRC), but this would rarely be useful. - Finish implementing serialization for existing extension types
- Round-trip testing
- By default newly written files should always use the core schema; i.e. setting the root node of the tree to a minimal