Proposal: An Ordered list w/ unique keys can be used in place of a unordered dictionary by DavidSagan · Pull Request #203 · pals-project/pals

DavidSagan · 2026-03-09T21:12:49Z

Clarified that ordered list with unique keys can be used in place of a unordered dictionary.

Rationale: A PALS parser may not always be able to preserve insertion order when it reads in a dict. And it is sometimes desirable to preserve the order. In particular, the C++ parser being developed does not have this capability since it uses an external parsing library that does not have this feature. (*)

Correction added by @ax3l: that premise (*) is wrong. The currently picked library (and many other C++ libs for YAML, JSON, etc.) do in fact support preservation of insertion order. This is a bug in pals-cpp (and here is the fix); it is not a library issue and not a fundamental limitation.

…a unordered dictionary.

ax3l · 2026-03-31T19:38:29Z

+with the restriction that no duplicate keys can be present in the list. 
+For example, the above dictionary written as a list would look like:
+```{code} yaml
+this_dictionary_expressed_as_list:


Are you referrring to named_dictionary above?
Aka saying

- named_dictionary: key3: value3 key4: value4

is the same as

- named_dictionary: - key3: value3 - key4: value4

Sorry, but I fear this is not a good idea, it adds ambiguity in parsing and complicates everything for no obvious gain.

Personally I don't care. You and @EZoni can come to some agreement on what is best.

Key differences:

Example 1 Example 2

Structure of named_dictionary A single mapping/dict A list of mappings/dicts

Access value3 (Python) data[0]["named_dictionary"]["key3"] data[0]["named_dictionary"][0]["key3"]

Can have duplicate keys? No (keys must be unique in a dict) Yes (each dict is separate)

Order guaranteed? Depends on implementation Yes (it's a list)

The choice between them depends on your use case. If the keys are unique and represent a fixed structure, a single dictionary (Example 1) is simpler. If you need an ordered collection or might have duplicate keys, a list of dictionaries (Example 2) is more appropriate.

Personally I don't care. You and @EZoni can come to some agreement on what is best.

Sounds good.
I would simply not relax interpretation and keep it clear. I.e., not adding the case in this PR (not merging it).

Major problem

This is not a great idea as it complicates parsing and introduces ambiguities.
We will regret it later in the parsers and have to dynamically branch in the schema application all the time.

Minor Problem

It is also not standard how YAML is interpreted (i.e. YAML 101 is the section this adds to.)
But I oppose this for the more fundamental reason above, not for where it is written.

ax3l

Review summary for status on the PR: I oppose this, this is not a good idea.

Details described here: #203 (comment)

DavidSagan · 2026-03-31T19:44:52Z

I oppose this, this is not a good idea.

@ax3l You need to articulate your objection. In fact you advocated the parser being able to preserve dictionary order if possible and this just does the same thing. And this works even in the case where the parser is not able to preserve order.

ax3l · 2026-03-31T19:54:04Z

This is just the summary/status, let me link the comment for clarity.

DavidSagan · 2026-03-31T19:58:24Z

This is just the summary/status, let me link the comment for clarity.

I don't see this as an objection. What exactly are you worried about?

ax3l · 2026-03-31T20:00:50Z

Both things are not equivalent in YAML; if we choose to relax them to be equivalent for PALS this will cause a problems writing and parsing schemas.

Lastly, in my view there is not a strong motivation why this complication is needed at all, lead by an example where this occurs at all please:

A PALS parser may not always be able to preserve insertion order when it reads in a dict. [...] . In particular, the C++ parser begin developed does not have this capability since it uses an external parsing library that does not have this feature.

C++ has multiple ways to express insertions preserving dicts and also the YAML library you picked actually supports it. There is a bug in your implementation.

ax3l · 2026-03-31T20:07:07Z

Here you go:

Detailed bug report: pals-project/pals-cpp#18
Bug fix: pals-project/pals-cpp#19

DavidSagan · 2026-03-31T20:09:19Z

Both things are not equivalent in YAML; if we choose to relax them to be equivalent for PALS this will cause a problems writing and parsing schemas.

Lastly, in my view there is not a strong motivation why this complication is needed at all, lead by an example where this occurs at all please:

A PALS parser may not always be able to preserve insertion order when it reads in a dict.

I don't see how this causes problems for writing and parsing. If you think there is a problem please come up with an example. In terms of why this is a desirable feature, I quote what your wrote:

   Code Developer note:
   PALS dictionaries should, when possible, implement a dictionary that preserves insertion order.

   While not strictly necessary, this helps with human readability:
   For example, having the [`kind`](#c:element.parameters) key of an element as the first attribute enhances legibility.

ax3l · 2026-03-31T20:11:16Z

Fundamental

No, please fix pals-project/pals-cpp#18
It is possible in C++ to implement a dictionary that preserves insertion order.

As the reference implementation in C++, pals-cpp must not cut corners as fundamental as this.

Banter

In fact you advocated the parser being able to preserve dictionary order if possible and this just does the same thing. And this works even in the case where the parser is not able to preserve order.

My whole motivation behind the when possible was to simply not order the read-back dict if we encounter a strong use case that we forgot, not to complicate parsing for everyone because you picked a bad dependency, as you suggest here.

DavidSagan · 2026-03-31T21:06:56Z

No, please fix pals-project/pals-cpp#18 It is possible in C++ to implement a dictionary that preserves insertion order.

OK will fix but this is independent of this PR.

ax3l · 2026-03-31T23:47:30Z

Then PR has no motivating real-world use case left, and we can close it.

I propose to drop when possible from PALS if that makes you feel better. I was just open to not sort things if it makes it easier for people, but if this is twisted into motivating a complication of the schema as this PR tries, then I rather drop it completely.

DavidSagan · 2026-04-01T02:48:12Z

Then PR has no motivating real-world use case left, and we can close it.

I propose to drop when possible from PALS if that makes you feel better. I was just open to not sort things if it makes it easier for people, but if this is twisted into motivating a complication of the schema as this PR tries, then I rather drop it completely.

I would agree with you if this represented a significant complication. But it does not. To have a parser handle this is literally a few lines of code. And once implemented there is no further maintenance needed.

ax3l · 2026-04-01T03:25:50Z

That is not correct, this text would add a conditional branch check for every dict to be a list, too. That is extremely random and confusing and will cause a lot of headache if enacted. Every dict has to be checks to be a list. Files will have all kinds of formatting. Typing will be a mess. This will cause confusion. You cannot write clear static scheme.

Or just run it through an LLM of your choice and ask it if it is a great idea for a portable standard that shall be easy to implement and verify.

Again, there is no real world need for this. Why do we even discuss this?

DavidSagan · 2026-04-01T03:39:35Z

That is not correct, this text would add a conditional branch check for every dict to be a list, too. That is extremely random and confusing and will cause a lot of headache if enacted. Every dict has to be checks to be a list. Files will have all kinds of formatting. Typing will be a mess. This will cause confusion. You cannot write clear static scheme.

Or just run it through an LLM of your choice and ask it if it is a great idea for a portable standard that shall be easy to implement and verify.

Again, there is no real world need for this. Why do we even discuss this?

When programmed correctly, the conditional branch is handled by the same low level routine that is written once and is transparent to the higher level code. Again, this can be done with a few lines of code.

And I can see a real world case for this for some extensions where order matters in a construct that PALS specifies as unordered.

ax3l · 2026-04-01T03:47:06Z

this can be done with a few lines of code.

No. And you ignore what I wrote on schemas and the other points I made why this is a bad idea.

I can see a real world case for this for some extensions where order matters in a construct that PALS specifies as unordered.

No sorry, this is a super hypothetical need. I do strongly recommend against this. My reasons are above.

DavidSagan · 2026-04-01T04:02:32Z

I can see a real world case for this for some extensions where order matters in a construct that PALS specifies as unordered.

No sorry, this is a super hypothetical need. I do strongly recommend against this. My reasons are above.

You have not justified your assertions. I have told you how a parser can handle this very simply. If you want more detail I can supply it. And there are other ways of simply handling this by converting from list to ordered dict on input.

If you think that this complicates things outside of any parsing please be more specific. And example would help.

ax3l · 2026-04-01T04:11:06Z

Let's start with the motivation for this PR: why would the PALS standard specify that "something must be unordered"? Please justify and motivate the actual need first.

in a construct that PALS specifies as unordered.

Where is this needed?
What would break or be terrible if the unnamed, future PALS-"unordered thing" was unnecessarily ordered in practice by a reader/writer?

There is no example here that one can follow.

ax3l · 2026-04-01T04:15:32Z

Here is a Claude summary of your change:

The proposal trades one problem (potential ordering ambiguity in mappings) for a worse problem (representational ambiguity across the entire schema). A standard schema should have exactly one canonical way to represent a given piece of data. If ordering matters, it should be encoded explicitly rather than relying on structural conventions that every consumer must independently understand and implement.

Absolutely, I will give you now 4 problems it causes + examples for your fully motivation-example-free proposal.

ax3l · 2026-04-01T04:18:09Z

1. It introduces representational ambiguity.

The same logical data now has two valid serializations. Every tool, validator, and parser that consumes these lattice files must handle both representations identically. This doubles the surface area for bugs and increases the cognitive load on anyone reading or writing these files. A contributor looking at two lattice files might see what appears to be structurally different data that is actually semantically identical.

Example

Two files describe the same beamline (ignore details), but look structurally different:

File A (dictionary form):

  facility:
    drift1:
        kind: Drift
        length: 0.25
  
    quad1:
        kind: Quadrupole
        MagneticMultipoleP:
          Bn1: 1.0
        length: 1.0

File B (list form):

  facility:
    - drift1:
        kind: Drift
        length: 0.25
  
    - quad1:
        kind: Quadrupole
        MagneticMultipoleP:
          Bn1: 1.0
        length: 1.0

Are these the same? A human reading them might not be sure. A diff tool will flag them as completely different. A code review becomes harder because a contributor could switch forms arbitrarily, and git diff will show a large structural change that is semantically a no-op.

ax3l · 2026-04-01T04:19:43Z

2. It conflates two distinct data models.

A list of key-value pairs and a dictionary are fundamentally different structures. By declaring them interchangeable (under constraints), you're essentially asking every consumer to implement a normalization step — "if you see a list of single-key mappings with no duplicate keys, treat it as a dict." This is implicit schema logic that lives outside the schema itself.

Every consumer of the schema must now implement normalization logic. Consider a tool that looks up an element by name:

# With a dict, this is trivial:
quad_params = lattice["facility"]["quad1"]

# With the list-of-dicts form, you need:
quad_params = None
for entry in lattice["elements"]:
    if "quad1" in entry:
        quad_params = entry["quad1"]
        break

Now imagine a validation library, a conversion tool to MAD-X format, a visualization tool, and a simulation runner. Each of these independently must include branching logic:

def get_elements(lattice):
    elems = lattice["facility"]
    if isinstance(elems, list):
        # convert list-of-single-key-dicts to ordered dict
        result = {}
        for item in elems:
            result.update(item)
        return result
    elif isinstance(elems, dict):
        return elems
    else:
        raise SchemaError("Invalid elements format")

Every tool in the ecosystem reimplements this, slightly differently, and each is a potential source of bugs.

ax3l · 2026-04-01T04:21:53Z

3. The duplicate-key restriction is fragile.

The constraint "no duplicate keys in the list" must be enforced at validation time, not by the data format itself. YAML won't stop you from writing duplicate keys in a list. So now you need a custom validator for something that dictionaries give you for free (or at least by convention: YAML's handling of duplicate keys in mappings is itself underspecified, but most parsers will warn or take the last value).

Nothing in YAML prevents this:

  facility:
    - drift1:
        kind: Drift
        length: 0.25
  
    - quad1:
        kind: Quadrupole
        MagneticMultipoleP:
          Bn1: 1.0
        length: 1.0

    - drift1:
        kind: Drift
        length: 0.5

This is perfectly valid YAML. The list happily contains two entries keyed quad1. A dictionary would have silently overwritten the first or raised a warning, but the list form gives no such signal. (This is an example and happens at every level of the PALS hierarchy, even if we allow duplicate names in this specific list of the example.)

The schema now requires a custom validator to catch this: (your "this is just a few lines, don't worry, Axel, I got this")

def validate_no_duplicate_keys(elements_list):
    keys_seen = set()
    for item in elements_list:
        for key in item:
            if key in keys_seen:
                raise ValidationError(f"Duplicate key '{key}' in elements list")
            keys_seen.add(key)

If any one tool in the ecosystem forgets this check, it will silently process the file with unpredictable behavior: maybe using the first quad1, maybe the last, maybe both.

ax3l · 2026-04-01T04:24:11Z

4. It solves a problem that's largely been solved.

Python 3.7+ guarantees dict insertion order. Most modern YAML libraries preserve mapping order (you can pick a proper one for a PALS reference implementation). C++ supports insertion-order stable maps. If the concern is interoperability with languages or tools that don't preserve order, the cleaner solution is to specify that compliant parsers must use order-preserving mappings, rather than introducing an alternative representation.

Example: pals-project/pals-cpp#18

DavidSagan · 2026-04-01T04:38:53Z

First of all, I do not accept Claude as an authority on any of this. Especially since answers may be manipulated depending upon the questions. There is a much better way to gauge this. Just ask people if they think they would be confused.

DavidSagan · 2026-04-01T04:42:17Z

The schema now requires a custom validator to catch this: (your "this is just a few lines, don't worry, Axel, I got this")
def validate_no_duplicate_keys(elements_list):
    keys_seen = set()
    for item in elements_list:
        for key in item:
            if key in keys_seen:
                raise ValidationError(f"Duplicate key '{key}' in elements list")
            keys_seen.add(key)
If any one tool in the ecosystem forgets this check, it will silently process the file with unpredictable behavior — maybe using the first quad1, maybe the last, maybe both.

There are many checks a validator must do. This includes spelling checks, etc. What you show would be a small part of validation.

ax3l · 2026-04-01T04:44:57Z

Yes a validator can check many things, but this misses the point.

The difference is that every other validation check catches errors that are inherent to the problem domain: wrong element types, out-of-range parameters, misspelled names. Those errors exist regardless of how you design the schema.

The key/list/duplicate check is different: it only exists because the schema introduced a representation that makes it possible. You're not catching a physics mistake, you're catching an artifact of a design choice. A dictionary makes this class of error structurally impossible.

The best validation check is the one you don't need to write.

DavidSagan · 2026-04-01T04:47:44Z

The best validation check is the one you don't need to write.

I agree. But I believe this is being blown all out of proportion.

ax3l · 2026-04-01T04:47:52Z

First of all, I do not accept Claude as an authority on any of this. Especially since answers may be manipulated depending upon the questions. There is a much better way to gauge this. Just ask people if they think they would be confused.

Don't worry, I suggested you early on to critically self-review it with an authority of your choice.

I am still waiting for a concrete use case example that motivates all this. It is hard to follow at all why this is worth spending time on now.

Your only concrete use case is built on the wrong premise that your current C++ YAML lib does not support it, but it in fact does preserve order mapping and you have a bug in pals-cpp while using it. Here is the fix, btw.

ax3l · 2026-04-01T04:52:12Z

Sorry, clicked the wrong button.

The best validation check is the one you don't need to write.

I agree. But I believe this is being blown all out of proportion.

I designed concrete cases above that prevent better support by existing structural serialization/deserialization and falls back for us "writing all the validation (structural and physics/PALS meaning) from scratch". This is not how we do this; for modern standards we want to rely on structural schemas, diff tools, declarative validation schemes, etc.. One builds on the other, we do not mix parsing YAML, doing schema validation, and doing physics mapping into the same levels.

I stand behind all problems 1-4 and want to see them either solved or strongly motivated why it is worth to add this.

I am repeating myself: for such structural changes, please lead with concrete examples/needs that make this necessary, not a vague "in the future / maybe [no concrete case]".

Overlooked impact of proposed change.

jlvay

After finally understanding what this PR was doing and reviewing, I think that the proposed option would bring confusion and additional work for the parsers that is not needed.

jlvay · 2026-04-06T18:09:03Z

Decided to close after extensive discussions.

Clarified that ordered list with unique keys can be used in place of …

973ef84

…a unordered dictionary.

DavidSagan requested review from EZoni, ax3l, cemitch99 and jlvay March 9, 2026 21:12

cemitch99 previously approved these changes Mar 30, 2026

View reviewed changes

EZoni reviewed Mar 30, 2026

View reviewed changes

Comment thread source/conventions.md Outdated

Add example.

a424f64

DavidSagan dismissed cemitch99’s stale review via a424f64 March 30, 2026 20:44

EZoni reviewed Mar 30, 2026

View reviewed changes

Comment thread source/conventions.md

Minor correction.

51f20fe

EZoni reviewed Mar 30, 2026

View reviewed changes

Comment thread source/conventions.md Outdated

Apply suggestion from @EZoni

6a028aa

EZoni previously approved these changes Mar 30, 2026

View reviewed changes

ax3l self-assigned this Mar 31, 2026

ax3l reviewed Mar 31, 2026

View reviewed changes

ax3l requested changes Mar 31, 2026

View reviewed changes

ax3l closed this Apr 1, 2026

ax3l reopened this Apr 1, 2026

ax3l changed the title ~~Clarified that ordered list with unique keys can be used in place of a unordered dictionary.~~ Proposal: An Ordered list w/ unique keys can be used in place of a unordered dictionary Apr 1, 2026

ax3l added the schema: structural changes Structural changes to the PALS schema label Apr 1, 2026

EZoni self-requested a review April 1, 2026 22:16

ax3l added the invalid This doesn't seem right label Apr 2, 2026

jlvay reviewed Apr 2, 2026

View reviewed changes

jlvay closed this Apr 6, 2026

	Example 1	Example 2
Structure of named_dictionary	A single mapping/dict	A list of mappings/dicts
Access value3 (Python)	data[0]["named_dictionary"]["key3"]	data[0]["named_dictionary"][0]["key3"]
Can have duplicate keys?	No (keys must be unique in a dict)	Yes (each dict is separate)
Order guaranteed?	Depends on implementation	Yes (it's a list)

Conversation

DavidSagan commented Mar 9, 2026 • edited by ax3l Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ax3l Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

DavidSagan Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

ax3l Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

ax3l Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

ax3l Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Major problem

Minor Problem

Uh oh!

ax3l left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

DavidSagan commented Mar 31, 2026

Uh oh!

ax3l commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DavidSagan commented Mar 31, 2026

Uh oh!

ax3l commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ax3l commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DavidSagan commented Mar 31, 2026

Uh oh!

ax3l commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Fundamental

Banter

Uh oh!

DavidSagan commented Mar 31, 2026

Uh oh!

ax3l commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DavidSagan commented Apr 1, 2026

Uh oh!

ax3l commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DavidSagan commented Apr 1, 2026

Uh oh!

ax3l commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DavidSagan commented Apr 1, 2026

Uh oh!

ax3l commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ax3l commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ax3l commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

1. It introduces representational ambiguity.

Example

Uh oh!

ax3l commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

2. It conflates two distinct data models.

DavidSagan commented Mar 9, 2026 •

edited by ax3l

Loading

ax3l Mar 31, 2026 •

edited

Loading

ax3l left a comment •

edited

Loading

ax3l commented Mar 31, 2026 •

edited

Loading

ax3l commented Mar 31, 2026 •

edited

Loading

ax3l commented Mar 31, 2026 •

edited

Loading

ax3l commented Mar 31, 2026 •

edited

Loading

ax3l commented Mar 31, 2026 •

edited

Loading

ax3l commented Apr 1, 2026 •

edited

Loading

ax3l commented Apr 1, 2026 •

edited

Loading

ax3l commented Apr 1, 2026 •

edited

Loading

ax3l commented Apr 1, 2026 •

edited

Loading

ax3l commented Apr 1, 2026 •

edited

Loading

ax3l commented Apr 1, 2026 •

edited

Loading

ax3l commented Apr 1, 2026 •

edited

Loading

ax3l commented Apr 1, 2026 •

edited

Loading

ax3l commented Apr 1, 2026 •

edited

Loading

ax3l commented Apr 1, 2026 •

edited

Loading

ax3l commented Apr 1, 2026 •

edited

Loading