Skip to content

select and order of columns #1126

@thbar

Description

@thbar

Today I had to figure out how to generate a CSV with Explorer.DataFrame, in a way that ensures the CSV columns will be in a specific order.

Here are my findings, which lead me to select:

  1. If you feed a list of map to the DataFrame, the fields appear to be defined in alphabetical order (at least when IO.inspect is called on it):
#Explorer.DataFrame<
  Polars[1 x 40]
  accessibilite_pmr string ["Accessible mais non réservé PMR"]
  adresse_station string ["26 rue des écluses, 17430 Champdolent"]
  cable_t2_attache boolean [false]
  code_insee_commune string ["17085"]
  condition_acces string ["Accès libre"]
  contact_amenageur string ["amenageur@example.com"]
  contact_operateur string ["operateur@example.com"]
  coordonneesXY string ["[-0.799141,45.91914]"]
  date_maj string ["2024-10-17"]
  date_mise_en_service string ["2024-10-02"]
  1. Calling select with a (ordered) list of fields appears to implicitly order them as provided:
columns = ["nom_amenageur", ...]
valid_record = DB.Factory.IRVE.generate_row()
Explorer.DataFrame.new([valid_record])
|> Explorer.DataFrame.select(columns)

#Explorer.DataFrame<
  Polars[1 x 40]
  nom_amenageur string ["Métropole de Nulle Part"]
  siren_amenageur string ["123456782"]
  contact_amenageur string ["amenageur@example.com"]
  nom_operateur string ["Opérateur de Charge"]
  1. This sorting appears to be an expectation of some users of Polars:
  1. The documentation (https://hexdocs.pm/explorer/Explorer.DataFrame.html#select/2) does not mention anything specific

(essentially, maybe the behavior is unspecified, or covered by tests only, or just by Polars itself)

So this leads me to wonder if I can safely (future-proof) rely on select to order a generated CSV, or not.

Does anyone has certainties around that topic?

Thank you!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions