Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
.venv/
__pycache__/
*.egg-info/
.vscode/
.DS_Store
TODO.md
Expand Down
22 changes: 17 additions & 5 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,12 +8,20 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co

## Running

The CLI entry point is `src/main.py`. Invoke as:
Install in editable mode (one-time):

```sh
py src/main.py <prefix> <variable> [options]
pip install -e .
```

Then invoke as:

```sh
psc-plot <prefix> <variable> [options]
```

Or directly via `py src/main.py <prefix> <variable> [options]` (backward-compatible).

Where `<prefix>` selects the data source: field prefixes (`pfd`, `pfd_moments`, `gauss`, `continuity`) or particle prefixes (`prt`). Examples live in `plots/check.sh` and `plots/check2.sh` and serve as the de-facto smoke tests / usage reference.

Common flags:
Expand All @@ -25,7 +33,7 @@ Required environment (see `src/lib/config.py`):
- `PSC_PLOT_FFMPEG_BIN` — optional, falls back to `which ffmpeg`; needed for saving animations
- `PSC_PLOT_DASK_NUM_WORKERS` — optional, defaults to 1

There is no test suite, lint config, or build system in this repo — running the example commands in `plots/check*.sh` against a real data directory is how changes get validated.
There is no test suite or lint config — running the example commands in `plots/check*.sh` against a real data directory is how changes get validated. Package management is via `pyproject.toml` (setuptools backend).

## Architecture

Expand All @@ -35,7 +43,7 @@ The code lives under `src/lib/` and is organized around three concepts: **source

1. `parsing.get_parsed_args()` (`src/lib/parsing/parse.py`) builds an argparse parser with one subparser per file prefix, each populated with the same shared set of optional arguments. The parsed namespace is converted to a typed `FieldArgs` or particle-equivalent (`args_base.ArgsUntyped.to_typed`).
2. The typed args' `get_animation()` constructs a `FieldLoader`/particle loader (a `DataSource`), then calls `compile_source` (`src/lib/data/compile.py`) to wrap it in a `DataSourceWithPipeline` made of the user-supplied `Adaptor` list. If no `Versus` adaptor is present, a default one is appended (`y,z` vs `t`) — this is what selects axes and time dim.
3. `source.get_data()` loads raw data and runs the pipeline, returning a `DataWithAttrs` (a `Field` wrapping `xr.DataArray`, or a `List` wrapping `pd.DataFrame` / `dd.DataFrame`).
3. `source.get_data()` loads raw data and runs the pipeline, returning a `DataWithAttrs` (a `Field` wrapping `xr.Dataset`, or a `List` wrapping `pd.DataFrame` / `dd.DataFrame`).
4. `get_plot(data)` (`src/lib/plotting/get_plot.py`) dispatches on `data` type and `metadata.spatial_dims`/`time_dim` to choose a concrete `Plot` subclass (static/animated, 1D/2D, polar, scatter).
5. Hooks (`src/lib/plotting/hooks/`) such as `--scale log`, `--grid`, `--vline`, `--fit` are appended onto the chosen plot before `show()`/`save()`.

Expand All @@ -56,10 +64,14 @@ The code lives under `src/lib/` and is organized around three concepts: **source

### Data wrapper

`src/lib/data/data_with_attrs.py` defines `DataWithAttrs[D, MD]` and concrete `Field` (xarray-backed), `FullList` (pandas), `LazyList` (dask). Frozen dataclasses; mutate via `assign_data` / `assign_metadata` / `assign`. `Metadata` carries `var_name`, `var_latex`, `name_fragments`, `spatial_dims`, `time_dim`, `color_dim`. The unusual `**` unpacking via `__getitem__` + `keys()` is what `Metadata.create_from` and `assign` use to round-trip values between subclasses (`FieldMetadata` vs `ListMetadata`).
`src/lib/data/data_with_attrs.py` defines `DataWithAttrs[D, MD]` and concrete `Field` (`xr.Dataset`-backed), `FullList` (pandas), `LazyList` (dask). Frozen dataclasses; mutate via `assign_data` / `assign_metadata` / `assign`. `Metadata` carries `var_name`, `var_latex`, `name_fragments`, `spatial_dims`, `time_dim`, `color_dim`. `FieldMetadata` also carries `prefix` (the file prefix, e.g. `"pfd_moments"`). The unusual `**` unpacking via `__getitem__` + `keys()` is what `Metadata.create_from` and `assign` use to round-trip values between subclasses (`FieldMetadata` vs `ListMetadata`).

`Field.data` is an `xr.Dataset` containing multiple variables; `Field.active_data` returns the `xr.DataArray` for `metadata.var_name` (the variable being plotted). `Field.with_active_data(da)` replaces the active variable and drops sibling variables that are no longer grid-compatible. Most code should use `active_data` rather than `data` directly. `BareAdaptor.apply_field` handles this automatically via the shim in `adaptor.py`.

The class-level `data: ...`/`metadata: ...` annotations on the subclasses look redundant but are intentional — see the comment in `DataWithAttrs.__init__`. They are needed so `isinstance`-narrowed code gets the concrete types; don't "clean them up."

### Derived variables

`src/lib/derived_field_variables/registry.py` and `derived_particle_variables/registry.py` register computed variables per file prefix using `@derived_field_variable("pfd_moments")` decorators. The decorated function's parameter names declare the dependencies (raw or other derived variables); the loader resolves and computes them on demand.

`--derive` works for both field and particle data. For fields, it operates on the underlying `xr.Dataset` and can reference any variable in the dataset or resolve names from the derived-variable registry (via `FieldMetadata.prefix`). It updates `metadata.var_name` to point to the newly created variable.
25 changes: 25 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
[build-system]
requires = ["setuptools>=68.0"]
build-backend = "setuptools.build_meta"

[project]
name = "psc-plot"
version = "0.1.0"
requires-python = ">=3.13"
dependencies = [
"numpy>=2.0",
"xarray>=2024.0",
"pandas>=2.0",
"dask>=2024.0",
"h5py>=3.0",
"scipy>=1.10",
"matplotlib>=3.8",
"lark>=1.0",
"pscpy @ git+https://github.com/psc-code/pscpy.git",
]

[project.scripts]
psc-plot = "lib.cli:main"

[tool.setuptools.packages.find]
where = ["src"]
21 changes: 21 additions & 0 deletions src/lib/cli.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
import dask
import matplotlib.pyplot as plt

from lib import parsing
from lib.config import CONFIG


def main():
dask.config.set(num_workers=CONFIG.dask_num_workers)

args = parsing.get_parsed_args()

anim = args.get_animation()

if args.show:
anim.show()
if args.save is not None:
if CONFIG.ffmpeg_bin:
plt.rcParams["animation.ffmpeg_path"] = CONFIG.ffmpeg_bin
args.save.mkdir(exist_ok=True)
anim.save(args.save)
20 changes: 2 additions & 18 deletions src/main.py
Original file line number Diff line number Diff line change
@@ -1,19 +1,3 @@
import dask
import matplotlib.pyplot as plt
from lib.cli import main

from lib import parsing
from lib.config import CONFIG

dask.config.set(num_workers=CONFIG.dask_num_workers)

args = parsing.get_parsed_args()

anim = args.get_animation()

if args.show:
anim.show()
if args.save is not None:
if CONFIG.ffmpeg_bin:
plt.rcParams["animation.ffmpeg_path"] = CONFIG.ffmpeg_bin
args.save.mkdir(exist_ok=True)
anim.save(args.save)
main()