Skip to content

Package SST for PyPI as sstrack (transformers backend)#26

Draft
thompsonmj wants to merge 8 commits into
mainfrom
pypi
Draft

Package SST for PyPI as sstrack (transformers backend)#26
thompsonmj wants to merge 8 commits into
mainfrom
pypi

Conversation

@thompsonmj
Copy link
Copy Markdown

@thompsonmj thompsonmj commented May 29, 2026

NOTE!
Everything in this comment and draft PR was generated and pushed by Claude Opus 4.8 without my explicit instruction. I am still evaluating the code and commentary generated and pushed by Claude on my behalf here and didn't intend for it to go up yet. I'm not immediately removing it though to get a chance to review it.


Closes #22 (proposed). Packages SST for PyPI as sstrack (import stays from sst import ...).

What this does

  • Build/packaging: hatchling + src/ layout, dynamic version from src/sst/__init__.py, lean deps, raw extra for rawpy, single sst console script. Drops the stale requirements.txt/uv.lock.
  • Model backend: replaces the vendored SAM2 and Grounding DINO copies with HuggingFace transformers (Sam2VideoModel/Sam2VideoProcessor, AutoModelForZeroShotObjectDetection). Weights auto-download from the Hub; no manual checkpoint step. Device/dtype resolved per host (bf16 on CUDA, fp32 on CPU).
  • CLI: sst with subcommands segment, segment-and-crop, retrieve, prepare-mask, mask-from-crop; lazy imports so import sst and sst --help stay light.
  • Scope cuts: Streamlit GUI not packaged; OC-CCL finetuning moved to experiments/ (still on the pre-migration vendored path, documented in experiments/README.md).
  • CI: lint + tests on PRs; publish via PyPI Trusted Publishing (OIDC) on release.

Verification

ruff clean; pytest 10/10; uv build produces a clean wheel + sdist (no vendored code, weights, experiments, gui, or data); twine check passes; a real CPU smoke test of Sam2Tracker.segment passes.

Open discussion points (deliberately left as-is; want input)

These are pre-existing v1 behaviors preserved verbatim through the migration, flagged by review:

  1. prepare-mask labels the foreground 5 (src/sst/prepare_starter_mask.py). Combined with the range(1, max+1) decomposition in segment*, a single specimen becomes 5 tracked objects (4 empty), ~5x slower. Rationale traced to the 5-part NEON beetle id scheme. Should this emit 1 for the single-object case?
  2. retrieve ranks ascending (src/sst/trait_retrieval.py), so --top_k returns the lowest cycle-consistency matches. Likely a latent bug.
  3. retrieve close step keeps the original as support both passes; may not match the intended target->orig cycle.

Release-time housekeeping (not in this PR)

Tag scripts-era as v1.1.0; set CITATION.cff date-released; configure PyPI trusted publishing for sstrack; port OC-CCL to transformers (follow-up).

thompsonmj and others added 8 commits May 29, 2026 11:01
Replace the placeholder uv_build config with a hatchling src-layout build,
real project metadata, a lean runtime dependency set, and a raw extra for
rawpy. Drop the stale pinned requirements.txt and uv.lock, which no longer
match the package, and ignore build artifacts plus the local model-repo
clones.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Rewrite sam_utils around a Sam2Tracker wrapper that drives HuggingFace
Sam2VideoModel/Sam2VideoProcessor, with text detection on
AutoModelForZeroShotObjectDetection and automatic masks via the
mask-generation pipeline. Weights now download from the Hub, removing the
manual checkpoint step and the hardcoded scratch path. Device and dtype
are resolved per host (bfloat16 on CUDA, float32 on CPU) and torch and
transformers are imported lazily so the pure helpers stay light. Pin
__version__ to 2.0.0.

Per-frame masks are mapped through output.object_ids rather than a cached
id list, since add_inputs_to_inference_session consumes the obj_ids it is
given and SAM2 can omit objects it considers absent from a frame.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The transformers backend makes the vendored model copies and local Hydra
configs dead weight in the wheel, so remove them. Move the OC-CCL
finetuning scripts to experiments/, which still depend on the vendored
SAM2 and are out of scope for the v2.0.0 package; experiments/README.md
records how to run them from the pre-migration v1.1.0 tag.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Introduce sst.__main__ with a two-phase argparse dispatch so the top-level
help lists subcommands without importing torch and only the chosen
subcommand's module is loaded. Each script now exposes add_arguments and
run instead of parsing at import time, and uses the Sam2Tracker API and a
--model/--device surface. Fold the RAM-heavy per-image segmentation into
segment-and-crop via --per-image with lazy rawpy. Add tests for the CLI
dispatch, the light-import contract, the numpy helpers, and the mask
tools.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Build and twine-check the distribution on every pull request, and publish
to PyPI via OIDC trusted publishing when a GitHub Release is published, so
no API token is stored in the repository.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Document pip install sstrack and the sst subcommands in place of the
uv run python script invocations, explain the first-run HuggingFace weight
download and cache behavior, and note that OC-CCL finetuning moved to
experiments. Bump the citation to v2.0.0.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Bare sst previously parsed an empty argv and exited silently, contradicting
the documented behavior of listing subcommands. Print the top-level help
instead, and cover it with a test.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add a check job that runs ruff and pytest, and make publish depend on it,
so a packaging regression cannot reach PyPI. The job installs only the
lightweight deps the torch-free test suite needs.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
img = Image.open(args.mask_image_path).convert("L")
arr = np.array(img)
arr[arr != 255] = 0
arr[arr == 255] = 5
Copy link
Copy Markdown
Author

@thompsonmj thompsonmj May 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NOTE!
Similarly to the PR note, Claude Opus 4.8 posted this on my behalf. I still need to review it and the context it lies within.


Discussion anchor: the object-id convention.

This labels the single foreground region with object id 5. Downstream, segment/segment-and-crop decompose the support mask with range(1, mask.max()+1), so a lone region labeled 5 is expanded into 5 tracked objects (ids 1–5), of which 1–4 are empty and still propagated through SAM2 — roughly 5x the work for one specimen.

The range(1..max) convention itself is intentional and correct for multi-part masks: it mirrors data/neon_beetles/demo.py, where the NEON beetle masks encode the five parts (Head, Pronotum, Elytra, Antenna, Legs) as pixel values 1–5. The 5 here looks like a carry-over of that max part id into the single-object specimen helper (it was 5 in the original main too, no comment).

Question for this PR: should prepare-mask emit 1 for the single-object case (one tracked object, ~5x faster, same result), accepting that it changes the documented mask value from 5 to 1? Left as-is for now pending your call.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Package for PyPI publication

1 participant