Skip to content

Export: latest-day filter, project-scoped CLI, and bulk API hardening#34

Open
leostar0412 wants to merge 3 commits intocppalliance:masterfrom
leostar0412:fix/export-enhancements
Open

Export: latest-day filter, project-scoped CLI, and bulk API hardening#34
leostar0412 wants to merge 3 commits intocppalliance:masterfrom
leostar0412:fix/export-enhancements

Conversation

@leostar0412
Copy link
Copy Markdown
Collaborator

@leostar0412 leostar0412 commented May 8, 2026

Summary

Improves the export story end-to-end: shared latest calendar-day handling for --since last, project filtering in the CLI, and more reliable bulk export in the web API (state, empty runs, and incremental behavior aligned with the CLI).

What’s in this change

  • Shared day filter (utils/export_day_filter.py) — centralizes “latest activity day” and session overlap logic for exports that target the most recent UTC day.
  • CLI (scripts/export.py) — supports --since last with the new helper, project scoping, and keeps incremental / full export behavior consistent with the API where it matters.
  • HTTP bulk export (api/export_api.py) — all / last / incremental modes; no empty zip when there’s nothing to export (HTTP 422 + JSON); clearer export state payload (last_export_session_count and legacy export_count); locked + atomic updates to export_state.json to avoid torn reads and lost merges.
  • Web UI (static/js/app.js) — handles 422 from bulk export with a clear message; shows “sessions in last export” using the new state fields.
  • Docs (README.md) — documents new behavior (e.g. empty bulk export response).
  • Tests — day filter, project filter, CLI args, and bulk export API tests.

How to test

pip install -r requirements-dev.txt
pytest

Manually: run the app, try Export all / Export new since last with and without new sessions; confirm a no-op export shows a clear error (not a tiny empty zip). From the CLI, exercise --since last and --project against a small projects directory.

Closes #33

Summary by CodeRabbit

  • New Features

    • Incremental export mode to export only new or changed sessions since the last export.
    • Case-insensitive project name matching via substring search.
    • UI: “Export new since last export” now uses incremental mode; session header may show per-model badges.
  • Bug Fixes

    • “Last” export selects sessions by the latest UTC calendar day.
    • Bulk export returns HTTP 422 with JSON when there is nothing to export.
  • Documentation

    • Expanded README and CLI docs clarifying --since, --project matching, and export filename semantics.
  • Tests

    • Added API, CLI parsing, day-filtering, project-filter, and state-store tests.

…nd latest-day slice

- Updated README to clarify bulk export options, including incremental updates and latest-day slice.
- Modified CLI export flags to support `--since incremental` for exporting only new or changed sessions since the last export.
- Implemented logic in the export API to handle the new export options, including returning a 422 error when there is nothing to export.
- Enhanced session filtering for the latest activity day and improved project matching for exports.
- Added unit tests for new export features and validation of state management.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 8, 2026

Review Change Stack
No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 68ae775d-f680-4b84-9e96-658c7111ab81

📥 Commits

Reviewing files that changed from the base of the PR and between a4171f4 and b9d4d39.

📒 Files selected for processing (2)
  • tests/test_export_state_store.py
  • utils/export_state_store.py
✅ Files skipped from review due to trivial changes (1)
  • tests/test_export_state_store.py

📝 Walkthrough

Walkthrough

Adds UTC calendar-day filtering for "last" exports, a concurrency-safe atomic export-state store, an "incremental" export mode, CLI project matching by display name, API/CLI/UI wiring for the modes, tests, and README updates.

Changes

Incremental Export & Calendar-Day Sessions with Improved Project Matching

Layer / File(s) Summary
Date Filtering Utilities
utils/export_day_filter.py
New UTC calendar-day helpers: iso_timestamp_to_date(), session_calendar_bounds(), day_overlaps_session(), and collect_sessions_for_latest_activity_day() for selecting sessions overlapping the latest activity day.
Export State Store
utils/export_state_store.py
New shared state store with EXPORT_STATE_FILE, export_state_lock(), load_export_state_from_disk(), and atomic_write_export_state() (POSIX flock or threading.Lock fallback, legacy migration, atomic temp-file writes).
API Bulk Export & State Wiring
api/export_api.py
/api/export now accepts "all", "last" (via calendar-day collector), and "incremental" (state-based filtering); state I/O is lock-guarded and atomic; returns HTTP 422 JSON {"error":"Nothing to export","since":...} when nothing exported; /api/export/state includes last_export_session_count.
CLI Project Matching
scripts/export.py
Adds _project_matches(project, needle) for case-insensitive substring matching against internal name or display_name; cmd_list and stats filtering use it.
CLI Export Helpers
scripts/export.py
Adds _zip_export_basename() for descriptive zip filenames, _prefixed_export_option_overrides() to recover flags that appear before the export subcommand, and _append_export_for_session() to centralize per-session export + manifest logic; delegates state I/O to the shared store.
CLI Since=Last Flow
scripts/export.py
--since last now computes the latest UTC activity day via collect_sessions_for_latest_activity_day() and exports only overlapping sessions; early-return reporting when none qualify.
CLI Export Loop Refactor
scripts/export.py
Non-last loops enforce incremental skipping when since=="incremental", count unchanged-mtime skips, apply exclusions/untitled skips, and assemble ZIP via helpers.
Frontend Incremental Export
static/js/app.js
Projects page displays last_export_session_count (fallback export_count) and wires "Export new since last export" to bulkExport('incremental'); bulkExport surfaces JSON error messages when present.
Tests: Date Utilities
tests/test_export_day_filter.py
Unit tests for ISO timestamp parsing, calendar bounds, day overlap, latest-day selection, parse-failure logging, and abort-on-parse-error behavior.
Tests: Project Filtering & ZIP Naming
tests/test_export_project_filter.py
Tests for _project_matches() (substring, case-insensitive, display/internal name) and _zip_export_basename() deterministic naming; integration-style test for incremental empty-export reporting.
Tests: CLI Arguments
tests/test_cli_args.py
Regression tests ensuring --since incremental is accepted and that --since/--out options placed before the export subcommand are recovered.
Tests: API Behavior & State Store
tests/test_export_api_bulk.py, tests/test_export_state_store.py
API tests for invalid since, non-object JSON request errors, empty-export 422 JSON response, /api/export/state mapping; tests for state loader normalization and lock fallback behavior.
Documentation
README.md
Documents HTTP 422 "Nothing to export" response and JSON shape, clarifies --since last (UTC calendar day) and zip naming, explains --since incremental state behavior, and specifies --project case-insensitive substring matching with examples.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested labels

enhancement

Possibly related reviewers

  • clean6378-max-it
  • wpak-ai

Poem

🐰 Exports hop to the latest day bright,

Locks hold the state through the softest night.
Projects now match by the name you see,
Incremental trims only what's new to me.
Zip names sing dates, tests kept snug and tight.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title accurately summarizes the three main objectives: latest-day filter, project-scoped CLI, and bulk API hardening, matching the scope of changes across utils, scripts, API, and UI.
Linked Issues check ✅ Passed The PR successfully addresses both objectives from #33: (1) --since last now filters by latest UTC activity day and returns HTTP 422 with clear JSON when nothing exports [api/export_api.py, scripts/export.py]; (2) --project now matches both display_name and internal name case-insensitively [scripts/export.py].
Out of Scope Changes check ✅ Passed All changes directly support the PR objectives: export state persistence (new utils/export_state_store.py), day-filter logic (new utils/export_day_filter.py), CLI/API alignment, UI updates, documentation, and comprehensive test coverage. No unrelated changes detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Comment @coderabbitai help to get the list of available commands and usage tips.

@leostar0412 leostar0412 requested a review from jonathanMLDev May 8, 2026 17:28
@leostar0412 leostar0412 self-assigned this May 8, 2026
@leostar0412 leostar0412 removed the request for review from jonathanMLDev May 8, 2026 17:29
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@api/export_api.py`:
- Around line 118-121: The handler currently treats invalid or misspelled since
values as "all" which causes accidental full exports; update the validation
around request.get_json(...) and the since variable so that if body.get("since")
exists but is not one of ("all","last","incremental") the endpoint returns a 400
(or 422) JSON error response instead of falling back to "all". Keep the allowed
set check (since in ("all","last","incremental")), but change the branch to
return a proper JSON error (including a short message and the invalid value) and
only proceed when since is valid or absent; reference the local variable since
and the request.get_json call to locate where to add the error response.

In `@README.md`:
- Line 22: Update the "Bulk export" documentation text to show the full HTTP 422
JSON response structure instead of just `Nothing to export`: replace the short
note with a clear example such as stating the API returns 422 JSON with
{"error": "Nothing to export", "since": <value>} (include both "error" and
"since" fields) so consumers can parse the response correctly; edit the sentence
mentioning Bulk export on the same line to include that exact JSON structure.
- Line 73: Update the README text that describes the `--since last` zip filename
to reflect the actual pattern shown in the example: change the statement that
says `last-MM-DD` to indicate the full pattern
`claude-code-export-last-MM-DD-YYYY-MM-DD.zip`, where the first MM-DD is the
latest calendar day (UTC) and the trailing YYYY-MM-DD is the export timestamp;
edit the descriptive sentence near the `--since last` explanation and any nearby
example text so they consistently show
`claude-code-export-last-MM-DD-YYYY-MM-DD.zip`.

In `@scripts/export.py`:
- Around line 464-465: The CLI still writes export_state.json with a plain
open(..., "w") in _save_state(), risking races with the API path that uses
atomic write+lock; change _save_state() to use the same atomic/locked write used
by the Flask/API code (i.e., acquire the same state lock, write to a temp file,
fsync, then os.replace/rename) and keep _load_state() reads protected by that
lock as well; ensure you validate/serialize the JSON into the temp file before
replacing and surface errors instead of leaving a truncated/invalid
export_state.json.

In `@utils/export_day_filter.py`:
- Around line 61-64: The try/except around parse_session silently swallows parse
errors (parse_session called with sess_info["path"]) which can make d = max(...)
pick the wrong day; change the except to capture the exception as e and record
it (e.g., logger.error or logger.exception) including sess_info["path"] and the
exception message, and then either re-raise the exception or propagate the
choice to the caller via a new parameter (e.g., abort_on_parse_error) so the
caller can decide to abort (raise) or continue (log+continue). Ensure the unique
symbols referenced are parse_session, sess_info["path"], and the downstream
latest-day computation (d = max(...)) so reviewers can locate and verify the
change.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e92e036d-4dd1-4c23-b59f-00d04ed6a876

📥 Commits

Reviewing files that changed from the base of the PR and between e0e0a76 and de2990c.

📒 Files selected for processing (9)
  • README.md
  • api/export_api.py
  • scripts/export.py
  • static/js/app.js
  • tests/test_cli_args.py
  • tests/test_export_api_bulk.py
  • tests/test_export_day_filter.py
  • tests/test_export_project_filter.py
  • utils/export_day_filter.py

Comment thread api/export_api.py Outdated
Comment thread README.md Outdated
Comment thread README.md Outdated
Comment thread scripts/export.py
Comment thread utils/export_day_filter.py
…r handling

- Introduced a new utility module for managing export state with atomic I/O and locking mechanisms.
- Updated the export API to validate request bodies and handle invalid 'since' parameters, returning appropriate error responses.
- Enhanced the README to clarify bulk export options and CLI export flag behaviors.
- Added unit tests for new error handling scenarios in the export API and improved logging for session parsing failures.
- Refactored existing code to utilize the new export state management utilities, ensuring consistency across API and CLI.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (1)
api/export_api.py (1)

79-83: ⚠️ Potential issue | 🟠 Major

Malformed JSON still falls back to a full export.

request.get_json(silent=True) returns None on decode failure, and this branch treats that the same as “no body”. A client sending invalid JSON with Content-Type: application/json can still reach since="all" instead of getting a 400/422.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@api/export_api.py` around lines 79 - 83, The current
request.get_json(silent=True) call in export_api.py masks JSON decode errors by
returning None and causing malformed JSON to be treated as an empty body; change
the JSON parsing to detect decode errors by calling
request.get_json(silent=False) (or wrapping request.get_json(...) in a
try/except catching werkzeug.exceptions.BadRequest/JSONDecodeError) so that if
parsing fails you return a 400/422 JSON error instead of falling back to empty
body; update the handling around the body variable and the existing
isinstance(body, dict) check so only a truly missing body (None) or an object
proceeds, while malformed JSON yields an immediate client error response.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@scripts/export.py`:
- Around line 91-143: The helper _prefixed_export_option_overrides only scans
for "export" and misses shared flags for other subcommands like "list" and
"stats"; update it to detect the first subcommand token among the command set
(e.g. "export", "list", "stats") instead of only "export" and parse the argv
prefix up to that token so shared flags (--project, --since, --out, etc.) are
recovered for all those subcommands; also rename or keep the function but update
any call sites (the usages around the previous 150-152 region) to use the
generalized helper so project scoping is consistent across list and stats as
well.

In `@utils/export_state_store.py`:
- Around line 11-17: The fallback using threading.Lock (variables
_fallback_locks and _fallback_locks_guard) only serializes threads in one
process; replace it with a cross-process file lock on Windows (use
msvcrt.locking on a per-store lock file) so CLI and Flask processes cannot
interleave R/W; keep the same locking API but have the fallback map lock keys to
file handles/lockfiles and perform msvcrt.locking/unlocking around critical
sections (import msvcrt when fcntl is unavailable and update places that use
_fallback_locks to acquire/release the file lock instead of threading.Lock).
- Around line 60-67: The current loader that opens path, calls json.load(path)
and returns data may return non-dict payloads or a non-dict "sessions" value;
update the loader to validate and sanitize the JSON shape: after loading into
data, ensure isinstance(data, dict) (otherwise return {"sessions": {}} or {}),
ensure the "sessions" key exists and that data["sessions"] is a dict (if missing
or not a dict set data["sessions"] = {}), and leave or validate "lastExportTime"
as-is; return the sanitized dict so callers that expect a mapping (references to
data and data["sessions"]) never receive non-mapping types.

---

Duplicate comments:
In `@api/export_api.py`:
- Around line 79-83: The current request.get_json(silent=True) call in
export_api.py masks JSON decode errors by returning None and causing malformed
JSON to be treated as an empty body; change the JSON parsing to detect decode
errors by calling request.get_json(silent=False) (or wrapping
request.get_json(...) in a try/except catching
werkzeug.exceptions.BadRequest/JSONDecodeError) so that if parsing fails you
return a 400/422 JSON error instead of falling back to empty body; update the
handling around the body variable and the existing isinstance(body, dict) check
so only a truly missing body (None) or an object proceeds, while malformed JSON
yields an immediate client error response.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: bfc3bbe0-e5f0-478d-96eb-69525424b1ad

📥 Commits

Reviewing files that changed from the base of the PR and between de2990c and a4171f4.

📒 Files selected for processing (7)
  • README.md
  • api/export_api.py
  • scripts/export.py
  • tests/test_export_api_bulk.py
  • tests/test_export_day_filter.py
  • utils/export_day_filter.py
  • utils/export_state_store.py
✅ Files skipped from review due to trivial changes (1)
  • README.md
🚧 Files skipped from review as they are similar to previous changes (2)
  • tests/test_export_day_filter.py
  • utils/export_day_filter.py

Comment thread scripts/export.py
Comment thread utils/export_state_store.py
Comment thread utils/export_state_store.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

export.py: --since last UX confusion and --project mismatch with list output

1 participant