ENG-1815 Add sync attempt telemetry and stale end-task handling#1094
ENG-1815 Add sync attempt telemetry and stale end-task handling#1094mdroidian wants to merge 6 commits into
Conversation
…text - Refactor end_sync_task to return a JSON object containing task status and metadata. - Enhance error handling to include telemetry context for better debugging. - Update related API routes and database schema to accommodate new return type. - Document changes in sync_functions.md to reflect the new response structure.
PR size/scope checkThis PR is over our review-size guideline.
Please split this into smaller PRs unless there is a clear reason the changes need to land together. If keeping it as one PR, please add a brief justification covering:
|
|
Updates to Preview Branch (eng-1815-add-sync-attempt-telemetry-and-stale-end-task-handling) ↗︎
Tasks are run on every commit but only new migration files are pushed.
View logs for this Workflow Run ↗︎. |
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
…error responses - Update EndSyncTaskRpcResult to require 'ok' and 'stale' fields. - Enhance isEndSyncTaskRpcResult function to validate response structure. - Modify endSyncTask function to handle claimed timestamps and improve error handling. - Update API route to parse and validate request body for task status and timestamps. - Document changes in sync_functions.md to reflect new requirements for end_sync_task.
This PR fixes one sync completion correctness issue: The files are coupled by the same contract change. The database function now returns structured completion results, the generated DB type and website route need to reflect that return shape, and the Roam caller must consume the result to avoid false success telemetry/backoff behavior. Splitting would create intermediate states where the server exposes a new contract without the client using it, or the client expects behavior the DB has not yet shipped. |
…prove request body validation - Made 'startedAt' in ParsedEndSyncTaskBody optional. - Enhanced parseEndSyncTaskBody to accept a task status string directly. - Updated error messages for better clarity on request body requirements. - Adjusted RPC call to conditionally include 's_started_at' based on availability.
…ding function to streamline error handling.
maparent
left a comment
There was a problem hiding this comment.
Ok. Nothing wrong, but one change I'd appreciate:
Add a version number to the JSON output, and check for it in the code. If the version number coming from the database is higher than the one in the code, be more lenient with json parsing errors. This gives you more leeway to change the database json format without breaking old versions of the plugins.
Otherwise lgtm.
This is from a back and forth with codex 5.5 xhigh with linear/posthog mcp access.
Summary
Fixes misleading sync reporting around ENG-1322 and adds the telemetry needed to debug remaining “Wrong worker” sync failures.
Previously, Roam could hit an
end_sync_taskfailure and still emitSync complete, which made PostHog sessions look contradictory. This change makesend_sync_taskreturn structured results, classifies stale completions explicitly, and only reports sync success after the database confirms the task was ended successfully.What Changed
syncAttemptId,syncWorkerId,syncUserUid, space ID, status, and per-phase timing telemetry.endSyncTaskto return a typed success/stale/error result instead of swallowing failures.Sync completenow only fires afterend_sync_taskreturns success.Sync stale; end-task failures emitSync errorwithstatus: "end_task_failed".Database Changes
public.end_sync_task(...)now returnsjsonbinstead ofvoid.ok: falsestale: truereason: "completed_by_newer_task"Wrong worker, but now include diagnostic detail in the Postgres error detail payload.20260528033000_end_sync_task_result.sql.Why This Helps
Sync completeevents after failed task cleanup.Out of Scope
Summary by CodeRabbit
Release Notes
Improvements
Documentation