Skip to content

fix(gtfs-rt): prevent stale vehicle feeds from overwriting newer data#486

Merged
aaronbrethorst merged 2 commits intoOneBusAway:mainfrom
AtharvaKathe18:fix/gtfs-rt-stale-vehicle-data
Mar 1, 2026
Merged

fix(gtfs-rt): prevent stale vehicle feeds from overwriting newer data#486
aaronbrethorst merged 2 commits intoOneBusAway:mainfrom
AtharvaKathe18:fix/gtfs-rt-stale-vehicle-data

Conversation

@AtharvaKathe18
Copy link
Copy Markdown
Contributor

Summary

Prevents stale GTFS-Realtime vehicle position feeds from overwriting newer data by introducing a per-feed freshness guard based on the GTFS-RT FeedMessage timestamp.

This fixes an issue where out-of-order, retried, or delayed realtime feeds could cause vehicles to jump backwards, disappear, or show incorrect positions.

What changed

  • Track last-applied GTFS-RT vehicle feed timestamp per feed
  • Reject vehicle updates where feed timestamp ≤ last applied timestamp
  • Still run vehicle expiry cleanup even when stale updates are rejected
  • Ensure expired vehicles are removed from both lastSeen tracking and feedVehicles
  • No behavior change for trip updates or service alerts

Why this matters

GTFS-Realtime feeds can arrive out of order due to retries, caching, or upstream restarts. The GTFS-RT spec expects consumers to handle this safely. This change aligns Maglev with that expectation.

Tests

  • All existing tests pass
  • Added/updated tests covering stale vehicle overwrite behavior and expiry
  • Verified with:
    go test -tags sqlite_fts5 ./internal/gtfs

Related issue

Fixes #476

Copy link
Copy Markdown
Member

@aaronbrethorst aaronbrethorst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey Atharva, the idea here is right — GTFS-RT feeds can absolutely arrive out of order and the spec does expect consumers to handle it. The two-level approach (feed-level freshness guard + per-vehicle staleness check) is a reasonable design. However, there's a critical bug in the feed-level guard that would actually make things worse for feeds that omit the FeedHeader timestamp. A few items to work through before this can merge.

Critical

1. uint64 overflow permanently blocks vehicle updates for feeds without a FeedHeader timestamp (realtime.go:271)

The feed-level freshness guard converts vehicleData.CreatedAt to uint64:

feedTimestamp := uint64(vehicleData.CreatedAt.UnixNano())

The GTFS-RT FeedHeader timestamp field is optional. When it's absent, the go-gtfs library leaves CreatedAt as Go's zero time.Time{}. I verified the behavior:

time.Time{}.UnixNano()      → -6795364578871345152
uint64(that value)           → 11651379494838206464
uint64(time.Now().UnixNano())→  1772255999731360000

The zero-time value wraps to ~11.6×10¹⁸, which is ~6.5x larger than any real timestamp (~1.7×10¹⁸). Here's what happens:

  1. First update from a feed without FeedHeader timestamp: feedTimestamp = 11.6×10¹⁸, stored value is 0, so 11.6×10¹⁸ > 0 — update is applied. Stored value becomes 11.6×10¹⁸.
  2. All subsequent updates — even with valid timestamps — are permanently rejected, because no real timestamp can exceed 11.6×10¹⁸.

Fix: Guard against zero CreatedAt before comparing:

feedTimestamp := uint64(vehicleData.CreatedAt.UnixNano())
applyVehicleUpdate := true

if vehicleData.CreatedAt.IsZero() {
    // Feed has no header timestamp — cannot compare freshness, always apply
    applyVehicleUpdate = true
} else if feedTimestamp <= manager.feedVehicleTimestamp[feedID] {
    // Feed is stale, skip vehicle updates
    logging.LogOperation(...)
    applyVehicleUpdate = false
} else {
    manager.feedVehicleTimestamp[feedID] = feedTimestamp
}

Also consider using int64 instead of uint64 for the timestamp to avoid wrapping issues, or use time.Time directly with Before()/After() comparisons.

Important

2. TestRealtimeAcceptsStaleVehicleData doesn't test the new behavior (realtime_test.go:287)

This test directly manipulates manager.realTimeVehicles — the merged view — which completely bypasses updateFeedRealtime() where the freshness guard lives. It documents the old (undesired) behavior and will pass regardless of whether the fix works.

What's needed is a test that sends two feeds through updateFeedRealtime where the second has an older FeedHeader timestamp, and asserts the stale feed is rejected:

func TestStaleFeedRejected(t *testing.T) {
    manager := newTestManager()

    // First poll: feed with timestamp T=1000
    // Second poll: feed with timestamp T=500 (stale)
    // Assert: vehicles from first poll are preserved, second poll is rejected
}

3. Add unit tests for isVehicleStale (realtime.go:52)

This new function has no direct tests. It needs coverage for:

  • Both timestamps present, incoming is older → true
  • Both timestamps present, incoming is newer → false
  • Both timestamps present, equal → false
  • Either or both timestamps nil → false

4. time.Sleep(100ms) in TestStaleVehicleExpiry is fragile (multi_feed_test.go:97)

The comment says the sleep "ensures the server's timestamp is fresher than the real feed." But the empty feed uses a minimal proto with no FeedHeader timestamp, so CreatedAt is time.Time{} regardless of when it's served. The test works today only because the uint64 overflow makes zero time appear as a huge number (larger than any real timestamp). Once you fix the overflow bug (#1), this test will need to be updated — the empty feed will need a real FeedHeader timestamp that's newer than the RABA data's timestamp.

Fit and Finish

5. Extract duplicate cleanup logic (realtime.go:330-364)

The else branch (when applyVehicleUpdate is false) duplicates the stale-vehicle expiry logic from the if branch. Consider extracting it into a helper:

func (manager *Manager) cleanupExpiredVehicles(feedID string) {
    // ... shared expiry logic
}

Then both branches call cleanupExpiredVehicles(feedID) instead of duplicating 20+ lines.

Verification

  • CLA: signed
  • CI: only CLA check visible (lint/test results not yet available)
  • Single commit: 8d5ab7b

Strengths

  • Two-level staleness design: Feed-level freshness guard (via FeedHeader timestamp) plus per-vehicle staleness check (isVehicleStale) is a solid defense-in-depth approach.
  • Cleanup-on-stale: Running vehicle expiry even when a stale feed is rejected is a good design decision — expired vehicles should be cleaned up regardless of whether new data is applied.
  • Consistent with GTFS-RT spec: The spec does recommend that consumers handle out-of-order feeds, and this PR addresses that gap.
  • Thorough PR description: The issue description, change summary, and rationale are well-written and clear.

Recommended Action

  1. Fix the uint64 overflow bug (critical — without this, the feature can permanently break vehicle updates)
  2. Add proper tests for isVehicleStale and feed-level freshness rejection
  3. Replace the characterization test with one that exercises the actual fix
  4. Fix the time.Sleep test once the overflow is resolved
  5. Extract duplicate cleanup logic

@AtharvaKathe18 AtharvaKathe18 force-pushed the fix/gtfs-rt-stale-vehicle-data branch from 8d5ab7b to 3414689 Compare February 28, 2026 08:52
Copy link
Copy Markdown
Member

@aaronbrethorst aaronbrethorst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey Atharva -- Solid work addressing all five items from the first round. The critical uint64 overflow is properly guarded with IsZero(), the cleanupExpiredVehicles extraction is clean, isVehicleStale has thorough unit tests, and TestStaleVehicleExpiry no longer depends on time.Sleep. Just a couple of items to tighten up the test coverage before this is ready to merge.


Prior Feedback Status

All 5 items from the first review have been addressed:

# Item Status
1 uint64 overflow permanently blocks vehicle updates for feeds without FeedHeader timestamp Fixed -- IsZero() guard on vehicleData.CreatedAt bypasses freshness comparison for feeds without a FeedHeader timestamp
2 TestRealtimeAcceptsStaleVehicleData doesn't test new behavior Fixed -- replaced with TestStaleFeedRejected that exercises updateFeedRealtime with a manually-set future timestamp
3 Add unit tests for isVehicleStale Fixed -- TestIsVehicleStale covers all 6 cases (both present: older/newer/equal, either nil, both nil)
4 time.Sleep(100ms) fragile in TestStaleVehicleExpiry Fixed -- test rewritten to call cleanupExpiredVehicles directly, no sleep needed
5 Extract duplicate cleanup logic Fixed -- cleanupExpiredVehicles extracted and used in the stale-feed branch

Important

1. TestStaleFeedRejected may pass vacuously if RABA data lacks a FeedHeader timestamp (realtime_test.go:316-330)

The test relies on the freshness guard rejecting a stale feed. But if the RABA protobuf data has no FeedHeader timestamp (i.e., CreatedAt.IsZero() is true), the guard is bypassed and every update is always applied — the test would still pass because the same data produces the same vehicle count, but it wouldn't actually exercise the rejection path.

The RABA data almost certainly has a FeedHeader timestamp since it's real production data, but the test should verify this prerequisite. Add an assertion after the first poll:

// Verify the feed has a FeedHeader timestamp — this test
// exercises the freshness guard, which only applies to feeds
// with a non-zero CreatedAt.
manager.realTimeMutex.RLock()
require.NotZero(t, manager.feedVehicleTimestamp[feed.ID],
    "RABA feed must have FeedHeader timestamp for this test to be meaningful")
manager.realTimeMutex.RUnlock()

This way, if the test data is ever replaced with data that lacks a timestamp, the test will fail loudly rather than passing silently.

2. os.ReadFile error silently discarded in TestStaleFeedRejected (realtime_test.go:300)

data, _ := os.ReadFile(filepath.Join("../../testdata", "raba-vehicle-positions.pb"))

Compare with TestStaleVehicleExpiry in multi_feed_test.go:88-89:

data, err := os.ReadFile(filepath.Join("../../testdata", "raba-vehicle-positions.pb"))
require.NoError(t, err)

If the file is missing, the handler silently serves an empty body, and the test would fail at require.NotEmpty(t, firstPoll) with a confusing error message. Use require.NoError for consistency and clearer failure messages. Note: the t variable from the test won't be available inside the HTTP handler closure, so you'll need to capture it or use a different approach (e.g., read the file before creating the server).


Fit and Finish

3. Consider time.Time instead of uint64 for timestamp storage

The feedVehicleTimestamp map still uses uint64 (gtfs_manager.go:65). The IsZero() guard prevents the overflow bug, but time.Time would be more idiomatic and eliminate the class of problems entirely:

feedVehicleTimestamp map[string]time.Time

Then the comparison becomes:

if !vehicleData.CreatedAt.IsZero() && !vehicleData.CreatedAt.After(manager.feedVehicleTimestamp[feedID]) {
    // stale
}

Not blocking — the current implementation is correct. Just something to consider for clarity.


Verification

  • Tests: all pass
  • Lint: 0 issues
  • Single commit: clean

Strengths

  • Clean extraction of cleanupExpiredVehicles: The helper is well-scoped, properly documented, and handles the nil-map edge case.
  • Thorough isVehicleStale tests: All 6 cases covered, including the nil-timestamp edge cases.
  • TestStaleVehicleExpiry rewrite: Deterministic and directly tests the cleanup function without flaky timing.
  • IsZero() guard is exactly right: The simplest correct fix for the overflow bug — no over-engineering.

The two important items are straightforward test hardening fixes. Once those are addressed, this is good to merge.

Copy link
Copy Markdown
Member

@aaronbrethorst aaronbrethorst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey Atharva — everything from rounds 1 and 2 has been addressed. Nice work iterating through the feedback.

Prior Feedback Status

Round 1 (5 items — all fixed)

# Item Status
1 uint64 overflow permanently blocks updates for feeds without FeedHeader timestamp FixedIsZero() guard bypasses freshness comparison
2 TestRealtimeAcceptsStaleVehicleData doesn't test new behavior Fixed — replaced with TestStaleFeedRejected
3 Add unit tests for isVehicleStale Fixed — 6 cases covered
4 time.Sleep(100ms) fragile in TestStaleVehicleExpiry Fixed — calls cleanupExpiredVehicles directly
5 Extract duplicate cleanup logic FixedcleanupExpiredVehicles extracted

Round 2 (2 items — all fixed)

# Item Status
1 TestStaleFeedRejected may pass vacuously if RABA data lacks FeedHeader timestamp Fixedrequire.NotZero assertion added at line 327
2 os.ReadFile error silently discarded in test Fixed — file read moved before server creation with require.NoError at line 297

Round 2's fit-and-finish suggestion (time.Time vs uint64) was non-blocking and the current implementation is correct with the IsZero() guard.

Verification

  • Tests: all pass
  • Lint: 0 issues
  • All 7 prior feedback items resolved

This is ready to merge.

@aaronbrethorst aaronbrethorst merged commit 81cc2ae into OneBusAway:main Mar 1, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

GTFS-Realtime: stale vehicle positions can overwrite newer data (no freshness guard)

2 participants