fix(gtfs-rt): prevent stale vehicle feeds from overwriting newer data#486
Conversation
aaronbrethorst
left a comment
There was a problem hiding this comment.
Hey Atharva, the idea here is right — GTFS-RT feeds can absolutely arrive out of order and the spec does expect consumers to handle it. The two-level approach (feed-level freshness guard + per-vehicle staleness check) is a reasonable design. However, there's a critical bug in the feed-level guard that would actually make things worse for feeds that omit the FeedHeader timestamp. A few items to work through before this can merge.
Critical
1. uint64 overflow permanently blocks vehicle updates for feeds without a FeedHeader timestamp (realtime.go:271)
The feed-level freshness guard converts vehicleData.CreatedAt to uint64:
feedTimestamp := uint64(vehicleData.CreatedAt.UnixNano())The GTFS-RT FeedHeader timestamp field is optional. When it's absent, the go-gtfs library leaves CreatedAt as Go's zero time.Time{}. I verified the behavior:
time.Time{}.UnixNano() → -6795364578871345152
uint64(that value) → 11651379494838206464
uint64(time.Now().UnixNano())→ 1772255999731360000
The zero-time value wraps to ~11.6×10¹⁸, which is ~6.5x larger than any real timestamp (~1.7×10¹⁸). Here's what happens:
- First update from a feed without FeedHeader timestamp:
feedTimestamp = 11.6×10¹⁸, stored value is0, so11.6×10¹⁸ > 0— update is applied. Stored value becomes11.6×10¹⁸. - All subsequent updates — even with valid timestamps — are permanently rejected, because no real timestamp can exceed
11.6×10¹⁸.
Fix: Guard against zero CreatedAt before comparing:
feedTimestamp := uint64(vehicleData.CreatedAt.UnixNano())
applyVehicleUpdate := true
if vehicleData.CreatedAt.IsZero() {
// Feed has no header timestamp — cannot compare freshness, always apply
applyVehicleUpdate = true
} else if feedTimestamp <= manager.feedVehicleTimestamp[feedID] {
// Feed is stale, skip vehicle updates
logging.LogOperation(...)
applyVehicleUpdate = false
} else {
manager.feedVehicleTimestamp[feedID] = feedTimestamp
}Also consider using int64 instead of uint64 for the timestamp to avoid wrapping issues, or use time.Time directly with Before()/After() comparisons.
Important
2. TestRealtimeAcceptsStaleVehicleData doesn't test the new behavior (realtime_test.go:287)
This test directly manipulates manager.realTimeVehicles — the merged view — which completely bypasses updateFeedRealtime() where the freshness guard lives. It documents the old (undesired) behavior and will pass regardless of whether the fix works.
What's needed is a test that sends two feeds through updateFeedRealtime where the second has an older FeedHeader timestamp, and asserts the stale feed is rejected:
func TestStaleFeedRejected(t *testing.T) {
manager := newTestManager()
// First poll: feed with timestamp T=1000
// Second poll: feed with timestamp T=500 (stale)
// Assert: vehicles from first poll are preserved, second poll is rejected
}3. Add unit tests for isVehicleStale (realtime.go:52)
This new function has no direct tests. It needs coverage for:
- Both timestamps present, incoming is older →
true - Both timestamps present, incoming is newer →
false - Both timestamps present, equal →
false - Either or both timestamps nil →
false
4. time.Sleep(100ms) in TestStaleVehicleExpiry is fragile (multi_feed_test.go:97)
The comment says the sleep "ensures the server's timestamp is fresher than the real feed." But the empty feed uses a minimal proto with no FeedHeader timestamp, so CreatedAt is time.Time{} regardless of when it's served. The test works today only because the uint64 overflow makes zero time appear as a huge number (larger than any real timestamp). Once you fix the overflow bug (#1), this test will need to be updated — the empty feed will need a real FeedHeader timestamp that's newer than the RABA data's timestamp.
Fit and Finish
5. Extract duplicate cleanup logic (realtime.go:330-364)
The else branch (when applyVehicleUpdate is false) duplicates the stale-vehicle expiry logic from the if branch. Consider extracting it into a helper:
func (manager *Manager) cleanupExpiredVehicles(feedID string) {
// ... shared expiry logic
}Then both branches call cleanupExpiredVehicles(feedID) instead of duplicating 20+ lines.
Verification
- CLA: signed
- CI: only CLA check visible (lint/test results not yet available)
- Single commit:
8d5ab7b
Strengths
- Two-level staleness design: Feed-level freshness guard (via FeedHeader timestamp) plus per-vehicle staleness check (
isVehicleStale) is a solid defense-in-depth approach. - Cleanup-on-stale: Running vehicle expiry even when a stale feed is rejected is a good design decision — expired vehicles should be cleaned up regardless of whether new data is applied.
- Consistent with GTFS-RT spec: The spec does recommend that consumers handle out-of-order feeds, and this PR addresses that gap.
- Thorough PR description: The issue description, change summary, and rationale are well-written and clear.
Recommended Action
- Fix the
uint64overflow bug (critical — without this, the feature can permanently break vehicle updates) - Add proper tests for
isVehicleStaleand feed-level freshness rejection - Replace the characterization test with one that exercises the actual fix
- Fix the
time.Sleeptest once the overflow is resolved - Extract duplicate cleanup logic
8d5ab7b to
3414689
Compare
aaronbrethorst
left a comment
There was a problem hiding this comment.
Hey Atharva -- Solid work addressing all five items from the first round. The critical uint64 overflow is properly guarded with IsZero(), the cleanupExpiredVehicles extraction is clean, isVehicleStale has thorough unit tests, and TestStaleVehicleExpiry no longer depends on time.Sleep. Just a couple of items to tighten up the test coverage before this is ready to merge.
Prior Feedback Status
All 5 items from the first review have been addressed:
| # | Item | Status |
|---|---|---|
| 1 | uint64 overflow permanently blocks vehicle updates for feeds without FeedHeader timestamp |
Fixed -- IsZero() guard on vehicleData.CreatedAt bypasses freshness comparison for feeds without a FeedHeader timestamp |
| 2 | TestRealtimeAcceptsStaleVehicleData doesn't test new behavior |
Fixed -- replaced with TestStaleFeedRejected that exercises updateFeedRealtime with a manually-set future timestamp |
| 3 | Add unit tests for isVehicleStale |
Fixed -- TestIsVehicleStale covers all 6 cases (both present: older/newer/equal, either nil, both nil) |
| 4 | time.Sleep(100ms) fragile in TestStaleVehicleExpiry |
Fixed -- test rewritten to call cleanupExpiredVehicles directly, no sleep needed |
| 5 | Extract duplicate cleanup logic | Fixed -- cleanupExpiredVehicles extracted and used in the stale-feed branch |
Important
1. TestStaleFeedRejected may pass vacuously if RABA data lacks a FeedHeader timestamp (realtime_test.go:316-330)
The test relies on the freshness guard rejecting a stale feed. But if the RABA protobuf data has no FeedHeader timestamp (i.e., CreatedAt.IsZero() is true), the guard is bypassed and every update is always applied — the test would still pass because the same data produces the same vehicle count, but it wouldn't actually exercise the rejection path.
The RABA data almost certainly has a FeedHeader timestamp since it's real production data, but the test should verify this prerequisite. Add an assertion after the first poll:
// Verify the feed has a FeedHeader timestamp — this test
// exercises the freshness guard, which only applies to feeds
// with a non-zero CreatedAt.
manager.realTimeMutex.RLock()
require.NotZero(t, manager.feedVehicleTimestamp[feed.ID],
"RABA feed must have FeedHeader timestamp for this test to be meaningful")
manager.realTimeMutex.RUnlock()This way, if the test data is ever replaced with data that lacks a timestamp, the test will fail loudly rather than passing silently.
2. os.ReadFile error silently discarded in TestStaleFeedRejected (realtime_test.go:300)
data, _ := os.ReadFile(filepath.Join("../../testdata", "raba-vehicle-positions.pb"))Compare with TestStaleVehicleExpiry in multi_feed_test.go:88-89:
data, err := os.ReadFile(filepath.Join("../../testdata", "raba-vehicle-positions.pb"))
require.NoError(t, err)If the file is missing, the handler silently serves an empty body, and the test would fail at require.NotEmpty(t, firstPoll) with a confusing error message. Use require.NoError for consistency and clearer failure messages. Note: the t variable from the test won't be available inside the HTTP handler closure, so you'll need to capture it or use a different approach (e.g., read the file before creating the server).
Fit and Finish
3. Consider time.Time instead of uint64 for timestamp storage
The feedVehicleTimestamp map still uses uint64 (gtfs_manager.go:65). The IsZero() guard prevents the overflow bug, but time.Time would be more idiomatic and eliminate the class of problems entirely:
feedVehicleTimestamp map[string]time.TimeThen the comparison becomes:
if !vehicleData.CreatedAt.IsZero() && !vehicleData.CreatedAt.After(manager.feedVehicleTimestamp[feedID]) {
// stale
}Not blocking — the current implementation is correct. Just something to consider for clarity.
Verification
- Tests: all pass
- Lint: 0 issues
- Single commit: clean
Strengths
- Clean extraction of
cleanupExpiredVehicles: The helper is well-scoped, properly documented, and handles the nil-map edge case. - Thorough
isVehicleStaletests: All 6 cases covered, including the nil-timestamp edge cases. TestStaleVehicleExpiryrewrite: Deterministic and directly tests the cleanup function without flaky timing.IsZero()guard is exactly right: The simplest correct fix for the overflow bug — no over-engineering.
The two important items are straightforward test hardening fixes. Once those are addressed, this is good to merge.
aaronbrethorst
left a comment
There was a problem hiding this comment.
Hey Atharva — everything from rounds 1 and 2 has been addressed. Nice work iterating through the feedback.
Prior Feedback Status
Round 1 (5 items — all fixed)
| # | Item | Status |
|---|---|---|
| 1 | uint64 overflow permanently blocks updates for feeds without FeedHeader timestamp |
Fixed — IsZero() guard bypasses freshness comparison |
| 2 | TestRealtimeAcceptsStaleVehicleData doesn't test new behavior |
Fixed — replaced with TestStaleFeedRejected |
| 3 | Add unit tests for isVehicleStale |
Fixed — 6 cases covered |
| 4 | time.Sleep(100ms) fragile in TestStaleVehicleExpiry |
Fixed — calls cleanupExpiredVehicles directly |
| 5 | Extract duplicate cleanup logic | Fixed — cleanupExpiredVehicles extracted |
Round 2 (2 items — all fixed)
| # | Item | Status |
|---|---|---|
| 1 | TestStaleFeedRejected may pass vacuously if RABA data lacks FeedHeader timestamp |
Fixed — require.NotZero assertion added at line 327 |
| 2 | os.ReadFile error silently discarded in test |
Fixed — file read moved before server creation with require.NoError at line 297 |
Round 2's fit-and-finish suggestion (time.Time vs uint64) was non-blocking and the current implementation is correct with the IsZero() guard.
Verification
- Tests: all pass
- Lint: 0 issues
- All 7 prior feedback items resolved
This is ready to merge.
Summary
Prevents stale GTFS-Realtime vehicle position feeds from overwriting newer data by introducing a per-feed freshness guard based on the GTFS-RT FeedMessage timestamp.
This fixes an issue where out-of-order, retried, or delayed realtime feeds could cause vehicles to jump backwards, disappear, or show incorrect positions.
What changed
Why this matters
GTFS-Realtime feeds can arrive out of order due to retries, caching, or upstream restarts. The GTFS-RT spec expects consumers to handle this safely. This change aligns Maglev with that expectation.
Tests
go test -tags sqlite_fts5 ./internal/gtfsRelated issue
Fixes #476