Skip to content

fix(api)!: prevent OSM node ID truncation in match/route annotations#7518

Closed
timpara wants to merge 7 commits into
Project-OSRM:masterfrom
timpara:fix/osm-node-id-overflow-7069
Closed

fix(api)!: prevent OSM node ID truncation in match/route annotations#7518
timpara wants to merge 7 commits into
Project-OSRM:masterfrom
timpara:fix/osm-node-id-overflow-7069

Conversation

@timpara
Copy link
Copy Markdown

@timpara timpara commented May 1, 2026

Issue

Fixes #7069. OSM node IDs above 2^34 come back wrong in /match and /route annotations.

Root cause

Two truncation bugs on the node-ID path:

  1. Storage: PackedOSMIDs packs OSM IDs into 34 bits, so any ID at or above 2^34 is silently masked. The BOOST_ASSERT guard is compiled out in release.
  2. FlatBuffers: Annotation.nodes was [uint] and route_api.hpp used std::vector<uint32_t>, truncating IDs above 2^32 again on the FB path. JSON output was already 64-bit.

Changes

  • Widen PackedOSMIDs from 34 to 40 bits (max ~1.1 * 10^12).
  • Add checkPackedOSMNodeIdFits() and call it at every push site so overflow throws during osrm-extract instead of masking.
  • Change FB Annotation.nodes to [ulong], use std::vector<uint64_t> in route_api.hpp, regenerate fbresult_generated.js.
  • Add unit tests for round-trip above 2^34 and the overflow guard.
  • Update docs/http.md.

Breaking changes

  1. On-disk: .osrm.nbg_nodes layout changes; existing datasets must be re-extracted. Recommend bumping package.json minor before release so the fingerprint rejects stale data.
  2. FB wire: Annotation.nodes changed from uint32 to uint64. FB clients must update accessors.

Notes

  • clang-format was not available in my environment; please run ./scripts/format.sh if needed.
  • No cucumber feature added: the framework auto-assigns small IDs and reworking it for huge IDs is out of scope. Unit tests cover the storage regression; the FB change is exercised by the existing FB cucumber matrix.
  • Vector tile output already uses OSMNodeID::value_type (uint64_t) and benefits transparently.

OSM node IDs above 2^34 were silently truncated when written into the
PackedOSMIDs storage (PackedVector<OSMNodeID, 34, ...>) used for the
.osrm.nbg_nodes file. The truncated values then flowed unchanged through
GetOSMNodeIDOfNode into the annotations.nodes field of /match and /route
responses, as well as vector tiles.

Additionally, the FlatBuffers schema for Annotation.nodes used [uint],
truncating any ID above 2^32 a second time on the FB wire path; the JSON
path was already 64-bit clean once storage stops truncating.

Changes:
- Widen PackedOSMIDs from 34 to 40 bits (max ~1.1e12, decades of headroom).
- Add checkPackedOSMNodeIdFits() guard at every osm_node_ids.push_back
  site so future overflow throws util::exception during osrm-extract
  rather than silently masking values.
- Change Annotation.nodes FlatBuffers field from [uint] to [ulong] and
  fix the corresponding std::vector<uint32_t> in route_api.hpp to
  std::vector<uint64_t>; regenerate fbresult_generated.js accordingly.
- Add unit tests for round-trip of IDs above 2^34 and for the overflow
  guard.
- Document the 64-bit semantics of annotations.nodes in docs/http.md.

This is a breaking change for two reasons:
1. The on-disk packed-vector layout for osm_node_ids changes; existing
   .osrm datasets must be re-extracted with the new osrm-extract.
2. The FlatBuffers Annotation.nodes wire type changes from uint32 to
   uint64; FB clients reading annotations.nodes must update accordingly.

Fixes Project-OSRM#7069
@timpara timpara marked this pull request as draft May 1, 2026 09:59
@DennisOSRM
Copy link
Copy Markdown
Collaborator

Thanks for the contribution. Please consider adding a test that covers the changed behavior.

@DennisOSRM
Copy link
Copy Markdown
Collaborator

This needs a formatting fix to run CI checks:

diff --git a/unit_tests/util/packed_vector.cpp b/unit_tests/util/packed_vector.cpp
index 0762c9b..99425bc 100644
--- a/unit_tests/util/packed_vector.cpp
+++ b/unit_tests/util/packed_vector.cpp
@@ -215,13 +215,13 @@ BOOST_AUTO_TEST_CASE(packed_osm_ids_above_2pow34_roundtrip)
     const std::vector<std::uint64_t> inputs = {
         0ull,
         1ull,
-        (1ull << 33),                                          // 2^33
-        (1ull << 34),                                          // 2^34, the old truncation boundary
-        (1ull << 34) + 1,                                      // just above 2^34
-        20'000'000'000ull,                                     // realistic projected OSM node ID
-        (1ull << 35),                                          // 2^35
-        osrm::extractor::MAX_PACKED_OSM_NODE_ID - 1,           // just below the new limit
-        osrm::extractor::MAX_PACKED_OSM_NODE_ID,               // exactly at the new limit
+        (1ull << 33),                                // 2^33
+        (1ull << 34),                                // 2^34, the old truncation boundary
+        (1ull << 34) + 1,                            // just above 2^34
+        20'000'000'000ull,                           // realistic projected OSM node ID
+        (1ull << 35),                                // 2^35
+        osrm::extractor::MAX_PACKED_OSM_NODE_ID - 1, // just below the new limit
+        osrm::extractor::MAX_PACKED_OSM_NODE_ID,     // exactly at the new limit
     };
 
     for (auto raw : inputs)

This comment was marked as outdated.

DennisOSRM and others added 5 commits May 15, 2026 12:17
Align array entry comments for readability.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…e ids

Bump the dataset fingerprint (magic number) to force incompatibility for the updated
packed storage layout (packed_osm_ids). Update docs/http.md to clarify that
Annotation.nodes are 64-bit (flatbuffers: [ulong]) and that JS bindings expose
these values as BigInt.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@DennisOSRM DennisOSRM force-pushed the fix/osm-node-id-overflow-7069 branch from 6c88d79 to f62c0d2 Compare May 15, 2026 12:46
@DennisOSRM
Copy link
Copy Markdown
Collaborator

continuing the work in #7552

@DennisOSRM DennisOSRM closed this May 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Node IDs > 2^34 are reported by route matching incorrect.

3 participants