Summary
The `shortestPath()` function in AGE appears to use the same VLE (variable-length edge) expansion internally, inheriting the O(n^k) cycle-free expansion issue from #2349. A shortest path query should use BFS (breadth-first search) which terminates as soon as the target is found, but the current implementation expands all paths up to the maximum depth before selecting the shortest.
Reproduction
SELECT * FROM cypher('benchmark', $$
MATCH p = shortestPath((a:Node {bench_id: 1})-[*..10]->(b:Node {bench_id: 50}))
RETURN length(p)
$$) AS (len agtype);
Performance Data
Benchmark W07 (shortest path):
| Approach |
10K scale |
100K scale |
| Cypher shortestPath() |
~23ms |
~23ms |
| BFS CTE with UNION + early termination |
0.09ms |
0.33ms |
The SDK workaround implements BFS as a recursive CTE:
WITH RECURSIVE bfs AS (
SELECT e.end_id AS node_id, 1 AS depth
FROM benchmark."EDGE" e
WHERE e.start_id = (SELECT id FROM benchmark."Node" WHERE _bench_id = 1)
UNION
SELECT e.end_id, b.depth + 1
FROM bfs b
JOIN benchmark."EDGE" e ON e.start_id = b.node_id
WHERE b.depth < 10
AND NOT EXISTS (SELECT 1 FROM bfs WHERE node_id = (SELECT id FROM benchmark."Node" WHERE _bench_id = 50))
)
SELECT depth FROM bfs
WHERE node_id = (SELECT id FROM benchmark."Node" WHERE _bench_id = 50)
LIMIT 1;
Suggested Fix
Implement `shortestPath()` using a proper BFS algorithm in the executor rather than delegating to VLE. BFS guarantees finding the shortest path and terminates at the first match, providing O(V+E) complexity instead of O(n^k).
This is related to #2349 (VLE cycle detection) — fixing VLE would also improve shortestPath, but a dedicated BFS implementation would be even more efficient for the shortest-path use case.
Environment
- PostgreSQL 18.2
- Apache AGE 1.7.0 (built from source, branch `release/PG18/1.7.0`)
- 32-core, 64GB RAM, Linux 6.17