⚡️ Speed up function find_last_node by 21,829%
#223
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄 21,829% (218.29x) speedup for
find_last_nodeinsrc/algorithms/graph.py⏱️ Runtime :
77.5 milliseconds→353 microseconds(best of250runs)📝 Explanation and details
The optimized code achieves a 218x speedup by eliminating a quadratic time complexity bottleneck in the original implementation.
Key Optimization
Original approach: For each node, the code checks
all(e["source"] != n["id"] for e in edges), which requires scanning through all edges for every node. This results in O(N × E) time complexity where N is the number of nodes and E is the number of edges.Optimized approach: The code pre-computes a set of all edge sources once (
edge_sources = {e["source"] for e in edges}), then performs constant-time O(1) lookups usingn["id"] not in edge_sources. This reduces complexity to O(N + E).Why This Matters
Set lookup vs. repeated iteration: Python sets use hash tables, providing O(1) average-case membership testing. The original code's nested iteration forces O(E) operations per node.
Dramatic impact on larger graphs:
test_large_linear_chain(18.3ms → 55.2μs)Consistent improvements: Even small graphs show 50-80% speedups because hash set construction overhead is minimal, and membership testing is immediately faster than repeated iteration.
Test Case Performance Patterns
The optimization is universally beneficial for any realistic workload where the function is called with non-trivial graphs, particularly in hot paths where graph analysis is performed repeatedly.
✅ Correctness verification report:
🌀 Click to see Generated Regression Tests
To edit these changes
git checkout codeflash/optimize-find_last_node-mjj6uva3and push.