Skip to content

⚡️ Speed up method GlobalMercator.QuadTree by 23%#20

Open
codeflash-ai[bot] wants to merge 1 commit into
masterfrom
codeflash/optimize-GlobalMercator.QuadTree-mh4zat2s
Open

⚡️ Speed up method GlobalMercator.QuadTree by 23%#20
codeflash-ai[bot] wants to merge 1 commit into
masterfrom
codeflash/optimize-GlobalMercator.QuadTree-mh4zat2s

Conversation

@codeflash-ai
Copy link
Copy Markdown

@codeflash-ai codeflash-ai Bot commented Oct 24, 2025

📄 23% (0.23x) speedup for GlobalMercator.QuadTree in opendm/tiles/gdal2tiles.py

⏱️ Runtime : 4.73 milliseconds 3.85 milliseconds (best of 224 runs)

📝 Explanation and details

The optimized code achieves a 23% speedup by eliminating expensive string concatenation operations in the QuadTree method.

Key optimizations:

  1. Replaced string concatenation with list operations: The original code used quadKey += str(digit) in a loop, which creates a new string object on each iteration. The optimized version preallocates a list quadKey = [''] * zoom and uses indexed assignment quadKey[zoom - i] = digits[digit], then joins once at the end with ''.join(quadKey).

  2. Pre-cached digit strings: Instead of calling str(digit) repeatedly, the optimized code uses a pre-defined tuple digits = ('0', '1', '2', '3') for constant-time lookup.

  3. Simplified conditional checks: Removed unnecessary != 0 comparisons in the bitwise operations (if tx & mask: instead of if (tx & mask) != 0:).

Why this works: String concatenation in Python is O(n) for each operation because strings are immutable, leading to O(n²) complexity overall. List operations are O(1) for indexed assignment, and the final join is O(n), resulting in O(n) total complexity.

Performance characteristics: The optimization shows the greatest benefit for higher zoom levels and batch processing scenarios. Test results show 16-28% improvements for large-scale operations (zoom 8-10 with multiple tiles), while individual low-zoom calls may be slightly slower due to the overhead of list allocation and tuple lookup - but this is more than compensated by the dramatic improvements in scenarios with many iterations.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 2418 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
import pytest  # used for our unit tests
from opendm.tiles.gdal2tiles import GlobalMercator

# unit tests

# Instantiate the class for use in tests
gm = GlobalMercator()

# ---------------------------
# 1. Basic Test Cases
# ---------------------------

def test_quadtree_zoom0():
    # At zoom level 0, only one tile (0,0)
    codeflash_output = gm.QuadTree(0, 0, 0) # 1.04μs -> 1.45μs (28.6% slower)

def test_quadtree_zoom1_tiles():
    # At zoom level 1, there are 2x2 tiles
    # (tx, ty) in TMS: (0,0) bottom-left, (1,0) bottom-right, (0,1) top-left, (1,1) top-right
    codeflash_output = gm.QuadTree(0, 0, 1) # 2.08μs -> 2.50μs (16.7% slower)
    codeflash_output = gm.QuadTree(1, 0, 1) # 949ns -> 1.08μs (12.3% slower)
    codeflash_output = gm.QuadTree(0, 1, 1) # 868ns -> 919ns (5.55% slower)
    codeflash_output = gm.QuadTree(1, 1, 1) # 711ns -> 761ns (6.57% slower)

def test_quadtree_zoom2_corners():
    # At zoom level 2, 4x4 tiles
    # Check all four corners
    codeflash_output = gm.QuadTree(0, 0, 2) # 2.32μs -> 2.59μs (10.6% slower)
    codeflash_output = gm.QuadTree(3, 0, 2) # 1.27μs -> 1.31μs (2.52% slower)
    codeflash_output = gm.QuadTree(0, 3, 2) # 1.14μs -> 1.10μs (3.55% faster)
    codeflash_output = gm.QuadTree(3, 3, 2) # 979ns -> 969ns (1.03% faster)

def test_quadtree_zoom2_edges():
    # Test edge tiles at zoom 2
    codeflash_output = gm.QuadTree(1, 0, 2) # 2.31μs -> 2.62μs (12.1% slower)
    codeflash_output = gm.QuadTree(2, 0, 2) # 1.23μs -> 1.23μs (0.325% faster)
    codeflash_output = gm.QuadTree(0, 1, 2) # 1.03μs -> 1.06μs (2.45% slower)
    codeflash_output = gm.QuadTree(3, 1, 2) # 977ns -> 975ns (0.205% faster)

def test_quadtree_zoom3_center():
    # Center tile at zoom 3 (should be 111)
    codeflash_output = gm.QuadTree(3, 3, 3) # 2.61μs -> 2.81μs (7.16% slower)
    codeflash_output = gm.QuadTree(1, 1, 3) # 1.36μs -> 1.37μs (0.659% slower)

# ---------------------------
# 2. Edge Test Cases
# ---------------------------



def test_quadtree_non_integer_input():
    # Non-integer tx/ty/zoom should raise TypeError
    with pytest.raises(TypeError):
        gm.QuadTree(0.5, 0, 1) # 3.80μs -> 3.89μs (2.34% slower)
    with pytest.raises(TypeError):
        gm.QuadTree(0, "1", 1) # 1.25μs -> 1.28μs (2.88% slower)
    with pytest.raises(TypeError):
        gm.QuadTree(0, 1, "2") # 913ns -> 763ns (19.7% faster)


def test_quadtree_large_zoom_max_tile():
    # Test the maximum valid tile at a high zoom
    zoom = 10
    tx = ty = 2**zoom - 1
    codeflash_output = gm.QuadTree(tx, ty, zoom); quadkey = codeflash_output # 5.08μs -> 4.85μs (4.64% faster)

def test_quadtree_large_zoom_min_tile():
    # Test the minimum valid tile at a high zoom
    zoom = 10
    tx = ty = 0
    codeflash_output = gm.QuadTree(tx, ty, zoom); quadkey = codeflash_output # 4.33μs -> 4.32μs (0.116% faster)

# ---------------------------
# 3. Large Scale Test Cases
# ---------------------------

def test_quadtree_large_scale_all_tiles_zoom8():
    # Test all tiles at zoom 8 (256x256 tiles, but test only a sample)
    zoom = 8
    max_tile = 2**zoom - 1
    # Sample 10 tiles across the diagonal
    for i in range(0, max_tile+1, max_tile//9):
        codeflash_output = gm.QuadTree(i, i, zoom); quadkey = codeflash_output # 22.6μs -> 19.1μs (18.1% faster)

def test_quadtree_large_scale_random_tiles_zoom9():
    # Test random tiles at zoom 9 (512x512 tiles), but only a few samples
    zoom = 9
    max_tile = 2**zoom - 1
    # Pick 5 arbitrary tiles
    samples = [(0, 0), (max_tile, max_tile), (max_tile//2, max_tile//2), (123, 456), (511, 511)]
    for tx, ty in samples:
        codeflash_output = gm.QuadTree(tx, ty, zoom); quadkey = codeflash_output # 13.8μs -> 11.8μs (16.6% faster)

def test_quadtree_performance_zoom10():
    # Performance test: generate quadkeys for all tiles in a row at zoom 10 (1024 tiles)
    zoom = 10
    ty = 0
    for tx in range(1000):
        codeflash_output = gm.QuadTree(tx, ty, zoom); quadkey = codeflash_output # 2.44ms -> 1.90ms (28.3% faster)

def test_quadtree_consistency():
    # Consistency: quadkey should be unique for each (tx, ty, zoom)
    zoom = 5
    seen = set()
    for tx in range(2**zoom):
        for ty in range(2**zoom):
            codeflash_output = gm.QuadTree(tx, ty, zoom); quadkey = codeflash_output
            seen.add(quadkey)

def test_quadtree_reversibility():
    # Reversibility: If we convert (tx, ty, zoom) to quadkey, then reconstruct tx, ty
    # from quadkey, we should get the same values
    # We'll implement a helper to reconstruct tx, ty from quadkey
    def quadkey_to_tile(quadkey):
        tx = ty = 0
        zoom = len(quadkey)
        for i in range(zoom):
            mask = 1 << (zoom - i - 1)
            digit = int(quadkey[i])
            if digit & 1:
                tx |= mask
            if digit & 2:
                ty |= mask
        # Undo the TMS->Google conversion for ty
        ty = (2**zoom - 1) - ty
        return tx, ty, zoom

    zoom = 7
    for tx in range(0, 2**zoom, 17):  # sample 8 tiles
        for ty in range(0, 2**zoom, 17):
            codeflash_output = gm.QuadTree(tx, ty, zoom); quadkey = codeflash_output
            tx2, ty2, zoom2 = quadkey_to_tile(quadkey)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
import pytest  # used for our unit tests
from opendm.tiles.gdal2tiles import GlobalMercator

# unit tests

@pytest.fixture
def mercator():
    # Fixture to provide a GlobalMercator instance
    return GlobalMercator()

# --- Basic Test Cases ---

def test_quadtree_zoom0_origin(mercator):
    # At zoom 0, only one tile (0,0) and quadkey is empty string
    codeflash_output = mercator.QuadTree(0, 0, 0) # 1.15μs -> 1.50μs (23.1% slower)

def test_quadtree_zoom1_tiles(mercator):
    # At zoom 1, tiles are (0,0), (1,0), (0,1), (1,1)
    # QuadKey for (0,0): '0'
    codeflash_output = mercator.QuadTree(0, 0, 1) # 2.25μs -> 2.65μs (14.8% slower)
    # QuadKey for (1,0): '1'
    codeflash_output = mercator.QuadTree(1, 0, 1) # 946ns -> 1.07μs (11.5% slower)
    # QuadKey for (0,1): '2'
    codeflash_output = mercator.QuadTree(0, 1, 1) # 804ns -> 910ns (11.6% slower)
    # QuadKey for (1,1): '3'
    codeflash_output = mercator.QuadTree(1, 1, 1) # 710ns -> 716ns (0.838% slower)

def test_quadtree_zoom2_corners(mercator):
    # At zoom 2, test all 4 corners
    codeflash_output = mercator.QuadTree(0, 0, 2) # 2.42μs -> 2.73μs (11.4% slower)
    codeflash_output = mercator.QuadTree(3, 0, 2) # 1.28μs -> 1.32μs (3.47% slower)
    codeflash_output = mercator.QuadTree(0, 3, 2) # 1.06μs -> 1.09μs (3.47% slower)
    codeflash_output = mercator.QuadTree(3, 3, 2) # 987ns -> 978ns (0.920% faster)

def test_quadtree_zoom2_edges(mercator):
    # At zoom 2, test edge tiles
    codeflash_output = mercator.QuadTree(1, 0, 2) # 2.38μs -> 2.66μs (10.5% slower)
    codeflash_output = mercator.QuadTree(2, 0, 2) # 1.24μs -> 1.24μs (0.567% faster)
    codeflash_output = mercator.QuadTree(1, 3, 2) # 1.11μs -> 1.04μs (5.93% faster)
    codeflash_output = mercator.QuadTree(2, 3, 2) # 940ns -> 905ns (3.87% faster)

def test_quadtree_typical_tile(mercator):
    # At zoom 3, tile (5,2)
    # Manually calculated quadkey: (zoom 3, tx=5, ty=2)
    # ty = (2^3 - 1) - 2 = 5
    # i=3: mask=4, tx&4=4, ty&4=4 => digit=3
    # i=2: mask=2, tx&2=0, ty&2=0 => digit=0
    # i=1: mask=1, tx&1=1, ty&1=1 => digit=3
    # quadkey: '300'
    codeflash_output = mercator.QuadTree(5, 2, 3) # 2.65μs -> 2.85μs (7.05% slower)

# --- Edge Test Cases ---

def test_quadtree_max_tile_zoom4(mercator):
    # At zoom 4, max tile index is 15 (2^4-1)
    codeflash_output = mercator.QuadTree(15, 15, 4) # 2.83μs -> 2.96μs (4.06% slower)
    codeflash_output = mercator.QuadTree(0, 15, 4) # 1.61μs -> 1.50μs (7.34% faster)
    codeflash_output = mercator.QuadTree(15, 0, 4) # 1.58μs -> 1.46μs (8.80% faster)
    codeflash_output = mercator.QuadTree(0, 0, 4) # 1.41μs -> 1.29μs (9.95% faster)

def test_quadtree_negative_tile_indices(mercator):
    # Negative tile indices should produce a valid quadkey (bitwise ops work)
    # For tx=-1, ty=0, zoom=2
    codeflash_output = mercator.QuadTree(-1, 0, 2); result = codeflash_output # 2.43μs -> 2.65μs (8.24% slower)

def test_quadtree_large_zoom(mercator):
    # At zoom 10, test tile (1023, 1023)
    codeflash_output = mercator.QuadTree(1023, 1023, 10); quadkey = codeflash_output # 4.32μs -> 4.11μs (5.16% faster)

def test_quadtree_zero_tile_large_zoom(mercator):
    # At zoom 10, tile (0,0)
    codeflash_output = mercator.QuadTree(0, 0, 10); quadkey = codeflash_output # 4.18μs -> 4.01μs (4.26% faster)

def test_quadtree_ty_flip(mercator):
    # QuadTree flips ty: test that (tx, ty, zoom) and (tx, (2**zoom-1)-ty, zoom) produce quadkeys with swapped digits
    zoom = 3
    tx, ty = 2, 1
    codeflash_output = mercator.QuadTree(tx, ty, zoom); quadkey1 = codeflash_output # 2.76μs -> 3.06μs (9.75% slower)
    codeflash_output = mercator.QuadTree(tx, (2**zoom-1)-ty, zoom); quadkey2 = codeflash_output # 1.51μs -> 1.49μs (1.68% faster)

def test_quadtree_non_integer_input(mercator):
    # Non-integer input should raise TypeError or ValueError
    with pytest.raises(TypeError):
        mercator.QuadTree(1.5, 2, 3) # 2.72μs -> 3.00μs (9.40% slower)
    with pytest.raises(TypeError):
        mercator.QuadTree(1, "2", 3) # 1.21μs -> 1.29μs (6.05% slower)
    with pytest.raises(TypeError):
        mercator.QuadTree(1, 2, "3") # 916ns -> 787ns (16.4% faster)

def test_quadtree_zoom_negative(mercator):
    # Negative zoom should produce an empty quadkey (since range(zoom,0,-1) is empty)
    codeflash_output = mercator.QuadTree(0, 0, -1) # 1.86μs -> 2.40μs (22.5% slower)

def test_quadtree_zoom_zero_nonzero_tile(mercator):
    # At zoom 0, any tx/ty should produce empty string
    codeflash_output = mercator.QuadTree(10, 10, 0) # 1.03μs -> 1.47μs (30.1% slower)

# --- Large Scale Test Cases ---

def test_quadtree_all_tiles_zoom8(mercator):
    # At zoom 8, there are 256x256 tiles. Test a sample of 10 random tiles.
    zoom = 8
    for tx, ty in [(0,0), (255,255), (128,128), (64,192), (192,64), (100,200), (200,100), (50,150), (150,50), (127,127)]:
        codeflash_output = mercator.QuadTree(tx, ty, zoom); quadkey = codeflash_output # 23.6μs -> 20.1μs (17.8% faster)

def test_quadtree_performance_zoom9(mercator):
    # At zoom 9, test a batch of 100 tiles for performance and correctness
    zoom = 9
    for tx in range(0, 100):
        ty = 99 - tx
        codeflash_output = mercator.QuadTree(tx, ty, zoom); quadkey = codeflash_output # 228μs -> 182μs (25.2% faster)

def test_quadtree_unique_keys_zoom7(mercator):
    # At zoom 7, check that quadkeys for tiles (x,0) are unique
    zoom = 7
    keys = set()
    for tx in range(128):
        codeflash_output = mercator.QuadTree(tx, 0, zoom); quadkey = codeflash_output # 246μs -> 200μs (22.8% faster)
        keys.add(quadkey)

def test_quadtree_large_zoom_max_tile(mercator):
    # At zoom 10, max tile index is 1023
    codeflash_output = mercator.QuadTree(1023, 1023, 10); quadkey = codeflash_output # 4.33μs -> 4.14μs (4.64% faster)

def test_quadtree_large_zoom_random_tiles(mercator):
    # Test 10 random tiles at zoom 9
    zoom = 9
    test_tiles = [(0,0), (511,511), (256,256), (123,456), (456,123), (100,900), (900,100), (50,800), (800,50), (127,127)]
    for tx, ty in test_tiles:
        codeflash_output = mercator.QuadTree(tx, ty, zoom); quadkey = codeflash_output # 26.2μs -> 22.2μs (18.2% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-GlobalMercator.QuadTree-mh4zat2s and push.

Codeflash

The optimized code achieves a **23% speedup** by eliminating expensive string concatenation operations in the `QuadTree` method. 

**Key optimizations:**

1. **Replaced string concatenation with list operations**: The original code used `quadKey += str(digit)` in a loop, which creates a new string object on each iteration. The optimized version preallocates a list `quadKey = [''] * zoom` and uses indexed assignment `quadKey[zoom - i] = digits[digit]`, then joins once at the end with `''.join(quadKey)`.

2. **Pre-cached digit strings**: Instead of calling `str(digit)` repeatedly, the optimized code uses a pre-defined tuple `digits = ('0', '1', '2', '3')` for constant-time lookup.

3. **Simplified conditional checks**: Removed unnecessary `!= 0` comparisons in the bitwise operations (`if tx & mask:` instead of `if (tx & mask) != 0:`).

**Why this works:** String concatenation in Python is O(n) for each operation because strings are immutable, leading to O(n²) complexity overall. List operations are O(1) for indexed assignment, and the final join is O(n), resulting in O(n) total complexity.

**Performance characteristics:** The optimization shows the greatest benefit for higher zoom levels and batch processing scenarios. Test results show 16-28% improvements for large-scale operations (zoom 8-10 with multiple tiles), while individual low-zoom calls may be slightly slower due to the overhead of list allocation and tuple lookup - but this is more than compensated by the dramatic improvements in scenarios with many iterations.
@codeflash-ai codeflash-ai Bot requested a review from mashraf-222 October 24, 2025 15:01
@codeflash-ai codeflash-ai Bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Oct 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants