The Azure Cosmos DB Python SDK does not parallelize cross-partition queries by default. When you execute a query that spans multiple partitions, the SDK queries each partition sequentially, one after another. For certain query patterns, this can result in slower performance than necessary.
This demo shows how to manually parallelize cross-partition queries using feed ranges and Python's asyncio. By querying all partitions concurrently instead of sequentially, you can improve query performance — but only for specific query patterns.
Parallelization via feed ranges is not a general-purpose optimization. It helps in a narrow set of cases and can actively hurt performance and cost in others.
These query patterns produce correct results and genuine speedup when parallelized across feed ranges:
| Pattern | Example | Why it works |
|---|---|---|
| Point lookup (unknown PK) | SELECT * FROM c WHERE c.id = 'HelloWorld' |
Each partition checks independently; only the partition containing the item returns it |
| Strongly filtering query | SELECT * FROM c WHERE c.status = 'active' |
Filter applied identically per partition; results combined without duplication |
| COUNT aggregate | SELECT VALUE COUNT(1) FROM c |
Each partition returns its count; sum client-side for the total |
| SUM aggregate | SELECT VALUE SUM(LENGTH(c.id)) FROM c |
Each partition returns its sum; sum client-side for the total |
| Existence check | SELECT * FROM c WHERE c.email = 'user@example.com' |
Same as point lookup — at most one partition has the result |
| MIN / MAX aggregate | SELECT VALUE MAX(c._ts) FROM c |
Each partition returns its min/max; take the min/max of those client-side |
These query patterns will produce incorrect results, waste RUs, or both when parallelized:
| Pattern | Example | What goes wrong |
|---|---|---|
TOP / LIMIT |
SELECT TOP 100 * FROM c |
TOP 100 is applied per feed range → 10 partitions return 1,000 items total, consuming 10x the RUs |
ORDER BY |
SELECT * FROM c ORDER BY c.createdAt DESC |
Each partition sorts independently; combined results are not sorted — requires full client-side re-sort |
OFFSET...LIMIT |
SELECT * FROM c OFFSET 10 LIMIT 5 |
Pagination semantics break completely — each partition skips/limits independently |
AVG |
SELECT VALUE AVG(c.price) FROM c |
Cannot sum averages — requires tracking both sum and count per partition, then dividing |
| Unfiltered scan | SELECT * FROM c |
Same total work as sequential, but with higher peak RU burst — no speedup benefit |
Rule of thumb: Parallelization works when the per-partition results can be combined without re-processing — either concatenated (filtering queries), summed (
COUNT,SUM), or compared (MIN,MAX). If the query usesTOP,ORDER BY,OFFSET, orLIMIT, do not parallelize.
This demo compares two approaches using a SUM aggregation across all partitions:
- Standard Cross-Partition Query (Sequential): The default SDK behavior — queries partitions one at a time
- Parallelized Query with Feed Ranges (Concurrent): Uses feed ranges and asyncio to query all partitions simultaneously, then sums the per-partition results client-side
The default query — SELECT VALUE SUM(LENGTH(c.id)) FROM c — is schema-agnostic (every document has id), forces the engine to read every document and compute a string length, and produces a SUM that is trivially parallelizable. This makes it an ideal demonstration of the performance difference.
# Create virtual environment
python -m venv venv
# Activate it
# On Windows:
venv\Scripts\activate
# On macOS/Linux:
source venv/bin/activatepip install -r requirements.txtCopy the example config and update with your settings:
cp config.example.json config.jsonEdit config.json with your Cosmos DB details:
{
"endpoint": "https://your-cosmos-account.documents.azure.com:443/",
"database": "your-database-name",
"container": "your-container-name",
"query": "SELECT VALUE SUM(LENGTH(c.id)) FROM c",
"use_default_credential": true
}Important: Use queries that are suitable for parallelization. See When to Use Parallelization for guidance. The default
SUM(LENGTH(c.id))query is an ideal candidate — it is schema-agnostic, computationally expensive, and trivially parallelizable.
az loginpython parallel_query_demo.pyEdit config.json to customize the demo:
| Field | Description | Example |
|---|---|---|
endpoint |
Cosmos DB endpoint URL | https://your-account.documents.azure.com:443/ |
database |
Database name | your-database |
container |
Container name | your-container |
query |
SQL query to execute | SELECT VALUE SUM(LENGTH(c.id)) FROM c |
use_default_credential |
Use Azure DefaultAzureCredential for authentication | true |
Important: Use queries that are suitable for parallelization — COUNT/SUM aggregates or strongly-filtering WHERE clauses. Do not use TOP, ORDER BY, or OFFSET — these operators do not parallelize correctly across feed ranges.
The demo uses DefaultAzureCredential which tries multiple authentication methods:
- Azure CLI (
az login) - Managed Identity
- Environment variables
- And more
Make sure you're authenticated:
az loginThis output is from a container with 100 million records partitioned by id:
================================================================================
COSMOS DB CROSS-PARTITION QUERY COMPARISON
================================================================================
Query: SELECT VALUE SUM(LENGTH(c.id)) FROM c
Type: Aggregate (COUNT/SUM) — results summed client-side
================================================================================
Container has 10 feed ranges (physical partitions)
[1] Running STANDARD cross-partition query...
✓ Completed in 42.17 seconds
✓ Result: 3,600,000,000
[2] Running PARALLELIZED cross-partition query...
✓ Completed in 5.83 seconds
✓ Result: 3,600,000,000 (summed from 10 partitions)
================================================================================
RESULTS
================================================================================
Standard query time: 42.17 seconds
Parallelized query time: 5.83 seconds
Speedup: 7.23x faster
Performance improvement: 86.2%
Results match: 3,600,000,000
================================================================================
Note: The parallelized query runs SUM(LENGTH(c.id)) on each partition simultaneously and sums the results client-side. This query forces the engine to read every document (it cannot be answered from indexes alone), making the sequential vs parallel difference dramatic.
.
├── parallel_query_demo.py # Main demo with comparison logic
├── config.json # Your configuration (gitignored)
├── config.example.json # Template configuration
├── requirements.txt # Python dependencies
└── README.md # This file
async for item in container.query_items(
query=query,
enable_cross_partition_query=True
):
items.append(item)# Get feed ranges (physical partitions)
feed_ranges = [fr async for fr in container.read_feed_ranges()]
# Query each feed range in parallel
async def query_feed_range(feed_range):
return [item async for item in container.query_items(
query=query,
feed_range=feed_range
)]
# Execute in parallel
results = await asyncio.gather(*[
query_feed_range(fr) for fr in feed_ranges
])- Query Pattern Matters Most: Only suitable queries benefit —
COUNT/SUMaggregates and strongly-filteringWHEREclauses.TOP,ORDER BY,OFFSETwill produce incorrect or wasteful results when parallelized - RU Cost: Parallel queries consume RUs from all partitions simultaneously. Ensure sufficient throughput to avoid throttling (429 errors)
- More Feed Ranges = More Parallelism: Performance scales with the number of physical partitions
- Network Latency: Parallel queries show greater improvement with higher latency
- Aggregate Recombination: Only
COUNTandSUMcan be trivially summed.AVGrequires tracking both sum and count per partition
- Add
WHEREfilters to narrow the result set - Check container throughput and scale if needed
- Run
az loginto authenticate with Azure - Verify you have read permissions on the Cosmos DB account
- Check firewall rules allow your IP address
- Monitor RU consumption in Azure Portal
- Check for throttling (429 errors)
- Consider increasing container throughput
- Verify network connectivity