Parallelizing Cross-Partition Queries in Azure Cosmos DB (Python SDK)

The Problem

The Azure Cosmos DB Python SDK does not parallelize cross-partition queries by default. When you execute a query that spans multiple partitions, the SDK queries each partition sequentially, one after another. For certain query patterns, this can result in slower performance than necessary.

The Solution

This demo shows how to manually parallelize cross-partition queries using feed ranges and Python's asyncio. By querying all partitions concurrently instead of sequentially, you can improve query performance — but only for specific query patterns.

⚠️ When to Use (and NOT Use) Parallelization

Parallelization via feed ranges is not a general-purpose optimization. It helps in a narrow set of cases and can actively hurt performance and cost in others.

✅ Good Candidates for Parallelization

These query patterns produce correct results and genuine speedup when parallelized across feed ranges:

Pattern	Example	Why it works
Point lookup (unknown PK)	`SELECT * FROM c WHERE c.id = 'HelloWorld'`	Each partition checks independently; only the partition containing the item returns it
Strongly filtering query	`SELECT * FROM c WHERE c.status = 'active'`	Filter applied identically per partition; results combined without duplication
COUNT aggregate	`SELECT VALUE COUNT(1) FROM c`	Each partition returns its count; sum client-side for the total
SUM aggregate	`SELECT VALUE SUM(LENGTH(c.id)) FROM c`	Each partition returns its sum; sum client-side for the total
Existence check	`SELECT * FROM c WHERE c.email = 'user@example.com'`	Same as point lookup — at most one partition has the result
MIN / MAX aggregate	`SELECT VALUE MAX(c._ts) FROM c`	Each partition returns its min/max; take the min/max of those client-side

❌ Bad Candidates — Do NOT Parallelize These

These query patterns will produce incorrect results, waste RUs, or both when parallelized:

Pattern	Example	What goes wrong
`TOP` / `LIMIT`	`SELECT TOP 100 * FROM c`	`TOP 100` is applied per feed range → 10 partitions return 1,000 items total, consuming 10x the RUs
`ORDER BY`	`SELECT * FROM c ORDER BY c.createdAt DESC`	Each partition sorts independently; combined results are not sorted — requires full client-side re-sort
`OFFSET...LIMIT`	`SELECT * FROM c OFFSET 10 LIMIT 5`	Pagination semantics break completely — each partition skips/limits independently
`AVG`	`SELECT VALUE AVG(c.price) FROM c`	Cannot sum averages — requires tracking both sum and count per partition, then dividing
Unfiltered scan	`SELECT * FROM c`	Same total work as sequential, but with higher peak RU burst — no speedup benefit

Rule of thumb: Parallelization works when the per-partition results can be combined without re-processing — either concatenated (filtering queries), summed (COUNT, SUM), or compared (MIN, MAX). If the query uses TOP, ORDER BY, OFFSET, or LIMIT, do not parallelize.

What This Demo Shows

This demo compares two approaches using a SUM aggregation across all partitions:

Standard Cross-Partition Query (Sequential): The default SDK behavior — queries partitions one at a time
Parallelized Query with Feed Ranges (Concurrent): Uses feed ranges and asyncio to query all partitions simultaneously, then sums the per-partition results client-side

The default query — SELECT VALUE SUM(LENGTH(c.id)) FROM c — is schema-agnostic (every document has id), forces the engine to read every document and compute a string length, and produces a SUM that is trivially parallelizable. This makes it an ideal demonstration of the performance difference.

Quick Start

1. Create a Virtual Environment

# Create virtual environment
python -m venv venv

# Activate it
# On Windows:
venv\Scripts\activate

# On macOS/Linux:
source venv/bin/activate

2. Install Dependencies

pip install -r requirements.txt

3. Configure Your Cosmos DB Connection

Copy the example config and update with your settings:

cp config.example.json config.json

Edit config.json with your Cosmos DB details:

{
  "endpoint": "https://your-cosmos-account.documents.azure.com:443/",
  "database": "your-database-name",
  "container": "your-container-name",
  "query": "SELECT VALUE SUM(LENGTH(c.id)) FROM c",
  "use_default_credential": true
}

Important: Use queries that are suitable for parallelization. See When to Use Parallelization for guidance. The default SUM(LENGTH(c.id)) query is an ideal candidate — it is schema-agnostic, computationally expensive, and trivially parallelizable.

4. Authenticate with Azure

az login

5. Run the Demo

python parallel_query_demo.py

Configuration Options

Edit config.json to customize the demo:

Field	Description	Example
`endpoint`	Cosmos DB endpoint URL	`https://your-account.documents.azure.com:443/`
`database`	Database name	`your-database`
`container`	Container name	`your-container`
`query`	SQL query to execute	`SELECT VALUE SUM(LENGTH(c.id)) FROM c`
`use_default_credential`	Use Azure DefaultAzureCredential for authentication	`true`

Important: Use queries that are suitable for parallelization — COUNT/SUM aggregates or strongly-filtering WHERE clauses. Do not use TOP, ORDER BY, or OFFSET — these operators do not parallelize correctly across feed ranges.

Authentication

The demo uses DefaultAzureCredential which tries multiple authentication methods:

Azure CLI (az login)
Managed Identity
Environment variables
And more

Make sure you're authenticated:

az login

Example Output

This output is from a container with 100 million records partitioned by id:

================================================================================
COSMOS DB CROSS-PARTITION QUERY COMPARISON
================================================================================
Query: SELECT VALUE SUM(LENGTH(c.id)) FROM c
Type:  Aggregate (COUNT/SUM) — results summed client-side
================================================================================

Container has 10 feed ranges (physical partitions)

[1] Running STANDARD cross-partition query...
    ✓ Completed in 42.17 seconds
    ✓ Result: 3,600,000,000

[2] Running PARALLELIZED cross-partition query...
    ✓ Completed in 5.83 seconds
    ✓ Result: 3,600,000,000 (summed from 10 partitions)

================================================================================
RESULTS
================================================================================
Standard query time:     42.17 seconds
Parallelized query time: 5.83 seconds

Speedup: 7.23x faster
Performance improvement: 86.2%

Results match: 3,600,000,000
================================================================================

Note: The parallelized query runs SUM(LENGTH(c.id)) on each partition simultaneously and sums the results client-side. This query forces the engine to read every document (it cannot be answered from indexes alone), making the sequential vs parallel difference dramatic.

Project Structure

.
├── parallel_query_demo.py       # Main demo with comparison logic
├── config.json                   # Your configuration (gitignored)
├── config.example.json           # Template configuration
├── requirements.txt              # Python dependencies
└── README.md                     # This file

How It Works

Standard Cross-Partition Query

async for item in container.query_items(
    query=query,
    enable_cross_partition_query=True
):
    items.append(item)

Parallelized Query with Feed Ranges

# Get feed ranges (physical partitions)
feed_ranges = [fr async for fr in container.read_feed_ranges()]

# Query each feed range in parallel
async def query_feed_range(feed_range):
    return [item async for item in container.query_items(
        query=query, 
        feed_range=feed_range
    )]

# Execute in parallel
results = await asyncio.gather(*[
    query_feed_range(fr) for fr in feed_ranges
])

Performance Considerations

Query Pattern Matters Most: Only suitable queries benefit — COUNT/SUM aggregates and strongly-filtering WHERE clauses. TOP, ORDER BY, OFFSET will produce incorrect or wasteful results when parallelized
RU Cost: Parallel queries consume RUs from all partitions simultaneously. Ensure sufficient throughput to avoid throttling (429 errors)
More Feed Ranges = More Parallelism: Performance scales with the number of physical partitions
Network Latency: Parallel queries show greater improvement with higher latency
Aggregate Recombination: Only COUNT and SUM can be trivially summed. AVG requires tracking both sum and count per partition

Troubleshooting

Query Hanging or Taking Too Long

Add WHERE filters to narrow the result set
Check container throughput and scale if needed

Authentication Issues

Run az login to authenticate with Azure
Verify you have read permissions on the Cosmos DB account
Check firewall rules allow your IP address

Performance Issues

Monitor RU consumption in Azure Portal
Check for throttling (429 errors)
Consider increasing container throughput
Verify network connectivity

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Parallelizing Cross-Partition Queries in Azure Cosmos DB (Python SDK)

The Problem

The Solution

⚠️ When to Use (and NOT Use) Parallelization

✅ Good Candidates for Parallelization

❌ Bad Candidates — Do NOT Parallelize These

What This Demo Shows

Quick Start

1. Create a Virtual Environment

2. Install Dependencies

3. Configure Your Cosmos DB Connection

4. Authenticate with Azure

5. Run the Demo

Configuration Options

Authentication

Example Output

Project Structure

How It Works

Standard Cross-Partition Query

Parallelized Query with Feed Ranges

Performance Considerations

Troubleshooting

Query Hanging or Taking Too Long

Authentication Issues

Performance Issues

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.gitignore		.gitignore
README.md		README.md
config.example.json		config.example.json
parallel_query_demo.py		parallel_query_demo.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Parallelizing Cross-Partition Queries in Azure Cosmos DB (Python SDK)

The Problem

The Solution

⚠️ When to Use (and NOT Use) Parallelization

✅ Good Candidates for Parallelization

❌ Bad Candidates — Do NOT Parallelize These

What This Demo Shows

Quick Start

1. Create a Virtual Environment

2. Install Dependencies

3. Configure Your Cosmos DB Connection

4. Authenticate with Azure

5. Run the Demo

Configuration Options

Authentication

Example Output

Project Structure

How It Works

Standard Cross-Partition Query

Parallelized Query with Feed Ranges

Performance Considerations

Troubleshooting

Query Hanging or Taking Too Long

Authentication Issues

Performance Issues

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages