Skip to content

Commit a5db9da

Browse files
sagebreeSamson GebreCopilot
authored
Introduce GA API surface: namespace operations, FetchXML paging, streaming queries, and v0→v1 migration tool (#175)
## Summary This PR aligns the SDK with its GA public API contract. It introduces a clean operations namespace, replaces the flat `client.*` v0 surface with structured sub-namespaces, adds full FetchXML support with correct paging cookie handling, and ships a migration tool for existing v0 callers. --- ## Changes ### 1. API surface: v0 removal and v1 namespace introduction All deprecated v0 methods (`create`, `update`, `delete`, `get`, `list`, etc.) have been removed from `DataverseClient` (~570 lines). The client now exposes three clean namespaces: ```python client.records # CRUD operations client.query # Query operations (builder, SQL, FetchXML) client.batch # Bulk/batch operations ` ` ` --- ### 2. Record operations (`client.records`) New methods in `operations/records.py`: - `create()`, `update()`, `delete()`, `upsert()` — single and bulk variants - `retrieve()` — returns a `QueryResult` - `list()` — returns a `QueryResult` (filter, select, top) - `list_pages()` — lazy iterator yielding one `QueryResult` per HTTP page --- ### 3. Query operations (`client.query`) - `builder(table)` — fluent `QueryBuilder` with `.where()`, `.select()`, `.order_by()`, `.expand()`, `.execute()`, `.execute_pages()` - `sql(query)` — pass-through SQL SELECT - `fetchxml(xml)` — returns an inert `FetchXmlQuery`; no HTTP until `.execute()` or `.execute_pages()` is called `execute(by_page=True/False)` on `QueryBuilder` is deprecated; use `execute_pages()` instead. --- ### 4. FetchXML support (`models/fetchxml_query.py`) New `FetchXmlQuery` implementing the correct Dataverse paging cookie algorithm per [public documentation](https://learn.microsoft.com/en-us/power-apps/developer/data-platform/fetchxml/page-results?tabs=webapi): - Annotation parsed as outer XML `<cookie pagenumber="N" pagingcookie="DOUBLE_ENCODED" />` - `pagingcookie` attribute extracted and double URL-decoded to get the inner cookie XML - `pagenumber` from annotation used for next page (trusts server over local counter) - `morerecords` handled as both `bool` and string `"true"` - Simple paging fallback (no cookie returned) continues with page increment and emits `UserWarning` - Safety guards: 32,768-character URL limit (documented Dataverse GET cap per [compose-http-requests-handle-errors](https://learn.microsoft.com/en-us/power-apps/developer/data-platform/webapi/compose-http-requests-handle-errors#maximum-url-length)), 10,000-page circuit breaker - Input validation: type, non-empty, XML well-formedness, entity element and name presence, URL length pre-check --- ### 5. `QueryResult` (`models/record.py`) New typed result wrapper returned by all read operations: ```python result.first() # first Record or None result.to_dataframe() # pandas DataFrame len(result) # record count for record in result: # iterable ` ` ` --- ### 6. `DataverseModel` protocol (`models/protocol.py`) New `DataverseModel` typed protocol for table-bound model classes, for future use --- ### 7. Filter expressions (`models/filters.py`) Enhanced `FilterExpression`, `col()`, and `raw()` — now exported from the top-level package: ```python from PowerPlatform.Dataverse import col, raw, QueryResult, DataverseModel ` ` ` --- ### 8. v0→v1 migration tool (`tools/migrate_v0_to_v1.py`) New CLI tool that rewrites existing v0 call sites to the v1 namespace API. Supports `--dry-run`. Covers create, update, delete, list, get, fetchxml, and query builder patterns. --- ### 9. Tests Four new phase test files covering the full GA surface: | File | Scope | |---|---| | `test_phase1_ga.py` | v0 removals, deprecation guards | | `test_phase2_ga.py` | QueryResult, execute(), exports | | `test_phase3_ga.py` | retrieve(), list(), DataverseModel | | `test_phase4_ga.py` | fetchxml(), input validation, paging cookie, deprecated odata helpers | --- ### 10. Examples and documentation - New `examples/advanced/fetchxml.py` — end-to-end FetchXML scenarios (basic, paging, aggregates, link-entity, system tables) - Updated `examples/advanced/sql_examples.py`, `batch.py`, `walkthrough.py`, and others to v1 API - `README.md` updated with v1 usage patterns - SKILL.md updated to reflect GA surface --------- Co-authored-by: Samson Gebre <sagebree@microsoft.com> Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: sagebree <6541424+sagebree@users.noreply.github.com>
1 parent 5eae887 commit a5db9da

43 files changed

Lines changed: 7815 additions & 3006 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.claude/skills/dataverse-sdk-use/SKILL.md

Lines changed: 159 additions & 45 deletions
Original file line numberDiff line numberDiff line change
@@ -28,11 +28,23 @@ Use the PowerPlatform Dataverse Client Python SDK to interact with Microsoft Dat
2828
The SDK supports Dataverse's native bulk operations: Pass lists to `create()`, `update()` for automatic bulk processing, for `delete()`, set `use_bulk_delete` when passing lists to use bulk operation
2929

3030
### Paging
31-
- Control page size with `page_size` parameter
31+
- Control page size with `page_size` parameter on `records.list()`, `records.list_pages()`, or `QueryBuilder.page_size()`
3232
- Use `top` parameter to limit total records returned
33+
- **Preferred**: `client.query.builder(table)....execute_pages()` — composable `where(col(...))` filters, formatted values, expand with nested selects, full pagination control
34+
- Simple streaming shortcut: `records.list_pages(table, *, filter, select, top, orderby, expand, page_size, count, include_annotations)` — string-based OData filter only, yields one `QueryResult` per page
35+
- `execute(by_page=True/False)` is **deprecated** and emits `UserWarning`; use `execute_pages()` instead
36+
- `QueryBuilder.to_dataframe()` is **deprecated**; use `.execute().to_dataframe()` instead
37+
38+
### QueryResult
39+
- Returned by `records.list()`, `records.retrieve()`, `execute()`, and each page from `list_pages()` / `execute_pages()`
40+
- Iterable: `for record in result` — each item is a `dict`-like `Record`
41+
- `.to_dataframe()` — convert to pandas DataFrame
42+
- `.first()` — return the first record or `None` (safe: returns `None` on empty result)
43+
- `result[n]` — index access returns a `Record`; `result[n:m]` returns a `QueryResult`
44+
- `len(result)` — number of records in this result/page
3345

3446
### DataFrame Support
35-
- DataFrame operations are accessed via the `client.dataframe` namespace: `client.dataframe.get()`, `client.dataframe.create()`, `client.dataframe.update()`, `client.dataframe.delete()`
47+
- DataFrame operations are accessed via the `client.dataframe` namespace: `client.dataframe.create()`, `client.dataframe.update()`, `client.dataframe.delete()``client.dataframe.get()` is deprecated; use `client.query.builder(table).where(...).execute().to_dataframe()` instead
3648

3749
## Common Operations
3850

@@ -85,28 +97,92 @@ contact_ids = client.records.create("contact", contacts)
8597
#### Read Records
8698
```python
8799
# Get single record by ID
88-
account = client.records.get("account", account_id, select=["name", "telephone1"])
89-
90-
# Query with filter (paginated)
91-
for page in client.records.get(
92-
"account",
93-
select=["accountid", "name"], # select is case-insensitive (automatically lowercased)
94-
filter="statecode eq 0", # filter must use lowercase logical names (not transformed)
95-
top=100,
96-
):
100+
account = client.records.retrieve("account", account_id, select=["name", "telephone1"])
101+
102+
# With expand — fetch a related record in the same HTTP request
103+
account = client.records.retrieve(
104+
"account", account_id,
105+
select=["name"],
106+
expand=["primarycontactid"],
107+
)
108+
contact = (account.get("primarycontactid") or {})
109+
print(contact.get("fullname"))
110+
111+
# Simple shortcut — use records.list() only for basic filter + select without composable logic.
112+
# Follows @odata.nextLink automatically and loads all matching records into memory.
113+
# For filtering, sorting, expansion, or formatted values, prefer client.query.builder() (see below).
114+
result = client.records.list("account", filter="statecode eq 0", select=["name", "accountid"])
115+
for record in result:
116+
print(record["name"])
117+
```
118+
119+
#### Query Builder (Preferred for Filtering, Sorting, Expand, Formatted Values)
120+
121+
Use `client.query.builder()` for any query that goes beyond simple filter + select. It provides composable `where(col(...))` expressions, formatted value support, nested expansion, and streaming — all with a fluent API.
122+
123+
```python
124+
from PowerPlatform.Dataverse.models.filters import col
125+
from PowerPlatform.Dataverse.models.query_builder import ExpandOption
126+
127+
# Basic query with composable filter and sort
128+
result = (client.query.builder("account")
129+
.select("accountid", "name", "statecode")
130+
.where(col("statecode") == 0)
131+
.order_by("name asc")
132+
.execute())
133+
for record in result:
134+
print(record["name"])
135+
136+
# Composable filters — AND / OR / NOT using Python operators
137+
result = (client.query.builder("contact")
138+
.select("fullname", "emailaddress1")
139+
.where((col("statecode") == 0) & (col("emailaddress1").contains("@contoso.com")))
140+
.execute())
141+
142+
# Formatted values — display labels for option sets, currency symbols, etc.
143+
result = (client.query.builder("account")
144+
.select("accountid", "name", "industrycode")
145+
.where(col("statecode") == 0)
146+
.include_formatted_values()
147+
.execute())
148+
for record in result:
149+
label = record.get("industrycode@OData.Community.Display.V1.FormattedValue")
150+
print(record["name"], label)
151+
152+
# Navigation property expansion with nested column select
153+
result = (client.query.builder("account")
154+
.select("name")
155+
.expand(ExpandOption("primarycontactid").select("fullname", "emailaddress1"))
156+
.where(col("statecode") == 0)
157+
.execute())
158+
for record in result:
159+
contact = record.get("primarycontactid", {})
160+
print(f"{record['name']} - {contact.get('fullname', 'N/A')}")
161+
162+
# Stream large result sets page-by-page (memory-efficient)
163+
for page in (client.query.builder("account")
164+
.select("accountid", "name")
165+
.where(col("statecode") == 0)
166+
.order_by("name asc")
167+
.page_size(500)
168+
.execute_pages()):
97169
for record in page:
98170
print(record["name"])
99171

100-
# Query with navigation property expansion (case-sensitive!)
101-
for page in client.records.get(
102-
"account",
103-
select=["name"],
104-
expand=["primarycontactid"], # Navigation properties are case-sensitive!
105-
filter="statecode eq 0", # Column names must be lowercase logical names
106-
):
107-
for account in page:
108-
contact = account.get("primarycontactid", {})
109-
print(f"{account['name']} - {contact.get('fullname', 'N/A')}")
172+
# Convert query results to a DataFrame
173+
df = (client.query.builder("account")
174+
.select("accountid", "name")
175+
.where(col("statecode") == 0)
176+
.execute()
177+
.to_dataframe())
178+
179+
# Limit total results
180+
result = client.query.builder("account").select("name").top(100).execute()
181+
182+
# Simple streaming shortcut via records.list_pages() (string filter only, same params as records.list())
183+
for page in client.records.list_pages("account", filter="statecode eq 0", select=["name"], page_size=500):
184+
for record in page:
185+
print(record["name"])
110186
```
111187

112188
#### Create Records with Lookup Bindings (@odata.bind)
@@ -179,18 +255,24 @@ client.records.delete("account", [id1, id2, id3], use_bulk_delete=True)
179255

180256
The SDK provides DataFrame wrappers for all CRUD operations via the `client.dataframe` namespace, using pandas DataFrames and Series as input/output.
181257

258+
> **Note:** `client.dataframe.get()` is deprecated. Use `client.query.builder(table).select(...).where(...).execute().to_dataframe()` instead. `QueryBuilder.to_dataframe()` (without `.execute()`) is also deprecated — always call `.execute()` first.
259+
182260
```python
183261
import pandas as pd
184262

185-
# Query records -- returns a single DataFrame
186-
df = client.dataframe.get("account", filter="statecode eq 0", select=["name"])
263+
# Query records -- returns a single DataFrame (GA pattern: .execute().to_dataframe())
264+
from PowerPlatform.Dataverse.models.filters import col
265+
df = client.query.builder("account").where(col("statecode") == 0).select("name").execute().to_dataframe()
187266
print(f"Got {len(df)} rows")
188267

189-
# Limit results with top for large tables
190-
df = client.dataframe.get("account", select=["name"], top=100)
268+
# Limit results with top
269+
df = client.query.builder("account").select("name").top(100).execute().to_dataframe()
270+
271+
# Via records.list() (simpler for basic queries)
272+
df = client.records.list("account", filter="statecode eq 0", select=["name"]).to_dataframe()
191273

192274
# Fetch single record as one-row DataFrame
193-
df = client.dataframe.get("account", record_id=account_id, select=["name"])
275+
df = client.records.retrieve("account", account_id, select=["name"]).to_dataframe()
194276

195277
# Create records from a DataFrame (returns a Series of GUIDs)
196278
new_accounts = pd.DataFrame([
@@ -223,6 +305,34 @@ for record in results:
223305
print(record["name"])
224306
```
225307

308+
### FetchXML Queries
309+
310+
`client.query.fetchxml(xml)` returns an inert `FetchXmlQuery` object — **no HTTP request is made** until `.execute()` or `.execute_pages()` is called.
311+
312+
```python
313+
xml = """
314+
<fetch top="50">
315+
<entity name="account">
316+
<attribute name="accountid" />
317+
<attribute name="name" />
318+
<filter>
319+
<condition attribute="statecode" operator="eq" value="0" />
320+
</filter>
321+
</entity>
322+
</fetch>
323+
"""
324+
325+
# Load all results into memory (simple, small-to-medium sets)
326+
query = client.query.fetchxml(xml)
327+
result = query.execute() # returns QueryResult — all pages fetched upfront
328+
for record in result:
329+
print(record["name"])
330+
331+
# Stream page-by-page (large sets or early exit)
332+
for page in query.execute_pages(): # yields one QueryResult per HTTP page
333+
process(page.to_dataframe())
334+
```
335+
226336
### Table Management
227337

228338
#### Create Custom Tables
@@ -380,7 +490,8 @@ Use `client.batch` to send multiple operations in one HTTP request. All batch me
380490
batch = client.batch.new()
381491
batch.records.create("account", {"name": "Contoso"})
382492
batch.records.update("account", account_id, {"telephone1": "555-0100"})
383-
batch.records.get("account", account_id, select=["name"])
493+
batch.records.retrieve("account", account_id, select=["name"], expand=["primarycontactid"], include_annotations="OData.Community.Display.V1.FormattedValue") # single record with expand
494+
batch.records.list("account", filter="statecode eq 0", select=["name"], orderby=["name asc"], top=50, page_size=25, count=True) # multi-record, single page
384495
batch.query.sql("SELECT TOP 5 name FROM account")
385496

386497
result = batch.execute()
@@ -412,7 +523,8 @@ print(f"Succeeded: {len(result.succeeded)}, Failed: {len(result.failed)}")
412523

413524
**Batch limitations:**
414525
- Maximum 1000 operations per batch
415-
- Paginated `records.get()` (without `record_id`) is not supported in batch
526+
- `batch.records.get()` is deprecated; use `batch.records.retrieve()` for single records
527+
- `batch.records.list()` returns a single page (no pagination); use `top` to bound results
416528
- `flush_cache()` is not supported in batch
417529

418530
## Error Handling
@@ -430,7 +542,7 @@ from PowerPlatform.Dataverse.core.errors import (
430542
from PowerPlatform.Dataverse.client import DataverseClient
431543

432544
try:
433-
client.records.get("account", "invalid-id")
545+
client.records.retrieve("account", "invalid-id")
434546
except HttpError as e:
435547
print(f"HTTP {e.status_code}: {e.message}")
436548
print(f"Error code: {e.code}")
@@ -464,16 +576,17 @@ except ValidationError as e:
464576

465577
### Performance Optimization
466578

467-
1. **Use bulk operations** - Pass lists to create/update/delete for automatic optimization
468-
2. **Specify select fields** - Limit returned columns to reduce payload size
469-
3. **Control page size** - Use `top` and `page_size` parameters appropriately
470-
4. **Reuse client instances** - Don't create new clients for each operation
471-
5. **Use production credentials** - ClientSecretCredential or CertificateCredential for unattended operations
472-
6. **Error handling** - Implement retry logic for transient errors (`e.is_transient`)
473-
7. **Always include customization prefix** for custom tables/columns
474-
8. **Use lowercase for column names, match `$metadata` for navigation properties** - Column names in `$select`/`$filter`/record payloads use lowercase LogicalNames. Navigation properties in `$expand` and `@odata.bind` keys are case-sensitive and must match the entity's `$metadata` (PascalCase for custom lookups like `new_CustomerId`, lowercase for system lookups like `parentaccountid`)
475-
9. **Test in non-production environments** first
476-
10. **Use named constants** - Import cascade behavior constants from `PowerPlatform.Dataverse.common.constants`
579+
1. **Prefer `client.query.builder()` for any non-trivial query** — use the builder for filtering, sorting, expansion, or formatted values; `records.list()` is a convenience shortcut for simple filter+select only
580+
2. **Use bulk operations** - Pass lists to create/update/delete for automatic optimization
581+
3. **Specify select fields** - Limit returned columns to reduce payload size
582+
4. **Control page size** - Use `top` and `page_size` parameters appropriately; use `execute_pages()` for large sets
583+
5. **Reuse client instances** - Don't create new clients for each operation
584+
6. **Use production credentials** - ClientSecretCredential or CertificateCredential for unattended operations
585+
7. **Error handling** - Implement retry logic for transient errors (`e.is_transient`)
586+
8. **Always include customization prefix** for custom tables/columns
587+
9. **Use lowercase for column names, match `$metadata` for navigation properties** - Column names in `$select`/`$filter`/record payloads use lowercase LogicalNames. Navigation properties in `$expand` and `@odata.bind` keys are case-sensitive and must match the entity's `$metadata` (PascalCase for custom lookups like `new_CustomerId`, lowercase for system lookups like `parentaccountid`)
588+
10. **Test in non-production environments** first
589+
11. **Use named constants** - Import cascade behavior constants from `PowerPlatform.Dataverse.common.constants`
477590

478591
## Additional Resources
479592

@@ -486,9 +599,10 @@ Load these resources as needed during development:
486599

487600
## Key Reminders
488601

489-
1. **Schema names are required** - Never use display names
490-
2. **Custom tables need prefixes** - Include customization prefix (e.g., "new_")
491-
3. **Filter is case-sensitive** - Use lowercase logical names
492-
4. **Bulk operations are encouraged** - Pass lists for optimization
493-
5. **No trailing slashes in URLs** - Format: `https://org.crm.dynamics.com`
494-
6. **Structured errors** - Check `is_transient` for retry logic
602+
1. **Use `client.query.builder()` for queries** — it's the primary query pattern; `records.list()` is a shortcut for trivial filter+select only
603+
2. **Schema names are required** - Never use display names
604+
3. **Custom tables need prefixes** - Include customization prefix (e.g., "new_")
605+
4. **Filter is case-sensitive** - Use lowercase logical names
606+
5. **Bulk operations are encouraged** - Pass lists for optimization
607+
6. **No trailing slashes in URLs** - Format: `https://org.crm.dynamics.com`
608+
7. **Structured errors** - Check `is_transient` for retry logic

CHANGELOG.md

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,32 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
77

88
## [Unreleased]
99

10+
### Added
11+
- `client.records.retrieve(table, record_id, *, select, expand, include_annotations)` — fetch a single record by GUID; returns `None` on 404 instead of raising; `expand` adds `$expand` for navigation property expansion on the single-record GET; `include_annotations` maps to the `Prefer: odata.include-annotations` header for formatted values and lookup labels (#175)
12+
- `client.records.list(table, *, filter, select, top, orderby, expand, page_size, count, include_annotations)` — eager fetch returning a flat `QueryResult`; GA replacement for `records.get()` without a record ID; `page_size` controls `Prefer: odata.maxpagesize`, `count=True` adds `$count=true`, `include_annotations` requests formatted values (#175)
13+
- `client.records.list_pages(table, *, filter, select, top, orderby, expand, page_size, count, include_annotations)` — lazy iterator yielding one `QueryResult` per HTTP page; streaming counterpart to `list()`; same parameter set (#175)
14+
- `client.query.fetchxml(xml)` — FetchXML support returning an inert `FetchXmlQuery`; no HTTP request is made until `.execute()` or `.execute_pages()` is called (#175)
15+
- `FetchXmlQuery` implements the correct Dataverse paging cookie algorithm: annotation parsed as outer XML, `pagingcookie` attribute double URL-decoded, server-supplied `pagenumber` used for next page, `morerecords` handled as both `bool` and `"true"` string, `UserWarning` emitted on simple paging fallback, 32,768-character URL limit enforced (documented Dataverse GET cap), 10,000-page circuit breaker against runaway iteration (#175)
16+
- `QueryBuilder.execute_pages()` — lazy per-page streaming returning one `QueryResult` per HTTP page; replaces deprecated `execute(by_page=True)` (#175)
17+
- `QueryBuilder.where()` — composable filter expressions using `col()` and Python operators (`==`, `>`, `&`, `|`, `~`); replaces deprecated `filter_eq()`, `filter_contains()`, and other `filter_*` helpers (#175)
18+
- `QueryResult.__getitem__` — index access (`result[0]`) returns a `Record`; slice access (`result[1:5]`) returns a new `QueryResult` (#175)
19+
- `DataverseModel` structural `Protocol` (`models/protocol.py`) — implement on any entity class to enable typed integration with CRUD operations without specifying table names or serializing manually (#175)
20+
- `col()`, `raw()`, `QueryResult`, and `DataverseModel` exported from the top-level `PowerPlatform.Dataverse` package (#175)
21+
- v0→v1 migration tool: installed as the `dataverse-migrate` console script (also runnable via `python -m PowerPlatform.Dataverse.migration.migrate_v0_to_v1`); rewrites v0 call sites to the v1 API with `--dry-run` support; covers `create`, `update`, `delete`, `get`, `list`, `fetchxml`, and query builder patterns; requires the `[migration]` optional extra (`pip install PowerPlatform-Dataverse-Client[migration]`) (#175)
22+
- Migration tool now auto-rewrites `QueryBuilder.to_dataframe()``.execute().to_dataframe()` (inserts `.execute()` when receiver is a recognised builder chain); output improved with `[NEEDS-MANUAL]` label for files that have no auto-rewrites but require manual attention, and a trailing note on `[MIGRATED]` lines when manual items remain (#175)
23+
24+
### Changed
25+
- `QueryBuilder.execute()` now returns a flat `QueryResult` (all pages collected eagerly) instead of `Iterable[Record]` (#175)
26+
- `records.get()` deprecation extended: calling with a `record_id` emits `DeprecationWarning` directing callers to `retrieve()`; calling without a `record_id` directs callers to `list()` (#175)
27+
28+
### Deprecated
29+
- `QueryBuilder.execute(by_page=True)` and `execute(by_page=False)` emit `UserWarning`; use `execute_pages()` and `execute()` respectively (#175)
30+
- `client.query.odata_select()`, `client.query.odata_expands()`, `client.query.odata_expand()`, `client.query.odata_bind()` emit `DeprecationWarning`; navigation property helpers are replaced by `QueryBuilder.expand()` (#175)
31+
32+
### Removed
33+
- All v0 flat methods on `DataverseClient` (`create`, `update`, `delete`, `get`, `list`, `query_sql`, etc.) removed (~570 lines); use the `client.records`, `client.query`, and `client.batch` namespaces (#175)
34+
- `client.query.sql_select()`, `client.query.sql_joins()`, `client.query.sql_join()` removed (#175)
35+
1036
## [0.1.0b10] - 2026-05-12
1137

1238
### Added

0 commit comments

Comments
 (0)