Skip to content

Commit e35f297

Browse files
author
Samson Gebre
committed
Enhance SKILL.md and QueryBuilder to support advanced streaming and update deprecated methods
- Added simple and advanced streaming options in SKILL.md for records.list_pages() and execute_pages(). - Updated QueryBuilder to replace records.get() with records.list() in documentation and method calls. - Improved unit tests to validate new streaming functionality and ensure correct method delegation.
1 parent 3ed77be commit e35f297

5 files changed

Lines changed: 203 additions & 134 deletions

File tree

.claude/skills/dataverse-sdk-use/SKILL.md

Lines changed: 54 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,16 @@ The SDK supports Dataverse's native bulk operations: Pass lists to `create()`, `
3030
### Paging
3131
- Control page size with `page_size` parameter
3232
- Use `top` parameter to limit total records returned
33+
- Simple streaming: `records.list_pages(table, filter, select)` — yields one `QueryResult` per HTTP page (3 params only; use builder for advanced options)
34+
- Advanced streaming: `client.query.builder(table)....execute_pages()` — full builder options, one `QueryResult` per page
35+
- `execute(by_page=True/False)` is **deprecated** and emits `UserWarning`; use `execute_pages()` instead
36+
37+
### QueryResult
38+
- Returned by `records.list()`, `records.retrieve()`, `execute()`, and each page from `list_pages()` / `execute_pages()`
39+
- Iterable: `for record in result` — each item is a `dict`-like `Record`
40+
- `.to_dataframe()` — convert to pandas DataFrame
41+
- `.first()` — return the first record or `None`
42+
- `len(result)` — number of records in this result/page
3343

3444
### DataFrame Support
3545
- DataFrame operations are accessed via the `client.dataframe` namespace: `client.dataframe.create()`, `client.dataframe.update()`, `client.dataframe.delete()``client.dataframe.get()` is deprecated; use `client.query.builder(table).where(...).execute().to_dataframe()` instead
@@ -90,7 +100,7 @@ account = client.records.retrieve("account", account_id, select=["name", "teleph
90100
# Query with filter — follows @odata.nextLink automatically (multiple HTTP requests if needed),
91101
# loads all matching records into memory, returns a single QueryResult.
92102
# Page size is Dataverse's default (~5000/page); use top to bound total records and round-trips.
93-
# For very large sets where memory is a concern, use query.builder().execute(by_page=True) instead.
103+
# For very large sets where memory is a concern, use records.list_pages() or execute_pages() instead.
94104
result = client.records.list(
95105
"account",
96106
select=["accountid", "name"], # select is case-insensitive (automatically lowercased)
@@ -100,12 +110,22 @@ result = client.records.list(
100110
for record in result:
101111
print(record["name"])
102112

103-
# For large result sets that must be streamed page-by-page (caller controls memory, one page at a time)
113+
# Simple streaming — page-by-page (3 params only; use builder for ordering/expand/count)
114+
for page in client.records.list_pages(
115+
"account",
116+
select=["accountid", "name"],
117+
filter="statecode eq 0",
118+
):
119+
for record in page:
120+
print(record["name"])
121+
122+
# Advanced streaming — full builder options, one QueryResult per HTTP page
123+
from PowerPlatform.Dataverse.models.filters import col
104124
for page in (client.query.builder("account")
105125
.select("accountid", "name")
106126
.where(col("statecode") == 0)
107127
.page_size(500) # optional: override Dataverse default page size
108-
.execute(by_page=True)):
128+
.execute_pages()):
109129
for record in page:
110130
print(record["name"])
111131

@@ -191,13 +211,14 @@ client.records.delete("account", [id1, id2, id3], use_bulk_delete=True)
191211

192212
The SDK provides DataFrame wrappers for all CRUD operations via the `client.dataframe` namespace, using pandas DataFrames and Series as input/output.
193213

194-
> **Note:** `client.dataframe.get()` is deprecated. Use `client.query.builder(table).select(...).filter(...).to_dataframe()` instead.
214+
> **Note:** `client.dataframe.get()` is deprecated. Use `client.query.builder(table).select(...).where(...).to_dataframe()` instead.
195215
196216
```python
197217
import pandas as pd
198218

199219
# Query records -- returns a single DataFrame (GA builder pattern)
200-
df = client.query.builder("account").filter("statecode eq 0").select("name").to_dataframe()
220+
from PowerPlatform.Dataverse.models.filters import col
221+
df = client.query.builder("account").where(col("statecode") == 0).select("name").to_dataframe()
201222
print(f"Got {len(df)} rows")
202223

203224
# Limit results with top
@@ -237,6 +258,34 @@ for record in results:
237258
print(record["name"])
238259
```
239260

261+
### FetchXML Queries
262+
263+
`client.query.fetch_xml(xml)` returns an inert `FetchXmlQuery` object — **no HTTP request is made** until `.execute()` or `.execute_pages()` is called.
264+
265+
```python
266+
xml = """
267+
<fetch top="50">
268+
<entity name="account">
269+
<attribute name="accountid" />
270+
<attribute name="name" />
271+
<filter>
272+
<condition attribute="statecode" operator="eq" value="0" />
273+
</filter>
274+
</entity>
275+
</fetch>
276+
"""
277+
278+
# Load all results into memory (simple, small-to-medium sets)
279+
query = client.query.fetch_xml(xml)
280+
result = query.execute() # returns QueryResult — all pages fetched upfront
281+
for record in result:
282+
print(record["name"])
283+
284+
# Stream page-by-page (large sets or early exit)
285+
for page in query.execute_pages(): # yields one QueryResult per HTTP page
286+
process(page.to_dataframe())
287+
```
288+
240289
### Table Management
241290

242291
#### Create Custom Tables

src/PowerPlatform/Dataverse/claude_skill/dataverse-sdk-use/SKILL.md

Lines changed: 54 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,16 @@ The SDK supports Dataverse's native bulk operations: Pass lists to `create()`, `
3030
### Paging
3131
- Control page size with `page_size` parameter
3232
- Use `top` parameter to limit total records returned
33+
- Simple streaming: `records.list_pages(table, filter, select)` — yields one `QueryResult` per HTTP page (3 params only; use builder for advanced options)
34+
- Advanced streaming: `client.query.builder(table)....execute_pages()` — full builder options, one `QueryResult` per page
35+
- `execute(by_page=True/False)` is **deprecated** and emits `UserWarning`; use `execute_pages()` instead
36+
37+
### QueryResult
38+
- Returned by `records.list()`, `records.retrieve()`, `execute()`, and each page from `list_pages()` / `execute_pages()`
39+
- Iterable: `for record in result` — each item is a `dict`-like `Record`
40+
- `.to_dataframe()` — convert to pandas DataFrame
41+
- `.first()` — return the first record or `None`
42+
- `len(result)` — number of records in this result/page
3343

3444
### DataFrame Support
3545
- DataFrame operations are accessed via the `client.dataframe` namespace: `client.dataframe.create()`, `client.dataframe.update()`, `client.dataframe.delete()``client.dataframe.get()` is deprecated; use `client.query.builder(table).where(...).execute().to_dataframe()` instead
@@ -90,7 +100,7 @@ account = client.records.retrieve("account", account_id, select=["name", "teleph
90100
# Query with filter — follows @odata.nextLink automatically (multiple HTTP requests if needed),
91101
# loads all matching records into memory, returns a single QueryResult.
92102
# Page size is Dataverse's default (~5000/page); use top to bound total records and round-trips.
93-
# For very large sets where memory is a concern, use query.builder().execute(by_page=True) instead.
103+
# For very large sets where memory is a concern, use records.list_pages() or execute_pages() instead.
94104
result = client.records.list(
95105
"account",
96106
select=["accountid", "name"], # select is case-insensitive (automatically lowercased)
@@ -100,12 +110,22 @@ result = client.records.list(
100110
for record in result:
101111
print(record["name"])
102112

103-
# For large result sets that must be streamed page-by-page (caller controls memory, one page at a time)
113+
# Simple streaming — page-by-page (3 params only; use builder for ordering/expand/count)
114+
for page in client.records.list_pages(
115+
"account",
116+
select=["accountid", "name"],
117+
filter="statecode eq 0",
118+
):
119+
for record in page:
120+
print(record["name"])
121+
122+
# Advanced streaming — full builder options, one QueryResult per HTTP page
123+
from PowerPlatform.Dataverse.models.filters import col
104124
for page in (client.query.builder("account")
105125
.select("accountid", "name")
106126
.where(col("statecode") == 0)
107127
.page_size(500) # optional: override Dataverse default page size
108-
.execute(by_page=True)):
128+
.execute_pages()):
109129
for record in page:
110130
print(record["name"])
111131

@@ -191,13 +211,14 @@ client.records.delete("account", [id1, id2, id3], use_bulk_delete=True)
191211

192212
The SDK provides DataFrame wrappers for all CRUD operations via the `client.dataframe` namespace, using pandas DataFrames and Series as input/output.
193213

194-
> **Note:** `client.dataframe.get()` is deprecated. Use `client.query.builder(table).select(...).filter(...).to_dataframe()` instead.
214+
> **Note:** `client.dataframe.get()` is deprecated. Use `client.query.builder(table).select(...).where(...).to_dataframe()` instead.
195215
196216
```python
197217
import pandas as pd
198218

199219
# Query records -- returns a single DataFrame (GA builder pattern)
200-
df = client.query.builder("account").filter("statecode eq 0").select("name").to_dataframe()
220+
from PowerPlatform.Dataverse.models.filters import col
221+
df = client.query.builder("account").where(col("statecode") == 0).select("name").to_dataframe()
201222
print(f"Got {len(df)} rows")
202223

203224
# Limit results with top
@@ -237,6 +258,34 @@ for record in results:
237258
print(record["name"])
238259
```
239260

261+
### FetchXML Queries
262+
263+
`client.query.fetch_xml(xml)` returns an inert `FetchXmlQuery` object — **no HTTP request is made** until `.execute()` or `.execute_pages()` is called.
264+
265+
```python
266+
xml = """
267+
<fetch top="50">
268+
<entity name="account">
269+
<attribute name="accountid" />
270+
<attribute name="name" />
271+
<filter>
272+
<condition attribute="statecode" operator="eq" value="0" />
273+
</filter>
274+
</entity>
275+
</fetch>
276+
"""
277+
278+
# Load all results into memory (simple, small-to-medium sets)
279+
query = client.query.fetch_xml(xml)
280+
result = query.execute() # returns QueryResult — all pages fetched upfront
281+
for record in result:
282+
print(record["name"])
283+
284+
# Stream page-by-page (large sets or early exit)
285+
for page in query.execute_pages(): # yields one QueryResult per HTTP page
286+
process(page.to_dataframe())
287+
```
288+
240289
### Table Management
241290

242291
#### Create Custom Tables

src/PowerPlatform/Dataverse/models/query_builder.py

Lines changed: 20 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -50,7 +50,7 @@
5050
from __future__ import annotations
5151

5252
import warnings
53-
from typing import Any, Dict, Iterable, Iterator, List, Optional, TypedDict, Union
53+
from typing import Any, Dict, Iterator, List, Optional, TypedDict, Union
5454

5555
import pandas as pd
5656

@@ -67,7 +67,7 @@ class QueryParams(TypedDict, total=False):
6767
"""Typed dictionary returned by :meth:`QueryBuilder.build`.
6868
6969
Provides IDE autocomplete when passing build results to
70-
``client.records.get()`` manually.
70+
``client.records.list()`` manually.
7171
"""
7272

7373
table: str
@@ -450,7 +450,7 @@ def build(self) -> QueryParams:
450450

451451
# --------------------------------------------------------------- execute
452452

453-
def execute(self, *, by_page=_BY_PAGE_UNSET) -> Union[QueryResult, Iterable[List[Record]]]:
453+
def execute(self, *, by_page=_BY_PAGE_UNSET) -> Union[QueryResult, Iterator[QueryResult]]:
454454
"""Execute the query and return results.
455455
456456
Returns a :class:`~PowerPlatform.Dataverse.models.record.QueryResult`
@@ -460,7 +460,7 @@ def execute(self, *, by_page=_BY_PAGE_UNSET) -> Union[QueryResult, Iterable[List
460460
This method is only available when the QueryBuilder was created
461461
via ``client.query.builder(table)``. Standalone ``QueryBuilder``
462462
instances should use :meth:`build` to get parameters and pass them
463-
to ``client.records.get()`` manually.
463+
to ``client.records.list()`` manually.
464464
465465
At least one of ``select()``, ``where()``, or ``top()`` must be
466466
called before ``execute()``; otherwise a :class:`ValueError` is
@@ -474,7 +474,7 @@ def execute(self, *, by_page=_BY_PAGE_UNSET) -> Union[QueryResult, Iterable[List
474474
:return: :class:`~PowerPlatform.Dataverse.models.record.QueryResult`
475475
with all pages collected (default), or page iterator (deprecated
476476
``by_page=True``).
477-
:rtype: QueryResult or Iterable[List[Record]]
477+
:rtype: QueryResult or Iterator[QueryResult]
478478
:raises ValueError: If no ``select``, ``where``, or ``top``
479479
constraint has been set.
480480
:raises RuntimeError: If the query was not created via
@@ -510,7 +510,7 @@ def execute(self, *, by_page=_BY_PAGE_UNSET) -> Union[QueryResult, Iterable[List
510510
if self._query_ops is None:
511511
raise RuntimeError(
512512
"Cannot execute: query was not created via client.query.builder(). "
513-
"Use build() and pass parameters to client.records.get() instead."
513+
"Use build() and pass parameters to client.records.list() instead."
514514
)
515515

516516
if not self._select and not self._filter_parts and self._top is None:
@@ -522,11 +522,12 @@ def execute(self, *, by_page=_BY_PAGE_UNSET) -> Union[QueryResult, Iterable[List
522522
params = self.build()
523523
client = self._query_ops._client
524524

525-
# Suppress DeprecationWarning from records.get() — execute() is GA;
526-
# records.get() is deprecated but still used as the internal paging mechanism.
527-
with warnings.catch_warnings():
528-
warnings.simplefilter("ignore", DeprecationWarning)
529-
pages = client.records.get(
525+
if use_by_page:
526+
return self.execute_pages()
527+
528+
all_records: List[Record] = []
529+
with client._scoped_odata() as od:
530+
for page in od._get_multiple(
530531
params["table"],
531532
select=params.get("select"),
532533
filter=params.get("filter"),
@@ -536,14 +537,8 @@ def execute(self, *, by_page=_BY_PAGE_UNSET) -> Union[QueryResult, Iterable[List
536537
page_size=params.get("page_size"),
537538
count=params.get("count", False),
538539
include_annotations=params.get("include_annotations"),
539-
)
540-
541-
if use_by_page:
542-
return pages
543-
544-
all_records: List[Record] = []
545-
for page in pages:
546-
all_records.extend(page)
540+
):
541+
all_records.extend(Record.from_api_response(params["table"], row) for row in page)
547542
return QueryResult(all_records)
548543

549544
# ---------------------------------------------------------- execute_pages
@@ -579,7 +574,7 @@ def execute_pages(self) -> Iterator[QueryResult]:
579574
if self._query_ops is None:
580575
raise RuntimeError(
581576
"Cannot execute: query was not created via client.query.builder(). "
582-
"Use build() and pass parameters to client.records.get() instead."
577+
"Use build() and pass parameters to client.records.list() instead."
583578
)
584579

585580
if not self._select and not self._filter_parts and self._top is None:
@@ -591,9 +586,8 @@ def execute_pages(self) -> Iterator[QueryResult]:
591586
params = self.build()
592587
client = self._query_ops._client
593588

594-
with warnings.catch_warnings():
595-
warnings.simplefilter("ignore", DeprecationWarning)
596-
pages = client.records.get(
589+
with client._scoped_odata() as od:
590+
for page in od._get_multiple(
597591
params["table"],
598592
select=params.get("select"),
599593
filter=params.get("filter"),
@@ -603,10 +597,8 @@ def execute_pages(self) -> Iterator[QueryResult]:
603597
page_size=params.get("page_size"),
604598
count=params.get("count", False),
605599
include_annotations=params.get("include_annotations"),
606-
)
607-
608-
for page in pages:
609-
yield QueryResult(page)
600+
):
601+
yield QueryResult([Record.from_api_response(params["table"], row) for row in page])
610602

611603
# ----------------------------------------------------------- to_dataframe
612604

@@ -653,7 +645,7 @@ def to_dataframe(self) -> pd.DataFrame:
653645
if self._query_ops is None:
654646
raise RuntimeError(
655647
"Cannot execute: query was not created via client.query.builder(). "
656-
"Use build() and pass parameters to client.records.get() instead."
648+
"Use build() and pass parameters to client.records.list() instead."
657649
)
658650

659651
result = self.execute()

src/PowerPlatform/Dataverse/operations/records.py

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -588,6 +588,9 @@ def list(
588588
GA replacement for ``records.get(table, filter=...)``. All pages are
589589
collected eagerly and returned as a single :class:`QueryResult`.
590590
591+
For advanced query options (ordering, expand, count, page size, or
592+
OData annotations) use ``client.query.builder()`` instead.
593+
591594
:param table: Schema name of the table (e.g. ``"account"``).
592595
:type table: :class:`str`
593596
:param filter: Optional OData filter string or :class:`FilterExpression`.
@@ -643,9 +646,12 @@ def list_pages(
643646
) -> Iterator[QueryResult]:
644647
"""Lazily yield one :class:`QueryResult` per HTTP page.
645648
646-
Symmetric with :meth:`~PowerPlatform.Dataverse.models.query_builder.QueryBuilder.execute_pages`
647-
on the builder. Each iteration triggers a network request via
648-
``@odata.nextLink``. One-shot — do not iterate more than once.
649+
Streaming counterpart to :meth:`list`. Each iteration triggers one
650+
network request via ``@odata.nextLink``. One-shot — do not iterate
651+
more than once.
652+
653+
For advanced query options (ordering, expand, count, page size, or
654+
OData annotations) use ``client.query.builder().execute_pages()`` instead.
649655
650656
:param table: Schema name of the table (e.g. ``"account"``).
651657
:type table: :class:`str`

0 commit comments

Comments
 (0)