Skip to content

Add extended table metadata retrieval and models for columns and option sets#2

Draft
maksii wants to merge 24 commits into
mainfrom
feature/metadata
Draft

Add extended table metadata retrieval and models for columns and option sets#2
maksii wants to merge 24 commits into
mainfrom
feature/metadata

Conversation

@maksii
Copy link
Copy Markdown
Owner

@maksii maksii commented Feb 28, 2026

Add extended table metadata retrieval and models for columns and option sets

  • Implemented methods to fetch detailed metadata for tables, including columns and relationships.
  • Introduced ColumnMetadata, OptionItem, and OptionSetInfo models to represent column and option set data structures.
  • Updated get method in TableOperations to support optional parameters for including columns and relationships in the response.
  • Enhanced tests to cover new functionality and ensure backward compatibility.

This update improves the SDK's ability to interact with Dataverse metadata, providing richer data for developers.

…on sets

- Implemented methods to fetch detailed metadata for tables, including columns and relationships.
- Introduced `ColumnMetadata`, `OptionItem`, and `OptionSetInfo` models to represent column and option set data structures.
- Updated `get` method in `TableOperations` to support optional parameters for including columns and relationships in the response.
- Enhanced tests to cover new functionality and ensure backward compatibility.

This update improves the SDK's ability to interact with Dataverse metadata, providing richer data for developers.
@maksii
Copy link
Copy Markdown
Owner Author

maksii commented Feb 28, 2026

@codex review
@copilot review

Copy link
Copy Markdown

Copilot AI commented Feb 28, 2026

@maksii I've opened a new pull request, #3, to work on those changes. Once the pull request is ready, I'll request review from you.

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Keep it up!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Copilot AI and others added 3 commits February 28, 2026 12:30
Co-authored-by: maksii <1761348+maksii@users.noreply.github.com>
Fix spurious `columns_created` in `get()` result and reduce cyclomatic complexity
- Introduced new test data fixtures in `tests/fixtures/test_data.py` for various metadata attributes, including columns, option sets, and relationships.
- Updated `tests/unit/models/test_metadata.py` to utilize the new fixtures for testing `ColumnMetadata` and `OptionSetInfo` classes, improving test coverage for primary name columns, picklist columns, and status/state option sets.
- Enhanced `tests/unit/test_tables_operations.py` to incorporate fixtures for table operations, ensuring accurate testing of column retrieval, relationship listing, and table metadata.
- Refactored existing tests to replace hardcoded data with structured test data from the new fixtures, promoting maintainability and clarity in test cases.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends the Dataverse SDK’s table metadata APIs to support richer metadata retrieval (columns, relationships, and option sets) and adds new metadata models plus corresponding unit tests.

Changes:

  • Added extended table metadata retrieval to TableOperations.get() via select, include_columns, and include_relationships.
  • Introduced metadata models (ColumnMetadata, OptionItem, OptionSetInfo) and added extensive fixtures/tests for columns and option sets.
  • Implemented new OData metadata helpers for table metadata, columns, relationships, and column option sets.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated no comments.

Show a summary per file
File Description
src/PowerPlatform/Dataverse/operations/tables.py Extends public table metadata APIs (get/columns/relationships/option sets).
src/PowerPlatform/Dataverse/data/_odata.py Adds OData helper methods for metadata retrieval (tables, columns, option sets, relationships).
src/PowerPlatform/Dataverse/models/metadata.py Adds dataclass models for column and option set metadata.
src/PowerPlatform/Dataverse/common/constants.py Adds OData type constants used for attribute-type casting in metadata calls.
tests/unit/test_tables_operations.py Expands table operations tests to cover new metadata functionality and backward compatibility.
tests/unit/models/test_metadata.py New unit tests validating metadata model parsing.
tests/fixtures/test_data.py Adds realistic fixtures for attributes, option sets, relationships, and table list entries.
tests/fixtures/__init__.py Initializes fixtures as a package.
README.md Documents new metadata retrieval APIs with examples.
.claude/skills/dataverse-sdk-use/SKILL.md Updates skill documentation with new metadata retrieval examples.
Comments suppressed due to low confidence (3)

src/PowerPlatform/Dataverse/data/_odata.py:1534

  • _get_table_metadata() expands Attributes / relationship collections without projecting fields (no nested $select). Expanding Attributes in particular can return very large payloads for common entities, which can make tables.get(..., include_columns=True) slow/heavy. Consider adding nested $select for expanded collections (or reusing the more selective _get_table_columns endpoint for columns) so the default extended-get remains practical on large schemas.
        expand_parts: List[str] = []
        if include_attributes:
            expand_parts.append("Attributes")
        if include_one_to_many:
            expand_parts.append("OneToManyRelationships")
        if include_many_to_one:
            expand_parts.append("ManyToOneRelationships")
        if include_many_to_many:
            expand_parts.append("ManyToManyRelationships")

src/PowerPlatform/Dataverse/operations/tables.py:368

  • The get_column_options() docstring says it only supports Picklist/MultiSelect/Boolean columns, but the implementation (_get_column_optionset) also supports Status and State attribute types (and unit tests exercise those). Please update the docstring to include Status/State so the public API docs match behavior.
        """Get option set values for a Picklist, MultiSelect, or Boolean column.

        This method retrieves the available choices for a column that uses an
        option set. For Picklist and MultiSelect columns, the options are the
        defined choice values. For Boolean columns, the result contains the
        True and False option labels.

src/PowerPlatform/Dataverse/data/_odata.py:1660

  • _get_column_optionset() uses $expand=OptionSet,GlobalOptionSet without a nested $select. This can significantly increase payload size and may omit fields you actually need (e.g., Options / TrueOption / FalseOption) depending on server defaults. Consider using a nested expand with an explicit select (similar to the existing picklist fetch logic earlier in this file that uses OptionSet($select=Options)) to keep responses smaller and more reliable.
        base = f"{self.api}/EntityDefinitions(LogicalName='{table_lower}')/Attributes(LogicalName='{column_lower}')"

        params = {"$select": "LogicalName", "$expand": "OptionSet,GlobalOptionSet"}

        for cast_type in [

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

maksii and others added 18 commits February 28, 2026 15:39
- Updated the docstring to clarify the types of columns supported, including Picklist, MultiSelect, Boolean, Status, and State.
- Improved descriptions of the return values for better understanding of the method's functionality.
…ce TableInfo model with deprecated property aliases and additional relationship attributes
…osoft#137)

# Fix @odata.bind key casing and harden OData annotation handling

## Summary

The SDK's `_lowercase_keys()` was unconditionally lowercasing all
dictionary keys in record payloads, including `@odata.bind` annotation
keys like `new_CustomerId@odata.bind`. This broke lookup field bindings
because the Dataverse OData parser validates navigation property names
**case-sensitively**.

**Root cause:** Dataverse uses two naming conventions:
- **Structural properties** (columns): LogicalName, always lowercase
(`new_name`, `new_priority`)
- **Navigation properties** (lookups): SchemaName, PascalCase
(`new_CustomerId`, `new_AgentId`)

The OData parser (`Microsoft.OData.Core`) rejects lowercased navigation
property names with: `ODataException: An undeclared property
'new_customerid' which only has property annotations in the payload but
no property value was found in the payload.`

Note: CDS's internal RelationshipService *is* case-insensitive, but it
never runs because the OData parser rejects the payload first.

## Changes

### Bug fixes
- **Preserve `@odata.bind` key casing** -- `_lowercase_keys()` now skips
keys containing `@odata.`, preserving the PascalCase navigation property
name that Dataverse requires
- **Skip `@odata.` keys in `_convert_labels_to_ints()`** -- Previously
made unnecessary HTTP metadata API calls for every `@odata.bind` key
(checking if it's a picklist attribute). These always returned empty
results but wasted an HTTP round-trip per annotation key per record on
every create/update/upsert
- **Fix `_get` `$select` consistency** -- Single-record `_get()` now
lowercases `$select` column names via `_lowercase_list()`, matching the
behavior of `_get_multiple()`

### Developer guardrails
- **Runtime warning for likely-wrong casing** -- `_lowercase_keys()` now
emits a `warnings.warn()` when it detects an `@odata.bind` key where the
navigation property portion is all-lowercase (e.g.,
`new_customerid@odata.bind`), alerting developers before they hit a
cryptic 400 error

### Tests
- `test_odata_bind_keys_preserve_case` -- PascalCase `@odata.bind` keys
are preserved through the write path
- `test_odata_bind_lowercase_warns` -- Lowercase nav property in
`@odata.bind` triggers a warning
- `test_odata_bind_pascalcase_no_warning` -- Correct PascalCase does not
trigger false positive
- `test_convert_labels_skips_odata_keys` -- Verifies
`_convert_labels_to_ints` does not call `_optionset_map` for `@odata.`
keys

### Documentation
- **`dataverse-sdk-dev` skill** -- Added "Dataverse Property Naming
Rules" section explaining structural vs navigation property conventions
and implementation rules for contributors
- **`dataverse-sdk-use` skill** -- Added `@odata.bind` usage examples,
400 error troubleshooting guidance, and corrected best practice on
casing

## Before / After

**Before:** SDK sent `{"new_customerid@odata.bind": ...}` -- 400 error

**After:** SDK sends `{"new_CustomerId@odata.bind": ...}` -- success

```python
# User code (unchanged -- SDK now preserves their casing correctly)
client.records.create("new_ticket", {
    "new_name": "TKT-001",
    "new_CustomerId@odata.bind": "/new_customers(guid)",
})

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Add changelog entry for v0.1.0b6 covering PRs microsoft#115, microsoft#117, microsoft#126, microsoft#137

Co-authored-by: Abel Milash <abelmilash@microsoft.com>
Post-release version bump to 0.1.0b7 after publishing v0.1.0b6 to PyPI.

Co-authored-by: Abel Milash <abelmilash@microsoft.com>
…icrosoft#98)

## Summary

Adds a `client.dataframe` namespace with pandas DataFrame/Series
wrappers for all CRUD operations, plus two advanced example scripts, and
a minor SDK enhancement for table metadata. Users can now query, create,
update, and delete Dataverse records using DataFrame-native inputs and
outputs -- no manual dict conversion required.

## Quick Example

```python
import pandas as pd
from azure.identity import InteractiveBrowserCredential
from PowerPlatform.Dataverse.client import DataverseClient

credential = InteractiveBrowserCredential()

with DataverseClient("https://yourorg.crm.dynamics.com", credential) as client:

    # Query records as a DataFrame (all pages consolidated automatically)
    df = client.dataframe.get("account", select=["name", "telephone1"], top=5)

    # Create records from a DataFrame (returns Series of GUIDs)
    new_records = pd.DataFrame([
        {"name": "Acme Corp", "telephone1": "555-9000"},
        {"name": "Globex Inc", "telephone1": "555-9001"},
    ])
    new_records["accountid"] = client.dataframe.create("account", new_records)

    # Update records (NaN/None skipped by default; use clear_nulls=True to clear fields)
    new_records["telephone1"] = ["555-1111", "555-2222"]
    client.dataframe.update("account", new_records[["accountid", "telephone1"]], id_column="accountid")

    # Delete records
    client.dataframe.delete("account", new_records["accountid"])
```

## Changes

### DataFrame CRUD (`client.dataframe` namespace)
| File | Description |
|------|-------------|
| `src/.../operations/dataframe.py` | `DataFrameOperations` class:
`get()`, `create()`, `update()`, `delete()` |
| `src/.../utils/_pandas.py` | `dataframe_to_records()` helper --
normalizes NumPy, datetime, NaN/None |
| `client.py` | Added `self.dataframe = DataFrameOperations(self)` |
| `pyproject.toml` | Added `pandas>=2.0.0` required dependency |
| `README.md` | DataFrame usage examples |
| `operations/__init__.py` | Cleanup (`__all__ = []`) |

### SDK Enhancement: TableInfo primary column metadata (fixes microsoft#148)
| File | Description |
|------|-------------|
| `src/.../data/_odata.py` | `_get_entity_by_table_schema_name()` and
`_get_table_info()` now select `PrimaryNameAttribute` and
`PrimaryIdAttribute` from EntityDefinitions |
| `src/.../models/table_info.py` | `TableInfo` includes
`primary_name_attribute` and `primary_id_attribute` fields |
| `tests/unit/models/test_table_info.py` | Tests for new fields in
`from_dict`, `from_api_response`, and legacy key access |

### Advanced Examples
| File | Description |
|------|-------------|
| `examples/advanced/dataframe_operations.py` | DataFrame CRUD
walkthrough |
| `examples/advanced/prodev_quick_start.py` | Pro-dev: 4-table system
with relationships, DataFrame CRUD, query/analyze. Uses
`result.primary_name_attribute` from `tables.create()` |
| `examples/advanced/datascience_risk_assessment.py` | Data science:
5-step risk pipeline with 3 LLM provider options (Azure AI Inference,
OpenAI, GitHub Copilot SDK), matplotlib charts |

### Test Files
| File | Tests |
|------|-------|
| `test_dataframe_operations.py` | 44 |
| `test_client_dataframe.py` | 26 |
| `test_pandas_helpers.py` | 33 |
| `test_table_info.py` | +1 (primary fields) |

## API Design

| Method | Input | Output | Underlying API |
|--------|-------|--------|----------------|
| `get(table, ...)` | OData params | `pd.DataFrame` | `records.get()` |
| `get(table, record_id=...)` | GUID | 1-row `pd.DataFrame` |
`records.get()` |
| `create(table, df)` | `pd.DataFrame` | `pd.Series` of GUIDs |
`CreateMultiple` |
| `update(table, df, id_column)` | `pd.DataFrame` | `None` |
`UpdateMultiple` |
| `delete(table, ids)` | `pd.Series` | `Optional[str]` | `BulkDelete` |

### Design Decisions
- **`clear_nulls`**: Default `False` skips NaN (field unchanged). `True`
sends null to clear.
- **Type normalization**: np.int64/float64/bool_/ndarray,
datetime/date/np.datetime64, pd.Timestamp -- all auto-converted.
- **ID validation**: Strip whitespace, report DataFrame index labels in
errors.
- **pandas required**: Core dependency by team decision.

## Test Results

```
396 passed, 8 warnings (pre-existing deprecation), 4 subtests passed
```

| Check | Result |
|-------|--------|
| Full test suite | 396 pass, 0 fail |
| mypy | 0 errors |
| black / isort | Clean |
| E2E prodev | PASS (4 tables, 3 relationships, 13 records, full CRUD
cycle) |
| E2E datascience | PASS (7 accounts, 3 cases, 8 opportunities, risk
scoring, charts) |
| PR review threads | 54/54 resolved |

## Issues Addressed
- Fixes microsoft#148: `tables.create()` now exposes `primary_name_attribute` via
Dataverse metadata
- Followup microsoft#147: `QueryBuilder.to_dataframe()` tracked for future work

---------

Co-authored-by: Saurabh Badenkal <sbadenkal@microsoft.com>
Updates CHANGELOG.md for v0.1.0b7 release

Co-authored-by: Abel Milash <abelmilash@microsoft.com>
Bumps version to 0.1.0b8 for next development cycle

Co-authored-by: Abel Milash <abelmilash@microsoft.com>
…rosoft#153)

## Summary

Fix broken cross-references (`<xref:of>`, `<xref:mapping>`, `<xref:to>`)
in the Microsoft Learn API reference docs caused by Sphinx-style
`:type:` and `:rtype:` directives.

## Problem

The Learn doc pipeline processes `:type:` and `:rtype:` Sphinx
directives differently from standard Sphinx. Every word between
`:class:` back-tick references is treated as a separate cross-reference.
For example:

```
:rtype: :class:`list` of :class:`str`
```

Becomes:

```yaml
types:
- <xref:list> <xref:of> <xref:str>
```

There is no type `of` in the Learn cross-reference database, so it
renders as a broken link on the published page.

## Root Cause

This was introduced in commit f0e8987 ("sphinx doc string", 2025-11-17)
which converted the original bracket-notation docstrings (`list[str]`,
`dict or list[dict]`) to Sphinx-style `:class:` syntax. Later commits
that added new APIs (operation namespaces, dataframe, etc.) perpetuated
the same broken pattern.

## Fix

Replaced all 42 occurrences across 6 source files with Python bracket
notation that the Learn pipeline handles correctly:

| Before (broken) | After (correct) |
|---|---|
| `:class:\`list\` of :class:\`str\`` | `list[str]` |
| `:class:\`dict\` or :class:\`list\` of :class:\`dict\`` | `dict or
list[dict]` |
| `:class:\`collections.abc.Iterable\` of :class:\`list\` of
:class:\`dict\`` | `collections.abc.Iterable[list[dict]]` |
| `:class:\`dict\` mapping :class:\`str\` to :class:\`typing.Any\`` |
`dict[str, typing.Any]` |

### Files changed

**Docstring fixes:**
- `src/PowerPlatform/Dataverse/client.py` (14 occurrences)
- `src/PowerPlatform/Dataverse/operations/records.py` (14 occurrences)
- `src/PowerPlatform/Dataverse/operations/tables.py` (7 occurrences)
- `src/PowerPlatform/Dataverse/operations/dataframe.py` (3 occurrences)
- `src/PowerPlatform/Dataverse/operations/query.py` (1 occurrence)
- `src/PowerPlatform/Dataverse/models/table_info.py` (3 occurrences)

**Prevention guidelines:**
- `.claude/skills/dataverse-sdk-dev/SKILL.md` -- added "Docstring Type
Annotations (Microsoft Learn Compatibility)" section
- `src/PowerPlatform/Dataverse/claude_skill/dataverse-sdk-dev/SKILL.md`
-- same section (kept both copies in sync)

## Testing

- All 398 unit tests pass
- Verified zero remaining occurrences of the broken pattern via regex
scan

---------

Co-authored-by: Saurabh Badenkal <sbadenkal@microsoft.com>
## Summary

Add end-to-end relationship tests that validate the full relationship
API lifecycle against a live Dataverse environment. This is the primary
pre-GA validation for Tim's relationship PRs (microsoft#88, microsoft#105, microsoft#114).

## Changes

### New: `tests/e2e/test_relationships_e2e.py`
11 curated e2e tests covering:

| Test Class | Tests | Coverage |
|---|---|---|
| `TestOneToManyCore` | 1 | Full 1:N lifecycle: create, get, delete,
field assertions |
| `TestLookupField` | 1 | Convenience `create_lookup_field` to system
table |
| `TestManyToMany` | 2 | N:N lifecycle + nonexistent returns None |
| `TestDataThroughRelationships` | 4 | `@odata.bind`, `$expand`,
`$filter` on lookup, update binding |
| `TestCascadeBehaviors` | 2 | Restrict blocks delete; Cascade deletes
children |
| `TestTypeDetection` | 1 | `get_relationship` distinguishes 1:N vs N:N
|

### Updated: `examples/basic/functional_testing.py`
- Added relationship testing section covering 1:N core API, convenience
API, N:N, get, and delete
- Added relationship imports and retry helpers

### Updated: `pyproject.toml`
- Added `[tool.pytest.ini_options]` with `testpaths = ["tests/unit"]`
- Default `pytest` runs only unit tests; e2e tests require explicit
invocation

## How to run e2e tests

```bash
# Set your Dataverse org URL
export DATAVERSE_URL=https://yourorg.crm.dynamics.com

# Run relationship e2e tests
pytest tests/e2e/ -v -s
```

The tests authenticate via `InteractiveBrowserCredential` and
create/delete temporary tables (prefixed `test_E2E*`).

## E2E Test Results (from `.scratch/` comprehensive suite)

Ran 30 tests against `https://aurorabapenv71aff.crm10.dynamics.com`:
- 25/30 passed on first run
- 5 failures were test bugs (not SDK bugs), all fixed:
  - Metadata propagation timing (increased retries)
- Navigation property name casing (`$expand` needs server-assigned nav
prop)
- `IsValidForAdvancedFind` requires `BooleanManagedProperty` complex
type

## Finding: SDK inconsistency to address before GA

`create_one_to_many_relationship()` returns `lookup_schema_name` as the
user-provided SchemaName, but `$expand` requires the server-assigned
`ReferencingEntityNavigationPropertyName` (which may differ in casing).
The e2e tests work around this by calling `get_relationship()` after
create to get the correct nav prop name. This should be harmonized
before GA.

## Checklist

- [x] 398 unit tests pass
- [x] 11 e2e tests collected by pytest
- [x] Default `pytest` excludes e2e (runs unit only)
- [x] Code formatted with black
- [x] Branch rebased on origin/main

---------

Co-authored-by: Saurabh Badenkal <sbadenkal@microsoft.com>
…icrosoft#118)

## Summary

Implements the QueryBuilder feature from the SDK redesign design doc
(ADO PR 1504429):

- **Fluent query builder** via `client.query.builder("table")` with 20
chainable methods including `select`, `filter_eq/ne/gt/ge/lt/le`,
`filter_contains/startswith/endswith`, `filter_in`, `filter_between`,
`filter_null/not_null`, `filter_raw`, `where`, `order_by`, `top`,
`page_size`, `expand`, and `execute`
- **Composable filter expression tree** (`models/filters.py`) with
Python operator overloads (`&`, `|`, `~`) for AND, OR, NOT composition
- **Value auto-formatting** for `str`, `int`, `float`, `bool`, `None`,
`datetime`, `date`, `uuid.UUID`
- 126 new unit tests (57 filters + 69 query builder), 309 total passing

### Usage examples

```python
# Fluent builder
for page in (client.query.builder("account")
             .select("name", "revenue")
             .filter_eq("statecode", 0)
             .filter_gt("revenue", 1000000)
             .order_by("revenue", descending=True)
             .top(100)
             .page_size(50)
             .execute()):
    for record in page:
        print(record["name"])

# Composable expression tree with where()
from PowerPlatform.Dataverse.models.filters import eq, gt, filter_in

for page in (client.query.builder("account")
             .where((eq("statecode", 0) | eq("statecode", 1))
                    & gt("revenue", 100000))
             .execute()):
    for record in page:
        print(record["name"])
```

### Design decisions
- **Regular class, not dataclass** — prevents leaking internal state as
constructor params
- **Unified `_filter_parts` list** — preserves call order when mixing
`filter_*()` and `where()`
- **`execute()` calls `build()` internally** — single source of truth
for filter compilation
- **No public `get()` on QueryOperations** — only `builder()` added;
paginated queries remain on `records.get()`
- **Parenthesized `filter_between`** — `(col ge low and col le high)`
for correct precedence

### Files changed
| File | Description |
|------|-------------|
| `src/.../models/filters.py` | **NEW** — Composable expression tree |
| `src/.../models/query_builder.py` | **NEW** — Fluent QueryBuilder
class |
| `src/.../operations/query.py` | Add `builder()` to QueryOperations |
| `src/.../models/__init__.py` | Updated docstring |
| `tests/.../models/test_filters.py` | **NEW** — 57 filter tests |
| `tests/.../models/test_query_builder.py` | **NEW** — 69 builder tests
|
| `tests/.../test_query_operations.py` | 6 new integration tests |

### Merge conflict note
`operations/query.py` may conflict with PR microsoft#115 (typed return models) —
resolution is straightforward since we only add a `builder()` method.

## Test plan
- [x] `pytest tests/unit/models/test_filters.py` — 57 passed
- [x] `pytest tests/unit/models/test_query_builder.py` — 69 passed
- [x] `pytest tests/unit/test_query_operations.py` — 9 passed
- [x] `pytest tests/` — 309 passed, 0 failed

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: tpellissier <tpellissier@microsoft.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Abel Milash <abelmilash@microsoft.com>
Co-authored-by: Saurabh Badenkal <sbadenkal@microsoft.com>
Co-authored-by: Saurabh Ravindra Badenkal <32964911+saurabhrb@users.noreply.github.com>
…microsoft#129)

## Summary

- Adds `client.batch` namespace -- a deferred-execution batch API that
packs multiple
  Dataverse Web API operations into a single `POST $batch` HTTP request
- Adds `client.batch.dataframe` namespace -- pandas DataFrame wrappers
for batch operations
- Adds `client.records.upsert()` and `client.batch.records.upsert()`
backed by the
  `UpsertMultiple` bound action with alternate-key support
- Fixes a bug where alternate key fields were merged into the
UpsertMultiple request
  body, causing `400 Bad Request` on the create path

## Batch API Design

Implements the [Batch API
Design](microsoft#129 (comment))
spec from @sagebree:

| Capability | How to use | Status |
|---|---|---|
| Record CRUD (create / update / delete / get) | `batch.records.*` |
Done |
| Upsert by alternate key | `batch.records.upsert(...)` | Done |
| Table metadata (create / delete / columns / relationships) |
`batch.tables.*` | Done |
| SQL queries | `batch.query.sql(...)` | Done |
| Atomic write groups | `batch.changeset()` | Done |
| Continue past failures | `batch.execute(continue_on_error=True)` |
Done |
| DataFrame integration | `batch.dataframe.create/update/delete` | Done
(new) |

**Design constraints enforced:**
- Maximum 1000 operations per batch (validated before sending)
- `records.get` paginated overload not supported -- single-record only
- GET operations cannot be placed inside a changeset (enforced by API
design)
- Content-ID references are only valid within the same changeset
- File upload operations not batchable
- `tables.create` returns no table metadata on success (HTTP 204)
- `tables.add_columns` / `tables.remove_columns` do not flush the
picklist cache
- `client.flush_cache()` not supported in batch (client-side operation)

## What's included

### New: `client.batch` API
- `batch.records.create / get / update / delete / upsert`
- `batch.tables.create / get / list / add_columns / remove_columns /
delete`
- `batch.tables.list(filter=..., select=...)` -- parity with
`client.tables.list()` from microsoft#112
- `batch.tables.create_one_to_many_relationship /
create_many_to_many_relationship / delete_relationship /
get_relationship / create_lookup_field`
- `batch.query.sql`
- `batch.changeset()` context manager for transactional (all-or-nothing)
operations
- Content-ID reference chaining inside changesets (globally unique
across all changesets via shared counter)
- `execute(continue_on_error=True)` for mixed success/failure batches
- `BatchResult` with `.responses`, `.succeeded`, `.failed`,
`.created_ids`, `.has_errors`

### New: `client.batch.dataframe` API
- `batch.dataframe.create(table, df)` -- DataFrame rows to
CreateMultiple batch item
- `batch.dataframe.update(table, df, id_column)` -- DataFrame rows to
update batch items
- `batch.dataframe.delete(table, ids_series)` -- pandas Series to delete
batch items

### Existing: Refactored existing APIs
- Payload generation shared between batch and direct API via `_build_*`
/ `_RawRequest` pattern
- Execution of batch operations deferred to `execute()`

### OData $batch spec compliance
- Audited against [Microsoft Learn
docs](https://learn.microsoft.com/en-us/power-apps/developer/data-platform/webapi/execute-batch-operations-using-web-api)
- `Content-Transfer-Encoding: binary` per part
- `Content-Type: application/http` per part
- `Content-Type: application/json; type=entry` for POST/PATCH bodies
- CRLF line endings throughout
- Absolute URLs in batch parts
- Empty changesets silently skipped (prevents invalid multipart)
- Top-level batch error handling (non-multipart 4xx/5xx raises
`HttpError` with parsed Dataverse error details)
- Accepts `200`, `202 Accepted`, `207 Multi-Status`, and `400` batch
response codes

### Review comment fixes
- Fixed `expected` status codes to include `202`/`207` for all Dataverse
environments
- Fixed `_split_multipart` / `_parse_mime_part` return type annotations:
`List[Tuple[Dict[str, str], str]]`
- Fixed OptionSet string check regression: now uses dict key lookup
instead of JSON string search
- Fixed `_build_get` to lowercase select column names (consistency with
`_get_multiple`)
- Added RFC 3986 `%20` encoding documentation in `_build_sql` docstring
- Fixed content-id response parsing for non-changeset parts
- Fixed test assertions after merge: `data` bytes instead of `json`
kwarg
- Exception type parity: `batch.records.upsert()` raises `TypeError`
(matching `client.records.upsert()`)

### Testing

**Unit tests -- 579 tests passing:**
- `test_batch_operations.py` -- BatchRequest, BatchRecordOperations,
BatchTableOperations, BatchQueryOperations, ChangeSet,
BatchItemResponse, BatchResult
- `test_batch_serialization.py` -- multipart serialization, response
parsing, intent resolution, upsert dispatch, batch size limit,
content-ID uniqueness, top-level error handling
- `test_batch_edge_cases.py` -- 40 edge case tests: empty changeset,
changeset rollback, content-ID in standalone parts, mixed batch,
multiple changesets, batch size limits, top-level errors,
continue-on-error, serialization compliance, multipart parsing,
content-ID references, intent validation
- `test_batch_dataframe.py` -- 18 tests: DataFrame create/update/delete,
validation, NaN handling, empty series, bulk delete
- `test_odata_internal.py` -- `_build_upsert_multiple` body exclusion,
conflict detection, URL/method correctness

**E2E tests -- 14 tests passing against live Dataverse
(`crm10.dynamics.com`):**
1. Basic batch CRUD (single create + CreateMultiple, update, get,
delete)
2. Changeset happy path (create + update via `$ref` content-ID)
3. Changeset rollback (failing op rolls back entire changeset)
4. Multiple changesets (globally unique content-IDs)
5. Continue-on-error (mixed success/failure)
6. Batch SQL query
7. Batch tables.get + tables.list
8. DataFrame batch create
9. DataFrame batch update
10. DataFrame batch delete
11. Mixed batch (changeset + standalone GET)
12. Empty changeset (silently skipped)
13. Content-ID chaining (2 creates + 2 updates via `$ref`)
14. Table setup/teardown

### Examples & docs
- `examples/advanced/batch.py` -- reference examples for all batch
operation types
- `examples/advanced/walkthrough.py` -- batch section added (section 11)
- `examples/basic/functional_testing.py` --
`test_batch_all_operations()` covering all operation categories against
a live environment

---------

Co-authored-by: Samson Gebre <sagebree@microsoft.com>
Co-authored-by: Saurabh Badenkal <sbadenkal@microsoft.com>
…a fetch (microsoft#154)

## Summary

Reduces API calls during picklist label-to-integer resolution by
fetching all picklist attributes and their options for the entire table
in a single API call using the `PicklistAttributeMetadata` cast, instead
of checking each attribute individually. Results are cached with a
1-hour TTL.

## Changes

**`src/PowerPlatform/Dataverse/data/_odata.py`**
- Add `_bulk_fetch_picklists()` — single API call to fetch all picklist
attributes and their options for a table
- Add `_request_metadata_with_retry()` — exponential backoff on
transient metadata errors
- Simplify `_convert_labels_to_ints()` — calls `_bulk_fetch_picklists`
then resolves labels from cache

**`tests/unit/data/test_odata_internal.py`**
- Rewrite `TestPicklistLabelResolution` class with 50 unit tests
covering `_bulk_fetch_picklists`, `_request_metadata_with_retry`,
`_convert_labels_to_ints`, integration through
`_create`/`_update`/`_upsert`, and edge cases

**`examples/advanced/walkthrough.py`**
- Add picklist label update test to Section 10 (verifies both create and
update with string labels)

## Performance impact

Cold cache API calls reduced from `n + p` to always `1`, where `p` =
picklist fields, `n` = string fields.

| Picklist Columns | Before Calls | Before Time | After Calls | After
Time | Speedup |

|-----------------|-------------|-------------|-------------|------------|---------|
| 1 | 2 | 0.6s | 1 | 0.3s | 2x |
| 10 | 11 | 3.3s | 1 | 0.4s | 9x |
| 100 | 101 | 34s | 1 | 0.6s | 55x |
| 250 | 251 | 79s | 1 | 1.2s | 64x |
| 400 | 401 | 119s | 1 | 1.3s | 92x |

Repeat operations use a 1-hour TTL cache (0 API calls, <5ms).

## Testing

- 660 unit tests passing
- Performance benchmarks verified against live Dataverse environment (5
runs each)

---------

Co-authored-by: Abel Milash <abelmilash@microsoft.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
## Summary

Adds support for `"memo"` (or `"multiline"`) column type, enabling
creation of multiline text columns on Dataverse tables. Users can
specify `"memo"` as a column type in `client.tables.create()` and
`client.tables.add_columns()`.

## Changes

**`src/PowerPlatform/Dataverse/data/_odata.py`**
- Add `"memo"` / `"multiline"` handling in `_attribute_payload()`,
generating `MemoAttributeMetadata` with `MaxLength: 4000`, `Format:
Text`, `ImeMode: Auto`

**`src/PowerPlatform/Dataverse/operations/tables.py`**
- Document `"memo"` / `"multiline"` in `create()` docstring

**`examples/advanced/walkthrough.py`**
- Add `"new_Notes": "memo"` column to walkthrough table
- Include memo field in record creation with multiline content
- Read back and display memo field in Section 4
- Update memo with new multiline content in Section 5

**`tests/unit/data/test_odata_internal.py`**
- `test_memo_type()` — validates MemoAttributeMetadata payload
- `test_multiline_alias()` — validates `"multiline"` produces identical
result

**`tests/unit/test_tables_operations.py`**
- `test_add_columns_memo()` — validates memo type through
`add_columns()`

**`README.md`** / **SKILL docs**
- List `"memo"` in supported column types

## Testing

- 620 unit tests passing
- E2E memo walkthrough verified against live Dataverse (10 assertions):
  multiline create/read/update, empty string, None, special characters,
  long text (4000 chars), memo not mistaken for picklist label,
  triple-quoted strings, clearing memo to None

---------

Co-authored-by: Abel Milash <abelmilash@microsoft.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
## Summary

Adds comprehensive unit tests across 13 test files and enables coverage
reporting and enforcement in both CI pipelines and `pyproject.toml`.
Coverage increased from 72% to 93%.

Coverage enforcement is now built into both CI pipelines and the project
config: overall coverage must stay at or above 90% (`fail_under = 90` in
`pyproject.toml`), and any new code introduced in a PR must also meet
90% (`diff-cover`).

## Test Results

- **1127 passed**, 0 failures
- **93% line coverage** (up from ~72%)

## Changes

### Configuration (2 files)
- `pyproject.toml`: Added `[tool.coverage.run]` (`source =
["src/PowerPlatform"]`) and `[tool.coverage.report]` (`fail_under = 90`,
`show_missing = true`) to centralize coverage config
- `pyproject.toml`: Added `diff-cover` dev dependency for PR-level
coverage enforcement

### CI Pipelines (2 files)
- `.github/workflows/python-package.yml`: Added `PYTHONPATH=src pytest
--cov --cov-report=xml --junitxml=test-results.xml`, `diff-cover` step
to enforce 90% on new changes, `fetch-depth: 0` for full git history,
and artifact upload steps
- `.azdo/ci-pr.yaml`: Same pytest and diff-cover steps, added
`PublishTestResults@2` and `PublishCodeCoverageResults@2` tasks

### New Test Files (6 files)
- `test_auth.py`: Credential validation, token acquisition (2 tests)
- `test_http_client.py`: Timeout selection, session routing, retry
behavior (15 tests)
- `test_http_errors.py`: Error response parsing, correlation IDs,
transient detection (9 tests)
- `test_upload.py`: File upload orchestrator, small upload, chunked
streaming (50+ tests)
- `test_relationships.py`: 1:N and M:N relationship CRUD (25+ tests)
- `test_records_operations.py`: Public records API delegation (30+
tests)

### Expanded Test Files (7 files)
- `test_odata_internal.py`: 36 test classes, 219 tests covering all
`_ODataClient` methods — CRUD, upsert, bulk operations, metadata,
caching, picklist resolution, pagination, SQL queries, alternate keys,
column management. Includes kwarg correctness audit (`json=` vs
`data=`), response content validation, and assertion strengthening.
- `test_batch_serialization.py`: 7 new test classes (26 tests) —
dispatch routing for all intent types, changeset validation, metadata
resolution, continue-on-error header, MIME parsing edge cases
- `test_query_builder.py`, `test_table_info.py`, `test_client.py`,
`test_context_manager.py`, `test_records_operations.py`: Docstrings
added to existing tests

---------

Co-authored-by: Abel Milash <abelmilash@microsoft.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Incorporate latest changes from microsoft/PowerPlatform-DataverseClient-Python:
- Batch API with changeset, upsert, and DataFrame integration (microsoft#129)
- Optimize picklist label resolution with bulk fetch (microsoft#154)
- Add memo/multiline column type support (microsoft#155)
- Add unit test coverage and CI coverage reporting (microsoft#158)

Made-with: Cursor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants