feat: add auto-generated Python data structures from ModelPack schema by rishi-jat · Pull Request #184 · modelpack/model-spec

rishi-jat · 2026-03-13T08:51:08Z

Summary

This PR adds auto-generated Python data structures for the ModelPack specification so downstream Python projects can import the spec types directly instead of copying them manually.

Example usage:

python
from model_spec.v1 import Model

model = Model.model_validate_json(json_payload)
print(model.descriptor.name)

Implementation

Python models are generated from the canonical schema: schema/config-schema.json

Generation uses datamodel-code-generator and produces Pydantic models.

Added files:

py/model_spec/v1/models.py – generated Python models
py/model_spec/v1/init.py – exposes the public API
tools/generate_python_models.py – script to regenerate models
py/README.md – documentation for usage and regeneration

A Makefile target was added to regenerate the models: make generate-python-api

Validation

go test ./...
Python import works: from model_spec.v1 import Model
Model generation works: make generate-python-api

fixes #138

Signed-off-by: Rishi Jat <rishijat098@gmail.com>

gemini-code-assist · 2026-03-13T08:51:22Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a significant enhancement by providing auto-generated Python data structures for the ModelPack specification. This change streamlines the development process for Python projects that interact with ModelPack, ensuring consistency and reducing the need for manual type definition. It establishes a robust mechanism for keeping Python models synchronized with the core schema, thereby improving developer experience and reducing potential errors.

Highlights

Python Data Structures: Auto-generated Python data structures for the ModelPack specification have been added, allowing downstream Python projects to import spec types directly.
Pydantic Model Generation: Pydantic models are generated from the canonical schema/config-schema.json using datamodel-code-generator.
New Files Introduced: Key new files include py/model_spec/v1/models.py (generated models), py/model_spec/v1/__init__.py (public API exposure), tools/generate_python_models.py (generation script), and py/README.md (documentation).
Makefile Integration: A new Makefile target, generate-python-api, has been added to simplify the regeneration of these Python models.

Changelog

Makefile
- Added a new generate-python-api target to automate the generation of Python API models from the JSON schema.
py/README.md
- Added a new README file providing documentation for the Python ModelPack types, including usage examples and instructions for regeneration.
py/model_spec/v1/init.py
- Added a new __init__.py file to expose the generated Pydantic models as part of the model_spec.v1 package.
py/model_spec/v1/models.py
- Added a new models.py file containing the auto-generated Pydantic data structures for the ModelPack specification, derived from config-schema.json.
tools/generate_python_models.py
- Added a new Python script responsible for executing datamodel-code-generator to generate the Pydantic models from the JSON schema.

Activity

No human activity has been recorded on this pull request yet.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

rishi-jat · 2026-03-13T08:52:27Z

/cc @bergwolf
/cc @gorkem
/cc @aftersnow
/cc @chlins

gemini-code-assist

Code Review

This PR introduces auto-generated Python data structures for the ModelPack specification, which is a great addition for Python users. My review focuses on improving the developer experience and correctness of the new Python package. I've suggested making the Python models more idiomatic by using snake_case for field names, improving the documentation to include dependency installation, and ensuring all necessary types are exported from the package.

Copilot

Pull request overview

Adds an auto-generated Python API (Pydantic models) for the canonical ModelPack JSON Schema so downstream Python consumers can import spec-aligned types directly.

Changes:

Add a generator script (datamodel-code-generator) to produce Pydantic v2 models from schema/config-schema.json.
Commit generated Python models and a small model_spec.v1 import surface.
Add Python usage/regeneration docs and a Makefile target to regenerate the models.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
tools/generate_python_models.py	Adds a CLI script to generate Pydantic v2 models from the canonical JSON schema.
py/model_spec/v1/models.py	Introduces generated Pydantic models representing the v1 schema types.
py/model_spec/v1/init.py	Exposes a small public API surface for v1 model imports.
py/README.md	Documents intended usage and regeneration workflow for the Python types.
Makefile	Adds `generate-python-api` target to regenerate the Python models.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Signed-off-by: Rishi Jat <rishijat098@gmail.com>

aftersnow · 2026-03-22T07:10:41Z

Thanks for working on this. The schema-driven generation approach here looks useful, but it overlaps quite a bit with #175. I would suggest collaborating with #175 so we can converge on one Python SDK path and avoid maintaining two parallel Python APIs.

rishi-jat · 2026-03-22T07:26:29Z

After reviewing #175 and the current repo structure, here is my assessment.

The Python SDK in this PR is functional and well put together, but it introduces a second representation of the ModelPack specification by manually defining dataclasses that mirror the schema and Go types.

In this repository, schema/config-schema.json is already the canonical source of truth. The Go implementation aligns with that schema. Recreating the same structure manually in Python means the spec now exists in multiple places:

JSON schema (canonical)
Go structs
Python dataclasses (this PR)

This creates a maintenance issue:

Any schema change requires manual updates in Python
There is no guarantee the Python layer stays aligned with the schema
Drift between schema and SDK becomes likely over time

I also noticed:

Schema loading is path-based, which is fragile for packaging/install use
Validator correctness depends on strict schema dialect alignment
Serialization logic is manually trying to mirror Go behavior, which is hard to maintain as the spec evolves

From what I see in the code and repo design, the direction looks schema-first rather than handwritten models.

An alternative that seems more aligned with the current design would be to keep the schema as the single source of truth and derive Python types from it, instead of maintaining handwritten spec types in parallel. Supporting pieces like validation, tests, and packaging can still be layered on top of that.

@aftersnow want to check your view on this:

Would it make more sense to avoid maintaining handwritten spec types here and instead keep the schema as the single source of truth, with Python types derived from it?

pradhyum6144 · 2026-03-22T07:42:40Z

@rishi-jat these are valid points and I agree with most of them.

You're right that hand written dataclasses create a maintenance burden. I already acknowledged this in my earlier reply I'm happy to adopt schema-driven generation as the base and layer my contributions on top.

Here's what I think a combined approach looks like:

From #184 (your PR): Schemadriven auto-generation via datamodel-code-generator this becomes the foundation for Python types, keeping them in sync with config-schema.json automatically.

From #175 (my PR):

Validator with correct dialect alignment (Draft4Validator matching the schema's draft-04 declaration)
Test suite (64 tests covering serialization, validation, edge cases)
pyproject.toml packaging
CI workflow (PR #180 pytest across Python 3.10-3.13 + ruff linting)
I can rebase my PR to use your generated types as the base and keep the validator/tests/packaging on top. Or we can merge into one joint PR whatever works best for you and the maintainers.

pradhyum6144 · 2026-03-22T07:42:56Z

@aftersnow what would you prefer two layered PRs or one combined effort???

rishi-jat · 2026-03-22T07:48:51Z

@rishi-jat these are valid points and I agree with most of them.

You're right that hand written dataclasses create a maintenance burden. I already acknowledged this in my earlier reply I'm happy to adopt schema-driven generation as the base and layer my contributions on top.

Here's what I think a combined approach looks like:

From #184 (your PR): Schemadriven auto-generation via datamodel-code-generator this becomes the foundation for Python types, keeping them in sync with config-schema.json automatically.

From #175 (my PR):

Validator with correct dialect alignment (Draft4Validator matching the schema's draft-04 declaration) Test suite (64 tests covering serialization, validation, edge cases) pyproject.toml packaging CI workflow (PR #180 pytest across Python 3.10-3.13 + ruff linting) I can rebase my PR to use your generated types as the base and keep the validator/tests/packaging on top. Or we can merge into one joint PR whatever works best for you and the maintainers.

This direction makes sense to me.

This PR keeps the schema as the source of truth and generates Python models from it. The other PR’s validator, tests, and packaging can be layered on top instead of maintaining separate handwritten types.

@aftersnow does this approach look right?

bergwolf

please add a CI job to ensure the generated python data structures are always updated, e.g., the CI job can regenerate them and compare with the existing one. If they differ, CI should fail.

rishi-jat · 2026-04-28T05:50:15Z

please add a CI job to ensure the generated python data structures are always updated, e.g., the CI job can regenerate them and compare with the existing one. If they differ, CI should fail.

will do. Thanks!

Signed-off-by: Rishi Jat <rishijat098@gmail.com>

rishi-jat · 2026-04-29T14:21:44Z

please add a CI job to ensure the generated python data structures are always updated, e.g., the CI job can regenerate them and compare with the existing one. If they differ, CI should fail.

Added a CI job that regenerates the Python models from the schema and fails if any diff is detected. This ensures generated artifacts stay in sync with the schema and prevents drift.

feat: add auto-generated Python data structures from ModelPack schema

7e24d71

Signed-off-by: Rishi Jat <rishijat098@gmail.com>

Copilot AI review requested due to automatic review settings March 13, 2026 08:51

Copilot started reviewing on behalf of rishi-jat March 13, 2026 08:51 View session

gemini-code-assist Bot reviewed Mar 13, 2026

View reviewed changes

Comment thread py/README.md Outdated

Comment thread py/README.md

Comment thread py/model_spec/v1/__init__.py

Comment thread tools/generate_python_models.py

Copilot AI reviewed Mar 13, 2026

View reviewed changes

Comment thread py/model_spec/v1/__init__.py

Comment thread tools/generate_python_models.py

Comment thread tools/generate_python_models.py Outdated

Comment thread py/README.md

Comment thread py/README.md Outdated

rishi-jat mentioned this pull request Mar 13, 2026

Is is possible to auto generate Python APIs? #138

Closed

rishi-jat added 2 commits March 13, 2026 18:16

copilot suggestion

90878bc

Signed-off-by: Rishi Jat <rishijat098@gmail.com>

fix

33ee298

Signed-off-by: Rishi Jat <rishijat098@gmail.com>

aftersnow mentioned this pull request Mar 22, 2026

feat: add Python SDK for model spec types and validation #175

Open

3 tasks

bergwolf reviewed Apr 28, 2026

View reviewed changes

ci: enforce generated Python API is up-to-date via regeneration check

2c80938

Signed-off-by: Rishi Jat <rishijat098@gmail.com>

Conversation

rishi-jat commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Implementation

Validation

Uh oh!

gemini-code-assist Bot commented Mar 13, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

rishi-jat commented Mar 13, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

aftersnow commented Mar 22, 2026

Uh oh!

rishi-jat commented Mar 22, 2026

Uh oh!

pradhyum6144 commented Mar 22, 2026

Uh oh!

pradhyum6144 commented Mar 22, 2026

Uh oh!

rishi-jat commented Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bergwolf left a comment

Choose a reason for hiding this comment

Uh oh!

rishi-jat commented Apr 28, 2026

Uh oh!

rishi-jat commented Apr 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

rishi-jat commented Mar 13, 2026 •

edited

Loading

rishi-jat commented Mar 22, 2026 •

edited

Loading