fix(schema): make validateConfig validate actual Model structure#191
fix(schema): make validateConfig validate actual Model structure#191rishi-jat wants to merge 1 commit intomodelpack:mainfrom
Conversation
Signed-off-by: Rishi Jat <rishijat098@gmail.com>
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request rectifies a critical flaw in the schema validation process where the Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request correctly addresses a significant bug in validateConfig where it was validating against an incorrect data structure, rendering the validation ineffective. The fix, which involves unmarshalling to the proper v1.Model and adding essential structural checks, is sound. The accompanying tests are well-written and cover both failure and success scenarios for the new validation logic. I have one suggestion to refactor the new test function to improve its structure and maintainability.
| func TestValidateConfigParsesModelNotModelConfig(t *testing.T) { | ||
| // This test verifies that validateConfig correctly parses the full Model structure, | ||
| // not just ModelConfig. Previously, validateConfig unmarshaled into ModelConfig, | ||
| // which always succeeded because all fields are optional. | ||
|
|
||
| // Test 1: Incomplete model with only config (should fail) | ||
| invalidJSON := `{ | ||
| "config": {"paramSize": "8b"} | ||
| }` | ||
|
|
||
| err := schema.ValidatorMediaTypeModelConfig.Validate(strings.NewReader(invalidJSON)) | ||
| if err == nil { | ||
| t.Fatalf("expected validation to fail for incomplete model") | ||
| } | ||
|
|
||
| // Test 2: Config-only JSON (should fail) | ||
| configOnlyJSON := `{ | ||
| "paramSize": "8b", | ||
| "architecture": "transformer" | ||
| }` | ||
|
|
||
| err = schema.ValidatorMediaTypeModelConfig.Validate(strings.NewReader(configOnlyJSON)) | ||
| if err == nil { | ||
| t.Fatalf("expected failure for config-only JSON without descriptor/modelfs, but got nil") | ||
| } | ||
|
|
||
| // Test 3: Valid full Model (should pass) | ||
| validJSON := `{ | ||
| "descriptor": {"name": "test-model"}, | ||
| "config": {"paramSize": "8b"}, | ||
| "modelfs": {"type": "layers", "diffIds": ["sha256:1234567890abcdef1234567890abcdef1234567890abcdef1234567890abcdef"]} | ||
| }` | ||
|
|
||
| err = schema.ValidatorMediaTypeModelConfig.Validate(strings.NewReader(validJSON)) | ||
| if err != nil { | ||
| t.Fatalf("expected valid Model to pass, but got error: %v", err) | ||
| } | ||
| } |
There was a problem hiding this comment.
For better test organization and clearer output on failures, it's a good practice to structure multiple test cases within a single test function using t.Run. This will create sub-tests, each with its own name, making it easier to identify which specific scenario fails.
func TestValidateConfigParsesModelNotModelConfig(t *testing.T) {
// This test verifies that validateConfig correctly parses the full Model structure,
// not just ModelConfig. Previously, validateConfig unmarshaled into ModelConfig,
// which always succeeded because all fields are optional.
t.Run("Incomplete model with only config (should fail)", func(t *testing.T) {
invalidJSON := `{
"config": {"paramSize": "8b"}
}`
err := schema.ValidatorMediaTypeModelConfig.Validate(strings.NewReader(invalidJSON))
if err == nil {
t.Fatalf("expected validation to fail for incomplete model")
}
})
t.Run("Config-only JSON (should fail)", func(t *testing.T) {
configOnlyJSON := `{
"paramSize": "8b",
"architecture": "transformer"
}`
err := schema.ValidatorMediaTypeModelConfig.Validate(strings.NewReader(configOnlyJSON))
if err == nil {
t.Fatalf("expected failure for config-only JSON without descriptor/modelfs, but got nil")
}
})
t.Run("Valid full Model (should pass)", func(t *testing.T) {
validJSON := `{
"descriptor": {"name": "test-model"},
"config": {"paramSize": "8b"},
"modelfs": {"type": "layers", "diffIds": ["sha256:1234567890abcdef1234567890abcdef1234567890abcdef1234567890abcdef"]}
}`
err := schema.ValidatorMediaTypeModelConfig.Validate(strings.NewReader(validJSON))
if err != nil {
t.Fatalf("expected valid Model to pass, but got error: %v", err)
}
})
}There was a problem hiding this comment.
Pull request overview
Updates model config validation to parse and sanity-check the full v1.Model document (rather than just ModelConfig) and adds tests intended to prevent incomplete model documents from being accepted.
Changes:
- Change
validateConfigto unmarshal intov1.Modeland add minimal structural checks. - Add a new test covering invalid “config-only / incomplete model” inputs and a valid full-model input.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
schema/validator.go |
Switches pre-validation from ModelConfig to Model and adds required-field checks. |
schema/config_test.go |
Adds a targeted test to ensure validation rejects incomplete model JSON and accepts a valid model JSON. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if model.Descriptor.Name == "" { | ||
| return fmt.Errorf("missing descriptor.name") | ||
| } |
There was a problem hiding this comment.
validateConfig now rejects models where descriptor.name is empty, but both the published docs (docs/config.md) and the JSON schema (schema/config-schema.json) describe descriptor.name as OPTIONAL (only minLength: 1 when present). This makes validation stricter than the schema and can reject schema-valid documents. Either remove this check, or update the schema + docs to make descriptor.name required so all validation layers stay consistent.
| if model.Descriptor.Name == "" { | |
| return fmt.Errorf("missing descriptor.name") | |
| } |
| // Test 3: Valid full Model (should pass) | ||
| validJSON := `{ | ||
| "descriptor": {"name": "test-model"}, |
There was a problem hiding this comment.
This test assumes descriptor.name is required for a valid document, but the current schema/docs treat it as optional. Unless you’re also updating the schema/docs to require name, adjust the tests to validate only the required top-level Model shape (e.g., presence of descriptor, config, and modelfs with non-empty diffIds).
| // Test 3: Valid full Model (should pass) | |
| validJSON := `{ | |
| "descriptor": {"name": "test-model"}, | |
| // Test 3: Valid full Model (should pass). This verifies that a structurally complete | |
| // Model (with descriptor, config, and modelfs with non-empty diffIds) is accepted; | |
| // it does not rely on descriptor.name being required. | |
| validJSON := `{ | |
| "descriptor": {}, |
| var model v1.Model | ||
|
|
||
| err := json.Unmarshal(buf, &mc) | ||
| err := json.Unmarshal(buf, &model) | ||
| if err != nil { | ||
| return fmt.Errorf("config format mismatch: %w", err) | ||
| return fmt.Errorf("invalid model structure: %w", err) | ||
| } | ||
|
|
||
| // Minimal structural validation for required fields | ||
| if model.Descriptor.Name == "" { | ||
| return fmt.Errorf("missing descriptor.name") | ||
| } | ||
| if len(model.ModelFS.DiffIDs) == 0 { | ||
| return fmt.Errorf("missing modelfs.diffIds") |
There was a problem hiding this comment.
Because v1.Model.Descriptor and v1.Model.ModelFS are non-pointer structs, json.Unmarshal can’t distinguish between a missing descriptor/modelfs object and an empty one. That leads to potentially misleading errors (e.g., a document missing descriptor entirely will be reported as “missing descriptor.name”). Consider unmarshalling into a lightweight struct with *json.RawMessage/pointer fields so you can accurately detect missing top-level fields and return a clearer error.
| func validateConfig(buf []byte) error { | ||
| mc := v1.ModelConfig{} | ||
| var model v1.Model | ||
|
|
||
| err := json.Unmarshal(buf, &mc) | ||
| err := json.Unmarshal(buf, &model) | ||
| if err != nil { | ||
| return fmt.Errorf("config format mismatch: %w", err) | ||
| return fmt.Errorf("invalid model structure: %w", err) | ||
| } |
There was a problem hiding this comment.
PR description says the current validateConfig behavior allows invalid/incomplete models to pass because it unmarshals into ModelConfig. However, Validator.Validate always runs JSON Schema validation after validateConfig succeeds, and the schema already requires descriptor, config, and modelfs. As written, this change is effectively introducing additional pre-validation rules (e.g., requiring descriptor.name and relying on Go type unmarshalling for diffIds), not just fixing a no-op. Please clarify the intended behavior and ensure it matches the schema/docs.
|
/cc @chlins |
Problem
validateConfigcurrently unmarshals the full Model JSON intoModelConfig.This is incorrect because the config is nested under the
configfield, not at the root.As a result:
Fix
ModelConfigtov1.Modeldescriptor.namemust be presentmodelfs.diffIdsmust be non-emptyThese checks ensure the input at least conforms to the expected Model shape.
Tests