Skip to content

fix(schema): make validateConfig validate actual Model structure#191

Open
rishi-jat wants to merge 1 commit intomodelpack:mainfrom
rishi-jat:fix/validate-config-bug
Open

fix(schema): make validateConfig validate actual Model structure#191
rishi-jat wants to merge 1 commit intomodelpack:mainfrom
rishi-jat:fix/validate-config-bug

Conversation

@rishi-jat
Copy link

Problem

validateConfig currently unmarshals the full Model JSON into ModelConfig.
This is incorrect because the config is nested under the config field, not at the root.

As a result:

  • the unmarshal always succeeds
  • invalid or incomplete models pass validation
  • the function effectively becomes a no-op

Fix

  • Changed unmarshalling target from ModelConfig to v1.Model
  • Added minimal structural checks:
    • descriptor.name must be present
    • modelfs.diffIds must be non-empty

These checks ensure the input at least conforms to the expected Model shape.

Tests

  • Added negative test to verify incomplete models are rejected
  • Added positive test to verify valid models pass

Signed-off-by: Rishi Jat <rishijat098@gmail.com>
Copilot AI review requested due to automatic review settings March 22, 2026 10:19
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request rectifies a critical flaw in the schema validation process where the validateConfig function was not correctly interpreting the model structure, leading to the acceptance of malformed or incomplete models. The changes ensure that the validation accurately reflects the expected model shape, thereby improving data integrity and reliability within the system.

Highlights

  • Validation Logic Correction: The validateConfig function was updated to correctly unmarshal the full v1.Model structure instead of just ModelConfig, addressing an issue where invalid models were passing validation.
  • Structural Validation Added: Minimal structural checks were introduced to ensure that descriptor.name is present and modelfs.diffIds is not empty, enforcing basic model integrity.
  • Comprehensive Test Coverage: New test cases were added to schema/config_test.go to verify both negative scenarios (incomplete/invalid models failing) and positive scenarios (valid models passing) for the updated validation logic.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly addresses a significant bug in validateConfig where it was validating against an incorrect data structure, rendering the validation ineffective. The fix, which involves unmarshalling to the proper v1.Model and adding essential structural checks, is sound. The accompanying tests are well-written and cover both failure and success scenarios for the new validation logic. I have one suggestion to refactor the new test function to improve its structure and maintainability.

Comment on lines +593 to +630
func TestValidateConfigParsesModelNotModelConfig(t *testing.T) {
// This test verifies that validateConfig correctly parses the full Model structure,
// not just ModelConfig. Previously, validateConfig unmarshaled into ModelConfig,
// which always succeeded because all fields are optional.

// Test 1: Incomplete model with only config (should fail)
invalidJSON := `{
"config": {"paramSize": "8b"}
}`

err := schema.ValidatorMediaTypeModelConfig.Validate(strings.NewReader(invalidJSON))
if err == nil {
t.Fatalf("expected validation to fail for incomplete model")
}

// Test 2: Config-only JSON (should fail)
configOnlyJSON := `{
"paramSize": "8b",
"architecture": "transformer"
}`

err = schema.ValidatorMediaTypeModelConfig.Validate(strings.NewReader(configOnlyJSON))
if err == nil {
t.Fatalf("expected failure for config-only JSON without descriptor/modelfs, but got nil")
}

// Test 3: Valid full Model (should pass)
validJSON := `{
"descriptor": {"name": "test-model"},
"config": {"paramSize": "8b"},
"modelfs": {"type": "layers", "diffIds": ["sha256:1234567890abcdef1234567890abcdef1234567890abcdef1234567890abcdef"]}
}`

err = schema.ValidatorMediaTypeModelConfig.Validate(strings.NewReader(validJSON))
if err != nil {
t.Fatalf("expected valid Model to pass, but got error: %v", err)
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For better test organization and clearer output on failures, it's a good practice to structure multiple test cases within a single test function using t.Run. This will create sub-tests, each with its own name, making it easier to identify which specific scenario fails.

func TestValidateConfigParsesModelNotModelConfig(t *testing.T) {
	// This test verifies that validateConfig correctly parses the full Model structure,
	// not just ModelConfig. Previously, validateConfig unmarshaled into ModelConfig,
	// which always succeeded because all fields are optional.

	t.Run("Incomplete model with only config (should fail)", func(t *testing.T) {
		invalidJSON := `{
			"config": {"paramSize": "8b"}
		}`

		err := schema.ValidatorMediaTypeModelConfig.Validate(strings.NewReader(invalidJSON))
		if err == nil {
			t.Fatalf("expected validation to fail for incomplete model")
		}
	})

	t.Run("Config-only JSON (should fail)", func(t *testing.T) {
		configOnlyJSON := `{
			"paramSize": "8b",
			"architecture": "transformer"
		}`

		err := schema.ValidatorMediaTypeModelConfig.Validate(strings.NewReader(configOnlyJSON))
		if err == nil {
			t.Fatalf("expected failure for config-only JSON without descriptor/modelfs, but got nil")
		}
	})

	t.Run("Valid full Model (should pass)", func(t *testing.T) {
		validJSON := `{
			"descriptor": {"name": "test-model"},
			"config": {"paramSize": "8b"},
			"modelfs": {"type": "layers", "diffIds": ["sha256:1234567890abcdef1234567890abcdef1234567890abcdef1234567890abcdef"]}
		}`

		err := schema.ValidatorMediaTypeModelConfig.Validate(strings.NewReader(validJSON))
		if err != nil {
			t.Fatalf("expected valid Model to pass, but got error: %v", err)
		}
	})
}

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates model config validation to parse and sanity-check the full v1.Model document (rather than just ModelConfig) and adds tests intended to prevent incomplete model documents from being accepted.

Changes:

  • Change validateConfig to unmarshal into v1.Model and add minimal structural checks.
  • Add a new test covering invalid “config-only / incomplete model” inputs and a valid full-model input.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
schema/validator.go Switches pre-validation from ModelConfig to Model and adds required-field checks.
schema/config_test.go Adds a targeted test to ensure validation rejects incomplete model JSON and accepts a valid model JSON.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +125 to +127
if model.Descriptor.Name == "" {
return fmt.Errorf("missing descriptor.name")
}
Copy link

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

validateConfig now rejects models where descriptor.name is empty, but both the published docs (docs/config.md) and the JSON schema (schema/config-schema.json) describe descriptor.name as OPTIONAL (only minLength: 1 when present). This makes validation stricter than the schema and can reject schema-valid documents. Either remove this check, or update the schema + docs to make descriptor.name required so all validation layers stay consistent.

Suggested change
if model.Descriptor.Name == "" {
return fmt.Errorf("missing descriptor.name")
}

Copilot uses AI. Check for mistakes.
Comment on lines +619 to +621
// Test 3: Valid full Model (should pass)
validJSON := `{
"descriptor": {"name": "test-model"},
Copy link

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test assumes descriptor.name is required for a valid document, but the current schema/docs treat it as optional. Unless you’re also updating the schema/docs to require name, adjust the tests to validate only the required top-level Model shape (e.g., presence of descriptor, config, and modelfs with non-empty diffIds).

Suggested change
// Test 3: Valid full Model (should pass)
validJSON := `{
"descriptor": {"name": "test-model"},
// Test 3: Valid full Model (should pass). This verifies that a structurally complete
// Model (with descriptor, config, and modelfs with non-empty diffIds) is accepted;
// it does not rely on descriptor.name being required.
validJSON := `{
"descriptor": {},

Copilot uses AI. Check for mistakes.
Comment on lines +117 to +129
var model v1.Model

err := json.Unmarshal(buf, &mc)
err := json.Unmarshal(buf, &model)
if err != nil {
return fmt.Errorf("config format mismatch: %w", err)
return fmt.Errorf("invalid model structure: %w", err)
}

// Minimal structural validation for required fields
if model.Descriptor.Name == "" {
return fmt.Errorf("missing descriptor.name")
}
if len(model.ModelFS.DiffIDs) == 0 {
return fmt.Errorf("missing modelfs.diffIds")
Copy link

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because v1.Model.Descriptor and v1.Model.ModelFS are non-pointer structs, json.Unmarshal can’t distinguish between a missing descriptor/modelfs object and an empty one. That leads to potentially misleading errors (e.g., a document missing descriptor entirely will be reported as “missing descriptor.name”). Consider unmarshalling into a lightweight struct with *json.RawMessage/pointer fields so you can accurately detect missing top-level fields and return a clearer error.

Copilot uses AI. Check for mistakes.
Comment on lines 116 to +122
func validateConfig(buf []byte) error {
mc := v1.ModelConfig{}
var model v1.Model

err := json.Unmarshal(buf, &mc)
err := json.Unmarshal(buf, &model)
if err != nil {
return fmt.Errorf("config format mismatch: %w", err)
return fmt.Errorf("invalid model structure: %w", err)
}
Copy link

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR description says the current validateConfig behavior allows invalid/incomplete models to pass because it unmarshals into ModelConfig. However, Validator.Validate always runs JSON Schema validation after validateConfig succeeds, and the schema already requires descriptor, config, and modelfs. As written, this change is effectively introducing additional pre-validation rules (e.g., requiring descriptor.name and relying on Go type unmarshalling for diffIds), not just fixing a no-op. Please clarify the intended behavior and ensure it matches the schema/docs.

Copilot uses AI. Check for mistakes.
@rishi-jat
Copy link
Author

/cc @chlins
/cc @aftersnow
/cc @bergwolf

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants