Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
344 changes: 344 additions & 0 deletions schema/architecture-schema.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,344 @@
{
"description": "ModelPack Architecture Configuration Schema",
"$schema": "http://json-schema.org/draft-04/schema#",
"$id": "https://github.com/modelpack/model-spec/architecture",
"type": "object",
"properties": {
"architecture_version": {
"type": "string"
},
"transformer": {
"$ref": "#/$defs/TransformerArchitecture"
}
},
"required": [
"transformer"
],
"additionalProperties": false,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Regarding your question about vendor extensions, a common pattern to allow for extensions while maintaining strict validation is to use patternProperties. You could add a patternProperties field to the root object to allow any property prefixed with x- for vendor-specific data. This provides an explicit extension point without opening up the schema completely.

For example:

"patternProperties": {
  "^x-": {}
},
"additionalProperties": false

"$defs": {
"TransformerArchitecture": {
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": ["decoder"]
},
"vocabulary_size": {
"type": "integer"
},
"hidden_size": {
"type": "integer"
},
Comment on lines +26 to +31
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The vocabulary_size and hidden_size properties should be positive integers. Consider adding minimum: 1 to enforce this and improve validation.

Suggested change
"vocabulary_size": {
"type": "integer"
},
"hidden_size": {
"type": "integer"
},
"vocabulary_size": {
"type": "integer",
"minimum": 1
},
"hidden_size": {
"type": "integer",
"minimum": 1
},

"tokenizer": {
"$ref": "#/$defs/Tokenizer"
},
"token_embedding": {
"$ref": "#/$defs/TokenEmbedding"
},
"position_embedding": {
"$ref": "#/$defs/PositionEmbedding"
},
"normalization": {
"$ref": "#/$defs/Normalization"
},
"uniform_layers": {
"$ref": "#/$defs/UniformLayers"
},
"mixed_layers": {
"$ref": "#/$defs/MixedLayers"
}
},
"required": [
"type",
"vocabulary_size",
"hidden_size",
"tokenizer",
"token_embedding",
"position_embedding",
"normalization"
],
"additionalProperties": false,
"oneOf": [
{
"required": ["uniform_layers"]
},
{
"required": ["mixed_layers"]
}
]
},
"Tokenizer": {
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": ["bpe"]
},
"library": {
"type": "string",
"enum": ["huggingface"]
},
"revision": {
"type": "string"
}
},
"required": [
"type",
"library"
],
"additionalProperties": false
},
"TokenEmbedding": {
"type": "object",
"properties": {
"has_bias": {
"type": "boolean"
},
"has_norm": {
"type": "boolean"
},
"shared_embedding": {
"type": "boolean"
}
},
"additionalProperties": false
},
"PositionEmbedding": {
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": ["rope"]
},
"max_position_embeddings": {
"type": "integer"
},
Comment on lines +113 to +115
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The max_position_embeddings property should be a positive integer. Consider adding minimum: 1 to enforce this.

                "max_position_embeddings": {
                    "type": "integer",
                    "minimum": 1
                },

"rope_theta": {
"type": "number"
},
"rope_scaling": {
"type": "object"
}
Comment on lines +119 to +121
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Regarding your open question on rope_scaling: defining it as a generic "type": "object" is inconsistent with the additionalProperties: false policy used elsewhere. This allows any fields, making validation less strict. If the properties are not yet defined, it would be better to make it an empty object with "properties": {}, "additionalProperties": false. If it's meant to be extensible, you could explicitly set "additionalProperties": true or use a oneOf for known scaling types as you suggested.

},
"required": [
"type",
"max_position_embeddings"
],
"additionalProperties": false
},
"Attention": {
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": ["mha", "gqa", "mla"]
},
"is_causal": {
"type": "boolean"
},
"num_attention_heads": {
"type": "integer"
},
"num_key_value_heads": {
"type": "integer"
},
"head_dim": {
"type": "integer"
},
Comment on lines +139 to +147
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The num_attention_heads, num_key_value_heads, and head_dim properties should be positive integers. Consider adding minimum: 1 to enforce this.

                "num_attention_heads": {
                    "type": "integer",
                    "minimum": 1
                },
                "num_key_value_heads": {
                    "type": "integer",
                    "minimum": 1
                },
                "head_dim": {
                    "type": "integer",
                    "minimum": 1
                },

"is_qkv_merged": {
"type": "boolean"
},
"has_qkv_bias": {
"type": "boolean"
},
"has_output_bias": {
"type": "boolean"
},
"has_pre_norm": {
"type": "boolean"
},
"has_post_norm": {
"type": "boolean"
},
"has_residual": {
"type": "boolean"
}
},
"required": [
"type",
"is_causal",
"num_attention_heads",
"num_key_value_heads"
],
"additionalProperties": false
},
"MLP": {
"type": "object",
"properties": {
"intermediate_size": {
"type": "integer"
},
Comment on lines +178 to +180
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The intermediate_size property should be a positive integer. Consider adding minimum: 1 to enforce this.

                "intermediate_size": {
                    "type": "integer",
                    "minimum": 1
                },

"activation": {
"type": "string"
},
Comment on lines +181 to +183
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The activation property is currently a generic string. To make the schema more robust and self-documenting, consider using an enum with common activation functions, for example: ["silu", "gelu", "relu"].

                "activation": {
                    "type": "string",
                    "enum": ["silu", "gelu", "relu"]
                },

"use_gated_activation": {
"type": "boolean"
},
"is_mlp_merged": {
"type": "boolean"
},
"has_bias": {
"type": "boolean"
},
"has_residual": {
"type": "boolean"
},
"has_pre_norm": {
"type": "boolean"
},
"has_post_norm": {
"type": "boolean"
}
},
"required": [
"intermediate_size",
"activation"
],
"additionalProperties": false
},
"MoE": {
"type": "object",
"properties": {
"num_experts": {
"type": "integer"
},
"top_k": {
"type": "integer"
},
"moe_intermediate_size": {
"type": "integer"
},
Comment on lines +212 to +220
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The num_experts, top_k, and moe_intermediate_size properties should be positive integers. Consider adding minimum: 1 to enforce this.

                "num_experts": {
                    "type": "integer",
                    "minimum": 1
                },
                "top_k": {
                    "type": "integer",
                    "minimum": 1
                },
                "moe_intermediate_size": {
                    "type": "integer",
                    "minimum": 1
                },

"num_shared_experts": {
"type": "integer"
},
"shared_expert_intermediate_size": {
"type": "integer"
},
"scoring_function": {
"type": "string"
},
Comment on lines +227 to +229
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The scoring_function property is a generic string. To improve validation, consider using an enum with known scoring functions, for example ["softmax"].

                "scoring_function": {
                    "type": "string",
                    "enum": ["softmax"]
                },

"norm_topk_prob": {
"type": "boolean"
},
"activation": {
"type": "string"
},
Comment on lines +233 to +235
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The activation property is currently a generic string. To make the schema more robust, consider using an enum with common activation functions, for example: ["silu", "gelu", "relu"].

                "activation": {
                    "type": "string",
                    "enum": ["silu", "gelu", "relu"]
                },

"use_gated_activation": {
"type": "boolean"
},
"has_bias": {
"type": "boolean"
}
},
"required": [
"num_experts",
"top_k",
"moe_intermediate_size",
"scoring_function",
"activation"
],
"additionalProperties": false
},
"Normalization": {
"type": "object",
"properties": {
"type": {
"type": "string",
"enum": ["rmsnorm", "layernorm"]
},
"epsilon": {
"type": "number"
}
},
"required": [
"type"
],
"additionalProperties": false
},
"UniformLayers": {
"type": "object",
"properties": {
"num_layers": {
"type": "integer"
},
Comment on lines +271 to +273
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The num_layers property should be a positive integer. Consider adding minimum: 1 to enforce this.

                "num_layers": {
                    "type": "integer",
                    "minimum": 1
                },

"attention": {
"$ref": "#/$defs/Attention"
},
"mlp": {
"$ref": "#/$defs/MLP"
},
"moe": {
"$ref": "#/$defs/MoE"
}
},
"required": [
"num_layers",
"attention"
],
"additionalProperties": false,
"oneOf": [
{
"required": ["mlp"]
},
{
"required": ["moe"]
}
]
},
"MixedLayers": {
"type": "object",
"properties": {
"num_layers": {
"type": "integer"
},
Comment on lines +301 to +303
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The num_layers property should be a positive integer. Consider adding minimum: 1 to enforce this.

                "num_layers": {
                    "type": "integer",
                    "minimum": 1
                },

"mlp_layers": {
"type": "array",
"items": {
"type": "integer"
}
},
Comment on lines +304 to +309
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The items in mlp_layers represent layer indices and should be non-negative. Consider adding minimum: 0 to the item schema for stricter validation.

                "mlp_layers": {
                    "type": "array",
                    "items": {
                        "type": "integer",
                        "minimum": 0
                    }
                },

"moe_frequency": {
"type": "integer"
},
Comment on lines +310 to +312
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The moe_frequency property should be a positive integer. Consider adding minimum: 1 to enforce this.

                "moe_frequency": {
                    "type": "integer",
                    "minimum": 1
                },

"pre_norm_layers": {
"type": "array",
"items": {
"type": "integer"
}
},
"post_norm_layers": {
"type": "array",
"items": {
"type": "integer"
}
},
Comment on lines +313 to +324
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The items in pre_norm_layers and post_norm_layers represent layer indices and should be non-negative. Consider adding minimum: 0 to their item schemas.

                "pre_norm_layers": {
                    "type": "array",
                    "items": {
                        "type": "integer",
                        "minimum": 0
                    }
                },
                "post_norm_layers": {
                    "type": "array",
                    "items": {
                        "type": "integer",
                        "minimum": 0
                    }
                },

"attention": {
"$ref": "#/$defs/Attention"
},
"mlp": {
"$ref": "#/$defs/MLP"
},
"moe": {
"$ref": "#/$defs/MoE"
}
},
"required": [
"num_layers",
"attention",
"mlp_layers",
"moe_frequency"
],
Comment on lines +335 to +340
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Regarding your question on MixedLayers: you are correct that it does not enforce the presence of mlp or moe configurations. Since mlp_layers and moe_frequency are required, it's logical that their corresponding configuration objects (mlp and moe) should also be required to ensure a complete and valid definition for a mixed-layer model.

            "required": [
                "num_layers",
                "attention",
                "mlp_layers",
                "moe_frequency",
                "mlp",
                "moe"
            ],

"additionalProperties": false
}
}
}
Loading