Skip to content

Model Package Support#27786

Draft
chilo-ms wants to merge 10 commits intomainfrom
chi/model_package_1
Draft

Model Package Support#27786
chilo-ms wants to merge 10 commits intomainfrom
chi/model_package_1

Conversation

@chilo-ms
Copy link
Contributor

@chilo-ms chilo-ms commented Mar 20, 2026

Description

To support the model package design, one of the goals for ORT is to automatically select the most suitable compiled EPContext binary from a collection of precompiled variants based on the EP, provider options, metadata, and available devices.

This PR is for ORT to support first phase model package. There could be other follow-up PRs in the future.

A model package is a collection of models, binaries, and metadata files organized in a hierarchically structured directory.
The directory structure is not yet finalized, so the following is just a simple example of a model package directory:

phi-4.ortpackage/ 
 ├── manifest.json 
 └── models/ 
    └── phi-4/ 
        ├── metadata.json 
        └── phi-4.generic-gpu/   
            ├── model.onnx 
            ├── Other config and data files 
        └── phi-4.variant_a/ 
            ├── optimized model.onnx (contains EPContext nodes) 
            ├── Other config and data files 
            └── [Compilation artifacts] 
        └── phi-4.variant_b/ 
            ├── optimized model.onnx (contains EPContext nodes) 
            ├── Other config and data files 
            └── [Compilation artifacts] 

A manifest.json should reside in the top-level of the model package directory and describing the components of the package.
Following is an example of a manifest.json:

{ 
    "name":  <my_model_name>,
    "components": [
         {
             "variant_name": <ep_context_model_1>,
             "file": <ep_context_model_1 onnx file>,
             "constraints": {
                 "ep": <ep_name>,
                 "device": <device_type>,
                 "architecture": <hardware_architecture>
             }
         },
         {
             "variant_name": <ep_context_model_2>,
             "file": <ep_context_model_2 onnx file>,
             "constraints": {
                 "ep": <ep_name>,
                 "device": <device_type>,
                 "architecture": <hardware_architecture>
              }
         }   
    ]
}

Check the unit test here to better understand how to use model package.

This pull request introduces significant enhancements to the execution provider (EP) selection and management infrastructure in ONNX Runtime. The main focus is on supporting more sophisticated device selection and manifest-based model packaging, as well as refactoring provider selection logic for modularity and future extensibility.

Key changes include:

  • Introduction of model package context and manifest parsing to support selecting model components based on device and EP constraints.
  • Refactoring of the execution provider interface and related classes to support multiple devices per provider.
  • Modularization of EP/device selection, creation, and registration logic in the provider policy context.

The most important changes are:

Model Package Context and Manifest Support

  • Added new files model_package_context.h and model_package_context.cc to implement manifest parsing, device/EP constraint matching, and component selection logic for model packages. This enables ONNX Runtime to select the most appropriate model variant based on available hardware and EP configuration. [1] [2]

Execution Provider Interface Enhancements

  • Updated the IExecutionProvider class to support construction with a list of OrtEpDevice pointers, and added a GetEpDevices() method to retrieve the supported devices. This allows plugin and bridge EPs to expose multiple devices. [1] [2]
  • Updated plugin EP construction to pass the list of supported devices to the base class.

Provider Policy Context Refactoring

  • Refactored provider policy context logic to modularize device ordering, device selection, telemetry logging, EP creation, and registration. This includes splitting the monolithic SelectEpsForSession into smaller methods: OrderDevices, SelectEpDevices, LogTelemetry, CreateExecutionProviders, RegisterExecutionProviders, and a new flow for model package-based EP selection. [1] [2] [3] [4]

These changes collectively lay the groundwork for more flexible, robust, and extensible device and EP selection in ONNX Runtime, especially in scenarios involving packaged models with multiple variants and complex hardware environments.

Motivation and Context

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can commit the suggested changes from lintrunner.

@skottmckay
Copy link
Contributor

Top-level manifest.json should define the overall inputs/outputs so a user of the package knows what it does. They shouldn't have to trawl through the information to find the first and last things that will be run to infer this info.

Comment on lines +114 to +122
{
"variant_name": "variant_1",
"file": "mul_1.onnx",
"constraints": {
"ep": "example_ep",
"device": "cpu",
"architecture": "arch1"
}
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like lower level per-variant info I would have expected to be in the component model's metadata.json not the top level manifest.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants