Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 0 additions & 39 deletions demos/using_onnx_model/python/Makefile

This file was deleted.

44 changes: 8 additions & 36 deletions demos/using_onnx_model/python/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,10 @@ Steps are similar to when you work with IR model format. Model Server accepts ON
Below is a complete functional use case using Python 3.7 or higher.
For this example let's use a public [ONNX ResNet](https://github.com/onnx/models/tree/main/validated/vision/classification/resnet) model - resnet50-caffe2-v1-9.onnx model.

This model requires additional [preprocessing function](https://github.com/onnx/models/tree/main/validated/vision/classification/resnet#preprocessing). Preprocessing can be performed in the client by manipulating data before sending the request. Preprocessing can be also delegated to the server by creating a [DAG](../../../docs/dag_scheduler.md) and using a custom processing node. Both methods will be explained below.
This model requires additional [preprocessing function](https://github.com/onnx/models/tree/main/validated/vision/classification/resnet#preprocessing). Preprocessing can be performed in the client by manipulating data before sending the request. Preprocessing can be also delegated to the server by setting preprocessing parameters. Both methods will be explained below.

[Option 1: Adding preprocessing to the client side](#option-1-adding-preprocessing-to-the-client-side)
[Option 2: Adding preprocessing to the server side (building DAG)](#option-2-adding-preprocessing-to-the-server-side-building-a-dag)
[Option 2: Adding preprocessing to the server side](#option-2-adding-preprocessing-to-the-server-side)

## Option 1: Adding preprocessing to the client side

Expand All @@ -17,9 +17,9 @@ git clone https://github.com/openvinotoolkit/model_server.git
cd model_server/demos/using_onnx_model/python
```

Prepare workspace with the model by running:
Download classification model
```bash
make client_preprocessing
curl --fail -L --create-dirs https://github.com/onnx/models/raw/main/validated/vision/classification/resnet/model/resnet50-caffe2-v1-9.onnx -o workspace/resnet50-onnx/1/resnet50-caffe2-v1-9.onnx
```

You should see `workspace` directory created with the following content:
Expand Down Expand Up @@ -53,31 +53,15 @@ Class is with highest score: 309
Detected class name: bee
```

## Option 2: Adding preprocessing to the server side (building a DAG)
## Option 2: Adding preprocessing to the server side

Prepare workspace with the model, preprocessing node library and configuration file by running:
```bash
make server_preprocessing
```

You should see `workspace` directory created with the following content:
```bash
workspace/
├── config.json
├── lib
│   └── libcustom_node_image_transformation.so
└── resnet50-onnx
└── 1
└── resnet50-caffe2-v1-9.onnx
```

Start the OVMS container with a configuration file option:
Start the OVMS container with additional preprocessing options:
```bash
docker run -d -u $(id -u):$(id -g) -v $(pwd)/workspace:/workspace -p 9001:9001 openvino/model_server:latest \
--config_path /workspace/config.json --port 9001
--model_path /workspace/resnet50-onnx --model_name resnet --port 9001 --layout NHWC:NCHW --mean "[123.675,116.28,103.53]" --scale "[58.395,57.12,57.375]" --shape "(1,224,224,3)" --color_format BGR:RGB
```

The `onnx_model_demo.py` script can run inference both with and without performing preprocessing. Since in this variant preprocessing is done by the model server (via custom node), there's no need to perform any image preprocessing on the client side. In that case, run without `--run_preprocessing` option. See [preprocessing function](https://github.com/openvinotoolkit/model_server/blob/main/demos/using_onnx_model/python/onnx_model_demo.py#L26-L33) run in the client.
The `onnx_model_demo.py` script can run inference both with and without performing preprocessing. Since in this variant preprocessing is done by the model server, there's no need to perform any image preprocessing on the client side. In that case, run without `--run_preprocessing` option. See [preprocessing function](https://github.com/openvinotoolkit/model_server/blob/main/demos/using_onnx_model/python/onnx_model_demo.py#L26-L33) run in the client.

Run the client without preprocessing:
```bash
Expand All @@ -86,15 +70,3 @@ Running without preprocessing on client side
Class is with highest score: 309
Detected class name: bee
```

## Node parameters explanation
Additional preprocessing step applies a division and an subtraction to each pixel value in the image. This calculation is configured by passing two parameters to _image transformation_ custom node in [config.json](https://github.com/openvinotoolkit/model_server/blob/main/demos/using_onnx_model/python/config.json#L32-L33):
```
"params": {
...
"mean_values": "[123.675,116.28,103.53]",
"scale_values": "[58.395,57.12,57.375]",
...
}
```
For each pixel, the custom node subtracted `123.675` from blue value, `116.28` from green value and `103.53` from red value. Next, it divides in the same color order using `58.395`, `57.12`, `57.375` values. This way we match the image data to the input required by onnx model.
72 changes: 0 additions & 72 deletions demos/using_onnx_model/python/config.json

This file was deleted.

4 changes: 2 additions & 2 deletions demos/using_onnx_model/python/onnx_model_demo.py
Original file line number Diff line number Diff line change
Expand Up @@ -60,12 +60,12 @@ def getJpeg(path, size):
if args["run_preprocessing"]:
print("Running with preprocessing on client side")
img = getJpeg(args["image_path"], 224)
input_name = "gpu_0/data_0"
input_name = "data"
else:
print("Running without preprocessing on client side")
with open(args["image_path"], "rb") as f:
img = f.read()
input_name = "0"
input_name = "data"

client = make_grpc_client(args["service_url"])
output = client.predict({input_name: img}, "resnet")
Expand Down
4 changes: 4 additions & 0 deletions docs/parameters.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,10 @@
| `"shape"` | `tuple/json/"auto"` | `shape` is optional and takes precedence over `batch_size`. The `shape` argument changes the model that is enabled in the model server to fit the parameters. `shape` accepts three forms of the values: * `auto` - The model server reloads the model with the shape that matches the input data matrix. * a tuple, such as `(1,3,224,224)` - The tuple defines the shape to use for all incoming requests for models with a single input. * A dictionary of shapes, such as `{"input1":"(1,3,224,224)","input2":"(1,3,50,50)", "input3":"auto"}` - This option defines the shape of every included input in the model.Some models don't support the reshape operation.If the model can't be reshaped, it remains in the original parameters and all requests with incompatible input format result in an error. See the logs for more information about specific errors.Learn more about supported model graph layers including all limitations at [Shape Inference Document](https://docs.openvino.ai/2025/openvino-workflow/running-inference/model-input-output/changing-input-shape.html). |
| `"batch_size"` | `integer/"auto"` | Optional. By default, the batch size is derived from the model, defined through the OpenVINO Model Optimizer. `batch_size` is useful for sequential inference requests of the same batch size.Some models, such as object detection, don't work correctly with the `batch_size` parameter. With these models, the output's first dimension doesn't represent the batch size. You can set the batch size for these models by using network reshaping and setting the `shape` parameter appropriately.The default option of using the Model Optimizer to determine the batch size uses the size of the first dimension in the first input for the size. For example, if the input shape is `(1, 3, 225, 225)`, the batch size is set to `1`. If you set `batch_size` to a numerical value, the model batch size is changed when the service starts.`batch_size` also accepts a value of `auto`. If you use `auto`, then the served model batch size is set according to the incoming data at run time. The model is reloaded each time the input data changes the batch size. You might see a delayed response upon the first request. |
| `"layout" `| `json/string` | `layout` is optional argument which allows to define or change the layout of model input and output tensors. To change the layout (add the transposition step), specify `<target layout>:<source layout>`. Example: `NHWC:NCHW` means that user will send input data in `NHWC` layout while the model is in `NCHW` layout.<br><br>When specified without colon separator, it doesn't add a transposition but can determine the batch dimension. E.g. `--layout CN` makes prediction service treat second dimension as batch size.<br><br>When the model has multiple inputs or the output layout has to be changed, use a json format. Set the mapping, such as: `{"input1":"NHWC:NCHW","input2":"HWN:NHW","output1":"CN:NC"}`.<br><br>If not specified, layout is inherited from model.<br><br> [Read more](shape_batch_size_and_layout.md#changing-model-input-output-layout) |
| `"mean"` | `array/float/tuple` | Optional. The value used for preprocessing input data which will be subtracted from pixel values. It may be float value, tuple or array. It should be the same as amount of color channels in layout. |
| `"scale"` | `array/float/tuple` | Optional. The value used for preprocessing input data which will divide pixel values. It may be float value, tuple or array. It should be the same as amount of color channels in layout. |
| `"color_format"` | `string` | Optional. Allows to define or change color format of model input tensors. To change the color format, specify `<target color format>:<source color format>`, as the layout option. Possible options: `RGB`, `BGR`, `GRAY`, `NV12`, `NV12_2`, `I420` or `I420_3` |
| `"precision"` | `string` | Optional. Allows to change precision of model input tensors. Possible options: `f32`, `f16`, `int8`, `uint8`, `int16`, `uint16`, `int32`, `uint32`, `int64`, `uint64` or `bf16` |
| `"model_version_policy"` | `json/string` | Optional. The model version policy lets you decide which versions of a model that the OpenVINO Model Server is to serve. By default, the server serves the latest version. One reason to use this argument is to control the server memory consumption.The accepted format is in json or string. Examples: <br> `{"latest": { "num_versions":2 }` <br> `{"specific": { "versions":[1, 3] } }` <br> `{"all": {} }` |
| `"plugin_config"` | `json/string` | List of device plugin parameters. For full list refer to [OpenVINO documentation](https://docs.openvino.ai/2025/documentation/compatibility-and-support/supported-devices.html) and [performance tuning guide](./performance_tuning.md). Example: <br> `{"PERFORMANCE_HINT": "LATENCY"}` |
| `"nireq"` | `integer` | The size of internal request queue. When set to 0 or no value is set value is calculated automatically based on available resources.|
Expand Down
12 changes: 6 additions & 6 deletions spelling-whitelist.txt
Original file line number Diff line number Diff line change
Expand Up @@ -9,12 +9,12 @@ src/shape.cpp:438: strIn
src/shape.cpp:488: strIn
src/shape.cpp:507: strIn
src/shape.hpp:121: strIn
src/test/modelconfig_test.cpp:473: OptionA
src/test/modelconfig_test.cpp:479: OptionA
src/test/modelconfig_test.cpp:485: OptionA
src/test/modelconfig_test.cpp:491: OptionA
src/test/modelconfig_test.cpp:497: OptionA
src/test/modelconfig_test.cpp:503: OptionA
src/test/modelconfig_test.cpp:617: OptionA
src/test/modelconfig_test.cpp:623: OptionA
src/test/modelconfig_test.cpp:629: OptionA
src/test/modelconfig_test.cpp:635: OptionA
src/test/modelconfig_test.cpp:641: OptionA
src/test/modelconfig_test.cpp:647: OptionA
src/test/modelinstance_test.cpp:1093: THROUGHTPUT
third_party/aws-sdk-cpp/aws-sdk-cpp.bz
WORKSPACE:98: thirdparty
Expand Down
13 changes: 13 additions & 0 deletions src/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -635,6 +635,7 @@ ovms_cc_library(
"libovmsschema",
"libovmslayout",
"libovmslayout_configuration",
"libovmscolor_format_configuration",
"libovms_tensorinfo",
"libovmstimer",
"tfs_utils",
Expand Down Expand Up @@ -1605,6 +1606,7 @@ ovms_cc_library(
"@com_github_tencent_rapidjson//:rapidjson", # TODO split into parser
"libovmsfilesystem",
"libovmslayout_configuration",
"libovmscolor_format_configuration",
"libovmsmodelversioning",
"libovmsschema",
"libovmsshape",
Expand Down Expand Up @@ -1663,6 +1665,17 @@ ovms_cc_library(
],
visibility = ["//visibility:public"],
)
ovms_cc_library(
name = "libovmscolor_format_configuration",
hdrs = ["color_format_configuration.hpp",],
srcs = ["color_format_configuration.cpp",],
deps = [
"//third_party:openvino",
"ovms_header",
"libovmsstatus",
],
visibility = ["//visibility:public"],
)
ovms_cc_library(
name = "libovmsmodelversion",
hdrs = [
Expand Down
4 changes: 4 additions & 0 deletions src/capi_frontend/server_settings.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -216,6 +216,10 @@ struct ModelsSettingsImpl {
std::string batchSize;
std::string shape;
std::string layout;
std::string mean;
std::string scale;
std::string colorFormat;
std::string precision;
std::string modelVersionPolicy;
uint32_t nireq = 0;
std::string targetDevice;
Expand Down
52 changes: 52 additions & 0 deletions src/cli_parser.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -259,6 +259,26 @@ std::variant<bool, std::pair<int, std::string>> CLIParser::parse(int argc, char*
"Resets model layout.",
cxxopts::value<std::string>(),
"LAYOUT")
("mean",
"Resets model mean.",
cxxopts::value<std::string>(),
"MEAN")
("scale",
"Resets model scale.",
cxxopts::value<std::string>(),
"SCALE")
("color_format",
"Resets model color format.",
cxxopts::value<std::string>(),
"COLOR_FORMAT")
("precision",
"Resets model precision.",
cxxopts::value<std::string>(),
"PRECISION")
("resize",
"Resets model resize dimensions.",
cxxopts::value<std::string>(),
"resize")
("model_version_policy",
"Model version policy",
cxxopts::value<std::string>(),
Expand Down Expand Up @@ -587,6 +607,38 @@ void CLIParser::prepareModel(ModelsSettingsImpl& modelsSettings, HFSettingsImpl&
modelsSettings.userSetSingleModelArguments.push_back("layout");
}

if (result->count("mean")) {
if (modelsSettings.layout.empty()) {
throw std::logic_error("error parsing options - --mean parameter requires --layout to be set");
}
modelsSettings.mean = result->operator[]("mean").as<std::string>();
modelsSettings.userSetSingleModelArguments.push_back("mean");
}

if (result->count("scale")) {
if (modelsSettings.layout.empty()) {
throw std::logic_error("error parsing options - --scale parameter requires --layout to be set");
}
modelsSettings.scale = result->operator[]("scale").as<std::string>();
modelsSettings.userSetSingleModelArguments.push_back("scale");
}

if (result->count("color_format")) {
if (modelsSettings.layout.empty()) {
throw std::logic_error("error parsing options - --color_format parameter requires --layout to be set");
}
modelsSettings.colorFormat = result->operator[]("color_format").as<std::string>();
modelsSettings.userSetSingleModelArguments.push_back("color_format");
}

if (result->count("precision")) {
if (modelsSettings.layout.empty()) {
throw std::logic_error("error parsing options - --precision parameter requires --layout to be set");
}
modelsSettings.precision = result->operator[]("precision").as<std::string>();
modelsSettings.userSetSingleModelArguments.push_back("precision");
}

if (result->count("model_version_policy")) {
modelsSettings.modelVersionPolicy = result->operator[]("model_version_policy").as<std::string>();
modelsSettings.userSetSingleModelArguments.push_back("model_version_policy");
Expand Down
Loading