Implement ObjectDetection support #205

benjamintli · 2026-01-25T21:15:24Z

Summary

This PR is an attempt to implement support for ObjectDetection models in optimum ExecuTorch. This PR adds ExecuTorchModelForObjectDetection, ObjectDetectionExportableModule, a task for object-detection, and a test for DETR type models.

Notes

ObjectDetectionExportableModule traces an object detection model, and stores num_channels, image_size, get_label_ids and get_label_names
- num_channels is not consistently defined in configs for the existing object detection models, so the ObjectDetectionExportableModule will try a few different config options, and if it can't resolve it it'll default to 3 (RGB), which I feel like is a sensible default. the priority goes kwargs defined num_channels -> config defined num_channels -> 3
- image_size is also not defined in configurations typically, as some models have dynamic sizes; there's not really a sensible default, since users would pick a size to use at inference/deployment time. So for this, the image size is passed in via the CLI and will only be used for object-detection models
- get_label_ids and get_label_names are two flat lists that are used to construct id2label; for some reason executorch doesn't seem to support storing dicts as values in constant_methods (it gets flattened)
ExecuTorchModelForObjectDetection has 3 attributes: num_channels, image_size, and id2label, which is a dict of class ids to labels.
added timm as a dev dependency for DETR

Testing

ran unit tests locally
tried running this script: https://gist.github.com/benjamintli/c188ee848fd945ec3ef78558a5803b6c. Output:

Loaded ExecuTorchModelForObjectDetection
=== Running inference ===
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.52, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
[cpuinfo_utils.cpp:71] Reading file /sys/devices/soc0/image_version
[cpuinfo_utils.cpp:87] Failed to open midr file /sys/devices/soc0/image_version
logits shape: torch.Size([1, 100, 92])
pred_boxes shape: torch.Size([1, 100, 4])

=== Detections (confidence > 0.7) ===
  remote: 1.00 @ box [38.75445556640625, 70.63130187988281, 177.2999267578125, 117.51690673828125]
  remote: 0.99 @ box [334.3410949707031, 73.98548126220703, 369.325439453125, 188.1864776611328]
  couch: 1.00 @ box [-0.009555816650390625, 1.426863670349121, 639.6910400390625, 474.4981994628906]
  cat: 1.00 @ box [11.712303161621094, 51.7900390625, 314.5272216796875, 469.29779052734375]
  cat: 1.00 @ box [345.2006530761719, 23.41107940673828, 640.120849609375, 370.40679931640625]

=== Drawing bounding boxes ===
Saved result to detections_output.jpg

Done!

This is my first PR in this repo, so I'm open to any feedback!

benjamintli added 2 commits January 25, 2026 16:02

initial implementation of object detection

e1c5699

add id2label, num_channels, image_size to model

ad38485

benjamintli marked this pull request as ready for review January 26, 2026 04:20

change function name

e823121

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement ObjectDetection support #205

Implement ObjectDetection support #205

Uh oh!

benjamintli commented Jan 25, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Implement ObjectDetection support #205

Are you sure you want to change the base?

Implement ObjectDetection support #205

Uh oh!

Conversation

benjamintli commented Jan 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Notes

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

benjamintli commented Jan 25, 2026 •

edited

Loading