Skip to content

[Feature] Add shared OpenTelemetry producer config#297

Open
royischoss wants to merge 1 commit into
mlrun:developmentfrom
royischoss:ceml-708
Open

[Feature] Add shared OpenTelemetry producer config#297
royischoss wants to merge 1 commit into
mlrun:developmentfrom
royischoss:ceml-708

Conversation

@royischoss
Copy link
Copy Markdown
Collaborator

@royischoss royischoss commented May 25, 2026

📝 Description

Adds a top-level telemetry: block to the umbrella chart so any service in the bundle can read the same OpenTelemetry producer-side config. Today it wires MLRUN_TELEMETRY__* env vars into mlrun-api via the existing
mlrun-common-env ConfigMap; Nuclio and other services are left for follow-up.

Mirrors the producer side of the MLRun PR (mlrun/mlrun#9668) and the analogous mlefi PR (IG4-2293, #532) on the IG4 side.


🛠️ Changes Made

File Change
values.yaml Add top-level telemetry: block (enabled / otlpEndpoint / insecure / headersSecretName), sibling to mlrun: / nuclio:
templates/config/mlrun-env-configmap.yaml Append MLRUN_TELEMETRY__* keys with resolution + safety logic
README.md Document the opt-in knobs + truth table + external-endpoint example
Chart.yaml Bump to 0.11.0-rc.38

✅ Checklist

  • I have tested the changes in this PR
  • I confirmed whether my changes require a change in documentation and if so, I created another PR in MLRun for the relevant documentation.
  • I confirmed whether my changes require a changes in QA tests, for example: credentials changes, resources naming change and if so, I updated the relevant Jira ticket for QA.
  • I increased the Chart version in charts/mlrun-ce/Chart.yaml.
  • I confirmed that the installation works both on a local Docker Desktop environment and on a real cluster when using the required prerequisites.
    • If installation issues were found, I updated the relevant Jira ticket with the issue and steps to reproduce, or updated the prerequisites documentation if the issue is related to missing or outdated prerequisites.
  • If needed, update https://github.com/mlrun/ce/blob/development/charts/mlrun-ce/README.md with the relevant installation instructions and version Matrix.
  • If needed, update the following values files for multi namespace support:

🧪 Testing

Live cluster verification on app1.vmdev232ig4.lab.iguaz.io:
- Test 1 (no OTel): mlrun-common-env ConfigMap shows MLRUN_TELEMETRY__ENABLED: "false"
- Test 2 (OTel enabled): ConfigMap shows ENABLED: "true", OTLP_ENDPOINT: otel-collector.mlrun.svc.cluster.local:4317
- Probe pod with same envFrom: configMapRef: mlrun-common-env receives all three env vars at runtime


🔗 References


🚨 Breaking Changes?

  • Yes (explain below)
  • No

🔍️ Additional Notes

  - values.yaml — top-level telemetry: block with "" default for enabled (inherits collector state)
  - templates/config/mlrun-env-configmap.yaml — resolution logic + safety override + endpoint derivation
  - README.md — opt-in docs with truth table and external-endpoint example
  - Chart.yaml — bumped to 0.11.0-rc.38
Copy link
Copy Markdown
Contributor

@davesh0812 davesh0812 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very good overall.
please add some unit tests if you want you can use those tests I added in mlefi :

  - test_telemetry_default_inherits_collector_disabled — default values, expect MLRUN_TELEMETRY__ENABLED: "false", no endpoint key.
  - test_telemetry_inherits_collector_enabled — --set opentelemetry.collector.enabled=true, expect ENABLED: "true" + in-cluster endpoint.
  - test_telemetry_external_endpoint — --set telemetry.enabled=true --set telemetry.otlpEndpoint=external.com:4317 --set opentelemetry.collector.enabled=false, expect endpoint passes through.
  - test_telemetry_safety_force_disable — --set telemetry.enabled=true with collector off and no endpoint, expect ENABLED: "false".
  - test_telemetry_headers_secret_emitted_only_when_enabled — headersSecretName set but enabled=false, expect no HEADERS_SECRET_NAME key.

Comment thread CLAUDE.md
<!-- These are Claude-specific behavioral instructions on HOW to respond. AGENTS.md covers WHAT the codebase contains. -->

- Values changes: show `--set` flags or a patch values file overlay, not edits to `values.yaml` directly, unless there is a change with the default value that should be reflected in `values.yaml` (e.g. a new component's `enabled` flag)
-it Values changes: show `--set` flags or a patch values file overlay, not edits to `values.yaml` directly, unless there is a change with the default value that should be reflected in `values.yaml` (e.g. a new component's `enabled` flag)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
-it Values changes: show `--set` flags or a patch values file overlay, not edits to `values.yaml` directly, unless there is a change with the default value that should be reflected in `values.yaml` (e.g. a new component's `enabled` flag)
- Values changes: show `--set` flags or a patch values file overlay, not edits to `values.yaml` directly, unless there is a change with the default value that should be reflected in `values.yaml` (e.g. a new component's `enabled` flag)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants