Skip to content

refactor: migrate metrics to lazy init pattern (and fix auto-register circular-import bug) #1079

@ajbozarth

Description

@ajbozarth

Background

Discovered while implementing tracing Phase 1 (#1045 / #1046 / #1047): the auto-register-at-module-load pattern that metrics uses to wire its plugins has two related problems that both have the same fix.

Problem 1: eager init forces tests to use importlib.reload

mellea/telemetry/metrics.py reads env vars and creates the meter provider at import time (lines 110-132 and 271-277):

_METRICS_ENABLED = _OTEL_AVAILABLE and os.getenv("MELLEA_METRICS_ENABLED", "false").lower() in ("true", "1", "yes")
...
if _OTEL_AVAILABLE and _METRICS_ENABLED:
    _meter_provider = _setup_meter_provider()
    _meter = metrics.get_meter("mellea.metrics", version("mellea"))

Because env vars are read at import, tests that want to toggle metrics need importlib.reload(mellea.telemetry.metrics)test_metrics.py alone has ~20 such reloads, and the pattern repeats across the metrics test suite (~50+ reloads total).

This is the same wart tracing had before being fixed in the tracing Phase 1 PR. Logging (#907) uses lazy init for its provider too, though it doesn't have plugins to register so it only exercises half of the pattern metrics needs.

Problem 2: auto-register fails silently from the early import chain

metrics.py ends with:

if _OTEL_AVAILABLE and _METRICS_ENABLED:
    try:
        from mellea.plugins.registry import register
        from mellea.telemetry.metrics_plugins import _METRICS_PLUGIN_CLASSES
        for _plugin_cls in _METRICS_PLUGIN_CLASSES:
            try:
                register(_plugin_cls())
            except ValueError as e:
                warnings.warn(f"{_plugin_cls.__name__} already registered: {e}", ...)
    except ImportError:
        warnings.warn("Metrics are enabled but the plugin framework is not installed. ...", ...)

When MELLEA_METRICS_ENABLED=true is set before import mellea, the import chain looks like:

mellea.core.__init__
  → mellea.core.backend
    → mellea.core.utils
      → from ..telemetry import get_otlp_log_handler
        → mellea.telemetry.__init__
          → mellea.telemetry.metrics  (top-of-module + bottom auto-register)
            → from mellea.plugins.registry import register
            → from mellea.telemetry.metrics_plugins import _METRICS_PLUGIN_CLASSES
              → from mellea.core.base import GenerationMetadata    # mellea.core mid-import → ImportError
            → falls into the bottom `except ImportError` handler
            → warns "plugin framework is not installed" (false — it IS installed; this is a circular import)
            → NO PLUGINS REGISTERED

Repro:

import os
os.environ['MELLEA_METRICS_ENABLED'] = 'true'
import warnings; warnings.filterwarnings('ignore')
import mellea
from mellea.plugins.manager import has_plugins
from mellea.plugins.types import HookType
print(has_plugins(HookType.GENERATION_POST_CALL))  # False — but should be True

The reason metrics tests don't catch this: tests reload mellea.telemetry.metrics AFTER mellea.core is fully loaded, so the import chain is no longer circular and the register call succeeds. But in real usage with the env var set before import, metrics never get wired up.

Scope

Mirror what the tracing Phase 1 PR did for tracing, applied to metrics:

  1. Lazy init: replace the import-time _meter_provider = _setup_meter_provider() with _ensure_meter_provider_initialised() called from a public get_meter() accessor (or from each record_* function). Module-level globals start as None/False.
  2. Move plugin registration into _ensure_meter_provider_initialised(): by the time anyone calls a metrics function, mellea.core has finished importing, so the registry import succeeds.
  3. Add a _plugins_registered flag so re-init (after test state resets) doesn't double-register and warn.
  4. Drop importlib.reload from metrics tests — replace with a _reset_metrics_state() helper that nulls the module globals, mirroring the pattern in tracing.

After: MELLEA_METRICS_ENABLED=true python -c "import mellea; ...has_plugins(...)" returns True.

Reference

The tracing Phase 1 PR is the canonical reference — it does both halves of the pattern metrics needs (lazy provider init + lazy plugin registration in the same function). See mellea/telemetry/tracing.py _ensure_provider_initialised() calling _register_tracing_plugins() for the exact shape to copy. Logging (mellea/telemetry/logging.py) is a partial reference: it has the lazy-provider-init half but no plugins to register.

Phase

Independent of epic #444. Self-contained, can land in parallel with any of the Phase 2 / Phase 3 tracing issues. Touches mellea/telemetry/metrics.py and test/telemetry/test_metrics*.py only.

Acceptance criteria

  • metrics.py uses lazy init; _meter_provider and _meter are populated on first use, not at import time.
  • Plugin registration happens inside the lazy-init path, not at module import.
  • MELLEA_METRICS_ENABLED=true set before import mellea results in registered plugins (verify with has_plugins(GENERATION_POST_CALL) is True).
  • Metrics tests no longer use importlib.reload(mellea.telemetry.metrics) — replaced with state-reset helper.
  • Existing metrics tests pass unchanged in semantics.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions