fix(operator): configure embeddings first; SDK-direct catalog refresh#401
Open
aaronsb wants to merge 2 commits into
Open
fix(operator): configure embeddings first; SDK-direct catalog refresh#401aaronsb wants to merge 2 commits into
aaronsb wants to merge 2 commits into
Conversation
Two coupled fixes for Step 6 init failure ("Anthropic requires an
embedding provider") on first-run setup.
1. Reorder guided-init: configure the local embedding profile right
after admin user creation (new Step 4), before AI provider selection.
The wizard already collects the GPU/CPU choice at the top; map that
to a PyTorch device string (mac→mps, nvidia/amd/amd-host→cuda,
cpu→cpu) and pass it to `configure.py embedding --device`. The API
container picks up the active profile + device at startup. Renumber
subsequent steps (AI provider → 5, validate key → 6, model select →
7); Garage and start-app stay at 8 and 9.
2. SDK-direct catalog refresh. `models refresh` previously went through
get_provider(), which instantiates AnthropicProvider — whose __init__
eagerly constructs an OpenAIProvider as the embedding delegate. The
operator container has no loaded EmbeddingModelManager (only the API
container initializes one at startup), so get_embedding_provider()
returns None and the eager fallback fails for lack of an OpenAI key.
New _fetch_catalog_via_sdk() bypasses __init__ via __new__, sets
only the SDK client (or api_key for OpenRouter), and reuses the
existing fetch_model_catalog method. Mirrors the SDK-direct pattern
already used by _validate_provider_key.
Adds a --device flag to `configure.py embedding` so the wizard can
write the chosen device onto the activated profile in one call.
set_model_default fetched the provider/category for the target row and then unpacked the result with `provider, category = row`. When the caller's connection is configured with RealDictCursor (as the operator container's configure.py is — see operator/configure.py line 39), the row is a dict subclass and tuple unpacking silently yields the column *names* — "provider" and "category" — rather than the values. The clear-existing-default UPDATE then matched zero rows, and the set-new-default UPDATE collided with the still-set old default, violating idx_catalog_default on (provider, category). The API container's path didn't hit this because AGEClient.pool doesn't set a cursor_factory; only this operator-driven path tripped on it. Replace the SELECT + tuple-unpack with a subquery so the function is cursor-factory-agnostic. As a bonus the path is now idempotent: setting a model that's already the default no longer races with itself. Manifests as Step 7 of guided init: "Models command failed: duplicate key value violates unique constraint 'idx_catalog_default'".
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Follow-up to #400. Init reached Step 6 (model catalog refresh) and crashed with the same Anthropic chicken-and-egg:
Two coupled root causes:
The operator container can never construct
LocalEmbeddingProvider.EmbeddingModelManageris loaded only by the API container at startup (api/app/main.py:192). The operator container imports the same code but never callsinit_embedding_model_manager(), so the module-level_model_managerglobal staysNone.get_embedding_provider()in the operator therefore always returnsNone, andAnthropicProvider.__init__falls into its eagerOpenAIProvider()fallback — which fails because no OpenAI key is stored on first-run setup.The wizard configured embeddings after extraction model selection. GPU/CPU choice from the start of the wizard wasn't being propagated to the embedding profile's
devicecolumn.What changed
operator/configure.py_fetch_catalog_via_sdk(provider)helper. BypassesAIProvider.__init__via__new__, sets onlyself.client(orself.api_keyfor OpenRouter — what itsfetch_model_catalogactually uses), and calls the existingfetch_model_catalogmethod. No catalog logic is duplicated.models refreshuses the new helper for openai/anthropic/openrouter; falls back toget_provider()for ollama/llamacpp (different requirements, not part of guided wizard).--deviceflag onembeddingsubcommand; writes the chosen device onto the activated profile.operator/lib/guided-init.shGPU_MODE→ PyTorch device string:mac→mps,nvidia→cuda,amd|amd-host→cuda(ROCm PyTorch presents as cuda),cpu→cpu.Test plan
./operator.sh initwith hot-reload + CPU + Anthropic.device=cpu.models refresh anthropic) succeeds without the embedding-provider error.device=mps._fetch_catalog_via_sdkhelper.