Skip to content

Ollama cloud models fail: native /api/chat times out, /v1/chat/completions works #6

@bussyjd

Description

@bussyjd

Problem

llmspy routes Ollama requests through the native /api/chat endpoint, which times out (or returns errors) for Ollama cloud/remote models (e.g., glm-5:cloud). The OpenAI-compatible /v1/chat/completions endpoint works correctly for the same models.

Reproduction

With Ollama running and glm-5:cloud pulled:

# This times out / fails:
curl http://ollama:11434/api/chat \
  -d '{"model":"glm-5:cloud","messages":[{"role":"user","content":"hi"}],"stream":false}'

# This works:
curl http://ollama:11434/v1/chat/completions \
  -d '{"model":"glm-5:cloud","messages":[{"role":"user","content":"hi"}],"stream":false}'

llmspy returns:

{"responseStatus": {"errorCode": "Error", "message": "Expecting value: line 1 column 1 (char 0)"}}

Root Cause

In llms/main.py, the OllamaProvider sends chat requests to {api}/api/chat (the native Ollama endpoint). For Ollama cloud/remote models, this endpoint doesn't work reliably — only the OpenAI-compatible /v1/chat/completions endpoint handles them correctly.

Secondary Issue: Silent Model Discovery Failure

When all_models: true is set and Ollama is temporarily unreachable at llmspy startup (e.g., network not ready, binding mismatch), load_models() fails silently. llmspy then has an empty model list for Ollama and returns "Model not found" for all requests, even after Ollama becomes reachable. A restart of llmspy is required to recover.

Suggested Fixes

  1. Use /v1/chat/completions for Ollama — switch to the OpenAI-compatible endpoint which handles both local and cloud models
  2. Retry model discovery — if load_models() fails at startup, retry periodically or on first request rather than failing permanently

Environment

  • Ollama with remote/cloud models (glm-5:cloud)
  • llmspy v3.0.34-obol.1
  • Kubernetes (k3d) with ExternalName service routing to host Ollama

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions