Skip to content

fix: use Ollama OpenAI-compatible endpoint (/v1) by default#685

Open
justi wants to merge 1 commit intocrmne:mainfrom
justi:fix/ollama-v1-api-base
Open

fix: use Ollama OpenAI-compatible endpoint (/v1) by default#685
justi wants to merge 1 commit intocrmne:mainfrom
justi:fix/ollama-v1-api-base

Conversation

@justi
Copy link

@justi justi commented Mar 17, 2026

Problem

The Ollama provider inherits from OpenAI but uses the native API base (e.g. http://localhost:11434) which doesn't serve OpenAI-compatible endpoints. This causes:

  1. Model listing fails with 404 — provider tries GET /models but Ollama serves models at /api/tags (native) or /v1/models (OpenAI-compatible)
  2. with_schema silently doesn't work — since models aren't in the registry, they don't have structured_output capability, so response_format is never sent

Steps to reproduce

RubyLLM.configure { |c| c.ollama_api_base = "http://localhost:11434" }
RubyLLM.models.refresh!
# => WARN: Failed to fetch Ollama models (RubyLLM::Error: 404 page not found)

chat = RubyLLM.chat(model: "gemma:latest")
# => RubyLLM::ModelNotFoundError: Unknown model: gemma:latest

Solution

Automatically append /v1 to ollama_api_base so the provider uses Ollama's OpenAI-compatible endpoint. Idempotent — does not double-append if /v1 is already present (as in the existing test configuration).

Since Ollama < OpenAI, using the /v1 endpoint means all OpenAI provider logic (chat, models, schemas) works without any other changes.

# Before: http://localhost:11434       → GET /models       → 404
# After:  http://localhost:11434/v1    → GET /v1/models    → 200 ✓

What works after this fix

  • RubyLLM.models.refresh! fetches Ollama models correctly
  • RubyLLM.chat(model: "gemma:latest") finds the model
  • chat.with_schema(MySchema) sends response_format and returns parsed Hash

Edge cases handled

Input Result
http://localhost:11434 http://localhost:11434/v1
http://localhost:11434/ http://localhost:11434/v1
http://localhost:11434/v1 http://localhost:11434/v1 (no change)
http://localhost:11434/v1/ http://localhost:11434/v1
https://my-ollama.com:8080 https://my-ollama.com:8080/v1

Note on native API

Ollama's native /api/chat endpoint with format param provides better nested JSON Schema enforcement than the OpenAI-compatible endpoint. A future enhancement could use the native endpoint for structured output while keeping /v1 for everything else.

Tests

5 new test cases for #api_base added to ollama_spec.rb, following existing spec style (subject, let, context, instance_double).

Tested locally with Ollama 0.17.5, gemma:latest (9B), llama3.2:3b (3B).

🤖 Generated with Claude Code

The Ollama provider inherits from OpenAI but used the native API
base (e.g. http://localhost:11434) which doesn't serve endpoints
like /chat/completions or /models. This caused:

- Model listing to fail with 404 (tried /models instead of /api/tags)
- with_schema to silently not work (models not in registry, no
  structured_output capability detected)

Fix: automatically append /v1 to ollama_api_base so the provider
uses Ollama's OpenAI-compatible endpoint. Idempotent — does not
double-append if /v1 is already present. This gives:

- Model listing works (/v1/models)
- Chat works (/v1/chat/completions)
- with_schema works (response_format passed correctly)
- All OpenAI provider logic inherited without changes

Tested with Ollama 0.17.5, gemma:latest (9B), llama3.2:3b (3B).

Note: Ollama's native /api/chat endpoint with `format` param
provides better nested JSON Schema enforcement than the OpenAI-
compatible endpoint. A future enhancement could use the native
endpoint for structured output.
Copilot AI review requested due to automatic review settings March 17, 2026 23:33
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes the Ollama provider’s default API base so it targets Ollama’s OpenAI-compatible endpoints, which enables OpenAI-inherited behavior (model listing, chat, and structured output) to work correctly against a standard Ollama host URL.

Changes:

  • Normalize ollama_api_base to always use the /v1 OpenAI-compatible endpoint (without double-appending and with trailing-slash cleanup).
  • Update Ollama provider specs to cover #api_base normalization cases and to use a local config double (consistent with other provider specs).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
lib/ruby_llm/providers/ollama.rb Adjusts #api_base to normalize to the /v1 OpenAI-compatible base URL for Ollama.
spec/ruby_llm/providers/ollama_spec.rb Adds test coverage for #api_base normalization and refactors header tests to use an isolated config double.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants