Skip to content

Latest commit

 

History

History
108 lines (88 loc) · 3.76 KB

File metadata and controls

108 lines (88 loc) · 3.76 KB
title Fallback
description When a model fails, Manifest retries with a backup
icon life-buoy

What is fallback?

When a model fails (provider outage, rate limit, bad request), Manifest retries with a backup model from the same tier. Your agent gets a response instead of an error.

How it works

The primary model returns an error (any HTTP 4xx or 5xx status). Manifest picks the next fallback model from the tier's fallback list. Fallback models are tried in the order you configure them. The original request is forwarded to the backup model. If that model also fails, Manifest continues down the fallback list until a model succeeds or all options are exhausted.

What triggers a fallback

Any HTTP status code >= 400 triggers a fallback, with one exception: 424 (Failed Dependency) does not trigger a fallback (this is the status Manifest itself returns when the entire chain is exhausted, preventing infinite loops).

This includes:

Status Example
400 Bad request
401 Authentication error
403 Forbidden
429 Rate limited
500 Internal server error
502 Bad gateway
503 Service unavailable
529 Provider overloaded

Configuration

Fallback models are configured per tier in the Manifest dashboard. Each tier can have up to 5 fallback models, tried in order.

Go to [app.manifest.build](https://app.manifest.build) and navigate to **Routing**. Click on any tier (Simple, Standard, Complex, or Reasoning). Add up to 5 fallback models. Drag to reorder — models are tried from top to bottom. Go to [http://localhost:2099](http://localhost:2099) and navigate to **Routing**. Click on any tier to configure its fallback chain. Add up to 5 fallback models. Drag to reorder priority.

Response headers

When a fallback succeeds, the response includes the standard routing headers plus two extra ones:

Header Description
X-Manifest-Tier The routing tier
X-Manifest-Model The model that served the response (the fallback model, not the original)
X-Manifest-Provider The provider that handled the request
X-Manifest-Confidence Routing confidence score
X-Manifest-Reason Why this tier was selected
X-Manifest-Fallback-From The primary model that was attempted first
X-Manifest-Fallback-Index Position in the fallback chain (0 = first fallback, 1 = second, etc.)

When the chain is exhausted:

Header Description
X-Manifest-Fallback-Exhausted Set to true when all models failed

Fallback vs routing

Routing Fallback
When Before the request is sent After the request fails
Goal Pick the cheapest capable model Recover from a failure
Speed < 2 ms scoring Adds one extra round-trip per retry
Tier Assigns a tier Stays within the same tier

Routing picks the model. Fallback catches it if that model is down.

Connect at least two providers. With a single provider, fallback can only switch between that provider's models.