-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Description
Description
Feature Request: Response-Based Fallback for FallbackModel
Summary
Add a fallback_on_response parameter to FallbackModel that allows fallback decisions based on inspecting the ModelResponse content, not just exceptions. This enables fallback when a model returns a successful HTTP response but the semantic content indicates failure (e.g., a builtin tool like web_fetch failed to retrieve a URL).
Motivation
Currently, FallbackModel only supports exception-based fallback via the fallback_on parameter:
fallback_on: Callable[[Exception], bool] | tuple[type[Exception], ...] = (ModelAPIError,)This works well for:
- API errors (rate limits, 5xx responses)
- Network failures
- Authentication issues
However, it cannot handle semantic failures where:
- The model returns HTTP 200 (no exception raised)
- But the response content indicates the operation failed
Real-World Example: WebFetchTool with Google Models
When using WebFetchTool with Google's Gemini models, the model may successfully return a response, but the BuiltinToolReturnPart indicates the URL fetch failed:
BuiltinToolReturnPart(
tool_name='web_fetch',
content=[{
'uri': 'https://example.com',
'url_retrieval_status': 'URL_RETRIEVAL_STATUS_FAILED' # Not SUCCESS!
}]
)In this case, I want to fallback to Anthropic's model (which has different web fetching capabilities), but there's no exception to catch.
Current Workaround
Today, I have to implement manual fallback logic outside of FallbackModel:
class BrandURLExtractionAgent:
def __init__(self):
self._google_agent = Agent(model=google_model, builtin_tools=[WebFetchTool()])
self._anthropic_agent = Agent(model=anthropic_model, builtin_tools=[WebFetchTool()])
async def get_brand_summary(self, url: str) -> str | None:
# Try Google first
try:
result = await self._google_agent.run(user_prompt=prompt)
# Manual inspection of response parts
if self._check_web_fetch_success(result.all_messages(), url):
return result.output
# Fall through to fallback...
except Exception:
pass
# Manual fallback to Anthropic
result = await self._anthropic_agent.run(user_prompt=prompt)
return result.output
def _check_web_fetch_success(self, messages: list[ModelMessage], url: str) -> bool:
for message in messages:
if isinstance(message, ModelRequest):
for part in message.parts:
if isinstance(part, BuiltinToolReturnPart) and part.tool_name == "web_fetch":
# Check if URL was successfully retrieved...
pass
return FalseThis is verbose, error-prone, and doesn't benefit from FallbackModel's clean abstraction.
Proposed API
Add a new fallback_on_response parameter to FallbackModel.__init__:
from collections.abc import Callable
from pydantic_ai.messages import ModelMessage, ModelResponse
class FallbackModel(Model):
def __init__(
self,
default_model: Model | KnownModelName | str,
*fallback_models: Model | KnownModelName | str,
fallback_on: Callable[[Exception], bool] | tuple[type[Exception], ...] = (ModelAPIError,),
# NEW PARAMETER:
fallback_on_response: Callable[[ModelResponse, list[ModelMessage]], bool] | None = None,
):
...Parameter Signature
fallback_on_response: Callable[[ModelResponse, list[ModelMessage]], bool] | None = Noneresponse: ModelResponse- The model's response to inspectmessages: list[ModelMessage]- Full message history (needed becauseBuiltinToolReturnPartlives inModelRequest, notModelResponse)- Returns
bool-Trueto trigger fallback,Falseto accept the response
Usage Example
from pydantic_ai.messages import (
BuiltinToolReturnPart,
ModelMessage,
ModelRequest,
ModelResponse,
)
from pydantic_ai.models.fallback import FallbackModel
from pydantic_ai.models.google import GoogleModel
from pydantic_ai.models.anthropic import AnthropicModel
def web_fetch_failed(response: ModelResponse, messages: list[ModelMessage]) -> bool:
"""Return True if web_fetch tool failed to retrieve content."""
for message in messages:
if not isinstance(message, ModelRequest):
continue
for part in message.parts:
if isinstance(part, BuiltinToolReturnPart) and part.tool_name == "web_fetch":
content = part.content
if isinstance(content, list):
for item in content:
status = item.get("url_retrieval_status")
if status and status != "URL_RETRIEVAL_STATUS_SUCCESS":
return True # Trigger fallback
return False # Accept response
google_model = GoogleModel('gemini-2.0-flash')
anthropic_model = AnthropicModel('claude-3-5-haiku-latest')
fallback_model = FallbackModel(
google_model,
anthropic_model,
fallback_on_response=web_fetch_failed,
)
agent = Agent(
model=fallback_model,
builtin_tools=[WebFetchTool()],
)
# Now if Google's web_fetch fails, automatically falls back to Anthropic!
result = await agent.run("Summarize https://example.com")Implementation Notes
Changes to FallbackModel.request()
async def request(
self,
messages: list[ModelMessage],
model_settings: ModelSettings | None,
model_request_parameters: ModelRequestParameters,
) -> ModelResponse:
exceptions: list[Exception] = []
for model in self._models:
try:
response = await model.request(messages, model_settings, model_request_parameters)
# NEW: Check response-based fallback condition
if self._fallback_on_response is not None:
if self._fallback_on_response(response, messages):
# Optionally log: "Fallback triggered by response inspection"
continue # Try next model
return response
except Exception as e:
if self._should_fallback(e):
exceptions.append(e)
else:
raise
raise FallbackExceptionGroup("All models failed", exceptions)Streaming Considerations
For request_stream(), response-based fallback is more complex since the response arrives incrementally. Options:
- Don't support for streaming initially - Document that
fallback_on_responseonly works with non-streaming requests - Inspect after stream completes - Buffer the full response, then check (but this defeats the purpose of streaming for the fallback check)
- Support a streaming-aware callback - More complex API
I'd recommend starting with option 1 (non-streaming only) and expanding later if there's demand.
Alternatives Considered
Alternative 1: Extend fallback_on to accept response
fallback_on: Callable[[Exception | ModelResponse], bool] | ...Rejected because: Mixing exception and response handling in one callback is confusing and breaks existing type signatures.
Alternative 2: Custom exception wrapping
Wrap response inspection failures in a custom exception that fallback_on can catch.
Rejected because: Requires users to raise synthetic exceptions, which is awkward and doesn't fit the "successful response with bad content" mental model.
Alternative 3: Output validator with retry
Use @agent.output_validator to raise ModelRetry.
Rejected because: This triggers retry with the same model, not fallback to a different model. Also, the inspection often needs to happen at the message/part level, not the final output level.
Additional Context
This pattern is useful beyond WebFetchTool. Other use cases:
- Citation validation: Fallback if the model's response doesn't include expected citations
- Tool call validation: Fallback if a required tool wasn't called
- Content quality checks: Fallback if response is too short, contains refusals, or lacks required structure
- Provider-specific quirks: Handle cases where one provider returns empty content while another handles the same prompt fine
Checklist
- Add
fallback_on_responseparameter toFallbackModel.__init__ - Implement response inspection in
FallbackModel.request() - Add logging when response-based fallback is triggered
- Document the feature in the FallbackModel docs
- Add unit tests for response-based fallback
- Document streaming limitation (if not supported initially)
If this is something you would accept, I'd be interested in contributing it :)
References
- Fallback Model #516 - Original FallbackModel feature request
- FallbackModel does not handle UnexpectedModelBehavior raised during response handling in agent graph #2837 - FallbackModel doesn't handle
UnexpectedModelBehaviorduring response handling (related: response-level issues not caught) - FallbackModel to allow model_settings specific to each model #2119 - Per-model settings for FallbackModel (resolved, shows precedent for extending FallbackModel API)