-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Add OpenRouter image generation support #3599
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
a23194b
1eb728b
06f868a
e32f767
967e612
cfbcb51
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -73,3 +73,58 @@ model = OpenRouterModel('openai/gpt-5') | |
| agent = Agent(model, model_settings=settings) | ||
| ... | ||
| ``` | ||
|
|
||
| ## Image Generation | ||
|
|
||
| You can use OpenRouter models that support image generation with the `openrouter_modalities` setting: | ||
|
|
||
| ```python {test="skip"} | ||
| from pydantic_ai import Agent, BinaryImage | ||
| from pydantic_ai.models.openrouter import OpenRouterModelSettings | ||
|
|
||
| agent = Agent( | ||
| model='openrouter:google/gemini-2.5-flash-image-preview', | ||
| output_type=str | BinaryImage, | ||
| model_settings=OpenRouterModelSettings(openrouter_modalities=['image', 'text']), | ||
| ) | ||
|
|
||
| result = agent.run_sync('A cat') | ||
| assert isinstance(result.output, BinaryImage) | ||
| ``` | ||
|
|
||
| You can further customize image generation using `openrouter_image_config`: | ||
|
|
||
| ```python | ||
| from pydantic_ai.models.openrouter import OpenRouterModelSettings | ||
|
|
||
| settings = OpenRouterModelSettings( | ||
| openrouter_modalities=['image', 'text'], | ||
| openrouter_image_config={'aspect_ratio': '3:2'} | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I want this to be an option on
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm OK with it also being a model setting if it supports more keys than If you want, you can finish that PR as we're at it to make your life here easier. |
||
| ) | ||
| ``` | ||
|
|
||
| > Available aspect ratios: `'1:1'`, `'2:3'`, `'3:2'`, `'3:4'`, `'4:3'`, `'4:5'`, `'5:4'`, `'9:16'`, `'16:9'`, `'21:9'`. | ||
|
|
||
| Image generation also works with streaming: | ||
|
|
||
| ```python {test="skip"} | ||
| from pydantic_ai import Agent, BinaryImage | ||
| from pydantic_ai.models.openrouter import OpenRouterModelSettings | ||
|
|
||
| agent = Agent( | ||
| model='openrouter:google/gemini-2.5-flash-image-preview', | ||
| output_type=str | BinaryImage, | ||
| model_settings=OpenRouterModelSettings( | ||
| openrouter_modalities=['image', 'text'], | ||
| openrouter_image_config={'aspect_ratio': '3:2'}, | ||
| ), | ||
| ) | ||
|
|
||
| response = agent.run_stream_sync('A dog') | ||
| for output in response.stream_output(): | ||
| if isinstance(output, str): | ||
| print(output) | ||
| elif isinstance(output, BinaryImage): | ||
| # Handle the generated image | ||
| print(f'Generated image: {output.media_type}') | ||
| ``` | ||
Large diffs are not rendered by default.
Large diffs are not rendered by default.
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -7,6 +7,7 @@ | |
|
|
||
| from pydantic_ai import ( | ||
| Agent, | ||
| BinaryImage, | ||
| ModelHTTPError, | ||
| ModelMessage, | ||
| ModelRequest, | ||
|
|
@@ -406,3 +407,40 @@ class FindEducationContentFilters(BaseModel): | |
| } | ||
| ] | ||
| ) | ||
|
|
||
|
|
||
| async def test_openrouter_image_generation(allow_model_requests: None, openrouter_api_key: str) -> None: | ||
| provider = OpenRouterProvider(api_key=openrouter_api_key) | ||
| model = OpenRouterModel( | ||
| model_name='google/gemini-2.5-flash-image-preview', | ||
| provider=provider, | ||
| ) | ||
| settings = OpenRouterModelSettings(openrouter_modalities=['image', 'text']) | ||
|
|
||
| agent = Agent(model=model, output_type=str | BinaryImage, model_settings=settings) | ||
|
|
||
| result = await agent.run('A cat') | ||
|
|
||
| assert result.response.text == snapshot('Here is a cat for you! ') | ||
| assert isinstance(result.output, BinaryImage) | ||
|
|
||
|
|
||
| async def test_openrouter_image_generation_streaming(allow_model_requests: None, openrouter_api_key: str) -> None: | ||
| provider = OpenRouterProvider(api_key=openrouter_api_key) | ||
| model = OpenRouterModel( | ||
| model_name='google/gemini-2.5-flash-image-preview', | ||
| provider=provider, | ||
| ) | ||
| settings = OpenRouterModelSettings( | ||
| openrouter_modalities=['image', 'text'], openrouter_image_config={'aspect_ratio': '3:2'} | ||
| ) | ||
|
|
||
| agent = Agent(model=model, output_type=str | BinaryImage, model_settings=settings) | ||
|
|
||
| async with agent.run_stream('A dog') as result: | ||
| async for output in result.stream_output(): | ||
| if isinstance(output, str): | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This means we may never actually get to the image assertions! |
||
| assert output == snapshot('Here you go: ') | ||
| else: | ||
| assert isinstance(output, BinaryImage) | ||
| assert output.media_type == snapshot('image/png') | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also make this work with
builtin_tools=[ImageGenerationTool()]and document it here: https://ai.pydantic.dev/builtin-tools/#image-generation-toolAs with Google, which doesn't expose that as a tool, using that tool or
BinaryImageinoutput_typeshould automatically enable the modality.