Add configurable audio encoding for OpenAI models (Data URI support) #3596

Pavanmanikanta98 · 2025-11-29T16:43:58Z

Key Changes:

Updated Profile: Added openai_audio_input_encoding: Literal['base64', 'uri'] to OpenAIModelProfile.
- 'base64' (default): Maintains strict OpenAI compliance.
- 'uri': Enables Data URI formatting for providers like Qwen Omni.
Updated Model Logic: Modified OpenAIChatModel._map_user_prompt to respect this setting.
- For BinaryContent: Uses item.data_uri when encoding is 'uri'.
- For AudioUrl: Manually constructs the Data URI with the correct MIME type (e.g., audio/mpeg for mp3) when
  encoding is 'uri'.
New Tests: Added tests/models/test_openai_audio.py covering both default and URI encoding scenarios for both binary content and audio URLs.

This commit introduces `openai_audio_input_encoding` to `OpenAIModelProfile`, allowing users to choose between `'base64'` (default) and `'uri'` encoding for audio inputs. This addresses compatibility issues with providers like Qwen Omni that require Data URI format for audio data. Key changes: - Added `openai_audio_input_encoding` to `OpenAIModelProfile`. - Updated `OpenAIChatModel._map_user_prompt` to respect the configured encoding for `BinaryContent` and `AudioUrl`. - Added new tests in `tests/models/test_openai_audio.py` covering both encoding modes.

DouweM · 2025-12-01T21:38:56Z

pydantic_ai_slim/pydantic_ai/models/openai.py

+                    profile = OpenAIModelProfile.from_profile(self.profile)
+                    if profile.openai_audio_input_encoding == 'uri':
+                        format_to_mime = {'wav': 'audio/wav', 'mp3': 'audio/mpeg'}
+                        mime_type = format_to_mime.get(


We can use item.media_type right?

Yes, good point — I can use item.media_type here instead of maintaining my own format_to_mime mapping. I’ll update the AudioUrl handling in _map_user_prompt to construct the data URI using item.media_type, with a simple 'fallback' if it’s missing.

DouweM · 2025-12-01T21:39:25Z

pydantic_ai_slim/pydantic_ai/profiles/openai.py

    openai_chat_supports_web_search: bool = False
    """Whether the model supports web search in Chat Completions API."""

+    openai_audio_input_encoding: Literal['base64', 'uri'] = 'base64'


This is specific to OpenAIChatModel and doesn't affect OpenAIResponsesModel so let's prefix with openai_chat_

Makes sense — this is only used by OpenAIChatModel. I’ll rename the profile field to openai_chat_audio_input_encoding and update the chat mapping to use that, so it’s clearly scoped to Chat Completions and doesn’t imply anything about OpenAIResponsesModel.

DouweM · 2025-12-01T21:42:14Z

pydantic_ai_slim/pydantic_ai/profiles/openai.py

+    """The encoding to use for audio input.
+
+    - `'base64'`: Raw base64 encoded string. (Default, used by OpenAI)
+    - `'uri'`: Data URI (e.g. `data:audio/wav;base64,...`). (Used by Qwen Omni)


We should still make it so that this is used automatically for Qwen Omni. If that's only a requirement of Qwen's own ChatCompletions-compatible API, we may want a new provider class that can define its own model_profile method and be used with OpenAIChatModel. We shouldn't set this in the existing qwen_model_profile method as Qwen can also be used with providers that probably do not have this quirk.

Pavanmanikanta98 · 2025-12-02T16:14:28Z

@DouweM , For the Qwen Omni integration specifically, I’d like to follow your suggestion and handle the Data URI requirement via a dedicated provider rather than changing the shared qwen_model_profile.

Concretely, my plan is:

Add a new provider class for Qwen’s OpenAI‑compatible Chat Completions endpoint (e.g. QwenOpenAIProvider ), which implements its own model_profile(self, model_name: str).

That model_profile will start from the standard OpenAI profile (e.g. openai_model_profile(model_name)) and then update it to set openai_chat_audio_input_encoding='uri'.

Users who want to talk to Qwen Omni via an OpenAI‑style API would instantiate OpenAIChatModel with this provider and the Qwen Omni base URL, and they’d automatically get Data URI audio, while other Qwen providers keep the default base64 behavior.

DouweM · 2025-12-02T21:55:02Z

@Pavanmanikanta98 Thanks, makes sense. It should be just QwenProvider, and we should also support the qwen: model name prefix, update the docs, etc. See https://ai.pydantic.dev/models/openai/#openai-compatible-models for examples; anywhere those a referenced in the code, we should add a branch for qwen as well.

…i models - Add QwenProvider for DashScope OpenAI-compatible API - Rename openai_audio_input_encoding to openai_chat_audio_input_encoding - Use item.media_type for Data URI MIME types instead of hardcoded mapping - Automatically set Data URI audio encoding for Qwen Omni models - Add comprehensive tests for QwenProvider and audio encoding - Add Qwen documentation section to OpenAI-compatible models docs Fixes pydantic#3530

- Include 'qwen' in the model inference options for compatibility with Qwen models. - Set up environment variable for Qwen API key in test_examples.py to facilitate testing. This enhances the integration of Qwen models within the existing framework.

- Add tests for initializing QwenProvider with `openai_client` and `http_client` to ensure full branch coverage.

Pavanmanikanta98 · 2025-12-04T04:30:04Z

Hi @DouweM, I've addressed your feedback: renamed to openai_chat_audio_input_encoding, used item.media_type instead of hardcoded mapping, and added QwenProvider with automatic Omni audio encoding. All tests pass.
Ready for review.

Pavanmanikanta98 mentioned this pull request Nov 29, 2025

The way OpenAIChatModel sends input audio is incompatible with Qwen Omni API #3530

Open

DouweM self-assigned this Dec 1, 2025

DouweM requested changes Dec 1, 2025

View reviewed changes

DouweM added the awaiting author revision label Dec 1, 2025

pavan added 3 commits December 3, 2025 22:15

test: Improve QwenProvider coverage

d62f968

- Add tests for initializing QwenProvider with `openai_client` and `http_client` to ensure full branch coverage.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add configurable audio encoding for OpenAI models (Data URI support) #3596

Add configurable audio encoding for OpenAI models (Data URI support) #3596

Uh oh!

Pavanmanikanta98 commented Nov 29, 2025 •

edited

Loading

Uh oh!

DouweM Dec 1, 2025

Uh oh!

Pavanmanikanta98 Dec 2, 2025 •

edited

Loading

Uh oh!

DouweM Dec 1, 2025

Uh oh!

Pavanmanikanta98 Dec 2, 2025

Uh oh!

DouweM Dec 1, 2025

Uh oh!

Pavanmanikanta98 commented Dec 2, 2025

Uh oh!

DouweM commented Dec 2, 2025

Uh oh!

Pavanmanikanta98 commented Dec 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add configurable audio encoding for OpenAI models (Data URI support) #3596

Are you sure you want to change the base?

Add configurable audio encoding for OpenAI models (Data URI support) #3596

Uh oh!

Conversation

Pavanmanikanta98 commented Nov 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DouweM Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

Pavanmanikanta98 Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

DouweM Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

Pavanmanikanta98 Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

DouweM Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

Pavanmanikanta98 commented Dec 2, 2025

Uh oh!

DouweM commented Dec 2, 2025

Uh oh!

Pavanmanikanta98 commented Dec 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Pavanmanikanta98 commented Nov 29, 2025 •

edited

Loading

Pavanmanikanta98 Dec 2, 2025 •

edited

Loading