-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Open
Labels
questionFurther information is requestedFurther information is requested
Description
Question
qwen omni
According to the Qwen Omni documentation, audio input must be formatted like this:
messages = [
{
"role": "user",
"content": [
{
"input_audio": {
"data": "data:;base64,{base64_audio}",
"format": "wav"
},
"type": "input_audio"
},
{
"text": "prompt",
"type": "text"
}
]
}
]I am using BinaryContent in Pydantic AI like this:
response = await agent.run(
user_prompt=[
BinaryContent(
data=base64.b64decode(audio_base64),
media_type="audio/wav",
),
self.prompt,
],
)But the serialized content becomes:
{"data": "{base64_audio}", "format": "wav"}The generated output is missing the required prefix:
data:;base64,This means the final payload is not compatible with Qwen Omni’s expected input format.
Question
Is there a correct way to:
- Make BinaryContent output the audio in the required data:;base64,{...} format?
- Or hook into the serialization layer so I can manually prepend the prefix?
- Any recommended workaround or official way to customize the content part format would be helpful.
Additional Context
python==3.10.8
pydantic-ai-slim[openai]==1.21.0
LLM model: qwen3-omni-flash
Metadata
Metadata
Assignees
Labels
questionFurther information is requestedFurther information is requested