-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Support GPT-5 Freeform Function Calling and Context Free Grammar for tool args and output #3612
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Support GPT-5 Freeform Function Calling and Context Free Grammar for tool args and output #3612
Conversation
the response from the model is not correctly handled yet
now produces appropriate warnings when misconfigured
I can now call the model with the following:
```
from pydantic_ai import Agent
import asyncio
import json
from pydantic_core import to_jsonable_python
from pydantic_ai.models.openai import OpenAIResponsesModel
agent = Agent(OpenAIResponsesModel("gpt-5-mini"), model_settings={"openai_reasoning_effort": "minimal"})
@agent.tool_plain(free_form=True)
def execute_lucene_query(query: str) -> str:
"""Use this to run a lucene query against the system.
YOU MUST ALWAYS RUN A QUERY BEFORE ANSWERING THE USER.
Args:
query: the lucene query to run
Returns:
the result of executing the query, or an error message
"""
return "The query failed to execute, the solr server is unavailable"
async def run() -> None:
response = await agent.run("Execute the lucene query text:IKEA and give me the results")
history = response.all_messages()
as_json = json.dumps(to_jsonable_python(history), indent=2)
print(as_json)
print(response.output)
asyncio.run(run())
```
will also validate the grammar if the dependency is installed
going to add the literal version now
it's provided by the init
pyright isn't happy with it
going to change all this
PYTHONPATH=pydantic_ai_slim/ uv run coverage run -m pytest tests/test_tools.py && uv run coverage report --include=pydantic_ai_slim/pydantic_ai/tools.py Name Stmts Miss Branch BrPart Cover Missing ------------------------------------------------------------------------------------ pydantic_ai_slim/pydantic_ai/tools.py 141 1 20 1 98.76% 176, 177->exit ------------------------------------------------------------------------------------ TOTAL 141 1 20 1 98.76% lark isn't installed
this makes coverage drop dramatically
Throwing an exception causes coverage to drop: ``` Name Stmts Miss Branch BrPart Cover Missing ------------------------------------------------------------------------------------ pydantic_ai_slim/pydantic_ai/tools.py 141 31 18 7 73.58% 176, 194-195, 196->202, 199, 202->204, 206, 209->210, 211, 224-228, 317, 320, 376-385, 389-392, 401-404, 417-418, 426, 438, 450, 463, 472->473, 474, 476-481 ------------------------------------------------------------------------------------ TOTAL 141 31 18 7 73.58% ``` This is due to the exception introducing a new branch which is untested by the original code.
`tool_argument_name` variable exists to narrow type of `argument_name` to `str`
This separates the implementation of regex and lark to two classes. The base class of these would be empty, so the `TextFormat` has become a union type. The openai tool call handling has been changed to silently ignore formats that it does not handle. This is consistent with how the gpt-5 models ignore temperature parameter, which is not supported by reasoning models (see pydantic#2483).
…-tools # Conflicts: # pydantic_ai_slim/pydantic_ai/agent/__init__.py # pydantic_ai_slim/pydantic_ai/models/openai.py # pydantic_ai_slim/pydantic_ai/profiles/openai.py # pydantic_ai_slim/pydantic_ai/toolsets/function.py # pydantic_ai_slim/pyproject.toml # pyproject.toml # tests/models/test_google.py # tests/models/test_openai.py # tests/models/test_openai_responses.py # tests/test_agent.py # uv.lock
Refactor freeform function calling to use Annotated[str, ...] syntax instead of decorator/kwarg parameters. This allows the same pattern to work for both function tools and output types. - Rename RegexTextFormat -> RegexGrammar, LarkTextFormat -> LarkGrammar - Add FreeformText marker class for unconstrained text input - Extract text_format from type annotations in _function_schema.py - Remove text_format parameter from tool decorators and ToolOutput - Update documentation with annotation-based examples
| ``` | ||
|
|
||
| 1. The GPT-5 family (`gpt-5`, `gpt-5-mini`, `gpt-5-nano`) all support freeform function calling with context free grammar constraints. Unfortunately `gpt-5-nano` often struggles with these calls. | ||
| 2. If the tool or model cannot be used with freeform function calling then it will be invoked in the normal way, which may lead to invalid input. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be super cool if we could perform our own agent-side validation of the input by defining __get_pydantic_core_schema__ on the grammar classes. That way this could work even with models other than gpt-5, provided we share the grammar in the tool description and models understand what it means, by relying on the same retry behavior for JSON args: https://ai.pydantic.dev/agents/#reflection-and-self-correction. That'd be similar to Prompted Output mode and non-strict JSON tool args, where we're just relying on the model's understanding instead of strict token constraints.
If we do that, then most of this can be documented outside of OpenAI context.
| #> This is an excellent joke invented by Samuel Colvin, it needs no explanation. | ||
| ``` | ||
|
|
||
| ### Freeform Function Calling |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fact that this can be used for output is a bit buried now, I'd like that to be clearer. If we do what I wrote in the other comment about validating on the agent side, this would warrant sections on the Output and Tool docs.
| ''' # (1)! | ||
|
|
||
| model = OpenAIResponsesModel('gpt-5') | ||
| agent = Agent(model, output_type=Annotated[str, LarkGrammar(sql_grammar)]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's try to implement support for this feature in OutlinesModel as well, as I believe outlines supports similar grammar based constraints. That'd be a good way of verifying that the implementation is generic enough to work with providers other than OpenAI.
| # Extract text format annotation if present | ||
| if extracted_format := _extract_text_format(annotation): | ||
| if text_format is not None: | ||
| errors.append('Only one parameter may have a TextFormat annotation') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may be able to weaken this requirement and support multiple grammar-constrained str args, if we can do the validation on our side. Then we'd use OpenAI's custom tools functionality only if there is a single arg with a format annotation.
| Returns: | ||
| The TextFormat instance if found, None otherwise. | ||
| """ | ||
| from typing import Annotated, get_args, get_origin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please move imports to the top of the file
|
|
||
| # Look for TextFormat in metadata | ||
| for item in metadata: | ||
| if isinstance(item, (FreeformText, RegexGrammar, LarkGrammar)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or if we make these all subclasses of one type that's defined here, the more interesting subtypes can be defined in tools
| pass | ||
| elif isinstance(item, responses.ResponseCustomToolCall): | ||
| # Handle custom tool calls (freeform function calling) | ||
| if item.name not in model_request_parameters.tool_defs: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make it clear this is for the scenario where the model calls a non-existent tool
| tool = model_request_parameters.tool_defs[item.name] | ||
| tool_argument_name = tool.single_string_argument_name | ||
| if tool_argument_name is None: | ||
| raise UnexpectedModelBehavior( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't be able to get here, right, as we wouldn't send the tool definition as being a custom tool unless there was a single_string_argument_name?
I'd prefer to just use input as the key in that case, and let the existing tool args validation deal with errors
| ToolCallPart( | ||
| item.name, | ||
| {argument_name: item.input}, | ||
| tool_call_id=_combine_tool_call_ids(item.call_id, item.id), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't need to _combine_tool_call_ids anymore, we can use both tool_call_id and id fields
| if f.text_format: | ||
| if not model_profile.openai_supports_freeform_function_calling: | ||
| raise UserError( | ||
| f'Tool {f.name!r} uses freeform function calling but {self._model_name!r} does not support freeform function calling.' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather not raise an error, as we won't in other model classes either. We can just use the normal behavior, if possible with agent-side grammar validation
| timestamp_pattern = r'^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01]) (?:[01]\d|2[0-3]):[0-5]\d$' | ||
|
|
||
| @agent.tool_plain | ||
| def timestamp_accepting_tool(timestamp: Annotated[str, RegexGrammar(timestamp_pattern)]): ... # (2)! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about supporting the Pydantic pattern Annotated[str, Field(pattern=...)] as well?
Picks this back up in hopes of bringing it across the finish line.