[Bug]: cannot send two POST to /v1/chat/completions endpoint with identic tool function name with model GPT-OSS-120B

### Your current environment

<details>
<summary>The bug is reproducible with docker image vllm/vllm-openai:v0.12.0</summary>

```yaml
services:
  vllm-gptoss-large:
    image: vllm/vllm-openai:v0.12.0
    restart: always
    shm_size: '64gb'
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              device_ids: ['0', '1']
              capabilities: [gpu]
    volumes:
      - ./data/hf:/data
    environment:
      - HF_TOKEN=${HF_TOKEN}
    ports:
      - 8000:8000
    command: ["openai/gpt-oss-120b",
             "--tool-call-parser","openai",
             "--enable-auto-tool-choice",
             "--reasoning-parser","openai_gptoss",
             "--tensor-parallel-size","2",
             "--port","8000",
             "--api-key", "${VLLM_API_KEY}",
             "--download_dir", "/data"]
```

</details>


### 🐛 Describe the bug

This bash script cannot be executed a second time, unless the name of the function is changed to a value which was not yet sent. Without tool definition, the POST can be sent as often as you like.

```bash
#!/bin/bash
curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Authorization: Bearer ${VLLM_API_KEY}" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-oss-120b",
    "stream": false,
    "messages": [
      {
        "role": "system",
        "content": "Be a helpful assistant."
      },
      {
        "role": "user",
        "content": "Hi"
      },
      {
        "role": "assistant",
        "content": "How can I help you?"
      },
      {
        "role": "user",
        "content": "Do you like Monty Python?"
      }
    ],
    "tools": [
      {
        "type": "function",
        "function": {
          "name": "CHANGE-NAME-BEFORE-SENDING",
          "description": "Use this tool if you need to extract information from a website.",
          "parameters": {
            "type": "object",
            "properties": {
              "url": {
                "type": "string",
                "description": "The URL to search or extract information from."
              }
            },
            "required": ["url"]
          }
        }
      }
    ]
  }'

```

The script doesn't finish waiting for a response and `nvidia-smi` shows the cards consuming max power. The vllm logs show that there are tokens generated, so from an external point of view the LLM seems to generate tokens without stopping.

<img width="2962" height="274" alt="Image" src="https://github.com/user-attachments/assets/115672b2-f85f-43ec-b89c-d3a0daae7d81" />

This is quite weird, because when you call it with python SDK, it is working fine, e.g.

```python
from openai import OpenAI
from dotenv import load_dotenv
import os

load_dotenv()

client = OpenAI(
    api_key=os.getenv("API_KEY"),
    base_url="http://localhost:8000/v1",
)

tools = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "Get the current weather in a given location",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {"type": "string"},
                "description": "Location and state, e.g., 'San Francisco, CA'"
            },
            "required": ["location"]
            },
        },
    }
]

response = client.chat.completions.create(
    model="openai/gpt-oss-120b",
    messages=[{"role": "user", "content": "How is the weather in Berlin? use the tool get_weather."}],
    tools=tools,
    tool_choice="auto",
    stream=False 
)
 
print(response.choices[0].message)
```

In fact this can also be reproduced using n8n, AI Agent nodes which are based on the typescipt langgraph implementation: https://github.com/n8n-io/n8n/blob/master/packages/%40n8n/nodes-langchain/nodes/agents/Agent/agents/ToolsAgent/V1/execute.ts#L34

Here you can also see that chat windows freeze when a tool is attached and a user is asking the second question.

The bug really seems to be related to this model, because I tested Mistral and Qwen Models and I couldn't reproduce it. When I tried to debug the issue, there was a sensetivity to the description field in the parameters list of the tool. To make it clear, this can also only be sent once using the OpenAI Python SDK, but works again when the function name is changed:

```python
from openai import OpenAI
from dotenv import load_dotenv
import os

load_dotenv()

client = OpenAI(
    api_key=os.getenv("API_KEY"),
    base_url=f"https://{os.getenv('API_DOMAIN')}/v1",
)

tools = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "Get the current weather in a given location",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string", 
                    "description": "Location and state, e.g., 'San Francisco, CA'"
                    },
            },
            "required": ["location"]
            },
        },
    }
]

response = client.chat.completions.create(
    model="openai/gpt-oss-120b",
    messages=[{"role": "user", "content": "How is the weather in Berlin? use the tool get_weather."}],
    tools=tools,
    tool_choice="auto",
    stream=False 
)
 
print(response.choices[0].message)
```

### Before submitting a new issue...

- [x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bug]: cannot send two POST to /v1/chat/completions endpoint with identic tool function name with model GPT-OSS-120B #29998

Your current environment

🐛 Describe the bug

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: cannot send two POST to /v1/chat/completions endpoint with identic tool function name with model GPT-OSS-120B #29998

Description

Your current environment

🐛 Describe the bug

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions