Skip to content

Conversation

@zhangxinyuehfad
Copy link
Contributor

What this PR does / why we need it?

Fix Qwen2-Audio-7B-Instruct accuracy test

Backport:#4017

Does this PR introduce any user-facing change?

How was this patch tested?

Signed-off-by: hfadzxy <starmoon_zhang@163.com>
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request fixes an accuracy test for the Qwen2-Audio-7B-Instruct model by setting gpu_memory_utilization to 0.8. This is a common and reasonable adjustment to prevent out-of-memory errors, especially for large multi-modal models. The change is confined to a test configuration file and appears to be a correct and targeted fix. I have reviewed the change and found no issues.

@github-actions
Copy link

github-actions bot commented Nov 6, 2025

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

@wangxiyuan wangxiyuan merged commit d913f94 into vllm-project:v0.11.0-dev Nov 10, 2025
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants