[Bug]: LoRA not working in v0.11.0rc0

### Your current environment

<details>
<summary>The output of `python collect_env.py`</summary>

```text
Your output of above commands here
```

</details>


### 🐛 Describe the bug

While using the LoRA feature in vllm-ascend v0.11.0rc0, I noticed that the responses generated by the LoRA model and the base model are identical. This does not happen in the standard vllm. Could there be an issue here? Below is my test startup script:
vllm serve /llm/model/Qwen1.5-4B-Chat --max-lora-rank 64 --max-loras 1 --max-cpu-loras 100 --enable-lora --lora-modules test-lora=/llm/lora_models/Qwen1.5-4B-Chat_lora_huanhuan

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug]: LoRA not working in v0.11.0rc0 #3668

Your current environment

🐛 Describe the bug

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug]: LoRA not working in v0.11.0rc0 #3668

Description

Your current environment

🐛 Describe the bug

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions