Skip to content

[Bug]: LoRA not working in v0.11.0rc0 #3668

@chenfengt

Description

@chenfengt

Your current environment

The output of `python collect_env.py`
Your output of above commands here

🐛 Describe the bug

While using the LoRA feature in vllm-ascend v0.11.0rc0, I noticed that the responses generated by the LoRA model and the base model are identical. This does not happen in the standard vllm. Could there be an issue here? Below is my test startup script:
vllm serve /llm/model/Qwen1.5-4B-Chat --max-lora-rank 64 --max-loras 1 --max-cpu-loras 100 --enable-lora --lora-modules test-lora=/llm/lora_models/Qwen1.5-4B-Chat_lora_huanhuan

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions