Skip to content

[Bug]: Online API Inference Generates Repetitive "batis)batis)" Pattern Instead of Coherent Text #409

@cheliu0086

Description

@cheliu0086

Your current environment

`python collect_env.py`
 python collect_env.py failed due to this is a Nvidia L20 machine, no Ascend modules

🐛 Describe the bug

Online API Inference Generates Repetitive "batis)batis)" Pattern Instead of Coherent Text
I have follow the guide in "quick start"

other information:
Offline Inference (offline_inference.py) is good
source code version: tag v0.1.0rc4

key logs:
···
curl http://localhost:7800/v1/completions -H "Content-Type: application/json" -d '{
"model": "vllm_cpu_offload",
"prompt": "Shanghai is a",
"max_tokens": 7,
"temperature": 0
}'
···
{"id":"cmpl-42e9ba776e284460a48bcbdafb848d0e","object":"text_completion","created":1764055654,"model":"vllm_cpu_offload","choices":[{"index":0,"text":"batis)batis)batis)batis","logprobs":null,"finish_reason":"length","stop_reason":null,"prompt_logprobs":null}],"usage":{"prompt_tokens":4,"total_tokens":11,"completion_tokens":7,"prompt_tokens_details":null},"kv_transfer_params":null}(modelscope-py310)

more details in attachment:
log20251125.txt

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions