You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: documents/EN/examples_python.md
+60Lines changed: 60 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -297,6 +297,66 @@ Running on local URL: http://127.0.0.1:7860
297
297
To create a public link, set `share=True` in `launch()`.
298
298
```
299
299
300
+
# 4_fastchat
301
+
[FastChat](https://github.com/lm-sys/FastChat) is an open-source platform designed for training, serving, and evaluating large language model chatbots. It facilitates integrating an inference engine backend into the platform in a worker-based manner, providing services compatible with the OpenAI API.
302
+
303
+
In the [examples/python/4_fastchat/dashinfer_worker.py](../../examples/python/4_fastchat/dashinfer_worker.py) file, we supply a sample code demonstrating the implementation of a worker using FastChat and DashInfer. Users can readily substitute the default `fastchat.serve.model_worker` in the FastChat service component with `dashinfer_worker`, thereby achieving a solution that is not only compatible with the OpenAI API but also optimizes CPU usage for efficient inference.
## Step 4: Send HTTP Request via cURL to Access OpenAI API-Compatible Endpoint
322
+
```shell
323
+
curl http://localhost:8000/v1/chat/completions \
324
+
-H "Content-Type: application/json" \
325
+
-d '{
326
+
"model": "Qwen-7B-Chat",
327
+
"messages": [{"role": "user", "content": "Hello! What is your name?"}]
328
+
}'
329
+
```
330
+
331
+
## Quick Start with Docker
332
+
Furthermore, we provide a convenient Docker image, enabling rapid deployment of an HTTP service that integrates dashinfer_worker and is compatible with the OpenAI API. Execute the following command, ensuring to replace bracketed paths with actual paths:
-`<host_path_to_your_model>`: The path on the host where ModelScope/HuggingFace models reside.
344
+
-`<container_path_to_your_model>`: The path within the container for mounting ModelScope/HuggingFace models.
345
+
-`<host_path_to_dashinfer_json_config_file>`: The location of the DashInfer JSON configuration file on the host.
346
+
-`<container_path_to_dashinfer_json_config_file>`: The destination path in the container for the DashInfer JSON configuration file.
347
+
- The `-m` flag denotes the path to ModelScope/HuggingFace within the container, which is determined by the host-to-container path binding specified in `-v`. If this refers to a standard ModelScope/HuggingFace path (e.g., `qwen/Qwen-7B-Chat`), there's no need to bind the model path from the host; the container will automatically download the model for you.
348
+
349
+
Below is an example of launching a Qwen-7B-Chat model service, with the default host set to localhost and the port to 8000.
0 commit comments