You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
**Working on your first Pull Request?** You can learn how from this *free* series [How to Contribute to an Open Source Project on GitHub](https://kcd.im/pull-request)
Many open source projects support the compatibility of the `completions` and the `chat/completions` endpoints of the OpenAI API, but do not support the `embeddings` endpoint.
4
6
5
7
The goal of this project is to create an OpenAI API-compatible version of the `embeddings` endpoint, which serves open source sentence-transformers models and other models supported by the LangChain's [HuggingFaceEmbeddings](https://api.python.langchain.com/en/latest/embeddings/langchain.embeddings.huggingface.HuggingFaceEmbeddings.html), HuggingFaceInstructEmbeddings and HuggingFaceBgeEmbeddings class.
@@ -22,25 +24,33 @@ To run the embeddings endpoint locally as a standalone FastAPI server, follow th
22
24
23
25
1. Install the dependencies by executing the following commands:
2. Run the server with the desired model using the following command which enabled normalize embeddings (Omit the `NORMALIZE_EMBEDDINGS` if the model don't support normalize embeddings):
If a GPU is detected in the runtime environment, the server will automatically execute using the `cuba` mode. However, you have the flexibility to specify the `DEVICE` environment variable to choose between `cpu` and `cuba`. Here's an example of how to run the server with your desired configuration:
This setup allows you to seamlessly switch between CPU and GPU modes, giving you control over the server's performance based on your specific requirements.
35
45
36
46
3. You will see the following text from your console once the server has started:
37
47
38
-
```bash
39
-
INFO: Started server process [19705]
40
-
INFO: Waiting for application startup.
41
-
INFO: Application startup complete.
42
-
INFO: Uvicorn running on http://localhost:8000 (Press CTRL+C to quit)
43
-
```
48
+
```bash
49
+
INFO: Started server process [19705]
50
+
INFO: Waiting for application startup.
51
+
INFO: Application startup complete.
52
+
INFO: Uvicorn running on http://localhost:8000 (Press CTRL+C to quit)
53
+
```
44
54
45
55
## AWS Lambda Function
46
56
@@ -58,8 +68,12 @@ To get started:
58
68
59
69
1. Install the dependencies by executing the following command:
0 commit comments