linode
diff --git a/‎docs/guides/kubernetes/ai-chatbot-and-rag-pipeline-for-inference-on-lke/ai_rag_chatbot_implementation.png‎
49.6 KB b/‎docs/guides/kubernetes/ai-chatbot-and-rag-pipeline-for-inference-on-lke/ai_rag_chatbot_implementation.png‎
49.6 KB
diff --git a/‎docs/guides/kubernetes/ai-chatbot-and-rag-pipeline-for-inference-on-lke/index.md‎
Lines changed: 27 additions & 18 deletions b/‎docs/guides/kubernetes/ai-chatbot-and-rag-pipeline-for-inference-on-lke/index.md‎
Lines changed: 27 additions & 18 deletions
diff --git a/‎docs/guides/kubernetes/ai-chatbot-and-rag-pipeline-for-inference-on-lke/open-webui-interface.jpg‎
52.6 KB b/‎docs/guides/kubernetes/ai-chatbot-and-rag-pipeline-for-inference-on-lke/open-webui-interface.jpg‎
52.6 KB
diff --git a/‎docs/guides/kubernetes/ai-chatbot-and-rag-pipeline-for-inference-on-lke/open-webui-llama3.jpg‎
202 KB b/‎docs/guides/kubernetes/ai-chatbot-and-rag-pipeline-for-inference-on-lke/open-webui-llama3.jpg‎
202 KB
diff --git a/‎docs/guides/kubernetes/ai-chatbot-and-rag-pipeline-for-inference-on-lke/open-webui-rag-pipeline.jpg‎
149 KB b/‎docs/guides/kubernetes/ai-chatbot-and-rag-pipeline-for-inference-on-lke/open-webui-rag-pipeline.jpg‎
149 KB
diff --git a/‎docs/marketplace-docs/guides/_index.md‎
Lines changed: 0 additions & 9 deletions b/‎docs/marketplace-docs/guides/_index.md‎
Lines changed: 0 additions & 9 deletions
diff --git a/‎docs/marketplace-docs/guides/ark-survival-evolved/index.md‎
Lines changed: 1 addition & 1 deletion b/‎docs/marketplace-docs/guides/ark-survival-evolved/index.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/marketplace-docs/guides/bitninja/index.md‎
Lines changed: 1 addition & 1 deletion b/‎docs/marketplace-docs/guides/bitninja/index.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/marketplace-docs/guides/budibase/index.md‎
Lines changed: 3 additions & 0 deletions b/‎docs/marketplace-docs/guides/budibase/index.md‎
Lines changed: 3 additions & 0 deletions
@@ -5,6 +5,7 @@ description: "Utilize the Retrieval-Augmented Generation technique to supplement
 authors: ["Linode"]
 contributors: ["Linode"]
 published: 2025-02-11
+modified: 2025-02-13
 keywords: ['kubernetes','lke','ai','inferencing','rag','chatbot','architecture']
 tags: ["kubernetes","lke"]
 license: '[CC BY-ND 4.0](https://creativecommons.org/licenses/by-nd/4.0)'
@@ -14,7 +15,9 @@ license: '[CC BY-ND 4.0](https://creativecommons.org/licenses/by-nd/4.0)'
 
 LLMs (Large Language Models) are increasingly used to power chatbots or other knowledge assistants. While these models are pre-trained on vast swaths of information, they are not trained on your own private data or knowledge base. To overcome this, you need to provide this data to the LLM (a process called context augmentation). This tutorial showcases a particular method of context augmentation called Retrieval-Augmented Generation (RAG), which indexes your data and attaches relevant data as context when users sends the LLM queries.
 
-Follow this tutorial to deploy a RAG pipeline on Akamai’s LKE service using our latest GPU instances. Once deployed, you will have a web chatbot that can respond to queries using data from your own custom data source.
+Follow this tutorial to deploy a RAG pipeline on Akamai’s LKE service using our latest GPU instances. Once deployed, you will have a web chatbot that can respond to queries using data from your own custom data source like shown in the screenshot below.
+
+![Screenshot of the Open WebUI query interface](open-webui-interface.jpg)
 
 ## Diagram
 
@@ -31,8 +34,8 @@ Follow this tutorial to deploy a RAG pipeline on Akamai’s LKE service using ou
 
 - **Kubeflow:** This open-source software platform includes a suite of applications that are used for machine learning tasks. It is designed to be run on Kubernetes. While each application can be installed individually, this tutorial installs all default applications and makes specific use of the following:
     - **KServe:** Serves machine learning models. This tutorial installs the Llama 3 LLM to KServe, which then serves it to other applications, such as the chatbot UI.
-    - **Kubeflow Pipeline:** Used to deploy pipelines, reusable machine learning workflows built using the Kubeflow Pipelines SDK. In this tutorial, a pipeline is used to run LlamaIndex to train the LLM with additional data.
-- **Meta’s Llama 3 LLM:** We use Llama 3 as the LLM, along with the LlamaIndex tool to capture data from an external source and send embeddings to the Milvus database.
+    - **Kubeflow Pipeline:** Used to deploy pipelines, reusable machine learning workflows built using the Kubeflow Pipelines SDK. In this tutorial, a pipeline is used to run LlamaIndex to process the dataset and store embeddings.
+- **Meta’s Llama 3 LLM:** The [meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) model is used as the LLM. You should review and agree to the licensing agreement before deploying.
 - **Milvus:** Milvus is an open-source vector database and is used for generative AI workloads. This tutorial uses Milvus to store embeddings generated by LlamaIndex and make them available to queries sent to the Llama 3 LLM.
 - **Open WebUI:** This is an self-hosted AI chatbot application that’s compatible with LLMs like Llama 3 and includes a built-in inference engine for RAG solutions. Users interact with this interface to query the LLM. This can be configured to send queries straight to Llama 3 or to first load data from Milvus and send that context along with the query.
 
@@ -54,11 +57,11 @@ The configuration instructions in this document are expected to not expose any s
 It’s not part of the scope of this document to cover the setup required to secure this configuration for a production deployment.
 {{< /note >}}
 
-# Set up infrastructure
+## Set up infrastructure
 
 The first step is to provision the infrastructure needed for this tutorial and configure it with kubectl, so that you can manage it locally and install software through helm. As part of this process, we’ll also need to install the NVIDIA GPU operator at this step so that the NVIDIA cards within the GPU worker nodes can be used on Kubernetes.
 
-1. **Provision an LKE cluster.** We recommend using at least two **RTX4000 Ada x2 Medium** GPU plans (plan ID: `g2-gpu-rtx4000a2-m`), though you can adjust this as needed. For reference, Kubeflow recommends 32 GB of RAM and 16 CPU cores. This tutorial has been tested using Kubernetes v1.31, though other versions should also work. To learn more about provisioning a cluster, see the [Create a cluster](https://techdocs.akamai.com/cloud-computing/docs/create-a-cluster) guide.
+1. **Provision an LKE cluster.** We recommend using at least 3 **RTX4000 Ada x1 Medium** GPU plans (plan ID: `g2-gpu-rtx4000a1-m`), though you can adjust this as needed. For reference, Kubeflow recommends 32 GB of RAM and 16 CPU cores for just their own application. This tutorial has been tested using Kubernetes v1.31, though other versions should also work. To learn more about provisioning a cluster, see the [Create a cluster](https://techdocs.akamai.com/cloud-computing/docs/create-a-cluster) guide.
 
     {{< note noTitle=true >}}
     GPU plans are available in a limited number of data centers. Review the [GPU product documentation](https://techdocs.akamai.com/cloud-computing/docs/gpu-compute-instances#availability) to learn more about availability.
@@ -77,7 +80,7 @@ The first step is to provision the infrastructure needed for this tutorial and c
     You can confirm that the operator has been installed on your cluster by running reviewing your pods. You should see a number of pods in the `gpu-operator` namespace.
 
     ```command
-    kubectl get pods -A
+    kubectl get pods -n gpu-operator
     ```
 
 ### Deploy Kubeflow
@@ -114,6 +117,8 @@ Next, let’s deploy Kubeflow on the LKE cluster. These instructions deploy all
     kubectl get pods -A
     ```
 
+    You may notice a status of `CrashLoopBackOff` on one or more pods. This can be caused to a temporary issue with a persistent volume attaching to a worker node and should be resolved within a minute or so.
+
 ### Install Llama3 LLM on KServe
 
 After Kubeflow has been installed, we can now deploy the Llama 3 LLM to KServe. This tutorial uses HuggingFace (a platform that provides pre-trained AI models) to deploy Llama 3 to the LKE cluster. Specifically, these instructions use the [meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) model.
@@ -152,7 +157,7 @@ After Kubeflow has been installed, we can now deploy the Llama 3 LLM to KServe.
             name: huggingface
           args:
             - --model_name=llama3
-            - --model_id=NousResearch/Meta-Llama-3-8B-Instruct
+            - --model_id=meta-llama/meta-llama-3-8b-instruct
             - --max-model-len=4096
           env:
             - name: HF_TOKEN
@@ -218,7 +223,7 @@ Kubeflow Pipeline pulls together the entire workflow for ingesting data from our
 
 1. Download a zip archive from the specified URL.
 1. Uses LlamaIndex to read the Markdown files within the archive.
-1. Generate embeddings from the content of those files using the [BAAI/bge-large-en-v1.5](https://huggingface.co/BAAI/bge-large-en-v1.5) model.
+1. Generate embeddings from the content of those files.
 1. Store the embeddings within the Milvus database collection.
 
 Keep this workflow in mind when going through the Kubeflow Pipeline set up steps in this section. If you require a different pipeline workflow, you will need to adjust the python file and Kubeflow Pipeline configuration discussed in this section.
@@ -240,7 +245,7 @@ This tutorial employs a Python script to create the YAML file used within Kubefl
     pip install kfp
     ```
 
-1. Use the following python script to generate a YAML file to use for the Kubeflow Pipeline. This script configures the pipeline to download the Markdown data you wish to ingest, read the content using LlamaIndex, generate embeddings of the content using BAAI general embedding model, and store the embeddings in the Milvus database. Replace values as needed before proceeding.
+1. Use the following python script to generate a YAML file to use for the Kubeflow Pipeline. This script configures the pipeline to download the Markdown data you wish to ingest, read the content using LlamaIndex, generate embeddings of the content, and store the embeddings in the Milvus database. Replace values as needed before proceeding.
 
     ```file {title="doc-ingest-pipeline.py" lang="python"}
     from kfp import dsl
@@ -269,7 +274,7 @@ This tutorial employs a Python script to create the YAML file used within Kubefl
         from llama_index.core import Settings
 
         Settings.embed_model = HuggingFaceEmbedding(
-            model_name="BAAI/bge-large-en-v1.5"
+            model_name="sentence-transformers/all-MiniLM-L6-v2"
         )
 
         from llama_index.llms.openai_like import OpenAILike
@@ -285,7 +290,7 @@ This tutorial employs a Python script to create the YAML file used within Kubefl
         from llama_index.core import VectorStoreIndex, StorageContext
         from llama_index.vector_stores.milvus import MilvusVectorStore
 
-        vector_store = MilvusVectorStore(uri="http://my-release-milvus.default.svc.cluster.local:19530", collection=collection, dim=1024, overwrite=True)
+        vector_store = MilvusVectorStore(uri="http://my-release-milvus.default.svc.cluster.local:19530", collection=collection, dim=384, overwrite=True)
         storage_context = StorageContext.from_defaults(vector_store=vector_store)
         index = VectorStoreIndex.from_documents(
             documents, storage_context=storage_context
@@ -350,7 +355,7 @@ Despite the naming, these RAG pipeline files are not related to the Kubeflow pip
 
 1. Create a new directory on your local machine and navigate to that directory.
 
-1. Create a pipeline-requirements.txt file with the following contents:
+1. Create a `pipeline-requirements.txt` file with the following contents:
 
     ```file {title="pipeline-requirements.txt"}
     requests
@@ -361,7 +366,7 @@ Despite the naming, these RAG pipeline files are not related to the Kubeflow pip
     llama-index-llms-openai-like
     ```
 
-1. Create a rag-pipeline.py file with the following contents:
+1. Create a file called `rag_pipeline.py` with the following contents. The filenames of both the `pipeline-requirements.txt` and `rag_pipeline.py` files should not be changed as they are referenced within the Open WebUI Pipeline configuration file.
 
     ```file {title="rag-pipeline.py"}
     """
@@ -388,7 +393,7 @@ Despite the naming, these RAG pipeline files are not related to the Kubeflow pip
             print(f"on_startup:{__name__}")
 
             Settings.embed_model = HuggingFaceEmbedding(
-                model_name="BAAI/bge-large-en-v1.5"
+                model_name="sentence-transformers/all-MiniLM-L6-v2"
             )
 
             llm = OpenAILike(
@@ -399,7 +404,7 @@ Despite the naming, these RAG pipeline files are not related to the Kubeflow pip
 
             Settings.llm = llm
 
-            vector_store = MilvusVectorStore(uri="http://my-release-milvus.default.svc.cluster.local:19530", collection="linode_docs", dim=1024, overwrite=False)
+            vector_store = MilvusVectorStore(uri="http://my-release-milvus.default.svc.cluster.local:19530", collection="linode_docs", dim=384, overwrite=False)
             self.index = VectorStoreIndex.from_vector_store(vector_store=vector_store)
 
         async def on_shutdown(self):
@@ -581,8 +586,12 @@ Now that the chatbot has been configured, the final step is to access the chatbo
 
 1. The first time you access this interface you are prompted to create an admin account. Do this now and then continue once you are successfully logged in using that account.
 
-1. You are now presented with the chatbot interface. Within the dropdown menu, you should be able to select from several models. Select one and ask it a question.
+1. You should now be presented with the chatbot interface. Within the dropdown menu, you should be able to select from several models. Select one and ask it a question.
+
+    - The **llama3** model uses information that was trained by other data sources (not your own custom data). If you ask this model a question, the data from your own dataset is not used.
+
+        ![Screenshot of a Llama 3 query in Open WebUI](open-webui-llama3.jpg)
 
-    - The **llama3** model will just use information that was trained by other data sources (not your own custom data). If you ask this model a question, the data from your own dataset will not be used.
+    - The **RAG Pipeline** model that you defined in a previous section does use data from your custom dataset. Ask it a question relevant to your data and the chatbot should respond with an answer that is informed by the custom dataset you configured.
 
-    - The **RAG Pipeline** model that you defined in a previous section does indeed use data from your custom dataset.  Ask it a question relevant to your data and the chatbot should respond with an answer that is informed by the custom dataset you configured.
+        ![Screenshot of a RAG Pipeline query in Open WebUI](open-webui-rag-pipeline.jpg)
@@ -31,8 +31,6 @@ See the [Marketplace](/docs/marketplace/) listing page for a full list of all Ma
 - [AzuraCast](/docs/marketplace-docs/guides/azuracast/)
 - [Backstage](/docs/marketplace-docs/guides/backstage/)
 - [BeEF](/docs/marketplace-docs/guides/beef/)
-- [Budibase](/docs/marketplace-docs/guides/budibase/)
-- [Chevereto](/docs/marketplace-docs/guides/chevereto/)
 - [Cloudron](/docs/marketplace-docs/guides/cloudron/)
 - [ClusterControl](/docs/marketplace-docs/guides/clustercontrol/)
 - [Couchbase Cluster](/docs/marketplace-docs/guides/couchbase-cluster/)
@@ -51,14 +49,12 @@ See the [Marketplace](/docs/marketplace/) listing page for a full list of all Ma
 - [Gitea](/docs/marketplace-docs/guides/gitea/)
 - [Gitlab](/docs/marketplace-docs/guides/gitlab/)
 - [GlusterFS Cluster](/docs/marketplace-docs/guides/glusterfs-cluster/)
-- [gopaddle](/docs/marketplace-docs/guides/gopaddle/)
 - [Grav](/docs/marketplace-docs/guides/grav/)
 - [Guacamole](/docs/marketplace-docs/guides/guacamole/)
 - [Haltdos Community WAF](/docs/marketplace-docs/guides/haltdos-community-waf/)
 - [Harbor](/docs/marketplace-docs/guides/harbor/)
 - [HashiCorp Nomad](/docs/marketplace-docs/guides/hashicorp-nomad/)
 - [HashiCorp Vault](/docs/marketplace-docs/guides/hashicorp-vault/)
-- [ILLA Builder](/docs/marketplace-docs/guides/illa-builder/)
 - [InfluxDB](/docs/marketplace-docs/guides/influxdb/)
 - [Jenkins](/docs/marketplace-docs/guides/jenkins/)
 - [JetBackup](/docs/marketplace-docs/guides/jetbackup/)
@@ -68,7 +64,6 @@ See the [Marketplace](/docs/marketplace/) listing page for a full list of all Ma
 - [Joplin](/docs/marketplace-docs/guides/joplin/)
 - [JupyterLab](/docs/marketplace-docs/guides/jupyterlab/)
 - [Kali Linux](/docs/marketplace-docs/guides/kali-linux/)
-- [Kepler](/docs/marketplace-docs/guides/kepler/)
 - [LAMP Stack](/docs/marketplace-docs/guides/lamp-stack/)
 - [LEMP Stack](/docs/marketplace-docs/guides/lemp-stack/)
 - [LinuxGSM](/docs/marketplace-docs/guides/linuxgsm/)
@@ -83,7 +78,6 @@ See the [Marketplace](/docs/marketplace/) listing page for a full list of all Ma
 - [MySQL/MariaDB](/docs/marketplace-docs/guides/mysql/)
 - [NATS Single Node](/docs/marketplace-docs/guides/nats-single-node/)
 - [Nextcloud](/docs/marketplace-docs/guides/nextcloud/)
-- [NirvaShare](/docs/marketplace-docs/guides/nirvashare/)
 - [Node.js](/docs/marketplace-docs/guides/nodejs/)
 - [Odoo](/docs/marketplace-docs/guides/odoo/)
 - [ONLYOFFICE](/docs/marketplace-docs/guides/onlyoffice/)
@@ -109,13 +103,11 @@ See the [Marketplace](/docs/marketplace/) listing page for a full list of all Ma
 - [RabbitMQ](/docs/marketplace-docs/guides/rabbitmq/)
 - [Redis](/docs/marketplace-docs/guides/redis/)
 - [Redis Sentinel](/docs/marketplace-docs/guides/redis-cluster/)
-- [Restyaboard](/docs/marketplace-docs/guides/restyaboard/)
 - [Rocket.Chat](/docs/marketplace-docs/guides/rocketchat/)
 - [Ruby on Rails](/docs/marketplace-docs/guides/ruby-on-rails/)
 - [Saltcorn](/docs/marketplace-docs/guides/saltcorn/)
 - [SeaTable](/docs/marketplace-docs/guides/seatable/)
 - [Secure Your Server](/docs/marketplace-docs/guides/secure-your-server/)
-- [ServerWand](/docs/marketplace-docs/guides/serverwand/)
 - [Shadowsocks](/docs/marketplace-docs/guides/shadowsocks/)
 - [Splunk](/docs/marketplace-docs/guides/splunk/)
 - [Superinsight](/docs/marketplace-docs/guides/superinsight/)
@@ -126,7 +118,6 @@ See the [Marketplace](/docs/marketplace/) listing page for a full list of all Ma
 - [VS Code](/docs/marketplace-docs/guides/vscode/)
 - [WarpSpeed VPN](/docs/marketplace-docs/guides/warpspeed/)
 - [Wazuh](/docs/marketplace-docs/guides/wazuh/)
-- [Webuzo](/docs/marketplace-docs/guides/webuzo/)
 - [WireGuard](/docs/marketplace-docs/guides/wireguard/)
 - [WooCommerce](/docs/marketplace-docs/guides/woocommerce/)
 - [WordPress](/docs/marketplace-docs/guides/wordpress/)
 
@@ -17,7 +17,7 @@ contributors: ["Akamai"]
 license: '[CC BY-ND 4.0](https://creativecommons.org/licenses/by-nd/4.0)'
 ---
 {{< note type="warning" title="This app is no longer available for deployment" >}}
-ARK: Survival Evolved has been removed from the App Marketplace and can no longer be deployed. This guide has been retained for reference only. For information on how to deploy and set up ARK: Survival Evolved manually on a Compute Instance, see our [Creating a Dedicated ARK Server on Ubuntu](/docs/guides/create-an-ark-server-on-ubuntu) guide.
+ARK: Survival Evolved has been removed from the App Marketplace and can no longer be deployed. This guide is retained for reference only. For information on how to deploy and set up ARK: Survival Evolved manually on a Compute Instance, see our [Creating a Dedicated ARK Server on Ubuntu](/docs/guides/create-an-ark-server-on-ubuntu) guide.
 {{< /note >}}
 
 [ARK: Survival Evolved](http://playark.com/ark-survival-evolved/) is a multiplayer action-survival game released in 2017. The game places you on a series of fictional islands inhabited by dinosaurs and other prehistoric animals. In ARK, the main objective is to survive. ARK is an ongoing battle where animals and other players have the ability to destroy you. To survive, you must build structures, farm resources, breed dinosaurs, and even set up trading hubs with neighboring tribes.
 
@@ -17,7 +17,7 @@ contributors: ["Akamai"]
 license: '[CC BY-ND 4.0](https://creativecommons.org/licenses/by-nd/4.0)'
 ---
 {{< note type="warning" title="This app is no longer available for deployment" >}}
-BitNinja has been removed from the App Marketplace and can no longer be deployed. This guide has been retained for reference only.
+BitNinja has been removed from the App Marketplace and can no longer be deployed. This guide is retained for reference only.
 {{< /note >}}
 
 
 
@@ -9,6 +9,9 @@ authors: ["Akamai"]
 contributors: ["Akamai"]
 license: '[CC BY-ND 4.0](https://creativecommons.org/licenses/by-nd/4.0)'
 ---
+{{< note type="warning" title="This app is no longer available for deployment" >}}
+Budibase has been removed from the App Marketplace and can no longer be deployed. This guide is retained for reference only.
+{{< /note >}}
 
 [Budibase](https://github.com/Budibase/budibase) is an open-source, low-code platform for building modern business applications. Build, design, and automate different types of applications, including admin panels, forms, internal tools, and client portals. Using Budibase helps developers avoid spending weeks building simple CRUD applications and, instead, allows them to complete many projects in significantly less time.