Skip to content

Commit 1e6155a

Browse files
committed
Incorporated some of the comments
1 parent 825e528 commit 1e6155a

7 files changed

+341
-75
lines changed

modules/developer-lightspeed/con-about-lightspeed-stack-and-llama-stack.adoc

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -3,15 +3,17 @@
33
[id="con-about-lightspeed-stack-and-llama-stack_{context}"]
44
= About {lcs-name} and Llama Stack
55

6-
The **{lcs-name} ({lcs-short})** and Llama Stack deploy together as sidecar containers to augment {rhdh-short} functionality.
6+
The {lcs-name} ({lcs-short}) and Llama Stack deploy together as sidecar containers to augment {rhdh-short} functionality.
77

8-
{lcs-short} acts as an intermediary service layer for interfacing with Large Language Model (LLM) providers. {lcs-short} handles LLM provider setup, authentication, and includes key functionalities such as question validation, user feedback collection, and Retrieval Augmented Generation (RAG).
8+
{lcs-short} serves as the Llama Stack service intermediary, managing configurations for key components. These components include the Large Language Model (LLM), inference providers, tool runtime providers, safety providers, and Retrieval Augmented Generation (RAG) settings.
99

10-
Llama Stack provides the model server functionality that {lcs-short} uses to process requests. This service requires a Kubernetes Secret to securely store environment variables for the chosen LLM provider.
10+
* **{lcs-short}** manages authentication, user feedback collection, MCP server configuration, and caching.
1111

12-
The {ls-brand-name} plugin within {rhdh-short} communicates with the {lcs-short} sidecar to send prompts and receives responses from the configured LLM service. The {lcs-short} sidecar centralizes the LLM interaction logic and configuration alongside your {rhdh-short} instance.
12+
* Llama Stack provides the inference functionality that {lcs-short} uses to process requests. The service requires a **Kubernetes Secret** to securely store environment variables for the chosen LLM provider.
13+
14+
* The {ls-brand-name} plugin in {rhdh-short} sends prompts and receives LLM responses through the {lcs-short} sidecar. {lcs-short} then uses the Llama Stack sidecar service to perform inference and MCP tool calling.
1315

1416
[NOTE]
1517
====
16-
{ls-brand-name} is a Developer Preview release. You must manually deploy the {lcs-name} and Llama Stack sidecar containers, and then install the {ls-brand-name} plugin on your {rhdh-short} instance.
18+
{ls-brand-name} is a Developer Preview release. You must manually deploy the {lcs-name} and Llama Stack sidecar containers, and install the {ls-brand-name} plugin on your {rhdh-short} instance.
1719
====

modules/developer-lightspeed/con-rag-embeddings.adoc

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,6 @@
33
[id="con-rag-embeddings_{context}"]
44
= Retrieval Augmented Generation embeddings
55

6-
The {product} documentation set has been added as a Retrieval-Augmented Generation (RAG) data source.
6+
The {product} documentation set serves as the Retrieval-Augmented Generation (RAG) data source.
77

8-
The {lcs-name} ({lcs-short}) sidecar handles the RAG process, using specialized Llama Stack components to manage the documentation data and generate vector embeddings.
8+
RAG initialization occurs through an init container, which copies the RAG data to a shared volume. The Llama Stack sidecar then mounts this volume to access the data. The Llama Stack service uses the resulting RAG embeddings in the vector database as a tool. This tool allows the service to provide references to production documentation during the inference process.

modules/developer-lightspeed/con-supported-architecture.adoc

Lines changed: 2 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -3,14 +3,9 @@
33
[id="con-supported-architecture_{context}"]
44
= Supported architecture for {ls-brand-name}
55

6-
{ls-short} is available as a plugin on all platforms that host {product-very-short}. It requires the use of {lcs-name}, which runs as a sidecar container. {lcs-short} acts as the intermediary layer for interfacing with Large Language Model (LLM) providers.
6+
{ls-short} is available as a plugin on all platforms that host {product-very-short}. It requires two sidecar containers: the {lcs-name} and the Llama Stack service.
77

8-
{ls-short} handles functionalities like question validation, feedback, and Retrieval-Augmented Generation (RAG) by leveraging components often built with Llama Stack.
9-
10-
[NOTE]
11-
====
12-
Currently, the provided {lcs-short} image is built for x86 platforms. To use other platforms (for example, arm64), you must ensure that you enable emulation.
13-
====
8+
The {lcs-short} container acts as the intermediary layer, which interfaces with and manages the Llama Stack service.
149

1510
.Additional resources
1611
* link:https://access.redhat.com/support/policy/updates/developerhub[{product} Life Cycle and supported platforms]

modules/developer-lightspeed/proc-configuring-mcp-components.adoc

Lines changed: 18 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -3,11 +3,12 @@
33
[id="proc-configure-mcp-components_{context}"]
44
= Configuring the Model Context Protocol (MCP)
55

6-
You can configure the Model Context Protocol (MCP) to leverage {ls-short}. With MCP configuration, the deployed {lcs-short} and Llama Stack server utilize external tools and retrieve real-time data. This allows the virtual assistant to execute complex actions and incorporate current operational context into its responses. You must synchronize settings across three components: the Llama Stack tool definition, the {lcs-short} server endpoint definition, and the {ls-brand-name} Plugin authorization token.
6+
To leverage Model Context Protocol (MCP) servers, you must configure them within the Llama Stack service. Configuring the MCP in {lcs-short} enables the deployed {lcs-short} and Llama Stack to use external tools and retrieve real-time data. This integration allows the virtual assistant to execute complex actions and incorporate current operational context into its responses.
7+
8+
Configuration requires synchronizing settings across three components: the Llama Stack tool definition, the {lcs-short} MCP server endpoint definition, and the {ls-brand-name} plugin MCP authorization token.
79

810
.Procedure
9-
. Configure the {llama-name} tool definition.
10-
Define the MCP provider in the `tool_runtime` section of the Llama Stack configuration file (`run.yaml`).
11+
. Configure the Llama Stack tool definition by defining the MCP provider in the `tool_runtime` section of the Llama Stack configuration file (`run.yaml`), as shown in the following code:
1112
+
1213
[source,yaml]
1314
----
@@ -18,24 +19,32 @@ providers:
1819
config: {}
1920
----
2021

21-
. Configure the {lcs-short} server endpoint.
22-
In the {lcs-short} configuration file (`lightspeed-stack.yaml`), define the MCP server endpoints. The `provider_id` must** match the Llama Stack definition (`model-context-protocol`).
23-
22+
. Configure the {lcs-short} MCP server endpoint by defining the MCP server endpoints in the {lcs-short} configuration file (`lightspeed-stack.yaml`), as shown in the following code:
23+
+
2424
[source,yaml]
2525
----
2626
mcp_servers:
2727
- name: mcp::backstage
2828
provider_id: model-context-protocol
2929
url: https://rhdh-mcp-proxy-apicast-production.apps.rosa.redhat-ai-dev.m6no.p3.openshiftapps.com:443/api/mcp-actions/v1
3030
----
31+
+
32+
[IMPORTANT]
33+
====
34+
`provider_id` must match the Llama Stack definition (`model-context-protocol`).
35+
====
3136

32-
. Configure the {ls-brand-name} plugin authorization token.
33-
In the {ls-brand-name} plugin configuration file (`lightspeed-app-config.yaml`), specify the MCP servers and provide the token for authorization. This token is used when the plugin makes requests to the {lcs-short} `/v1/streaming_query` endpoint.
37+
. Configure the {ls-brand-name} plugin authorization token by specifying the MCP servers and providing the token for authentication in the {ls-brand-name} plugin configuration file (`lightspeed-app-config.yaml`), as shown in the following code:
3438
+
3539
[source,yaml]
3640
----
3741
lightspeed:
3842
mcpServers:
3943
- name: mcp::backstage
4044
token: ${MCP_TOKEN}
41-
----
45+
----
46+
+
47+
[IMPORTANT]
48+
====
49+
`name` must match the {lcs-name} definition (`mcp::backstage`).
50+
====

modules/developer-lightspeed/proc-gathering-feedback.adoc

Lines changed: 10 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -3,11 +3,15 @@
33
[id="proc-gathering-feedback_{context}"]
44
= Gathering feedback in {ls-short}
55

6-
Feedback collection is an optional feature configured on the {lcs-short}. This feature gathers user feedback by providing thumbs-up/down ratings and text comments directly from the chat window. {lcs-short} gathers the feedback, along with the user's query and the response of the model, and stores it as a JSON file within the local file system of the Pod for later collection and analysis by the platform administrator. This can be useful for assessing model performance and improving your users' experience. The collected feedback is stored in the cluster where {product-very-short} and {lcs-short} are deployed, and as such, is only accessible by the platform administrators for that cluster. For users that intend to have their data removed, they must request their respective platform administrator to perform that action as {company-name} does not collect (or have access to) any of this data.
6+
Feedback collection is an optional feature configured on the {lcs-short}. This feature gathers user feedback by providing thumbs-up/down ratings and text comments directly from the chat window.
7+
8+
{lcs-short} collects the feedback, the user's query, and the response of the model, storing the data as a JSON file on the local file system of the Pod. A platform administrator must later collect and analyze this data to assess model performance and improve the user experience.
9+
10+
The collected data resides in the cluster where {product-very-short} and {lcs-short} are deployed, making it accessible only to platform administrators for that cluster. For data removal, users must request this action from their platform administrator, as {company-name} neither collects nor accesses this data.
711

812
.Procedure
913

10-
* To enable or disable feedback, in your {lcs-short} configuration file, add the following settings:
14+
. To enable or disable feedback collection, in the {lcs-short} configuration file (`lightspeed-stack.yaml`), add the following settings:
1115
+
1216
[source,yaml]
1317
----
@@ -16,6 +20,8 @@ llm_providers:
1620
lightspeed_stack:
1721
......
1822
user_data_collection:
19-
feedback_disabled: <true/false>
20-
feedback_storage: "/app-root/tmp/data/feedback"
23+
feedback_enabled: true
24+
feedback_storage: "/tmp/data/feedback"
25+
transcripts_enabled: true
26+
transcripts_storage: "/tmp/data/transcripts"
2127
----

0 commit comments

Comments
 (0)