Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 35 additions & 0 deletions ai/gen-ai-agents/voice-ai-agent/LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
Copyright (c) 2025 Oracle and/or its affiliates.

The Universal Permissive License (UPL), Version 1.0

Subject to the condition set forth below, permission is hereby granted to any
person obtaining a copy of this software, associated documentation and/or data
(collectively the "Software"), free of charge and under any and all copyright
rights in the Software, and any and all patent rights owned or freely
licensable by each licensor hereunder covering either (i) the unmodified
Software as contributed to or provided by such licensor, or (ii) the Larger
Works (as defined below), to deal in both

(a) the Software, and
(b) any piece of software and/or hardware listed in the lrgrwrks.txt file if
one is included with the Software (each a "Larger Work" to which the Software
is contributed by such licensors),

without restriction, including without limitation the rights to copy, create
derivative works of, display, perform, and distribute the Software and make,
use, sell, offer for sale, import, export, have made, and have sold the
Software and the Larger Work(s), and to sublicense the foregoing rights on
either these or other terms.

This license is subject to the following condition:
The above copyright notice and either this complete permission notice or at
a minimum a reference to the UPL must be included in all copies or
substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
52 changes: 52 additions & 0 deletions ai/gen-ai-agents/voice-ai-agent/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
# Voice AI Agent (OCI Realtime Speech + Generative AI Agent)

**Author:** msliwins
**Last review date:** 2025-12-05

A small voice assistant that:

1. Listens to your microphone with VAD (voice activity detection),
2. Streams audio to **OCI Realtime Speech** for STT,
3. Sends the recognized text to an **OCI Generative AI Agent Endpoint**,
4. Uses **OCI Text-to-Speech** to speak the answer back.

Everything runs in a loop until you stop it with `Ctrl+C`.

---

## Features

- 🎙️ Voice Activity Detection (VAD)
Automatically starts recording when you speak and stops after a short silence.

- 🧠 Generative AI Agent integration
Uses an OCI Generative AI Agent Endpoint to handle conversation and tools.

- 🗣️ Text-to-Speech
Uses OCI AI Speech to synthesize responses and plays them locally.

- 🔁 Persistent agent session
Single agent session reused across turns for conversational context.

- 🧪 Debug traces
Optionally saves agent traces to `traces.json` for debugging.

---

## Project Structure (key files)

- `main.py` – the script you shared; runs the whole loop.
- `requirements.txt` – Python dependencies.
- `example.env` – safe template with placeholder values for others.

---

## Requirements

- Python 3.11+ (recommended)
- Valid OCI tenancy and user with:
- Permission to use **AI Speech** (STT + TTS),
- Permission to use **Generative AI Agent Runtime**.
- Configured `~/.oci/config` with a profile matching your env (`OCI_PROFILE`).

- A working microphone on your machine (Windows, since it uses `winsound`).
3 changes: 3 additions & 0 deletions ai/gen-ai-agents/voice-ai-agent/example.env
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
OCI_PROFILE=PHOENIX
OCI_COMPARTMENT_OCID=ocid1.compartment.oc1..
OCI_AGENT_ENDPOINT_ID=ocid1.genaiagentendpoint.oc1.phx.
Loading