Skip to content

Commit 43559e4

Browse files
Merge branch 'main' into dependabot/pip/ai/generative-ai-service/Document-Processing-with-GenAI/smart-invoice-extraction/pillow-10.3.0
2 parents dedb4f8 + 523fc71 commit 43559e4

File tree

171 files changed

+3366
-522
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

171 files changed

+3366
-522
lines changed

ai/ai-document-understanding/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ Reviewed: 22.09.2025
2323

2424
## GitHub
2525

26-
- [Enhanced Document Understanding with LLMs](https://github.com/oracle-devrel/technology-engineering/tree/main/ai/generative-ai-service/doc-understanding-and-genAI)
26+
- [Enhanced Document Understanding with LLMs](https://github.com/oracle-devrel/technology-engineering/tree/main/ai/generative-ai-service/Document%20Processing%20with%20GenAI/doc-understanding-and-genAI)
2727
- A Streamlit-based app comparing and expanding on traditional Document Understanding (OCI DU) + LLM approach vs. a multimodal LLM for extracting structured data from documents (PDFs, images). This is is aimed at highlighting the strengths of each of our services and the power GenAI brings in combining these approaches for the best handling of complex documents.
2828
- [Invoice Document Processing from Gmail into ERP systems using OCI Document Understanding & Oracle Integration Cloud](https://github.com/oracle-devrel/technology-engineering/tree/main/ai/ai-document-understanding/ai-email-invoice)
2929
- Explore how we can process invoice documents from Gmail into an ERP System in real-time using OCI Document Understanding and Oracle Integration Cloud (OIC). This solution combines a low-code approach to capture Gmail messages in real-time with Google Cloud Pub/Sub Adapter, extract invoice data with AI Document Understanding and create invoices in ERP systems using Oracle Integration Cloud ERP adapters.

ai/ai-speech/README.md

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,12 @@
11
# AI Speech
22

3-
OCI Speech is an AI service that applies automatic speech recognition technology to transform audio-based content into text. Developers can easily make API calls to integrate OCI Speech’s pre-trained models into their applications. OCI Speech can be used for accurate, text-normalized, time-stamped transcription via the console and REST APIs as well as command-line interfaces or SDKs. You can also use OCI Speech in an OCI Data Science notebook session. With OCI Speech, you can filter profanities, get confidence scores for both single words and complete transcriptions, and more.
3+
OCI Speech offers speech-to-text (STT) capabilities for files and real-time streams, as well as text-to-speech (TTS) functionality - All in one solution.
44

5-
Reviewed: 11.06.2026
5+
It’s accessible via Console, REST, CLI, and SDKs. Outputs are written to your Object Storage bucket as JSON (with word-level timestamps & confidences) and optionally SRT for captions.
6+
7+
Recent updates include Live Transcribe for real-time ASR and Whisper model support for multilingual transcription alongside Oracle’s native ASR models.
8+
9+
Reviewed: 25.09.2025
610

711
# Table of Contents
812

@@ -28,10 +32,10 @@ Reviewed: 11.06.2026
2832

2933
# Useful Links
3034

35+
- [OCI Speech Release Notes](https://docs.oracle.com/en-us/iaas/releasenotes/services/speech/index.htm)
3136
- [AI Solutions Hub](https://www.oracle.com/artificial-intelligence/solutions/)
3237
- [Oracle AI Speech on oracle.com](https://www.oracle.com/artificial-intelligence/speech/)
3338
- [Oracle AI Speech documentation](https://docs.oracle.com/en-us/iaas/Content/speech/home.htm)
34-
- [Oracle Speech AI service now supports diarization](https://blogs.oracle.com/ai-and-datascience/post/oracle-speech-ai-service-now-supports-diarization)
3539
- [OCI Speech supports the Whisper model](https://blogs.oracle.com/ai-and-datascience/post/oci-speech-supports-the-whisper-model)
3640
- [OCI Speech supports text-to-speech and real-time transcription with customized vocabulary](https://blogs.oracle.com/ai-and-datascience/post/oci-speech-texttospeech-realtime-transcription-custom-vocab)
3741

ai/ai-speech/podcast-generator/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ The application is designed to streamline podcast production through advanced AI
55
This application is built using Oracle Visual Builder Cloud Service (VBCS), a powerful low-code platform that simplifies development and accelerates the creation of robust applications without extensive coding. With this low-code approach, even complex workflows are straightforward to set up, allowing developers to focus on leveraging AI's potential for high-quality audio synthesis.
66
This AI-powered solution not only automates and optimizes the podcast creation process but also allows content creators to deliver professional audio content at scale efficiently.
77

8-
Reviewed: 24.04.2025
8+
Reviewed: 29.09.2025
99

1010

1111
# When to use this asset?

ai/gen-ai-agents/agentsOCI-OpenAI-gateway/api/routers/chat.py

Lines changed: 194 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@
66
import json
77
import yaml
88
import logging
9+
from urllib.parse import urlparse
910
from typing import Annotated, Any, Dict, List, Optional, Tuple, Union
1011

1112
import oci
@@ -133,6 +134,185 @@ def _extract_user_text(messages: List[Dict[str, Any]] | List[Any]) -> str:
133134
)
134135
return ""
135136

137+
def _normalize_source_location(source_location: Any) -> dict:
138+
"""
139+
Returns a dict with display_name and url (when present).
140+
Handles:
141+
- OCI SDK objects with .url
142+
- dict-like with 'url'
143+
- JSON-stringified dicts
144+
- raw URLs
145+
- plain strings / paths
146+
"""
147+
display_name = None
148+
url_value = None
149+
150+
try:
151+
# 1) SDK object with attribute 'url'
152+
if hasattr(source_location, "url"):
153+
url_value = getattr(source_location, "url") or None
154+
155+
# 2) dict-like
156+
if url_value is None:
157+
if isinstance(source_location, dict):
158+
url_value = source_location.get("url")
159+
else:
160+
# 3) JSON-like string? try parse
161+
if isinstance(source_location, str) and source_location.strip().startswith("{"):
162+
try:
163+
parsed = json.loads(source_location)
164+
if isinstance(parsed, dict):
165+
url_value = parsed.get("url")
166+
source_location = parsed
167+
except Exception:
168+
pass
169+
170+
# 4) If it's a URL string
171+
if url_value is None and isinstance(source_location, str):
172+
if source_location.startswith("http://") or source_location.startswith("https://"):
173+
url_value = source_location
174+
175+
# Decide display_name
176+
candidate_for_name = url_value or (source_location if isinstance(source_location, str) else None)
177+
if candidate_for_name:
178+
if isinstance(candidate_for_name, str) and (
179+
candidate_for_name.startswith("http://") or candidate_for_name.startswith("https://")
180+
):
181+
path = urlparse(candidate_for_name).path or ""
182+
base = os.path.basename(path) or path.strip("/")
183+
display_name = base or candidate_for_name
184+
else:
185+
display_name = os.path.basename(candidate_for_name) or str(candidate_for_name)
186+
else:
187+
display_name = None
188+
189+
except Exception as e:
190+
logging.getLogger(__name__).warning(f"Failed to normalize source_location: {e}")
191+
display_name = None
192+
url_value = None
193+
194+
return {"display_name": display_name, "url": url_value}
195+
196+
def _extract_citations_from_response(result, agent_name: str = "OCI Agent") -> Optional[Dict[str, Any]]:
197+
try:
198+
if not result or not hasattr(result, 'message') or not result.message:
199+
return None
200+
201+
message = result.message
202+
if not hasattr(message, 'content') or not message.content:
203+
return None
204+
205+
content = message.content
206+
if not hasattr(content, 'paragraph_citations') or not content.paragraph_citations:
207+
return None
208+
209+
paragraph_citations = []
210+
for para_citation in content.paragraph_citations:
211+
if hasattr(para_citation, 'paragraph') and hasattr(para_citation, 'citations'):
212+
paragraph = para_citation.paragraph
213+
citations = para_citation.citations
214+
215+
citation_list = []
216+
for citation in citations:
217+
normalized_loc = _normalize_source_location(getattr(citation, 'source_location', None))
218+
citation_dict = {
219+
"source_text": getattr(citation, 'source_text', None),
220+
"title": getattr(citation, 'title', None),
221+
"doc_id": getattr(citation, 'doc_id', None),
222+
"page_numbers": getattr(citation, 'page_numbers', None),
223+
"metadata": getattr(citation, 'metadata', None),
224+
"location_display": normalized_loc.get("display_name"),
225+
"location_url": normalized_loc.get("url"),
226+
}
227+
citation_list.append(citation_dict)
228+
229+
paragraph_dict = {
230+
"paragraph": {
231+
"text": getattr(paragraph, 'text', '') or '',
232+
"start": getattr(paragraph, 'start', 0),
233+
"end": getattr(paragraph, 'end', 0)
234+
},
235+
"citations": citation_list
236+
}
237+
paragraph_citations.append(paragraph_dict)
238+
239+
if paragraph_citations:
240+
return {"paragraph_citations": paragraph_citations, "agent_name": agent_name}
241+
242+
return None
243+
except Exception as e:
244+
logging.getLogger(__name__).warning(f"Failed to extract citations: {e}")
245+
return None
246+
247+
def _format_citations_for_display(citations: Dict[str, Any], agent_name: str = "OCI Agent") -> str:
248+
"""
249+
Renders like:
250+
251+
--- Citations from [Agent Name] ---
252+
253+
1. Text: "..."
254+
Sources:
255+
1. Title: ...
256+
Location: document.pdf
257+
Document ID: ...
258+
Pages: [1, 2]
259+
Source: ...
260+
Metadata: {...}
261+
262+
--- End Citations ---
263+
"""
264+
if not citations or "paragraph_citations" not in citations:
265+
return ""
266+
267+
agent = citations.get("agent_name") or agent_name
268+
blocks = []
269+
blocks.append(f"\n\n--- Citations from [{agent}] ---\n")
270+
271+
for idx, para_citation in enumerate(citations["paragraph_citations"], start=1):
272+
p = para_citation.get("paragraph", {}) or {}
273+
text = (p.get("text") or "").strip()
274+
275+
line = []
276+
# Ensure quoted text; json.dumps gives safe quoting and escapes
277+
#line.append(f"{idx}. Text: {json.dumps(text) if text else '\"\"'}")
278+
line.append(" Sources:")
279+
280+
for jdx, c in enumerate(para_citation.get("citations", []) or [], start=1):
281+
title = c.get("title")
282+
loc_display = c.get("location_display")
283+
doc_id = c.get("doc_id")
284+
pages = c.get("page_numbers")
285+
source_text = c.get("source_text")
286+
metadata = c.get("metadata")
287+
288+
line.append(f" {jdx}. " + (f"Title: {title}" if title else "Title: (unknown)"))
289+
if loc_display:
290+
line.append(f" Location: {loc_display}")
291+
if doc_id:
292+
line.append(f" Document ID: {doc_id}")
293+
if pages:
294+
try:
295+
pages_str = json.dumps(pages, ensure_ascii=False)
296+
except Exception:
297+
pages_str = str(pages)
298+
line.append(f" Pages: {pages_str}")
299+
if source_text:
300+
st = (source_text or "").strip()
301+
if len(st) > 500:
302+
st = st[:500].rstrip() + "…"
303+
#line.append(f" Source: {st}")
304+
if metadata:
305+
try:
306+
md_str = json.dumps(metadata, ensure_ascii=False)
307+
line.append(f" Metadata: {md_str}")
308+
except Exception:
309+
pass
310+
311+
blocks.append("\n".join(line) + "\n")
312+
313+
blocks.append("--- End Citations ---")
314+
return "\n".join(blocks)
315+
136316
def _resolve_endpoint_ocid(region: str, endpoint_ocid: Optional[str], agent_ocid: Optional[str], compartment_ocid: Optional[str]) -> str:
137317
if endpoint_ocid:
138318
return endpoint_ocid
@@ -227,6 +407,13 @@ async def chat_completions(
227407
text = ""
228408
if getattr(result, "message", None) and getattr(result.message, "content", None):
229409
text = getattr(result.message.content, "text", "") or ""
410+
411+
agent_name = agent_cfg.get("name", "OCI Agent")
412+
citations = _extract_citations_from_response(result, agent_name)
413+
414+
if citations:
415+
citation_text = _format_citations_for_display(citations, agent_name)
416+
text += citation_text
230417
except oci.exceptions.ServiceError as se:
231418
raise HTTPException(status_code=502, detail=f"Agent chat failed ({se.status}): {getattr(se,'message',str(se))}")
232419

@@ -257,6 +444,13 @@ async def chat_completions(
257444
result = runtime.chat(agent_endpoint_id=endpoint_id, chat_details=chat_details).data
258445
text = getattr(getattr(result, "message", None), "content", None)
259446
text = getattr(text, "text", "") if text else ""
447+
448+
citations = _extract_citations_from_response(result, "OCI Agent")
449+
450+
if citations:
451+
citation_text = _format_citations_for_display(citations, "OCI Agent")
452+
text += citation_text
453+
260454
tag = f"oci:agentendpoint:{endpoint_id}"
261455
if getattr(chat_request, "stream", False):
262456
return StreamingResponse(_stream_one_chunk(text, tag), media_type="text/event-stream", headers={"x-oci-session-id": session_id})
Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
MIT License
2+
3+
Copyright (c) 2025 Luigi Saetta
4+
5+
Permission is hereby granted, free of charge, to any person obtaining a copy
6+
of this software and associated documentation files (the "Software"), to deal
7+
in the Software without restriction, including without limitation the rights
8+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9+
copies of the Software, and to permit persons to whom the Software is
10+
furnished to do so, subject to the following conditions:
11+
12+
The above copyright notice and this permission notice shall be included in all
13+
copies or substantial portions of the Software.
14+
15+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21+
SOFTWARE.

ai/gen-ai-agents/custom-rag-agent/README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,11 @@
1-
![UI](images/ui_image.png)
2-
31
# Custom RAG agent
42
This repository contains the code for the development of a **custom RAG Agent**, based on **OCI Generative AI**, **Oracle 23AI** Vector Store and **LangGraph**
53

64
**Author**: L. Saetta
75

8-
**Last updated**: 11/09/2025
6+
**Reviewed**: 23.09.2025
7+
8+
![UI](images/ui_image.png)
99

1010
## Design and implementation
1111
* The agent is implemented using **LangGraph**

ai/gen-ai-agents/custom-rag-agent/llm_with_mcp.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -102,7 +102,7 @@ async def _list_tools(self):
102102
Fetch tools from the MCP server using FastMCP. Must be async.
103103
"""
104104
jwt = self.jwt_supplier()
105-
105+
106106
logger.info("Listing tools from %s ...", self.mcp_url)
107107

108108
# FastMCP requires async context + await for client ops.

0 commit comments

Comments
 (0)