|
1 | 1 | # DeepAgent + DocumentAgent Integration Design |
2 | 2 |
|
3 | | -**Version:** 1.0 |
4 | | -**Date:** October 7, 2025 |
| 3 | +**Version:** 1.1 |
| 4 | +**Date:** October 8, 2025 |
5 | 5 | **Author:** AI Architecture Team |
6 | | -**Status:** Design Proposal |
| 6 | +**Status:** Design Proposal (Enhanced Addendum Included) |
| 7 | + |
| 8 | +### Version History |
| 9 | +| Version | Date | Summary | |
| 10 | +|---------|------|---------| |
| 11 | +| 1.0 | 2025-10-07 | Initial integration design (tool architecture, QA, performance, security) | |
| 12 | +| 1.1 | 2025-10-08 | Added advanced tagging + confirmation workflow, domain-specific pipelines, Postgres + pgvector persistence API, hybrid RAG architecture, compliance & cross‑document reasoning use cases, >99% accuracy requirements pipeline, retrieval evaluation strategy | |
7 | 13 |
|
8 | 14 | --- |
9 | 15 |
|
@@ -1723,4 +1729,322 @@ ERROR_MESSAGES = { |
1723 | 1729 |
|
1724 | 1730 | --- |
1725 | 1731 |
|
1726 | | -**End of Design Document** |
| 1732 | +## 14. Enhanced Capabilities Addendum (v1.1) |
| 1733 | + |
| 1734 | +This addendum incorporates the newly requested advanced capabilities: |
| 1735 | + |
| 1736 | +### 14.1 Requirement Mapping (User Requests → Design Elements) |
| 1737 | +| # | User Requirement | Design Element(s) Added | |
| 1738 | +|---|------------------|-------------------------| |
| 1739 | +| 1 | Context-aware tagging + user confirmation | Section 14.2 Tagging & Confirmation Workflow | |
| 1740 | +| 2 | Tag-driven domain pipelines | Section 14.3 Domain-Specific Processing Matrix | |
| 1741 | +| 3 | >99% accuracy for requirements | Section 14.4 High-Accuracy Requirements Pipeline | |
| 1742 | +| 4 | Persist structured requirements in Postgres (external repo) | Section 14.5 Persistence & External API Contracts | |
| 1743 | +| 4b | Store requirements as embeddings in pgvector | Section 14.6 Embedding & Vector Index Strategy | |
| 1744 | +| 5 | Embed other doc types (standards/howto/templates) | Section 14.6 (document_type expansion) | |
| 1745 | +| 6 | Hybrid RAG across all doc types | Section 14.7 Hybrid Retrieval Architecture | |
| 1746 | +| 7 | Compliance check (requirements vs standards/templates) | Section 14.8 Use Case Flow 1 | |
| 1747 | +| 8 | Q&A over standards + related templates/howtos | Section 14.8 Use Case Flow 2 | |
| 1748 | +| 9 | Standards inter-relationship exploration | Section 14.8 Use Case Flow 3 | |
| 1749 | + |
| 1750 | +### 14.2 Tagging & Confirmation Workflow |
| 1751 | +Objective: Automatically classify each uploaded document into one (or multiple) semantic types: `requirements_spec`, `standard`, `howto`, `template`, `policy`, `guideline`, `unknown`. |
| 1752 | + |
| 1753 | +Workflow Steps: |
| 1754 | +1. **Initial Rapid Heuristic Pass** (deterministic): |
| 1755 | + - File name / path regex (e.g. `(spec|srs|requirements)` → requirements_spec; `(iso|iec|ieee|nist)` → standard). |
| 1756 | + - Heading density & patterns (e.g. high ratio of imperative "shall" → requirements_spec; presence of numbered normative clauses like "3.2.1" with normative modal verbs → standard). |
| 1757 | + - Keyword priors with TF-IDF or BM25 quick scan. |
| 1758 | +2. **LLM Tagging Pass** (contextual): Provide top N headings + first 2 pages + any strong heuristic signals. Return JSON: |
| 1759 | + ```json |
| 1760 | + {"primary_tag": "requirements_spec", "alt_tags": ["template"], "confidence": 0.91, "rationale": "Contains 'shall' density 4.2%, structured numbered sections"} |
| 1761 | + ``` |
| 1762 | +3. **Conflict Resolver:** If heuristic primary ≠ LLM primary and both confidences < threshold (e.g. 0.8) → ask user. |
| 1763 | +4. **User Confirmation Loop:** DeepAgent presents a summary: |
| 1764 | + > Detected: requirements_spec (91% confidence). Alternate: template (42%). Confirm? (Yes / choose correct tag / multi-select) |
| 1765 | +5. **Correction Handling:** If user overrides, store override in `document_tag_overrides` table (Postgres) with pattern signature (hash of top headings) to auto-apply next time (active learning). |
| 1766 | +6. **Multi-Tag Support:** Some documents may legitimately be both `standard` and `template` (rare). We allow up to 2 tags, one primary. Pipelines prioritize primary. |
| 1767 | +7. **Persistence of Tag Decision:** Store final tag(s) + confidence + rationale for audit. |
| 1768 | + |
| 1769 | +Data Model (Tagging Metadata): |
| 1770 | +```sql |
| 1771 | +CREATE TABLE document_tags ( |
| 1772 | + document_id UUID PRIMARY KEY, |
| 1773 | + file_name TEXT NOT NULL, |
| 1774 | + primary_tag TEXT NOT NULL, |
| 1775 | + secondary_tag TEXT NULL, |
| 1776 | + heuristic_confidence REAL, |
| 1777 | + llm_confidence REAL, |
| 1778 | + final_confidence REAL, |
| 1779 | + rationale TEXT, |
| 1780 | + created_at TIMESTAMPTZ DEFAULT now() |
| 1781 | +); |
| 1782 | +``` |
| 1783 | + |
| 1784 | +### 14.3 Domain-Specific Processing Matrix |
| 1785 | +| Tag | Extraction Pipeline | Specialized Prompts | Extra Validation | Output Artifacts | |
| 1786 | +|-----|---------------------|---------------------|------------------|------------------| |
| 1787 | +| requirements_spec | High-accuracy multi-pass | Requirements schema, atomicity rules | Duplicate ID check, modal verb density, coverage vs TOC | Structured requirements JSON, embeddings | |
| 1788 | +| standard | Clause segmentation, normative language parser | Normative vs informative discrimination | Clause numbering integrity, cross-reference validation | Clause graph, embeddings | |
| 1789 | +| howto | Procedure step parser, imperative detection | Step normalization & tool references | Ordered step continuity, missing prerequisite detection | Steps list, embeddings | |
| 1790 | +| template | Placeholder field extraction | Variable slot detection | Placeholder coverage ratio, duplicate placeholder detection | Template slots, embeddings | |
| 1791 | +| policy/guideline | Policy statement extraction | Risk/compliance phrasing patterns | Policy classification consistency | Policy items, embeddings | |
| 1792 | + |
| 1793 | +### 14.4 High-Accuracy Requirements Pipeline (>99%) |
| 1794 | +Stages (multi-pass): |
| 1795 | +1. **Ingestion & Normalization** (Docling → Markdown → canonical whitespace, remove page artifacts). |
| 1796 | +2. **Section Structuring Pass** (existing DocumentAgent) with chunk overlap for context. |
| 1797 | +3. **Requirements Extraction Pass A (Baseline)** – Strict JSON schema. |
| 1798 | +4. **Requirements Extraction Pass B (Refinement)** – Feed ambiguous or low-confidence items with clarifying meta-prompt; unify style. |
| 1799 | +5. **Deduplication & Canonicalization:** Hash normalized body; unify IDs; if conflicting IDs with different bodies → create variant list + resolution heuristic (prefer longer, more specific, or user-confirmed). |
| 1800 | +6. **Atomicity Validator:** Split compound statements containing multiple modal verbs (`shall`,`must`,`should`) separated by conjunctions. |
| 1801 | +7. **Category Classifier (functional vs non-functional + subcategories):** Lightweight model + LLM tie-break. |
| 1802 | +8. **Confidence Scoring Ensemble:** Combine: (a) extraction model self-score, (b) heuristic quality metrics (length, modality strength, ambiguity penalty), (c) duplication penalty. |
| 1803 | +9. **Human-in-the-Loop Optional Gate:** For items < threshold (e.g. 0.85), present batch diff to user. |
| 1804 | +10. **Persistence & Embedding:** After acceptance, store in Postgres, generate embeddings, index in pgvector. |
| 1805 | + |
| 1806 | +Error Handling & Correction: |
| 1807 | +- Retry with reduced chunk size on context errors. |
| 1808 | +- Fallback minimal parser if LLM output invalid JSON after N retries (skeleton insertion + mark `needs_review=true`). |
| 1809 | + |
| 1810 | +### 14.5 Persistence & External API Contracts |
| 1811 | +External Postgres (other repo) is exposed via REST (or gRPC). We define a client in this repo with resilient calls + exponential backoff. |
| 1812 | + |
| 1813 | +#### Core Tables (Proposed) |
| 1814 | +```sql |
| 1815 | +CREATE EXTENSION IF NOT EXISTS "vector"; -- pgvector |
| 1816 | + |
| 1817 | +CREATE TABLE documents ( |
| 1818 | + document_id UUID PRIMARY KEY, |
| 1819 | + file_name TEXT NOT NULL, |
| 1820 | + primary_tag TEXT NOT NULL, |
| 1821 | + secondary_tag TEXT, |
| 1822 | + source_path TEXT, |
| 1823 | + version TEXT, |
| 1824 | + checksum TEXT, |
| 1825 | + size_bytes BIGINT, |
| 1826 | + processed_at TIMESTAMPTZ DEFAULT now() |
| 1827 | +); |
| 1828 | + |
| 1829 | +CREATE TABLE requirements ( |
| 1830 | + requirement_id UUID PRIMARY KEY, |
| 1831 | + document_id UUID REFERENCES documents(document_id) ON DELETE CASCADE, |
| 1832 | + external_req_id TEXT, -- original numbering if present |
| 1833 | + body TEXT NOT NULL, |
| 1834 | + category TEXT, -- functional / non-functional |
| 1835 | + subcategory TEXT, -- performance / security etc. |
| 1836 | + confidence REAL, |
| 1837 | + needs_review BOOLEAN DEFAULT FALSE, |
| 1838 | + metadata JSONB, |
| 1839 | + created_at TIMESTAMPTZ DEFAULT now() |
| 1840 | +); |
| 1841 | + |
| 1842 | +CREATE TABLE knowledge_clauses ( |
| 1843 | + clause_id UUID PRIMARY KEY, |
| 1844 | + document_id UUID REFERENCES documents(document_id) ON DELETE CASCADE, |
| 1845 | + tag TEXT, -- standard / howto / template |
| 1846 | + clause_number TEXT, |
| 1847 | + title TEXT, |
| 1848 | + content TEXT, |
| 1849 | + metadata JSONB, |
| 1850 | + created_at TIMESTAMPTZ DEFAULT now() |
| 1851 | +); |
| 1852 | + |
| 1853 | +-- Unified embedding store |
| 1854 | +CREATE TABLE embeddings ( |
| 1855 | + embedding_id UUID PRIMARY KEY, |
| 1856 | + parent_type TEXT NOT NULL CHECK (parent_type IN ('requirement','clause','template_slot')), |
| 1857 | + parent_id UUID NOT NULL, |
| 1858 | + document_id UUID NOT NULL REFERENCES documents(document_id) ON DELETE CASCADE, |
| 1859 | + vector vector(1536) NOT NULL, -- dimension depends on model (e.g. text-embedding-3-large) |
| 1860 | + tag TEXT, -- reuse primary_tag or refined semantic tag |
| 1861 | + chunk_index INT, |
| 1862 | + text_excerpt TEXT, |
| 1863 | + metadata JSONB, |
| 1864 | + created_at TIMESTAMPTZ DEFAULT now() |
| 1865 | +); |
| 1866 | + |
| 1867 | +CREATE INDEX ON embeddings USING ivfflat (vector vector_cosine_ops) WITH (lists=100); |
| 1868 | +CREATE INDEX embeddings_tag_idx ON embeddings(tag); |
| 1869 | +CREATE INDEX requirements_doc_idx ON requirements(document_id); |
| 1870 | +CREATE INDEX knowledge_doc_idx ON knowledge_clauses(document_id); |
| 1871 | +``` |
| 1872 | + |
| 1873 | +#### External REST API (Contract) |
| 1874 | +| Endpoint | Method | Purpose | Request | Response | |
| 1875 | +|----------|--------|---------|---------|----------| |
| 1876 | +| `/documents` | POST | Register processed doc | file metadata + tags | `{document_id}` | |
| 1877 | +| `/requirements/batch` | POST | Bulk insert requirements | list of requirement objects | counts + failed IDs | |
| 1878 | +| `/clauses/batch` | POST | Bulk insert standard/howto/template clauses | objects | counts | |
| 1879 | +| `/embeddings/batch` | POST | Bulk insert vectors | dimension + vectors | success/fail | |
| 1880 | +| `/retrieval/hybrid` | POST | Hybrid search (query + filters) | query JSON | ranked results | |
| 1881 | +| `/compliance/check` | POST | Requirements vs standard sections | requirement IDs + standard ref | compliance summary | |
| 1882 | +| `/standards/graph` | GET | Return standards relationship graph | query params | node/edge JSON | |
| 1883 | + |
| 1884 | +Request JSON examples and detailed schemas would be placed in `doc/api/` (future work). |
| 1885 | + |
| 1886 | +#### Client Pseudocode |
| 1887 | +```python |
| 1888 | +class ExternalKnowledgeStoreClient: |
| 1889 | + def __init__(self, base_url: str, api_key: str | None = None, timeout=30): ... |
| 1890 | + |
| 1891 | + def register_document(self, meta: dict) -> str: ... |
| 1892 | + def upsert_requirements(self, reqs: list[dict]) -> dict: ... |
| 1893 | + def upsert_clauses(self, clauses: list[dict]) -> dict: ... |
| 1894 | + def upsert_embeddings(self, embeddings: list[dict]) -> dict: ... |
| 1895 | + def hybrid_search(self, query: str, k: int = 15, filters: dict | None = None) -> list[dict]: ... |
| 1896 | + def compliance_check(self, requirement_ids: list[str], standard_ref: str) -> dict: ... |
| 1897 | +``` |
| 1898 | + |
| 1899 | +### 14.6 Embedding & Vector Index Strategy |
| 1900 | +Embedding Model Options: |
| 1901 | +- Default: OpenAI `text-embedding-3-large` (1536 dims) OR local Qwen/Instructor variant if privacy constraints. |
| 1902 | +- Domain adaptation: Fine-tune or use contrastive re-ranking for standards. |
| 1903 | + |
| 1904 | +Chunking Strategy: |
| 1905 | +| Doc Type | Unit | Avg Tokens | Overlap | Notes | |
| 1906 | +|----------|------|------------|---------|-------| |
| 1907 | +| requirements_spec | Individual requirement | 30–120 | 0 | Each requirement atomic -> direct embedding | |
| 1908 | +| standard | Clause / subclause | 80–250 | 25 tokens | Preserve normative boundaries | |
| 1909 | +| howto | Step group (5–7 steps) | 60–150 | 20 tokens | Provide local context | |
| 1910 | +| template | Placeholder + surrounding context | 40–90 | 15 | Capture variable semantics | |
| 1911 | +| policy/guideline | Policy statement | 50–160 | 15 | Keep actionable text intact | |
| 1912 | + |
| 1913 | +Embedding Ingestion Pipeline: |
| 1914 | +1. Normalize text (unicode NFC, preserve casing, strip page numbers). |
| 1915 | +2. Generate vector. |
| 1916 | +3. Compute lexical signature (top 12 stemmed tokens) for hybrid BM25 fusion. |
| 1917 | +4. Persist to Postgres `embeddings` table. |
| 1918 | + |
| 1919 | +### 14.7 Hybrid Retrieval Architecture |
| 1920 | +Hybrid = Vector Similarity + Lexical (BM25) + Metadata Filters + (Optional) Reranker. |
| 1921 | + |
| 1922 | +Retrieval Steps: |
| 1923 | +1. **Lexical Candidate Generation:** Use Postgres full text search or an external BM25 (pg_trgm / tsvector) index. |
| 1924 | +2. **Vector Similarity Search:** ivfflat (cosine) top K. |
| 1925 | +3. **Score Fusion:** Reciprocal Rank Fusion (RRF) or Weighted Sum: |
| 1926 | + `final = w_vec * norm(vector_score) + w_lex * norm(bm25_score) + w_meta * meta_boost` |
| 1927 | +4. **Optional Cross-Encoder Re-rank:** For top 50 using a local mini LM (e.g. `bge-reranker-base`). |
| 1928 | +5. **Diversity Filter:** Remove near-duplicate (cosine > 0.95) keeping highest rank. |
| 1929 | +6. **Return:** Structured results with provenance: `{parent_type, parent_id, document_id, score, snippet}`. |
| 1930 | + |
| 1931 | +Representative Hybrid Query (Illustrative): |
| 1932 | +```sql |
| 1933 | +WITH vec AS ( |
| 1934 | + SELECT parent_id, 1 - (vector <=> embedding_query(:q_vec)) AS vscore |
| 1935 | + FROM embeddings |
| 1936 | + WHERE tag = ANY(:tags) |
| 1937 | + ORDER BY embedding_query(:q_vec) <=> vector |
| 1938 | + LIMIT 100 |
| 1939 | +), |
| 1940 | +lex AS ( |
| 1941 | + SELECT parent_id, ts_rank_cd(tsv, plainto_tsquery(:q_text)) AS lscore |
| 1942 | + FROM lexical_index |
| 1943 | + WHERE tsv @@ plainto_tsquery(:q_text) |
| 1944 | + LIMIT 100 |
| 1945 | +) |
| 1946 | +SELECT coalesce(vec.parent_id, lex.parent_id) AS parent_id, |
| 1947 | + coalesce(vscore,0) AS vscore, |
| 1948 | + coalesce(lscore,0) AS lscore, |
| 1949 | + (0.6 * vscore + 0.4 * lscore) AS final_score |
| 1950 | +FROM vec FULL OUTER JOIN lex USING (parent_id) |
| 1951 | +ORDER BY final_score DESC |
| 1952 | +LIMIT 25; |
| 1953 | +``` |
| 1954 | + |
| 1955 | +### 14.8 Advanced Use Case Flows |
| 1956 | + |
| 1957 | +#### 1. Compliance / Conformance Checking (Requirements vs Standard) |
| 1958 | +Flow: |
| 1959 | +1. User: *"Do our login requirements comply with ISO-27001 section 9.2?"* |
| 1960 | +2. Agent: Retrieve requirements tagged `authentication` + standard clauses referencing access control. |
| 1961 | +3. Alignment Heuristic: |
| 1962 | + - Semantic similarity (embedding cos > threshold) |
| 1963 | + - Keyword obligation coverage (presence of MUST/SHALL vs passive wording) |
| 1964 | + - Gap detection (standard clause concepts missing in requirement set) |
| 1965 | +4. Output categories: |
| 1966 | + - `fully_covered`, `partially_covered`, `missing`, `over_specified`. |
| 1967 | +5. Summarize gaps + propose draft requirements (LLM generative assist flagged as `suggested_draft`). |
| 1968 | + |
| 1969 | +#### 2. Standards Q&A with Related Templates & HowTos |
| 1970 | +Flow: |
| 1971 | +1. User question → Hybrid retrieval across `standard` + `template` + `howto`. |
| 1972 | +2. Group results by type; build answer plan: |
| 1973 | + - Normative definition excerpts |
| 1974 | + - Concrete procedural template placeholders |
| 1975 | + - Practical steps from howto. |
| 1976 | +3. LLM synthesizes final answer citing sources (document_id + clause_number / step number). |
| 1977 | + |
| 1978 | +#### 3. Standards Relationship Exploration |
| 1979 | +Data Prep: |
| 1980 | +- Build a *standards graph* (background job): nodes = clauses; edges: semantic similarity > 0.88 OR explicit cross-reference anchor. |
| 1981 | +- Store edges in `standards_graph_edges` (source_clause_id, target_clause_id, edge_type, weight). |
| 1982 | +Interactive Flow: |
| 1983 | +1. User: *"How does ISO-27001 relate to NIST 800-53 on incident response?"* |
| 1984 | +2. Retrieve subgraph filtered by tags & topic embeddings (incident response cluster labels). |
| 1985 | +3. Summarize: overlapping concepts, unique requirements, divergence notes. |
| 1986 | + |
| 1987 | +### 14.9 Orchestration Pseudocode (High-Level) |
| 1988 | +```python |
| 1989 | +def process_document(file_path: str, session_id: str): |
| 1990 | + raw_meta = gather_basic_metadata(file_path) |
| 1991 | + heuristic_tag, h_conf = heuristic_classifier(file_path) |
| 1992 | + llm_tag, llm_conf, rationale = llm_classifier(file_path) |
| 1993 | + final_tag, final_conf = resolve_tag(heuristic_tag, h_conf, llm_tag, llm_conf) |
| 1994 | + if needs_user_confirmation(final_conf): |
| 1995 | + prompt_user_for_tag_confirmation(session_id, candidates=[heuristic_tag, llm_tag]) |
| 1996 | + final_tag = await_user_choice(session_id) |
| 1997 | + doc_id = external_client.register_document({...}) |
| 1998 | + pipeline = select_pipeline(final_tag) |
| 1999 | + structured = pipeline.run(file_path) |
| 2000 | + if final_tag == 'requirements_spec': |
| 2001 | + refined = high_accuracy_refinement(structured) |
| 2002 | + external_client.upsert_requirements(refined.requirements) |
| 2003 | + embed_and_store(refined.requirements, doc_id) |
| 2004 | + else: |
| 2005 | + clauses = normalize_non_requirements(structured) |
| 2006 | + external_client.upsert_clauses(clauses) |
| 2007 | + embed_and_store(clauses, doc_id) |
| 2008 | + return summary(structured) |
| 2009 | +``` |
| 2010 | + |
| 2011 | +### 14.10 Evaluation & QA Extensions |
| 2012 | +New Metrics: |
| 2013 | +| Aspect | Metric | Target | |
| 2014 | +|--------|--------|--------| |
| 2015 | +| Tagging | Primary tag accuracy (manual validation) | ≥95% | |
| 2016 | +| Tagging | Confirmation intervention rate | <25% (improves with learning) | |
| 2017 | +| Requirements | Extraction accuracy (precision/recall) | ≥99% / ≥98% | |
| 2018 | +| Embeddings | Retrieval nDCG@10 (bench queries) | ≥0.82 | |
| 2019 | +| Hybrid Search | Latency (p95) | <700ms (warm) | |
| 2020 | +| Compliance | Gap detection F1 | ≥0.9 | |
| 2021 | + |
| 2022 | +Automated Evaluation Harness: |
| 2023 | +- Golden dataset of annotated documents (requirements, standards) with expected outputs. |
| 2024 | +- Periodic CI job runs extraction + retrieval benchmarks; publishes `TEST_EXECUTION_REPORT.md` deltas. |
| 2025 | + |
| 2026 | +### 14.11 Risks & Mitigations (Addendum) |
| 2027 | +| Risk | Impact | Mitigation | |
| 2028 | +|------|--------|------------| |
| 2029 | +| External DB downtime | Lost persistence / user blockage | Local queue + retry DLQ; show degraded-mode notice | |
| 2030 | +| Tag misclassification | Wrong pipeline reduces accuracy | Confirmation loop + override memory + continuous learning | |
| 2031 | +| Vector drift (model change) | Retrieval inconsistency | Versioned embeddings (store `embedding_model_version`) + background re-indexer | |
| 2032 | +| Hybrid query latency spike | Poor UX | Adaptive K reduction + caching top lexical candidates | |
| 2033 | +| Over-generation in compliance suggestions | False confidence | Flag AI-suggested items; require explicit user accept | |
| 2034 | + |
| 2035 | +### 14.12 Implementation Phasing Extension |
| 2036 | +Add to original phases: |
| 2037 | +- **Phase 5 (Week 9-10)**: Tagging confirmation loop + external API client + requirements persistence. |
| 2038 | +- **Phase 6 (Week 11-12)**: pgvector embeddings + hybrid retrieval MVP. |
| 2039 | +- **Phase 7 (Week 13-14)**: Compliance engine + standards graph builder. |
| 2040 | +- **Phase 8 (Week 15)**: Evaluation harness automation + performance tuning. |
| 2041 | + |
| 2042 | +### 14.13 Summary of Addendum |
| 2043 | +The enhanced design introduces a *closed-loop knowledge lifecycle*: |
| 2044 | +`Document Ingestion → Context-Aware Tagging (+User Confirmation) → Domain Pipeline → High-Accuracy Structuring → Persistent Knowledge Graph (Postgres + pgvector) → Hybrid Retrieval → Cross-Document Reasoning (Compliance, Relationships, Q&A)`. |
| 2045 | + |
| 2046 | +This augments the original architecture without breaking existing abstractions: new functionality slots into **pre-tool (tagging)**, **mid-pipeline (high-accuracy refinement)**, and **post-processing (persistence + retrieval)** stages. |
| 2047 | + |
| 2048 | +--- |
| 2049 | + |
| 2050 | +**End of Design Document (v1.1 with Addendum)** |
0 commit comments