Skip to content

Commit 1711d6c

Browse files
NSHkrNSHkr
authored andcommitted
docs for foundation infra, updated infra api, addendum for ast layer docs
1 parent 10b9ad5 commit 1711d6c

File tree

5 files changed

+1379
-0
lines changed

5 files changed

+1379
-0
lines changed
Lines changed: 161 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,161 @@
1+
Okay, this is an important clarification. The "Infrastructure Layer Design Document" (`10_gemini_plan.md`) describes new components that will reside **within the Foundation Layer (Layer 1)**, specifically under `lib/elixir_scope/foundation/infrastructure/`. It's not a separate numbered layer in the 9-layer model but rather an enhancement *to* Layer 1.
2+
3+
The AST Layer (Layer 2) consumes services from Layer 1. Therefore, enhancements to Layer 1 (like the new Infrastructure components) will primarily benefit Layer 2 *indirectly* by making the Layer 1 services it depends on more resilient and performant. Layer 2 components will generally not call, for example, `ElixirScope.Foundation.Infrastructure.CircuitBreakerWrapper.execute/3` directly. Instead, a Layer 1 service like `ElixirScope.Foundation.Events` (which Layer 2 *does* use) would internally use these infrastructure wrappers to protect its own operations (e.g., writing to a database).
4+
5+
However, there are specific points of integration and considerations for the AST Layer. Let's create a technical addendum to the AST Layer documentation reflecting these.
6+
7+
---
8+
9+
## Technical Addendum: AST Layer Integration with Enhanced Foundation Layer
10+
11+
**Document Version:** 1.1 (AST Layer)
12+
**Date:** May 2025
13+
**Context:** This addendum updates the AST Layer (Layer 2) technical documentation to reflect the integration of and impact from the new Infrastructure sub-layer within the Foundation Layer (Layer 1), as detailed in `docs/FOUNDATION_OTP_IMPLEMENT_NOW/10_gemini_plan.md`.
14+
15+
**Primary Impact Points:**
16+
1. **Enhanced Reliability of Foundation Services:** AST Layer components can expect greater stability and resilience from the Foundation services they consume (e.g., `ElixirScope.Foundation.Config`, `ElixirScope.Foundation.Events`).
17+
2. **Coordination with Foundation Infrastructure Services:** AST Layer's own management components (e.g., `AST.Repository.MemoryManager`) may need to coordinate or report to new Foundation-level infrastructure services (e.g., `Foundation.Infrastructure.MemoryManager`, `Foundation.Infrastructure.HealthAggregator`).
18+
3. **Refined Error Handling Expectations:** AST Layer components should be prepared to handle potentially new error types surfaced by Foundation services that are now using infrastructure protections (e.g., an error indicating a circuit breaker is open in a Foundation service).
19+
4. **Telemetry and Observability:** AST Layer components can contribute more detailed telemetry, knowing that the Foundation Layer has enhanced capabilities for its collection and analysis (via `Foundation.Infrastructure.PerformanceMonitor`).
20+
21+
---
22+
23+
### Revisions and Considerations for AST Layer Documentation:
24+
25+
#### 1. Document: `AST_TECH_SPEC.md`
26+
27+
* **Section 1: Architecture Overview**
28+
* **Current Consideration:** General dependency on Foundation Layer.
29+
* **Impact of Enhanced L1:** The Foundation Layer is now internally more robust due to its own infrastructure components (circuit breakers, rate limiters, connection pools).
30+
* **Revised Consideration / Action Item:**
31+
* Update text to note that L1 services consumed by L2 (Config, Events, Utils) are now built on a more resilient internal infrastructure, leading to higher expected reliability for L2 interactions with L1.
32+
33+
* **Section 2.1: Repository System (`core.ex`, `enhanced.ex`)**
34+
* **Current Consideration:** `Repository.Core` is a GenServer managing ETS tables. `enhanced.ex` builds on this.
35+
* **Impact of Enhanced L1:**
36+
* If `Repository.Core` or `Enhanced` were to perform operations that could benefit from protection (e.g., complex state computations that could hang, or hypothetical writes to an external metadata store not covered by `EventStore`), they *could* internally use `ElixirScope.Foundation.Infrastructure.execute_protected/2`.
37+
* However, for standard ETS operations, this is likely overkill and adds unnecessary overhead. ETS is already highly performant and local.
38+
* **Revised Consideration / Action Item:**
39+
* Primarily, these components benefit from more reliable `ConfigServer` access if they fetch dynamic configurations.
40+
* No direct changes to their API for L1 infra, but internal implementation *could* adopt it for specific complex, state-altering `handle_call/cast` implementations if deemed necessary for self-protection, though this is less common for primarily ETS-bound GenServers.
41+
* Ensure `health_check` and `get_statistics` (from `MODULE_INTERFACES_DOCUMENTATION.md` for `Repository.Core`) are robust and detailed enough for consumption by `Foundation.Infrastructure.HealthAggregator` and `PerformanceMonitor`.
42+
43+
* **Section 2.1.3: Memory Manager (`memory_manager/`)**
44+
* **Current Consideration:** AST Layer has its own `MemoryManager` subsystem for its ETS tables.
45+
* **Impact of Enhanced L1:** Foundation Layer will have `ElixirScope.Foundation.Infrastructure.MemoryManager` for global/system-wide memory pressure.
46+
* **Revised Consideration / Action Item:**
47+
* The AST Layer's `MemoryManager.Monitor` should not only monitor AST-specific ETS tables but also consider subscribing to or querying the `Foundation.Infrastructure.MemoryManager` for global memory pressure signals.
48+
* `AST.Repository.MemoryManager.PressureHandler` should incorporate global pressure level information from the Foundation's `MemoryManager` into its decision-making for AST cache trimming, compression, etc.
49+
* A clear protocol for interaction/coordination between L1 and L2 memory managers needs to be defined (e.g., L2 MM reports its usage to L1 MM; L1 MM can signal L2 MM to take action).
50+
* Update `MODULE_INTERFACES_DOCUMENTATION.md` for `AST.MemoryManager.*` to reflect this potential coordination.
51+
52+
* **Section 2.4: Query System (`executor.ex`)**
53+
* **Current Consideration:** Executes queries against repository data (ETS).
54+
* **Impact of Enhanced L1:**
55+
* If query execution becomes a source of high load or involves potentially slow/complex computations (beyond simple ETS lookups), the `Query.Executor` could internally use `Foundation.Infrastructure.RateLimiter` to protect itself from too many concurrent complex queries, or `CircuitBreakerWrapper` if a query involves a risky computation.
56+
* **Revised Consideration / Action Item:**
57+
* For now, assume queries are primarily fast ETS reads. If future query types become computationally expensive or state-altering within the Executor GenServer, consider applying infrastructure protections internally to the `Executor`'s `handle_call` for those specific query types.
58+
* The `Query.Cache` can be an important part of this.
59+
60+
* **Section 6: Integration Points > Foundation Layer Integration**
61+
* **Current Consideration:** Lists `DataAccess`, `Utils`, `Config`.
62+
* **Impact of Enhanced L1:** New infrastructure components exist within L1.
63+
* **Revised Consideration / Action Item:**
64+
* Add a note: "The Foundation Layer now includes an `ElixirScope.Foundation.Infrastructure` sub-layer providing resilience patterns. While AST Layer components typically consume higher-level Foundation services (Config, Events), these services are now internally more robust. Direct use of `Foundation.Infrastructure` components by AST Layer is possible for specific advanced use cases but should be carefully evaluated for necessity."
65+
66+
* **Section 7: Implementation Guidelines > Error Handling & Monitoring**
67+
* **Current Consideration:** General guidelines.
68+
* **Impact of Enhanced L1:** L1 provides more structured error types from its infra and better monitoring.
69+
* **Revised Consideration / Action Item:**
70+
* AST Layer components should be prepared to handle specific `ElixirScope.Foundation.Types.Error` instances returned by L1 services that might indicate underlying infrastructure issues (e.g., circuit breaker open, rate limited).
71+
* AST Layer components should emit detailed telemetry that can be consumed by `Foundation.Infrastructure.PerformanceMonitor`. Standardize AST telemetry event names/payloads for this purpose.
72+
73+
#### 2. Document: `MODULE_INTERFACES_DOCUMENTATION.md`
74+
75+
* **Section 1.1: Core Repository Interface (`ElixirScope.AST.Repository.Core`)**
76+
* **Current API:** `health_check(pid())`, `get_statistics(pid())`.
77+
* **Impact of Enhanced L1:** These will be consumed by `Foundation.Infrastructure.HealthAggregator` and `PerformanceMonitor`.
78+
* **Revised Consideration / Action Item:**
79+
* Ensure `health_check` returns a standardized format, e.g., `{:ok, %{status: :healthy | :degraded, details: map()}}` or `{:error, reason}`.
80+
* Ensure `get_statistics` provides metrics relevant for performance monitoring (e.g., ETS table sizes, hit/miss rates if applicable, queue lengths for its GenServer messages).
81+
82+
* **Section 8: Memory Management Interfaces (e.g., `ElixirScope.AST.Repository.MemoryManager.Monitor`)**
83+
* **Current API:** e.g., `get_memory_usage_report()`, `get_pressure_level()`.
84+
* **Impact of Enhanced L1:** Need to coordinate with `Foundation.Infrastructure.MemoryManager`.
85+
* **Revised Consideration / Action Item:**
86+
* Add functions or mechanisms for `AST.MemoryManager.Monitor` to report its specific memory usage (for AST data) to `Foundation.Infrastructure.MemoryManager`.
87+
* `AST.Repository.MemoryManager.PressureHandler` might subscribe to notifications from `Foundation.Infrastructure.MemoryManager` or periodically query its global pressure state to influence AST-specific cleanup actions.
88+
* The `handle_pressure_level/1` in `AST.PressureHandler` should take global context into account.
89+
90+
#### 3. Document: `ETS_SCHEMA.md`
91+
92+
* **Section 8: Memory Management**
93+
* **Current Consideration:** Describes AST-specific LRU eviction and memory pressure response.
94+
* **Impact of Enhanced L1:** These local strategies now operate within a system that has a global `Foundation.Infrastructure.MemoryManager`.
95+
* **Revised Consideration / Action Item:**
96+
* The `ElixirScope.AST.MemoryPressure.handle_memory_pressure/1` logic should be updated to potentially be triggered or influenced by signals from `Foundation.Infrastructure.MemoryManager` in addition to its own local monitoring. For example, if Foundation L1 signals "critical pressure," AST's Memory Manager might trigger more aggressive cleanup than its local thresholds would indicate.
97+
98+
#### 4. Document: `SUPERVISION_TREE.md`
99+
100+
* **Section 8: Error Recovery Patterns**
101+
* **Current Consideration:** Mentions an AST-local `ElixirScope.AST.CircuitBreaker`.
102+
* **Impact of Enhanced L1:** Foundation now provides `ElixirScope.Foundation.Infrastructure.CircuitBreakerWrapper`.
103+
* **Revised Consideration / Action Item:**
104+
* Clarify the role of any *AST-local* circuit breakers versus the *Foundation-level* ones.
105+
* AST-local CBs would be for protecting internal computational flows or non-critical internal tasks within Layer 2.
106+
* If an AST component needs to call an external service (which should be rare, usually L1 handles this), it *should* use `ElixirScope.Foundation.Infrastructure.CircuitBreakerWrapper.execute/3`.
107+
* More commonly, AST components benefit because the L1 services they call (e.g., `ConfigServer` hypothetically calling an external persistence layer for its config) are *already protected* by Foundation's CBs.
108+
* The `ElixirScope.AST.RecoveryPatterns.handle_repository_failure/1` might involve checking the health of underlying Foundation services before attempting complex recovery, as L1 issues could be the root cause.
109+
110+
#### 5. Documents: `REQ-*.md` (e.g., `REQ-01-CORE-REPOSITORY.md`, `REQ-04-ADVANCED-FEATURES.md`)
111+
112+
* **General Impact:** Non-functional requirements related to reliability, performance, and monitoring can be met with higher confidence or to a greater degree due to the enhanced Foundation Layer.
113+
* **`REQ-01-CORE-REPOSITORY.md`:**
114+
* **FR-1.4 (Lifecycle Management - Monitoring):** `Repository.Core` statistics collection can be more effectively utilized by `Foundation.Infrastructure.PerformanceMonitor`.
115+
* **`REQ-04-ADVANCED-FEATURES.md`:**
116+
* **FR-4.2 (Memory Management System):** The description of this system *within AST* needs to clearly define its interaction and responsibility boundaries with `Foundation.Infrastructure.MemoryManager`. It should not duplicate efforts but rather manage AST-specific data structures based on its own heuristics *and* global signals from L1.
117+
* **NFR-4.5 (System Reliability - Fault tolerance):** The "graceful handling of memory exhaustion" will be a joint effort between L2's Memory Manager and L1's global Memory Manager.
118+
119+
---
120+
121+
### New Integration Patterns & API Considerations for AST Layer (Layer 2)
122+
123+
1. **Health Reporting to Foundation:**
124+
* Key AST GenServers (e.g., `Repository.Core`, `PatternMatcher.Core`, `Query.Executor`, `AST.MemoryManager.Monitor`, `FileWatcher.Core`) MUST implement a `health_check/0` function or respond to a `:health_check` call.
125+
* This health check should return `{:ok, details_map}` or `{:error, reason_map}`.
126+
* `Foundation.Infrastructure.HealthAggregator` will be configured to call these endpoints.
127+
* **Action Item:** Define a standardized health check response format/behaviour for ElixirScope services.
128+
129+
2. **Performance Telemetry for Foundation:**
130+
* AST components performing significant work (parsing, complex queries, pattern matching) SHOULD emit detailed telemetry events.
131+
* Event names should follow a convention allowing `Foundation.Infrastructure.PerformanceMonitor` to easily subscribe and aggregate them (e.g., `[:elixir_scope, :ast, :parser, :parse_file_duration]`, `[:elixir_scope, :ast, :query, :execute_duration]`).
132+
* Payloads should include relevant metadata (e.g., file size for parsing, query complexity for queries).
133+
* **Action Item:** Define standard AST-specific telemetry events and payloads.
134+
135+
3. **Memory Usage Reporting to Foundation:**
136+
* `ElixirScope.AST.Repository.MemoryManager.Monitor` SHOULD periodically report the memory usage specifically consumed by AST's ETS tables and caches to `Foundation.Infrastructure.MemoryManager`.
137+
* This allows the L1 `MemoryManager` to have a global view including L2's significant contribution.
138+
* **Action Item:** Define an API on `Foundation.Infrastructure.MemoryManager` for other layers/components to report their specialized memory consumption.
139+
140+
4. **Consuming Global Memory Pressure Signals:**
141+
* `ElixirScope.AST.Repository.MemoryManager.PressureHandler` SHOULD be able to subscribe to or poll `Foundation.Infrastructure.MemoryManager` for global memory pressure level updates.
142+
* When L1 signals high/critical pressure, L2's `PressureHandler` must trigger more aggressive local cleanup within AST's data stores, even if its own local thresholds haven't been met.
143+
* **Action Item:** Define the mechanism (callback, subscription, polling) for L2 MM to receive global pressure signals from L1 MM.
144+
145+
5. **Direct Use of Foundation Infrastructure (Limited Cases):**
146+
* While uncommon, if an AST component *itself* makes external calls or manages a pool of unique, long-lived, non-GenServer resources (highly unlikely for AST's domain), it *could* directly use `Foundation.Infrastructure.CircuitBreakerWrapper` or `ConnectionManager`.
147+
* Example: If `AST.Parsing.Parser` had to fetch some schema from a remote URL (not current design), that call would be wrapped.
148+
* **Action Item:** General guidance: AST components should rely on L1 services to handle external interactions. Direct use of L1 Infrastructure by L2 needs strong justification.
149+
150+
---
151+
152+
**Conclusion:**
153+
154+
The enhancements to the Foundation Layer (Layer 1) with a dedicated Infrastructure sub-layer significantly bolster the overall robustness and manageability of the ElixirScope platform. For the AST Layer (Layer 2), this primarily translates to:
155+
156+
* **Increased reliability** of the L1 services it depends upon.
157+
* New **coordination points** with L1's infrastructure services for global concerns like memory management and health aggregation.
158+
* The **API contract for L2 consuming L1 services like `Config`, `Events`, `Telemetry` remains stable**.
159+
* The **new `ElixirScope.Foundation.Infrastructure` API is primarily for L1 internal use** or very specific, advanced L2 scenarios.
160+
161+
This addendum provides the necessary pointers to update AST Layer documentation and consider these integration points during its ongoing development and refinement. The AST Layer's own internal mechanisms for memory management, caching, and error handling remain important but will now operate within a more globally aware and resilient system.

0 commit comments

Comments
 (0)