|
| 1 | +# Profiling Results: Validation Performance Analysis |
| 2 | + |
| 3 | +**Date**: 2025-01-XX |
| 4 | +**Branch**: `optimization/phase-3` |
| 5 | +**Workload**: 500-node scale-free graph, 10× validation runs |
| 6 | + |
| 7 | +--- |
| 8 | + |
| 9 | +## Executive Summary |
| 10 | + |
| 11 | +**Key Finding**: 76% of validation time spent in NetworkX graph algorithms: |
| 12 | +- `eccentricity()`: 4.684s / 6.138s total (76%) |
| 13 | +- `_single_shortest_path_length()`: 2.758s self-time (45%) |
| 14 | +- Field caching works perfectly: 2nd run = 0.000s (100% cache hits) |
| 15 | + |
| 16 | +**Bottleneck**: `estimate_coherence_length()` → diameter calculation → APSP O(N³) |
| 17 | + |
| 18 | +--- |
| 19 | + |
| 20 | +## Detailed Profile: Full Validation (10 runs) |
| 21 | + |
| 22 | +### Top Functions by Cumulative Time |
| 23 | + |
| 24 | +| Function | cumtime | tottime | calls | Source | |
| 25 | +|----------|---------|---------|-------|--------| |
| 26 | +| `run_structural_validation` | 6.138s | 0.000s | 10 | aggregator.py:124 | |
| 27 | +| `eccentricity` | 4.684s | 0.023s | 20 | networkx/distance_measures.py:317 | |
| 28 | +| `shortest_path_length` | 4.600s | 0.006s | 10K | networkx/shortest_paths/generic.py:178 | |
| 29 | +| `single_source_shortest_path_length` | 4.584s | 0.603s | 10K | networkx/unweighted.py:19 | |
| 30 | +| **`_single_shortest_path_length`** | **3.979s** | **2.758s** | **5M** | **networkx/unweighted.py:61** | |
| 31 | +| `diameter` | 2.339s | 0.000s | 10 | networkx/distance_measures.py:408 | |
| 32 | +| `compute_structural_potential` | 1.428s | 0.100s | 1 | fields.py:309 | |
| 33 | +| `_dijkstra_multisource` | 1.150s | 0.637s | 500 | networkx/weighted.py:784 | |
| 34 | + |
| 35 | +### Primitive Operations (High Self-Time) |
| 36 | + |
| 37 | +| Operation | tottime | calls | Type | |
| 38 | +|-----------|---------|-------|------| |
| 39 | +| `set.add()` | 0.491s | 5M | Builtin | |
| 40 | +| `list.append()` | 0.450s | 5M | Builtin | |
| 41 | +| `lambda` (edge weight) | 0.363s | 1.5M | NetworkX | |
| 42 | +| `len()` | 0.298s | 3.8M | Builtin | |
| 43 | + |
| 44 | +**Interpretation**: |
| 45 | +- 2.758s self-time in `_single_shortest_path_length` = actual BFS work |
| 46 | +- 0.637s self-time in Dijkstra = distance computations |
| 47 | +- Remaining time = Python overhead (sets, lists, len checks) |
| 48 | + |
| 49 | +--- |
| 50 | + |
| 51 | +## Field Caching Performance: Second Run |
| 52 | + |
| 53 | +### Total Time: 0.000s (100% cache hits) |
| 54 | + |
| 55 | +| Function | cumtime | calls | Role | |
| 56 | +|----------|---------|-------|------| |
| 57 | +| `cache.wrapper` | 0.000s | 40 | Check cache | |
| 58 | +| `_generate_cache_key` | 0.000s | 40 | Hash inputs | |
| 59 | +| `get()` | 0.000s | 40 | Retrieve value | |
| 60 | +| `openssl_md5` | 0.000s | 40 | Hash computation | |
| 61 | + |
| 62 | +**Evidence**: Field caching working perfectly. Zero computational overhead on cached graphs. |
| 63 | + |
| 64 | +--- |
| 65 | + |
| 66 | +## Performance Breakdown by Component |
| 67 | + |
| 68 | +### 1. NetworkX Graph Algorithms: 76% (4.684s / 6.138s) |
| 69 | + |
| 70 | +**Functions**: |
| 71 | +- `eccentricity()` (diameter calculation): 4.684s cumulative |
| 72 | +- `shortest_path_length()`: 4.600s cumulative |
| 73 | +- BFS internal: 2.758s self-time |
| 74 | + |
| 75 | +**Why Expensive**: |
| 76 | +- Diameter requires All-Pairs Shortest Paths (APSP) |
| 77 | +- NetworkX eccentricity = max(shortest_path_length(n, target) for all targets) |
| 78 | +- Complexity: O(N² × M) for unweighted, O(N³) worst-case |
| 79 | +- 500 nodes → 500² = 250K path computations |
| 80 | + |
| 81 | +**Optimization Opportunities**: |
| 82 | +1. **Approximate diameter** (2-sweep BFS heuristic): O(N + M) vs O(N³) |
| 83 | +2. **Cache graph-level metrics** (diameter, eccentricity) separately |
| 84 | +3. **Lazy diameter** - only compute if needed for ξ_C validation |
| 85 | + |
| 86 | +### 2. Field Computation (First Run): 23% (1.428s / 6.138s) |
| 87 | + |
| 88 | +**Functions**: |
| 89 | +- `compute_structural_potential()`: 1.428s (Φ_s) |
| 90 | +- Uses Dijkstra for distance matrix: 1.150s |
| 91 | + |
| 92 | +**Why Reasonable**: |
| 93 | +- First computation on uncached graph |
| 94 | +- Dijkstra O(N log N) per source, 500 sources = O(N² log N) |
| 95 | +- Includes inverse-square distance weighting |
| 96 | + |
| 97 | +**Already Optimized**: |
| 98 | +- ✅ Cache decorator applied |
| 99 | +- ✅ NumPy vectorization for distance matrix operations |
| 100 | +- ✅ No obvious low-hanging fruit |
| 101 | + |
| 102 | +### 3. Cache System: <1% (0.000s) |
| 103 | + |
| 104 | +**Already Optimal**: Negligible overhead, perfect hit rate on repeated calls. |
| 105 | + |
| 106 | +--- |
| 107 | + |
| 108 | +## Optimization Priorities (Based on Profile Data) |
| 109 | + |
| 110 | +### HIGH PRIORITY 🔴 |
| 111 | + |
| 112 | +#### 1. Replace Exact Diameter with Approximation |
| 113 | +**Impact**: ~4.5s → ~0.05s (99% reduction) |
| 114 | +**Effort**: Medium |
| 115 | +**Risk**: Low (approximate ξ_C sufficient) |
| 116 | + |
| 117 | +**Implementation**: |
| 118 | +```python |
| 119 | +def approximate_diameter(G): |
| 120 | + """2-sweep BFS heuristic for diameter estimation. |
| 121 | + |
| 122 | + Complexity: O(N + M) vs O(N³) exact. |
| 123 | + Accuracy: Typically within 2× of true diameter. |
| 124 | + """ |
| 125 | + # 1. Random peripheral node |
| 126 | + u = max(G.nodes(), key=lambda n: nx.eccentricity(G, n)) |
| 127 | + |
| 128 | + # 2. BFS from u, find farthest v |
| 129 | + lengths = nx.single_source_shortest_path_length(G, u) |
| 130 | + v, d1 = max(lengths.items(), key=lambda x: x[1]) |
| 131 | + |
| 132 | + # 3. BFS from v, diameter ≈ max distance |
| 133 | + lengths2 = nx.single_source_shortest_path_length(G, v) |
| 134 | + d2 = max(lengths2.values()) |
| 135 | + |
| 136 | + return max(d1, d2) |
| 137 | +``` |
| 138 | + |
| 139 | +**Validation**: Benchmark against exact diameter on test graphs. |
| 140 | + |
| 141 | +#### 2. Cache Graph-Level Metrics Separately |
| 142 | +**Impact**: ~20% reduction if diameter reused |
| 143 | +**Effort**: Low |
| 144 | +**Risk**: Very Low |
| 145 | + |
| 146 | +**Implementation**: |
| 147 | +- Add `@cache_tnfr_computation(dependencies={'graph_topology'})` to diameter wrapper |
| 148 | +- Store in graph cache with longer TTL |
| 149 | +- Invalidate only on topology changes |
| 150 | + |
| 151 | +### MEDIUM PRIORITY 🟡 |
| 152 | + |
| 153 | +#### 3. Vectorize Phase Operations |
| 154 | +**Impact**: ~10-15% reduction (phase gradient/curvature) |
| 155 | +**Effort**: Medium |
| 156 | +**Risk**: Low |
| 157 | + |
| 158 | +**Target**: Batch phase difference computations in `compute_phase_gradient` |
| 159 | + |
| 160 | +#### 4. Early Exit for Grammar Validation |
| 161 | +**Impact**: Variable (10-30% if errors common) |
| 162 | +**Effort**: Low |
| 163 | +**Risk**: Very Low |
| 164 | + |
| 165 | +**Implementation**: Add `stop_on_first_error=True` flag |
| 166 | + |
| 167 | +### LOW PRIORITY 🟢 |
| 168 | + |
| 169 | +#### 5. NumPy/Numba JIT for BFS |
| 170 | +**Impact**: ~20% (if replacing NetworkX) |
| 171 | +**Effort**: High |
| 172 | +**Risk**: High (correctness, maintenance) |
| 173 | + |
| 174 | +**Decision**: Defer - NetworkX BFS already C-optimized. |
| 175 | + |
| 176 | +--- |
| 177 | + |
| 178 | +## Recommended Next Steps |
| 179 | + |
| 180 | +1. **Implement approximate diameter** (Issue #1) |
| 181 | + - Create `fast_diameter()` helper |
| 182 | + - Add benchmark comparing exact vs approximate |
| 183 | + - Update `estimate_coherence_length()` to use approximation |
| 184 | + - Measure speedup on 100, 500, 1K node graphs |
| 185 | + |
| 186 | +2. **Add graph-level metric caching** (Issue #2) |
| 187 | + - Wrap diameter in cached function |
| 188 | + - Test invalidation on topology changes |
| 189 | + |
| 190 | +3. **Profile after optimizations** |
| 191 | + - Re-run this script |
| 192 | + - Verify NetworkX time <20% total |
| 193 | + - Document speedup in OPTIMIZATION_PROGRESS.md |
| 194 | + |
| 195 | +4. **Benchmark at scale** |
| 196 | + - Test 1K, 2K, 5K node graphs |
| 197 | + - Measure O(N) scaling for approximate diameter |
| 198 | + - Compare O(N³) exact vs O(N) approximate curves |
| 199 | + |
| 200 | +--- |
| 201 | + |
| 202 | +## Tools & Commands |
| 203 | + |
| 204 | +### Run This Profile |
| 205 | +```powershell |
| 206 | +$env:PYTHONPATH=(Resolve-Path -Path ./src).Path |
| 207 | +& "C:/Program Files/Python313/python.exe" profile_validation.py |
| 208 | +``` |
| 209 | + |
| 210 | +### Analyze with snakeviz (Visual) |
| 211 | +```powershell |
| 212 | +# Install snakeviz |
| 213 | +pip install snakeviz |
| 214 | +
|
| 215 | +# Generate profile |
| 216 | +python -m cProfile -o profile.stats profile_validation.py |
| 217 | +
|
| 218 | +# Visualize |
| 219 | +snakeviz profile.stats |
| 220 | +``` |
| 221 | + |
| 222 | +### Line-by-line profiling (optional) |
| 223 | +```powershell |
| 224 | +# Install line_profiler |
| 225 | +pip install line_profiler |
| 226 | +
|
| 227 | +# Decorate target function with @profile |
| 228 | +# Run with kernprof |
| 229 | +kernprof -l -v profile_validation.py |
| 230 | +``` |
| 231 | + |
| 232 | +--- |
| 233 | + |
| 234 | +## References |
| 235 | + |
| 236 | +- **NetworkX Performance**: https://networkx.org/documentation/stable/reference/algorithms/shortest_paths.html |
| 237 | +- **Diameter Approximation**: Magnien et al. "Fast computation of empirically tight bounds for the diameter of massive graphs" (2009) |
| 238 | +- **BFS Complexity**: O(N + M) unweighted, O(N log N + M) weighted (Dijkstra) |
| 239 | + |
| 240 | +--- |
| 241 | + |
| 242 | +**Next Document**: `docs/DIAMETER_OPTIMIZATION.md` (implementation plan) |
0 commit comments