Skip to content

Commit 6de37e1

Browse files
committed
feat: add workflow parallelism controls and benchmarking tools
- Configure workflow parallelism (limit:10), mutex locks, and timeouts - Add idempotency and retry logic to convert/register scripts - Create pipeline_utils module for shared logging and metrics - Add baseline benchmarking and load testing tools - Include unit and integration tests with CI validation - Update workflow template with production resource limits
1 parent 8198ca6 commit 6de37e1

19 files changed

+1649
-617
lines changed
Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
name: Workflow Parallelism Smoke Test
2+
3+
on:
4+
pull_request:
5+
paths:
6+
- 'workflows/**'
7+
- 'scripts/convert.py'
8+
- 'scripts/register.py'
9+
push:
10+
branches:
11+
- feat/workflow-parallelism
12+
13+
jobs:
14+
smoke-test:
15+
runs-on: ubuntu-latest
16+
steps:
17+
- uses: actions/checkout@v4
18+
19+
- name: Set up Python
20+
uses: actions/setup-python@v5
21+
with:
22+
python-version: '3.11'
23+
24+
- name: Install dependencies
25+
run: |
26+
pip install uv
27+
uv sync
28+
29+
- name: Validate workflow template YAML
30+
run: |
31+
python -c "import yaml; yaml.safe_load(open('workflows/base/workflowtemplate.yaml'))"
32+
33+
- name: Test convert.py imports
34+
run: |
35+
uv run python -c "from scripts.convert import run_conversion; print('✓ convert.py imports OK')"
36+
37+
- name: Test register.py imports
38+
run: |
39+
uv run python -c "from scripts.register import run_registration; print('✓ register.py imports OK')"
40+
41+
- name: Validate kustomize overlays
42+
run: |
43+
if command -v kustomize &> /dev/null; then
44+
kustomize build workflows/overlays/high-throughput
45+
else
46+
echo "Kustomize not available; skipping validation"
47+
fi

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -61,3 +61,5 @@ Thumbs.db
6161
# Project-specific
6262
*.zarr
6363
out/
64+
reports/*
65+
!reports/README.md

Makefile

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -49,3 +49,21 @@ clean: ## Clean generated files and caches
4949
find . -type f -name '*.pyc' -delete 2>/dev/null || true
5050
rm -rf .pytest_cache .mypy_cache .ruff_cache htmlcov .coverage
5151
@echo "✓ Clean complete"
52+
53+
test: ## Run tests with pytest
54+
@echo "🧪 Running tests..."
55+
uv run pytest tests/ -v
56+
57+
validate-workflows: ## Validate workflow YAML files
58+
@echo "✓ Validating workflow templates..."
59+
@python3 -c "import yaml; yaml.safe_load(open('workflows/base/workflowtemplate.yaml'))" && echo " ✓ workflowtemplate.yaml"
60+
@python3 -c "import yaml; yaml.safe_load(open('workflows/base/sensor.yaml'))" && echo " ✓ sensor.yaml"
61+
@python3 -c "import yaml; yaml.safe_load(open('workflows/base/eventsource.yaml'))" && echo " ✓ eventsource.yaml"
62+
63+
apply-staging: ## Apply staging overlay to devseed-staging
64+
@echo "📦 Applying staging overlay..."
65+
kubectl apply -k workflows/overlays/staging
66+
67+
apply-production: ## Apply production overlay to devseed
68+
@echo "📦 Applying production overlay..."
69+
kubectl apply -k workflows/overlays/production

pyproject.toml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ build-backend = "hatchling.build"
55
[project]
66
name = "data-pipeline"
77
version = "1.0.0"
8-
description = "Minimal event-driven Argo Workflows pipeline for Sentinel-2 GeoZarr conversion and STAC registration"
8+
description = "Minimal event-driven Argo Workflows pipeline for Sentinel GeoZarr conversion and STAC registration"
99
readme = "README.md"
1010
requires-python = ">=3.13"
1111
license = { text = "MIT" }
@@ -44,6 +44,7 @@ dev = [
4444
"mypy>=1.11.0",
4545
"pre-commit>=3.7.0",
4646
"types-boto3>=1.0.2",
47+
"matplotlib>=3.7.0",
4748
]
4849

4950
[project.urls]

reports/README.md

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
# Benchmarking Reports
2+
3+
Performance metrics and analysis for workflow scaling tests.
4+
5+
## Structure
6+
7+
```
8+
reports/
9+
├── baseline/ # Baseline workflow metrics
10+
│ └── baseline-metrics-*.json
11+
└── analysis/ # Statistical analysis outputs
12+
├── baseline-stats.json
13+
└── baseline-plots.png
14+
```
15+
16+
## Metrics Files
17+
18+
Each `baseline-metrics-*.json` file contains:
19+
20+
- **metadata**: Benchmark run details (start/end time, duration, namespace)
21+
- **captures**: Time-series data points
22+
- **workflows**: Argo workflow counts (total, running, succeeded, failed)
23+
- **nodes**: Kubernetes node resource allocation
24+
- **rabbitmq**: Queue depths and consumer counts (if available)
25+
26+
## Analysis Files
27+
28+
Each `baseline-stats.json` file contains:
29+
30+
- **metadata**: Benchmark run details
31+
- **workflow_stats**: Aggregate statistics
32+
- `peak_concurrent`: Maximum concurrent workflows
33+
- `total_completed`: Total succeeded workflows
34+
- `total_failed`: Total failed workflows
35+
- `avg_concurrent`: Average concurrent workflows
36+
- **rabbitmq_stats**: RabbitMQ statistics (if available)
37+
- `peak_queue_depth`: Maximum queue depth
38+
- `avg_queue_depth`: Average queue depth
39+
40+
## Usage
41+
42+
```bash
43+
# Capture metrics during workflow burst
44+
python tools/benchmark_baseline.py --duration 1800 --interval 30
45+
46+
# Analyze results
47+
python tools/analyze_baseline.py reports/baseline/baseline-metrics-*.json
48+
```
49+
50+
## Notes
51+
52+
- Metrics files timestamped: `baseline-metrics-YYYYMMDD-HHMMSS.json`
53+
- Reports directory excluded from git (see `.gitignore`)
54+
- Production metrics stored in monitoring system (Prometheus/Grafana)

0 commit comments

Comments
 (0)