Skip to content

Commit 84b32c0

Browse files
committed
feat: Add comprehensive multi-format documentation build system
- Added comprehensive build-docs.sh script with multi-format support (HTML, Markdown, PDF) - Enhanced CI workflow for automated documentation generation and artifact upload - Added LaTeX configuration for professional PDF generation - Improved documentation generator with robust cleanup and validation - Updated Sphinx configuration with proper warning suppression - Added comprehensive error handling and dependency checking - Implemented structured artifact packaging with manifest generation Features: - Automated cleanup of temporary files (.dot, build artifacts, cache) - Multi-format documentation generation (HTML, Markdown, PDF) - Professional CI/CD pipeline with artifact upload - Comprehensive dependency validation - Structured output with version tagging and manifest - Robust error handling and fallback mechanisms
1 parent d1976c0 commit 84b32c0

File tree

390 files changed

+60582
-767
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

390 files changed

+60582
-767
lines changed

.github/workflows/python-docs.yml

Lines changed: 197 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -1,43 +1,221 @@
1-
# Python Auto Documentation
2-
# This workflow auto-generates documentation using Sphinx
3-
name: Python Auto Documentation
1+
# Comprehensive Documentation Generation Pipeline
2+
# Generates HTML, Markdown, and PDF documentation with proper cleanup
3+
name: Documentation Generation & Deployment
44
permissions:
55
contents: read
6+
actions: write
67

78
on:
89
push:
10+
branches: [ main, develop, bugfix-* ]
911
paths:
1012
- '**.py'
13+
- 'doc/**'
14+
- 'scripts/generate-docs.py'
1115
pull_request:
1216
paths:
1317
- '**.py'
18+
- 'doc/**'
19+
- 'scripts/generate-docs.py'
20+
21+
env:
22+
PYTHON_VERSION: '3.11'
23+
DOCS_SOURCE: 'doc/codeDocs'
24+
DOCS_OUTPUT: 'documentation-artifacts'
1425

1526
jobs:
16-
build-docs:
27+
build-documentation:
28+
name: Build Multi-Format Documentation
1729
runs-on: ubuntu-latest
30+
outputs:
31+
docs-version: ${{ steps.version.outputs.version }}
32+
1833
steps:
19-
- name: Checkout code
34+
- name: Checkout repository
2035
uses: actions/checkout@v4
36+
with:
37+
fetch-depth: 0
38+
2139
- name: Setup Python Environment
2240
uses: ./.github/actions/setup-python-env
2341
with:
24-
python-version: '3.11'
42+
python-version: ${{ env.PYTHON_VERSION }}
2543
install-dev-reqs: 'false'
2644
install-docs-reqs: 'true'
45+
2746
- name: Install system dependencies
2847
run: |
29-
sudo apt-get update
30-
sudo apt-get install -y graphviz
31-
- name: Generate enhanced documentation
32-
run: |
33-
# Run our enhanced documentation generator
34-
python scripts/generate-docs.py --src src --output doc/codeDocs
35-
# Generate API documentation, excluding conf.py
36-
sphinx-apidoc -o doc/codeDocs/ src/ --force --no-toc --module-first
37-
# Build the enhanced documentation
38-
sphinx-build -W -b html doc/codeDocs/ doc/codeDocs/_build/html
39-
- name: Upload documentation artifact
48+
sudo apt-get update -qq
49+
sudo apt-get install -y --no-install-recommends \
50+
graphviz \
51+
pandoc \
52+
texlive-latex-recommended \
53+
texlive-fonts-recommended \
54+
texlive-latex-extra \
55+
latexmk
56+
57+
- name: Set documentation version
58+
id: version
59+
run: |
60+
VERSION="$(date +'%Y.%m.%d')-$(git rev-parse --short HEAD)"
61+
echo "version=$VERSION" >> $GITHUB_OUTPUT
62+
echo "Documentation version: $VERSION"
63+
64+
- name: Clean previous builds
65+
run: |
66+
echo "🧹 Cleaning previous documentation builds..."
67+
rm -rf doc/codeDocs/_build/ || true
68+
rm -rf doc/codeDocs/_static/diagrams/*.png || true
69+
rm -rf doc/codeDocs/_static/diagrams/*.dot || true
70+
find doc/codeDocs/ -name '*.rst' -not -name 'index.rst' -not -name 'overview.rst' -delete || true
71+
mkdir -p ${{ env.DOCS_OUTPUT }}
72+
73+
- name: Build comprehensive documentation
74+
run: |
75+
cd ${{ github.workspace }}
76+
./scripts/build-docs.sh
77+
78+
# Update manifest with CI information
79+
if [ -f documentation-output/manifest.json ]; then
80+
python3 -c "import json; manifest = json.load(open('documentation-output/manifest.json')); manifest.update({'ci_run_number': '${{ github.run_number }}', 'ci_sha': '${{ github.sha }}', 'ci_ref': '${{ github.ref }}', 'repository': '${{ github.repository }}'}); json.dump(manifest, open('documentation-output/manifest.json', 'w'), indent=2)"
81+
fi
82+
83+
- name: Build HTML documentation
84+
run: |
85+
echo "🌐 Building HTML documentation..."
86+
cd ${{ env.DOCS_SOURCE }}
87+
sphinx-build -W -b html . _build/html
88+
echo "HTML documentation built successfully"
89+
90+
- name: Build Markdown documentation
91+
run: |
92+
echo "📝 Building Markdown documentation..."
93+
cd ${{ env.DOCS_SOURCE }}
94+
sphinx-build -b markdown . _build/markdown
95+
# Create comprehensive README
96+
cat > _build/markdown/README.md << 'EOF'
97+
# unstructuredDataHandler Documentation
98+
99+
**Version:** ${{ steps.version.outputs.version }}
100+
**Generated:** $(date -u '+%Y-%m-%d %H:%M:%S UTC')
101+
**Repository:** ${{ github.repository }}
102+
**Branch:** ${{ github.ref_name }}
103+
104+
This directory contains the complete documentation in Markdown format.
105+
106+
## Navigation
107+
108+
- [Main Documentation](index.md) - Start here
109+
- [System Overview](overview.md) - Architecture and design
110+
- [API Reference](modules.md) - Complete API documentation
111+
112+
## Module Documentation
113+
114+
EOF
115+
find _build/markdown -name '*.md' -not -name 'README.md' | sort | while read file; do
116+
basename="$(basename "$file" .md)"
117+
echo "- [$basename]($file)" >> _build/markdown/README.md
118+
done
119+
echo "Markdown documentation built successfully"
120+
121+
- name: Build PDF documentation
122+
run: |
123+
echo "📄 Building PDF documentation..."
124+
cd ${{ env.DOCS_SOURCE }}
125+
sphinx-build -b latex . _build/latex
126+
cd _build/latex
127+
# Build PDF with error handling
128+
make all-pdf || {
129+
echo "⚠️ PDF generation failed, creating fallback PDF from HTML"
130+
cd ../html
131+
# Fallback: convert HTML to PDF using pandoc
132+
find . -name '*.html' -exec basename {} .html \; | head -1 | xargs -I {} \
133+
pandoc {}.html -o ../../_build/unstructuredDataHandler-docs.pdf --pdf-engine=xelatex || \
134+
echo "⚠️ PDF generation skipped - LaTeX not fully configured"
135+
}
136+
echo "PDF documentation processing completed"
137+
138+
- name: Package documentation artifacts
139+
run: |
140+
echo "📦 Packaging documentation artifacts..."
141+
cd ${{ env.DOCS_SOURCE }}/_build
142+
143+
# HTML Documentation
144+
if [ -d "html" ]; then
145+
tar -czf "$GITHUB_WORKSPACE/${{ env.DOCS_OUTPUT }}/html-docs-${{ steps.version.outputs.version }}.tar.gz" -C html .
146+
cp -r html "$GITHUB_WORKSPACE/${{ env.DOCS_OUTPUT }}/html/"
147+
fi
148+
149+
# Markdown Documentation
150+
if [ -d "markdown" ]; then
151+
tar -czf "$GITHUB_WORKSPACE/${{ env.DOCS_OUTPUT }}/markdown-docs-${{ steps.version.outputs.version }}.tar.gz" -C markdown .
152+
cp -r markdown "$GITHUB_WORKSPACE/${{ env.DOCS_OUTPUT }}/markdown/"
153+
fi
154+
155+
# PDF Documentation
156+
if [ -f "latex/unstructureddatahandler.pdf" ]; then
157+
cp "latex/unstructureddatahandler.pdf" "$GITHUB_WORKSPACE/${{ env.DOCS_OUTPUT }}/unstructuredDataHandler-docs-${{ steps.version.outputs.version }}.pdf"
158+
elif [ -f "unstructuredDataHandler-docs.pdf" ]; then
159+
cp "unstructuredDataHandler-docs.pdf" "$GITHUB_WORKSPACE/${{ env.DOCS_OUTPUT }}/"
160+
fi
161+
162+
# Create manifest
163+
cat > "$GITHUB_WORKSPACE/${{ env.DOCS_OUTPUT }}/manifest.json" << EOF
164+
{
165+
"version": "${{ steps.version.outputs.version }}",
166+
"generated_at": "$(date -u -Iseconds)",
167+
"repository": "${{ github.repository }}",
168+
"branch": "${{ github.ref_name }}",
169+
"commit": "${{ github.sha }}",
170+
"formats": {
171+
"html": "html/",
172+
"markdown": "markdown/",
173+
"pdf": "unstructuredDataHandler-docs-${{ steps.version.outputs.version }}.pdf"
174+
}
175+
}
176+
EOF
177+
178+
echo "📊 Documentation packaging summary:"
179+
ls -la "$GITHUB_WORKSPACE/${{ env.DOCS_OUTPUT }}/"
180+
181+
- name: Cleanup temporary files
182+
run: |
183+
echo "🧹 Cleaning up temporary files..."
184+
cd ${{ env.DOCS_SOURCE }}
185+
# Remove build artifacts but keep source
186+
rm -rf _build/doctrees/ || true
187+
rm -rf _build/latex/*.aux _build/latex/*.log _build/latex/*.out _build/latex/*.toc || true
188+
find _static/diagrams/ -name '*.dot' -delete || true
189+
# Clean Python cache
190+
find . -type d -name '__pycache__' -exec rm -rf {} + || true
191+
find . -name '*.pyc' -delete || true
192+
echo "Cleanup completed"
193+
194+
- name: Upload HTML Documentation
195+
uses: actions/upload-artifact@v4
196+
with:
197+
name: html-documentation
198+
path: ${{ env.DOCS_OUTPUT }}/html/
199+
retention-days: 90
200+
201+
- name: Upload Markdown Documentation
202+
uses: actions/upload-artifact@v4
203+
with:
204+
name: markdown-documentation
205+
path: ${{ env.DOCS_OUTPUT }}/markdown/
206+
retention-days: 90
207+
208+
- name: Upload PDF Documentation
209+
uses: actions/upload-artifact@v4
210+
if: hashFiles('documentation-artifacts/*.pdf') != ''
211+
with:
212+
name: pdf-documentation
213+
path: ${{ env.DOCS_OUTPUT }}/*.pdf
214+
retention-days: 90
215+
216+
- name: Upload Complete Documentation Archive
40217
uses: actions/upload-artifact@v4
41218
with:
42-
name: sphinx-docs
43-
path: doc/codeDocs/_build/html
219+
name: complete-documentation-${{ steps.version.outputs.version }}
220+
path: ${{ env.DOCS_OUTPUT }}/
221+
retention-days: 90

doc/CodeDocs/_static/diagrams/agents_deepagent_calls.dot

Lines changed: 0 additions & 44 deletions
This file was deleted.

doc/CodeDocs/_static/diagrams/app_calls.dot

Lines changed: 0 additions & 8 deletions
This file was deleted.

doc/CodeDocs/_static/diagrams/architecture.dot

Lines changed: 0 additions & 27 deletions
This file was deleted.

doc/CodeDocs/_static/diagrams/llm_base_calls.dot

Lines changed: 0 additions & 26 deletions
This file was deleted.

doc/CodeDocs/_static/diagrams/parsers_base_parser_calls.dot

Lines changed: 0 additions & 17 deletions
This file was deleted.

0 commit comments

Comments
 (0)