Architecture documentation analysis extension plugin for Markitdown

A lot of information is contained in the images and diagrams of an architecture description for an IT system.

The plugin for markitdown is intended for the support of architectural analysis of documents. The main difference is that images and drawings are not ignored, but the plugin uses multimodal LLM calls to add image descriptions as footnotes to the extracted markdown.

Features:

[.docx, .pdf] for images contained in the documents, architectural descriptions are added as footnotes
[.docx] the standard behavior is changed so that data uris are kept to be processes by image description extension
[.pdf] (planned) enhanced pdf handling by using brand new layout and extraction capabilities of pymupdf

Installation

python3 -m venv .venv
. .venv/bin/activate
pip3 install --upgrade pip
pip3 install -e ."[all]"

For development, to do unit testing:

pip3 install -e ."[dev]"

Usage

Show plugins:

. .venv\bin\activate
markitdown -x markitdown-arch-plugin --list-plugins > /dev/stdout

The output should contain the line

  * arch_plugin         (package: markitdown_arch_plugin)

Running with the plugin enabled:

markitdown -x markitdown-arch-plugin --use-plugins _source_.docx > _target_.md

Tips for Confluence export

Do not export the full confluence, bbut pick root nodes with important content subtrees.
Export to word works best - it is always helpful if you can remove unnecessary content which is possible with docx format, but difficult with pdf.
Recommended export parameter:
- Chose "External links only"
- Uncheck Inhaltsverzeichnis Macro

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
src/markitdown_arch_plugin		src/markitdown_arch_plugin
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Architecture documentation analysis extension plugin for Markitdown

Installation

Usage

Tips for Confluence export

About

Uh oh!

Releases

Packages

Languages

License

tsdicloud/markitdown-arch-plugin

Folders and files

Latest commit

History

Repository files navigation

Architecture documentation analysis extension plugin for Markitdown

Installation

Usage

Tips for Confluence export

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages