Skip to content

Tech spec specialized #markitdown-plugin, for example enhanced markdowns with image descriptions and more

License

Notifications You must be signed in to change notification settings

tsdicloud/markitdown-arch-plugin

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Architecture documentation analysis extension plugin for Markitdown

A lot of information is contained in the images and diagrams of an architecture description for an IT system.

The plugin for markitdown is intended for the support of architectural analysis of documents. The main difference is that images and drawings are not ignored, but the plugin uses multimodal LLM calls to add image descriptions as footnotes to the extracted markdown.

Features:

  • [.docx, .pdf] for images contained in the documents, architectural descriptions are added as footnotes
  • [.docx] the standard behavior is changed so that data uris are kept to be processes by image description extension
  • [.pdf] (planned) enhanced pdf handling by using brand new layout and extraction capabilities of pymupdf

Installation

python3 -m venv .venv
. .venv/bin/activate
pip3 install --upgrade pip
pip3 install -e ."[all]"

For development, to do unit testing:

pip3 install -e ."[dev]"

Usage

Show plugins:

. .venv\bin\activate
markitdown -x markitdown-arch-plugin --list-plugins > /dev/stdout

The output should contain the line

  * arch_plugin         (package: markitdown_arch_plugin)

Running with the plugin enabled:

markitdown -x markitdown-arch-plugin --use-plugins _source_.docx > _target_.md

Tips for Confluence export

  • Do not export the full confluence, bbut pick root nodes with important content subtrees.
  • Export to word works best - it is always helpful if you can remove unnecessary content which is possible with docx format, but difficult with pdf.
  • Recommended export parameter:
    • Chose "External links only"
    • Uncheck Inhaltsverzeichnis Macro

About

Tech spec specialized #markitdown-plugin, for example enhanced markdowns with image descriptions and more

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages