Discovered: Dec 14, 2024 21:17 microsoft/markitdown: Python tool for converting files and office documents e.g. PDF, Word, Excel, Images, Audio, speech transcription, html including Wikipedia, csv, json, json, xml to Markdown. via Daring Fireball via MarkItDown: Python Tool for Converting Files and Office Documents to Markdown <– sounds great!

QUOTE:

The MarkItDown library is a utility tool for converting various files to Markdown (e.g., for indexing, text analysis, etc.)

It presently supports: PDF (.pdf), PowerPoint (.pptx), Word (.docx), Excel (.xlsx), Images (EXIF metadata, and OCR), Audio (EXIF metadata, and speech transcription), HTML (special handling of Wikipedia, etc.), Various other text-based formats (csv, json, xml, etc.)

Leave a comment on github