markitdown
Convert files and office documents to Markdown — supporting PDF, DOCX, PPTX, XLSX, images (with OCR), audio (with transcription), HTML, CSV, JSON, XML, ZIP, YouTube URLs, EPubs, and more.
Convert any document or file format to clean Markdown
Trigger phrases
Phrases that activate this skill when typed to Claude Code:
convert this PDF to Markdownextract text from this documentmarkitdown convertturn this Word file into Markdownextract content from PowerPoint
What it does
markitdown is a Claude Code skill from K-Dense AI’s scientific-agent-skills repo. It turns Claude into a document conversion specialist using Microsoft’s MarkItDown library — converting PDF, DOCX, PPTX, XLSX, HTML, CSV, JSON, XML, ZIP, YouTube URLs, and EPub files to clean Markdown. For images it applies OCR to extract text; for audio files it applies transcription.
A session produces a Markdown file from the input document, preserving heading structure, tables, lists, and code blocks where the source format supports them — ready for processing by Claude or other text-based tools.
When to use it
Reach for it when:
- You have a PDF, Word document, or PowerPoint that you want to process with Claude and need the content as plain text
- You’re building a document ingestion pipeline that needs to normalize heterogeneous file formats to a single Markdown representation
- You want to extract and structure content from office documents for downstream analysis or search indexing
When not to reach for it:
- You need to create or edit Word/PowerPoint files — MarkItDown is read-only (extraction only)
- Image-heavy documents where the visual content matters — OCR extracts text but doesn’t reconstruct the visual layout
Install
Copy the SKILL.md from K-Dense AI’s markitdown folder into .claude/skills/markitdown/ in your project. Install via pip install markitdown. OCR and audio transcription features require additional optional dependencies.
Trigger phrases: “convert this PDF to Markdown”, “extract text from this document”, “markitdown convert”, “turn this Word file into Markdown”.
What a session looks like
A typical session has three phases:
- Input specification. Provide the file path, URL, or YouTube link. Claude detects the format and selects the appropriate MarkItDown converter — no manual format specification needed for most common formats.
- Conversion. MarkItDown runs the conversion, handling multi-page PDFs, embedded images (OCR), tables (converted to Markdown table syntax), and hyperlinks. Claude reports on any conversion warnings or elements that couldn’t be extracted cleanly.
- Output. The Markdown text is returned or saved to a file. For large documents, Claude provides a structural summary (section headings, table count) to help navigate the extracted content.
Receipts
Where it works well:
- Academic PDF papers where the text layer is clean — MarkItDown extracts body text and preserves paragraph structure reliably for PDFs generated from LaTeX or modern word processors
- Excel/XLSX files — tables are converted to Markdown table syntax correctly, including multi-sheet workbooks where each sheet becomes a section
Where it backfires:
- Scanned PDFs without a text layer — OCR quality depends on scan resolution and language, and complex scientific notation or tables often extract incorrectly
- Heavily formatted PowerPoint slides where visual layout carries the meaning — extracted text loses the spatial relationships between elements
Pattern that works: for complex PDFs, extract to Markdown first and verify the heading structure before processing — MarkItDown uses font size heuristics for heading detection that occasionally promote body text to headings in PDFs with unusual typography.
Source and attribution
Originally authored by K-Dense Inc.. The canonical SKILL.md lives in the markitdown folder of their public scientific-agent-skills repository.
License: MIT. Install, adapt, and redistribute with attribution preserved.
This page documents the skill from a practitioner’s perspective. For the formal spec and any updates, defer to the source repo.