Skip to main content

markitdown

Convert files and office documents to Markdown — supporting PDF, DOCX, PPTX, XLSX, images (with OCR), audio (with transcription), HTML, CSV, JSON, XML, ZIP, YouTube URLs, EPubs, and more.

Convert any document or file format to clean Markdown

Source K-Dense AI
License MIT
First documented
Updated
Receipts firsthand ✓

Trigger phrases

Phrases that activate this skill when typed to Claude Code:

  • convert this PDF to Markdown
  • extract text from this document
  • markitdown convert
  • turn this Word file into Markdown
  • extract content from PowerPoint

What it does

markitdown is a Claude Code skill from K-Dense AI’s scientific-agent-skills repo. It turns Claude into a document conversion specialist using Microsoft’s MarkItDown library — converting PDF, DOCX, PPTX, XLSX, HTML, CSV, JSON, XML, ZIP, YouTube URLs, and EPub files to clean Markdown. For images it applies OCR to extract text; for audio files it applies transcription.

A session produces a Markdown file from the input document, preserving heading structure, tables, lists, and code blocks where the source format supports them — ready for processing by Claude or other text-based tools.

When to use it

Reach for it when:

  • A source document is too large for WebFetch’s 10MB content cap (model cards, filings, decks) — convert it locally and read the Markdown instead
  • You have a PDF, Word document, or PowerPoint that you want to process with Claude and need the content as plain text
  • You’re building a document ingestion pipeline that needs to normalize heterogeneous file formats to a single Markdown representation
  • You want to extract and structure content from office documents for downstream analysis or search indexing

When not to reach for it:

  • You need to create or edit Word/PowerPoint files — MarkItDown is read-only (extraction only)
  • Image-heavy documents where the visual content matters — OCR extracts text but doesn’t reconstruct the visual layout

Install

Copy the SKILL.md from K-Dense AI’s markitdown folder into .claude/skills/markitdown/ in your project. Install via pip install markitdown. OCR and audio transcription features require additional optional dependencies.

Trigger phrases: “convert this PDF to Markdown”, “extract text from this document”, “markitdown convert”, “turn this Word file into Markdown”.

What a session looks like

A typical session has three phases:

  1. Input specification. Provide the file path, URL, or YouTube link. Claude detects the format and selects the appropriate MarkItDown converter — no manual format specification needed for most common formats.
  2. Conversion. MarkItDown runs the conversion, handling multi-page PDFs, embedded images (OCR), tables (converted to Markdown table syntax), and hyperlinks. Claude reports on any conversion warnings or elements that couldn’t be extracted cleanly.
  3. Output. The Markdown text is returned or saved to a file. For large documents, Claude provides a structural summary (section headings, table count) to help navigate the extracted content.

Receipts

Firsthand, 2026-05-31 — MarkItDown 0.1.6, markitdown[pdf], Python 3.14.2, isolated venv, offline (no OCR/transcription).

The test was a real dead end worth converting out of. During a 15-agent Opus 4.8 fact-check, a verifier couldn’t confirm the system card’s “code summary honesty 3.7%” figure because WebFetch caps fetched content at 10MB and the system card is 19.5MB; the claim shipped unverified. MarkItDown converted it locally:

  • Input: 20,430,397 bytes (~19.5MB), PDF 1.4, 124+ pages (the Opus 4.8 system card).
  • Convert: 66 s on the pdfminer backend, exit code 0.
  • Output: 6,355 lines / 64,335 words / 434,861 chars of Markdown; tables preserved as pipe tables (e.g. | Without thinking | 10.8% | 3.7% | 3.1% | 2.5% |).
  • Payoff: the previously-unverifiable claim came back quotable from the primary — “…fails to raise the important events to the user only 3.7% of the time, down 5-fold from Mythos Preview…” — pinning a comparison baseline the workflow couldn’t find.

Gotchas hit: stderr floods with non-fatal Could not get FontBBox warnings (redirect 2>/dev/null, or success reads like a crash); on Python 3.14, pip install prints harmless Cache entry deserialization failed warnings; markitdown[all] is wheel-risky on a new interpreter — [pdf] installs clean; OCR and audio transcription require an LLM client and send content out, so plain text-bearing conversion is the offline default.

Not yet tested firsthand: scanned (no-text-layer) PDFs, image-OCR quality, and layout-heavy PPTX. MarkItDown leans on font-size heuristics for heading detection, so verify heading structure on unusual typography before trusting it.

Full write-up: MarkItDown read the 19MB PDF WebFetch wouldn’t.

Source and attribution

Originally authored by K-Dense Inc.. The canonical SKILL.md lives in the markitdown folder of their public scientific-agent-skills repository.

License: MIT. Install, adapt, and redistribute with attribution preserved.

This page documents the skill from a practitioner’s perspective. For the formal spec and any updates, defer to the source repo.