# markitdown

> Convert files and office documents to Markdown — supporting PDF, DOCX, PPTX, XLSX, images (with OCR), audio (with transcription), HTML, CSV, JSON, XML, ZIP, YouTube URLs, EPubs, and more.

**Use case**: Convert any document or file format to clean Markdown

**Canonical URL**: https://agentcookbooks.com/skills/markitdown/

**Topics**: claude-code, skills, science, science

**Trigger phrases**: "convert this PDF to Markdown", "extract text from this document", "markitdown convert", "turn this Word file into Markdown", "extract content from PowerPoint"

**Source**: [K-Dense AI](https://github.com/K-Dense-AI/scientific-agent-skills/tree/main/scientific-skills/markitdown)

**License**: MIT

---

## What it does

`markitdown` is a Claude Code skill from K-Dense AI's [scientific-agent-skills repo](https://github.com/K-Dense-AI/scientific-agent-skills). It turns Claude into a document conversion specialist using Microsoft's MarkItDown library — converting PDF, DOCX, PPTX, XLSX, HTML, CSV, JSON, XML, ZIP, YouTube URLs, and EPub files to clean Markdown. For images it applies OCR to extract text; for audio files it applies transcription.

A session produces a Markdown file from the input document, preserving heading structure, tables, lists, and code blocks where the source format supports them — ready for processing by Claude or other text-based tools.

## When to use it

Reach for it when:

- You have a PDF, Word document, or PowerPoint that you want to process with Claude and need the content as plain text
- You're building a document ingestion pipeline that needs to normalize heterogeneous file formats to a single Markdown representation
- You want to extract and structure content from office documents for downstream analysis or search indexing

When *not* to reach for it:

- You need to create or edit Word/PowerPoint files — MarkItDown is read-only (extraction only)
- Image-heavy documents where the visual content matters — OCR extracts text but doesn't reconstruct the visual layout

## Install

Copy the `SKILL.md` from K-Dense AI's [markitdown folder](https://github.com/K-Dense-AI/scientific-agent-skills/tree/main/scientific-skills/markitdown) into `.claude/skills/markitdown/` in your project. Install via `pip install markitdown`. OCR and audio transcription features require additional optional dependencies.

Trigger phrases: "convert this PDF to Markdown", "extract text from this document", "markitdown convert", "turn this Word file into Markdown".

## What a session looks like

A typical session has three phases:

1. **Input specification.** Provide the file path, URL, or YouTube link. Claude detects the format and selects the appropriate MarkItDown converter — no manual format specification needed for most common formats.
2. **Conversion.** MarkItDown runs the conversion, handling multi-page PDFs, embedded images (OCR), tables (converted to Markdown table syntax), and hyperlinks. Claude reports on any conversion warnings or elements that couldn't be extracted cleanly.
3. **Output.** The Markdown text is returned or saved to a file. For large documents, Claude provides a structural summary (section headings, table count) to help navigate the extracted content.

## Receipts

**Where it works well:**
- Academic PDF papers where the text layer is clean — MarkItDown extracts body text and preserves paragraph structure reliably for PDFs generated from LaTeX or modern word processors
- Excel/XLSX files — tables are converted to Markdown table syntax correctly, including multi-sheet workbooks where each sheet becomes a section

**Where it backfires:**
- Scanned PDFs without a text layer — OCR quality depends on scan resolution and language, and complex scientific notation or tables often extract incorrectly
- Heavily formatted PowerPoint slides where visual layout carries the meaning — extracted text loses the spatial relationships between elements

**Pattern that works:** for complex PDFs, extract to Markdown first and verify the heading structure before processing — MarkItDown uses font size heuristics for heading detection that occasionally promote body text to headings in PDFs with unusual typography.

## Source and attribution

Originally authored by [K-Dense Inc.](https://github.com/K-Dense-AI). The canonical SKILL.md lives in the [`markitdown` folder](https://github.com/K-Dense-AI/scientific-agent-skills/tree/main/scientific-skills/markitdown) of their public scientific-agent-skills repository.

License: MIT. Install, adapt, and redistribute with attribution preserved.

This page documents the skill from a practitioner's perspective. For the formal spec and any updates, defer to the source repo.