# pyhealth

> Comprehensive healthcare AI toolkit for developing, testing, and deploying ML models with clinical data — covering EHR data (MIMIC-III/IV, eICU, OMOP), clinical prediction tasks (mortality, readmission, drug recommendation), and deep learning models for healthcare.

**Use case**: Train and evaluate ML models on EHR and clinical datasets

**Canonical URL**: https://agentcookbooks.com/skills/pyhealth/

**Topics**: claude-code, skills, science, clinical

**Trigger phrases**: "EHR machine learning", "MIMIC dataset", "clinical prediction model", "readmission prediction", "drug recommendation model"

**Source**: [K-Dense AI](https://github.com/K-Dense-AI/scientific-agent-skills/tree/main/scientific-skills/pyhealth)

**License**: MIT

---

## What it does

`pyhealth` is a Claude Code skill from K-Dense AI's [scientific-agent-skills repo](https://github.com/K-Dense-AI/scientific-agent-skills). It turns Claude into a PyHealth expert for healthcare ML — covering EHR dataset loading and processing (MIMIC-III/IV, eICU, OMOP CDM), medical coding systems (ICD-9/10, NDC, ATC codes), clinical prediction tasks (in-hospital mortality, 30-day readmission, drug recommendation, length of stay), physiological signal processing (EEG, ECG), and deep learning model implementations (RETAIN, SafeDrug, Transformer-based clinical models, GNNs for drug-drug interactions).

A session produces a complete clinical ML pipeline: dataset loading from structured EHR data, task definition, model training, and evaluation with clinically appropriate metrics (AUROC, AUPRC).

## When to use it

Reach for it when:

- You're working with MIMIC-III/IV, eICU, or OMOP CDM data and need a standardized preprocessing pipeline that handles the clinical data complexity
- You want established deep learning models for clinical prediction tasks without implementing them from scratch
- You're benchmarking a new clinical prediction model against established baselines (RETAIN, Transformer, GRU-based models)

When *not* to reach for it:

- DICOM medical imaging — use `pydicom`
- Clinical text NLP (clinical NLP, note processing) — use `transformers` with a clinical BERT variant

## Install

Copy the `SKILL.md` from K-Dense AI's [pyhealth folder](https://github.com/K-Dense-AI/scientific-agent-skills/tree/main/scientific-skills/pyhealth) into `.claude/skills/pyhealth/` in your project. MIMIC and eICU datasets require PhysioNet data access agreements; PyHealth handles the parsing once you have the raw CSV files.

Trigger phrases: "EHR machine learning", "MIMIC dataset", "clinical prediction model", "readmission prediction".

## What a session looks like

A typical session has three phases:

1. **Dataset and task setup.** Specify the dataset (MIMIC-III, eICU) and the prediction task. Claude sets up the PyHealth dataset object that handles patient visit aggregation, medical code mapping, and train/validation/test splitting by patient ID (not by visit).
2. **Model selection and training.** Claude selects an appropriate model architecture from PyHealth's model zoo — RETAIN for interpretability, Transformer for sequence modeling, GNN for drug-drug interaction tasks — and configures the training loop with appropriate loss function and evaluation metrics.
3. **Evaluation.** Results are computed with AUROC and AUPRC (both critical for imbalanced clinical outcomes), and Claude flags any data leakage risks in the splitting strategy specific to the clinical task.

## Receipts

**Where it works well:**
- Benchmarking clinical prediction models on MIMIC — PyHealth's standardized task definitions and data processing make comparison to published results meaningful rather than confounded by data processing differences
- Multi-task clinical prediction where one model needs to predict multiple outcomes — PyHealth's task definition layer handles this cleanly

**Where it backfires:**
- Clinical datasets have severe class imbalance (in-hospital mortality rates of 5–15%) that requires careful metric selection; models that optimize accuracy rather than AUROC/AUPRC look good on paper but are clinically useless
- PyHealth's data loading assumes specific CSV formats from MIMIC; minor version differences in the downloaded files require preprocessing fixes that aren't always documented

**Pattern that works:** always split by patient ID, not by visit — splitting by visit allows the same patient to appear in train and test sets, which artificially inflates performance and produces models that don't generalize.

## Source and attribution

Originally authored by [K-Dense Inc.](https://github.com/K-Dense-AI). The canonical SKILL.md lives in the [`pyhealth` folder](https://github.com/K-Dense-AI/scientific-agent-skills/tree/main/scientific-skills/pyhealth) of their public scientific-agent-skills repository.

License: MIT. Install, adapt, and redistribute with attribution preserved.

This page documents the skill from a practitioner's perspective. For the formal spec and any updates, defer to the source repo.