# transformers

> Pre-trained transformer models for NLP, computer vision, audio, and multimodal tasks — use for text generation, classification, question answering, translation, summarization, image classification, object detection, speech recognition, and fine-tuning on custom datasets.

**Use case**: Load and fine-tune Hugging Face transformer models

**Canonical URL**: https://agentcookbooks.com/skills/transformers/

**Topics**: claude-code, skills, science, ml-libraries

**Trigger phrases**: "use a Hugging Face model", "fine-tune this model", "text classification with transformers", "load a pretrained model", "sentiment analysis"

**Source**: [K-Dense AI](https://github.com/K-Dense-AI/scientific-agent-skills/tree/main/scientific-skills/transformers)

**License**: Apache-2.0

---

## What it does

`transformers` is a Claude Code skill from K-Dense AI's [scientific-agent-skills repo](https://github.com/K-Dense-AI/scientific-agent-skills). It turns Claude into a Hugging Face Transformers expert covering model loading via `AutoModel`/`pipeline`, tokenization, inference, and fine-tuning with the `Trainer` API — across NLP (generation, classification, QA, translation, summarization), vision (image classification, object detection, segmentation), audio (ASR, audio classification), and multimodal tasks.

A session produces complete Python code: model loading, tokenization, inference pipeline, or a full fine-tuning setup with the Trainer configured for your task and hardware.

## When to use it

Reach for it when:

- You need to run inference with a pretrained model from the Hugging Face Hub on text, images, or audio
- You're fine-tuning a pretrained model on a custom dataset for a specific classification or generation task
- You need multimodal inference (vision-language models, audio-text models) from a single unified library

When *not* to reach for it:

- Organized training loops with multi-GPU, logging callbacks, and experiment tracking — use `pytorch-lightning` which wraps Transformers cleanly
- Graph neural networks on structured data — use `torch-geometric`

## Install

Copy the `SKILL.md` from K-Dense AI's [transformers folder](https://github.com/K-Dense-AI/scientific-agent-skills/tree/main/scientific-skills/transformers) into `.claude/skills/transformers/` in your project. A Hugging Face token (`HF_TOKEN` environment variable) is required for gated models.

Trigger phrases: "use a Hugging Face model", "fine-tune this model", "text classification with transformers", "load a pretrained model".

## What a session looks like

A typical session has three phases:

1. **Task and model selection.** Describe the task and data domain. Claude selects an appropriate pretrained model from the Hub (specifying the model card), the correct tokenizer, and the appropriate `AutoModel` class for the task head.
2. **Inference or fine-tuning setup.** For inference, Claude writes a `pipeline()` call or explicit tokenize-forward-decode loop. For fine-tuning, Claude configures a `Trainer` with the dataset, training arguments (learning rate, batch size, number of epochs), and evaluation metrics.
3. **Hardware and optimization.** Claude adds device placement (`.to("cuda")`), half-precision inference (`torch.float16`), and batching setup appropriate to the available hardware.

## Receipts

**Where it works well:**
- Zero-shot classification and named entity recognition via `pipeline()` — the abstraction is clean and the pretrained models perform surprisingly well out of the box on common tasks
- Fine-tuning BERT-family models on text classification — the Trainer API handles the training loop, evaluation, and checkpoint saving reliably

**Where it backfires:**
- Large model inference without quantization exhausts GPU memory quickly; Claude doesn't always proactively recommend `bitsandbytes` quantization for 7B+ models
- Some gated models on the Hub require manual license acceptance through the web UI before the token grants download access — a workflow friction point that surprises first-time users

**Pattern that works:** start with the `pipeline()` API to verify the model works for your task before writing custom tokenization and model code — it's much faster to prototype with and easier to debug.

## Source and attribution

Originally authored by [K-Dense Inc.](https://github.com/K-Dense-AI). The canonical SKILL.md lives in the [`transformers` folder](https://github.com/K-Dense-AI/scientific-agent-skills/tree/main/scientific-skills/transformers) of their public scientific-agent-skills repository.

License: Apache-2.0. Install, adapt, and redistribute with attribution preserved.

This page documents the skill from a practitioner's perspective. For the formal spec and any updates, defer to the source repo.