# torch-geometric

> Guide for building Graph Neural Networks with PyTorch Geometric (PyG) — covering node classification, link prediction, graph classification, message passing networks, heterogeneous graphs, and neighbor sampling for graph-structured data.

**Use case**: Build graph neural networks with PyTorch Geometric

**Canonical URL**: https://agentcookbooks.com/skills/torch-geometric/

**Topics**: claude-code, skills, science, ml-libraries

**Trigger phrases**: "graph neural network", "GNN with PyG", "node classification", "link prediction", "torch_geometric"

**Source**: [K-Dense AI](https://github.com/K-Dense-AI/scientific-agent-skills/tree/main/scientific-skills/torch-geometric)

**License**: MIT

---

## What it does

`torch-geometric` is a Claude Code skill from K-Dense AI's [scientific-agent-skills repo](https://github.com/K-Dense-AI/scientific-agent-skills). It turns Claude into a PyTorch Geometric (PyG) expert covering graph data structures (`Data`, `HeteroData`), 60+ GNN layer implementations (GCN, GAT, GraphSAGE, GIN, MPNN), tasks (node classification, link prediction, graph classification), scalable mini-batch training with neighbor sampling, and heterogeneous graph workflows.

A session produces PyG code: a graph dataset setup, a GNN model class inheriting from `torch.nn.Module` using PyG message-passing layers, a training loop with mini-batch sampling, and evaluation metrics appropriate to the task.

## When to use it

Reach for it when:

- You have graph-structured data (molecular graphs, citation networks, social networks, knowledge graphs) and need a GNN
- You're implementing node classification, link prediction, or graph-level property prediction on relational data
- You need scalable GNN training with neighbor sampling for graphs too large to fit full batch training in GPU memory

When *not* to reach for it:

- Standard tabular ML without graph structure — use `scikit-learn`
- Molecular ML where pre-built featurization and MoleculeNet benchmarks matter more than custom GNN architecture — use `deepchem`

## Install

Copy the `SKILL.md` from K-Dense AI's [torch-geometric folder](https://github.com/K-Dense-AI/scientific-agent-skills/tree/main/scientific-skills/torch-geometric) into `.claude/skills/torch-geometric/` in your project. PyG installation requires matching the PyTorch and CUDA versions — Claude will generate the correct `pip install` command for your environment.

Trigger phrases: "graph neural network", "GNN with PyG", "node classification", "link prediction", "torch_geometric".

## What a session looks like

A typical session has three phases:

1. **Graph data setup.** Claude creates a `torch_geometric.data.Data` object from your node features, edge index, and labels, or loads a built-in dataset (Cora, TUDataset, OGB) for benchmarking.
2. **GNN architecture.** Claude writes a `nn.Module` using PyG's message-passing layers — selecting GCN, GAT, or GraphSAGE based on the task and graph properties — with skip connections and batch normalization where appropriate.
3. **Training and evaluation.** A training loop with `DataLoader` for mini-batch sampling (NeighborLoader for large graphs) is set up with the appropriate loss and evaluation metric (accuracy, AUC-ROC, mean average precision).

## Receipts

**Where it works well:**
- Citation network node classification (Cora, CiteSeer, PubMed) — a standard GCN or GAT implementation with PyG reliably achieves competitive accuracy and serves as a working starting point for custom architectures
- Molecular property prediction using graph-level models — the PyG ecosystem's built-in molecular datasets and GNN layers reduce the boilerplate significantly

**Where it backfires:**
- Very large graphs (millions of nodes) require careful neighbor sampling setup; the default full-batch training will OOM without the NeighborLoader
- Heterogeneous graphs with many edge types require the `HeteroData` API which has a steeper learning curve and more verbose code

**Pattern that works:** always validate your graph construction (edge_index shape, node feature dimensions, label alignment) before training — indexing errors in graph data are silent and produce garbage results without explicit checks.

## Source and attribution

Originally authored by [K-Dense Inc.](https://github.com/K-Dense-AI). The canonical SKILL.md lives in the [`torch-geometric` folder](https://github.com/K-Dense-AI/scientific-agent-skills/tree/main/scientific-skills/torch-geometric) of their public scientific-agent-skills repository.

License: MIT. Install, adapt, and redistribute with attribution preserved.

This page documents the skill from a practitioner's perspective. For the formal spec and any updates, defer to the source repo.