torch-geometric
Guide for building Graph Neural Networks with PyTorch Geometric (PyG) — covering node classification, link prediction, graph classification, message passing networks, heterogeneous graphs, and neighbor sampling for graph-structured data.
Build graph neural networks with PyTorch Geometric
Trigger phrases
Phrases that activate this skill when typed to Claude Code:
graph neural networkGNN with PyGnode classificationlink predictiontorch_geometric
What it does
torch-geometric is a Claude Code skill from K-Dense AI’s scientific-agent-skills repo. It turns Claude into a PyTorch Geometric (PyG) expert covering graph data structures (Data, HeteroData), 60+ GNN layer implementations (GCN, GAT, GraphSAGE, GIN, MPNN), tasks (node classification, link prediction, graph classification), scalable mini-batch training with neighbor sampling, and heterogeneous graph workflows.
A session produces PyG code: a graph dataset setup, a GNN model class inheriting from torch.nn.Module using PyG message-passing layers, a training loop with mini-batch sampling, and evaluation metrics appropriate to the task.
When to use it
Reach for it when:
- You have graph-structured data (molecular graphs, citation networks, social networks, knowledge graphs) and need a GNN
- You’re implementing node classification, link prediction, or graph-level property prediction on relational data
- You need scalable GNN training with neighbor sampling for graphs too large to fit full batch training in GPU memory
When not to reach for it:
- Standard tabular ML without graph structure — use
scikit-learn - Molecular ML where pre-built featurization and MoleculeNet benchmarks matter more than custom GNN architecture — use
deepchem
Install
Copy the SKILL.md from K-Dense AI’s torch-geometric folder into .claude/skills/torch-geometric/ in your project. PyG installation requires matching the PyTorch and CUDA versions — Claude will generate the correct pip install command for your environment.
Trigger phrases: “graph neural network”, “GNN with PyG”, “node classification”, “link prediction”, “torch_geometric”.
What a session looks like
A typical session has three phases:
- Graph data setup. Claude creates a
torch_geometric.data.Dataobject from your node features, edge index, and labels, or loads a built-in dataset (Cora, TUDataset, OGB) for benchmarking. - GNN architecture. Claude writes a
nn.Moduleusing PyG’s message-passing layers — selecting GCN, GAT, or GraphSAGE based on the task and graph properties — with skip connections and batch normalization where appropriate. - Training and evaluation. A training loop with
DataLoaderfor mini-batch sampling (NeighborLoader for large graphs) is set up with the appropriate loss and evaluation metric (accuracy, AUC-ROC, mean average precision).
Receipts
Where it works well:
- Citation network node classification (Cora, CiteSeer, PubMed) — a standard GCN or GAT implementation with PyG reliably achieves competitive accuracy and serves as a working starting point for custom architectures
- Molecular property prediction using graph-level models — the PyG ecosystem’s built-in molecular datasets and GNN layers reduce the boilerplate significantly
Where it backfires:
- Very large graphs (millions of nodes) require careful neighbor sampling setup; the default full-batch training will OOM without the NeighborLoader
- Heterogeneous graphs with many edge types require the
HeteroDataAPI which has a steeper learning curve and more verbose code
Pattern that works: always validate your graph construction (edge_index shape, node feature dimensions, label alignment) before training — indexing errors in graph data are silent and produce garbage results without explicit checks.
Source and attribution
Originally authored by K-Dense Inc.. The canonical SKILL.md lives in the torch-geometric folder of their public scientific-agent-skills repository.
License: MIT. Install, adapt, and redistribute with attribution preserved.
This page documents the skill from a practitioner’s perspective. For the formal spec and any updates, defer to the source repo.