rdkit

Cheminformatics toolkit for fine-grained molecular control — SMILES/SDF parsing, descriptors (MW, LogP, TPSA), fingerprints, substructure search, 2D/3D generation, and molecular similarity for advanced control and custom algorithms.

Fine-grained cheminformatics and molecular analysis with RDKit

Source K-Dense AI
License MIT
First documented

Trigger phrases

Phrases that activate this skill when typed to Claude Code:

  • parse SMILES
  • compute molecular descriptors
  • substructure search
  • generate 3D conformer
  • molecular fingerprints

What it does

rdkit is a Claude Code skill from K-Dense AI’s scientific-agent-skills repo. It turns Claude into an RDKit expert covering the full cheminformatics toolkit — SMILES and SDF parsing, molecular descriptor computation (MW, LogP, TPSA, HBD/HBA, rotatable bonds), fingerprint generation (Morgan, ECFP, MACCS, RDKit), substructure search with SMARTS patterns, 2D and 3D conformer generation (ETKDG), molecular similarity (Tanimoto, Dice), and reaction handling.

A session produces Python code that takes molecular inputs (SMILES strings, SDF files) and returns the requested chemical properties, filtered compound sets, or visualization files.

When to use it

Reach for it when:

  • You need advanced molecular control — custom sanitization, non-standard valences, or specialized fingerprint parameters
  • You’re implementing a custom cheminformatics algorithm that requires access to the RDKit C++ layer through Python
  • You’re doing substructure searches with complex SMARTS patterns that need precise control over the matching behavior

When not to reach for it:

  • Standard drug discovery workflows with sensible defaults — use datamol (a Pythonic RDKit wrapper)
  • Molecular ML with diverse featurization and MoleculeNet benchmarks — use deepchem

Install

Copy the SKILL.md from K-Dense AI’s rdkit folder into .claude/skills/rdkit/ in your project. RDKit is best installed via conda: conda install -c conda-forge rdkit.

Trigger phrases: “parse SMILES”, “compute molecular descriptors”, “substructure search”, “generate 3D conformer”, “molecular fingerprints”.

What a session looks like

A typical session has three phases:

  1. Input specification. Provide SMILES strings, an SDF file path, or a compound library. Claude sets up the molecule loading with appropriate sanitization flags and handles invalid SMILES gracefully.
  2. Computation. Claude generates the RDKit code for the requested operation: descriptor calculation via Descriptors, fingerprint generation via AllChem, substructure search via HasSubstructMatch, or 3D conformer embedding via AllChem.EmbedMolecule(mol, AllChem.ETKDGv3()).
  3. Output. Results are returned as a pandas DataFrame, an SDF file, or a molecular visualization (2D depiction via Draw.MolToImage). Invalid molecules are flagged in the output rather than silently dropped.

Receipts

Where it works well:

  • Lipinski Rule of Five filtering on a compound library — descriptor computation across thousands of molecules is fast and the filtering logic is clean
  • SMARTS-based substructure search for functional group identification — RDKit’s SMARTS matching is comprehensive and Claude knows the common SMARTS patterns for standard pharmacophore features

Where it backfires:

  • 3D conformer generation quality degrades for highly flexible molecules (>10 rotatable bonds) — ETKDG produces geometries but the ensemble may not represent the true conformational distribution
  • Some exotic tautomers and charged species require manual sanitization overrides that are non-obvious from the error messages

Pattern that works: always check mol is not None after parsing SMILES — RDKit returns None for invalid SMILES rather than raising an exception, and downstream operations on None produce cryptic errors rather than informative ones.

Source and attribution

Originally authored by K-Dense Inc.. The canonical SKILL.md lives in the rdkit folder of their public scientific-agent-skills repository.

License: MIT. Install, adapt, and redistribute with attribution preserved.

This page documents the skill from a practitioner’s perspective. For the formal spec and any updates, defer to the source repo.