database-lookup

Search 78 public scientific, biomedical, materials science, and economic databases via REST APIs — covering physics, earth science, chemistry, biology/genomics, disease/clinical, regulatory, economics, and demographics databases.

Query 78 public scientific databases from a single skill

Source K-Dense AI
License MIT
First documented

Trigger phrases

Phrases that activate this skill when typed to Claude Code:

  • look up in PubChem
  • query UniProt
  • search ClinicalTrials
  • Materials Project lookup
  • database query

What it does

database-lookup is a Claude Code skill from K-Dense AI’s scientific-agent-skills repo. It turns Claude into a multi-database query agent covering 78 public scientific databases via REST APIs — including chemistry (PubChem, ChEMBL, DrugBank, KEGG, ZINC, BindingDB), biology/genomics (UniProt, STRING, Ensembl, NCBI Gene, GEO, PDB, AlphaFold, Human Protein Atlas), disease/clinical (ClinicalTrials.gov, OMIM, ClinVar, TCGA, DisGeNET), materials (Materials Project, COD), regulatory (FDA, USPTO), and economics (FRED, World Bank).

A session produces structured query results from the appropriate database — protein records, compound properties, clinical trial listings, variant annotations, or economic indicators — returned in a pandas DataFrame or JSON.

When to use it

Reach for it when:

  • You need data from a specific public database and don’t want to write API integration code for it
  • You’re pulling data across multiple databases in a single research workflow (e.g., compound from PubChem → target from UniProt → trials from ClinicalTrials.gov)
  • You need economic or regulatory data (FDA drug approvals, USPTO patents, World Bank indicators) alongside scientific data in the same pipeline

When not to reach for it:

  • Deep literature search across academic papers — use paper-lookup or literature-review
  • Comprehensive genomics workflows requiring sequence analysis — use biopython or gget

Install

Copy the SKILL.md from K-Dense AI’s database-lookup folder into .claude/skills/database-lookup/ in your project.

Trigger phrases: “look up in PubChem”, “query UniProt”, “search ClinicalTrials”, “Materials Project lookup”.

What a session looks like

A typical session has three phases:

  1. Database and query specification. Describe what you’re looking for — a compound name/ID, gene symbol, disease, clinical trial criterion, or economic indicator. Claude identifies which of the 78 databases is most appropriate and confirms the query parameters.
  2. API retrieval. Claude generates and executes the REST API call to the appropriate database, handling authentication (API keys where needed), pagination, and rate limits.
  3. Structured output. Results are returned as a pandas DataFrame or formatted dict with the relevant fields extracted — not the raw JSON response. Claude explains which fields were returned and flags any unexpected empty results.

Receipts

Where it works well:

  • Compound lookups by name or InChI across PubChem and ChEMBL — Claude correctly identifies which database is most relevant and retrieves the right record with minimal disambiguation ambiguity
  • ClinicalTrials.gov queries by condition and intervention — structured results with trial phase, enrollment, status, and primary outcomes in a clean format

Where it backfires:

  • Some databases require API keys (DrugBank, BindingDB full access) that are not included in the skill — the skill queries what’s publicly available but flags when full data requires credentials
  • Cross-database joins (e.g., linking a PubChem CID to a UniProt target to TCGA expression data) require multiple query steps and intermediate identifier mapping that can introduce mismatches

Pattern that works: specify the primary identifier type upfront (PubChem CID, UniProt accession, Ensembl gene ID) rather than a name where possible — identifier-based queries are unambiguous and avoid the false match problem that gene/compound name lookups can produce.

Source and attribution

Originally authored by K-Dense Inc.. The canonical SKILL.md lives in the database-lookup folder of their public scientific-agent-skills repository.

License: MIT. Install, adapt, and redistribute with attribution preserved.

This page documents the skill from a practitioner’s perspective. For the formal spec and any updates, defer to the source repo.