Skip to main content

skill-stocktake

A Claude Code skill from Affaan M's everything-claude-code repo that audits installed skills + commands against a quality checklist via parallel subagent batches. Two modes — Quick Scan (changed-only diff against cached results.json, 5–10 min) and Full Stocktake (20–30 min). Verdicts: Keep / Improve / Update / Retire / Merge into [X], with mandatory decision-enabling reasons.

Audit the installed skill set for overlap, staleness, and bloat with explicit retire / merge / improve verdicts

Source Affaan M
License MIT
First documented
Receipts TODO

Trigger phrases

Phrases that activate this skill when typed to Claude Code:

  • audit my installed Claude skills
  • what skills should I retire
  • find overlap in my skill collection

What it does

skill-stocktake is the post-install audit skill in Affaan M’s everything-claude-code — see skills/skill-stocktake. It backs a /skill-stocktake slash command that evaluates every installed skill and command against a quality checklist using sequential subagent batches. Two modes: Quick Scan (re-evaluate only skills changed since the last run, 5–10 min, results cached in results.json) and Full Stocktake (complete review, 20–30 min, triggered when no cache exists or /skill-stocktake full).

The four-phase flow: inventory (scan ~/.claude/skills/ plus optional {cwd}/.claude/skills/, extract frontmatter, collect UTC mtimes), quality evaluation (launch a general-purpose Agent / Task subagent per chunk of ~20 skills, applying a four-item checklist — content overlap, MEMORY.md / CLAUDE.md overlap, freshness of technical references via WebSearch, usage frequency), summary table, consolidation (Retire / Merge with explicit justification, Improve with target size and section references, Update with sources checked).

Verdicts are five categories — Keep, Improve, Update, Retire, Merge into [X] — and the skill is unusually strict about reason quality. Every reason must be self-contained and decision-enabling: “Superseded” is not acceptable for a Retire verdict; the reason must state the specific defect AND what covers the same need instead. “Too long” is not acceptable for Improve; the reason must name a specific section, line range, target size, and the action. The skill explicitly forbids “unchanged” as the only justification on mtime-only Quick Scan re-evaluations.

When to use it

  • Periodic audit (monthly, quarterly) of the installed skill set
  • After bulk-importing skills — check overlap and decide what to retire
  • When the skill directory feels noisy and you don’t know which skills are actually used
  • Before sharing a ~/.claude/ setup with a team — clean out retire/merge candidates first
  • Diagnosing context bloat at the skill-set level (paired with context-budget for token-level)

When not to reach for it:

  • Searching for a new skill before creating one — that’s skill-scout
  • Authoring a new skill — separate workflow
  • Auditing context-window overhead in tokens — that’s context-budget
  • Single-skill quality check — overhead exceeds value

Install

From affaan-m/everything-claude-code at skills/skill-stocktake/. Drop the folder into ~/.claude/skills/skill-stocktake/. The skill ships shell scripts that the slash command calls: scripts/scan.sh for inventory, scripts/quick-diff.sh for Quick Scan diff, scripts/save-results.sh for cache writes. Results cache lives at ~/.claude/skills/skill-stocktake/results.json. The quality evaluation uses the Agent / Task tool (general-purpose subagent), no extra MCP server needed.

What a session looks like

  1. Run the audit. /skill-stocktake — defaults to Quick Scan if results.json exists, Full Stocktake otherwise.
  2. Inventory phase. Scan output names which paths were scanned and how many files found. Table of every skill with 7-day and 30-day usage, plus frontmatter description.
  3. Quality evaluation in chunks. General-purpose subagent gets ~20 skills + the checklist per invocation. Per-skill verdict + reason returned as JSON. Intermediate results saved with status: "in_progress" after each chunk; resume detection skips already-evaluated skills.
  4. Summary table. Skill | 7d use | Verdict | Reason. The reason column is what makes the verdict reviewable.
  5. Consolidation step. For Retire / Merge candidates, the skill presents detailed justification per file (what defect, what alternative covers it, what dependencies break on removal) before any deletion. Explicit user confirmation required for every removal.
  6. Improve verdicts. Specific change descriptions — section name, line range, target size, rationale. Operator decides whether to act.
  7. Cache the results. results.json updated with evaluated_at (real UTC timestamp, not date-only), enabling the next Quick Scan to diff against this state.

The discipline that makes it work: decision-enabling reasons. “Superseded” or “unchanged” alone gives the operator no basis for action — they have to re-read the skill to decide. The strict reason requirements turn the audit output into a queue of actionable items the operator can work through without revisiting source.

Receipts

TODO — to be filled in from a real session. Once the audit has been run against a real skill collection, this section will capture: how many skills landed in each verdict bucket (Keep / Improve / Update / Retire / Merge) and which verdict was most common, how Quick Scan diffed correctly vs. missed an mtime-only change that mattered, whether the chunk size of ~20 stayed inside subagent context limits without truncation, and which Retire reason actually held up when the operator went to delete the skill (some “supersede” claims fall apart on inspection).

Source and attribution

From Affaan M’s everything-claude-code — an MIT-licensed skill collection covering harness construction, agent ops, video, payments, and platform-specific patterns.

License: MIT.

Quoting the reason-quality rule verbatim: “The reason field must be self-contained and decision-enabling: do NOT write ‘unchanged’ alone — always restate the core evidence.” That’s the wedge — audits that emit one-word verdicts decay into noise; audits that emit reviewable justifications stay useful between runs.