safety-guard

A Claude Code skill from Affaan M's everything-claude-code repo that prevents destructive operations in production / autonomous-agent sessions via PreToolUse hooks. Three modes — Careful (intercept destructive bash patterns like rm -rf / git push --force / DROP TABLE), Freeze (lock edits to a specific directory tree), Guard (both, with read-everywhere / write-restricted).

Block destructive commands and restrict edits to a directory when running agents autonomously

Source Affaan M

License MIT

First documented 2026-05-12

Receipts TODO

Agents Security

Trigger phrases

Phrases that activate this skill when typed to Claude Code:

block rm -rf in this session
lock edits to src/components
safety guard for autonomous agent

What it does

safety-guard is the destructive-operation guard skill in Affaan M’s everything-claude-code — see skills/safety-guard. It intercepts destructive commands and out-of-scope edits via PreToolUse hooks against Bash, Write, Edit, and MultiEdit tool calls. Three modes give a graduated response: Careful (intercept and warn), Freeze (lock edits to one tree), Guard (both combined).

Careful mode watches a hard-coded set of dangerous patterns: rm -rf (especially with /, ~, or project root), git push --force, git reset --hard, git checkout . (discard all changes), DROP TABLE / DROP DATABASE, docker system prune, kubectl delete, chmod 777, sudo rm, npm publish (the accidental-publish guard), and any command with --no-verify. On match, the hook shows what the command does, asks for confirmation, and suggests a safer alternative.

Freeze mode locks file edits to a specific directory tree: /safety-guard freeze src/components/ blocks any Write or Edit outside the named subtree. Useful when an agent should focus on one area without touching unrelated code. Guard mode combines both — agents can read anything but only write to the named directory, and destructive commands are blocked everywhere. Logs go to ~/.claude/safety-guard.log. Unlock is /safety-guard off.

When to use it

Working on a production system where a wrong command has expensive consequences
Running agents autonomously (codex -a never mode and similar) where the operator isn’t reviewing every action
Focused refactors where the agent should touch only one directory
Sensitive operations — migrations, deploys, data changes — where the destructive-command set is the right safety net
Pairing with llm-trading-agent-security or production-audit for high-stakes work

When not to reach for it:

Solo interactive sessions where the operator is reviewing every command
Exploratory work where the agent legitimately needs to touch many directories
Cases where the freeze-mode restriction would block legitimate edits — overhead exceeds value
Code review for security findings — that’s production-audit or skill-security-auditor

Install

From affaan-m/everything-claude-code at skills/safety-guard/. Drop the folder into ~/.claude/skills/safety-guard/. The PreToolUse hooks need to be wired into ~/.claude/settings.json against Bash, Write, Edit, and MultiEdit matchers — the skill ships the hook script(s); operator wires the matcher entries. Logs land at ~/.claude/safety-guard.log by default.

What a session looks like

Operator enables a mode. /safety-guard careful for warn-on-destructive, /safety-guard freeze src/api/ for directory lock, /safety-guard guard --dir src/api/ --allow-read-all for both.
Agent runs a command. PreToolUse hook intercepts the Bash invocation. If the command matches a watched pattern (rm -rf node_modules), the hook shows the operator what the command does and asks for confirmation. Operator approves or rejects.
Agent tries an out-of-scope edit. In Freeze or Guard mode, an Edit to src/components/Header.tsx while frozen on src/api/ gets blocked. The hook explains the restriction and points the agent at the allowed directory.
Log entry. Every block goes to ~/.claude/safety-guard.log — the audit trail for what was blocked and when. Useful when reviewing whether the policy was right or too restrictive.
Unlock when done. /safety-guard off removes the active mode.

The discipline that makes it work: pre-commit prevention. The hooks run before the destructive action lands — there’s no “the agent already did the rm -rf and now we’re recovering from backups” failure mode. The cost is friction; the wedge is preventing irrecoverable state.

Receipts

TODO — to be filled in from a real session. Once the guard has been enabled in a real autonomous session, this section will capture: how many destructive-pattern matches actually fired in a day of autonomous work and whether any false positives caused unnecessary friction, whether Freeze mode caught a legitimate cross-directory edit the agent wanted to make and whether that was the right call, the actual content of ~/.claude/safety-guard.log for a representative session, and whether any destructive pattern slipped through the regex (the most common is escape variants like rm -rf with double space).

Source and attribution

From Affaan M’s everything-claude-code — an MIT-licensed skill collection covering harness construction, agent ops, video, payments, and platform-specific patterns.

License: MIT.