Skip to main content

18 pre-commit rules that block the commit, not the launch

The usual indie path is “add the privacy policy and the consumer-protection labels sometime before launch.” That path ends with a signup form shipping without consent tags (a GDPR breach), or a featured card shipping without an “Ad” label (a regulator notice) — discovered, if at all, after the site is live. This setup moves all of it to the commit boundary: 18 rules in .githooks/pre-commit, run on every git commit, any failure blocks the commit (exit 1). The rules are compliance-grained, not just code-quality-grained, and the hook itself doubles as a machine-readable compliance spec.

What I ran

Not a Claude Code skill activation — a cross-project transfer of a compliance-as-code setup from a sister Astro 6 site. The project spec defines 18 pre-commit rules. A single script in .githooks/pre-commit runs all of them on every commit; the first failure blocks (exit 1).

Activation on a fresh clone is one line:

git config core.hooksPath .githooks

That points Git at .githooks/ instead of the default .git/hooks/, so the rules are version-controlled and travel with the repo rather than living in each developer’s local clone.

What happened

The 18 rules, with the regulatory or quality concern each one enforces:

 1  API-key leak prevention                              Security
 2  Trademark phrasing ("X recommends Y")                Brand/IP
 3  Sponsored content without an "Ad" label              Consumer protection
 4  Affiliate link without disclosure                    Consumer protection
 5  Signup code without 3 consent-tag attributes         GDPR art. 7
 6  Individual service prices in .md files               Data isolation
 7  Privacy notice without controller details            GDPR art. 13
 8  Pre-checked optional consent                         GDPR art. 7
10  Sitemap without an exclude list                      SEO/privacy
11  "Pay-to-remove" terms                                Consumer manipulation
12  Manipulative marketing terms                         Consumer manipulation
13  Sensitive-topic content without expert review        E-E-A-T / liability
14  District page without unique local content           Thin-content
15  Special venue without permit-source URL OR data >60d Liability
16  Sensitive term without an official-source link       E-E-A-T
17  Dev-only debug attributes in production code         Cleanup
18  Broken internal links                                Quality

(The numbering is the source spec’s — it skips 9. I kept it as-is rather than inventing a rule to fill the gap.)

The concrete one worth reading is rule #5, the three-consent check. Every signup form has to ship with three separate data-consent-* attributes, and the hook fails the commit if any one is missing:

if grep -qE '(action="/api/subscribe"|data-form="subscribe")' "$f"; then
  MISSING=""
  grep -q 'data-consent-newsletter' "$f" || MISSING="$MISSING newsletter"
  grep -q 'data-consent-local'      "$f" || MISSING="$MISSING local"
  grep -q 'data-consent-profiling'  "$f" || MISSING="$MISSING profiling"
  if [ -n "$MISSING" ]; then
    fail 5 "Signup form ($f) is missing consent tags:$MISSING"
  fi
fi

The check is dumb on purpose: detect a signup form, then assert three named attributes are present. The chance of a single forgotten consent attribute slipping into a commit is gone — not reduced, gone, because the commit can’t land without all three.

Rule #15 is the one that reaches past the working tree into the data. It runs a Node script over the data file: every special-category venue needs a permit-source URL, and every venue needs a data_updated date less than 60 days old. Stale data blocks the commit until a refresh runs — the staleness check is enforced at the same boundary as everything else, so data rot can’t accumulate silently between launches.

The whole 18-rule pass costs about ~150ms per commit. That’s the entire price: a sub-quarter-second tax on every commit in exchange for a hardcoded guard against the most common regulatory mistakes, running 24/7, with no compliance team in the loop.

Where it drifted

The framing that makes this more than a linting setup: .githooks/pre-commit is a machine-readable compliance spec.

A written compliance spec — the kind that lives in a doc and explains why the site needs an “Ad” label on sponsored cards and three consent attributes on every signup — has a known failure mode. It documents the “why,” and then the code drifts away from it. Six months later the doc says one thing and the templates do another, and nobody notices until a complaint arrives. The doc rotted because nothing was checking that the code still matched.

The hook closes that gap by being a second form of the same source of truth. The written spec documents the “why”; the hook runs the “how.” They can’t silently disagree, because the hook executes on every commit and the commit fails the moment the code stops matching the rule. The spec can’t rot away from the code when the spec is also the thing gating the code.

The other piece is the boundary choice. Most of these failures are cheap to fix at the commit boundary and expensive at the launch boundary. A missing consent attribute is a one-line edit when the form is the file you’re committing; it’s a GDPR breach when it’s already live and someone has subscribed through it. A featured card without an “Ad” label is a one-character class change at commit time; live, it’s a consumer-protection notice and a complaint process that can run up to 18 months with a fine reaching 4% of turnover. The hook doesn’t reduce the work of compliance — it relocates it to the point where the work is smallest and the cost of a miss is lowest.

This sits alongside the deterministic hooks already in the cookbook — the frontmatter guard is the same shape (cheap structural check, hoisted from build-time to edit-time), and the content-judge hook is the probabilistic counterpart. It’s worth being clear about what kind of hook this is, though: these are Git pre-commit hooks (.githooks/pre-commit, fired by git commit), not Claude Code’s native PreToolUse/PostToolUse hooks from the hooks cookbook, and not the regex-to-message rules in hookify-rules. Three distinct mechanisms that all carry the word “hook.” If you want guardrails on the Claude Code side of the same boundary, the git-guardrails-claude-code skill covers that surface.

What I would change

Two concrete moves.

Split the fast checks from the slow ones. Most rules are greppable and instant; rule #15 shells out to a Node script over a data file. Right now all 18 run on every commit and the total is ~150ms, which is fine — but as the data file grows, #15 is the rule that will drift that number upward. The move is to keep the regex rules in the always-on pre-commit pass and gate the data-scanning rule (#15) behind a pre-push hook or CI, where a slower run is acceptable. The commit-boundary tax stays cheap; the expensive check still blocks before anything ships.

Make the executable hook the single source of the rule list. The framing above — the hook is the compliance spec — only fully holds if the human-readable rule list and the script that enforces it are the same artifact. The moment they’re maintained separately, a table can drift from the checks it claims to describe, and the “spec can’t rot” property leaks at exactly that seam. The move is to generate the rule list from the same registry the script iterates — one array of { id, concern, check }, the script runs the checks, a build step renders the list — so there’s genuinely one source of truth instead of a doc and a script kept in sync by hand.