# Six personas in parallel, three findings with receipts

> Six agency-agent personas dispatched as general-purpose subagents — emulated, not installed. Three findings: a frontmatter gap, a schema gap, three a11y gaps.

**Canonical URL**: https://agentcookbooks.com/blog/six-personas-in-parallel-three-findings-emulated-not-installed/

**Published**: 2026-05-09

**Tags**: claude-code, skills, agents

---

The wiki had eighteen skills shipping with `## Receipts` sections that read "TODO — to be filled in from a real session" — a transparency choice over filler, but a section that says "real notes coming" can only stay there so long. To cut the bucket, I dispatched six [agency-agent personas](/skills/) in parallel, each scoped to a specific artifact on the live site. Roughly ~390K agent tokens, ~12 minutes wall time, three real findings worth the receipts. The non-obvious part: an agency-agent "persona" dispatched via the Agent tool is not the installed `~/.claude/agents/` file. It's a general-purpose subagent that read the persona prompt. The wiki receipts have to say so.

## What I ran

The setup: an `Explore` agent first, to count the buckets. 160 skill pages on the wiki at the time. 18 with a `## Receipts` section reading "TODO — to be filled in from a real session." 5 already firsthand. 137 still plausible-generic from earlier bulk imports. Triage said: don't run all 18. Split into *naturally applicable today* (six personas with an obvious live target on this site), *contrived* (skills I'd have to invent a target for), and *external setup* (skills that need separate environment, e.g. Python and an LLM API key).

The six dispatched personas — all from [github.com/msitarzewski/agency-agents](https://github.com/msitarzewski/agency-agents), MIT-licensed:

- [`code-reviewer`](/skills/code-reviewer/) → review of commit `be11144` (4 freshly-imported Osmani skill pages)
- [`evidence-collector`](/skills/evidence-collector/) → HTML evidence pass on `/skills/seo/` + `/skills/topic/seo/`
- [`accessibility-auditor`](/skills/accessibility-auditor/) → live homepage + `/skills/` against WCAG 2.2 AA
- [`technical-writer`](/skills/technical-writer/) → review of `receipts-drafts/_TEMPLATE.md`
- [`agents-orchestrator`](/skills/agents-orchestrator/) → meta-review of this exact dispatch flow
- [`codebase-onboarding-engineer`](/skills/codebase-onboarding-engineer/) → cold-read of the repo

Three more skills got receipts the same session, but those I ran myself (against the working tree, the audit itself, and the recent commit history) rather than dispatching a persona.

The trigger phrase for each was explicit: *"Read the agency-agents persona file at this URL. Take that as your role. Audit X. Stay in lane — report findings, don't fix anything."* Then a tight scope: one URL, one commit, one file.

## What happened

Three findings worth the receipts.

**Finding 1 — `code-reviewer` caught a frontmatter gap on the most recent commit.** The four Addy Osmani skill pages added in `be11144` were missing `receiptsStatus: "todo"` on every frontmatter block. The schema at `src/content.config.ts:31` defines the field as `z.enum(['firsthand','todo','generic'])` and optional. Without it, the index table renders a `—` dash instead of the visible-TODO chip the transparency policy ships. Caught and fixed in the same commit that consolidated all nine receipts. Persona output, verbatim:

> Important #1: missing `receiptsStatus: "todo"` on all 4 frontmatter blocks. Schema in `src/content.config.ts:31` defines `receiptsStatus: z.enum(['firsthand','todo','generic'])` as optional; without it, these 4 files land in the `—` column on the index instead of the visible TODO bucket the visible-TODO transparency policy is designed to surface.

That's the kind of finding that sits in the gap between "ships and renders" and "ships as designed." A unit test wouldn't catch it; a frontmatter-validation rule could but doesn't exist; a human reviewing four similar files at once tends to skim. The persona caught it on the second file and verified across all four.

**Finding 2 — `evidence-collector` flagged the canonical slug buried in alphabetical order.** On `/skills/topic/seo/`, 28 skill cards in alphabetical order put the canonical slug `seo` at position 4 (after `ai-seo`, `programmatic-seo`, `schema-markup`). The hand-written lede at the top names five anchor skills — `seo-audit`, `ai-seo`, `schema-markup`, `programmatic-seo`, `seo-cluster` — but not the canonical `seo` itself. Among 25+ `seo-*` siblings, the broadest entry isn't elevated. The persona also flagged a schema asymmetry: skill detail pages emit `BreadcrumbList`, topic pages don't.

Both findings are open work. The first is a sort-key fix; the second is a layout edit. Neither blocks ship.

**Finding 3 — `accessibility-auditor` found three real WCAG AA gaps on the live site.** Skip link missing (`<a href="#main">` should be the first focusable element; the `id="main"` was there, the link wasn't). Pagefind search input lacking a server-rendered accessible name (the SSR shell is just `<div id="ac-pagefind-search">`; the `<input>` is JS-injected post-mount, and Pagefind UI 1.5.x sets `aria-label="Search"` by default but the SSR markup couldn't be verified). Cloudflare email-obfuscation footer link reading literally "email protected" to screen readers — that one is invisible from a source-code audit because the obfuscation happens edge-side, not in the Astro build.

Two of the three landed in the same `2a5a1d7` commit: skip link in `src/layouts/Base.astro`, `aria-label="Email Agent Cookbooks"` on the footer link. Pagefind verification deferred (it needs DevTools post-deploy).

## Where it drifted

The honesty footnote is the part the wiki receipts had to be written carefully around.

An agency-agent "persona" file is a system prompt in a markdown frontmatter block, designed to be installed at `~/.claude/agents/<name>.md` and selected via the Agent tool's `subagent_type` parameter. When you dispatch via Claude Code without that file installed, what runs is a *general-purpose subagent* given the persona's text as part of its prompt. Same model, same tools, but: no `subagent_type` lookup, no description-based dispatch matching, no `tools` field enforcement from the agent definition. The behavior approximates the persona; it isn't the persona.

This matters for the wiki receipts because every page above is framed as "persona dispatched on X." Read literally, that implies the installed-agent path. The receipts now say "persona-emulation dispatch — general-purpose agent given the persona prompt, not the installed agent file." Three lines longer; honest.

The receipts that got promoted to firsthand on the wiki call this out explicitly in their Receipts sections. Future runs of this kind of dispatch should name the emulation up front — what's running is a general-purpose subagent reading the persona prompt, not the persona's installed agent file.

The other thing that drifted: the cost. ~390K agent tokens for six dispatches isn't free. If the goal had been "generic plausible content for the chip bucket," this would be the wrong way to spend the budget. The justification was that each finding above is *specifically true about this site* — a frontmatter gap on a specific commit, a slug buried in a specific topic page, three a11y gaps on specific URLs — not a generic "what a code reviewer would say." Generic findings are cheap. Specific findings, run in parallel, are the price you pay for scaling firsthand audits past one human.

## What I'd change

Three things.

**Don't dispatch personas to fill TODO buckets.** Dispatch them to find a *specific* thing on a *specific* artifact. The bucket-filling framing is the trap — it leads to plausible-generic receipts that read confident but wouldn't grep-verify. The framing that worked: "audit commit X for frontmatter gaps." The framing that wouldn't have: "give me a code review for the wiki."

**Inherit the evidence rubric in the prompt.** Each persona dispatch named the artifact (URL, commit hash, file path) and required findings to cite specifics that could be re-verified. Without that, semantic classifications drift toward judgment calls that don't ground out. The dispatched agent inherits whatever evidence rubric the prompt makes explicit; if the prompt doesn't carry it, the agent fills the gap with judgment.

**Cap the parallel batch.** Six was the right number for this site at this state — three personas would have under-used the parallel cost; ten would have over-spent. The triage step (run all 18? no — six naturally-applicable) was the actual decision; the parallel dispatch was the mechanical part. Push back on "run them all" before it becomes a habit. The same triage shape showed up earlier in [Four SEO skills on one homepage: only Sweep 4 found the gap](/blog/four-seo-skills-only-sweep-4-found-the-gap/) — running multiple lenses on one target only pays when each lens is differentiated, not when they overlap.