# ce-agent-native-architecture

> A skill from Every's compound-engineering plugin for building applications where agents are first-class citizens — features are prompt-defined outcomes, not hand-written code. It teaches five principles (Parity, Granularity, Composability, Emergent Capability, Improvement Over Time); its companion `ce-agent-native-audit` scores a codebase against eight concrete principles using parallel subagents and produces a prioritized gap matrix.

**Use case**: Design — or audit — systems where agents drive features through tools and prompts instead of hardcoded code paths

**Canonical URL**: https://agentcookbooks.com/skills/ce-agent-native-architecture/

**Topics**: claude-code, skills, agent-native, architecture, agents

**Trigger phrases**: "make this app agent-native", "design tools an agent can drive", "audit our agent-native architecture"

**Source**: [Every](https://github.com/EveryInc/compound-engineering-plugin/tree/main/plugins/compound-engineering/skills/ce-agent-native-architecture)

**License**: MIT

---

## What it does

`ce-agent-native-architecture` is the design skill in [Every's compound-engineering plugin](https://github.com/EveryInc/compound-engineering-plugin) — see [skills/ce-agent-native-architecture](https://github.com/EveryInc/compound-engineering-plugin/tree/main/plugins/compound-engineering/skills/ce-agent-native-architecture). It helps you architect systems where **agents drive functionality rather than traditional code** — where, in the plugin's framing, "features are prompt-defined outcomes" and "new features = new prompts, not new code." You reach for it when designing an agent-powered app, building tools for an agent to operate, or refactoring an existing system toward autonomous operation. A session starts from an intake menu (tool design, execution patterns, testing, and so on) and applies the relevant patterns to your context.

The framework rests on five principles: **Parity** ("whatever the user can do through the UI, the agent can achieve through tools"), **Granularity** (tools as atomic primitives), **Composability** (new features as new prompts, not new code), **Emergent Capability** (the agent accomplishes things you didn't explicitly design for), and **Improvement Over Time** (the app gets better through accumulated context and prompt refinement, without shipping code). The contrast with traditional architecture is the point: behavior emerges from agents looping over well-designed tools and clear system prompts, instead of living in hand-written feature code.

Its companion **`ce-agent-native-audit`** turns those principles into a scored review. It deploys parallel subagents — one per principle — that enumerate the relevant code artifacts, score compliance against concrete evidence (e.g. "7 out of 10 tools are proper primitives"), identify specific gaps, and compile a prioritized recommendation matrix. It audits **eight** principles: Action Parity, Tools as Primitives, Context Injection, Shared Workspace, CRUD Completeness, UI Integration, Capability Discovery, and Prompt-Native Features. It ships with `disable-model-invocation: true` — you run it deliberately; it doesn't auto-trigger.

## When to use it

- Designing a new agent-powered product — what the tool surface should be, how the execution loop runs
- Deciding whether a feature should be code or a prompt-defined outcome over existing tools
- Auditing an existing codebase for how agent-ready it actually is (run `ce-agent-native-audit` for the scored matrix)
- Planning a refactor toward autonomous operation and needing a prioritized gap list

When *not* to reach for it:

- Designing an agent's *internal* harness — tool granularity, observation shape, recovery contract — that's a different concern (see below)
- Pure CRUD apps with no agent in the loop — the principles don't pay off
- Multi-agent orchestration / parallel dispatch — that's an operational pattern, not an architecture-of-the-app one

How it differs from neighbors on this wiki: [agent-harness-construction](/skills/agent-harness-construction/) (Affaan M) is about giving *one agent* a fighting chance to finish — tool granularity, observation fields, recovery contract. [hexagonal-architecture](/skills/hexagonal-architecture/) is about decoupling domain logic from frameworks. `ce-agent-native-architecture` sits upstream of both: should this product's features live in code at all, or emerge from an agent operating your tools?

## Install

Part of the compound-engineering plugin (not a standalone `~/.claude/skills/` drop-in). In Claude Code:

```
/plugin marketplace add EveryInc/compound-engineering-plugin
/plugin install compound-engineering
```

Then invoke `/ce-agent-native-architecture` to design, and `/ce-agent-native-audit` to score an existing codebase. The plugin namespaces every skill with `ce-`. MIT-licensed.

## What a session looks like

1. **State the system.** "We're building a CRM where users manage contacts, deals, and tasks."
2. **Intake menu.** Pick the concern — start with tool design.
3. **Apply Parity.** List what a user can do in the UI (create contact, move deal stage, snooze task) and confirm each is reachable as a tool the agent can call.
4. **Granularity + Composability.** Model tools as primitives (`create_contact`, `update_deal_stage`) so a new feature like "weekly pipeline digest" is a *prompt* composing existing tools, not a new code path.
5. **Run `/ce-agent-native-audit`.** Eight subagents score the codebase: Tools-as-Primitives 7/10 (two tools encode business logic), CRUD Completeness flags that deals have no delete tool, Capability Discovery is low (users can't find what the agent can do).
6. **Prioritized matrix.** The audit ranks the gaps; you close CRUD Completeness and the two non-primitive tools first.

The discipline that makes it work: Parity first. If the agent can't do something the UI can, every later principle is built on a hole. The audit's value is that it scores against *enumerated code artifacts* rather than vibes — "7 of 10 tools are primitives" is a number you can act on.

## Receipts

_TODO — to be filled in from a real session. Once a real codebase has been designed or audited with these principles, this section will capture: which of the eight dimensions the audit scored lowest, whether "features = prompts, not code" held up or leaked back into hardcoded paths under deadline, whether the parallel-subagent audit's scores matched a human read of the same code, and how actionable the recommendation matrix actually was when it came time to prioritize the gaps._

## Source and attribution

From [Every's compound-engineering plugin](https://github.com/EveryInc/compound-engineering-plugin/tree/main/plugins/compound-engineering/skills/ce-agent-native-architecture) — the MIT-licensed plugin (Dan Shipper & Kieran Klaassen) that packages the Plan → Work → Assess → Compound workflow Every uses across its products.

License: MIT.

Quoting the anchoring principle verbatim: *"Whatever the user can do through the UI, the agent can achieve through tools."* Parity is the principle the other four — and all eight audit dimensions — ultimately serve.