# llm-trading-agent-security

> A Claude Code skill from Affaan M's everything-claude-code repo for autonomous trading agent defense — layered controls treating prompt hygiene, hard spend limits, pre-send simulation, circuit breakers, MEV protection, and wallet isolation as independent layers. No single check is enough when an injection turns directly into asset loss.

**Use case**: Layer prompt-injection, spend-limit, simulation, and wallet-isolation defenses on an agent that signs transactions

**Canonical URL**: https://agentcookbooks.com/skills/llm-trading-agent-security/

**Topics**: claude-code, skills, crypto, agents, security

**Trigger phrases**: "harden my trading agent against prompt injection", "spend limits for an autonomous LLM trader", "pre-send simulation for agent transactions"

**Source**: [Affaan M](https://github.com/affaan-m/everything-claude-code/tree/main/skills/llm-trading-agent-security)

**License**: MIT

---

## What it does

`llm-trading-agent-security` is the autonomous-trading-agent defense skill in [Affaan M's everything-claude-code](https://github.com/affaan-m/everything-claude-code) — see [skills/llm-trading-agent-security](https://github.com/affaan-m/everything-claude-code/tree/main/skills/llm-trading-agent-security). It treats the threat model as harsher than normal LLM apps: an injection or bad tool path turns directly into asset loss, so no single check is sufficient. The skill is review / hardening focused, not exploit construction.

The defense layers are independent by design: prompt hygiene (sanitize external data for injection patterns like `ignore previous instructions`, `send .* to 0x...`, `approve .* for` before it enters the execution-capable context), hard spend limits (per-tx + 24-hour ceiling enforced *outside* the model's output — the model can't talk its way past a `SpendLimitGuard.check_and_record`), pre-send simulation (`w3.eth.call(tx)` returns the simulated outcome; mandatory `min_amount_out`; reject anything below threshold), circuit breakers (halt on consecutive losses or hourly drawdown beyond a threshold), wallet isolation (dedicated hot wallet, session funds only, never the primary treasury), MEV and deadline protection (private RPC or Flashbots, per-strategy slippage bps, explicit deadlines).

Every layer is shown with Python code: regex sanitizer, `SpendLimitGuard` class with explicit `SpendLimitError`, `safe_execute` that requires `expected_min_out` and refuses without it, `TradingCircuitBreaker` with `MAX_CONSECUTIVE_LOSSES` and `MAX_HOURLY_LOSS_PCT` thresholds, wallet loaded from env var (never code or logs), MEV protection via `PRIVATE_RPC = "https://rpc.flashbots.net"`. The pre-deploy checklist is the final gate: every layer present and tested, no fallback to unmetered access when any layer is unreachable.

## When to use it

- Building an AI agent that signs and sends transactions — the threat model is harsher than non-financial LLM apps
- Auditing an existing trading bot or on-chain execution assistant
- Designing wallet key management for an agent — env vars, dedicated hot wallet, never the treasury
- Giving an LLM access to order placement, swaps, or treasury operations
- Hardening against prompt injection from external data sources (news, social, webhook payloads) that the agent reads

When *not* to reach for it:

- Solidity contract review — that's `defi-amm-security`
- Non-trading agent payments — that's `agent-payment-x402`
- Pure model-output safety filtering — the skill is execution-layer defense
- Exploit construction or offensive security — defensive review only

## Install

From [affaan-m/everything-claude-code](https://github.com/affaan-m/everything-claude-code) at `skills/llm-trading-agent-security/`. Drop the folder into `~/.claude/skills/llm-trading-agent-security/`. The skill is patterns + Python code; runtime dependencies vary by which layer the operator implements — `web3.py` for the simulation and chain reads, `eth_account` for wallet loading, the operator's preferred LLM SDK for the model integration. Private mempool / Flashbots access requires a separate RPC endpoint.

## What a session looks like

1. **Audit the data path.** Every external string that flows into the model prompt — token names, pair labels, webhook payloads, social-media inputs — gets a `sanitize_onchain_data` pass. The regex catches `ignore .* instructions`, `send .* to 0x[0-9a-fA-F]{40}`, `transfer .* to`, `approve .* for`. Any match raises before the prompt is built.
2. **Wire the spend limit guard.** `SpendLimitGuard.check_and_record(usd_amount)` runs before any signed transaction. Per-tx ceiling, 24-hour rolling window. The guard is enforced outside the model's output — model can't override.
3. **Pre-send simulation.** `safe_execute` is mandatory; calls `self.w3.eth.call(tx)` first. `expected_min_out` is required — refuse without it. If the simulated output is below threshold, `SlippageError`, no send.
4. **Circuit breaker.** `TradingCircuitBreaker.check(portfolio_value)` halts on consecutive losses ≥ 3 or hourly PnL below -5%. Invalid hour-start values also halt. Halt is sticky — manual reset required.
5. **Wallet isolation.** Private key from env var (`TRADING_WALLET_PRIVATE_KEY`), missing key fails immediately, dedicated hot wallet, session funds only, never the primary treasury.
6. **MEV / deadline.** Private RPC or Flashbots; per-strategy slippage bps (`stable: 10`, `volatile: 50`); explicit deadline 60 seconds out.
7. **Audit log every decision.** Not just successful sends — every rejected action, every halt, every sanitization match. Recovery later requires the audit trail.

The discipline that makes it work: layered independence. The skill is explicit that no single check is enough. If the model is compromised, prompt hygiene fails — but spend limit still holds. If spend limit is bypassed somehow, simulation rejects. Each layer is enforced outside model output so the agent can't talk its way past the next.

## Receipts

_TODO — to be filled in from a real session. Once the layered defenses have been applied to a real trading agent, this section will capture: how many injection-pattern matches the regex actually caught in a week of live external-data ingestion, whether the pre-send simulation rejected at least one transaction that would've gone through under spot pricing, the circuit-breaker thresholds that turned out to be too tight or too loose for the strategy's normal volatility, and whether wallet isolation forced an architectural change (separate keystore, separate signing path) or just stayed an env-var swap._

## Source and attribution

From [Affaan M's everything-claude-code](https://github.com/affaan-m/everything-claude-code/tree/main/skills/llm-trading-agent-security) — an MIT-licensed skill collection covering harness construction, agent ops, video, payments, and platform-specific patterns.

License: MIT.

Quoting the layered-defense rule verbatim: *"Layer the defenses. No single check is enough."* That's the wedge — single-layer defenses against a model with transaction-signing authority all reduce to "the model didn't decide to do the bad thing today"; the skill enforces independent checks so a single failure doesn't cascade to asset loss.