# foundation-models-on-device

> A Claude Code skill from Affaan M's everything-claude-code repo for Apple's FoundationModels framework — on-device LLM in iOS 26+. Covers availability checks (deviceNotEligible / appleIntelligenceNotEnabled / modelNotReady), basic and multi-turn sessions, guided generation with @Generable / @Guide, custom tool calling, and snapshot streaming for real-time UI updates.

**Use case**: Build on-device LLM features (text gen, structured output, tool calling) with FoundationModels instead of cloud APIs

**Canonical URL**: https://agentcookbooks.com/skills/foundation-models-on-device/

**Topics**: claude-code, skills, ios

**Trigger phrases**: "use Apple Intelligence on-device LLM", "structured output with @Generable", "custom tool calling in FoundationModels"

**Source**: [Affaan M](https://github.com/affaan-m/everything-claude-code/tree/main/skills/foundation-models-on-device)

**License**: MIT

---

## What it does

`foundation-models-on-device` is the Apple Intelligence skill in [Affaan M's everything-claude-code](https://github.com/affaan-m/everything-claude-code) — see [skills/foundation-models-on-device](https://github.com/affaan-m/everything-claude-code/tree/main/skills/foundation-models-on-device). It covers Apple's FoundationModels framework for on-device LLM in iOS 26+ — text generation, structured output with `@Generable`, custom tool calling, and snapshot streaming, all running on-device with no cloud dependency.

The availability gate is the first pattern: `SystemLanguageModel.default.availability` returns `.available`, `.unavailable(.deviceNotEligible)`, `.unavailable(.appleIntelligenceNotEnabled)`, or `.unavailable(.modelNotReady)`. The skill is explicit that every entry point must check availability before creating a `LanguageModelSession` — silently failing in `unavailable` states is the most common bug.

Sessions are either single-turn (create a new session per call) or multi-turn (reuse for conversation context). Instructions go on session init and cover role, task, style, and safety guards ("Respond with 'I can't help with that' for dangerous requests"). Guided generation is the structured-output pattern: `@Generable` on a struct, `@Guide` on each property with optional constraints (`.range(0...20)` for numeric, `.count(3)` for arrays, `description:` for semantic guidance). `try await session.respond(to: prompt, generating: CatProfile.self)` returns the typed struct directly. Custom `Tool` types with `Generable` Arguments let the model invoke domain-specific code — recipe search, calendar event creation — without leaving on-device.

## When to use it

- Building AI-powered features with Apple Intelligence on-device — privacy and offline support are the wedge
- Generating or summarizing text without cloud dependency or per-token cost
- Extracting structured data from natural language input via `@Generable`
- Implementing custom tool calling for domain-specific AI actions where the tool runs on-device
- Streaming structured responses for real-time UI updates (snapshot streaming)
- Privacy-sensitive workflows where data leaving the device is a non-starter

When *not* to reach for it:

- Pre-iOS-26 deployment targets — the framework is gated
- Devices without Apple Intelligence eligibility — the availability check will refuse
- Frontier-model reasoning tasks — on-device models are smaller than cloud frontier models, and the skill is honest about that limit
- Cross-platform LLM integration — this is Apple-specific

## Install

From [affaan-m/everything-claude-code](https://github.com/affaan-m/everything-claude-code) at `skills/foundation-models-on-device/`. Drop the folder into `~/.claude/skills/foundation-models-on-device/`. The skill is markdown + Swift code patterns; the runtime is Xcode 26+ targeting iOS 26+, on an Apple Intelligence-eligible device. The framework itself ships with iOS — no SDK install — but the availability check is mandatory because not every iOS 26 device qualifies.

## What a session looks like

1. **Add the availability check.** Switch over `SystemLanguageModel.default.availability` — render the content for `.available` and a clear status message for each `unavailable` case (device not eligible / Apple Intelligence off in settings / model still downloading / other).
2. **Create the session.** Single-turn for one-shot calls (`LanguageModelSession()`), multi-turn with `instructions:` for conversational features. Instructions cover role + task + style + safety.
3. **Define the `@Generable` type.** Struct with `@Guide` on each property. Constraints where the schema benefits — `.range(0...20)` for ages, `description:` strings that guide the generation semantically.
4. **Request structured output.** `try await session.respond(to: prompt, generating: CatProfile.self)` returns a `Response<CatProfile>` with `.content` typed as the struct. No JSON parsing.
5. **Add tool calling if needed.** Define a `Tool` struct with `name`, `description`, `@Generable Arguments`, and a `call(_:)` method. The model decides when to invoke the tool; the on-device runtime handles the round-trip.
6. **Stream if the UI needs it.** Snapshot streaming for real-time updates — partial structured outputs surface as the model generates them.

The discipline that makes it work: structured-output-first. Using `@Generable` for anything beyond pure text generation avoids the brittle JSON-parsing layer that plagues cloud-LLM apps — the framework guarantees the response is a valid `CatProfile` or it throws.

## Receipts

_TODO — to be filled in from a real session. Once the framework has been used in a real iOS 26 app, this section will capture: which `availability` state actually fired most often on real devices (the upstream just shows the switch — receipts will show which devices land in `.deviceNotEligible` and which in `.appleIntelligenceNotEnabled`), how the on-device model handled a non-trivial `@Generable` schema with multiple `@Guide` constraints, whether the snapshot streaming UX kept pace with model generation speed, and the actual per-request latency on a real device for a structured-output call._

## Source and attribution

From [Affaan M's everything-claude-code](https://github.com/affaan-m/everything-claude-code/tree/main/skills/foundation-models-on-device) — an MIT-licensed skill collection covering harness construction, agent ops, video, payments, and platform-specific patterns.

License: MIT.

Quoting the availability-check rule verbatim: *"Always check model availability before creating a session."* That's the wedge — every other pattern in the skill assumes the device can run the model; the availability check is what makes the rest safe to write.