foundation-models-on-device
A Claude Code skill from Affaan M's everything-claude-code repo for Apple's FoundationModels framework — on-device LLM in iOS 26+. Covers availability checks (deviceNotEligible / appleIntelligenceNotEnabled / modelNotReady), basic and multi-turn sessions, guided generation with @Generable / @Guide, custom tool calling, and snapshot streaming for real-time UI updates.
Build on-device LLM features (text gen, structured output, tool calling) with FoundationModels instead of cloud APIs
Trigger phrases
Phrases that activate this skill when typed to Claude Code:
use Apple Intelligence on-device LLMstructured output with @Generablecustom tool calling in FoundationModels
What it does
foundation-models-on-device is the Apple Intelligence skill in Affaan M’s everything-claude-code — see skills/foundation-models-on-device. It covers Apple’s FoundationModels framework for on-device LLM in iOS 26+ — text generation, structured output with @Generable, custom tool calling, and snapshot streaming, all running on-device with no cloud dependency.
The availability gate is the first pattern: SystemLanguageModel.default.availability returns .available, .unavailable(.deviceNotEligible), .unavailable(.appleIntelligenceNotEnabled), or .unavailable(.modelNotReady). The skill is explicit that every entry point must check availability before creating a LanguageModelSession — silently failing in unavailable states is the most common bug.
Sessions are either single-turn (create a new session per call) or multi-turn (reuse for conversation context). Instructions go on session init and cover role, task, style, and safety guards (“Respond with ‘I can’t help with that’ for dangerous requests”). Guided generation is the structured-output pattern: @Generable on a struct, @Guide on each property with optional constraints (.range(0...20) for numeric, .count(3) for arrays, description: for semantic guidance). try await session.respond(to: prompt, generating: CatProfile.self) returns the typed struct directly. Custom Tool types with Generable Arguments let the model invoke domain-specific code — recipe search, calendar event creation — without leaving on-device.
When to use it
- Building AI-powered features with Apple Intelligence on-device — privacy and offline support are the wedge
- Generating or summarizing text without cloud dependency or per-token cost
- Extracting structured data from natural language input via
@Generable - Implementing custom tool calling for domain-specific AI actions where the tool runs on-device
- Streaming structured responses for real-time UI updates (snapshot streaming)
- Privacy-sensitive workflows where data leaving the device is a non-starter
When not to reach for it:
- Pre-iOS-26 deployment targets — the framework is gated
- Devices without Apple Intelligence eligibility — the availability check will refuse
- Frontier-model reasoning tasks — on-device models are smaller than cloud frontier models, and the skill is honest about that limit
- Cross-platform LLM integration — this is Apple-specific
Install
From affaan-m/everything-claude-code at skills/foundation-models-on-device/. Drop the folder into ~/.claude/skills/foundation-models-on-device/. The skill is markdown + Swift code patterns; the runtime is Xcode 26+ targeting iOS 26+, on an Apple Intelligence-eligible device. The framework itself ships with iOS — no SDK install — but the availability check is mandatory because not every iOS 26 device qualifies.
What a session looks like
- Add the availability check. Switch over
SystemLanguageModel.default.availability— render the content for.availableand a clear status message for eachunavailablecase (device not eligible / Apple Intelligence off in settings / model still downloading / other). - Create the session. Single-turn for one-shot calls (
LanguageModelSession()), multi-turn withinstructions:for conversational features. Instructions cover role + task + style + safety. - Define the
@Generabletype. Struct with@Guideon each property. Constraints where the schema benefits —.range(0...20)for ages,description:strings that guide the generation semantically. - Request structured output.
try await session.respond(to: prompt, generating: CatProfile.self)returns aResponse<CatProfile>with.contenttyped as the struct. No JSON parsing. - Add tool calling if needed. Define a
Toolstruct withname,description,@Generable Arguments, and acall(_:)method. The model decides when to invoke the tool; the on-device runtime handles the round-trip. - Stream if the UI needs it. Snapshot streaming for real-time updates — partial structured outputs surface as the model generates them.
The discipline that makes it work: structured-output-first. Using @Generable for anything beyond pure text generation avoids the brittle JSON-parsing layer that plagues cloud-LLM apps — the framework guarantees the response is a valid CatProfile or it throws.
Receipts
TODO — to be filled in from a real session. Once the framework has been used in a real iOS 26 app, this section will capture: which availability state actually fired most often on real devices (the upstream just shows the switch — receipts will show which devices land in .deviceNotEligible and which in .appleIntelligenceNotEnabled), how the on-device model handled a non-trivial @Generable schema with multiple @Guide constraints, whether the snapshot streaming UX kept pace with model generation speed, and the actual per-request latency on a real device for a structured-output call.
Source and attribution
From Affaan M’s everything-claude-code — an MIT-licensed skill collection covering harness construction, agent ops, video, payments, and platform-specific patterns.
License: MIT.
Quoting the availability-check rule verbatim: “Always check model availability before creating a session.” That’s the wedge — every other pattern in the skill assumes the device can run the model; the availability check is what makes the rest safe to write.