Apple has quietly shipped one of the most significant frameworks for iOS developers in years. With iOS 26 the Foundation Models framework gives you direct access to Apple's on-device ~3B parameter large language model — the same one powering Apple Intelligence — right from your Swift code.
No API keys. No cloud costs. No internet required. And it's completely free.
Let's break down what this means and how to start building with it.
What Is the Foundation Models Framework?
The Foundation Models framework exposes Apple's on-device LLM to third-party developers. Unlike cloud-based models like ChatGPT or Claude that run on remote servers, Apple's model runs entirely on the user's device using Apple silicon (CPU, GPU, and Neural Engine).
This gives you:
- Privacy by default — all data stays on-device
- Zero latency from network calls — inference happens locally
- Offline support — works without internet
- Free inference — no per-token costs, no API billing
- Deep Swift integration — the API feels native, not bolted on
The model specializes in language understanding, structured output generation, and tool calling. It's not designed as a general-knowledge chatbot, but rather as an engine for building intelligent features tailored to your app.
Getting Started: Your First On-Device AI Feature
Here's how simple it is to generate a response from Apple's on-device LLM:
import FoundationModels
let session = LanguageModelSession()
let response = try await session.respond(to: "What's a good name for a travel app?")
print(response.content)
That's it. Three lines of code and you're running AI inference on-device.
Streaming Responses
For a ChatGPT-like experience where text appears token by token:
let session = LanguageModelSession()
let stream = session.streamResponse(to: "Suggest 5 creative app names for a fitness tracker")
for try await partial in stream {
print(partial.content, terminator: "")
}
This creates a much smoother UX instead of waiting for the entire response to complete.
Guided Generation: The Killer Feature
Here's where Foundation Models really shines compared to typical LLM APIs. Guided Generation lets you get structured, type-safe outputs directly as Swift types.
Instead of parsing messy JSON strings, you define your output structure using the @Generable macro:
import FoundationModels
@Generable
struct MovieRecommendation {
let title: String
@Guide(description: "A brief one-sentence summary")
let summary: String
@Guide(.anyOf(["PG", "PG-13", "R", "G"]))
let rating: String
}
Then generate structured output:
let session = LanguageModelSession()
let movie: MovieRecommendation = try await session.respond(
to: "Recommend an action movie from the 2020s",
generating: MovieRecommendation.self
).content
print(movie.title) // e.g., "Top Gun: Maverick"
print(movie.rating) // "PG-13" — guaranteed to be one of the allowed values
The @Guide macro lets you:
- Add natural language descriptions to guide the model
- Constrain values to a specific set with
.anyOf() - Control array lengths with
count() - Enforce string patterns with regex
This is constrained decoding built directly into the framework — the model is literally forced to produce valid output matching your Swift types at the token level. No more hoping the model returns valid JSON.
Tool Calling: Give the Model Superpowers
The on-device model has a ~3B parameter size, so it doesn't know everything. But with Tool Calling, you can extend its capabilities by giving it access to your app's data and APIs.
Here's an example — a health coach that reads HealthKit data:
import FoundationModels
struct BloodPressureTool: Tool {
let name = "getBloodPressure"
let description = "Fetches the user's latest blood pressure reading"
func call(arguments: EmptyInput) async throws -> String {
// Fetch from HealthKit
let systolic = 120
let diastolic = 80
return "Systolic: \(systolic) mmHg, Diastolic: \(diastolic) mmHg"
}
}
let session = LanguageModelSession(tools: [BloodPressureTool()]) {
"""
You're a health coach. Help users manage their health
based on their blood pressure data.
"""
}
let response = try await session.respond(to: "How's my blood pressure looking?")
The model will automatically decide when to call your tool, fetch the data, and incorporate it into a natural language response. This is incredibly powerful for building context-aware features.
Specialized Adapters: Content Tagging Out of the Box
Beyond the general-purpose model, Apple provides specialized adapters for specific tasks. The content tagging adapter is built-in and optimized for:
- Topic tag generation
- Entity extraction
- Topic detection
let taggingModel = SystemLanguageModel(useCase: .contentTagging)
let session = LanguageModelSession(model: taggingModel)
@Generable
struct Tags {
let topics: [String]
}
let result: Tags = try await session.respond(
to: "Apple announced new MacBook Pro with M5 chip at their spring event",
generating: Tags.self
).content
print(result.topics) // ["Apple", "MacBook Pro", "M5", "Product Launch"]
Custom Adapter Training: Teach the Model New Tricks
For advanced use cases, Apple provides a Python-based adapter training toolkit that lets you fine-tune the on-device model with your own data using LoRA (Low-Rank Adaptation).
When should you consider training a custom adapter?
- The model needs to become a subject-matter expert for your domain
- You need a specific output style, format, or policy
- Prompt engineering isn't achieving the required accuracy
- You want lower latency by reducing prompt length
Key things to know about adapters:
- Each adapter is ~160MB in storage
- Adapters are compatible with a single specific model version — you must retrain when Apple updates the base model
- Deploy via the Background Assets framework (don't bundle in your app)
- Requires Mac with Apple silicon and 32GB+ RAM, or Linux GPU machines
- You need the Foundation Models Framework Adapter Entitlement for production deployment
Apple recommends exhausting prompt engineering and tool calling before jumping to adapter training. It's powerful but comes with ongoing maintenance.
Availability and Requirements
Before creating a session, always check availability:
let availability = SystemLanguageModel.default.availability
switch availability {
case .available:
// Ready to use
case .unavailable(.deviceNotEligible):
// Device doesn't support Apple Intelligence
case .unavailable(.appleIntelligenceNotEnabled):
// User needs to enable Apple Intelligence in Settings
case .unavailable(.modelNotReady):
// Model is still downloading
default:
break
}
Requirements:
- iOS 26, iPadOS 26, macOS 26, or visionOS 26
- Apple Intelligence-compatible device (iPhone 15 Pro or later, M-series Macs/iPads)
- Apple Intelligence must be enabled in Settings
Practical Use Cases
Here are some real-world features you can build today:
Smart Journaling: Auto-generate mood tags and summaries from journal entries using guided generation.
Recipe Parsing: Point the model at unstructured recipe text and extract structured ingredient lists and steps.
Customer Support: Build an in-app assistant that uses tool calling to access order history and FAQs without any cloud dependency.
Content Moderation: Use the content tagging adapter to automatically classify user-generated content.
Personalized Learning: Generate quiz questions based on study material, all processed locally.
Workout Insights: Combine tool calling with HealthKit to generate natural language summaries of fitness data.
Limitations to Keep in Mind
The Foundation Models framework is powerful, but it's not a silver bullet:
- Context window is limited — the ~3B model can't handle massive prompts
- Not for general world knowledge — it's optimized for tasks, not trivia
- Apple explicitly warns against using it for: code generation, math calculations, or factual Q&A
- Device-dependent — older devices can't run it, so always have a fallback
- Adapter retraining — every OS update with a new model version means retraining your adapters
What This Means for iOS Development
The Foundation Models framework represents a fundamental shift. For the first time, iOS developers have access to a production-quality LLM with:
- Zero marginal cost per inference
- Native Swift API that feels like any other Apple framework
- Type-safe outputs through guided generation
- Built-in privacy without any extra work
This isn't just another AI API wrapper. It's a deeply integrated, first-party framework that makes on-device intelligence a realistic feature for apps of all sizes — from indie side projects to enterprise applications.
If you haven't started experimenting with it yet, now's the time. The barrier to adding AI features to your iOS app has never been lower.
Requirements: Xcode 26 + iOS 26 SDK + Apple Intelligence-enabled device
Documentation: Foundation Models | Apple Developer Documentation
Top comments (1)
Apple recommends exhausting prompt engineering and tool calling before jumping to adapter training. It's powerful but comes with ongoing maintenance.