Localight is a simple SwiftUI chatbot app for iOS 26, powered entirely by Apple’s on-device Foundation Models. Designed for demonstration purposes, Localight offers fast, private, and completely offline AI chat — no internet connection or server required.
Localight showcases how to integrate Apple's local LLM into a native iOS experience using SwiftUI and the new Foundation Models framework.
- 🧠 On-device LLM: Uses Apple’s local Foundation Models for text generation.
- 🔐 Privacy-first: All conversations stay on your device. No data is sent to the cloud.
- ⚡ Fast & Offline: No internet needed. Responses are generated locally.
- 💬 Minimalist Chat UI: Clean SwiftUI interface for interacting with the model.
- 🗑️ No history: Conversation is not saved after closing the app.
-
Import the Library: To work with Foundation Models, you must import the library in every file where you intend to use them. Go with:
import FoundationModels -
Check Availability: The key object is the SystemLanguageModel. This can indicate whether the model is
.availableor.unavailable. In case of unavailability, a reason is also provided. -
Create a LanguageModelSession: To start prompting, you need to create a LanguageModelSession. When you create a session you can provide instructions that tells the model what its role is and provides guidance on how to respond. These instructions should never be editable by the user:
let session = LanguageModelSession(instructions: "You are the best friend.")
-
Generate a Response: Finally, call the method to receive a result as a String by using:
let response = try await session.respond(to: promptAsString).content
-
Stream a Response: Otherwise, you can call the method to receive a result as a String that is streamed by using:
let stream = session.streamResponse(to: promptAsString) do { for try await chunk in stream { self.streamingResponse = chunk.content } let response = try await stream.collect().content } catch {}
Apple’s on-device Foundation Models operate with a limited context window per session.
The context window defines how many tokens the model can process within a single LanguageModelSession.
- A token is a unit of text processed by the model.
- In Western languages (e.g. English or German), 1 token ≈ 3–4 characters.
- In East Asian languages (e.g. Japanese or Chinese), 1 token ≈ 1 character.
- The system model currently supports up to 4,096 tokens per session.
If this limit is exceeded, the framework throws the following error: LanguageModelSession.GenerationError.exceededContextWindowSize(_:)
For more details, see Apple’s official documentation: TN3193 – Managing the on-device foundation model’s context window