Command Line
Run models from a terminal on Windows ARM64, or via Docker on Linux ARM64. Best for trying things out fast.
Local Server
OpenAI-compatible API on Windows ARM64 and Linux ARM64.
Python SDK
Huggingface style API for scripting and notebooks, available on Windows ARM64 and Linux ARM64.
Linux (Docker)
Docker image for Linux ARM64 with NPU access. Best for Dragonwing IoT and similar platforms.
Android SDK
Kotlin SDK from Maven Central, plus a prebuilt demo APK for Snapdragon 8 Elite.
Before you start
- Platform & runtime — pick where you’ll run (Windows ARM64, Android, Linux ARM64) and which runtime fits your model. See Platforms & runtimes.
- Models — see Models for tested examples on each runtime.
- Bring your own model — already have a GGUF or AI Hub bundle on disk? See Run a local model.
Was this page helpful?