> ## Documentation Index > Fetch the complete documentation index at: https://geniex.aihub.qualcomm.com/llms.txt > Use this file to discover all available pages before exploring further. # Quickstart > Run your first model from the GenieX CLI. ## **Prerequisites** * The CLI installed — see [Install](/en/run/cli/install). * Interactive shell from container (Docker only) — see [Run interactively](/en/run/linux/install#run-interactively). * Familiarity with [runtime choice](/en/get-started/platforms#geniex-runtimes) — `qairt` (Qualcomm AI Engine Direct) for Qualcomm AI Hub Models, `llama_cpp` for any GGUF. ## **Run your first model** ### **Qualcomm AI Engine Direct runtime (Qualcomm AI Hub)** **Language model:** ```powershell windows theme={"dark"} geniex infer ai-hub-models/Qwen3-4B ``` **Multimodal model:** ```powershell windows theme={"dark"} geniex infer ai-hub-models/Qwen2.5-VL-7B-Instruct ``` ### **llama.cpp runtime (GGUF)** Pick `Q4_0` when prompted — it has the best Hexagon NPU support. **Language model:** ```powershell windows theme={"dark"} geniex infer unsloth/Qwen3.5-0.8B-GGUF ``` **Multimodal model:** ```powershell windows theme={"dark"} geniex infer Qwen/Qwen3-VL-2B-Instruct-GGUF ``` When prompted: * **Model type** — `vlm` for vision-language models, `llm` for text-only models. For `Qwen3.5` and `Gemma4`, pick `llm` for now (multimodal support coming soon). * **Precision (Quantization)** — `Q4_0` for best Hexagon NPU performance. To try other GGUF models, copy any compatible GGUF path from Hugging Face and substitute it into the command above. See [Run a GGUF model from Hugging Face](/en/models/supported#run-a-gguf-model-from-hugging-face). ## **Run a local model** Already have a model on disk, or want to self-convert a bundle from Hugging Face? Use `geniex pull` with `--local-path` to register it, then run it like any other model. See: * [Run a local Qualcomm AI Engine Direct bundle](/en/models/supported#run-a-local-qualcomm-ai-engine-direct-bundle) — self-converted from Hugging Face, an extracted bundle directory, or an AI Hub `.zip`. * [Run a local GGUF model](/en/models/supported#run-a-local-gguf-model) — a directory containing your `.gguf` file. ## **Next steps** Expose an OpenAI-compatible HTTP API on `localhost:18181`. Every command, every flag.

Was this page helpful?

Yes