> ## Documentation Index
> Fetch the complete documentation index at: https://geniex.aihub.qualcomm.com/llms.txt
> Use this file to discover all available pages before exploring further.

# What is GenieX

> On-device AI inference runtime for Qualcomm Snapdragon — run frontier LLMs and VLMs across CLI, Python, Android, and Docker.

GenieX is an on-device Gen AI inference runtime built for Qualcomm platforms. It is the easiest way to run frontier language and vision-language models locally on Hexagon NPU, Adreno GPU, or CPU with a few lines of code. It is the community version of Qualcomm GENIE.

## **Architecture**

<img src="https://mintcdn.com/qualcomm-0801e48b/ewrmU9zMnfZyH0O6/Mintlify-image/geniex_arch_v2.png?fit=max&auto=format&n=ewrmU9zMnfZyH0O6&q=85&s=332ac0c099324f4820ddb8581c67cd68" alt="GenieX architecture stack: CLI, Python API, Java API, Docker, and Serve interfaces sit on the GenieX SDK, which dispatches to the llama.cpp runtime (GGML over CPU/GPU/HTP kernels) or the Qualcomm AI Engine Direct runtime on NPU. Targets Windows, Android, and Linux." style={{ borderRadius: '0.5rem' }} width="3424" height="1936" data-path="Mintlify-image/geniex_arch_v2.png" />

GenieX exposes **five entry points**, all over a single SDK:

* **CLI** — run and serve models straight from the terminal.
* **Python** — embed inference in your apps with the Python SDK.
* **Java/Kotlin** — the Android SDK for on-device mobile apps.
* **Docker** — a containerized image for reproducible deployments.
* **OpenAI-compatible server** — a drop-in local server for existing OpenAI clients.

Under the hood, that SDK dispatches to either the **llama.cpp runtime** (GGML kernels for CPU / GPU / Hexagon HTP) or the **[Qualcomm® AI Engine Direct](https://www.qualcomm.com/developer/software/qualcomm-ai-engine-direct-sdk) runtime** (NPU-only). The same SDK runs on Windows ARM64, Android, and Linux ARM64.

<Note>**Qualcomm AI Engine Direct** is the official name of what is also known as the *Qualcomm AI Engine Direct SDK*, *Qualcomm AI Runtime*, and *QAIRT*. Throughout these docs we use the official name.</Note>

## **Why two runtimes?**

So you get both **broad model coverage** and **optimal performance** in one stack:

* **Most models just work** — point GenieX at almost any GGUF on Hugging Face and it runs on CPU / GPU / NPU via llama.cpp.
* **Qualcomm® AI Hub Models run optimally** — models published to [Qualcomm AI Hub](https://aihub.qualcomm.com/) are pre-compiled per chipset and run through Qualcomm AI Engine Direct on the Hexagon NPU for peak on-device performance.

See [Platforms & runtimes](/en/get-started/platforms#geniex-runtimes) for when to pick which.

## **What you can do with GenieX**

* **Run models locally** on Snapdragon X (Windows ARM64), Snapdragon 8 Elite (Android), and Dragonwing IoT chipsets.
* **Pick a runtime** — `llama.cpp` for any community GGUF model, Qualcomm AI Engine Direct (`qairt`) for Qualcomm AI Hub pre-compiled NPU bundles.
* **Build apps** through the CLI, an OpenAI-compatible local server, the Python SDK, the Android SDK, or a Docker image.

## **Pick where to start**

<CardGroup cols={2}>
  <Card title="Quickstart" href="/en/get-started/quickstart" icon="rocket">
    Choose your interface and get to first inference in minutes.
  </Card>

  <Card title="Platforms & runtimes" href="/en/get-started/platforms" icon="microchip">
    Snapdragon platforms GenieX supports, and when to pick llama.cpp vs Qualcomm AI Engine Direct.
  </Card>

  <Card title="Models" href="/en/models/supported" icon="cube">
    Tested LLMs and VLMs across the llama.cpp and Qualcomm AI Engine Direct runtimes.
  </Card>
</CardGroup>

## **Community**

<CardGroup cols={2}>
  <Card title="Report an issue" href="https://github.com/qualcomm/GenieX/issues" icon="github">
    File a bug, request a feature, or browse open issues on GitHub.
  </Card>

  <Card title="Join Slack" href="https://aihub.qualcomm.com/community/slack" icon="slack">
    Collaborate with the GenieX team and other developers.
  </Card>
</CardGroup>

## **Legal**

<CardGroup cols={2}>
  <Card title="License" href="https://github.com/qualcomm/GenieX/blob/main/LICENSE" icon="scale-balanced">
    GenieX is released under the BSD 3-Clause License.
  </Card>

  <Card title="Terms of Use" href="https://www.qualcomm.com/site/terms-of-use" icon="file-contract">
    Qualcomm site terms of use.
  </Card>
</CardGroup>

<br />

<div class="feedback-wrapper">
  <span class="feedback-label">Was this page helpful?</span>

  <div class="feedback-toggle">
    <input type="radio" name="feedback" id="feedback-yes" class="feedback-input" />

    <label for="feedback-yes" class="feedback-button">
      <img src="https://mintlify.s3.us-west-1.amazonaws.com/qualcomm-0801e48b/Images/FeedBack/thumbs-up.svg" alt="Thumbs up" class="feedback-icon" noZoom />

      Yes
    </label>

    <input type="radio" name="feedback" id="feedback-no" class="feedback-input" />

    <label for="feedback-no" class="feedback-button">
      <img src="https://mintlify.s3.us-west-1.amazonaws.com/qualcomm-0801e48b/Images/FeedBack/thumbs-down.svg" alt="Thumbs down" class="feedback-icon" noZoom />

      No
    </label>
  </div>
</div>
