`unable to load runner: error waiting for runner to be ready: llama.cpp terminated unexpectedly`

I'm getting this error:
```none
Failed to generate a response: error response: status=500 body=unable to load runner: error waiting for runner to be ready: llama.cpp terminated unexpectedly: llama.cpp exit status: exit status 0xc0000005
```

I'm with DD on Windows, using WSL2 (ECI disabled in this case).
DMR enabled via DD.
I'm able to run `docker model pull` commands, `docker model list`:
```none
MODEL NAME           PARAMETERS  QUANTIZATION    ARCHITECTURE  MODEL ID      CREATED       CONTEXT  SIZE
gemma3               3.88 B      MOSTLY_Q4_K_M   gemma3        a353a8898c9d  2 months ago           2.31 GiB
qwen2.5:0.5B-F16     494.03 M    F16             qwen2         3e1aad67b4cc  7 months ago           942.43 MiB
smollm2:360M-Q4_K_M  361.82 M    IQ2_XXS/Q4_K_M  llama         354bf30d0aa3  8 months ago           256.35 MiB
```
But when running `docker model run ai/qwen2.5:0.5B-F16 "who are you"`, I'm getting the error described above: `unable to load runner: error waiting for runner to be ready: llama.cpp terminated unexpectedly`.
Also, `docker model status` is saying:
```none
Docker Model Runner is running

Status:
llama.cpp: running llama.cpp latest-cpu (sha256:ea16f02ab4b7ce60f05a2cc3d08d2643e53f2c7bb9187c6644fbf108d898739d) version: unknown
vllm: not installed
```

Looking at `docker model logs -f`, seeing this:
```none
-------------------------------------------------------------------------------->8
[2025-11-20T20:49:58.776194000Z][inference] Running on system with 32265 MB RAM
[2025-11-20T20:49:58.780206900Z][inference.model-manager] Successfully initialized store
[2025-11-20T20:49:58.780206900Z][inference] 2 backends available
-------------------------------------------------------------------------------->8
[2025-11-20T20:56:13.809536300Z][inference] Running on system with 4094 MB VRAM
[2025-11-20T20:56:13.834988700Z][inference] Running on system with 32265 MB RAM
[2025-11-20T20:56:13.836031600Z][inference.model-manager] Successfully initialized store
[2025-11-20T20:56:13.836578100Z][inference] 2 backends available
[2025-11-20T20:56:17.663784000Z][inference] Reconciling service state on initialization
[2025-11-20T20:56:17.664289000Z][inference] Reconciling service state on settings change
[2025-11-20T20:56:17.664289000Z][inference.inference-llama.cpp] downloadLatestLlamaCpp: latest, cpu, C:\Program Files\Docker\Docker\resources\model-runner\bin, <HOME>\.docker\bin\inference\com.docker.llama-server.exe
[2025-11-20T20:56:18.991409200Z][inference.inference-llama.cpp][W] could not get llama.cpp version: exit status 0xc0000005
[2025-11-20T20:56:18.991409200Z][inference.inference-llama.cpp] failed to ensure latest llama.cpp: bundled llama.cpp version is up to date, no need to update
[2025-11-20T20:56:19.174425300Z][inference.inference-llama.cpp][W] Failed to determine if llama-server is built with GPU support: exit status 0xc0000005
[2025-11-20T20:56:19.174425300Z][inference.inference-llama.cpp] installed llama-server with gpuSupport=false
[2025-11-20T20:56:19.174425300Z][inference][W] Backend installation failed for vllm: not implemented
[2025-11-20T20:56:36.211617900Z][inference.model-manager] Listing available models
[2025-11-20T20:56:36.212180800Z][inference.model-manager] Successfully listed models, count: 0
```

And this:
```none
[2025-12-02T15:57:10.302145800Z][inference.model-manager] Successfully listed models, count: 3
[2025-12-02T15:57:15.206081100Z][inference.model-manager] Getting model by reference: ai/qwen2.5:0.5B-F16
[2025-12-02T15:57:18.599097500Z][inference.model-manager] Getting model by reference: ai/qwen2.5:0.5B-F16
[2025-12-02T15:57:18.600749200Z][inference.model-manager] Getting model by reference: ai/qwen2.5:0.5B-F16
[2025-12-02T15:57:18.602932000Z][inference.model-manager] Checking model by reference: sha256:3e1aad67b4cc8e3dca660fe65f9f73edb598474284256ffdd9ba460b5b35ff26
[2025-12-02T15:57:23.575350900Z][inference] Loading sha256:3e1aad67b4cc8e3dca660fe65f9f73edb598474284256ffdd9ba460b5b35ff26, which will require 1563 MB RAM and 129 MB VRAM on a system with 32265 MB RAM and 4094 MB VRAM
[2025-12-02T15:57:23.575350900Z][inference] Loading llama.cpp backend runner with model sha256:3e1aad67b4cc8e3dca660fe65f9f73edb598474284256ffdd9ba460b5b35ff26 in completion mode
[2025-12-02T15:57:23.575350900Z][inference.openai-recorder][W] SetConfigForModel called with nil config for model sha256:3e1aad67b4cc8e3dca660fe65f9f73edb598474284256ffdd9ba460b5b35ff26
[2025-12-02T15:57:23.583948900Z][inference.inference-llama.cpp] llama.cpp args: [-ngl 999 --metrics --model C:\\Users\\<USER>\\.docker\\models\\bundles\\sha256\\3e1aad67b4cc8e3dca660fe65f9f73edb598474284256f...[truncated] --host C:\\Users\\<USER>\\AppData\\Local\\Docker\\run\\inference-0.sock --ctx-size 4096 --jinja]
[2025-12-02T15:57:24.117776700Z][inference][W] Backend llama.cpp running model ai/qwen2.5:0.5B-F16 exited with error: llama.cpp terminated unexpectedly: llama.cpp exit status: exit status 0xc0000005
[2025-12-02T15:57:24.577909900Z][inference.model-manager] Getting model by reference: sha256:3e1aad67b4cc8e3dca660fe65f9f73edb598474284256ffdd9ba460b5b35ff26
[2025-12-02T15:57:24.581195900Z][inference.openai-recorder][W] No records found for model: sha256:3e1aad67b4cc8e3dca660fe65f9f73edb598474284256ffdd9ba460b5b35ff26
[2025-12-02T15:57:24.581195900Z][inference][W] Initialization for llama.cpp backend runner with model sha256:3e1aad67b4cc8e3dca660fe65f9f73edb598474284256ffdd9ba460b5b35ff26 in completion mode failed: llama.cpp terminated unexpectedly: llama.cpp exit status: exit status 0xc0000005
```

More info about my setup:
```none
Processor	Intel(R) Core(TM) Ultra 7 155H (1.40 GHz)
System type	64-bit operating system, x64-based processor
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

`unable to load runner: error waiting for runner to be ready: llama.cpp terminated unexpectedly` #479

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

unable to load runner: error waiting for runner to be ready: llama.cpp terminated unexpectedly #479

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`unable to load runner: error waiting for runner to be ready: llama.cpp terminated unexpectedly` #479