Skip to content

Releases: jeffbolznv/llama.cpp

b6795

18 Oct 14:53
ee09828

Choose a tag to compare

HIP: fix GPU_TARGETS (#16642)

b6791

17 Oct 19:00
66b0dbc

Choose a tag to compare

llama-model: fix insonsistent ctxs <-> bufs order (#16581)

b6782

16 Oct 20:21
1bb4f43

Choose a tag to compare

mtmd : support home-cooked Mistral Small Omni (#14928)

b6745

12 Oct 20:11
a31cf36

Choose a tag to compare

metal : add opt_step_adamw and op_sum (#16529)

* scaffold to support opt step adamw on metal (not written so far)

* add opt-step-adamw kernel for metal

* pass op->src[4] as a separate buffer to the pipeline

* add bounds check to opt-step-adamw kernel

* complete scaffold for GGML_OP_SUM

* naive GGML_OP_SUM kernel

* remove unwanted comment

* change OP_SUM capability gate

* Add has_simdgroup_reduction to both ops to pass CI

b6744

12 Oct 19:23
81d54bb

Choose a tag to compare

webui: remove client-side context pre-check and rely on backend for l…

b6644

30 Sep 13:09

Choose a tag to compare

ggml : bump version to 0.9.4 (ggml/1363)

b6618

28 Sep 16:56
d9e0e7c

Choose a tag to compare

ci : fix musa docker build (#16306)

Signed-off-by: Xiaodong Ye <yeahdongcn@gmail.com>

b6604

27 Sep 16:26
3f81b4e

Choose a tag to compare

vulkan: support GET_ROWS for k-quants (#16235)

The dequantize functions are copy/pasted from mul_mm_funcs.comp with very few
changes - add a_offset and divide iqs by 2. It's probably possible to call
these functions from mul_mm_funcs and avoid the duplication, but I didn't go
that far in this change.

b6548

22 Sep 16:17
37a23c1

Choose a tag to compare

common : enable `--offline` mode without curl support (#16137)

* common : use the json parser

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

* common : enable --offline mode without CURL support

This change refactors the download logic to properly support offline mode
even when the project is built without CURL.

Without this commit, using `--offline` would give the following error:

    error: built without CURL, cannot download model from the internet

even if all the files are already cached.

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

---------

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

b6530

21 Sep 15:40
28baac9

Choose a tag to compare

ci : migrate ggml ci to self-hosted runners (#16116)

* ci : migrate ggml ci to a self-hosted runners

* ci : add T4 runner

* ci : add instructions for adding self-hosted runners

* ci : disable test-backend-ops from debug builds due to slowness

* ci : add AMD V710 runner (vulkan)

* cont : add ROCM workflow

* ci : switch to qwen3 0.6b model

* cont : fix the context size