Releases: jeffbolznv/llama.cpp
Releases · jeffbolznv/llama.cpp
b6795
b6791
llama-model: fix insonsistent ctxs <-> bufs order (#16581)
b6782
mtmd : support home-cooked Mistral Small Omni (#14928)
b6745
metal : add opt_step_adamw and op_sum (#16529) * scaffold to support opt step adamw on metal (not written so far) * add opt-step-adamw kernel for metal * pass op->src[4] as a separate buffer to the pipeline * add bounds check to opt-step-adamw kernel * complete scaffold for GGML_OP_SUM * naive GGML_OP_SUM kernel * remove unwanted comment * change OP_SUM capability gate * Add has_simdgroup_reduction to both ops to pass CI
b6744
webui: remove client-side context pre-check and rely on backend for l…
b6644
ggml : bump version to 0.9.4 (ggml/1363)
b6618
ci : fix musa docker build (#16306) Signed-off-by: Xiaodong Ye <yeahdongcn@gmail.com>
b6604
vulkan: support GET_ROWS for k-quants (#16235) The dequantize functions are copy/pasted from mul_mm_funcs.comp with very few changes - add a_offset and divide iqs by 2. It's probably possible to call these functions from mul_mm_funcs and avoid the duplication, but I didn't go that far in this change.
b6548
common : enable `--offline` mode without curl support (#16137)
* common : use the json parser
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
* common : enable --offline mode without CURL support
This change refactors the download logic to properly support offline mode
even when the project is built without CURL.
Without this commit, using `--offline` would give the following error:
error: built without CURL, cannot download model from the internet
even if all the files are already cached.
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
---------
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
b6530
ci : migrate ggml ci to self-hosted runners (#16116) * ci : migrate ggml ci to a self-hosted runners * ci : add T4 runner * ci : add instructions for adding self-hosted runners * ci : disable test-backend-ops from debug builds due to slowness * ci : add AMD V710 runner (vulkan) * cont : add ROCM workflow * ci : switch to qwen3 0.6b model * cont : fix the context size