Skip to content

Releases: ggml-org/llama.cpp

b7200

29 Nov 21:41
ab49f09

Choose a tag to compare

server: move server-context to its own cpp|h (#17595)

* git mv

* add server-context.h

* add server-context.h

* clean up headers

* cont : cleanup

* also expose server_response_reader (to be used by CLI)

* fix windows build

* decouple server_routes and server_http

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

b7199

29 Nov 18:37
8c32d9d

Choose a tag to compare

server: explicitly set the function name in lambda (#17538)

As [1] explained, the real debug message will be like:
	"res    operator(): operator() : queue result stop"

Set the name explicitly, the message is easy for debugging:
	"res    operator(): recv : queue result stop"

The left "operator()" is generated by 'RES_DBG() ... __func__'

[1]: https://clang.llvm.org/extra/clang-tidy/checks/bugprone/lambda-function-name.html

Signed-off-by: Haiyue Wang <haiyuewa@163.com>

b7198

29 Nov 16:57
0874693

Choose a tag to compare

common : fix json schema with '\' in literals (#17307)

* Fix json schema with '\' in literals

* Add "literal string with escapes" test

b7197

29 Nov 14:27
7d2add5

Choose a tag to compare

sycl : support to malloc memory on device more than 4GB, update the d…

b7196

29 Nov 14:23
f698a79

Choose a tag to compare

ggml: replace hwcap with riscv_hwprobe for RVV detection (#17567)

Signed-off-by: Wang Yang <yangwang@iscas.ac.cn>

b7195

29 Nov 09:01
47a268e

Choose a tag to compare

Vulkan: MMVQ Integer Dot K-Quant and MUL_MAT_ID support (#16900)

* vulkan: split mul_mmq_funcs for mul_mat_vecq use

* add mxfp4 mmvq

* add q2_k mmvq

* add q3_k mmvq

* add q4_k and q5_k mmvq

* add q6_k mmvq

* handle 4x4 quants per mmvq thread

* enable MUL_MAT_ID mmvq support

* enable subgroup optimizations for mul_mat_vec_id shaders

* device tuning

* request prealloc_y sync after quantization

* fix indentation

* fix llvmpipe test failures

* fix mul_mat_id mmvq condition

* fix unused variable warning

b7194

29 Nov 08:30
59d8d4e

Choose a tag to compare

vulkan: improve topk perf for large k, fix overflow in unit tests (#1…

b7192

28 Nov 20:21
03914c7

Choose a tag to compare

common : move all common_chat_parse_* to chat-parser.cpp. (#17481)

b7191

28 Nov 19:57
3ce7a65

Choose a tag to compare

server: fix: /metrics endpoint returning JSON-escaped Prometheus form…