Skip to content

Misc. bug: kv_unified = true despite not setting --kv-unified #17450

@wallentri88

Description

@wallentri88

Name and Version

latest master (96ac5a2)

Operating systems

No response

Which llama.cpp modules do you know to be affected?

llama-server

Command line

Problem description & steps to reproduce

Try to run any model using llama-server. Documentation says that --kv-unified arg should enable unified cache and it shouldn't be enabled by default. But it's enabled by default and I didn't find a way to disable it.

First Bad Commit

probably this one? #16736

Relevant log output

llama_context: constructing llama_context
llama_context: n_seq_max     = 4
llama_context: n_ctx         = 131072
llama_context: n_ctx_seq     = 131072
llama_context: n_batch       = 2048
llama_context: n_ubatch      = 2048
llama_context: causal_attn   = 1
llama_context: flash_attn    = enabled
llama_context: kv_unified    = true
llama_context: freq_base     = 150000.0
llama_context: freq_scale    = 0.03125

Metadata

Metadata

Assignees

No one assigned

    Labels

    low severityUsed to report low severity bugs in llama.cpp (e.g. cosmetic issues, non critical UI glitches)

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions