Vitali Lovich 3e9a2860e9
llama : expose llama_model_n_head_kv in the API (#11997)
It's useful to be able to have this from the library layer as it's a key
parameter of the model (e.g. to figure out how much KV cache memory is
needed).
2025-02-25 11:29:33 +02:00
..
2025-02-08 16:49:38 +02:00