llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2025-04-19 04:56:11 +00:00

master

6408210082 · main : Fix Ctrl+D/newline handling (#12951) · Updated 2025-04-18 20:02:55 +00:00

ggml-quants 8a86b95e87 · quantize : --pure option for disabling k-quant mixtures · Updated 2023-10-28 20:37:03 +00:00 mirrors	3715 3
apply-3585 de7e0912b6 · convert : ignore tokens if their IDs are within [0, vocab_size) · Updated 2023-10-28 12:01:36 +00:00 mirrors	3718 1
sampling-greedy-with-probs bbfc62ac2f · sampling : temp == 0.0 -> no probs, temp < 0.0 -> probs · Updated 2023-10-28 11:04:57 +00:00 mirrors	3726 3
cuda-multi-gpu cd3e20fb50 · cuda : fix multi-gpu with tensor cores · Updated 2023-10-27 20:11:50 +00:00 mirrors	3725 3
cuda-quantum-batch 49af767fad · build : add compile option to force use of MMQ kernels · Updated 2023-10-27 10:21:04 +00:00 mirrors	3727 7
cuda-batched-gemm d798a17c34 · cuda : add TODO for calling cublas from kernel + using mem pool · Updated 2023-10-24 13:33:24 +00:00 mirrors	3741 10
cuda-batched-gemm-deq 6966474928 · cuda : play with faster Q4_0 dequantization · Updated 2023-10-24 07:29:40 +00:00 mirrors	3741 8
upd-issue-templates b9bb4cbe86 · Separate bug and enhancement template + no default title · Updated 2023-10-23 15:59:11 +00:00 mirrors	3741 1
server-rev c0f4d54870 · server : add comment about changing slot_state to bool · Updated 2023-10-22 19:24:39 +00:00 mirrors	3747 72
perf-study cb79f8a2d8 · llama : add SKIP_KQ_KQV option · Updated 2023-10-22 06:58:29 +00:00 mirrors	3747 3
sampling-refactor 56ba00b923 · sampling : hide prev behind API and apply #3661 · Updated 2023-10-20 15:53:27 +00:00 mirrors	3750 6
speculative-tree ad2727d091 · Merge branch 'master' into speculative-tree · Updated 2023-10-18 07:50:58 +00:00 mirrors	3761 18
llava-fix-offloading 932589c0ef · Honor -ngl option for Cuda offloading in llava · Updated 2023-10-14 00:12:10 +00:00 mirrors	3775 1
rev-sampling 5261aee8d8 · sampling : one sequence per sampling context · Updated 2023-10-12 17:36:44 +00:00 mirrors	3778 1
batched-bench 2fcdf869cd · batched-bench : add mmq CLI arg · Updated 2023-10-11 16:42:33 +00:00 mirrors	3790 7
alloc-assert-fix ee7456926e · ggml-alloc : fix assert in debug builds · Updated 2023-10-09 12:33:12 +00:00 mirrors	3799 1
fix-kv-cache-access ee268b5446 · llama : no longer perform uninitialized access to the KV cache · Updated 2023-10-08 08:49:38 +00:00 mirrors	3806 5
fix-refact acead654d2 · Merge branch 'master' into fix-refact · Updated 2023-10-08 08:25:16 +00:00 mirrors	3806 4
metal-improve-batching 6b9554a740 · metal : print more GPU info + disable mul_mm for MTLGPUFamiliy < Apple7 · Updated 2023-10-08 06:55:13 +00:00 mirrors	3813 5
gguf-fix-publish ba44776dc2 · bump version · Updated 2023-10-07 18:47:48 +00:00 mirrors	3812 6

... 16 17 18 19 20 ...

Default Branch

Branches