llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2025-04-19 13:06:10 +00:00

History

opencl: fix for small models (#11950 )

* opencl: fix small shape gemv, remove unused extensions

* opencl: fix `transpose_16`, `dump_tensor`, enforce subgroup size

* opencl: fix for token length < 4

* opencl: use wave size of 64 for all Adreno GPUs

---------

Co-authored-by: Shawn Gu <quic_shawngu@quicinc.com>
Co-authored-by: Skyler Szot <quic_sszot@quicinc.com>

2025-02-24 14:47:07 -07:00

cmake

cmake: add ggml find package (#11369 )

2025-01-26 12:07:48 -04:00

include

ggml-cpu: Support s390x SIMD Instruction Set (#12019 )

2025-02-22 21:39:24 +00:00

src

opencl: fix for small models (#11950 )

2025-02-24 14:47:07 -07:00

.gitignore

vulkan : cmake integration (#8119 )

2024-07-13 18:12:39 +02:00

CMakeLists.txt

ggml-cpu: Support s390x SIMD Instruction Set (#12019 )

2025-02-22 21:39:24 +00:00