llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2025-04-20 21:46:07 +00:00

History

* imatrix: load

* imatrix: WIP

* imatrix: Add Q2_K quantization

* imatrix: also guard against Q2_K_S quantization without importance matrix

* imatrix: guard even more against low-bit quantization misuse

---------

Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>

2024-01-14 09:45:56 +02:00

benchmark-matmult.cpp

2-bit quantizations (#4897 )

2024-01-14 09:45:56 +02:00

CMakeLists.txt

build : link against build info instead of compiling against it (#3879 )

2023-11-02 08:50:16 +02:00