llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2025-04-23 20:36:04 +00:00

History

convert_hf : faster lazy safetensors (#8482 )

* convert_hf : faster lazy safetensors

This makes '--dry-run' much, much faster.

* convert_hf : fix memory leak in lazy MoE conversion

The '_lazy' queue was sometimes self-referential,
which caused reference cycles of objects old enough
to avoid garbage collection until potential memory exhaustion.

2024-07-15 23:13:10 -04:00

__init__.py

convert-hf : support direct Q8_0 conversion (#7234 )

2024-05-13 14:10:51 -04:00

constants.py

Refactor lora adapter support (#8332 )

2024-07-15 20:50:47 +02:00

gguf_reader.py

py : type-check all Python scripts with Pyright (#8341 )

2024-07-07 15:04:39 -04:00

gguf_writer.py

Refactor lora adapter support (#8332 )