mirror of
https://github.com/ggerganov/llama.cpp.git
synced 2025-04-19 21:16:06 +00:00

* gguf-py : support lazy tensor splitting Splitting usually involves returning tuples of tensors, which need to be handled properly to avoid early eager evaluation. * gguf-py : fix flake8 lint