mirror of
https://github.com/ggerganov/llama.cpp.git
synced 2025-04-15 19:16:09 +00:00

* initial sampling changes: * completed top nsigma sampler implementation * apply parameter to only llama-cli * updated readme * added tests and fixed nsigma impl * cleaned up pr * format * format * format * removed commented tests * cleanup pr and remove explicit floats * added top-k sampler to improve performance * changed sigma to float * fixed string format to float * Update src/llama-sampling.cpp Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * Update common/sampling.cpp Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * Update src/llama-sampling.cpp Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * Update src/llama-sampling.cpp Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * Update src/llama-sampling.cpp Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * Update src/llama-sampling.cpp Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * added llama_sampler_init --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>