* sampling : refactor + optimize penalties sampler
ggml-ci
* common : apply ignore_eos as logit bias
ggml-ci
* batched : remove penalties sampler
* params : allow penalty_last_n == -1 to be equal to context size
ggml-ci
* common : by default, move the penalties at the end of the sampling chain
ggml-ci
* common : ignore all EOG tokens
Co-authored-by: Diego Devesa <slarengh@gmail.com>
* common : move back the penalties at the front of the sampling chain
ggml-ci
* readme : restore hint about --ignore-eos flag [no ci]
* llama : minor
ggml-ci
* webui : update
---------
Co-authored-by: Diego Devesa <slarengh@gmail.com>