mirror of https://github.com/ggerganov/llama.cpp.git synced 2025-04-20 21:46:07 +00:00

History

common : fix mirostat state when using multiple sequences (#3543 )

* Fix mirostat state when using multiple sequences

* Fix mirostat by completely refactoring sampling!

* Try to fix zig build.

* Export function to fetch/create default sampler states

Code formatting cleanups and add some comments

Silence a warning about id not being used when logging is disabled

* Apply some renaming suggestions.

Fix comments that were out of sync with the pull.

* Use more consistant naming convention for sampling contexts

2023-10-11 22:35:46 +03:00

CMakeLists.txt

llama : custom attention mask + parallel decoding + no context swaps (#3228 )

2023-09-28 19:04:36 +03:00

parallel.cpp

common : fix mirostat state when using multiple sequences (#3543 )

2023-10-11 22:35:46 +03:00

README.md

llama : custom attention mask + parallel decoding + no context swaps (#3228 )

2023-09-28 19:04:36 +03:00

README.md

llama.cpp/example/parallel

Simplified simluation for serving incoming requests in parallel