llama.cpp: fix warning message (#11839)

There was a typo-like error, which would print the same number twice if request is received with n_predict > server-side config. Before the fix: ``` slot launch_slot_: id 0 | task 0 | n_predict = 4096 exceeds server configuration, setting to 4096 ``` After the fix: ``` slot launch_slot_: id 0 | task 0 | n_predict = 8192 exceeds server configuration, setting to 4096 ```
2025-04-16 03:26:08 +00:00 · 2025-02-13 01:25:34 -05:00 · 2025-02-13 01:25:34 -05:00 · e4376270d9
commit e4376270d9
parent 3e69319772
1 changed files with 1 additions and 1 deletions
--- a/examples/server/server.cpp
+++ b/examples/server/server.cpp
@ -2073,8 +2073,8 @@ struct server_context {

        if (slot.n_predict > 0 && slot.params.n_predict > slot.n_predict) {
            // Might be better to reject the request with a 400 ?
+            SLT_WRN(slot, "n_predict = %d exceeds server configuration, setting to %d", slot.params.n_predict, slot.n_predict);
            slot.params.n_predict = slot.n_predict;
-            SLT_WRN(slot, "n_predict = %d exceeds server configuration, setting to %d", slot.n_predict, slot.n_predict);
        }

        if (slot.params.ignore_eos && has_eos_token) {