diff --git a/docs/function-calling.md b/docs/function-calling.md index 92cb6531a..c3873c3fa 100644 --- a/docs/function-calling.md +++ b/docs/function-calling.md @@ -287,30 +287,32 @@ Here are some models known to work (w/ chat template override when needed): llama-server --jinja -fa -hf bartowski/Qwen2.5-7B-Instruct-GGUF:Q4_K_M llama-server --jinja -fa -hf bartowski/Mistral-Nemo-Instruct-2407-GGUF:Q6_K_L -llama-server --jinja -fa -hf bartowski/functionary-small-v3.2-GGUF:Q4_K_M llama-server --jinja -fa -hf bartowski/Llama-3.3-70B-Instruct-GGUF:Q4_K_M -# Native support for DeepSeek R1 works best w/ our own template (official template buggy) +# Native support for DeepSeek R1 works best w/ our template override (official template is buggy, although we do work around it) llama-server --jinja -fa -hf bartowski/DeepSeek-R1-Distill-Qwen-7B-GGUF:Q6_K_L \ ---chat-template-file models/templates/llama-cpp-deepseek-r1.jinja + --chat-template-file models/templates/llama-cpp-deepseek-r1.jinja llama-server --jinja -fa -hf bartowski/DeepSeek-R1-Distill-Qwen-32B-GGUF:Q4_K_M \ ---chat-template-file models/templates/llama-cpp-deepseek-r1.jinja + --chat-template-file models/templates/llama-cpp-deepseek-r1.jinja # Native support requires the right template for these GGUFs: +llama-server --jinja -fa -hf bartowski/functionary-small-v3.2-GGUF:Q4_K_M + --chat-template-file models/templates/meetkai-functionary-medium-v3.2.jinja + llama-server --jinja -fa -hf bartowski/Hermes-2-Pro-Llama-3-8B-GGUF:Q4_K_M \ ---chat-template-file <( python scripts/get_chat_template.py NousResearch/Hermes-2-Pro-Llama-3-8B tool_use ) + --chat-template-file models/templates/NousResearch-Hermes-2-Pro-Llama-3-8B-tool_use.jinja llama-server --jinja -fa -hf bartowski/Hermes-3-Llama-3.1-8B-GGUF:Q4_K_M \ ---chat-template-file <( python scripts/get_chat_template.py NousResearch/Hermes-3-Llama-3.1-8B tool_use ) + --chat-template-file models/templates/NousResearch-Hermes-3-Llama-3.1-8B-tool_use.jinja llama-server --jinja -fa -hf bartowski/firefunction-v2-GGUF -hff firefunction-v2-IQ1_M.gguf \ ---chat-template-file <( python scripts/get_chat_template.py fireworks-ai/llama-3-firefunction-v2 tool_use ) + --chat-template-file models/templates/fireworks-ai-llama-3-firefunction-v2.jinja llama-server --jinja -fa -hf bartowski/c4ai-command-r7b-12-2024-GGUF:Q6_K_L \ ---chat-template-file <( python scripts/get_chat_template.py CohereForAI/c4ai-command-r7b-12-2024 tool_use ) + --chat-template-file models/templates/CohereForAI-c4ai-command-r7b-12-2024-tool_use.jinja # Generic format support llama-server --jinja -fa -hf bartowski/phi-4-GGUF:Q4_0 @@ -318,6 +320,8 @@ llama-server --jinja -fa -hf bartowski/gemma-2-2b-it-GGUF:Q8_0 llama-server --jinja -fa -hf bartowski/c4ai-command-r-v01-GGUF:Q2_K ``` +To get the official template from original HuggingFace repos, you can use [scripts/get_chat_template.py](../scripts/get_chat_template.py) (see examples invocations in [models/templates/README.md](../models/templates/README.md)) + > [!TIP] > If there is no official `tool_use` Jinja template, you may want to set `--chat-template chatml` to use a default that works with many models (YMMV!), or write your own (e.g. we provide a custom [llama-cpp-deepseek-r1.jinja](../models/templates/llama-cpp-deepseek-r1.jinja) for DeepSeek R1 distills)