llama.cpp/examples at 296901983700f3c37449bcb555d85d27150a679d - llama.cpp - Gitea For EOELAB

mirrors/llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2025-04-16 11:36:08 +00:00

History

Ivy233 02082f1519

clip: Fix llama-llava-clip-quantize-cli quantization error under CUDA backend (#12566 )

* [Fix] Compiling clip-quantize-cli and running it in a CUDA environment will cause ggml_fp16_to_fp32 to report an error when trying to access video memory. You need to switch to the CPU backend to run quantize.
After the fix, it will automatically run in the CPU backend and will no longer be bound to CUDA.

* [Fix]Roll back the signature and implementation of clip_model_load, and change the call in clip_model_quantize to clip_init.

2025-03-26 15:06:04 +01:00

..

llama : add llama_vocab, functions -> methods, naming (#11110 )

2025-01-12 11:32:42 +02:00

llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181 )

2025-03-13 12:35:44 +02:00

llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181 )

2025-03-13 12:35:44 +02:00

convert-llama2c-to-ggml

llama : add llama_vocab, functions -> methods, naming (#11110 )

2025-01-12 11:32:42 +02:00

cvector-generator

llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181 )

2025-03-13 12:35:44 +02:00

deprecation-warning

Update deprecation-warning.cpp (#10619 )

2024-12-04 23:19:20 +01:00

llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181 )

2025-03-13 12:35:44 +02:00

llama : add llama_vocab, functions -> methods, naming (#11110 )

2025-01-12 11:32:42 +02:00

common : refactor '-o' option (#12278 )

2025-03-10 13:34:13 +02:00

Tool call support (generic + native for Llama, Functionary, Hermes, Mistral, Firefunction, DeepSeek) w/ lazy grammars (#9639 )

2025-01-30 19:13:58 +00:00

ggml : move AMX to the CPU backend (#10570 )

2024-11-29 21:54:58 +01:00

GGUF: C++ refactor, backend support, misc fixes (#11030 )

2025-01-07 18:01:58 +01:00

GGUF: C++ refactor, backend support, misc fixes (#11030 )

2025-01-07 18:01:58 +01:00

ci : use -no-cnv in gguf-split tests (#11254 )

2025-01-15 18:28:35 +02:00

llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181 )

2025-03-13 12:35:44 +02:00

llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181 )

2025-03-13 12:35:44 +02:00

llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181 )

2025-03-13 12:35:44 +02:00

build: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 )

2024-06-13 00:41:52 +01:00

llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181 )

2025-03-13 12:35:44 +02:00

llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181 )

2025-03-13 12:35:44 +02:00

llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181 )

2025-03-13 12:35:44 +02:00

clip: Fix llama-llava-clip-quantize-cli quantization error under CUDA backend (#12566 )

2025-03-26 15:06:04 +01:00

llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181 )

2025-03-13 12:35:44 +02:00

llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181 )

2025-03-13 12:35:44 +02:00

docs : bring llama-cli conversation/template docs up-to-date (#12426 )

2025-03-17 21:14:32 +01:00

llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181 )

2025-03-13 12:35:44 +02:00

llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181 )

2025-03-13 12:35:44 +02:00

llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181 )

2025-03-13 12:35:44 +02:00

ggml : portability fixes for VS 2017 (#12150 )

2025-03-04 18:53:26 +02:00

llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181 )

2025-03-13 12:35:44 +02:00

llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181 )

2025-03-13 12:35:44 +02:00

rpc-server : add support for the SYCL backend (#10934 )

2024-12-23 10:39:30 +02:00

run: de-duplicate fmt and format functions and optimize (#11596 )

2025-03-25 18:46:11 +01:00

save-load-state

llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181 )

2025-03-13 12:35:44 +02:00

server : Add verbose output to OAI compatible chat endpoint. (#12246 )

2025-03-23 19:30:26 +01:00

llama : add llama_vocab, functions -> methods, naming (#11110 )

2025-01-12 11:32:42 +02:00

llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181 )

2025-03-13 12:35:44 +02:00

simple-cmake-pkg

repo : update links to new url (#11886 )

2025-02-15 16:40:57 +02:00

speculative : fix seg fault in certain cases (#12454 )

2025-03-18 19:35:11 +02:00

speculative-simple

llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181 )

2025-03-13 12:35:44 +02:00

[SYCL] Optimize mul_mat for Q4_0 on Intel GPU (#12035 )

2025-02-24 22:33:23 +08:00

llama : add llama_vocab, functions -> methods, naming (#11110 )

2025-01-12 11:32:42 +02:00

llama-tts : avoid crashes related to bad model file paths (#12482 )

2025-03-21 11:12:45 +02:00

chat-13B.bat

Create chat-13B.bat (#592 )

2023-03-29 20:21:09 +03:00

chat-13B.sh

build: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 )

2024-06-13 00:41:52 +01:00

chat-persistent.sh

scripts : fix pattern and get n_tokens in one go (#10221 )

2024-11-09 09:06:54 +02:00

chat-vicuna.sh

build: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 )

2024-06-13 00:41:52 +01:00

chat.sh

build: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 )

2024-06-13 00:41:52 +01:00

CMakeLists.txt

tts : add OuteTTS support (#10784 )

2024-12-18 19:27:21 +02:00

convert_legacy_llama.py

metadata: Detailed Dataset Authorship Metadata (#8875 )

2024-11-13 21:10:38 +11:00

json_schema_pydantic_example.py

py : type-check all Python scripts with Pyright (#8341 )

2024-07-07 15:04:39 -04:00

json_schema_to_grammar.py

tool-call: fix Qwen 2.5 Coder support, add micro benchmarks, support trigger patterns for lazy grammars (#12034 )

2025-03-05 13:05:13 +00:00

llama.vim

repo : update links to new url (#11886 )

2025-02-15 16:40:57 +02:00

llm.vim

llm.vim : stop generation at multiple linebreaks, bind to <F2> (#2879 )

2023-08-30 09:50:55 +03:00

Miku.sh

build: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 )

2024-06-13 00:41:52 +01:00

pydantic_models_to_grammar_examples.py

repo : update links to new url (#11886 )

2025-02-15 16:40:57 +02:00

pydantic_models_to_grammar.py

pydantic : replace uses of __annotations__ with get_type_hints (#8474 )

2024-07-14 19:51:21 -04:00

reason-act.sh

build: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 )

2024-06-13 00:41:52 +01:00

regex_to_grammar.py

py : switch to snake_case (#8305 )

2024-07-05 07:53:33 +03:00

server_embd.py

py : type-check all Python scripts with Pyright (#8341 )

2024-07-07 15:04:39 -04:00

server-llama2-13B.sh

build: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809 )

2024-06-13 00:41:52 +01:00

ts-type-to-grammar.sh

JSON schema conversion: ⚡️ faster repetitions, min/maxLength for strings, cap number length (#6555 )

2024-04-12 19:43:38 +01:00