1
0
mirror of https://github.com/ggerganov/llama.cpp.git synced 2025-04-17 12:06:10 +00:00

Default Branch

971f245b3b · llama : recognize IBM Granite 3.3 FIM tokens () · Updated 2025-04-17 08:37:05 +00:00

Branches

4baa85633a · Fix build · Updated 2023-05-07 01:44:07 +00:00    mirrors

4632
5

31ff9e2e83 · ci : add cublas to windows release · Updated 2023-05-03 21:21:20 +00:00    mirrors

4647
1

102cd98074 · ggml : Q4_3c using 2x "Full range" approach · Updated 2023-04-23 11:56:44 +00:00    mirrors

4728
8

71e6ae3779 · ggml : continue from (wip) · Updated 2023-04-22 15:49:07 +00:00    mirrors

4728
7

a0242a833c · Minor, plus rebase on master · Updated 2023-04-22 14:07:10 +00:00    mirrors

4728
2

4b8d5e3890 · llama : quantize attention results · Updated 2023-04-22 08:35:13 +00:00    mirrors

4733
1

1506737499 · Add mmap pages stats (disabled by default) · Updated 2023-04-16 16:22:30 +00:00    mirrors

4783
1

36ddd12924 · llama : add flash attention (demo) · Updated 2023-04-05 19:12:04 +00:00    mirrors

4849
1

c9c820ff36 · Added support for _POSIX_MAPPED_FILES if defined in source () · Updated 2023-03-28 21:26:25 +00:00    mirrors

5083
8

4aeee216fd · Regroup q4_1 dot addition for better numerics. · Updated 2023-03-24 20:20:57 +00:00    mirrors

4964
2

66ea164e1d · Kahan summation on Q4_1 · Updated 2023-03-23 03:28:51 +00:00    mirrors

4991
2

711224708d · Break up loop for numeric stability · Updated 2023-03-23 02:14:44 +00:00    mirrors

4991
2

3a0dcb3920 · Implement server mode. · Updated 2023-03-22 17:34:19 +00:00    mirrors

4992
5
dev

a169bb889c · Gate signal support on being on a unixoid system. () · Updated 2023-03-13 03:08:01 +00:00    mirrors

5098
0
Included