Johannes Gäßler
254a7a7a5f
CUDA full GPU acceleration, KV cache in VRAM (#1827)
* Fixed CUDA RoPE
* ggml_cuda_mul_mat_vec_p021
* ggml_cuda_scale
* ggml_cuda_diag_mask_inf
* ggml_is_permuted
* ggml_cuda_cpy
* flatten rows for ggml_cuda_op
* Added a --low-vram option
* Fixed Windows performance
* Fixed LLAMA_CUDA_DMMV_Y > 1 for WizardLM
2023-06-14 19:47:19 +02:00
..
2023-06-13 22:37:54 +03:00
2023-05-20 11:06:37 +03:00
2023-05-20 11:06:37 +03:00
2023-04-28 19:13:33 +03:00
2023-06-14 19:47:19 +02:00
2023-06-04 23:34:30 +03:00
2023-05-20 11:06:37 +03:00
2023-06-13 04:23:23 -06:00
2023-06-05 22:56:18 +03:00
2023-05-17 22:12:01 +00:00
2023-06-14 19:47:19 +02:00
2023-06-13 22:04:40 +03:00
2023-04-22 09:54:33 +03:00
2023-03-29 20:21:09 +03:00
2023-05-03 20:58:11 +03:00
2023-05-24 09:16:22 +03:00
2023-03-25 21:51:41 +02:00
2023-06-13 22:04:40 +03:00
2023-06-14 19:47:19 +02:00
2023-06-14 19:47:19 +02:00
2023-04-13 16:03:39 +03:00
2023-05-03 18:26:47 +03:00
2023-03-29 10:10:24 -05:00