rpc : update README for cache usage

2025-04-27 19:36:05 +00:00 · 2025-03-28 09:15:09 +02:00 · 2025-03-28 09:15:09 +02:00 · c875e03f96
commit c875e03f96
parent ab6ab8f809
1 changed files with 11 additions and 0 deletions
--- a/examples/rpc/README.md
+++ b/examples/rpc/README.md
@ -72,3 +72,14 @@ $ bin/llama-cli -m ../models/tinyllama-1b/ggml-model-f16.gguf -p "Hello, my name
 This way you can offload model layers to both local and remote devices.
 ### Local cache
 The RPC server can use a local cache to store large tensors and avoid transferring them over the network.
 This can speed up model loading significantly, especially when using large models.
 To enable the cache, use the `-c` option:
 ```bash
 $ bin/rpc-server -c
 ```
 By default, the cache is stored in the `$HOME/.cache/llama.cpp/rpc` directory and can be controlled via the `LLAMA_CACHE` environment variable.