Merge pull request #25444 from justinjfu:gpu_docs_update

PiperOrigin-RevId: 705938411
2025-04-14 10:56:06 +00:00 · 2024-12-13 11:05:31 -08:00 · 2024-12-13 11:05:31 -08:00 · 99b390ce96
commit 99b390ce96
parent 87b66f3c35 1021603f85
1 changed files with 0 additions and 6 deletions
--- a/docs/gpu_performance_tips.md
+++ b/docs/gpu_performance_tips.md
@ -44,11 +44,8 @@ example, we can add this to the top of a Python file:
 ```python
 import os
 os.environ['XLA_FLAGS'] = (
-    '--xla_gpu_enable_triton_softmax_fusion=true '
    '--xla_gpu_triton_gemm_any=True '
-    '--xla_gpu_enable_async_collectives=true '
    '--xla_gpu_enable_latency_hiding_scheduler=true '
-    '--xla_gpu_enable_highest_priority_async_stream=true '
 )
 ```

@ -58,9 +55,6 @@ training on Nvidia GPUs](https://github.com/NVIDIA/JAX-Toolbox/blob/main/rosetta

 ### Code generation flags

-* **--xla_gpu_enable_triton_softmax_fusion** This flag enables an automatic
-  softmax fusion, based on pattern-matching backed by Triton code generation.
-  The default value is False.
 * **--xla_gpu_triton_gemm_any** Use the Triton-based GEMM (matmul) emitter for
  any GEMM that it supports. The default value is False.