rocm_jax/jax at 4b49c0352355c4d71c701aba74e52c678d6627fd - rocm_jax - Gitea For EOELAB

mirrors/rocm_jax

mirror of https://github.com/ROCm/jax.git synced 2025-04-16 11:56:07 +00:00

History

Jevin Jiang 4b49c03523 Open source TPU-friendly ragged paged attention kernel.

Key features:
* ***Support mixed prefill and decode*** to increase throughput for inference. (eg., ***5x*** speedup compared to padded Muti-Queries Paged Attention implementation for llama-3-8b.)
* ***No explicit `swapaxes`*** for `seq_len` and `num_head` in pre/post kernel. The kernel takes `num_head` in 2nd minor as it naturally was. We fold swapaxes to strided load/store in the kernel and apply transpose on the fly.
* ***No GMM (Grouped Matmul) Metadata required!*** We calculate the metadata on the fly in the kernel. This can speed up ***10%***!
* ***Increase MXU utilization 8x in GQA*** by grouping shared q heads for MXU in decode.
* ***Minimize recompilation:*** The only factors can cause recompilation are model specs, `max_num_batched_tokens` and `max_num_seqs` in the setting of mixed engine.

PiperOrigin-RevId: 734269519

2025-03-06 13:36:45 -08:00

..

Minor bug fixes in error checking

2025-03-06 06:57:52 -08:00

example_libraries

update example optimizers library docstring

2024-09-03 23:40:47 -07:00

Open source TPU-friendly ragged paged attention kernel.

2025-03-06 13:36:45 -08:00

Create jax wheel build target.

2025-02-25 09:30:08 -08:00

Update references to the GitHub url in JAX codebase to reflect move from google/jax to jax-ml/jax

2024-09-20 07:52:33 -07:00

Refactor Jax FFI lowering to prepare for implementing CPU/GPU callbacks using XLA's FFI.

2025-02-21 09:45:59 -08:00

Start adding primitive registration helper functions to lax.linalg.

2025-02-21 04:05:34 -08:00

Finalize deprecation of some symbols from jax.lib.xla_client

2024-12-23 10:14:16 -08:00

block_scale_config

2025-02-13 04:35:06 +00:00

jax.numpy ndim/shape/size: deprecate non-array input

2025-03-04 10:42:32 -08:00

Update references to the GitHub url in JAX codebase to reflect move from google/jax to jax-ml/jax

2024-09-20 07:52:33 -07:00

feat(gh-13291): Add exponential distribution functions: cdf, logcdf, sf, logsf, and ppf

2025-02-01 12:51:11 +05:00

Create jax wheel build target.

2025-02-25 09:30:08 -08:00

__init__.py

Add jax.copy_to_host_async(tree).

2025-02-27 01:22:15 -08:00

ad_checkpoint.py

Cleanup: fix unused imports & mark exported names

2024-10-16 17:42:41 -07:00

api_util.py

[better_errors] Continue adding debug info to Jaxprs (step 7)

2025-02-09 18:14:33 +02:00

BUILD

Add fuser to jax.experimental.pallas

2025-03-03 17:26:44 -08:00

cloud_tpu_init.py

Cleanup: fix unused imports & mark exported names

2024-10-16 17:42:41 -07:00

collect_profile.py

Use pathlib for profiler log_dir

2024-07-28 18:47:45 +03:00

core.py

[better_errors] Merge the JaxprDebugInfo and TracingDebugInfo into core.DebugInfo

2025-02-02 06:23:03 +02:00

custom_batching.py

Use PEP484-style exports in several submodules

2024-08-14 08:59:56 -07:00

custom_derivatives.py

Reverts 342cb7b99a09180472823a33c7cdad8a8db77875

2025-03-05 10:22:40 -08:00

custom_transpose.py

Use PEP484-style exports in several submodules

2024-08-14 08:59:56 -07:00

debug.py

…

distributed.py

add distributed.is_initialized

2025-02-18 16:47:19 -08:00

dlpack.py

Use PEP484-style exports in several submodules

2024-08-14 08:59:56 -07:00

dtypes.py

Cleanup: fix unused imports & mark exported names

2024-10-16 17:42:41 -07:00

errors.py

Add jax.errors.JaxRuntimeError as a public alias for the XlaRuntimeError class.

2024-09-26 08:39:30 -07:00

export.py

[export] Add support for serialization for some custom PyTree nodes

2024-10-21 11:38:13 +02:00

ffi.py

[xla:python] Add a mechanism for "batch partitioning" of FFI calls.

2025-02-07 09:14:06 -08:00

flatten_util.py

…

monitoring.py

Add JAX events that have time spans, not only durations.

2025-01-07 23:08:30 -08:00

profiler.py

Update references to the GitHub url in JAX codebase to reflect move from google/jax to jax-ml/jax

2024-09-20 07:52:33 -07:00

py.typed

…

random.py

Roll back multinomial change from https://github.com/jax-ml/jax/pull/25688

2025-02-05 09:13:56 -08:00

sharding.py

Replace Auto/User/Collective AxisTypes names with Hidden/Visible/Collective.

2025-01-16 17:55:54 -08:00

stages.py

Update references to the GitHub url in JAX codebase to reflect move from google/jax to jax-ml/jax

2024-09-20 07:52:33 -07:00

test_util.py

Update references to the GitHub url in JAX codebase to reflect move from google/jax to jax-ml/jax

2024-09-20 07:52:33 -07:00

tree_util.py

Update references to the GitHub url in JAX codebase to reflect move from google/jax to jax-ml/jax

2024-09-20 07:52:33 -07:00

tree.py

Add jax.tree shortcuts for .*_with_path calls, for convenience of users.

2024-12-12 15:13:32 -08:00

typing.py

Fix typo in jax.typing module doc

2024-09-11 23:34:03 +10:00

util.py

Update references to the GitHub url in JAX codebase to reflect move from google/jax to jax-ml/jax

2024-09-20 07:52:33 -07:00

version.py

Update version numbers after 0.5.1 release.

2025-02-24 16:18:25 -05:00