15678 Commits

Author SHA1 Message Date
Mathew Odden
6b35155294
Fix invalid lowerings for ROCm in Pallas (#223)
popcount and clz were effectively broken on ROCm,
since math_dialect had incorrect lowerings.

Use the device intrinsics for these functions, as
well as for exp and absf, which fixes some accuracy issues in
the pallas tests.

Docs for OCML/OCKL

- https://github.com/ROCm/llvm-project/blob/amd-staging/amd/device-libs/doc/OCML.md
- https://github.com/ROCm/llvm-project/blob/amd-staging/amd/device-libs/doc/OCKL.md
2025-02-14 11:27:52 -06:00
Peter Hawkins
c25fb92c44 Release JAX 0.5.0 2025-01-17 10:28:03 -05:00
jax authors
a527aba646 Reverts f1b894d14a28ac22a037fb79177b991275c75a18
PiperOrigin-RevId: 716653711
2025-01-17 07:00:31 -08:00
Yash Katariya
ce85b89884 [sharding_in_types] Error out for reshape for splits like this: (4, 6, 8) -> (4, 4, 2, 6)
PiperOrigin-RevId: 716653203
2025-01-17 06:58:29 -08:00
Benjamin Chetioui
d3be190efb [Mosaic GPU] Delete unused declarations of mosaic_gpu_memcpy_async_h2d.
PiperOrigin-RevId: 716616807
2025-01-17 04:34:48 -08:00
Sergei Lebedev
d34c40f6b6 [mosaic_gpu] Added a serialization pass
The pass adds versioning to the Mosaic GPU IR in the lowered custom calls
and can apply forward/backward migration rules. Currently, no rules are
necessary since we are at version 1.

PiperOrigin-RevId: 716596848
2025-01-17 03:12:51 -08:00
Yash Katariya
af667199db [sharding_in_types] Rename .at[...].get(out_spec) to .at[...].get(out_sharding).
PiperOrigin-RevId: 716466870
2025-01-16 18:56:52 -08:00
Yash Katariya
97cd748376 Rename out_type -> out_sharding parameter on einsum
PiperOrigin-RevId: 716454800
2025-01-16 18:16:52 -08:00
Yash Katariya
49224d6cdb Replace Auto/User/Collective AxisTypes names with Hidden/Visible/Collective.
Replace `with set_mesh(mesh):` with `with use_mesh(mesh):` context manager

Also expose `AxisTypes` and `use_mesh` into public API via `jax.sharding.AxisTypes` and `jax.sharding.use_mesh`.

PiperOrigin-RevId: 716446406
2025-01-16 17:55:54 -08:00
Parker Schuh
f2f552c108 Allow resharding between tokens on a single device
and multiple devices.

Whenever this happens we can essentially introduce an effects barrier
instead of doing the normal device -> host -> device transfer.

Fixes https://github.com/jax-ml/jax/issues/25671.

PiperOrigin-RevId: 716309978
2025-01-16 11:24:22 -08:00
Yash Katariya
b23c42372b [sharding_in_types] If an indexing operation hits into gather_p, error out saying to use .at[...].get(out_spec=...) instead.
This will basically drop the gather operation into full auto mode and add a sharding constraint on the output given by the user via `out_spec`.

Co-authored-by: Matthew Johnson <mattjj@google.com>
PiperOrigin-RevId: 716295953
2025-01-16 10:51:15 -08:00
Yash Katariya
0df4475aeb Make result_handler of _DeferredShardArg a method instead of a property. Also play some code golf.
PiperOrigin-RevId: 716273533
2025-01-16 09:53:48 -08:00
Dimitar (Mitko) Asenov
5e27efd0e0 [MosaicGPU] Cleanup imports in dialect_lowering.py
PiperOrigin-RevId: 716244938
2025-01-16 08:26:02 -08:00
Benjamin Chetioui
6746d63364 [Mosaic GPU][NFC] Clean up import to align with stylistic guidance.
PiperOrigin-RevId: 716233876
2025-01-16 07:50:04 -08:00
Benjamin Chetioui
d3bf243342 [Mosaic GPU] Add layout inference for splat arith.ConstantOps and vector.SplatOps.
PiperOrigin-RevId: 716224880
2025-01-16 07:18:35 -08:00
Dimitar (Mitko) Asenov
24884071b9 [MosaicGPU] Remove the single_thread context from top-level dialect code.
- Change the `async_load` lowering to manage the single thread context.
- Use a predicate for the top-level arrive_expect. If we want to hide this further, we can have a warp-group level op that lowers to a single-threaded context.

PiperOrigin-RevId: 716219730
2025-01-16 06:59:32 -08:00
Yash Katariya
c6b5ac5c7b [sharding_in_types] Expand reshape's sharding rule to add support for the following cases:
* Split on 1 dimension only and the splitting dimension should be unsharded.

  `operand.shape = (4@x, 6@y, 8), new_shape = (4@x, 6@y, 2, 2, 2)`

* Merging into 1 dimension only and all the merging dimensions should be unsharded.

  `operand.shape = (4@y, 2, 3, 8), new_shape = (4@y, 6, 8)`

* Split into singleton dimensions i.e. adding extra dims of size 1

  `operand.shape = (4@x, 6@y, 8@z), new_shape = (1, 4@x, 1, 6@y, 1, 8@z, 1)`

* Merge singleton dimensions i.e. removing extra dims of size 1

  `operand.shape = (1, 4@x, 6, 1, 8, 1), new_shape = (1, 4@x, 6, 8)`

* Identity reshape

  `operand.shape = (4@(x,y), 6), new_shape = (4@(x,y), 6)`

These cases are unambiguous to handle. In all other cases, we error out and ask the user to provide the out_sharding.

PiperOrigin-RevId: 716216240
2025-01-16 06:47:26 -08:00
Dimitar (Mitko) Asenov
ce03cf976e [MosaicGPU] Move gpu_address_space_to_nvptx inside utils.py and use it.
PiperOrigin-RevId: 716214822
2025-01-16 06:41:51 -08:00
Dimitar (Mitko) Asenov
22417ae28e [MosaicGPU] Extract code into a new method BarrierRef.from_dialect_barrier_memref and implement support for 1D barrier memrefs.
PiperOrigin-RevId: 716180182
2025-01-16 04:30:43 -08:00
Benjamin Chetioui
bc7204f003 [Mosaic GPU] Allow querying layouts from a FuncOp's block arguments if set.
The motivation behind this change is twofold:

1. it simplifies test writing (no need to produce arbitrary, manual, non-splat
   constants to produce arguments with a strided layout);
2. it'll allow running layout inference on different `FuncOp`s in isolation,
   before inlining.

While the primary motivation is to simplify test writing for upcoming changes,
`2.` is useful if we ever intend to call functions whose body's layout we have
inferred from other functions. It's not clear to me that we have a use case for
that, but the theoretical benefit is worth pointing out.

Crucially, layout inference does not set default layouts for `FuncOp`s, since
the caller may choose a different layout for its arguments. As a result, there
is also no layout inference rule for `func.FuncOp`.

PiperOrigin-RevId: 716158516
2025-01-16 03:05:41 -08:00
Sharad Vikram
0ac63157f5 [Pallas TPU] Add helpers file with copy_ref function
PiperOrigin-RevId: 716030813
2025-01-15 18:34:58 -08:00
jax authors
2fa1002054 Merge pull request #25911 from hawkinsp:version
PiperOrigin-RevId: 715882985
2025-01-15 11:43:23 -08:00
Zachary Garrett
f7d097f7cc Make utils for reporting function name work with functools.partial by using the inner .func attribute if the object doesn't have a __name__ attribute. functools.partial objects do not have __name__ attributes by default.
PiperOrigin-RevId: 715881812
2025-01-15 11:40:59 -08:00
Peter Hawkins
3a8f31aa83 Update the JAX version to 0.5.0.
This is because of the breaking change to PRNG key semantics, and the version follows JAX's new effver versioning scheme (https://jax.readthedocs.io/en/latest/jep/25516-effver.html).
2025-01-15 14:08:15 -05:00
jax authors
41993fdb24 Merge pull request #25755 from ROCm:ci_rnn_final-upstream
PiperOrigin-RevId: 715856939
2025-01-15 10:40:54 -08:00
jax authors
ca012d7ad6 Merge pull request #25864 from jax-ml:yet-more-linearization-fixes
PiperOrigin-RevId: 715840148
2025-01-15 10:00:31 -08:00
Zac Mustin
2d72e8de84 Jax: Stop returning a list of cost-analyses.
As it stands, there is only ever one element in this list (see b/384741132) and only the 0th element is ever used so we can simplify.

This is a potentially breaking change for external users, but (as stated in the [documentation](https://jax.readthedocs.io/en/latest/aot.html#debug-information-and-analyses-when-available)) no guarantees are made on this type, which is intended for debugging purposes and not intended to be a reliable public API.

PiperOrigin-RevId: 715837855
2025-01-15 09:53:59 -08:00
jax authors
70c1ee5d9c Merge pull request #25876 from gnecula:debug_info_3
PiperOrigin-RevId: 715831527
2025-01-15 09:35:03 -08:00
jax authors
2e5e4799fd Merge pull request #25880 from jakevdp:fix-gather
PiperOrigin-RevId: 715804120
2025-01-15 08:10:44 -08:00
Dougal
9fe553ca49 More linearization fixes 2025-01-15 10:27:21 -05:00
Sergei Lebedev
afcb21ddf1 [pallas:mosaic_gpu] Fixed a crash in MLIR Python bindings
The error message produced by MLIR is not really clear, but AFAICT the crash
was caused by the "temporary module" hack we use in the lax.cond lowering
rule.

PiperOrigin-RevId: 715785632
2025-01-15 07:09:43 -08:00
Benjamin Chetioui
cdf490a5d0 [Mosaic GPU][NFC] Address some previous stylistic comments.
PiperOrigin-RevId: 715772455
2025-01-15 06:21:23 -08:00
George Necula
f9dfe7f646 [better_errors] More cleanup 2025-01-15 10:22:29 +00:00
jax authors
c4406d2759 [pallas] Fix bad rebase, deleted lowering for a print
PiperOrigin-RevId: 715694818
2025-01-15 01:18:30 -08:00
jax authors
c18492be65 [pallas][mosaic kernel export] Add initial support for exporting a dynamic shapes (placeholder bound) kernel out of mosaic, via pallas as both MLIR and jaxpr.
PiperOrigin-RevId: 715629439
2025-01-14 20:34:11 -08:00
Ruturaj4
fe68eb8b25 [ROCm] Implement RNN support 2025-01-14 19:04:49 -06:00
Justin Fu
cc9f6e7528 [Pallas] Fix GQA triton kernel test.
PiperOrigin-RevId: 715576240
2025-01-14 16:40:55 -08:00
Jevin Jiang
6851700ed4 [Mosaic TPU] Append dump id to timestamp to make dump list ordered
PiperOrigin-RevId: 715488504
2025-01-14 12:44:10 -08:00
Jake VanderPlas
54fbf0b3f2 Indexing: avoid dynamic_slice when mode='clip'
This causes issues in the backward pass, where effectively mode='promise_in_bounds'
2025-01-14 11:20:50 -08:00
George Necula
f1b894d14a Reverts 391bad8ff59c07c8fad7b8ce05cd0e29dee4cf1a
PiperOrigin-RevId: 715435319
2025-01-14 10:31:59 -08:00
Justin Fu
b6acb9cb7a Fix remat bug on primitives with multiple outputs.
Addresses https://github.com/jax-ml/jax/issues/25841

PiperOrigin-RevId: 715434084
2025-01-14 10:26:58 -08:00
jax authors
f270739f9f Merge pull request #25872 from gnecula:jax2tf_doc
PiperOrigin-RevId: 715411235
2025-01-14 09:24:20 -08:00
Yash Katariya
b7e06f1937 Remove dead codepaths now that MemorySpaceDescription works in OSS
PiperOrigin-RevId: 715410774
2025-01-14 09:22:26 -08:00
jax authors
ee724565bf Merge pull request #25827 from gnecula:debug_info_2
PiperOrigin-RevId: 715407809
2025-01-14 09:12:37 -08:00
Yash Katariya
c72ed260fe [sharding_in_types] Handle ShapeDtypeStruct inputs with sharding_in_types by registering the sharding on the aval properly created by SDS in it's pytype_aval_mapping.
Also If we are running under full auto mode, don't error out if primitives don't have a sharding rule registered.

PiperOrigin-RevId: 715383866
2025-01-14 08:03:50 -08:00
Dougal
7d11d12bcd Mention expected tangent aval in error message, see #25517. 2025-01-14 08:51:12 -05:00
George Necula
b30df36d7d [better_errors] Add debug_info to DynamicJaxprTrace and JaxprStackFrame
This is part of a sequence of changes to ensure that the debugging information
is propagated properly.

Additional cleanup:
* Rename `result_paths` to `result_paths_thunk` in `TracingDebugInfo` to clarify the
  difference from the similar field in `JaxprDebugInfo`
* Added more type declarations
2025-01-14 13:49:18 +00:00
George Necula
36533b9eb5 [jax2tf] Fix bitrot in docs 2025-01-14 11:36:14 +00:00
Bart Chrzaszcz
74e912c3c0 #sdy dynamically choose which custom_partitioning API to use based on the current
value of the `use_shardy_partitioner` feature flag.

Before the way the API works depends on the value of the flag when the partitioning is defined. But we should allow this to be dynamically swapped in and out when the function is actually called. This change allows for that.

PiperOrigin-RevId: 715293018
2025-01-14 02:11:55 -08:00
jax authors
4f2f5fa53a Merge pull request #25798 from gnecula:fix_fori_error
PiperOrigin-RevId: 715258789
2025-01-14 00:01:30 -08:00