rocm_jax

mirror of https://github.com/ROCm/jax.git synced 2025-04-16 11:56:07 +00:00

Author	SHA1	Message	Date
jax authors	4988adccf1	Merge pull request #27010 from mattjj:direct-linearize-fixes-3 PiperOrigin-RevId: 734747001	2025-03-07 18:15:02 -08:00
Matthew Johnson	fe26c19b92	[direct-linearize] fix name_stack bugs Surprisingly, the bug was tracked down to #26111 aka cl/730939406, specifically the new implementation of reset_name_stack in source_info_util.py. To repro, use the before-this-commit implementation of reset_name_stack (left commented-out in the file), and run ``` JAX_USE_DIRECT_LINEARIZE=1 python tests/name_stack_test.py NameStackTransformationTest.test_nested_jit_stack ```	2025-03-08 01:51:19 +00:00
jax authors	4660d7b6dd	Merge pull request #27005 from mattjj:direct-linearize-fixes-2 PiperOrigin-RevId: 734736244	2025-03-07 17:17:45 -08:00
Matthew Johnson	251b93ebd7	fixups that we meant to include in #26427 Co-authored-by: Dougal Maclaurin <dougalm@google.com>	2025-03-08 00:03:26 +00:00
Jevin Jiang	041f575747	Support MHA in ragged paged attention for packed type PiperOrigin-RevId: 734695213	2025-03-07 14:47:04 -08:00
jax authors	6095af050f	Merge pull request #26427 from mattjj:direct-linearize-fixes PiperOrigin-RevId: 734687601	2025-03-07 14:22:16 -08:00
jax authors	d849779689	Merge pull request #27001 from mattjj:yash-scan PiperOrigin-RevId: 734685031	2025-03-07 14:14:30 -08:00
jax authors	1870176eb3	Merge pull request #26979 from mattjj:26936 PiperOrigin-RevId: 734674945	2025-03-07 13:43:55 -08:00
Matthew Johnson	f4f31f89ae	[scan] when num_trips==0, don't generate weird size-zero reshapes	2025-03-07 21:35:40 +00:00
Matthew Johnson	7c2f842353	shard_map and other fixes to direct-linearize Co-authored-by: Dougal Maclaurin <dougalm@google.com>	2025-03-07 21:02:40 +00:00
Matthew Johnson	0e30a3ace9	[mutable-arrays] read values should have the same explicit sharding as ref fixes #26936	2025-03-07 20:53:29 +00:00
Hyeontaek Lim	178278863d	[JAX] Fix api_benchmark broken by https://github.com/jax-ml/jax/pull/26569 `pjit_check_aval_sharding` expects `names: Sequence[str]`. PiperOrigin-RevId: 734614264	2025-03-07 10:49:53 -08:00
jax authors	ccf7278292	Add the len(arg) to the error message for static_argnums Helps reduce the confusion on what is considered an argnum. Ideally there should be static_argkwg PiperOrigin-RevId: 734591856	2025-03-07 09:49:49 -08:00
Yash Katariya	9f37b5197f	[sharding_in_types] Fix a bug where `empty_array` in scan was created with the wrong spec when `unroll > 1`. PiperOrigin-RevId: 734591110	2025-03-07 09:47:32 -08:00
Christos Perivolaropoulos	eeccc67c0b	[mgpu] Debug print arrays. PiperOrigin-RevId: 734576543	2025-03-07 08:58:25 -08:00
Adam Paszke	1bef8b61af	[Mosaic GPU] Add a better explanation for the transposed layout Thanks to @bchetioui for the discussion! PiperOrigin-RevId: 734564672	2025-03-07 08:19:32 -08:00
Adam Paszke	402389290c	[Mosaic TPU] Enable all conversions involving fp8 types on TPUv5+ PiperOrigin-RevId: 734558364	2025-03-07 07:59:31 -08:00
Sergei Lebedev	928caf83ee	[pallas:mosaic_gpu] `copy_smem_to_gmem` now allows skipping `cp.async.commit_group` This feature is necessary to fix the SMEM->GMEM waiting behavior in `emit_pipeline`, which used a pessimistic condition prior to this change, since every copy was its own commit group. PiperOrigin-RevId: 734553668	2025-03-07 07:43:54 -08:00
Adam Paszke	65462fe684	[Mosaic GPU] Add a new layout to help with transposing WGMMA results PiperOrigin-RevId: 734553651	2025-03-07 07:42:01 -08:00
Yash Katariya	f8b98993b8	Add a divisibility check so that we make sure that sharding evenly divides the shape (until this restriction is lifted) to make sure we don't create bad shardings. Also improve dynamic_update_slice sharding error by printing `aval.str_short()` instead of full sharding because it's concise and gives more info than the current error (i.e. it adds shape too to the error message) Also make some formatting changes in scan lowering to make it easier to debug. PiperOrigin-RevId: 734542862	2025-03-07 07:01:34 -08:00
Dan Foreman-Mackey	b7ecfdfd95	Update ad.backward_pass to support non-linear functions of constants.	2025-03-07 09:54:06 -05:00
Adam Paszke	85c6b6a128	[Mosaic GPU] Add support for tiling stores to refs using small tiling The difficulty here is that our register tiling is based on the (64, 8) shape, while the memory tiling is now (8, swizzle // bytewidth). Before, we would assume that each register tile fits neatly within a single memory tile, but now it is obviously not the case. Luckily, it wasn't too hard to add. PiperOrigin-RevId: 734517000	2025-03-07 05:19:11 -08:00
jax authors	de78d2cc71	Merge pull request #26950 from lockwo:Owen/add-pmap-typehint PiperOrigin-RevId: 734500798	2025-03-07 04:10:35 -08:00
Daniel Suo	e6db7a9d99	Dedup non-ref constants closed in cond branch functions. PiperOrigin-RevId: 734497907	2025-03-07 04:01:42 -08:00
jax authors	bf95bf49d4	Update XLA dependency to use revision `f1213b83af`. PiperOrigin-RevId: 734484617	2025-03-07 03:00:30 -08:00
shuw	ccbe9f7cd6	Fix lint	2025-03-07 04:52:58 +00:00
Zac Mustin	8095d842c8	roofline: Support computing flops for unary ops. PiperOrigin-RevId: 734351741	2025-03-06 17:44:36 -08:00
Jevin Jiang	ff4310f640	[Mosaic TPU] Support fp8 upcast to f32 PiperOrigin-RevId: 734345644	2025-03-06 17:19:15 -08:00
Yash Katariya	e9486920e8	Auto complete specs in a sharding if aval.ndim > len(sharding.spec) with `None`. So that for a 2D input, P('data') continues to work. PiperOrigin-RevId: 734325209	2025-03-06 16:10:14 -08:00
jax authors	4cab118344	Merge pull request #26927 from skye:merge_release PiperOrigin-RevId: 734323206	2025-03-06 16:06:09 -08:00
jax authors	cd7f03f272	Updates the Colocated Python's serialization (and deserialization) implementation to utilize the recently added support for string arrays. Currently the serialized data and its length are being carried in two separate arrays, a fixed-with bytes array (with a hard-coded max size) and a unit32 array respectively. PiperOrigin-RevId: 734299259	2025-03-06 14:57:52 -08:00
Jake VanderPlas	b441b2b7a5	Prevent tracer leaks in scipy.special.expn	2025-03-06 14:38:11 -08:00
Jevin Jiang	4b49c03523	Open source TPU-friendly ragged paged attention kernel. Key features: * *Support mixed prefill and decode* to increase throughput for inference. (eg., *5x* speedup compared to padded Muti-Queries Paged Attention implementation for llama-3-8b.) * *No explicit `swapaxes`* for `seq_len` and `num_head` in pre/post kernel. The kernel takes `num_head` in 2nd minor as it naturally was. We fold swapaxes to strided load/store in the kernel and apply transpose on the fly. * *No GMM (Grouped Matmul) Metadata required!* We calculate the metadata on the fly in the kernel. This can speed up *10%! *Increase MXU utilization 8x in GQA* by grouping shared q heads for MXU in decode. * *Minimize recompilation:* The only factors can cause recompilation are model specs, `max_num_batched_tokens` and `max_num_seqs` in the setting of mixed engine. PiperOrigin-RevId: 734269519	2025-03-06 13:36:45 -08:00
Dimitar (Mitko) Asenov	5d64b3d2dd	[Mosaic GPU] Fix `scf.ForOp` lowering to put lowered ops at the right place. Without this fix, lowerings of ops within the `for` body are always appended at the end, even if they have users earlier in the body. This caused an `operand #0 does not dominate this use` error. The fix was tested in the upcoming (but not yet submitted) `test_realistic_matmul` in Pallas with Workgroup semantics. PiperOrigin-RevId: 734157829	2025-03-06 08:40:19 -08:00
Ayaka	8c89da7cdc	Minor bug fixes in error checking PiperOrigin-RevId: 734126415	2025-03-06 06:57:52 -08:00
Nitin Srinivasan	623865fe95	Build JAX wheels instead of installing it from the source repository This change allows us to get rid of extra env vars which used to control whether to install `jax` at head. Now, `jax` will be be built and consumed in the same way as the other wheels in the continuous jobs. PiperOrigin-RevId: 734123590	2025-03-06 06:48:16 -08:00
Sergei Lebedev	2a34019388	[pallas:mosaic_gpu] Added WG lowering rule for `lax.bitcast_convert_type_p` PiperOrigin-RevId: 734081448	2025-03-06 04:09:55 -08:00
Chris Jones	d6b97c2026	[pallas] Add support for `pl.dot` with `int8` inputs. PiperOrigin-RevId: 734081057	2025-03-06 04:08:04 -08:00
jax authors	16bb919020	Update XLA dependency to use revision `6e396aae2e`. PiperOrigin-RevId: 734059108	2025-03-06 02:40:28 -08:00
Benjamin Chetioui	fe577b5dc4	[Pallas/Mosaic GPU] Enable `ops_test` for Mosaic GPU. For now, most of the tests are skipped. PiperOrigin-RevId: 734026728	2025-03-06 00:45:05 -08:00
Yash Katariya	a67ab9fade	Just use `jit` as the string in error messages instead of `jit` and `pjit` based on resource_env. This is to start deprecating the need for `with mesh` and replace it with `use_mesh(mesh)`. PiperOrigin-RevId: 733959962	2025-03-05 20:09:30 -08:00
Yash Katariya	ba5349f896	Add a note about uneven sharding and with_sharding_constraint. Fixes https://github.com/jax-ml/jax/issues/26946 PiperOrigin-RevId: 733953836	2025-03-05 19:35:03 -08:00
jax authors	c16f37d89d	Set `USERPROFILE` for Windows builds to fix CI issue. This change fixes https://github.com/jax-ml/jax/actions/runs/13686468791/job/38270929632. From the [documentation](https://docs.python.org/3/library/os.path.html#os.path.expanduser): `On Windows, USERPROFILE will be used if set, otherwise a combination of HOMEPATH and HOMEDRIVE will be used.` PiperOrigin-RevId: 733935305	2025-03-05 18:09:14 -08:00
Jacob Burnim	016b351f00	[Pallas] Adds a simple dynamic race detector for TPU interpret mode. PiperOrigin-RevId: 733885890	2025-03-05 15:15:21 -08:00
jax authors	8571ad9ff2	Merge pull request #26952 from garymm:vmap-arg PiperOrigin-RevId: 733865978	2025-03-05 14:19:11 -08:00
jax authors	0913cd7583	Fix build rule for free-threaded python builds. PiperOrigin-RevId: 733857126	2025-03-05 13:54:24 -08:00
Gary Miguel	69d66f66df	vmap mismatch size error message: handle *args Fixes: https://github.com/jax-ml/jax/issues/26908	2025-03-05 13:08:54 -08:00
jax authors	3edc068f8c	Fix ambiguous cpu definition for JAX wheels. Should fix the error in https://github.com/jax-ml/jax/actions/runs/13682579939/job/38258344926. PiperOrigin-RevId: 733838895	2025-03-05 12:59:21 -08:00
Owen Lockwood	3e4dc0d490	add pmap axes hints	2025-03-05 12:14:24 -08:00
Adam Paszke	8df00e2666	[Mosaic GPU] Remove support for large tiles on Blackwell We don't have many Blackwell kernels yet, so let's begin the deprecation there! Small tiles have clearer semantics when it comes to transposes too, which allows us to enable more test cases. PiperOrigin-RevId: 733786884	2025-03-05 10:34:53 -08:00

... 2 3 4 5 6 ...

26262 Commits