rocm_jax

mirror of https://github.com/ROCm/jax.git synced 2025-04-24 21:46:05 +00:00

Author	SHA1	Message	Date
Matthew Johnson	66a6eb299e	add autodiff rules for jax.lax.ragged_all_to_all collective also update the ragged_all_to_all docstring. pseudocode in the style of the shard_map tutorial would be better and cleaner, but it needs the context of the tutorial to explain; i'll add ra2a to the shmap tutorial in the future. PiperOrigin-RevId: 735957604	2025-03-11 18:22:02 -07:00
Yash Katariya	f45cbf3342	Fix a bug where `full` and `use_mesh` outside jit did not work because the `shard` passed to `make_array_from_callback` was sharded on all devices instead of just 1 device. This is because `convert_element_type` returning an output on all devices of the mesh because of the surrounding `use_mesh` context. PiperOrigin-RevId: 735909962	2025-03-11 15:25:46 -07:00
Pearu Peterson	82b2591b21	Fix scipy.special.gammainc/gammaincc evaluation at boundary points	2025-03-11 21:18:47 +02:00
jax authors	c2c68c018f	Merge pull request #27059 from jakevdp:fix-while-loop PiperOrigin-RevId: 735828960	2025-03-11 11:32:00 -07:00
Gunhyun Park	d191927b24	Fix syntax error and typos for composite primitive docstring. PiperOrigin-RevId: 735808000	2025-03-11 10:37:07 -07:00
Jake VanderPlas	4ae3211ea2	jax.disable_jit: ensure while_loop behaves similarly to non-disable_jit version	2025-03-11 09:53:34 -07:00
Praveen Narayanan	b6d4fe5387	Define lax.ragged_dot_general and express lax.ragged_dot in terms of it. PiperOrigin-RevId: 735471245	2025-03-10 12:25:22 -07:00
jax authors	14b215fe76	Merge pull request #27032 from dfm:lax-dtype PiperOrigin-RevId: 735424674	2025-03-10 10:18:58 -07:00
Dan Foreman-Mackey	21884d4a14	Move (most) jaxlib linalg custom call registration into JAX. My motivation here is to fix the plugin support for batch partitionable custom calls. Since plugin support for custom call partitioners is provided via register_plugin_callback in xla_bridge, instead of xla_client itself, it's much more straightforward to register the custom calls in JAX. It would be possible to refactor things differently, but it actually seems like a reasonable choice to use the supported APIs from `jax.ffi` instead of `xla_client` so that we can take advantage of any new features we might add there in the future. This is all still a little bit brittle and I'd eventually like to migrate to a version where the XLA FFI library provides a mechanism for exporting handlers, but this change is still compatible with any future changes like that. PiperOrigin-RevId: 735381736	2025-03-10 08:17:44 -07:00
Dan Foreman-Mackey	4eada56027	Avoid using array operations within lax.py operations.	2025-03-10 11:04:32 -04:00
jax authors	6095af050f	Merge pull request #26427 from mattjj:direct-linearize-fixes PiperOrigin-RevId: 734687601	2025-03-07 14:22:16 -08:00
Matthew Johnson	f4f31f89ae	[scan] when num_trips==0, don't generate weird size-zero reshapes	2025-03-07 21:35:40 +00:00
Matthew Johnson	7c2f842353	shard_map and other fixes to direct-linearize Co-authored-by: Dougal Maclaurin <dougalm@google.com>	2025-03-07 21:02:40 +00:00
Yash Katariya	9f37b5197f	[sharding_in_types] Fix a bug where `empty_array` in scan was created with the wrong spec when `unroll > 1`. PiperOrigin-RevId: 734591110	2025-03-07 09:47:32 -08:00
Yash Katariya	f8b98993b8	Add a divisibility check so that we make sure that sharding evenly divides the shape (until this restriction is lifted) to make sure we don't create bad shardings. Also improve dynamic_update_slice sharding error by printing `aval.str_short()` instead of full sharding because it's concise and gives more info than the current error (i.e. it adds shape too to the error message) Also make some formatting changes in scan lowering to make it easier to debug. PiperOrigin-RevId: 734542862	2025-03-07 07:01:34 -08:00
Daniel Suo	e6db7a9d99	Dedup non-ref constants closed in cond branch functions. PiperOrigin-RevId: 734497907	2025-03-07 04:01:42 -08:00
Yash Katariya	766315f791	Make sure concat + vmap of sharded input and replicated input works properly. In this case, the example boils down to: ``` inp1 = f32[16@x, 4] inp2 = f32[4] def f(x: f32[4], y: f32[4]) return jnp.concat([x, y], axis=-1) vmap(f, in_axes=(0, None))(inp1) ``` This example was breaking in concat batching rule because we didn't broadcast with the right sharding. PiperOrigin-RevId: 733536944	2025-03-04 18:35:13 -08:00
Jake VanderPlas	84ca80d215	doc: in lax.cond, note that both branches will be traced	2025-03-03 13:05:24 -08:00
George Necula	a6c47d6f36	Use the same name for aliased Vars when pretty-printing Jaxprs. Add a mechanism for using the same Var names for Vars that are aliased. In this PR, we use this for `pjit`, such that the following `print(jax.make_jaxpr(lambda a: jax.jit(lambda a: a + 1)(a))(0.))` prints: ``` { lambda ; a:f32[]. let b:f32[] = pjit[ name=<lambda> jaxpr={ lambda ; a:f32[]. let b:f32[] = add a 1.0 in (b,) } ] a in (b,) } ``` instead of the previous: ``` { lambda ; a:f32[]. let b:f32[] = pjit[ name=<lambda> jaxpr={ lambda ; c:f32[]. let d:f32[] = add c 1.0 in (d,) } ] a in (b,) } ``` The same mechanism could be used for other higher-order primitives, e.g., cond, and others. Also add some typing declarations and rename APIs to use "shared jaxpr" in lieu of "top-level jaxpr" for those Jaxprs that are used multiple times and are printed first. I presume that the term "top-level jaxpr" was picked because these are printed first at top-level. But this is confusing, because they are really subjaxprs. In fact, there was already a function `core.pp_toplevel_jaxpr` for printing the top-level Jaxpr, and there was also `core.pp_top_level_jaxpr` (which now is named `core.pp_shared_jaxpr`.	2025-03-03 11:38:51 +01:00
Yash Katariya	53494ade2d	`PRNGKeyArray.aval` should have the correct logical sharding. This required refactoring code so that we don't hit recursion errors. PiperOrigin-RevId: 732536521	2025-03-01 18:18:19 -08:00
Peter Hawkins	1e5d9a9158	Add an allow_negative_indices option to lax.dynamic_slice and lax.dynamic_update_slice. The goal of this change is to avoid generating code to wrap negative indices back into range in cases where we know it doesn't matter. Change scan to pass allow_negative_indices=False to avoid emitting index wrapping code for each scan argument. PiperOrigin-RevId: 731812827	2025-02-27 12:04:28 -08:00
Dan Foreman-Mackey	f93c2a1aa5	Add and test support for partitioning of batch dimensions in lax.linalg. On CPU and GPU, almost all of the primitives in lax.linalg are backed by custom calls that support simple semantics when batch dimensions are sharded. Before this change, all linalg operations on CPU and GPU will insert an `all-gather` before being executed when called on sharded inputs, even when that shouldn't be necessary. This change adds support for this type of partitioning, to cover a wide range of use cases. There are a few remaining GPU ops that don't support partitioning either because they are backed by HLO ops that don't partition properly (Cholesky factorization and triangular solves), or because they're still using descriptors with problem dimensions in kernel. I'm going to fix these in follow up changes. PiperOrigin-RevId: 731732301	2025-02-27 08:16:16 -08:00
Dan Foreman-Mackey	553b441fef	Use LAPACK trsm kernel even for batched solves. Depending on the platform and linked LAPACK library, this change seems to improve (or at least not degrade) performance across a wide range of problem and batch sizes. On colab, the performance is not dramatically improved for most input shapes, but on my Mac, this improves the performance of batched triangular solves by a factor of a few up to an order of magnitude across all the problems that I tried. PiperOrigin-RevId: 730971127	2025-02-25 11:49:01 -08:00
Dan Foreman-Mackey	2ce88c950a	Deprecate alpha argument to trsm LAPACK kernel. (Part of general cleanups of the lax.linalg submodule.) This is always set to 1 and I don't see any benefit to keeping this argument around. This can be done in a forward and backward compatible way following these docs: https://docs.jax.dev/en/latest/export/export.html#ensuring-forward-and-backward-compatibility We start by updating the FFI handler to remove the explicit alpha argument, but allow it to accept (but ignore) extra input arguments. Then we only pass alpha when lowering in forward compatibility mode, or when the jaxlib version is old (I'm using >0.5.1 as the cutoff assuming that this change doesn't make it into the upcoming release). Then, the forward compatibility lowering can be removed after at least 21 days, and the kernel can be updated at least 180 days after 0.5.2 is released. PiperOrigin-RevId: 730928808	2025-02-25 10:04:29 -08:00
Dan Foreman-Mackey	62530d5922	Update JVP rule for lax.linalg.lu to use vmap instead of broadcasted_iotas. PiperOrigin-RevId: 730497540	2025-02-24 10:09:41 -08:00
Dan Foreman-Mackey	6bd99207d5	Fix rank promotion error in JVP of batched eigh. PiperOrigin-RevId: 730475017	2025-02-24 09:08:55 -08:00
Dan Foreman-Mackey	ae656e1574	Update lax.linalg.svd primitive to use registration helper functions. PiperOrigin-RevId: 730466560	2025-02-24 08:44:06 -08:00
jax authors	c74f497eaf	Merge pull request #25053 from JanLuca:gesvd PiperOrigin-RevId: 730445233	2025-02-24 07:38:15 -08:00
jax authors	c17ea805f3	Merge pull request #26569 from gnecula:debug_info_arg_names PiperOrigin-RevId: 730432019	2025-02-24 06:48:41 -08:00
Yash Katariya	7d3c63eded	[sharding_in_types] Add more reshape sharding support * Allow merging and splitting only if major most dim is sharded since that involves no data movement. This only happens if `dimensions` is None i.e. if the input array is in row-major order. * Merging: If only the major most dim is sharded of the merge block then that sharding is propagated to the merge block output * Splitting: If the dimension being split is sharded, then the sharding is propagated to the major most dimension post split only if the spec divides the new shape exactly. PiperOrigin-RevId: 730291595	2025-02-23 21:39:23 -08:00
George Necula	1be801bac8	[better_errors] Cleanup use of DebugInfo.arg_names and result_paths Previously, we represented a missing arg name with `None`, and a missing result path with the empty string. We now adopt the same convention for arg names and use empty strings. This simplifies the typing, and prevents the string "None" from appearing in error messages. I changed how we encode the result paths. Previously for a function that returns a single array the path was the empty string (the same as for an unknown path). And for a function that returns a pair of arrays it was `([0], [1])`. Now we add the "result" prefix: `("result",)` for a function returning a single array and `(result[0], result[1])` for a function returning a pair of arrays. Finally, in debug_info_test, I removed the `check_tracer_arg_name` so that all spied tracers are printed with the argument name they depend on.	2025-02-23 08:27:56 +02:00
Yash Katariya	d695aa4c63	[sharding_in_types] Add sharding rules for the following primitives: * `bitcast_convert_element_type` * `cumsum` * `cumlogsumexp` * `cumprod` * `cummax` * `cummin` * `reduce_window` * `reduce_window_sum` * `reduce_window_max` * `reduce_window_min` * `select_and_gather_add` For `reduce_window_...` primitives only trivial windowing is supported along non-replicated dimensions. We can relax the other NotImplemented case in the future. PiperOrigin-RevId: 729910108	2025-02-22 10:45:58 -08:00
Jan Naumann	e03fe3a06d	Implement SVD algorithm based on QR for CPU targets In a recent jax release the SvdAlgorithm parameter has been added to the jax.lax.linalg.svd function. Currently, for CPU targets still only the divide and conquer algorithm from LAPACK is supported (gesdd). This commits adds the functionality to select the QR based algorithm on CPU as well. Mainly it addes the wrapper code to call the gesvd function of LAPACK using the FFI interface. Signed-off-by: Jan Naumann <j.naumann@fu-berlin.de>	2025-02-22 15:24:57 +01:00
Dan Foreman-Mackey	c4418c1010	Update several remaining lax.linalg primitives to use registration helper functions. In this change, we update schur, triangular_solve, tridiagonal, and tridiagonal_solve. I batched these ones since they're all pretty straightforward. PiperOrigin-RevId: 729572705	2025-02-21 10:18:30 -08:00
Dan Foreman-Mackey	ed10003adc	Update lax.linalg.qr primitives to use registration helper functions. PiperOrigin-RevId: 729551997	2025-02-21 09:15:01 -08:00
Dan Foreman-Mackey	09325d925f	Update internal unop primitive helper to pass kwargs to dtype rule. To be consistent with other rule registration helpers, `unop_dtype_rule` should pass through its kwargs to the `result_dtype` callable. PiperOrigin-RevId: 729483613	2025-02-21 04:52:51 -08:00
Dan Foreman-Mackey	126909b62a	Update lax.linalg.lu primitive to use registration helper functions. PiperOrigin-RevId: 729483456	2025-02-21 04:50:46 -08:00
Dan Foreman-Mackey	a981e1c4b9	Start adding primitive registration helper functions to lax.linalg. As part of my efforts to simplify the primitive implementations in lax.linalg, I've found that all of the primitives share some common logic when it comes to impls, abstract_evals, and batching. This change adds some helper functions and starts the process of abstracting the primitive definitions to simplify and reduce duplication. I will continue with the rest of the primitives in lax.linalg, but I didn't want to overload the first diff. PiperOrigin-RevId: 729471970	2025-02-21 04:05:34 -08:00
Robert David	08de0128b6	Fix head comment: was referring to nonexistent parameters. PiperOrigin-RevId: 729231457	2025-02-20 13:29:40 -08:00
Yash Katariya	8305803b76	[sharding_in_types] Initial support for partial-auto/explicit shard_map + sharding-in-types. If the axes in `shmap(..., auto=...)` is an explicit axes in the outer mesh context, then that axis is treated as Explicit instead of Auto. PiperOrigin-RevId: 728920514	2025-02-19 20:04:54 -08:00
jax authors	cb0d326e16	Merge pull request #26591 from jakevdp:lax-docs PiperOrigin-RevId: 728908919	2025-02-19 19:22:48 -08:00
Roy Frostig	ae10f2da13	fix scan doc on the `unroll` argument. Looks like a typo worth fixing.	2025-02-19 11:01:44 -08:00
Yash Katariya	66d04f85e6	Error out if going from `Manual` -> `Auto/Explicit` AxisTypes in the `auto_axes` and `explicit_axes` API that do `mesh_cast` implicitly. Also, improve the error raised by canonicalize_sharding to include the api name and current source location. PiperOrigin-RevId: 728701237	2025-02-19 09:21:53 -08:00
Yash Katariya	a3edfb43ef	Now that sharding_in_types config flag is True, remove the config and all the conditionals PiperOrigin-RevId: 728653433	2025-02-19 06:53:35 -08:00
Jake VanderPlas	7f115fbb64	jax.lax: improve docs for comparison operators	2025-02-18 13:48:59 -08:00
jax authors	72f0a90ee6	Merge pull request #26401 from jakevdp:numpy-consts PiperOrigin-RevId: 728292846	2025-02-18 11:32:25 -08:00
Jake VanderPlas	29771dd06c	jax.lax: improve docs for bitwise operators.	2025-02-14 14:17:30 -08:00
Jake VanderPlas	33b989ac9e	refactor: import numpy objects directly in jax.numpy	2025-02-14 12:47:58 -08:00
Jake VanderPlas	531443c434	jax.lax: improve docs for pow & related functions	2025-02-14 08:40:19 -08:00
George Necula	a0812cd57e	[better_errors] Make it explicit that debug_info is not None. Now all internal uses of lu.wrap_init and core.Jaxpr are with actual debug info. This enables us to clean up the type declarations and to remove the checks whether debug_info is present. For usage outside of the JAX internals, we change `jax.extend.linear_util.wrap_init` to be usable without debug_info, for temporary backwards compatibility. We emit a deprecation warning and fill-in some fake debugging info. See https://github.com/jax-ml/jax/issues/26480 for more details. PiperOrigin-RevId: 726770483	2025-02-13 22:07:04 -08:00

1 2 3 4 5 ...

1747 Commits