rocm_jax

mirror of https://github.com/ROCm/jax.git synced 2025-04-24 14:56:05 +00:00

Author	SHA1	Message	Date
Yash Katariya	177e1f6ed9	Canonicalize PartitionSpec so that we can delete ParsedPartitionSpec. We need to do this after sharding-in-types to speed up NamedSharding construction and remove a lot of tech debt and unnecessary complexity. * `_partitions` is now canonicalized and only contains `tuples`, `singular strings`, `None` or `UNCONSTRAINED`. No more empty tuples (`P((), 'x')`) and singleton tuples. * Cache the creating of sharding on ShapedArray since it's expensive to do it a lot of times * Change the `__hash__` and `__eq__` of `NamedSharding` to depend on `self.spec` instead of `self._parsed_pspec`. PiperOrigin-RevId: 731745062	2025-02-27 08:59:25 -08:00
Tom Hennigan	1becb57ac9	Add `jax.copy_to_host_async(tree)`. A relatively common pattern I've observed is the following: ```python _, metrics = some_jax_function() with profiler.Trace('compute_metrics'): jax.block_until_ready(metrics) with profiler.Trace('copy_to_host'): metrics = jax.device_get(metrics) ``` We are missing an opportunity here to more eagerly begin the h2d copy of the metrics (e.g. overlap it with closing the "compute_metrics" context manager etc. The intention of `jax.copy_to_host_async(x)` is to make it simple to begin h2d transfers as early as possible. Adapting the above code: ```python _, metrics = some_jax_function() # Begin D2H copies as early as we can. jax.copy_to_host_async(metrics) with profiler.Trace('compute_metrics'): jax.block_until_ready(metrics) with profiler.Trace('copy_to_host'): metrics = jax.device_get(metrics) ``` PiperOrigin-RevId: 731626446	2025-02-27 01:22:15 -08:00
Peter Hawkins	256e37af5f	Port many uses of contextlib.contextdecorator to explicit context manager classes. contextdecorator turns out to be slower than just writing a decorator class explicitly. Since we use many decorators per-equation, this causes a measurable speed difference in certain benchmarks. PiperOrigin-RevId: 730939406	2025-02-25 10:31:05 -08:00
George Necula	1be801bac8	[better_errors] Cleanup use of DebugInfo.arg_names and result_paths Previously, we represented a missing arg name with `None`, and a missing result path with the empty string. We now adopt the same convention for arg names and use empty strings. This simplifies the typing, and prevents the string "None" from appearing in error messages. I changed how we encode the result paths. Previously for a function that returns a single array the path was the empty string (the same as for an unknown path). And for a function that returns a pair of arrays it was `([0], [1])`. Now we add the "result" prefix: `("result",)` for a function returning a single array and `(result[0], result[1])` for a function returning a pair of arrays. Finally, in debug_info_test, I removed the `check_tracer_arg_name` so that all spied tracers are printed with the argument name they depend on.	2025-02-23 08:27:56 +02:00
Yash Katariya	1081c1f11a	Relax the check in `_mapped_axis_spec` to allow `()` and `None` to be treated the same PiperOrigin-RevId: 728746291	2025-02-19 11:23:17 -08:00
Yash Katariya	401fa9019c	Mark `in_shardings` and `out_shardings` as Any for typing reasons since they can take pytrees. Fixes https://github.com/jax-ml/jax/issues/26609 PiperOrigin-RevId: 728730349	2025-02-19 10:46:09 -08:00
Yash Katariya	a3edfb43ef	Now that sharding_in_types config flag is True, remove the config and all the conditionals PiperOrigin-RevId: 728653433	2025-02-19 06:53:35 -08:00
Sergei Lebedev	a73456d54d	Removed unused ``# type: ignore`` comments For future reference, this can be done via python -m mypy jax --warn-unused-ignores > /tmp/unused.txt while IFS=: read file line rest; do echo "$file:$line"; gsed -i "${line}s/ \# type: ignore$\[[^]]\]$*//" "$file" done < /tmp/unused.txt	2025-02-13 21:12:27 +00:00
George Necula	1e813e1693	[better_errors] Continue adding debug info to Jaxprs (step 4) This follows after #26078, #26313, #26348, adding `debug_info` to more calls to `lu.wrap_init`. As part of this I have changed the primitive `custom_transpose` to take the `transpose` parameter as a `lu.WrappedFun`, which carries debug info. Previously, this was a `Callable`. These changes ensure that all the `lu.wrap_init` and `Jaxpr` are called with debug_info in the `api_test.py:CustomTransposeTest`.	2025-02-08 09:13:55 +02:00
jax authors	5d647ccfa1	Merge pull request #26348 from gnecula:debug_info_jaxpr_3 PiperOrigin-RevId: 723920031	2025-02-06 06:59:18 -08:00
Michael Hudgins	2e808f2836	Merge pull request #26279 from MichaelHudgins:tsan-resultstore PiperOrigin-RevId: 723918760	2025-02-06 14:55:57 +00:00
George Necula	904b74860c	[better_errors] Continue adding debug info to Jaxprs (step 3) This follows after #26078, and #26313, adding `debug_info` to more calls to `lu.wrap_init`. As part of this I have changed the primitives `custom_vjp_call_jaxpr` and `custom_lin` to take the `bwd` parameter as a `lu.WrappedFun`, which carries debug info. Previously, this was a `Callable`, but in almost all cases if was really ` lu.WrappedFun.call_wrapped`.	2025-02-06 16:26:49 +02:00
George Necula	abcaec7081	[better_errors] Add debug info to the Jaxprs formed for AD Following #26078 , we add debug info to more calls of lu.wrap_init.	2025-02-05 19:21:02 +02:00
jax authors	414449e142	Merge pull request #26078 from gnecula:debug_info_jaxpr PiperOrigin-RevId: 723151082	2025-02-04 10:54:26 -08:00
George Necula	d12aead696	[better_errors] Add debug info to more Jaxprs and WrappedFun (step 1) The plan is for all `core.Jaxpr` and `lu.WrappedFun` to carry non-None debug info. We change `lu.wrap_init` to construct the result paths thunk whenever it is passed a `debug_info`. The goal is to make sure that all `WrappedFun` have a debug info with result paths support. We change some calling conventions for internal functions to not pass along a separate debug_info if we have a `WrappedFun` or a `Jaxpr`. We obtain several improvements in presence of debug infos in debug_info_test.py	2025-02-04 10:02:35 +02:00
Yash Katariya	bc1a706688	[sharding_in_types] Add a canonicalize_value step before dispatching `bind` so that we can insert `mesh_cast`s under the following conditions: * When current_mesh is Manual and aval mesh is Auto * When current mesh is set and aval mesh is unset * Final style primitives skip this canonicalization and they are free to add it in their own `bind` method. * `mesh_cast` is skipped from this canonicalization to avoid recursion errors. This is required to make sure that after we hit abstract_eval rule and check_jaxpr, everything is properly typed in JAX's type system. `Auto` right now is a bit more permissive because we need to keep the current code at HEAD working but `Explicit` and `Manual` are very strict. PiperOrigin-RevId: 722868091	2025-02-03 18:00:19 -08:00
George Necula	c70de6deed	[better_errors] Merge the JaxprDebugInfo and TracingDebugInfo into core.DebugInfo Previously, we had two almost identical classes: `TracingDebugInfo` and `JaxprDebugInfo`. The only difference was that `TracingDebugInfo` had a thunk to return the result paths, while `JaxprDebugInfo` had the result paths resolved to a tuple. The separation of these types provided some clarity, but also led to code duplication and required conversions as the debugging info goes from `WrappedFun` to a `Jaxpr` and then to `WrappedFun` again.	2025-02-02 06:23:03 +02:00
George Necula	32c98b9a76	[better_errors] Refactor more uses of pe.tracing_debug_info (part 3) We replace uses of `pe.tracing_debug_info` with with `api_util.tracing_debug_info`, which uses the actual args and kwargs, instead of `in_tree` to manufacture fake args and kwargs. This ends up being more accurate, especially for `arg_names`; see changes in debug_info_tests.py. This means that we have to construct the debug info further upstream, before flattening args. This will later help populate debug info in `WrappedFun` and `Jaxpr`. This is part 3 of a series (following #26097, #26099) for jit, pmap, checkify, and the custom_partitioning (the last few uses). In order to land this, I had to remove a safety check that the number of `arg_names` and `result_paths` in a Jaxpr's debug info match the number of Jaxpr invars and outvars, respectively. Additionally, I added two accessors `safe_arg_names` and `safe_result_paths` to ensure that the arg names and result paths match the expected length. These accessors return no-op results when the lengths are not as expected. From my testint, this happens only in Jaxprs that are not used for lowering, hence there is no actual user-visible change here. Simply, more internal Jaxprs are getting debug_info and in some cases the `arg_names` and `result_paths` are not correct. Still, this change is worth it because the `func_src_info` is the most useful part of the debug info (used for leaked tracers), and that is accurate. We will fix the `arg_names` and `result_paths` in a future change. One can see in the changes in debug_info_test.py the improvements in the user-visible debug info, including for `pjit` and `pmap` cases when it was wrong.	2025-01-30 07:40:05 +02:00
Yash Katariya	dcb28f1218	[sharding_in_types] Add vmap + explicit sharding support. The main changes are: * Track `explicit_mesh_axis` on `AxisData`. * Modify `unmapped_aval` to the the above explicit mesh axis and insert it into the right place in the sharding so out_shardings are correct. * Make `matchaxis` also handle shardings correctly * All mapped dimensions should be sharded the same way * spmd_axis_name and explicit sharded arrays cannot be used together * `out_shardings` parameter on `dot_general`, `broadcast_in_dim`, `reshape`, `reshard` and `mesh_cast` is handled correctly in presence of vmap. This should eventually help us get rid of `spmd_axis_name` from `vmap`. PiperOrigin-RevId: 721007659	2025-01-29 09:34:27 -08:00
Yash Katariya	23d360bded	Remove axis_name from unmapped_aval PiperOrigin-RevId: 718558713	2025-01-22 15:49:04 -08:00
Yash Katariya	d50d1e2c40	Don't allow users to query `tracer.sharding` even under sharding in types mode. Instead, users should do `tracer.aval.sharding` so that code behaves the same under jit and eager mode. PiperOrigin-RevId: 717638986	2025-01-20 15:12:47 -08:00
Yash Katariya	799eb98cac	Add `reshard` API in experimental. Currently for sharding_in_types we have 2 APIs: `mesh_cast` and `reshard`. Both work in sharding_in_types mode and affect the sharding of the aval. Following are the semantics of both: * `mesh_cast`: AxisTypes between src and dst mesh must differ. There should be no "visible" data movement. The shape of the aval doesn't change. * `reshard`: Mesh should be the same between src and dst (same axis_names, axis_sizes and axis_types). Data movement is allowed. The shape of the aval doesn't change. We might make `reshard` == `device_put`, hence the API is in experimental. This decision can be taken at a later point in time. The reason not to just give `device_put` this power is because `device_put` does a lot of stuff right now (and is going to get even more powers in the near future like cross-host transfers) and it's semantics would be very confusing if we keep piling sharding-in-types stuff on it. PiperOrigin-RevId: 717588253	2025-01-20 11:39:25 -08:00
George Necula	dcf72b01f4	[better_errors] Improvements in propagation of debugging info Added some documentation for `TracingDebugInfo` (docstring, comments about `arg_names`, since it was not obvious to me that this would flatten the non-static arguments). Laying the ground for the unification of the old `api_util.debug_info` and `partial_eval.tracing_debug_info`: we rename the former to `api_util.tracing_debug_info`, we push inside the calls to `fun_sourceinfo` and `fun_signature` (which were done by the callers until now), and we rewrite the latter in terms of the former. We leave for a future PR the actual replacing of the latter with the former throughout. In the process of above, cleaned up the one case when `partial_eval.tracing_debug_info` received None for the `in_tree` and `out_tracer_thunk`. The function contained catch-all exception clauses to handle those, but doing so it masked other places where we fail to collect debug info due to programming mistakes. E.g., in one place we passed a `WrappedFun` instead of a `Callable`, resulting in missing debugging info. Added more type declarations. Added a `state_test` with a failure to track debugging information, manifested with a leaked tracer without function provenance. Fixing this in a subsequent PR.	2025-01-20 15:09:51 +01:00
Yash Katariya	c7f8d17f5a	Expose hidden_axes via jax namespace as public API. Also mention it as a workaround for primitives we don't support yet. PiperOrigin-RevId: 716839003	2025-01-17 16:48:58 -08:00
jax authors	ee724565bf	Merge pull request #25827 from gnecula:debug_info_2 PiperOrigin-RevId: 715407809	2025-01-14 09:12:37 -08:00
Yash Katariya	c72ed260fe	[sharding_in_types] Handle ShapeDtypeStruct inputs with sharding_in_types by registering the sharding on the aval properly created by SDS in it's pytype_aval_mapping. Also If we are running under full auto mode, don't error out if primitives don't have a sharding rule registered. PiperOrigin-RevId: 715383866	2025-01-14 08:03:50 -08:00
Dougal	7d11d12bcd	Mention expected tangent aval in error message, see #25517 .	2025-01-14 08:51:12 -05:00
George Necula	b30df36d7d	[better_errors] Add debug_info to DynamicJaxprTrace and JaxprStackFrame This is part of a sequence of changes to ensure that the debugging information is propagated properly. Additional cleanup: * Rename `result_paths` to `result_paths_thunk` in `TracingDebugInfo` to clarify the difference from the similar field in `JaxprDebugInfo` * Added more type declarations	2025-01-14 13:49:18 +00:00
Jake VanderPlas	ccc3a29537	Internal: use a single registry for abstractify APIs	2024-12-23 08:44:35 -08:00
Jake VanderPlas	c560f8e06c	Unify abstractify & shaped_abstractify rules	2024-12-20 04:28:19 -08:00
Jake VanderPlas	5dc37d3f70	Remove internal uses of api_util.shaped_abstractify	2024-12-19 07:06:36 -08:00
Yash Katariya	473e2bf527	Put abstract_mesh on every eqn so that we can preserve it during `eval_jaxpr` and `check_jaxpr` roundtrip. Also allow users to enter into `Auto`/`User` mode inside jit along all or some axes. Add checks to make sure that avals inside a context match the surrounding context. This check happens inside `abstract_eval` rules but maybe we need a more central place for it which we can create later on. PiperOrigin-RevId: 707128096	2024-12-17 09:17:21 -08:00
Jake VanderPlas	40367a9eaf	Cleanup: remove uses of no-op raise_to_shaped	2024-12-12 09:49:06 -08:00
Yash Katariya	deab6fbd80	Remove _pjit_lower_cached cache. We can simplify the caching of jit as we have downstream caches and a cpp cache too. If you drop out of cpp cache, things are going to be slow anyways. PiperOrigin-RevId: 700052522	2024-11-25 11:40:50 -08:00
jax authors	4363bb65d7	Merge pull request #24770 from jakevdp:extended-device-get PiperOrigin-RevId: 695671688	2024-11-12 03:58:23 -08:00
Jake VanderPlas	58dee3ea33	jax.device_get: handle generic extended dtypes	2024-11-07 16:01:22 -08:00
Yash Katariya	0bb30f0777	Propagate CopySemantics from python to C++ transfer APIs so that device_put works correctly in presence of copy/donate options that user specified. This change only supports pinned_host -> pinned_host copies on the same device. HBM -> HBM copies don't work yet and donation also doesn't work in PJRT. This CL also sets up the plumbing from JAX to PJRT so that in the future support for missing features can be added easily. Fixes https://github.com/jax-ml/jax/issues/24521 PiperOrigin-RevId: 694274616	2024-11-07 15:51:54 -08:00
Peter Hawkins	0e8acff5c6	Reverts a913fbf2fddc5b8c1b6c85b159d0eeb1bf65d461 PiperOrigin-RevId: 693360032	2024-11-05 08:32:25 -08:00
jax authors	a913fbf2fd	rollback due to data race Reverts ab47d4687f647de3aa145a9a782fb7b4aaf92af4 PiperOrigin-RevId: 693191298	2024-11-04 21:05:33 -08:00
Peter Hawkins	ab47d4687f	[JAX] [XLA:Python] Move JAX configuration objects into C++. A noticeable amount of time during JAX tracing is spent getting and setting the value of config.State objects, in particular the thread-local values within that state. If we move that logic into C++, we can speed up that code. There are two main ways we can get a speedup: * Python thread-local state is based around a dictionary and isn't terribly fast. * we can have the C++ jit dispatch path directly access the configuration items it needs to include in its cache key. We spend a considerable amount of time in effect eagerly computing cache keys via update_thread_local_jit_state, although most of that is pointless work. Instead, we can have `jit` simply pull the config items it needs on demand. PiperOrigin-RevId: 693114411	2024-11-04 15:39:06 -08:00
Yash Katariya	fff33f90b2	Add `compiler_options` argument to `jax.jit`. This exists on `Compiled` object via AOT too i.e. `jit(f).lower(*args).compile(compiler_options={})` PiperOrigin-RevId: 692283964	2024-11-01 14:01:19 -07:00
Yash Katariya	07858fa98d	[sharding_in_types] Allow `device_put` to reshard inputs. `device_put` is a good choice for resharding since it already handles transpose correctly because it tracks the `src` sharding too. PiperOrigin-RevId: 692274137	2024-11-01 13:25:08 -07:00
Dougal Maclaurin	48f24b6acb	Remove ConcreteArray from JAX. It's easy to do trace-time concretization without it. PiperOrigin-RevId: 691929385	2024-10-31 14:06:54 -07:00
Jake VanderPlas	0181cb396d	Re-land #24589 with fixes to handle `dtype` that is not compatible with NumPy. Previously, this change did not account for that fact that `device_get` may be called on objects that have a non-NumPy-compatible `dtype` attribute, such as tensorflow tensors. This change adds new dtype handling aimed at being robust to this case. Reverts 2bed1e88e4276558e4dd5e6a6d5afe6f2396a25d PiperOrigin-RevId: 691568933	2024-10-30 15:13:00 -07:00
Thomas Köppe	2bed1e88e4	Reverts 6dd1417d4a0a9ee31d8a014352b3a0fb2bcfcbaf PiperOrigin-RevId: 691417832	2024-10-30 07:54:00 -07:00
jax authors	6dd1417d4a	Merge pull request #24589 from jakevdp:device-get-key PiperOrigin-RevId: 691154098	2024-10-29 14:03:18 -07:00
Jake VanderPlas	b9ad519a29	Implement device_get for typed PRNG keys	2024-10-29 12:34:46 -07:00
Dougal Maclaurin	c36e1f7c1a	Make trace dispatch purely a function of context rather than a function of both context and data. This lets us delete a lot of machinery for managing data-dependent tracing: levels, sublevels, post_process_call, new_base_main, custom_bind and so on. PiperOrigin-RevId: 691086496	2024-10-29 11:04:31 -07:00
jax authors	47bacfab5e	Merge pull request #24031 from garymm:garymm/vmap-error-msg PiperOrigin-RevId: 689940504	2024-10-25 15:59:57 -07:00
Gary Miguel	9f7f08eccb	Fix vmap error message when args passed by keyword See the new test for a case that used to produce the wrong message. Fixes: #24406	2024-10-25 15:17:03 -07:00

1 2 3 4 5 ...

539 Commits