rocm_jax

mirror of https://github.com/ROCm/jax.git synced 2025-04-25 00:36:06 +00:00

Author	SHA1	Message	Date
Yash Katariya	49224d6cdb	Replace Auto/User/Collective AxisTypes names with Hidden/Visible/Collective. Replace `with set_mesh(mesh):` with `with use_mesh(mesh):` context manager Also expose `AxisTypes` and `use_mesh` into public API via `jax.sharding.AxisTypes` and `jax.sharding.use_mesh`. PiperOrigin-RevId: 716446406	2025-01-16 17:55:54 -08:00
Jake VanderPlas	de3191fab3	Cleanup: fix unused imports & mark exported names	2024-10-16 17:42:41 -07:00
Jake VanderPlas	49ad220e57	Finalize deprecation of XLACompatibleSharding PiperOrigin-RevId: 681156145	2024-10-01 14:02:34 -07:00
Michael Hudgins	d4d1518c3d	Update references to the GitHub url in JAX codebase to reflect move from google/jax to jax-ml/jax PiperOrigin-RevId: 676843138	2024-09-20 07:52:33 -07:00
Yash Katariya	daa69da321	Introduce `jax.sharding.AbstractMesh(shape_tuple: tuple[tuple[str, int], ...])` and allow `with_sharding_constraint` and `shard_map` to accept an abstract mesh as input (`with_sharding_constraint` is via `NamedSharding(abstract_mesh, pspec)`). Semantics Inside jit, we don't need to talk about concrete devices ever so the semantics stay the same as today i.e. we can lower a NamedSharding with abstract mesh with only mesh axis names and sizes and PartitionSpec. The only restriction is that the number of devices need to be consistent throughout the program when we are tracing. During compilation, the order of devices throughout the program needs to be consistent (same as before this change). Outside jit i.e. eager mode, if a `shard_map` or `with_sharding_constraint` contains AbstractMesh, then the input to those primitives should contain a concrete Mesh with the same shape and names as the abstract mesh. Why do this? There are cases, where you want the change the devices in the mesh but keep the mesh shape the same (axis names and axis sizes). But this leads to a device mismatch error if you have `with_sharding_constraint` or `shard_map` in your computation because they embed concrete devices in their signature. So to fix the error, you need to change the mesh in `wsc` and `shmap` which will lead to a tracing cache miss (because function id is now different) and consequently a lowering to stableHLO cache miss. Explaining via an example: ``` mesh1 = Mesh(jax.devices()[:2], 'x') mesh2 = Mesh(jax.devices()[2:4], 'x') arr_mesh1 = jax.device_put(np.arange(8), NamedSharding(mesh1, P())) arr_mesh2 = jax.device_put(np.arange(8), NamedSharding(mesh2, P())) @jax.jit def f(x): y = with_sharding_constraint(x, NamedSharding(mesh1, P('x'))) return y * 2 f(arr_mesh1) f(arr_mesh2) # DEVICE MISMATCH ERROR! ``` The same problem exists for `shard_map` since it takes a mesh with concrete devices in it's signature. Okay, so how do you fix this? As mentioned above, we need the above program to work and get tracing and lowering cache hits (cache hits is the most important part here) The approach in this change, allows `with_sharding_constraint` to accept a `NamedSharding(abstract_mesh, pspec)` as input. This leads to no errors downstream and we get tracing and lowering cache hits since we don't encode the concrete devices anymore. Just the axis_names and axis_size of the mesh. The important part is that the concrete device information should only come from the arguments. Inside `jax.jit`, you should never reference concrete devices ever. ``` mesh1 = Mesh(jax.devices()[:2], 'x') mesh2 = Mesh(jax.devices()[2:4], 'x') arr_mesh1 = jax.device_put(np.arange(8), NamedSharding(mesh1, P())) arr_mesh2 = jax.device_put(np.arange(8), NamedSharding(mesh2, P())) # Creating abstract mesh with mesh1 but since both meshes have the same shape (names # and axis size), it should be ok. abstract_mesh = jax.sharding.AbstractMesh(arr_mesh1.shape_tuple) @jax.jit def f(x): y = with_sharding_constraint(x, NamedSharding(abstract_mesh, P('x'))) return y * 2 f(arr_mesh1) f(arr_mesh2) # tracing and lowering cache hit ``` One caveat is that this only works with `jax.NamedSharding` but that's fine because `NamedSharding` is the most used `Sharding` in JAX. What about `shard_map`? shard_map's signature will be: `shmap(f, mesh: Mesh \| AbstractMesh, in_specs: Specs, out_specs: Specs)`. ``` mesh1 = Mesh(jax.devices()[:2], 'x') mesh2 = Mesh(jax.devices()[2:4], 'x') arr_mesh1 = jax.device_put(np.arange(8), NamedSharding(mesh1, P())) arr_mesh2 = jax.device_put(np.arange(8), NamedSharding(mesh2, P())) # Creating abstract mesh with mesh1 but since both meshes have the same shape (names # and axis size), it should be ok. abstract_mesh = jax.sharding.AbstractMesh(arr_mesh1.shape_tuple) @jax.jit def f(x): y = shard_map(lambda x: x, mesh=abstract_mesh, in_specs=P('x'), out_specs=P('x')) return y * 2 f(arr_mesh1) f(arr_mesh2) # tracing and lowering cache hit ``` This is a fully backwards change. So your current code will continue to work as is but you can opt-into this new behavior and get all the benefits! PiperOrigin-RevId: 662670932	2024-08-13 15:18:08 -07:00
Yash Katariya	1edd649de4	Deprecate `XLACompatibleSharding` in favor of `jax.sharding.Sharding`. PiperOrigin-RevId: 640544939	2024-06-05 09:07:27 -07:00
Yash Katariya	14451492c9	Delete OpShardingSharding export since it has been 3 months since it was deprecated. Also remove deprecation warnings for MeshPspecSharding. PiperOrigin-RevId: 538880293	2023-06-08 13:53:38 -07:00
Peter Hawkins	b4402185db	Move PartitionSpec into its own file (jax/_src/partition_spec.py). No functional changes intended. A subsequent change will move ParsedPartitionSpec and array mapping utilities here also. PiperOrigin-RevId: 522393166	2023-04-06 11:43:25 -07:00
Yash Katariya	7442faa715	Remove MeshPspecSharding since it has been more than 3 months since it was deprecated (Nov 2, 2022). The replacement name is NamedSharding. PiperOrigin-RevId: 520072687	2023-03-28 10:47:42 -07:00
Peter Hawkins	1925aa1109	Split Sharding subclasses out of _src/sharding.py into _src/sharding_impls.py By defining the Sharding base class in its own module, we can pull it out into a separate Bazel submodule, which will help pytype inference when defining Array. PiperOrigin-RevId: 516223009	2023-03-13 08:50:18 -07:00
Yash Katariya	0ffdeb3de2	Rename `jax.sharding.OpShardingSharding` to `jax.sharding.GSPMDSharding`. `jax.sharding.OpShardingSharding` will be removed in 3 months from Feb 17, 2023. PiperOrigin-RevId: 510556189	2023-02-17 17:11:06 -08:00
Peter Hawkins	8268cd562d	Add infrastructure for managing deprecations. Use it to deprecate jax.experimental.PartitionSpec, jax.interpreters.pxla.PartitionSpec, jax.interpreters.pxla.Mesh. PiperOrigin-RevId: 508349776	2023-02-09 05:48:40 -08:00
Jake VanderPlas	26f2f97805	Document why 'import name as name' is used	2022-12-14 15:07:04 -08:00
Yash Katariya	934bc4e1b3	Move `PartitionSpec` and `Mesh` out of experimental and into the `sharding` namespace. The new API endpoint is `jax.sharding.PartitionSpec` and `jax.sharding.Mesh`. PiperOrigin-RevId: 492358238	2022-12-01 19:28:32 -08:00
Yash Katariya	cc5af7ed98	Rename `ReshapeableDevicesSharding` to `PositionalSharding` and add an alias `NamedSharding` for `MeshPspecSharding`. `MeshPspecSharding` name will be replaced with `NamedSharding` in 3 months. PiperOrigin-RevId: 485753078	2022-11-02 19:13:13 -07:00
Matthew Johnson	95eb4249bb	tweaks to DevicesSharding 1. rename DevicesSharding -> ReshapeableDevicesSharding 2. fix repr to print device order faithfully 3. respect shape of np.ndarray argument to __init__	2022-10-25 14:28:48 -07:00
Matthew Johnson	43098f906a	initial commit of DevicesSharding (fka SimpleSharding) need to add tests! Co-authored-by: Yash Katariya <yashkatariya@google.com> Co-authored-by: Sharad Vikram <sharad.vikram@gmail.com>	2022-10-18 21:10:24 -07:00
Yash Katariya	9e4114f0f1	Move `array.py` and `sharding.py` from `experimental/` to `_src/`. PiperOrigin-RevId: 477201711	2022-09-27 10:06:52 -07:00

18 Commits