rocm_jax

mirror of https://github.com/ROCm/jax.git synced 2025-04-22 09:56:06 +00:00

Author	SHA1	Message	Date
Yash Katariya	53494ade2d	`PRNGKeyArray.aval` should have the correct logical sharding. This required refactoring code so that we don't hit recursion errors. PiperOrigin-RevId: 732536521	2025-03-01 18:18:19 -08:00
Jake VanderPlas	ccc3a29537	Internal: use a single registry for abstractify APIs	2024-12-23 08:44:35 -08:00
Jake VanderPlas	676070f4cd	Refactor: move shaped_abstractify to core	2024-12-18 19:14:46 -08:00
Yash Katariya	0bb30f0777	Propagate CopySemantics from python to C++ transfer APIs so that device_put works correctly in presence of copy/donate options that user specified. This change only supports pinned_host -> pinned_host copies on the same device. HBM -> HBM copies don't work yet and donation also doesn't work in PJRT. This CL also sets up the plumbing from JAX to PJRT so that in the future support for missing features can be added easily. Fixes https://github.com/jax-ml/jax/issues/24521 PiperOrigin-RevId: 694274616	2024-11-07 15:51:54 -08:00
Yash Katariya	66c6292e6a	Make committed a public property of jax.Array. Why? Because users need to know if an array is committed or not since JAX raises errors based on committedness of a jax.Array. JAX also makes decisions about dispatching based on committedness of a jax.Array. But the placement of such arrays on devices is an internal implementation detail. PiperOrigin-RevId: 686329828	2024-10-15 19:46:10 -07:00
Yash Katariya	6e1c23610d	If input layouts are specified via `in_shardings` to `jit` and the array that the jitted function is called with is uncommitted, reshard the input array to the layout specified by the user. Not doing the resharding, leads to incorrect outputs on GPU and a crash on TPU which is not good. Fixes: https://github.com/google/jax/issues/23100 PiperOrigin-RevId: 665000157	2024-08-19 15:10:32 -07:00
Jake VanderPlas	613a00044c	[array API] add device property & to_device method	2024-07-23 11:12:35 -07:00
Junwhan Ahn	5046cedbfc	Make `pxla.shard_arg` batch calls to `xc.copy_array_to_devices_with_sharding` This CL changes `shard_arg_handlers` to be batched, in that it now receives a list of objects and a list of shardings and returns a list of array. This makes it possible to batch backend calls whenever it's beneficial to do so. Based on the above, the batched shard arg for arrays leverages the newly added `xla::ifrt::Client::CopyArrays()` (https://github.com/tensorflow/tensorflow/pull/69096) to make bulk copy cheaper in some backend implementations. Since `Client::CopyArrays()` requires batched arrays to have the same set of source/destination devices, `PyArray::BatchedCopyToDeviceWithSharding()` internally groups arrays by their source/destination devices and memory kinds. The grouping is pushed all the way to C++ for performance in case we have lots of arrays. PiperOrigin-RevId: 643097852	2024-06-13 13:10:10 -07:00
Yash Katariya	1273028018	Simplify extended dtypes rules part 1. Start by removing sharding specific rules from EDtypes. This is because we always want to replicate the trailing dims introduced by Edtypes. PiperOrigin-RevId: 639920049	2024-06-03 14:52:50 -07:00
Matthew Johnson	89f26db36d	start adding EArray, a jax.Array analog that can contain extended dtypes	2024-04-06 13:09:25 -07:00

10 Commits