rocm_jax

mirror of https://github.com/ROCm/jax.git synced 2025-04-25 02:26:05 +00:00

Author	SHA1	Message	Date
Yash Katariya	e624610e72	Replace apply_primitive internals with `jax.jit`. This allows deletion of a lot of code and leads to ~40% eager performance speedup. Benchmarks: ``` name old time/op new time/op delta eager_unary_dispatch 31.3µs ± 1% 19.4µs ± 6% -37.91% (p=0.016 n=4+5) eager_unary 32.1µs ± 0% 19.8µs ± 4% -38.26% (p=0.016 n=4+5) eager_binary_dispatch 35.9µs ± 1% 20.5µs ± 4% -42.93% (p=0.016 n=4+5) eager_binary 36.6µs ± 1% 21.1µs ± 4% -42.29% (p=0.016 n=4+5) jit_trivial_dispatch 3.87µs ± 2% 4.12µs ±25% ~ (p=1.000 n=5+5) jit_trivial 4.75µs ± 2% 4.82µs ±11% ~ (p=0.690 n=5+5) jit_simple_dispatch 2.95µs ± 2% 2.97µs ± 7% ~ (p=1.000 n=5+5) jit_simple 3.52µs ± 6% 3.51µs ± 5% ~ (p=0.841 n=5+5) jit_simple_dispatch_array 2.95µs ± 2% 2.96µs ± 6% ~ (p=1.000 n=5+5) jit_simple_array 3.46µs ± 2% 3.51µs ± 5% ~ (p=0.690 n=5+5) jit_small_matmul 3.01µs ± 1% 3.00µs ± 4% ~ (p=0.548 n=5+5) jit_big_matmul 34.0µs ±18% 35.5µs ±17% ~ (p=0.310 n=5+5) jit_simple_many_args_dispatch/num_args:10 6.93µs ± 6% 6.80µs ± 6% ~ (p=0.481 n=10+10) jit_simple_many_args_dispatch/num_args:100 47.7µs ± 7% 45.4µs ± 2% ~ (p=0.237 n=10+8) jit_simple_many_args_dispatch/num_args:1000 545µs ± 8% 516µs ± 2% ~ (p=0.101 n=10+8) jit_simple_many_args_dispatch/num_args:2000 1.12ms ± 7% 1.07ms ± 2% ~ (p=0.237 n=10+8) jit_simple_many_args/num_args:10 7.42µs ± 5% 7.23µs ± 2% ~ (p=0.173 n=10+8) jit_simple_many_args/num_args:100 48.4µs ± 7% 45.6µs ± 2% ~ (p=0.237 n=10+8) jit_simple_many_args/num_args:1000 542µs ± 6% 524µs ± 8% ~ (p=0.089 n=10+10) jit_simple_many_args/num_args:2000 1.12ms ± 7% 1.08ms ± 1% ~ (p=0.068 n=10+8) jit_simple_pruned_args_dispatch_10 4.79µs ± 8% 4.98µs ±10% ~ (p=0.421 n=5+5) jit_simple_pruned_args_10 5.32µs ± 6% 5.30µs ± 4% ~ (p=1.000 n=5+5) jit_simple_pruned_args_dispatch_100 24.7µs ± 6% 23.8µs ± 8% ~ (p=0.548 n=5+5) jit_simple_pruned_args_100 25.2µs ± 6% 24.4µs ± 8% ~ (p=0.690 n=5+5) jit_simple_pruned_args_dispatch_1000 238µs ± 7% 232µs ± 8% ~ (p=0.841 n=5+5) jit_simple_pruned_args_1000 240µs ± 7% 234µs ± 8% ~ (p=1.000 n=5+5) jit_simple_pruned_args_dispatch_2000 516µs ± 6% 497µs ± 1% ~ (p=0.413 n=5+4) jit_simple_pruned_args_2000 517µs ± 6% 505µs ± 7% ~ (p=0.690 n=5+5) jit_dispatch_without_transfer 719µs ± 9% 751µs ± 8% ~ (p=0.222 n=5+5) jit_dispatch_with_transfer 799µs ±14% 793µs ± 9% ~ (p=1.000 n=5+5) pmap_trivial_2_devices 49.9µs ±40% 48.2µs ±42% ~ (p=0.841 n=5+5) pmap_trivial_dispatch_8_devices 74.5µs ±24% 78.9µs ±29% ~ (p=0.421 n=5+5) pmap_trivial_8_devices 79.3µs ± 6% 82.7µs ±20% ~ (p=0.841 n=5+5) pmap_simple_2_devices 47.1µs ±17% 49.1µs ±20% ~ (p=0.548 n=5+5) pmap_simple_dispatch_8_devices 73.4µs ±16% 76.8µs ±21% ~ (p=0.690 n=5+5) pmap_simple_8_devices 76.0µs ±10% 80.6µs ±29% ~ (p=1.000 n=5+5) pmap_simple_dispatch_8_devices_100_args 1.12ms ±22% 1.08ms ±42% ~ (p=0.841 n=5+5) pmap_simple_8_devices_100_args 12.5ms ± 8% 12.8ms ±10% ~ (p=1.000 n=5+5) sda_index_1 413µs ± 1% 686µs ± 4% +66.08% (p=0.008 n=5+5) sda_index_2 850µs ± 1% 1378µs ± 4% +62.02% (p=0.008 n=5+5) sda_index_8 3.60ms ± 1% 5.69ms ± 4% +58.00% (p=0.008 n=5+5) bench_shaped_abstractify 300µs ± 1% 305µs ± 3% ~ (p=0.056 n=5+5) bench_xla_abstractify_scalar_int 6.45µs ± 1% 6.50µs ± 3% ~ (p=0.548 n=5+5) bench_xla_abstractify_scalar_float 3.73µs ± 1% 3.73µs ± 3% ~ (p=0.690 n=5+5) bench_xla_abstractify_scalar_numpy_int32 4.97µs ± 1% 4.83µs ± 3% ~ (p=0.095 n=5+5) bench_xla_abstractify_scalar_numpy_uint32 4.91µs ± 1% 4.75µs ± 0% -3.30% (p=0.016 n=5+4) bench_xla_abstractify_numpy_random 4.34µs ± 2% 4.31µs ± 3% ~ (p=0.310 n=5+5) bench_xla_abstractify_numpy_arange_100_float32 3.94µs ± 1% 3.93µs ± 3% ~ (p=0.548 n=5+5) bench_xla_abstractify_enum 6.85µs ± 1% 7.06µs ± 7% +3.07% (p=0.032 n=5+5) bench_are_op_shardings_equal 26.9µs ± 2% 27.0µs ± 3% ~ (p=0.841 n=5+5) bench_pjit_check_aval_sharding 691µs ± 2% 711µs ±13% ~ (p=0.841 n=5+5) bench_addressable_shards_index 656ns ± 4% 688ns ± 9% ~ (p=0.095 n=5+5) bench_remat_eager_retracing_overheads 12.7ms ± 4% 10.7ms ± 1% -15.48% (p=0.016 n=5+4) bench_remat_eager_retracing_overheads_static_argnums 13.0ms ± 2% 11.3ms ± 6% -13.71% (p=0.008 n=5+5) bench_slicing_compilation 12.1ms ± 1% 12.3ms ± 4% ~ (p=0.690 n=5+5) bench_slicing_compilation2 11.3ms ± 0% 11.5ms ± 6% ~ (p=0.690 n=5+5) bench_repeated_static_indexing 62.5ms ± 2% 40.8ms ± 8% -34.77% (p=0.008 n=5+5) bench_repeated_static_slicing 46.7ms ± 1% 31.4ms ± 2% -32.76% (p=0.008 n=5+5) pjit_simple_1_device/num_args:1 2.72µs ± 2% 2.68µs ± 5% ~ (p=0.151 n=5+5) pjit_simple_1_device/num_args:10 12.6µs ± 7% 12.3µs ± 3% ~ (p=0.310 n=5+5) pjit_simple_1_device/num_args:100 109µs ± 3% 108µs ± 4% ~ (p=0.548 n=5+5) pjit_simple_4_device/num_args:1 38.0µs ±26% 36.8µs ±19% ~ (p=0.690 n=5+5) pjit_simple_4_device/num_args:10 93.3µs ±19% 96.6µs ±23% ~ (p=0.841 n=5+5) pjit_simple_4_device/num_args:100 730µs ±16% 698µs ±48% ~ (p=0.841 n=5+5) pjit_aot_1_device/num_args:1 3.29µs ± 2% 3.12µs ± 4% -5.24% (p=0.016 n=4+5) pjit_aot_1_device/num_args:10 13.0µs ± 1% 12.7µs ± 2% ~ (p=0.063 n=4+5) pjit_aot_1_device/num_args:100 111µs ± 5% 110µs ±11% ~ (p=0.421 n=5+5) pjit_aot_4_device/num_args:1 38.4µs ±19% 38.9µs ±24% ~ (p=1.000 n=5+5) pjit_aot_4_device/num_args:10 91.3µs ±15% 96.9µs ±29% ~ (p=0.548 n=5+5) pjit_aot_4_device/num_args:100 676µs ±20% 689µs ±41% ~ (p=0.841 n=5+5) host_local_array_to_global_array 196µs ± 6% 194µs ± 4% ~ (p=0.548 n=5+5) device_put 50.8µs ± 1% 50.7µs ± 4% ~ (p=0.413 n=4+5) device_put_sharded 176µs ± 0% 177µs ± 4% ~ (p=0.190 n=4+5) device_get_8_devices 3.96ms ± 4% 4.03ms ± 7% ~ (p=0.413 n=4+5) np_asarray_8_devices 3.34ms ±18% 3.30ms ±10% ~ (p=0.548 n=5+5) jax_array_arrays_8_devices 5.01ms ±10% 5.09ms ±21% ~ (p=0.421 n=5+5) batch_inplace_while_scatter 440µs ± 1% 439µs ± 1% ~ (p=0.421 n=5+5) batch_inplace_while_dynamic_update_slice 454µs ± 0% 457µs ± 1% ~ (p=0.905 n=4+5) serial_dot_products 4.51µs ± 3% 4.41µs ± 2% ~ (p=0.151 n=5+5) bench_make_array_from_callback_fully_replicated_sharding 26.6µs ± 1% 27.0µs ± 2% ~ (p=0.056 n=5+5) ``` PiperOrigin-RevId: 586505950	2023-11-29 18:07:13 -08:00
Neil Girdhar	3dcf0fc520	Annotate Jaxpr properties	2023-11-10 13:48:56 -05:00
Jake VanderPlas	cd3ea05665	Ensure sharding-related array properties are documented	2023-11-03 09:56:33 -07:00
Sergei Lebedev	f2ce5dbd01	MAINT Do not use `str()` and `repr()` in f-string replacement fields `str()` is called by default by the formatting machinery, and `repr()` only needs `!r`.	2023-10-23 15:12:04 +01:00
Jake VanderPlas	a794bebb33	CI: update mypy to v1.6.0	2023-10-11 12:54:51 -07:00
Sergei Lebedev	65d3058944	Migrate a subset of internal modules to use state objects The motivation here is to gradually replace all dynamic lookups on `jax.config` with statically-typed state objects, which are more type checker/IDE friendly. PiperOrigin-RevId: 571932143	2023-10-09 07:29:53 -07:00
Jake VanderPlas	bfed3d862e	Improve behavior of core.valid_jaxtype	2023-09-22 13:46:09 -07:00
jax authors	256612bb80	Merge pull request #17720 from superbobry:tuple-list-comp PiperOrigin-RevId: 567433086	2023-09-21 15:16:12 -07:00
Sergei Lebedev	df7f6a06c0	MAINT Use a generator expression in tuple([... for ... in ...]) In a few cases I also replaced tuple([xs, ys]) with (*xs, ys), because tuple literals support unpacking as well.	2023-09-21 22:25:38 +01:00
Jake VanderPlas	0dc2252f71	Better errors for array scalar/boolean conversion	2023-09-19 09:00:19 -07:00
Matthew Johnson	70b58bbd30	rolling forward shard_map transpose fixes The new efficient-transpose path, enabled by setting check_rep=True in the shard_map call, had kept working. But the change inadvertently broke the check_rep=False path. And because most tests set check_rep=True, we didn't notice it in the tests! The issue was that with check_rep=False, we need the shard_map transpose rule to insert psums corresponding to in_specs with fan-out, and correspondingly insert division for out_specs with fan-in-consensus. (With the new check_rep=True path that this change adds, those extra operations aren't necessary as the body itself transposes correctly.) But the PR accidentally removed those! The fix was simple: just track whether we've applied the efficient-transpose-body-rewrite (i.e. whether we're in the new body-is-transposable path or old need-extra-operations path) by adding a boolean parameter `rewrite` to the shard_map primitive, and if the rewrite hasn't been applied then include the explicit psum/div operations in the transpose rule. Reverts 8a04dfd830ff89f46e1fe3e866ee4fb2da9c90aa PiperOrigin-RevId: 561805840	2023-08-31 17:31:21 -07:00
Matthew Johnson	8a04dfd830	rolling back shard_map transposition change to fix a bug Reverts 437d7be73534403f39fbee9d6391be1c532933a1 PiperOrigin-RevId: 561730581	2023-08-31 12:39:56 -07:00
Matthew Johnson	fdd252f6ca	[shard-map] add rewrite for efficient transposition	2023-08-30 15:08:11 -07:00
Peter Hawkins	2c32660a8f	Replace references to DeviceArray with Array. A number of stale references are lurking in our documentation.	2023-08-18 17:46:00 -04:00
Jake Vanderplas	d8f799391b	COPYBARA_INTEGRATE_REVIEW=https://github.com/google/jax/pull/17027 from jakevdp:dtypes-annotations a116a9c498a7b085f9b3fec93b37da12289f6e31 PiperOrigin-RevId: 554905739	2023-08-08 20:38:44 +00:00
Peter Hawkins	76cda0ae07	Update flags to use the ABSL typed flag API. Change flags to use the newer definition style where the flag is read via a typed FlagHolder object returned by the DEFINE_... function. The advantage of doing this is that `flag.value` has a type known to the type checker, rather than reading it as an attr out of a gigantic config dictionary. For jax.config flags, define a typed FlagHolder object that is returned when defining a flag, matching the ABSL API. Move a number of flags into the file that consumes them. There's no reason we're defining every flag in `config.py`. This PR does not change the similar "state" objects in `jax.config`. Changing those is for a future PR. PiperOrigin-RevId: 551604974	2023-07-27 12:15:58 -07:00
Jake Vanderplas	b4132b4c50	Copybara import of the project: -- b243ea79ae7c9e2c2aa85e264b8dca8fc4c61b7b by Jake VanderPlas <jakevdp@google.com>: Rename opaque dtype to extended dtype. This includes three deprecations: - jax.core.is_opaque_dtype(dt) is deprecated in favor of jnp.issubdtype(dt, jax.dtypes.extended) - jax.core.has_opaque_dtype(x) is deprecated in favor of jnp.issubdtype(x.dtype, jax.dtypes.extended) - the allow_opaque_dtype argument to jax.core.canonicalize_dtype is now allow_extended_dtype Because jax.core is explicitly excluded from the API deprecation policy, these changes will not be subject to a standard 3-month deprecation period. COPYBARA_INTEGRATE_REVIEW=https://github.com/google/jax/pull/16824 from jakevdp:extended-dtype b243ea79ae7c9e2c2aa85e264b8dca8fc4c61b7b PiperOrigin-RevId: 550674205	2023-07-24 14:38:20 -07:00
jax authors	1b33a4eb05	Merge pull request #16815 from hawkinsp:py39 PiperOrigin-RevId: 550014612	2023-07-21 12:12:47 -07:00
Peter Hawkins	319ab98980	Apply pyupgrade --py39-plus. Notable changes: * use PEP 585 type names * use PEP 604 type union syntax where `from __future__ import annotations` is present. * use f-strings in more places. * remove redundant arguments to open().	2023-07-21 14:49:44 -04:00
Jake VanderPlas	2ffa9bd8df	Refactor opaque dtype implementation. This makes it closer to numpy, with dtypes.OpaqueDtype analogous to np.dtype, and dtypes.opaque analogous to np.numeric. This will let us replace the dtypes.is_opaque_dtype function with jnp.issubdtype(dtype, dtypes.opaque).	2023-07-20 19:51:52 -07:00
George Necula	4fdc134543	[shape_poly] Add support for max0 for symbolic dimensions. There are a few cases when JAX computes `max(v, 0)`, most notably when computing the sizes of strided access, dilated convolutions and padding, and for the size of jnp.arange. Until now these cases were supported for shape polymorphism only when we can tell statically that the size is >= 0. Here we add support to the symbolic expressions for a `non_negative` operator, which essentially implements `max(v, 0)` and with this we can now support the general case for `jnp.arange`, with simpler code. We could add a general `max` operator, and we may do so in the future, but for now `non_negative` suffices. Note that this fixes a couple of bugs * for core.dilated_dim we had the code "if d == 0 then 0 else ..." but this works only if we can tell statically that `d == 0`, and it produced wrong results when `d` was symbolic and could take the value 0. * for core.stride_dim we did not handle correctly the case when `d < window_size`. Handling the above fundamentally requires a `max(d, 0)` operation.	2023-07-19 16:15:04 +03:00
George Necula	71ac0bb446	[shape_poly] More cleanup for the internal APIs for shape polymorphism. Previously we had a number of APIs in core.py that operated on dimensions and shapes and delegated to instances of DimensionHandler. We remove most of those APIs because by now they ended up doing very little, e.g., `core.sum_dim` was the same as `operator.add`, and `core.sum_shape` was the same as `tuple(map(operator.add))`. We also remove the whole `DimensionHandler` machinery because by now the only other use of non-constant dimensions using this mechanism are the symbolic dimensions used for shape polymorphism, and those support now full operator overloading. (When we introduced `DimensionHandler` we had the masking transformation around that needed it also.)	2023-07-13 16:37:53 +03:00
George Necula	58d6c4c1ec	Roll back #16689 PiperOrigin-RevId: 547773322	2023-07-13 06:05:50 -07:00
George Necula	d21a667235	[shape_poly] More cleanup for the internal APIs for shape polymorphism. Previously we had a number of APIs in core.py that operated on dimensions and shapes and delegated to instances of DimensionHandler. We remove most of those APIs because by now they ended up doing very little, e.g., `core.sum_dim` was the same as `operator.add`, and `core.sum_shape` was the same as `tuple(map(operator.add))`. We also remove the whole `DimensionHandler` machinery because by now the only other use of non-constant dimensions using this mechanism are the symbolic dimensions used for shape polymorphism, and those support now full operator overloading. (When we introduced `DimensionHandler` we had the masking transformation around that needed it also.)	2023-07-13 09:59:41 +03:00
Alexey Radul	6f09fe840e	Better error message when broadcasting ragged to static shape. Co-authored-by: Matthew Johnson <mattjj@google.com>	2023-07-07 09:23:29 -04:00
George Necula	9261edaf94	[shape_poly] Cleanups for the shape polymorphism APIs. Shape polymorphism relies on a number of functions defined in core.py. Overtime we have accumulated some duplicate functionality in those functions. Here we do some cleanups: * remove symbolic_equal_dim and symbolic_equal_shape in favor of the newer definitely_equal and definitely_equal_shape * remove is_special_dim_size, which checks that a value is a dimension expression (not a constant). Some uses are replaced with `not is_constant_dim` and others with `is_dim`. * introduce concrete_dim_or_error to check that a value is a dimension	2023-06-30 15:56:57 +03:00
Peter Hawkins	816ba91263	Use lower-case PEP 585 names for types. Issue https://github.com/google/jax/issues/16537 PiperOrigin-RevId: 542969282	2023-06-23 15:12:14 -07:00
jax authors	f67acee129	Merge pull request #16430 from jakevdp:bool-error PiperOrigin-RevId: 542951181	2023-06-23 14:00:12 -07:00
jax authors	63415a9184	Merge pull request #16386 from axch:ragged-einsum PiperOrigin-RevId: 542887557	2023-06-23 10:00:07 -07:00
Ayaka	feb34ce074	Fix typo: `ConcretizationError` -> `ConcretizationTypeError`	2023-06-22 16:01:35 +08:00
Ayaka	5da5804824	Fix typo in documentation	2023-06-22 15:47:41 +08:00
Jake VanderPlas	f1e603e4b3	errors: create TracerBoolConversionError for more targeted debugging tips	2023-06-21 01:41:45 -07:00
Jake VanderPlas	452a3b928b	Errors: avoid printing tracer repr for concretization errors	2023-06-20 00:33:51 -07:00
Lena Martens	fbf8823da3	Add live-analysis memory optimization to more jaxpr interpreters. Follow-up on 8a85e76a5cff0897eccbafc48da836b6f6704e5d PiperOrigin-RevId: 540857501	2023-06-16 06:08:51 -07:00
Alexey Radul	63f912c220	Test and implement ragged einsum.	2023-06-13 17:04:43 -04:00
Alexey Radul	d67e309482	Update todo comments based on offline discussion.	2023-06-13 10:44:52 -04:00
Alexey Radul	effaf674ae	Test and fix jnp.broadcast_to.	2023-06-08 16:17:43 -04:00
Matthew Johnson	1c6a892c7e	Improve printing of bints and piles, and allow bints in convert_element_type.	2023-05-19 13:14:48 -07:00
Alexey Radul	2daeec83ce	Redefine the pile representation from concatenated to stacked-and-padded. The advantage (already being realized) is that the batching rules become much simpler: we just batch along the stacked axis as always, and when a reduction is about to occur, also mask out the padding elements, replacing them with the identity element of the reduction. This commit - Changes the intended representation of data for piles and the corresponding BatchTracers. - Re-defines ConcatAxis as RaggedAxis to represent the metadata. - Updates `defreducer` to require the identity function (in case masking is needed), and supplies it everywhere. - Flushes batching.segment_sum, as it is dead code now. - Deletes unpack_concat_axes and reassemble_concat_axes, because they are irrelevant to the padded representation.	2023-05-19 13:13:15 -07:00
Roy Frostig	180e26dafb	remove `physical_avals` rule in favor of `physical_element_aval`	2023-05-17 20:07:58 -07:00
Peter Hawkins	eaf7eb2626	Break cycle between _src/core.py and _src/dtypes.py. PiperOrigin-RevId: 532788430	2023-05-17 07:58:59 -07:00
George Necula	876c53abb7	[shape_poly] Refactor the unification of the argument abstract values with the actual arguments This was called shape_poly.compute_dim_values. We rename it to shape_poly.unify_avals_with_args and we add better error reporting to it. Now it will identify the arg/kwarg where there is a shape discrepancy. This is intended to be a pure refactoring, in preparation for adding support for shape polymorphism to jax_export.call_exported.	2023-04-27 08:59:59 +02:00
Matthew Johnson	84ae14e7d3	[djax] handle simple reshapes and size-0 checks One of the main changes here is that we don't do division in handling x.reshape(..., -1) unless we have to.	2023-04-21 19:20:48 -07:00
Peter Hawkins	a3b262c379	Use the traceback of the call site when assigning a source location to an inlined function. Improves but does not completely fix https://github.com/google/jax/issues/15663 . The non-inlined case still has similar problems.	2023-04-19 13:56:53 -04:00
Jake VanderPlas	72bb8ab753	jax.Array: dynamically define abstract methods	2023-04-18 13:08:32 -07:00
Jake VanderPlas	5521423d92	Change np.prod->math.prod Why? This is generally used for static operations on shapes, but np.prod has an unfortunate corner-case behavior that np.prod([]) returns a float. math.prod is available as of Python 3.8, and is a better solution here.	2023-04-13 11:48:11 -07:00
Peter Hawkins	1c8512b1fa	Micro-optimization: speed up JaxprEqn.replace(). PiperOrigin-RevId: 523415813	2023-04-11 09:00:12 -07:00
Matthew Johnson	9dabb6fa59	[shard-map] better errors for not-implemented-in-eager features	2023-04-08 21:12:40 -07:00
jax authors	c42aae9fd7	Merge pull request #15221 from froystig:custom-vjp-symbolic-zeros2 PiperOrigin-RevId: 522823918	2023-04-08 09:49:45 -07:00
Peter Hawkins	dee8279377	Add `__slots__` to core.Var PiperOrigin-RevId: 522659264	2023-04-07 12:33:37 -07:00

1 2

85 Commits