rocm_jax

mirror of https://github.com/ROCm/jax.git synced 2025-04-22 21:46:06 +00:00

Author	SHA1	Message	Date
Ruturaj Vaidya	dceb5310fe	[ROCm] Implement RNN support (#217 )	2025-02-07 11:08:12 -06:00
Ionel Gog	ec279f9c54	Add config option to log or fatal when jax.Arrays are GCed. Introduces `jax.config.array_garbage_collection_guard`, which is a tristate config for setting up a `jax.Array` garbage collection guard. The possible configs are: * allow: `jax.Array`s are allowed to be garbage collected. This is the default value. * log: whenever a `jax.Array` is GCed a log entry is generated with the array's traceback. * fatal: fatal crash when a `jax.Array` is GCed. This is meant to be used for mature code bases that do tight memory management, and are reference cycle free. PiperOrigin-RevId: 687003464	2024-10-17 12:23:16 -07:00
Peter Hawkins	145304a0e0	Remove reference to outfeed_receiver.pyi, which was deleted. PiperOrigin-RevId: 683195999	2024-10-07 08:37:14 -07:00
Peter Hawkins	5a1d0a6c26	Include the sdy MLIR dialect in jaxlib. We're seeing test failures from tests assuming that this dialect exists. But given we plan to enable it at some point, we may as well just include it in the build. The size impact is small (around 400K uncompressed). PiperOrigin-RevId: 679608092	2024-09-27 08:53:31 -07:00
Peter Hawkins	70f91db853	Set PYTHONWARNINGS=error in bazel tests. The goal of this change is to catch PRs that introduce new warnings sooner. To help pass the environment variable more easily, rename the jax_test Bazel test macro to jax_multiplatform_test, and introduce a new jax_py_test macro that wraps py_test. Add code to both to set the environment variable. Add code to suppress some new warnings uncovered in CI. PiperOrigin-RevId: 678352286	2024-09-24 12:30:11 -07:00
Michael Hudgins	d4d1518c3d	Update references to the GitHub url in JAX codebase to reflect move from google/jax to jax-ml/jax PiperOrigin-RevId: 676843138	2024-09-20 07:52:33 -07:00
Peter Hawkins	922e652c05	Replace plat-name with plat_name. The former seems to elicit a deprecation warning from setuptools recently.	2024-09-18 15:17:49 +00:00
Adam Paszke	611ad63060	Add basic PyTorch integration for Mosaic GPU We have already had most of the relevant pieces and we only needed to connect them together. The most sensitive change is perhaps that I needed to expose one more symbol from the XLA GPU plugin, but I don't think it should be a problem.	2024-09-18 12:55:23 +00:00
Bart Chrzaszcz	864178d3a3	#sdy Initial set of changes to allow for lowering to the Shardy dialect. The OpenXLA project is working on an open source, MLIR, named-axis based propagation (and in the future SP<D partitioning) system that will be dialect agnostic (would work for any dialect - MHLO, StableHLO, YourDialect). We plan on having frontends like JAX and PyTorch target this when using XLA and wanting SPMD propagation/partitioning. See www.github.com/openxla/shardy for more info. Currently Shardy is implemented inside the XLA compiler, requiring us to round-trip between StableHLO and HLO with `mhlo.sharding`s. But we will eventually make Shardy the first pass in the XLA pipeline while it's still working on StableHLO. Partitioning (the system that adds the collectives like all-gathers/all-reduces) will still be the GSPMD Partitioner, but next year the Shardy partitioner will be developed, allowing for propagation and partitioning to be completely in MLIR and the first pass in the pipeline. So then we'd have: 1. Traced jaxpr 2. Jaxpr -> StableHLO 3. StableHLO with Shardy propagation 4. StableHLO with Shardy partitioning 5. StableHLO -> HLO 6. XLA optimizations The following test: ```py def test_sdy_lowering(self): mesh = jtu.create_global_mesh((4, 2), ('x', 'y')) np_inp = np.arange(16).reshape(8, 2) s = jax.sharding.NamedSharding(mesh, P('x', 'y')) arr = jax.device_put(np_inp, s) @partial(jax.jit, out_shardings=s) def f(x): return x * 2 print(f.lower(arr).as_text()) ``` outputs: ``` module @jit_f attributes {mhlo.num_partitions = 8 : i32, mhlo.num_replicas = 1 : i32} { sdy.mesh @mesh = <"x"=4, "y"=2> func.func public @main(%arg0: tensor<8x2xi64> {mhlo.layout_mode = "{1,0}", sdy.sharding = #sdy.sharding<@mesh, [{"x"}, {"y"}]>}) -> (tensor<8x2xi64> {jax.result_info = "", mhlo.layout_mode = "default", sdy.sharding = #sdy.sharding<@mesh, [{"x"}, {"y"}]>}) { %c = stablehlo.constant dense<2> : tensor<i64> %0 = stablehlo.broadcast_in_dim %c, dims = [] : (tensor<i64>) -> tensor<8x2xi64> %1 = stablehlo.multiply %arg0, %0 : tensor<8x2xi64> return %1 : tensor<8x2xi64> } } ``` Shardy will be hidden behind the `jax_use_shardy_partitioner` flag initially before becoming enabled by default in the future. PiperOrigin-RevId: 655127611	2024-07-23 05:32:06 -07:00
Ruturaj4	332435e028	[ROCM] make mosaic dependency cuda specific	2024-07-02 11:05:42 -05:00
jax authors	00528b9858	`libdevice.10.bc` is removed from JAX wheels bundle. The recommended source of JAX wheels is `pip`, and NVIDIA dependencies are installed automatically when JAX is installed via `pip install`. `libdevice` gets installed from `nvidia-cuda-nvcc-cu12` package. PiperOrigin-RevId: 647328834	2024-06-27 08:35:59 -07:00
Ruturaj4	99c2b7b4e9	[ROCm] Bring-up pjrt support	2024-06-17 16:49:22 +00:00
Adam Paszke	cfe64cd5ce	[Mosaic GPU] Integrate the ExecutionEngine with the jaxlib GPU plugin This lets us avoid bundling a whole another copy of LLVM with JAX packages and so we can finally start building Mosaic GPU by default. PiperOrigin-RevId: 638569750	2024-05-30 01:46:23 -07:00
George Necula	3bcb8d6831	Remove DUCC FFT from jaxlib JAX has stopped generating code that uses directly the DUCC FFT custom calls. The 6 months backwards compatibility window has also expired. PiperOrigin-RevId: 638132572	2024-05-28 21:12:23 -07:00
Dan Foreman-Mackey	88790711e8	Package XLA FFI headers with jaxlib wheel The new "typed" API that XLA provides for foreign function calls is header-only and packaging it as part of jaxlib could simplify the open source workflow for building custom calls. It's not completely obvious that we need to include this, because jaxlib isn't strictly required as a _build_ dependency for FFI calls, although it typically will be required as a _run time_ dependency. Also, it probably wouldn't be too painful for external projects to use the headers directly from the openxla/xla repo. All that being said, I wanted to figure out how to do this, and it has been requested a few times.	2024-05-22 12:28:38 -04:00
Vadym Matsishevskyi	517e299a9d	Use hermetic Python in JAX, see "Managing hermetic Python" in developer.md for details PiperOrigin-RevId: 634146391	2024-05-15 18:20:56 -07:00
jax authors	c3cab2e3d3	Reverts 6c425338d20c0c9be3fc69d2f07ababf79c881d3 PiperOrigin-RevId: 632579101	2024-05-10 12:56:10 -07:00
Sergei Lebedev	8ccbebae4b	Fixed Mosaic GPU build following #21029	2024-05-07 17:08:00 +01:00
jax authors	e691c19bb2	Merge pull request #21029 from superbobry:jaxlib-mlir-pyi PiperOrigin-RevId: 629836927	2024-05-01 14:22:21 -07:00
Sergei Lebedev	442526869f	Bundle MLIR .pyi files with jaxlib This allows mypy and pyright to type check the code using MLIR Python APIs.	2024-05-01 19:37:26 +01:00
Adam Paszke	4051ac2a2f	[Mosaic GPU] Only call kernel initializer from inside a custom call XLA:GPU custom call design is far from ideal, as there's apparently no way to figure out the CUDA context that will be used to run an HLO module before the custom call is first called. So, we can't preload the kernel onto the GPU, or else we'll get invalid handle errors due to the load and launch happening in different CUDA contexts... Also fix up build_wheel.py to match the rename of the runtime lib. PiperOrigin-RevId: 629401858	2024-04-30 07:10:05 -07:00
Adam Paszke	340b9e3739	Update GPU and NVGPU MLIR bindings to match upstream MLIR changes Upstream MLIR Python bindings now require two more extension libraries to work properly. The dialects fail to import without this change.	2024-04-25 11:41:19 +00:00
Adam Paszke	5a2d7a2df4	Switch Mosaic GPU to a custom pass pipeline and improve the lowering of GPU launch The stock MLIR pipeline was a good way to get the prototype off the ground, but its default passes can be problematic. In particular, the gpu.launch is compiled into a sequence of instructions that load the kernel onto the GPU, run the kernel and immediately unload it again. This has the correct semantics, but loading the kernel is both expensive and forces a synchronization point, which leads to performance issues. To resolve this, I implemented a new MLIR pass that finds the gpu.launch ops and splits each function that has it into two functions: one that preloads the kernel onto the GPU, and another one that consumes the handle produced by the previous one. We call the first function at compile-time, while only the second one is used at run-time. There are other overheads in MLIR's implementation of kernel launch, but I will fix those later. PiperOrigin-RevId: 627670773	2024-04-24 03:27:45 -07:00
Jieying Luo	16b4f69769	Rename arg in build script to be more clear. The flag means skips GPU plugin extension in jaxlib. PiperOrigin-RevId: 627203738	2024-04-22 17:22:24 -07:00
Adam Paszke	8e3f5b1018	Initial commit for Mosaic GPU Moving this to JAX to make it easier to explore Pallas integration. PiperOrigin-RevId: 625982382	2024-04-18 04:04:10 -07:00
jax authors	fb55d59143	This CL introduces 'PluginProgram' in IFRT and exposes this in python via `xla_client.compile_ifrt_program()`. The IFRT `PluginProgram` is simply a wrapper for arbitrary byte-strings: an IFRT backend that recognizes `PluginProgram` can interpret the byte-string in any way it sees fit. PiperOrigin-RevId: 621258245	2024-04-02 12:20:35 -07:00
jax authors	df9cefabc1	jaxlib: Add `ifrt_proxy.pyi` to `build_wheel.py`. PiperOrigin-RevId: 617275734	2024-03-19 13:27:39 -07:00
Peter Hawkins	ef40b85c8b	Don't build the Triton MLIR dialect on Windows This dialect doesn't build on Windows, but we don't support GPUs on Windows anyway, so we can simply exclude it from the build. CI failures look like this: ``` C:\npm\prefix\bazel.CMD run --verbose_failures=true //jaxlib/tools:build_wheel -- --output_path=C:\a\jax\jax\jax\dist --jaxlib_git_hash=5f19f7712b485493ac141c44eea3b3eb1ffdfb59 --cpu=AMD64 b"external/triton/lib/Dialect/TritonGPU/Transforms/Utility.cpp(70): error C2672: 'mlir::Block::walk': no matching overloaded function found\r\nexternal/triton/lib/Dialect/TritonGPU/Transforms/Utility.cpp(70): error C2783: 'RetT mlir::Block::walk(FnT &&)': could not deduce template argument for 'ArgT'\r\nexternal/llvm-project/mlir/include\\mlir/IR/Block.h(289): note: see declaration of 'mlir::Block::walk'\r\nexternal/triton/lib/Dialect/TritonGPU/Transforms/Utility.cpp(110): error C2672: 'mlir::OpState::walk': no matching overloaded function found\r\nexternal/triton/lib/Dialect/TritonGPU/Transforms/Utility.cpp(110): error C2783: 'enable_if<llvm::function_traits<decay<FnT>::type,std::is_class<T>::value>::num_args==1,RetT>::type mlir::OpState::walk(FnT &&)': could not deduce template argument for 'RetT'\r\n with\r\n [\r\n T=decay<FnT>::type\r\n ]\r\nexternal/llvm-project/mlir/include\\mlir/IR/OpDefinition.h(165): note: see declaration of 'mlir::OpState::walk'\r\nexternal/llvm-project/mlir/include\\mlir/IR/PatternMatch.h(357): error C2872: 'detail': ambiguous symbol\r\nexternal/llvm-project/mlir/include\\mlir/Rewrite/FrozenRewritePatternSet.h(15): note: could be 'mlir::detail'\r\nbazel-out/x64_windows-opt/bin/external/triton/include\\triton/Dialect/Triton/IR/Ops.h.inc(5826): note: or 'mlir::triton::detail'\r\nexternal/triton/lib/Dialect/TritonGPU/Transforms/Utility.cpp(712): note: see reference to class template instantiation 'mlir::OpRewritePattern<mlir::scf::ForOp>' being compiled\r\nexternal/triton/lib/Dialect/TritonGPU/Transforms/Utility.cpp(741): error C2672: 'mlir::Block::walk': no matching overloaded function found\r\nexternal/triton/lib/Dialect/TritonGPU/Transforms/Utility.cpp(741): error C2783: 'RetT mlir::Block::walk(FnT &&)': could not deduce template argument for 'ArgT'\r\nexternal/llvm-project/mlir/include\\mlir/IR/Block.h(289): note: see declaration of 'mlir::Block::walk'\r\n" output = subprocess.check_output(cmd) ``` PiperOrigin-RevId: 609153322	2024-02-21 16:02:54 -08:00
Sergei Lebedev	881436240e	Inlined triton.compat We no longer need a compatibility layer, since Pallas does not use any Triton IR building APIs. PiperOrigin-RevId: 606948415	2024-02-14 05:23:15 -08:00
Sergei Lebedev	5e2e609a9b	_triton_ext no longer links in MLIR C APIs I re-used the same trick we do for the TPU dialect. Specifically, _triton_ext no longer depends on :triton_dialect_capi. Instead * we include Triton dialect C bindings into :jaxlib_mlir_capi_objects * and _triton_ext depends on :jaxlib_mlir_capi_objects and a header-only cc_library providing Triton dialect C bindings This is a fork of #19680 with a few internal-only fixes. PiperOrigin-RevId: 604929377	2024-02-07 03:39:29 -08:00
Jieying Luo	b0b7c1c186	Fix missing flag definition in plugin wheels built script. jaxlib_git_hash was recently added to the build command build/build.py. PiperOrigin-RevId: 599931552	2024-01-19 14:06:19 -08:00
Sergei Lebedev	1e9f96a574	Include Triton files into the jaxlib wheel This PR is based on #19368.	2024-01-16 15:28:12 +00:00
Peter Hawkins	dedd69f323	Add a bazel test that verifies that the jaxlib wheel builds.	2024-01-11 23:22:17 +00:00
Peter Hawkins	858fd52ac0	Fix jaxlib wheel build after removal of mosaic python files. PiperOrigin-RevId: 597536465	2024-01-11 06:21:07 -08:00
Tomás Longeri	027c24e602	[Mosaic] Remove Python implementation of apply_vector_layout and infer_memref_layout. PiperOrigin-RevId: 597332393	2024-01-10 13:00:21 -08:00
Jake VanderPlas	326d1d27ef	jaxlib: avoid external build-time dependency on ml_dtypes Currently, the ml_dtypes C++ sources are included in the set of sources at jaxlib build time. This is unnecessary, and can lead to problematic version skew in some cases (e.g. nightly builds). PiperOrigin-RevId: 595725529	2024-01-04 09:26:05 -08:00
Parker Schuh	23b9c2a22f	Add the githash that the jaxlib was built at to __init__.py. This is to allow identifying the githash of nightlies. PiperOrigin-RevId: 595529249	2024-01-03 16:12:23 -08:00
Christian Sigg	c83fd971a0	Fix jax mlir python dependency build after `537b2aa264` PiperOrigin-RevId: 593370604	2023-12-23 21:02:29 -08:00
Peter Hawkins	560187334a	Add register_jax_dialects to jaxlib wheel. Fixes build breakage.	2023-12-06 19:07:04 +00:00
Peter Hawkins	1c80b364d2	Remove stale reference to _site_initialize_0 in wheel build script.	2023-12-06 12:12:15 -05:00
Peter Hawkins	32fb1b4034	Remove the ml_program MLIR dialect from jaxlib. Jax isn't using this, and in fact our code to build this wasn't including the C++ parts, so it was broken anyway. Remove it until someone actually needs it for something. PiperOrigin-RevId: 587323808	2023-12-02 09:29:39 -08:00
Adam Paszke	038879248d	Add a recently added Mosaic Python file to build_wheel.py PiperOrigin-RevId: 584356541	2023-11-21 10:03:59 -08:00
Jieying Luo	d6c5910105	[PJRT C API] Move cuda_plugin_extension from jaxlib to jax-cuda-plugin (the package for cuda kernels). PiperOrigin-RevId: 583406466	2023-11-17 09:11:46 -08:00
Jieying Luo	ec21e04201	[PJRT C API] Rename the folder "plugins" to "jax_plugins". With this change, existing plugin discovery mechanism can discover local plugins without pip install. Update jax_plugins/cuda/__init__.py to return without registering the plugin if the .so file does not exist. PiperOrigin-RevId: 582431300	2023-11-14 13:56:13 -08:00
Jieying Luo	462ef165c4	[PJRT C API] Change build wheel script to build a separate package for cuda kernels. With this change, `python3 build/build.py --enable_cuda --build_gpu_plugin --gpu_plugin_cuda_version=12` will generate three wheels: \| \|size\|wheel name \| \|----------------------\|----\|-------------------------------------------------------------------------\| \|jaxlib w/o cuda kernels\|76M \|jaxlib-0.4.20.dev20231101-cp310-cp310-manylinux2014_x86_64.whl \| \|cuda pjrt \|73M\|jax_cuda12_pjrt-0.4.20.dev20231101-py3-none-manylinux2014_x86_64.whl \| \|cuda kernels \|6.6M\|jax_cuda12_plugin-0.4.20.dev20231101-cp310-cp310-manylinux2014_x86_64.whl\| The size of jaxlib with cuda kernels and pjrt is 119M. The cuda kernel wheel contains all the cuda kernels. A plugin_setup.py and plugin_pyproject.toml are added for this new pacakge. PiperOrigin-RevId: 579861480	2023-11-06 09:13:44 -08:00
Adam Paszke	3de9595fc3	Update the TPU dialect binding extension to follow MLIR guidelines The way MLIR dialects are allowed to be extended in Python has recently changed (in https://github.com/llvm/llvm-project/pull/68853), so we have to update our bindings. PiperOrigin-RevId: 576060814	2023-10-24 01:48:52 -07:00
Peter Hawkins	caee898fd0	Fix jaxlib build failure after upstream MLIR Python binding changes. https://github.com/llvm/llvm-project/pull/68853 changed the structure of the upstream MLIR Python bindings, breaking the jaxlib build. Update our build scripts to match.	2023-10-23 14:27:52 +00:00
Rahul Batra	b4b97cd8e8	[ROCm]: Add jax-triton support for ROCm	2023-10-18 07:09:20 +00:00
Peter Hawkins	fa8159681d	Clean up build_wheel.py and build_gpu_plugin_wheel.py. * Use pathlib.Path object-oriented paths. * Change copy_files() helper to copy many files in one call. * Make copy_files() also make the output directory, if needed. * Format file with pyink --pyink-indentation=2	2023-09-29 20:08:42 +00:00
Peter Hawkins	9404518201	[CUDA] Add code to jax initialization that verifies that the CUDA libraries that are found are at least as new as the versions against which JAX was built. This is intended to flag cases where the wrong CUDA libraries are used, either because: * the user self-installed CUDA and that installation is too old, or * the user used the pip package installation, but due to LD_LIBRARY_PATH overrides or similar we didn't end up using the pip-installed version. PiperOrigin-RevId: 568910422	2023-09-27 11:28:40 -07:00

1 2

64 Commits