rocm_jax

mirror of https://github.com/ROCm/jax.git synced 2025-04-16 20:06:05 +00:00

Author	SHA1	Message	Date
Dan Foreman-Mackey	4f394828e1	Fix C++ registration of FFI handlers and consolidate gpu/linalg kernel implementation. This change does a few things (arguably too many): 1. The key change here is that it fixes the handler registration in `jaxlib/gpu/gpu_kernels.cc` for the two handlers that use the XLA FFI API. A previous attempt at this change caused downstream issues because of duplicate registrations, but we were able to fix that directly in XLA. 2. A second related change is to declare and define the XLA FFI handlers consistently using the `XLA_FFI_DECLARE_HANDLER_SYMBOL` and `XLA_FFI_DEFINE_HANDLER_SYMBOL` macros. We need to use these macros instead of the `XLA_FFI_DEFINE_HANDLER` version which produces a lambda, so that when XLA checks the address of the handler during registration it is consistent. Without this change, the downstream tests would continue to fail. 3. The final change is to consolidate the `cholesky_update_kernel` and `lu_pivot_kernels` implementations into a common `linalg_kernels` target. This makes the implementation of the `_linalg` nanobind module consistent with the other targets within `jaxlib/gpu`, and (I think!) makes the details easier to follow. This last change is less urgent, but it was what I set out to do so that's why I'm suggesting them all together, but I can split this in two if that would be preferred. PiperOrigin-RevId: 651107659	2024-07-10 12:09:12 -07:00
jax authors	e8b06ccf56	Cholesky rank-1 update kernel for JAX. PiperOrigin-RevId: 633722940	2024-05-14 15:21:38 -07:00
Chris Jones	4ac2bdc2b1	[jax_triton] Add user-specified `name` field to serialized format. PiperOrigin-RevId: 557415723	2023-08-16 02:53:51 -07:00
Chris Jones	31b862dd56	[jax_triton] Split C++ only parts of Triton custom callback from Python parts. Register callback with default call target name from C++, enabling Triton calls with the default name to work in C++ only contexts (e.g. serving). PiperOrigin-RevId: 545211452	2023-07-03 06:52:32 -07:00
Chris Jones	d4e2464340	[jax_triton] Expose Triton custom call callback in header file. This allows users to register the callback from C++ when not using the default call target name. PiperOrigin-RevId: 544029098	2023-06-28 05:32:02 -07:00
Chris Jones	f238667492	Make JAX-Triton calls serializable. PiperOrigin-RevId: 542524794	2023-06-22 04:57:14 -07:00
Sharad Vikram	bf8ed6a543	Move triton_kernel_call_lib to jaxlib PiperOrigin-RevId: 534934592	2023-05-24 12:11:21 -07:00
Peter Hawkins	3bb7386149	[JAX] Improve handling of metadata in compilation cache. Metadata, in particular code location information is present in the HLO generated by JAX. The compilation cache uses the serialized HLO as a cache key, which begs the question: should code location information be part of that key? Simply changing the line number on which a function appears shouldn't necessarily cause a cache miss. There are pros and cons: the main advantage of excluding metadata is that we will get more cache hits, and the main disadvantage is that debug information and profiling data in the HLO might become confusing, since it may refer to a different program entirely, or to a version of a program that does not correspond to the current state of the source tree. We argue that saving compilation time is the more important concern. This change adds a tiny MLIR pass that strips Locations from a StableHLO module, and applies it in the compilation cache if metadata stripping is enabled. PiperOrigin-RevId: 525534901	2023-04-19 13:27:04 -07:00
Qiao Zhang	4d1c4bc761	Add CUDNN custom call for LSTM. Exposed as jax.experimental.rnn module. PiperOrigin-RevId: 491445515	2022-11-28 14:31:48 -08:00
jax authors	d1fbdbc1cf	Rollback of "Add CUDNN custom call for LSTM. Exposed as jax.experimental.rnn module." PiperOrigin-RevId: 490499003	2022-11-23 07:48:05 -08:00
Qiao Zhang	78963b6020	Add CUDNN custom call for LSTM. Exposed as jax.experimental.rnn module. PiperOrigin-RevId: 490387796	2022-11-22 18:53:29 -08:00
Peter Hawkins	a852710a09	Merge CUDA and ROCM kernel code in jaxlib. The code for both CUDA and ROCM is almost identical, so with a small shim library to handle the differences we can share almost everything. PiperOrigin-RevId: 483666051	2022-10-25 07:23:34 -07:00

12 Commits