The pass adds versioning to the Mosaic GPU IR in the lowered custom calls
and can apply forward/backward migration rules. Currently, no rules are
necessary since we are at version 1.
PiperOrigin-RevId: 716596848
Replace `with set_mesh(mesh):` with `with use_mesh(mesh):` context manager
Also expose `AxisTypes` and `use_mesh` into public API via `jax.sharding.AxisTypes` and `jax.sharding.use_mesh`.
PiperOrigin-RevId: 716446406
This commit stores the GCS upload URI as a step output and then maps the job output to it so that we can make it available when a workflow call is made to `build_artifacts.yml`.
This helps in the case where we want to debug a workflow such as https://github.com/jax-ml/jax/blob/main/.github/workflows/wheel_tests_continuous.yml where artifact build jobs and test jobs are run separately. Previously, we could not re-try a failing test workflow as the upload URI contains `${{ github.workflow }}/${{ github.run_number }}/${{ github.run_attempt }}`. In GitHub, `github.run_attempt` is mapped to a workflow and not individual jobs so even a re-trigger of a test job alone would lead to the `run_attempt` value being increased which in turn invalidates the GCS download URI that it reads.
By storing the the upload URI as a step output, we freeze the upload/download URIs until the build artifact jobs are re-run. However, note that this still has a edge case where things can break - `run-pytest-cuda` job in `wheel_tests_continuous.yml` depends on both `build-jaxlib-artifact` and `build-cuda-artifacts` but consumes the upload URI from the output of `build-jaxlib-artifact` alone. This is done on the assumption that both these jobs will have uploaded to the same location. However, that would not be the case if one of these jobs fail and have to re-run. We are working on a longterm solution for this case but in the meantime, the recommendation for now is just to re-run the whole set of jobs again.
PiperOrigin-RevId: 716348745
and multiple devices.
Whenever this happens we can essentially introduce an effects barrier
instead of doing the normal device -> host -> device transfer.
Fixes https://github.com/jax-ml/jax/issues/25671.
PiperOrigin-RevId: 716309978
This will basically drop the gather operation into full auto mode and add a sharding constraint on the output given by the user via `out_spec`.
Co-authored-by: Matthew Johnson <mattjj@google.com>
PiperOrigin-RevId: 716295953
1. For apply, llvm::StringMap()::insert(MapEntryTy*) will cause dangling reference if not constructing mlir::tpu::extensions::rules() with const-reference. However, if we do construct it with const-reference, the signature is not const-qualified and fails to compile. Hence, change it to llvm::StringMap()::insert(std::pair<...>) and get extension rules by const-reference.
2. Pass default tiling to infer rule, we need it to infer single op. See infer of tpu::MatmulOp.
PiperOrigin-RevId: 716274818
- Change the `async_load` lowering to manage the single thread context.
- Use a predicate for the top-level arrive_expect. If we want to hide this further, we can have a warp-group level op that lowers to a single-threaded context.
PiperOrigin-RevId: 716219730
* Split on 1 dimension only and the splitting dimension should be unsharded.
`operand.shape = (4@x, 6@y, 8), new_shape = (4@x, 6@y, 2, 2, 2)`
* Merging into 1 dimension only and all the merging dimensions should be unsharded.
`operand.shape = (4@y, 2, 3, 8), new_shape = (4@y, 6, 8)`
* Split into singleton dimensions i.e. adding extra dims of size 1
`operand.shape = (4@x, 6@y, 8@z), new_shape = (1, 4@x, 1, 6@y, 1, 8@z, 1)`
* Merge singleton dimensions i.e. removing extra dims of size 1
`operand.shape = (1, 4@x, 6, 1, 8, 1), new_shape = (1, 4@x, 6, 8)`
* Identity reshape
`operand.shape = (4@(x,y), 6), new_shape = (4@(x,y), 6)`
These cases are unambiguous to handle. In all other cases, we error out and ask the user to provide the out_sharding.
PiperOrigin-RevId: 716216240
The motivation behind this change is twofold:
1. it simplifies test writing (no need to produce arbitrary, manual, non-splat
constants to produce arguments with a strided layout);
2. it'll allow running layout inference on different `FuncOp`s in isolation,
before inlining.
While the primary motivation is to simplify test writing for upcoming changes,
`2.` is useful if we ever intend to call functions whose body's layout we have
inferred from other functions. It's not clear to me that we have a use case for
that, but the theoretical benefit is worth pointing out.
Crucially, layout inference does not set default layouts for `FuncOp`s, since
the caller may choose a different layout for its arguments. As a result, there
is also no layout inference rule for `func.FuncOp`.
PiperOrigin-RevId: 716158516
As it stands, there is only ever one element in this list (see b/384741132) and only the 0th element is ever used so we can simplify.
This is a potentially breaking change for external users, but (as stated in the [documentation](https://jax.readthedocs.io/en/latest/aot.html#debug-information-and-analyses-when-available)) no guarantees are made on this type, which is intended for debugging purposes and not intended to be a reliable public API.
PiperOrigin-RevId: 715837855
The error message produced by MLIR is not really clear, but AFAICT the crash
was caused by the "temporary module" hack we use in the lax.cond lowering
rule.
PiperOrigin-RevId: 715785632