13 Commits

Author SHA1 Message Date
Michael Hudgins
ecf7fde714 Add B200 testing to continuous workflow 2025-03-17 20:19:20 +00:00
Nitin Srinivasan
12760af236 Add custom job names to group different matrix combinations in the Actions dashboard
PiperOrigin-RevId: 736481804
2025-03-13 06:23:04 -07:00
Nitin Srinivasan
7ac6355262 Add TPU test jobs to the new CI continuous and nightly/release test workflows
Also, modify the TPU presubmit workflow to reuse the `build_artifacts.yml` and `pytest_tpu.yml`

PiperOrigin-RevId: 735832964
2025-03-11 11:42:21 -07:00
Nitin Srinivasan
623865fe95 Build JAX wheels instead of installing it from the source repository
This change allows us to get rid of extra env vars which used to control whether to install `jax` at head. Now, `jax` will be be built and consumed in the same way as the other wheels in the continuous jobs.

PiperOrigin-RevId: 734123590
2025-03-06 06:48:16 -08:00
Nitin Srinivasan
771306bab3 Use ${{ !cancelled() }} instead of ${{ always() }}
`${{ always() }}` makes it difficult to cancel a workflow. See https://github.com/orgs/community/discussions/26303

PiperOrigin-RevId: 731044750
2025-02-25 15:06:38 -08:00
Nitin Srinivasan
cf01fdfe6a Use the 64 core Windows runner to build artifacts
Now that we have disabled RBE on Windows, we need to use the bigger machine to build fast.

PiperOrigin-RevId: 731012952
2025-02-25 13:42:16 -08:00
Michael Hudgins
c664a0cd44 [CI] Enable workflow_dispatch for the continuous workflow
This enables testing changes that have a high chance of breaking a longer running test.

PiperOrigin-RevId: 729515893
2025-02-21 06:59:20 -08:00
Nitin Srinivasan
93831bdde7 Download and use jax wheels from GCS bucket for nightly/release test workflows
Unlike continuous workflows, when testing nightly/release artifacts, we want to download and install the `jax` wheels found in the GCS bucket instead of installing it from HEAD.

It looks like `env` setting in the calling workflow isn't passed over to the called workflows so we define a new workflow input, `install-jax-current-commit`, to control the `jax` install behavior.

PiperOrigin-RevId: 726086522
2025-02-12 09:32:05 -08:00
Nitin Srinivasan
30acd383fb Run test job irrespective of if the build jobs succeeds or fails
This lets us avoid losing test coverage if a single unrelated build job fails. E.g Windows build job fails but everything else succeeds. In this case, we still want to run the tests for other platforms.

Also, if a build job fails, its corresponding test job will also report a failure as a result of not being able to download the wheel artifact so we should still be able to tell the source of job failure easily.

PiperOrigin-RevId: 725754098
2025-02-11 13:37:30 -08:00
Nitin Srinivasan
e8d40ff1a7 Fix typo and improve readability of workflow documentation
PiperOrigin-RevId: 718838936
2025-01-23 06:24:55 -08:00
Nitin Srinivasan
9aad6a6827 Add job that runs Bazel single accelerator and multi-accelerator CUDA tests (non-RBE)
PiperOrigin-RevId: 718637923
2025-01-22 17:51:45 -08:00
Nitin Srinivasan
12beb00bb3 Set timeout for artifact building and "run tests" steps
Also, use a conditional expression in the continuous workflow to control concurrent runs. We don't want to cancel runs on multiple pushes to main or release branch.

PiperOrigin-RevId: 716780290
2025-01-17 13:24:45 -08:00
Nitin Srinivasan
c78487d23d Add Github action workflows for running continuous tests with Pytest
Changes:
- Adds `wheel_tests.yml` that will be used to run continuous jobs that builds artifacts and runs CPU/CUDA tests. Jobs will run by workflow calls to `build_artifacts.yml`/`pytest_cpu.yml`/`pytest_gpu.yml`.
- Adds testing of CUDA tests on H100 gpus
- Make script executable
- Change the name of GPU scripts and workflows to CUDA to be more clear as to what is being tested
PiperOrigin-RevId: 715500412
2025-01-14 13:10:51 -08:00