Skye Wanderman-Milne
120125f3dd
Make pytest-xdist work on TPU and update Cloud TPU CI.
...
This change also marks multiaccelerator test files in a way pytest can
understand (if pytest is installed).
By running single-device tests on a single TPU chip, running the test
suite goes from 1hr 45m to 35m (both timings are running slow tests).
I tried using bazel at first, which already supported parallel
execution across TPU cores, but somehow it still takes 2h 20m! I'm not
sure why it's so slow. It appears that bazel creates many new test
processes over time, vs. pytest reuses the number of processes
initially specified, and starting and stopping the TPU runtime takes a
few seconds so that may be adding up. It also appears that
single-process bazel is slower than single-process pytest, which I
haven't looked into yet.
2022-11-18 22:05:13 +00:00
Skye Wanderman-Milne
b4564a2a57
TPU CI: don't notify when testing the workflow from a branch
2022-11-16 21:27:24 +00:00
Skye Wanderman-Milne
8bed9bac81
Update Github Actions workflows using Ratchet
...
https://opensource.google/documentation/reference/github/services#actions
mandates using a specific commit for non-Google actions in workflow
files. I used https://github.com/sethvargo/ratchet to update all our
workflow files. Example command: `ratchet pin cloud-tpu-ci-nightly.yml`
Ratchet appears to also auto-format the YAML files. It makes the diff
confusing but I'm ok with the final result.
2022-11-16 18:45:59 +00:00
Yash Katariya
a419e1917a
Use jax.Array by default for doctests
...
PiperOrigin-RevId: 488719467
2022-11-15 11:52:22 -08:00
Skye Wanderman-Milne
5da7976093
Send message to internal chat room on Cloud TPU CI failure
2022-11-14 19:44:45 +00:00
Skye Wanderman-Milne
52775c42e4
Add .github/workflows/self_hosted_runner_utils/README.md
...
This was meant to be part of https://github.com/google/jax/pull/13000 , oops
2022-11-04 17:12:54 +00:00
jax authors
3db2a59f76
Merge pull request #13097 from jakevdp:actions-permissions
...
PiperOrigin-RevId: 486160888
2022-11-04 09:36:32 -07:00
Skye Wanderman-Milne
8c22e34e22
Add Github Actions workflow that runs on a self-hosted TPU VM runner.
...
This also includes some utilites for setting up the self-hosted
runner. Googlers, see go/jax-self-hosted-runners for more setup info.
The workflow is pretty basic currently. We can and should add more
functionality later, such as email notifications. I kept it simple
here for easier reviewing.
Testing:
- Sample workflow run in my fork: https://github.com/skye/jax/actions/runs/3333614180
- Sample PR attempt: (will add soon but I did verify validate_job.sh blocks pull_request workflows)
2022-11-03 21:15:57 +00:00
Jake VanderPlas
8057e2805b
CI: set explicit permissions for ci-build action
2022-11-03 13:21:58 -07:00
dependabot[bot]
cef5f20dbb
Bump styfle/cancel-workflow-action from 0.10.1 to 0.11.0
...
Bumps [styfle/cancel-workflow-action](https://github.com/styfle/cancel-workflow-action ) from 0.10.1 to 0.11.0.
- [Release notes](https://github.com/styfle/cancel-workflow-action/releases )
- [Commits](https://github.com/styfle/cancel-workflow-action/compare/0.10.1...0.11.0 )
---
updated-dependencies:
- dependency-name: styfle/cancel-workflow-action
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
2022-10-17 17:18:44 +00:00
jax authors
6c3c51e8f3
Merge pull request #12591 from sudhakarsingh27:add_pytest_run_for_jaxlib_release
...
PiperOrigin-RevId: 478608240
2022-10-03 14:34:32 -07:00
dependabot[bot]
8f71b03662
Bump styfle/cancel-workflow-action from 0.10.0 to 0.10.1
...
Bumps [styfle/cancel-workflow-action](https://github.com/styfle/cancel-workflow-action ) from 0.10.0 to 0.10.1.
- [Release notes](https://github.com/styfle/cancel-workflow-action/releases )
- [Commits](https://github.com/styfle/cancel-workflow-action/compare/0.10.0...0.10.1 )
---
updated-dependencies:
- dependency-name: styfle/cancel-workflow-action
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
2022-10-03 17:11:55 +00:00
Sudhakar
4fbc9a10d1
Add multihost GPU CI run with last public jaxlib release
2022-09-29 17:06:56 -07:00
Peter Hawkins
ba557d5e1b
Change JAX's copyright attribution from "Google LLC" to "The JAX Authors.".
...
See https://opensource.google/documentation/reference/releasing/contributions#copyright for more details.
PiperOrigin-RevId: 476167538
2022-09-22 12:27:19 -07:00
Yash Katariya
160e14308c
[Rollback] Add a github presubmit build which runs with jax.Array flag enabled for OSS coverage.
...
PiperOrigin-RevId: 473161716
2022-09-08 22:09:44 -07:00
Yash Katariya
49672cd2bc
Add a github presubmit build which runs with jax.Array flag enabled for OSS coverage.
...
PiperOrigin-RevId: 473100614
2022-09-08 15:32:31 -07:00
Sudhakar
5f1858f533
Add pytest marker inside the test only if pytest is present in the env
2022-09-06 11:45:59 -07:00
Roy Frostig
fe2de26b2c
test the RNG key upgrade in one of our github CI runs
...
Choosing the "numpy-dispatch" configuration since it tends to be our
frequent pick for future-proofing.
2022-08-30 21:43:44 -07:00
Sudhakar
a571db18db
Enable one gpu per process in multinode GPU CI
2022-08-29 09:00:19 -07:00
Sudhakar
4b1a2eaaec
combine gpu tests
2022-08-25 15:27:07 -07:00
Sudhakar
c2e521807c
Add support to test gpu jaxlib nightly in CI instead of prebuilt jax/jaxlib
2022-08-19 11:08:11 -07:00
Jake VanderPlas
7ec6acd981
nightly multiprocess test: create issue on failure
2022-08-09 19:12:32 -07:00
Sudhakar Singh
efb37ff784
Bump EnricoMi/publish-unit-test-result-action from 1 to 2
2022-08-08 10:55:53 -07:00
Sudhakar Singh
1565fd2525
Squashed commit of the following:
...
commit 0b4c3f05a49037be93eb0612113e193f3a8d61c5
Author: Sudhakar Singh <sudhakars@nvidia.com>
Date: Thu Aug 4 09:53:04 2022 -0700
change the path
commit 2c629739c1cfa45d848a2cf7109d329c1262e6ac
Author: Sudhakar Singh <sudhakars@nvidia.com>
Date: Wed Aug 3 16:37:46 2022 -0700
rename file to reflect current objective
commit ef46bcae6cd66d6fe7b04bd6d8aeed42c4f3ddfa
Author: Sudhakar Singh <sudhakars@nvidia.com>
Date: Wed Aug 3 15:56:32 2022 -0700
correct formatting
commit e5da60ad855592d5f150612f65ad679872160132
Author: Sudhakar Singh <sudhakars@nvidia.com>
Date: Wed Aug 3 15:26:32 2022 -0700
Add multi-node multi-GPU JAX tests
This adds multi-node multi-GPU test for `jax.distributed.initialize`.
Presently, this is expected to run on a nightly basis. Under the hood,
SLURM is used to launch the `pytest <test_name>` commands on multiple
nodes.
Resolves : #11648
2022-08-04 10:13:50 -07:00
jax authors
9f96a0474e
Merge pull request #11624 from alexalemi:notifications
...
PiperOrigin-RevId: 463390452
2022-07-26 11:43:51 -07:00
Alex Alemi
31d2a74aeb
remove broken-main-notify
2022-07-26 14:29:31 -04:00
Jake VanderPlas
993196c451
CI: make parse_logs more robust to errors
2022-07-20 10:33:32 -07:00
Jake VanderPlas
2eaa44e8d8
[CI] upstream-dev: run all tests
2022-07-18 11:06:56 -07:00
Yash Katariya
90433c0518
Update ci-build.yaml
2022-07-15 16:03:16 -07:00
jax authors
5a10c1af3c
Merge pull request #11487 from google:upstream-dev-logs
...
PiperOrigin-RevId: 461241333
2022-07-15 14:43:37 -07:00
Jake VanderPlas
b0cd7de999
CI: use minimum jaxlib in upstream-ci build
2022-07-14 14:34:47 -07:00
Jake VanderPlas
b97f6f4819
upstream-dev: add failure information to CI issues
2022-07-13 14:05:11 -07:00
Jake VanderPlas
917c4e9dfc
add upstream-nightly CI job
2022-07-12 17:04:53 -07:00
Arjun Sharda
cc8e302933
Update ci-build.yaml
...
Update ci-build.yaml
2022-06-28 12:45:59 -07:00
dependabot[bot]
19ae5d8581
Bump styfle/cancel-workflow-action from 0.9.1 to 0.10.0
...
Bumps [styfle/cancel-workflow-action](https://github.com/styfle/cancel-workflow-action ) from 0.9.1 to 0.10.0.
- [Release notes](https://github.com/styfle/cancel-workflow-action/releases )
- [Commits](https://github.com/styfle/cancel-workflow-action/compare/0.9.1...0.10.0 )
---
updated-dependencies:
- dependency-name: styfle/cancel-workflow-action
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
2022-06-27 17:17:21 +00:00
dependabot[bot]
11777fab8c
Bump actions/setup-python from 3 to 4
...
Bumps [actions/setup-python](https://github.com/actions/setup-python ) from 3 to 4.
- [Release notes](https://github.com/actions/setup-python/releases )
- [Commits](https://github.com/actions/setup-python/compare/v3...v4 )
---
updated-dependencies:
- dependency-name: actions/setup-python
dependency-type: direct:production
update-type: version-update:semver-major
...
Signed-off-by: dependabot[bot] <support@github.com>
2022-06-13 17:09:30 +00:00
Sharad Vikram
143ed40a78
Add collect_profile script
2022-06-03 17:56:17 -07:00
Alex Alemi
4ca92edaec
Remove pull request notification.
...
PiperOrigin-RevId: 452280419
2022-06-01 06:20:09 -07:00
Alex Alemi
e8a92f33b0
Remove bodies from the notification messages as they tend to break bash syntax.
...
PiperOrigin-RevId: 452068897
2022-05-31 09:39:25 -07:00
Alex Alemi
57da8d941b
Attempting again to fix the yaml syntax issues in the pull request notifier and trying to fix the trigger for the broken main.
2022-05-27 16:40:48 -04:00
Alex Alemi
8e34061739
Fixing the interpolation errors and yaml syntax errors.
2022-05-27 14:07:21 -04:00
Alex Alemi
6c7542e4a3
adding notification workflows.
2022-05-27 12:19:47 -04:00
Jeppe Klitgaard
a11f15e3ec
feat: officially support Python 3.10
2022-05-07 13:43:12 +01:00
dependabot[bot]
bff0845794
Bump actions/cache from 2 to 3
...
Bumps [actions/cache](https://github.com/actions/cache ) from 2 to 3.
- [Release notes](https://github.com/actions/cache/releases )
- [Commits](https://github.com/actions/cache/compare/v2...v3 )
---
updated-dependencies:
- dependency-name: actions/cache
dependency-type: direct:production
update-type: version-update:semver-major
...
Signed-off-by: dependabot[bot] <support@github.com>
2022-03-21 17:13:18 +00:00
dependabot[bot]
16b147861d
Bump actions/checkout from 2 to 3
...
Bumps [actions/checkout](https://github.com/actions/checkout ) from 2 to 3.
- [Release notes](https://github.com/actions/checkout/releases )
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md )
- [Commits](https://github.com/actions/checkout/compare/v2...v3 )
---
updated-dependencies:
- dependency-name: actions/checkout
dependency-type: direct:production
update-type: version-update:semver-major
...
Signed-off-by: dependabot[bot] <support@github.com>
2022-03-07 17:12:34 +00:00
dependabot[bot]
680c06ddc4
Bump actions/setup-python from 2 to 3
...
Bumps [actions/setup-python](https://github.com/actions/setup-python ) from 2 to 3.
- [Release notes](https://github.com/actions/setup-python/releases )
- [Commits](https://github.com/actions/setup-python/compare/v2...v3 )
---
updated-dependencies:
- dependency-name: actions/setup-python
dependency-type: direct:production
update-type: version-update:semver-major
...
Signed-off-by: dependabot[bot] <support@github.com>
2022-02-28 17:11:41 +00:00
Jake VanderPlas
f2222bb1cf
CI: error if docstring rewrite fails
2022-02-07 14:43:00 -08:00
Yash Katariya
0bb7d204ab
Move serialization/de-serialization of GDA into jax.
...
PiperOrigin-RevId: 414607092
2021-12-06 20:05:02 -08:00
Peter Hawkins
70b8a6a806
Add a prototype IREE backend for JAX.
...
This is to support experimentation with the combination of JAX/IREE. Many things do not work yet.
PiperOrigin-RevId: 409980064
2021-11-15 07:57:04 -08:00
Peter Hawkins
8f6e077d9a
Adds an initial prototype of an alternate JAX compilation path that emits the MLIR MHLO/CHLO dialects instead of classic XLA HLO.
...
This lowering is missing a number of features, but it is complete enough that many tests pass, and that I would like to start checking it in.
PiperOrigin-RevId: 409134016
2021-11-11 06:37:12 -08:00