Frederic Bastien
0a55822c87
The date trick doesn't work, so try to sort instead.
2023-04-12 07:43:13 -07:00
Frederic Bastien
951f5c5ad1
Try to fix strange error where DATE end up being empty.\n\n. All is working locally.
2023-04-11 12:06:05 -07:00
Yu-Hang "Maxin" Tang
d8592ebd47
Fix typo in ci config
...
There appears to be a typo since branch refs are stored in `refs/heads` (plural) directory.
2023-04-08 10:25:15 -07:00
jax authors
830d41d5f8
Merge pull request #15441 from nouiz:nightly_ci_use_the_nightly_whl
...
PiperOrigin-RevId: 522632873
2023-04-07 10:49:05 -07:00
Skye Wanderman-Milne
74a5c0d125
Add nightly to TPU test matrix
2023-04-06 23:13:55 +00:00
Frederic Bastien
dbc8f7bba1
Use the nightly whl in the CI.
2023-04-06 11:06:02 -07:00
Yash Katariya
d27a80dbfa
Rename gda_serialization to array_serialization but keep gda_serialization around until it is included in a jax release so that OSS projects can be moved to array_serialization
...
PiperOrigin-RevId: 521055760
2023-03-31 18:07:51 -07:00
Skye Wanderman-Milne
6927e5d4bf
Add 1-hour timeout to each Cloud TPU CI job.
...
Sometimes they hang, and the default timeout is 6 hours, which is way too long.
2023-03-29 14:23:45 -07:00
Skye Wanderman-Milne
ef5e4a4035
Remove 'pjrt_c_api_unimplemented' pytest mark.
...
Instead, we skip tests that the PJRT C API doesn't support. We had
this tag for feature development so it was easy to broadly disable,
but now we don't expect to need to do that.
2023-03-24 23:14:54 +00:00
jax authors
61a5686f51
Merge pull request #15175 from nouiz:nightly_ci_keep_alive
...
PiperOrigin-RevId: 519167122
2023-03-24 09:52:49 -07:00
Frederic Bastien
229a4cfdb4
remove another dependency not currently needed.
2023-03-24 08:04:27 -07:00
Frederic Bastien
f3be75cb53
WAR ssh timeout like:
...
client_loop: send disconnect: Broken pipe
https://github.com/google/jax/actions/runs/4500333187/jobs/7919324156#step:8:42
2023-03-23 11:58:57 -07:00
Frederic Bastien
406fd791a7
Add missing file
2023-03-23 07:44:40 -07:00
Frederic Bastien
21cbe7b78e
WAR the dependency issue in the nightly CI container.
2023-03-22 12:30:48 -07:00
Lee J.O'Riordan
1d2183e8c1
Ensure wheel cahced only on release
2023-03-15 14:00:18 -04:00
Lee J.O'Riordan
94235fc9cd
Add support for Github Actions build of jaxlib on Winx64
2023-03-15 14:00:04 -04:00
Geoffrey Martin-Noble
32d11fd8ff
Update links from iree-org/iree to openxla/iree
...
As described in https://github.com/openxla/iree/issues/12102 , IREE has
moved to the openxla GitHub organization as part of joining the OpenXLA
project.
PiperOrigin-RevId: 511350414
2023-02-21 17:40:05 -08:00
Skye Wanderman-Milne
1476a85225
Add --pre
to nightly libtpu pip install command.
...
This is necessary to make sure we pick up the nightly "dev" versions
hosted on GCP and not the fake package at
https://pypi.org/project/libtpu-nightly/ .
2023-02-21 23:38:39 +00:00
Skye Wanderman-Milne
93cd07efb8
Add PJRT C API to Cloud TPU test matrix
...
Also shortens the job names so the full name is visible from the
github UI (this was driving me crazy), and marks a new test that can't
be run on the PJRT C API yet.
Example run: https://github.com/google/jax/actions/runs/4019968334
2023-01-27 01:06:21 +00:00
Skye Wanderman-Milne
582578220d
[TPU CI] Send chat notification on cancellation as well as failure.
...
In particular, this makes it notify on timeouts (which usually
indicates a test hang, but should be addressed in any case).
2023-01-25 22:12:30 +00:00
Jake VanderPlas
e24a0e5bf2
CI: adjust permissions for upstream-nightly build
2023-01-20 14:01:48 -08:00
Jake VanderPlas
25c9621295
CI: update deprecated uses of set-output
2023-01-20 09:51:05 -08:00
Leopold Cambier
056702c1cb
Multinodes CICD on GPUs using on-demand cluster and e2e tests using T5X
2023-01-17 16:29:30 -08:00
Jake VanderPlas
aa34ea7b1c
JAX github actions: Update Python versions in test matrix for better coverage
...
PiperOrigin-RevId: 496501739
2022-12-19 15:10:33 -08:00
Jake VanderPlas
ad40b0842d
CI: use Python 3.11 for upstream-nightly action
2022-12-19 10:47:09 -08:00
Yash Katariya
e6e1836711
Copybara import of the project:
...
--
20080434922caf49181c456785ab78b90a4907e3 by Anselm Levskaya <levskaya@google.com>:
Revert to old test runners to investigate runner queue failure.
PiperOrigin-RevId: 496099919
2022-12-17 09:28:51 -08:00
Jake VanderPlas
9f811ba54d
Address drastic slowdown in mypy runtime
2022-12-16 14:48:26 -08:00
Anselm Levskaya
2008043492
Revert to old test runners to investigate runner queue failure.
2022-12-15 18:51:34 -08:00
Skye Wanderman-Milne
8d4b50e397
[TPU CI] Run build matrix on v3-8 as well as v4-8
...
We're seeing failures on v3-8 that don't appear on the current v4-8
testing. v3-8 also exposes 8 devices (vs. v4-8 exposes 4), and some
tests needs 8 devices to run.
I just added a v3-8 runner VM.
Also adds a missing pip install command (I only caught this with a
fresh runner since it only needs to be installed once).
2022-12-09 22:32:09 +00:00
jax authors
23b808f7d0
Merge pull request #13446 from google:maxfail
...
PiperOrigin-RevId: 493414635
2022-12-06 14:34:01 -08:00
Jake VanderPlas
cb62a31653
Drop support for Python 3.7
2022-11-29 15:01:47 -08:00
Jake VanderPlas
1647c5960e
CI: bump timeout for pre-commit
2022-11-28 13:26:44 -08:00
Anselm Levskaya
074e4ec813
Enable faster test-runners for PR/push CI runs.
2022-11-23 14:07:08 -08:00
Skye Wanderman-Milne
246614ed5c
Add --maxfail=20 to Cloud TPU CI.
...
This prevents spamming the test output with 100s of failures when something fundamental is broken.
Also updates some `python3` commands to use `python` for consistency.
2022-11-23 00:47:54 +00:00
jax authors
dd902fde21
Merge pull request #13317 from google:xdist_tpu
...
PiperOrigin-RevId: 490366370
2022-11-22 16:40:00 -08:00
Roy Frostig
35634fcc2a
exercise config.jax_threefry_partitionable
in one of the CI runs
2022-11-21 15:30:58 -08:00
Skye Wanderman-Milne
120125f3dd
Make pytest-xdist work on TPU and update Cloud TPU CI.
...
This change also marks multiaccelerator test files in a way pytest can
understand (if pytest is installed).
By running single-device tests on a single TPU chip, running the test
suite goes from 1hr 45m to 35m (both timings are running slow tests).
I tried using bazel at first, which already supported parallel
execution across TPU cores, but somehow it still takes 2h 20m! I'm not
sure why it's so slow. It appears that bazel creates many new test
processes over time, vs. pytest reuses the number of processes
initially specified, and starting and stopping the TPU runtime takes a
few seconds so that may be adding up. It also appears that
single-process bazel is slower than single-process pytest, which I
haven't looked into yet.
2022-11-18 22:05:13 +00:00
Skye Wanderman-Milne
0a886c34fa
Include which jaxlib/libtpu version failed (latest or nightly) in TPU CI chat notification
2022-11-16 21:38:36 +00:00
Skye Wanderman-Milne
b4564a2a57
TPU CI: don't notify when testing the workflow from a branch
2022-11-16 21:27:24 +00:00
Skye Wanderman-Milne
8bed9bac81
Update Github Actions workflows using Ratchet
...
https://opensource.google/documentation/reference/github/services#actions
mandates using a specific commit for non-Google actions in workflow
files. I used https://github.com/sethvargo/ratchet to update all our
workflow files. Example command: `ratchet pin cloud-tpu-ci-nightly.yml`
Ratchet appears to also auto-format the YAML files. It makes the diff
confusing but I'm ok with the final result.
2022-11-16 18:45:59 +00:00
Yash Katariya
a419e1917a
Use jax.Array by default for doctests
...
PiperOrigin-RevId: 488719467
2022-11-15 11:52:22 -08:00
Skye Wanderman-Milne
5da7976093
Send message to internal chat room on Cloud TPU CI failure
2022-11-14 19:44:45 +00:00
Skye Wanderman-Milne
52775c42e4
Add .github/workflows/self_hosted_runner_utils/README.md
...
This was meant to be part of https://github.com/google/jax/pull/13000 , oops
2022-11-04 17:12:54 +00:00
jax authors
3db2a59f76
Merge pull request #13097 from jakevdp:actions-permissions
...
PiperOrigin-RevId: 486160888
2022-11-04 09:36:32 -07:00
Skye Wanderman-Milne
8c22e34e22
Add Github Actions workflow that runs on a self-hosted TPU VM runner.
...
This also includes some utilites for setting up the self-hosted
runner. Googlers, see go/jax-self-hosted-runners for more setup info.
The workflow is pretty basic currently. We can and should add more
functionality later, such as email notifications. I kept it simple
here for easier reviewing.
Testing:
- Sample workflow run in my fork: https://github.com/skye/jax/actions/runs/3333614180
- Sample PR attempt: (will add soon but I did verify validate_job.sh blocks pull_request workflows)
2022-11-03 21:15:57 +00:00
Jake VanderPlas
8057e2805b
CI: set explicit permissions for ci-build action
2022-11-03 13:21:58 -07:00
dependabot[bot]
cef5f20dbb
Bump styfle/cancel-workflow-action from 0.10.1 to 0.11.0
...
Bumps [styfle/cancel-workflow-action](https://github.com/styfle/cancel-workflow-action ) from 0.10.1 to 0.11.0.
- [Release notes](https://github.com/styfle/cancel-workflow-action/releases )
- [Commits](https://github.com/styfle/cancel-workflow-action/compare/0.10.1...0.11.0 )
---
updated-dependencies:
- dependency-name: styfle/cancel-workflow-action
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
2022-10-17 17:18:44 +00:00
jax authors
6c3c51e8f3
Merge pull request #12591 from sudhakarsingh27:add_pytest_run_for_jaxlib_release
...
PiperOrigin-RevId: 478608240
2022-10-03 14:34:32 -07:00
dependabot[bot]
8f71b03662
Bump styfle/cancel-workflow-action from 0.10.0 to 0.10.1
...
Bumps [styfle/cancel-workflow-action](https://github.com/styfle/cancel-workflow-action ) from 0.10.0 to 0.10.1.
- [Release notes](https://github.com/styfle/cancel-workflow-action/releases )
- [Commits](https://github.com/styfle/cancel-workflow-action/compare/0.10.0...0.10.1 )
---
updated-dependencies:
- dependency-name: styfle/cancel-workflow-action
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
2022-10-03 17:11:55 +00:00
Sudhakar
4fbc9a10d1
Add multihost GPU CI run with last public jaxlib release
2022-09-29 17:06:56 -07:00