rocm_jax

mirror of https://github.com/ROCm/jax.git synced 2025-04-16 03:46:06 +00:00

Author	SHA1	Message	Date
Peter Hawkins	14cb7453f0	Add a C++ implementation of a toplogical sort. This is an exact port of the current Python implementation to C++ for speed. I am being careful not to change the topological order we return in any way in this change, although we may do so in a future change. PiperOrigin-RevId: 737014989	2025-03-14 16:04:25 -07:00
Peter Hawkins	8ab33669e2	Add a variant of safe_map() that has no return value, named foreach(). This avoids a bunch of list bookkeeping in cases where we are iterating only for a side effect and do not care about the results. I would have named this iter() to match OCaml's list.iter(), but unfortunately iter() is a Python builtin. PiperOrigin-RevId: 736859418	2025-03-14 07:42:48 -07:00
Peter Hawkins	0389d617c8	Add a unittest test extension that runs test cases in parallel using threads. This change does not yet do the work necessary to make any tests pass with threading enabled, which will come in future changes. This approach is broadly inspired by `a6d205dd4c/testtools/testsuite.py (L113)` and by unittest-ft. We add a custom TestResult class that batches up any test result actions and applies them under a lock. We also add a custom TestSuite class that runs individual test cases in parallel using a thread-pool. We need a reader-writer lock to implement a `@jtu.thread_hostile_test` decorator, which we do by adding bindings around absl::Mutex to jaxlib. PiperOrigin-RevId: 713312937	2025-01-08 09:11:47 -08:00
Peter Hawkins	70b7d50181	Switch jaxlib to use nanobind instead of pybind11. nanobind has a number of advantages (https://nanobind.readthedocs.io/en/latest/why.html), notably speed of compilation and dispatch, but the main reason to do this for these bindings is because nanobind can target the Python Stable ABI starting with Python 3.12. This means that we will not need to ship per-Python version CUDA plugins starting with Python 3.12. PiperOrigin-RevId: 559898790	2023-08-24 16:07:56 -07:00
Peter Hawkins	bfa113ba60	Remove references to Python 3.8. Remove the old build scripts/Dockerfile, since they are unused and broken. PiperOrigin-RevId: 542870354	2023-06-23 08:48:57 -07:00
Peter Hawkins	74384e6a87	Add a C++ safe_zip implementation. Benchmark results on my workstation: ``` name old cpu/op new cpu/op delta safe_zip/arg_lengths:0/num_args:1 1.22µs ± 1% 0.28µs ± 8% -77.33% (p=0.008 n=5+5) safe_zip/arg_lengths:1/num_args:1 1.28µs ± 1% 0.34µs ± 6% -73.18% (p=0.008 n=5+5) safe_zip/arg_lengths:2/num_args:1 1.28µs ± 1% 0.38µs ± 5% -70.26% (p=0.008 n=5+5) safe_zip/arg_lengths:5/num_args:1 1.38µs ± 1% 0.51µs ± 3% -63.26% (p=0.008 n=5+5) safe_zip/arg_lengths:10/num_args:1 1.61µs ± 1% 0.69µs ± 3% -56.93% (p=0.008 n=5+5) safe_zip/arg_lengths:100/num_args:1 5.39µs ± 1% 3.83µs ± 2% -29.03% (p=0.008 n=5+5) safe_zip/arg_lengths:0/num_args:2 1.46µs ± 1% 0.32µs ± 4% -78.30% (p=0.008 n=5+5) safe_zip/arg_lengths:1/num_args:2 1.52µs ± 1% 0.39µs ± 4% -74.20% (p=0.008 n=5+5) safe_zip/arg_lengths:2/num_args:2 1.53µs ± 1% 0.44µs ± 4% -71.38% (p=0.008 n=5+5) safe_zip/arg_lengths:5/num_args:2 1.66µs ± 2% 0.60µs ± 3% -63.96% (p=0.008 n=5+5) safe_zip/arg_lengths:10/num_args:2 1.90µs ± 1% 0.82µs ± 3% -56.66% (p=0.008 n=5+5) safe_zip/arg_lengths:100/num_args:2 6.51µs ± 1% 4.80µs ± 0% -26.23% (p=0.016 n=5+4) safe_zip/arg_lengths:0/num_args:3 1.62µs ± 1% 0.36µs ± 4% -77.95% (p=0.008 n=5+5) safe_zip/arg_lengths:1/num_args:3 1.68µs ± 1% 0.44µs ± 3% -73.75% (p=0.008 n=5+5) safe_zip/arg_lengths:2/num_args:3 1.69µs ± 1% 0.50µs ± 3% -70.48% (p=0.008 n=5+5) safe_zip/arg_lengths:5/num_args:3 1.83µs ± 1% 0.68µs ± 2% -62.73% (p=0.008 n=5+5) safe_zip/arg_lengths:10/num_args:3 2.12µs ± 1% 0.96µs ± 1% -54.71% (p=0.008 n=5+5) safe_zip/arg_lengths:100/num_args:3 7.34µs ± 2% 5.89µs ± 1% -19.74% (p=0.008 n=5+5) ``` In addition, improve the length mismatch error for safe_map and define __module__ on both functions. PiperOrigin-RevId: 523475834	2023-04-11 12:43:04 -07:00
Peter Hawkins	0dbd467cea	Add a C++ implementation of safe map. Before (argument names reversed, oops, fixed in code): ``` name time/op safe_map/num_args:0/arg_lengths:1 1.43µs ± 1% safe_map/num_args:1/arg_lengths:1 1.61µs ± 1% safe_map/num_args:2/arg_lengths:1 1.72µs ± 0% safe_map/num_args:5/arg_lengths:1 2.14µs ± 1% safe_map/num_args:10/arg_lengths:1 2.87µs ± 1% safe_map/num_args:100/arg_lengths:1 15.6µs ± 1% safe_map/num_args:0/arg_lengths:2 1.65µs ± 0% safe_map/num_args:1/arg_lengths:2 1.83µs ± 1% safe_map/num_args:2/arg_lengths:2 1.97µs ± 1% safe_map/num_args:5/arg_lengths:2 2.41µs ± 1% safe_map/num_args:10/arg_lengths:2 3.22µs ± 2% safe_map/num_args:100/arg_lengths:2 17.0µs ± 2% safe_map/num_args:0/arg_lengths:3 1.83µs ± 1% safe_map/num_args:1/arg_lengths:3 2.02µs ± 1% safe_map/num_args:2/arg_lengths:3 2.16µs ± 1% safe_map/num_args:5/arg_lengths:3 2.63µs ± 1% safe_map/num_args:10/arg_lengths:3 3.48µs ± 1% safe_map/num_args:100/arg_lengths:3 18.1µs ± 1% ``` After: ``` name time/op safe_map/num_args:0/arg_lengths:1 409ns ± 1% safe_map/num_args:1/arg_lengths:1 602ns ± 5% safe_map/num_args:2/arg_lengths:1 777ns ± 4% safe_map/num_args:5/arg_lengths:1 1.21µs ± 3% safe_map/num_args:10/arg_lengths:1 1.93µs ± 2% safe_map/num_args:100/arg_lengths:1 14.7µs ± 0% safe_map/num_args:0/arg_lengths:2 451ns ± 1% safe_map/num_args:1/arg_lengths:2 652ns ± 0% safe_map/num_args:2/arg_lengths:2 850ns ± 4% safe_map/num_args:5/arg_lengths:2 1.32µs ± 3% safe_map/num_args:10/arg_lengths:2 2.11µs ± 2% safe_map/num_args:100/arg_lengths:2 16.0µs ± 1% safe_map/num_args:0/arg_lengths:3 496ns ± 1% safe_map/num_args:1/arg_lengths:3 718ns ± 5% safe_map/num_args:2/arg_lengths:3 919ns ± 4% safe_map/num_args:5/arg_lengths:3 1.43µs ± 2% safe_map/num_args:10/arg_lengths:3 2.30µs ± 2% safe_map/num_args:100/arg_lengths:3 17.3µs ± 1% ``` PiperOrigin-RevId: 523263207	2023-04-10 18:09:56 -07:00

7 Commits