* improve an escaped tracer error message
Before this commit, encountering an escaped tracer in a specific way
would lead to a bad internal error. This change
1. raises an UnexpectedTracerError instead, and
2. includes in the error message the user source line which created the
tracer.
* deflake
* replace _live propety with _assert_live method
Thanks @jekbradbury !
This is normally unnecessary, because the XLA translation usually
doesn't bind any of the primitives in the jaxpr, but this is not true in
case of scan! Its translation rule reevaluates the jaxpr as a function,
and if it contains collectives such as `axis_index` it can fail due to
axis being missing.
Some of the vmap and gmap collective tests have been failing on master
and I can't seem to be able to reproduce them locally. Hopefully, if
this happens again, this extra bit of information will be useful in
debugging the problem.
* applied simple find+sed for 'master' -> 'main'
* Rename master->main in JAX API and internals (#4178)
* Started with #4174
* Renamed Trace.master to Trace.main
* Renamed core.new_master and core.new_base_master
Co-authored-by: George Necula <gcnecula@gmail.com>
This allows executing collectives over the gmapped axes. This requires
some extra manipulation of the gmapped jaxpr, since gmap exposes a
single logical axis name, but evaluates the program using multiple
"physical" axes.
This also fixes some bugs around handling `multiple_returns` in
vmap collective implementation.
Before this change, there were two versions, one used with omnistaging
and one without. But that made bookkeeping hard and buggy. This change
defines the axis_index_p primitive in core.py. Some of its rules are
still changed when omnistaging is enabled.
In the original usage of TypedJaxpr, literals could not be tracers
because they were only produced by initial-style transformations of
jaxprs. But now TypedJaxpr is used in several other ways, e.g. in
make_jaxpr, and moreover its avals are redundant. It should probably be
renamed ClosedJaxpr since it mainly serves to package a jaxpr together
with its constant arrays. This check was limiting the utility of
TypedJaxpr, and it was only added relatively recently anyway.
* Add experimental __array_module__ method
xref https://github.com/google/jax/issues/1565
`__array_module__` (see [NEP 37](https://numpy.org/neps/nep-0037-array-module.html))
is an experimental alternative to `__array_function__` and `__array_ufunc__`
for "duck array" compatibility with NumPy that promises to be much less
invasive.
Example usage:
```python
import numpy as np
def duckarray_stack(arrays):
"""This "stack" function should work with any array library, including JAX."""
npx = np.get_array_module(*arrays)
arrays = [npx.asarray(arr) for arr in arrays]
shapes = {arr.shape for arr in arrays}
if len(shapes) != 1:
raise ValueError('all input arrays must have the same shape')
expanded_arrays = [arr[npx.newaxis, ...] for arr in arrays]
return npx.concatenate(expanded_arrays, axis=0)
```
Support for this protocol has *not* yet been implemented in NumPy, but it can
be tested with https://github.com/seberg/numpy-dispatch.
My reasoning for merging it into JAX (on an experimental basis with no
guarantees, of course) is that:
1. It's not invasive -- the implementation is small and self-contained.
2. No backwards compatibility issues. Unlike `__array_function__` and
`__array_ufunc__`, `__array_module__` will always require an explicit
opt-in by libraries that use it by calling `get_array_module()`.
2. Other NumPy developers
[want evidence](https://github.com/numpy/numpy/pull/16935#issuecomment-673951287)
that this is actually feasible.
3. Scikit-Learn developers like @thomasjpfan are interested in exploring
supporting scikit-learn on top of NumPy-like libraries like JAX, and
experimental support for this protocol will make that easier.
Note: this PR does add `numpy-dispatch` as a optional testing requirement in
order to verify that this works. If desired, we could remove this from CI, but
installing numpy-dispatch (and its build requirement Cython) appears to only
add a few seconds of build time.
* don't explicitly list cython
* remove UnshpaedArray from _JAX_ARRAY_TYPES
* Remove incorrect note about metaclasses
* remove unnecessary numpy_dispatch.ensure_dispatching()
This adds support for the basic (associative and commutative)
collectives to vmap. Supporting more complex collectives will
require some more complicated rules. Also, at the moment it is not
possible to use collectives inside `custom_vjp` rules which we might
want to fix in the future.
This feature is also omnistaging-only.
Co-authored-by: Matthew Johnson <mattjj@google.com>
This change, when enabled, stages out all primitive calls in the dynamic
scope of a jitted, pmapped, or control flow function, rather than only
staging out based on data dependence. One improvement is that jitted
functions can consume less memory, by avoiding instantiating large
constants at trace time, and cause less memory fragmentation as well. It
also simplifies several internals.
See https://github.com/google/jax/pull/3370 fo more information.
* moved check_jaxpr code around to match eval_jaxpr
This change is mostly stylistic; it brings check_jaxpr closer to
eval_jaxpr (and the other jaxpr interpreters) in organization. There's a
slight tweak to an error message which lets us save some slightly
redundant code.
* fixes and tweaks
revert find_top_trace change from #3197
The previous version was written and tested for performance; the revised
version caused at least a 25% slowdown in the dispatch time of
`lax.add(1, 2)` (and so likely a much bigger slowdown for the
find_top_trace timing alone).
Instead, we can just change the error message in xla.abstractify, since
invalid types lead to abstractification errors when we apply primitive
impls.
For a computation of the form:
>>> f = lambda x: x ** 2
>>> f = jax.jit(f)
>>> while run:
... x = f(x)
JAX must currently always have two copies of `x` in device memory since there
is no reliable way in Python to determine whether there will be future uses of
`x`. This causes two classes of problem:
1. Users at the limit of available device are constrained by the additional
copy of their parameters and other state while they typically only require
one copy. This typically frees 100M+ of device memory and is a critical
optimization for larger models to match state of the art performance in
other frameworks.
2. This constant alloc/free of the input/output buffers can cause memory
fragmentation on some platforms (although having a reusing allocator and
limiting run-ahead may be a better solution for this problem).
We propose fixing this by using input/output aliasing as supported by XLA. We
will support this in JAX by allowing certain arguments of jit/pmap decorated
functions to be donated and reused as outputs:
>>> f = lambda x: x ** 2
>>> f = jit(f, donate_argnums=0)
>>> while run:
... x = f(x)
JAX will determine that the donated input `x` can alias with the output of the
function and it will instruct XLA it _must_ write the result to this buffer.
If a user tries to reuse a buffer after it has been donated they get an error
that the buffer is invalid:
>>> y = f(x)
>>> jax.device_get(x)
...
RuntimeError: Invalid argument: CopyToHostAsync() called on invalid buffer.
The semantics of `donate_argnums` follows that of `static_argnums`, namely that
it identifies positional arguments to the computation that are to be donated
to the computation and used as part of the output.
One feature that is also enabled by this is invalidating buffers that should
only be used once, for example PRNGKeys:
>>> @partial(jit, donate_argnums=0)
... def move(x):
... # Do something complex enough for JAX to just optimize it away.
... return tree_map(lambda x: x + x - x, x)
>>> def safe_eager_uniform(key, *a, **k):
... assert hasattr(key, 'device_buffer'), "random must run eagerly"
... key = move(key)
... return jax.random.uniform(key, *a, **k)
This is not a complete answer to random safety since it is still possible to
reuse a key as part of a traced computation, however it can be used to support
this feature (somewhat inefficiently) in eager mode.
* Added argument check to all primitives.
The issue that inspired this is that `lax.tie_in` is
easy to misuse if the first argument is not a JAX type, then
it silently disappears. This means that `lax.tie_in((x, x), const)`
is the same as `const` even though `x` is a tracer.
This error would be caught previously if core.skip_checks == False
because then `bind` checks its arguments. I have essentially added
an unconditional argument check to `bind`.
In case this is considered too inefficient, we can add argument
checking to individual primivites, e.g., tie_in. For most primitives
if a non-JAX array is passed, the `impl` rule would fire and `numpy`
would report the error somehow, perhaps.
* Merged find_top_trace with check_args
This was previously merged as #2948 but reverted awaiting the fixes
in some user code.