Change the JAX type promotion table to prefer inexact types during type promotion.
NumPy's type promotion rules tend to promote aggressively to float64, which isn't a very accelerator-friendly behavior when not all accelerators (e.g., TPUs) support 64-bit floating point types. Even on accelerators that support 64-bit floating point types (e.g., GPUs), promotion to a 64-bit type comes with a significant performance cost.
This change makes JAX type promotion between inexact and exact types closer to PyTorch's promotion semantics, which are a better fit for modern accelerators:
e.g.,
```
import numpy as onp
from jax import numpy as np
In [1]: onp.promote_types(onp.float32, onp.int32)
Out[1]: dtype('float64')
In [2]: onp.promote_types(onp.float16, onp.int64)
Out[2]: dtype('float64')
In [3]: np.promote_types(onp.float32, onp.int32)
Out[3]: dtype('float32')
In [4]: np.promote_types(onp.float16, onp.int64)
Out[4]: dtype('float16')
```
This change is in preparation for enabling x64 mode by default on all platforms.
* Moved all notebooks to docs/notebooks.
Now all notebooks are in the same place, thus all are subject
to auto-doc generation at readthedocs.io and to automated testing
with travis.
Some notebooks are too slow, exclude them at docs/conf.py:exclude_patterns.
Cleanup a bit the section headings in notebooks so that they show
up well in readtehdocs.io.
* Increase the cell timeout for executing notebooks
* Exclude also the neural network notebook from auto-generation (timing out)
* Disable the score_matching notebook from auto-doc (travis does not have sklearn)
* Cleaned up use of section levels
* Renamed ma to multiply_add and sq_add to square_add
* Other minor clarifications
* Separated the Colabs into Tutorials and Advanced Tutorials
Using threading within a traced context still won't work, but that is perhaps less important than the ability to call JIT-ted computations from separate threads.
(Revives https://github.com/google/jax/pull/734.)