mirror of
https://github.com/ROCm/jax.git
synced 2025-04-24 23:06:05 +00:00
1150 lines
79 KiB
Plaintext
1150 lines
79 KiB
Plaintext
{
|
||
"cells": [
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 1,
|
||
"metadata": {
|
||
"id": "aPUwOm-eCSFD",
|
||
"tags": [
|
||
"remove-cell"
|
||
]
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"# Configure ipython to hide long tracebacks.\n",
|
||
"import sys\n",
|
||
"ipython = get_ipython()\n",
|
||
"\n",
|
||
"def minimal_traceback(*args, **kwargs):\n",
|
||
" etype, value, tb = sys.exc_info()\n",
|
||
" value.__cause__ = None # suppress chained exceptions\n",
|
||
" stb = ipython.InteractiveTB.structured_traceback(etype, value, tb)\n",
|
||
" del stb[3:-1]\n",
|
||
" return ipython._showtraceback(etype, value, stb)\n",
|
||
"\n",
|
||
"ipython.showtraceback = minimal_traceback"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "LQHmwePqryRU"
|
||
},
|
||
"source": [
|
||
"# How to Think in JAX\n",
|
||
"\n",
|
||
"[](https://colab.research.google.com/github/google/jax/blob/main/docs/notebooks/thinking_in_jax.ipynb)\n",
|
||
"\n",
|
||
"JAX provides a simple and powerful API for writing accelerated numerical code, but working effectively in JAX sometimes requires extra consideration. This document is meant to help build a ground-up understanding of how JAX operates, so that you can use it more effectively."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "nayIExVUtsVD"
|
||
},
|
||
"source": [
|
||
"## JAX vs. NumPy\n",
|
||
"\n",
|
||
"**Key Concepts:**\n",
|
||
"\n",
|
||
"- JAX provides a NumPy-inspired interface for convenience.\n",
|
||
"- Through duck-typing, JAX arrays can often be used as drop-in replacements of NumPy arrays.\n",
|
||
"- Unlike NumPy arrays, JAX arrays are always immutable.\n",
|
||
"\n",
|
||
"NumPy provides a well-known, powerful API for working with numerical data. For convenience, JAX provides `jax.numpy` which closely mirrors the numpy API and provides easy entry into JAX. Almost anything that can be done with `numpy` can be done with `jax.numpy`:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 2,
|
||
"metadata": {
|
||
"id": "kZaOXL7-uvUP",
|
||
"outputId": "17a9ee0a-8719-44bb-a9fe-4c9f24649fef"
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"image/png": "\n",
|
||
"text/plain": [
|
||
"<Figure size 432x288 with 1 Axes>"
|
||
]
|
||
},
|
||
"metadata": {
|
||
"needs_background": "light",
|
||
"tags": []
|
||
},
|
||
"output_type": "display_data"
|
||
}
|
||
],
|
||
"source": [
|
||
"import matplotlib.pyplot as plt\n",
|
||
"import numpy as np\n",
|
||
"\n",
|
||
"x_np = np.linspace(0, 10, 1000)\n",
|
||
"y_np = 2 * np.sin(x_np) * np.cos(x_np)\n",
|
||
"plt.plot(x_np, y_np);"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 3,
|
||
"metadata": {
|
||
"id": "18XbGpRLuZlr",
|
||
"outputId": "9e98d928-1925-45b1-d886-37956ca95e7c"
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stderr",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"WARNING:absl:No GPU/TPU found, falling back to CPU. (Set TF_CPP_MIN_LOG_LEVEL=0 and rerun for more info.)\n"
|
||
]
|
||
},
|
||
{
|
||
"data": {
|
||
"image/png": "\n",
|
||
"text/plain": [
|
||
"<Figure size 432x288 with 1 Axes>"
|
||
]
|
||
},
|
||
"metadata": {
|
||
"needs_background": "light",
|
||
"tags": []
|
||
},
|
||
"output_type": "display_data"
|
||
}
|
||
],
|
||
"source": [
|
||
"import jax.numpy as jnp\n",
|
||
"\n",
|
||
"x_jnp = jnp.linspace(0, 10, 1000)\n",
|
||
"y_jnp = 2 * jnp.sin(x_jnp) * jnp.cos(x_jnp)\n",
|
||
"plt.plot(x_jnp, y_jnp);"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "kTZcsCJiuPG8"
|
||
},
|
||
"source": [
|
||
"The code blocks are identical aside from replacing `np` with `jnp`, and the results are the same. As we can see, JAX arrays can often be used directly in place of NumPy arrays for things like plotting.\n",
|
||
"\n",
|
||
"The arrays themselves are implemented as different Python types:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 4,
|
||
"metadata": {
|
||
"id": "PjFFunI7xNe8",
|
||
"outputId": "e1706c61-2821-437a-efcd-d8082f913c1f"
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"numpy.ndarray"
|
||
]
|
||
},
|
||
"execution_count": 4,
|
||
"metadata": {
|
||
"tags": []
|
||
},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"type(x_np)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 5,
|
||
"metadata": {
|
||
"id": "kpv5K7QYxQnX",
|
||
"outputId": "8a3f1cb6-c6d6-494c-8efe-24a8217a9d55"
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"jax.interpreters.xla._DeviceArray"
|
||
]
|
||
},
|
||
"execution_count": 5,
|
||
"metadata": {
|
||
"tags": []
|
||
},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"type(x_jnp)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "Mx94Ri7euEZm"
|
||
},
|
||
"source": [
|
||
"Python's [duck-typing](https://en.wikipedia.org/wiki/Duck_typing) allows JAX arrays and NumPy arrays to be used interchangeably in many places.\n",
|
||
"\n",
|
||
"However, there is one important difference between JAX and NumPy arrays: JAX arrays are immutable, meaning that once created their contents cannot be changed.\n",
|
||
"\n",
|
||
"Here is an example of mutating an array in NumPy:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 6,
|
||
"metadata": {
|
||
"id": "fzp-y1ZVyGD4",
|
||
"outputId": "300a44cc-1ccd-4fb2-f0ee-2179763f7690"
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"[10 1 2 3 4 5 6 7 8 9]\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"# NumPy: mutable arrays\n",
|
||
"x = np.arange(10)\n",
|
||
"x[0] = 10\n",
|
||
"print(x)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "nQ-De0xcJ1lT"
|
||
},
|
||
"source": [
|
||
"The equivalent in JAX results in an error, as JAX arrays are immutable:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 7,
|
||
"metadata": {
|
||
"id": "pCPX0JR-yM4i",
|
||
"outputId": "02a442bc-8f23-4dce-9500-81cd28c0b21f",
|
||
"tags": [
|
||
"raises-exception"
|
||
]
|
||
},
|
||
"outputs": [
|
||
{
|
||
"ename": "TypeError",
|
||
"evalue": "ignored",
|
||
"output_type": "error",
|
||
"traceback": [
|
||
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
|
||
"\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)",
|
||
"\u001b[0;32m<ipython-input-7-6b90817377fe>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0;31m# JAX: immutable arrays\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 2\u001b[0m \u001b[0mx\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mjnp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0marange\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m10\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 3\u001b[0;31m \u001b[0mx\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;36m10\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
|
||
"\u001b[0;31mTypeError\u001b[0m: '<class 'jax.interpreters.xla._DeviceArray'>' object does not support item assignment. JAX arrays are immutable; perhaps you want jax.ops.index_update or jax.ops.index_add instead?"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"# JAX: immutable arrays\n",
|
||
"x = jnp.arange(10)\n",
|
||
"x[0] = 10"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "yRYF0YgO3F4H"
|
||
},
|
||
"source": [
|
||
"For updating individual elements, JAX provides an [indexed update syntax](https://jax.readthedocs.io/en/latest/jax.ops.html#indexed-update-operators) that returns an updated copy:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 8,
|
||
"metadata": {
|
||
"id": "8zqPEAeP3UK5",
|
||
"outputId": "7e6c996d-d0b0-4d52-e722-410ba78eb3b1"
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"[0 1 2 3 4 5 6 7 8 9]\n",
|
||
"[10 1 2 3 4 5 6 7 8 9]\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"y = x.at[0].set(10)\n",
|
||
"print(x)\n",
|
||
"print(y)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "886BGDPeyXCu"
|
||
},
|
||
"source": [
|
||
"## NumPy, lax & XLA: JAX API layering\n",
|
||
"\n",
|
||
"**Key Concepts:**\n",
|
||
"\n",
|
||
"- `jax.numpy` is a high-level wrapper that provides a familiar interface.\n",
|
||
"- `jax.lax` is a lower-level API that is stricter and often more powerful.\n",
|
||
"- All JAX operations are implemented in terms of operations in [XLA](https://www.tensorflow.org/xla/) – the Accelerated Linear Algebra compiler."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "BjE4m2sZy4hh"
|
||
},
|
||
"source": [
|
||
"If you look at the source of `jax.numpy`, you'll see that all the operations are eventually expressed in terms of functions defined in `jax.lax`. You can think of `jax.lax` as a stricter, but often more powerful, API for working with multi-dimensional arrays.\n",
|
||
"\n",
|
||
"For example, while `jax.numpy` will implicitly promote arguments to allow operations between mixed data types, `jax.lax` will not:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 9,
|
||
"metadata": {
|
||
"id": "c6EFPcj12mw0",
|
||
"outputId": "730e2ca4-30a5-45bc-923c-c3a5143496e2"
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"DeviceArray(2., dtype=float32)"
|
||
]
|
||
},
|
||
"execution_count": 9,
|
||
"metadata": {
|
||
"tags": []
|
||
},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"import jax.numpy as jnp\n",
|
||
"jnp.add(1, 1.0) # jax.numpy API implicitly promotes mixed types."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 10,
|
||
"metadata": {
|
||
"id": "0VkqlcXL2qSp",
|
||
"outputId": "601b0562-3e6a-402d-f83b-3afdd1e7e7c4",
|
||
"tags": [
|
||
"raises-exception"
|
||
]
|
||
},
|
||
"outputs": [
|
||
{
|
||
"ename": "TypeError",
|
||
"evalue": "ignored",
|
||
"output_type": "error",
|
||
"traceback": [
|
||
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
|
||
"\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)",
|
||
"\u001b[0;32m<ipython-input-10-63245925fccf>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0;32mfrom\u001b[0m \u001b[0mjax\u001b[0m \u001b[0;32mimport\u001b[0m \u001b[0mlax\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 2\u001b[0;31m \u001b[0mlax\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0madd\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m1.0\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;31m# jax.lax API requires explicit type promotion.\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
|
||
"\u001b[0;31mTypeError\u001b[0m: add requires arguments to have the same dtypes, got int32, float32."
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"from jax import lax\n",
|
||
"lax.add(1, 1.0) # jax.lax API requires explicit type promotion."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "aC9TkXaTEu7A"
|
||
},
|
||
"source": [
|
||
"If using `jax.lax` directly, you'll have to do type promotion explicitly in such cases:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 11,
|
||
"metadata": {
|
||
"id": "3PNQlieT81mi",
|
||
"outputId": "cb3ed074-f410-456f-c086-23107eae2634"
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"DeviceArray(2., dtype=float32)"
|
||
]
|
||
},
|
||
"execution_count": 11,
|
||
"metadata": {
|
||
"tags": []
|
||
},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"lax.add(jnp.float32(1), 1.0)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "M3HDuM4x2eTL"
|
||
},
|
||
"source": [
|
||
"Along with this strictness, `jax.lax` also provides efficient APIs for some more general operations than are supported by NumPy.\n",
|
||
"\n",
|
||
"For example, consider a 1D convolution, which can be expressed in NumPy this way:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 12,
|
||
"metadata": {
|
||
"id": "Bv-7XexyzVCN",
|
||
"outputId": "f5d38cd8-e7fc-49e2-bff3-a0eee306cb54"
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"DeviceArray([1., 3., 4., 4., 4., 4., 4., 4., 4., 4., 3., 1.], dtype=float32)"
|
||
]
|
||
},
|
||
"execution_count": 12,
|
||
"metadata": {
|
||
"tags": []
|
||
},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"x = jnp.array([1, 2, 1])\n",
|
||
"y = jnp.ones(10)\n",
|
||
"jnp.convolve(x, y)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "0GPqgT7S0q8r"
|
||
},
|
||
"source": [
|
||
"Under the hood, this NumPy operation is translated to a much more general convolution implemented by [`lax.conv_general_dilated`](https://jax.readthedocs.io/en/latest/_autosummary/jax.lax.conv_general_dilated.html):"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 13,
|
||
"metadata": {
|
||
"id": "pi4f6ikjzc3l",
|
||
"outputId": "b9b37edc-b911-4010-aaf8-ee8f500111d7"
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"DeviceArray([1., 3., 4., 4., 4., 4., 4., 4., 4., 4., 3., 1.], dtype=float32)"
|
||
]
|
||
},
|
||
"execution_count": 13,
|
||
"metadata": {
|
||
"tags": []
|
||
},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"from jax import lax\n",
|
||
"result = lax.conv_general_dilated(\n",
|
||
" x.reshape(1, 1, 3).astype(float), # note: explicit promotion\n",
|
||
" y.reshape(1, 1, 10),\n",
|
||
" window_strides=(1,),\n",
|
||
" padding=[(len(y) - 1, len(y) - 1)]) # equivalent of padding='full' in NumPy\n",
|
||
"result[0, 0]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "7mdo6ycczlbd"
|
||
},
|
||
"source": [
|
||
"This is a batched convolution operation designed to be efficient for the types of convolutions often used in deep neural nets. It requires much more boilerplate, but is far more flexible and scalable than the convolution provided by NumPy (See [Convolutions in JAX](https://jax.readthedocs.io/en/latest/notebooks/convolutions.html) for more detail on JAX convolutions).\n",
|
||
"\n",
|
||
"At their heart, all `jax.lax` operations are Python wrappers for operations in XLA; here, for example, the convolution implementation is provided by [XLA:ConvWithGeneralPadding](https://www.tensorflow.org/xla/operation_semantics#convwithgeneralpadding_convolution).\n",
|
||
"Every JAX operation is eventually expressed in terms of these fundamental XLA operations, which is what enables just-in-time (JIT) compilation."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "NJfWa2PktD5_"
|
||
},
|
||
"source": [
|
||
"## To JIT or not to JIT\n",
|
||
"\n",
|
||
"**Key Concepts:**\n",
|
||
"\n",
|
||
"- By default JAX executes operations one at a time, in sequence.\n",
|
||
"- Using a just-in-time (JIT) compilation decorator, sequences of operations can be optimized together and run at once.\n",
|
||
"- Not all JAX code can be JIT compiled, as it requires array shapes to be static & known at compile time.\n",
|
||
"\n",
|
||
"The fact that all JAX operations are expressed in terms of XLA allows JAX to use the XLA compiler to execute blocks of code very efficiently.\n",
|
||
"\n",
|
||
"For example, consider this function that normalizes the rows of a 2D matrix, expressed in terms of `jax.numpy` operations:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 14,
|
||
"metadata": {
|
||
"id": "SQj_UKGc-7kQ"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"import jax.numpy as jnp\n",
|
||
"\n",
|
||
"def norm(X):\n",
|
||
" X = X - X.mean(0)\n",
|
||
" return X / X.std(0)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "0yVo_OKSAolW"
|
||
},
|
||
"source": [
|
||
"A just-in-time compiled version of the function can be created using the `jax.jit` transform:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 15,
|
||
"metadata": {
|
||
"id": "oHLwGmhZAnCY"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"from jax import jit\n",
|
||
"norm_compiled = jit(norm)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "Q3H9ig5GA2Ms"
|
||
},
|
||
"source": [
|
||
"This function returns the same results as the original, up to standard floating-point accuracy:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 16,
|
||
"metadata": {
|
||
"id": "oz7zzyS3AwMc",
|
||
"outputId": "914f9242-82c4-4365-abb2-77843a704e03"
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"True"
|
||
]
|
||
},
|
||
"execution_count": 16,
|
||
"metadata": {
|
||
"tags": []
|
||
},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"np.random.seed(1701)\n",
|
||
"X = jnp.array(np.random.rand(10000, 10))\n",
|
||
"np.allclose(norm(X), norm_compiled(X), atol=1E-6)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "3GvisB-CA9M8"
|
||
},
|
||
"source": [
|
||
"But due to the compilation (which includes fusing of operations, avoidance of allocating temporary arrays, and a host of other tricks), execution times can be orders of magnitude faster in the JIT-compiled case (note the use of `block_until_ready()` to account for JAX's [asynchronous dispatch](https://jax.readthedocs.io/en/latest/async_dispatch.html)):"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 17,
|
||
"metadata": {
|
||
"id": "6mUB6VdDAEIY",
|
||
"outputId": "5d7e1bbd-4064-4fe3-f3d9-5435b5283199"
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"100 loops, best of 3: 4.3 ms per loop\n",
|
||
"1000 loops, best of 3: 452 µs per loop\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"%timeit norm(X).block_until_ready()\n",
|
||
"%timeit norm_compiled(X).block_until_ready()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "B1eGBGn0tMba"
|
||
},
|
||
"source": [
|
||
"That said, `jax.jit` does have limitations: in particular, it requires all arrays to have static shapes. That means that some JAX operations are incompatible with JIT compilation.\n",
|
||
"\n",
|
||
"For example, this operation can be executed in op-by-op mode:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 18,
|
||
"metadata": {
|
||
"id": "YfZd9mW7CSKM",
|
||
"outputId": "899fedcc-0857-4381-8f57-bb653e0aa2f1"
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"DeviceArray([-0.10570311, -0.59403396, -0.8680282 , -0.23489487], dtype=float32)"
|
||
]
|
||
},
|
||
"execution_count": 18,
|
||
"metadata": {
|
||
"tags": []
|
||
},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"def get_negatives(x):\n",
|
||
" return x[x < 0]\n",
|
||
"\n",
|
||
"x = jnp.array(np.random.randn(10))\n",
|
||
"get_negatives(x)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "g6niKxoQC2mZ"
|
||
},
|
||
"source": [
|
||
"But it returns an error if you attempt to execute it in jit mode:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 19,
|
||
"metadata": {
|
||
"id": "yYWvE4rxCjPK",
|
||
"outputId": "765b46d3-49cd-41b7-9815-e8bb7cd80175",
|
||
"tags": [
|
||
"raises-exception"
|
||
]
|
||
},
|
||
"outputs": [
|
||
{
|
||
"ename": "IndexError",
|
||
"evalue": "ignored",
|
||
"output_type": "error",
|
||
"traceback": [
|
||
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
|
||
"\u001b[0;31mIndexError\u001b[0m Traceback (most recent call last)",
|
||
"\u001b[0;32m<ipython-input-19-ec8799cf80d7>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mjit\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mget_negatives\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mx\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
|
||
"\u001b[0;31mIndexError\u001b[0m: Array boolean indices must be concrete."
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"jit(get_negatives)(x)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "vFL6DNpECfVz"
|
||
},
|
||
"source": [
|
||
"This is because the function generates an array whose shape is not known at compile time: the size of the output depends on the values of the input array, and so it is not compatible with JIT."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "BzBnKbXwXjLV"
|
||
},
|
||
"source": [
|
||
"## JIT mechanics: tracing and static variables\n",
|
||
"\n",
|
||
"**Key Concepts:**\n",
|
||
"\n",
|
||
"- JIT and other JAX transforms work by *tracing* a function to determine its effect on inputs of a specific shape and type.\n",
|
||
"\n",
|
||
"- Variables that you don't want to be traced can be marked as *static*\n",
|
||
"\n",
|
||
"To use `jax.jit` effectively, it is useful to understand how it works. Let's put a few `print()` statements within a JIT-compiled function and then call the function:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 20,
|
||
"metadata": {
|
||
"id": "TfjVIVuD4gnc",
|
||
"outputId": "df6ad898-b047-4ad1-eb18-2fbcb3fd2ab3"
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Running f():\n",
|
||
" x = Traced<ShapedArray(float32[3,4])>with<DynamicJaxprTrace(level=0/1)>\n",
|
||
" y = Traced<ShapedArray(float32[4])>with<DynamicJaxprTrace(level=0/1)>\n",
|
||
" result = Traced<ShapedArray(float32[3])>with<DynamicJaxprTrace(level=0/1)>\n"
|
||
]
|
||
},
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"DeviceArray([0.25773212, 5.3623195 , 5.4032435 ], dtype=float32)"
|
||
]
|
||
},
|
||
"execution_count": 20,
|
||
"metadata": {
|
||
"tags": []
|
||
},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"@jit\n",
|
||
"def f(x, y):\n",
|
||
" print(\"Running f():\")\n",
|
||
" print(f\" x = {x}\")\n",
|
||
" print(f\" y = {y}\")\n",
|
||
" result = jnp.dot(x + 1, y + 1)\n",
|
||
" print(f\" result = {result}\")\n",
|
||
" return result\n",
|
||
"\n",
|
||
"x = np.random.randn(3, 4)\n",
|
||
"y = np.random.randn(4)\n",
|
||
"f(x, y)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "Ts1fP45A40QV"
|
||
},
|
||
"source": [
|
||
"Notice that the print statements execute, but rather than printing the data we passed to the function, though, it prints *tracer* objects that stand-in for them.\n",
|
||
"\n",
|
||
"These tracer objects are what `jax.jit` uses to extract the sequence of operations specified by the function. Basic tracers are stand-ins that encode the **shape** and **dtype** of the arrays, but are agnostic to the values. This recorded sequence of computations can then be efficiently applied within XLA to new inputs with the same shape and dtype, without having to re-execute the Python code.\n",
|
||
"\n",
|
||
"When we call the compiled function again on matching inputs, no re-compilation is required and nothing is printed because the result is computed in compiled XLA rather than in Python:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 21,
|
||
"metadata": {
|
||
"id": "xGntvzNH7skE",
|
||
"outputId": "66694b8b-181f-4635-a8e2-1fc7f244d94b"
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"DeviceArray([1.4344584, 4.3004413, 7.9897013], dtype=float32)"
|
||
]
|
||
},
|
||
"execution_count": 21,
|
||
"metadata": {
|
||
"tags": []
|
||
},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"x2 = np.random.randn(3, 4)\n",
|
||
"y2 = np.random.randn(4)\n",
|
||
"f(x2, y2)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "9EB9WkRX7fm0"
|
||
},
|
||
"source": [
|
||
"The extracted sequence of operations is encoded in a JAX expression, or *jaxpr* for short. You can view the jaxpr using the `jax.make_jaxpr` transformation:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 22,
|
||
"metadata": {
|
||
"id": "89TMp_Op5-JZ",
|
||
"outputId": "151210e2-af6f-4950-ac1e-9fdb81d4aae1"
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"{ lambda ; a b.\n",
|
||
" let c = add a 1.0\n",
|
||
" d = add b 1.0\n",
|
||
" e = dot_general[ dimension_numbers=(((1,), (0,)), ((), ()))\n",
|
||
" precision=None ] c d\n",
|
||
" in (e,) }"
|
||
]
|
||
},
|
||
"execution_count": 22,
|
||
"metadata": {
|
||
"tags": []
|
||
},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"from jax import make_jaxpr\n",
|
||
"\n",
|
||
"def f(x, y):\n",
|
||
" return jnp.dot(x + 1, y + 1)\n",
|
||
"\n",
|
||
"make_jaxpr(f)(x, y)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "0Oq9S4MZ90TL"
|
||
},
|
||
"source": [
|
||
"Note one consequence of this: because JIT compilation is done *without* information on the content of the array, control flow statements in the function cannot depend on traced values. For example, this fails:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 23,
|
||
"metadata": {
|
||
"id": "A0rFdM95-Ix_",
|
||
"outputId": "d7ffa367-b241-488e-df96-ad0576536605",
|
||
"tags": [
|
||
"raises-exception"
|
||
]
|
||
},
|
||
"outputs": [
|
||
{
|
||
"ename": "ConcretizationTypeError",
|
||
"evalue": "ignored",
|
||
"output_type": "error",
|
||
"traceback": [
|
||
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
|
||
"\u001b[0;31mConcretizationTypeError\u001b[0m Traceback (most recent call last)",
|
||
"\u001b[0;32m<ipython-input-23-acbedba5ce66>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[1;32m 3\u001b[0m \u001b[0;32mreturn\u001b[0m \u001b[0;34m-\u001b[0m\u001b[0mx\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mneg\u001b[0m \u001b[0;32melse\u001b[0m \u001b[0mx\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 4\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 5\u001b[0;31m \u001b[0mf\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;32mTrue\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
|
||
"\u001b[0;31mConcretizationTypeError\u001b[0m: Abstract tracer value encountered where concrete value is expected.\n\nThe problem arose with the bool function. \n\nWhile tracing the function f at <ipython-input-23-acbedba5ce66>:1, this concrete value was not available in Python because it depends on the value of the arguments to f at <ipython-input-23-acbedba5ce66>:1 at flattened positions [1], and the computation of these values is being staged out (that is, delayed rather than executed eagerly).\n\nYou can use transformation parameters such as static_argnums for jit to avoid tracing particular arguments of transformed functions, though at the cost of more recompiles.\n\nSee https://jax.readthedocs.io/en/latest/faq.html#abstract-tracer-value-encountered-where-concrete-value-is-expected-error for more information.\n\nEncountered tracer value: Traced<ShapedArray(bool[], weak_type=True)>with<DynamicJaxprTrace(level=0/1)>"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"@jit\n",
|
||
"def f(x, neg):\n",
|
||
" return -x if neg else x\n",
|
||
"\n",
|
||
"f(1, True)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "DkTO9m8j-TYI"
|
||
},
|
||
"source": [
|
||
"If there are variables that you would not like to be traced, they can be marked as static for the purposes of JIT compilation:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 24,
|
||
"metadata": {
|
||
"id": "K1C7ZnVv-lbv",
|
||
"outputId": "cdbdf152-30fd-4ecb-c9ec-1d1124f337f7"
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"DeviceArray(-1, dtype=int32)"
|
||
]
|
||
},
|
||
"execution_count": 24,
|
||
"metadata": {
|
||
"tags": []
|
||
},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"from functools import partial\n",
|
||
"\n",
|
||
"@partial(jit, static_argnums=(1,))\n",
|
||
"def f(x, neg):\n",
|
||
" return -x if neg else x\n",
|
||
"\n",
|
||
"f(1, True)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "dD7p4LRsGzhx"
|
||
},
|
||
"source": [
|
||
"Note that calling a JIT-compiled function with a different static argument results in re-compilation, so the function still works as expected:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 25,
|
||
"metadata": {
|
||
"id": "sXqczBOrG7-w",
|
||
"outputId": "3a3f50e6-d1fc-42bb-d6df-eb3d206e4b67"
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"DeviceArray(1, dtype=int32)"
|
||
]
|
||
},
|
||
"execution_count": 25,
|
||
"metadata": {
|
||
"tags": []
|
||
},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"f(1, False)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "ZESlrDngGVb1"
|
||
},
|
||
"source": [
|
||
"Understanding which values and operations will be static and which will be traced is a key part of using `jax.jit` effectively."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "r-RCl_wD5lI7"
|
||
},
|
||
"source": [
|
||
"## Static vs Traced Operations\n",
|
||
"\n",
|
||
"**Key Concepts:**\n",
|
||
"\n",
|
||
"- Just as values can be either static or traced, operations can be static or traced.\n",
|
||
"\n",
|
||
"- Static operations are evaluated at compile-time in Python; traced operations are compiled & evaluated at run-time in XLA.\n",
|
||
"\n",
|
||
"- Use `numpy` for operations that you want to be static; use `jax.numpy` for operations that you want to be traced.\n",
|
||
"\n",
|
||
"This distinction between static and traced values makes it important to think about how to keep a static value static. Consider this function:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 26,
|
||
"metadata": {
|
||
"id": "XJCQ7slcD4iU",
|
||
"outputId": "a89a5614-7359-4dc7-c165-03e7d0fc6610",
|
||
"tags": [
|
||
"raises-exception"
|
||
]
|
||
},
|
||
"outputs": [
|
||
{
|
||
"ename": "ConcretizationTypeError",
|
||
"evalue": "ignored",
|
||
"output_type": "error",
|
||
"traceback": [
|
||
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
|
||
"\u001b[0;31mConcretizationTypeError\u001b[0m Traceback (most recent call last)",
|
||
"\u001b[0;32m<ipython-input-26-5fa933a68063>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[1;32m 7\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 8\u001b[0m \u001b[0mx\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mjnp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mones\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m2\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m3\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 9\u001b[0;31m \u001b[0mf\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mx\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
|
||
"\u001b[0;31mConcretizationTypeError\u001b[0m: Abstract tracer value encountered where concrete value is expected.\n\nThe error arose in jax.numpy.reshape.\n\nWhile tracing the function f at <ipython-input-26-5fa933a68063>:4, this value became a tracer due to JAX operations on these lines:\n\n operation c:int32[] = reduce_prod[ axes=(0,) ] b:int32[2]\n from line <ipython-input-26-5fa933a68063>:6 (f)\n\nSee https://jax.readthedocs.io/en/latest/faq.html#abstract-tracer-value-encountered-where-concrete-value-is-expected-error for more information.\n\nEncountered tracer value: Traced<ShapedArray(int32[])>with<DynamicJaxprTrace(level=0/1)>"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"import jax.numpy as jnp\n",
|
||
"from jax import jit\n",
|
||
"\n",
|
||
"@jit\n",
|
||
"def f(x):\n",
|
||
" return x.reshape(jnp.array(x.shape).prod())\n",
|
||
"\n",
|
||
"x = jnp.ones((2, 3))\n",
|
||
"f(x)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "ZO3GMGrHBZDS"
|
||
},
|
||
"source": [
|
||
"This fails with an error specifying that a tracer was found instead of a 1D sequence of concrete values of integer type. Let's add some print statements to the function to understand why this is happening:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 27,
|
||
"metadata": {
|
||
"id": "Cb4mbeVZEi_q",
|
||
"outputId": "f72c1ce3-950c-400f-bfea-10c0d0118911"
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"x = Traced<ShapedArray(float32[2,3])>with<DynamicJaxprTrace(level=0/1)>\n",
|
||
"x.shape = (2, 3)\n",
|
||
"jnp.array(x.shape).prod() = Traced<ShapedArray(int32[])>with<DynamicJaxprTrace(level=0/1)>\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"@jit\n",
|
||
"def f(x):\n",
|
||
" print(f\"x = {x}\")\n",
|
||
" print(f\"x.shape = {x.shape}\")\n",
|
||
" print(f\"jnp.array(x.shape).prod() = {jnp.array(x.shape).prod()}\")\n",
|
||
" # comment this out to avoid the error:\n",
|
||
" # return x.reshape(jnp.array(x.shape).prod())\n",
|
||
"\n",
|
||
"f(x)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "viSQPc3jEwJr"
|
||
},
|
||
"source": [
|
||
"Notice that although `x` is traced, `x.shape` is a static value. However, when we use `jnp.array` and `jnp.prod` on this static value, it becomes a traced value, at which point it cannot be used in a function like `reshape()` that requires a static input (recall: array shapes must be static).\n",
|
||
"\n",
|
||
"A useful pattern is to use `numpy` for operations that should be static (i.e. done at compile-time), and use `jax.numpy` for operations that should be traced (i.e. compiled and executed at run-time). For this function, it might look like this:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 28,
|
||
"metadata": {
|
||
"id": "GiovOOPcGJhg",
|
||
"outputId": "399ee059-1807-4866-9beb-1c5131e38e15"
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"DeviceArray([1., 1., 1., 1., 1., 1.], dtype=float32)"
|
||
]
|
||
},
|
||
"execution_count": 28,
|
||
"metadata": {
|
||
"tags": []
|
||
},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"from jax import jit\n",
|
||
"import jax.numpy as jnp\n",
|
||
"import numpy as np\n",
|
||
"\n",
|
||
"@jit\n",
|
||
"def f(x):\n",
|
||
" return x.reshape((np.prod(x.shape),))\n",
|
||
"\n",
|
||
"f(x)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "C-QZ5d1DG-dv"
|
||
},
|
||
"source": [
|
||
"For this reason, a standard convention in JAX programs is to `import numpy as np` and `import jax.numpy as jnp` so that both interfaces are available for finer control over whether operations are performed in a static matter (with `numpy`, once at compile-time) or a traced manner (with `jax.numpy`, optimized at run-time)."
|
||
]
|
||
}
|
||
],
|
||
"metadata": {
|
||
"colab": {
|
||
"collapsed_sections": [],
|
||
"name": "thinking_in_jax.ipynb",
|
||
"provenance": []
|
||
},
|
||
"jupytext": {
|
||
"formats": "ipynb,md:myst"
|
||
},
|
||
"kernelspec": {
|
||
"display_name": "Python 3",
|
||
"name": "python3"
|
||
},
|
||
"language_info": {
|
||
"codemirror_mode": {
|
||
"name": "ipython",
|
||
"version": 3
|
||
},
|
||
"file_extension": ".py",
|
||
"mimetype": "text/x-python",
|
||
"name": "python",
|
||
"nbconvert_exporter": "python",
|
||
"pygments_lexer": "ipython3",
|
||
"version": "3.7.6"
|
||
}
|
||
},
|
||
"nbformat": 4,
|
||
"nbformat_minor": 0
|
||
}
|