llvm-project/libc/utils/benchmarks/README.md

# Libc mem* benchmarks

This framework has been designed to evaluate and compare relative performance of
memory function implementations on a particular host.

It will also be use to track implementations performances over time.

## Quick start

### Setup

**Python 2** [being deprecated](https://www.python.org/doc/sunset-python-2/) it is
advised to used **Python 3**.

Then make sure to have `matplotlib`, `scipy` and `numpy` setup correctly:

```shell
apt-get install python3-pip
pip3 install matplotlib scipy numpy
```

To get good reproducibility it is important to make sure that the system runs in
`performance` mode. This is achieved by running:

```shell
cpupower frequency-set --governor performance
```

### Run and display `memcpy` benchmark

The following commands will run the benchmark and display a 95 percentile
confidence interval curve of **time per copied bytes**. It also features **host
informations** and **benchmarking configuration**.

```shell
cd llvm-project
cmake -B/tmp/build -Sllvm -DLLVM_ENABLE_PROJECTS=libc -DCMAKE_BUILD_TYPE=Release
make -C /tmp/build -j display-libc-memcpy-benchmark-small
```

## Benchmarking regimes

Using a profiler to observe size distributions for calls into libc functions, it
was found most operations act on a small number of bytes.

Function           | % of calls with size ≤ 128 | % of calls with size ≤ 1024
------------------ | --------------------------: | ---------------------------:
memcpy             | 96%                         | 99%
memset             | 91%                         | 99.9%
memcmp<sup>1</sup> | 99.5%                       | ~100%

Benchmarking configurations come in two flavors:

 - [small](libc/utils/benchmarks/configuration_small.json)
    - Exercises sizes up to `1KiB`, representative of normal usage
    - The data is kept in the `L1` cache to prevent measuring the memory
      subsystem
 - [big](libc/utils/benchmarks/configuration_big.json)
    - Exercises sizes up to `32MiB` to test large operations
    - Caching effects can show up here which prevents comparing different hosts

_<sup>1</sup> - The size refers to the size of the buffers to compare and not
the number of bytes until the first difference._

## Benchmarking targets

The benchmarking process occurs in two steps:

1. Benchmark the functions and produce a `json` file
2. Display (or renders) the `json` file

Targets are of the form `<action>-libc-<function>-benchmark-<configuration>`

 - `action` is one of :
    - `run`, runs the benchmark and writes the `json` file
    - `display`, displays the graph on screen
    - `render`, renders the graph on disk as a `png` file
 - `function` is one of : `memcpy`, `memcmp`, `memset`
 - `configuration` is one of : `small`, `big`

## Superposing curves

It is possible to **merge** several `json` files into a single graph. This is
useful to **compare** implementations.

In the following example we superpose the curves for `memcpy`, `memset` and
`memcmp`:

```shell
> make -C /tmp/build run-libc-memcpy-benchmark-small run-libc-memcmp-benchmark-small run-libc-memset-benchmark-small
> python libc/utils/benchmarks/render.py3 /tmp/last-libc-memcpy-benchmark-small.json /tmp/last-libc-memcmp-benchmark-small.json /tmp/last-libc-memset-benchmark-small.json
```

## Useful `render.py3` flags

 - To save the produced graph `--output=/tmp/benchmark_curve.png`.
 - To prevent the graph from appearing on the screen `--headless`.


## Under the hood

 To learn more about the design decisions behind the benchmarking framework,
 have a look at the [RATIONALE.md](RATIONALE.md) file.
[llvm-libc] Add memory function benchmarks Summary: This patch adds a benchmarking infrastructure for llvm-libc memory functions. In a nutshell, the code can benchmark small and large buffers for the memcpy, memset and memcmp functions. It also produces graphs of size vs latency by running targets of the form `render-libc-{memcpy\|memset\|memcmp}-benchmark-{small\|big}`. The configurations are provided as JSON files and the benchmark also produces a JSON file. This file is then parsed and rendered as a PNG file via the `render.py` script (make sure to run `pip3 install matplotlib scipy numpy`). The script can take several JSON files as input and will superimpose the curves if they are from the same host. TODO: - The code benchmarks whatever is available on the host but should be configured to benchmark the -to be added- llvm-libc memory functions. - Add a README file with instructions and rationale. - Produce scores to track the performance of the functions over time to allow for regression detection. Reviewers: sivachandra, ckennelly Subscribers: mgorny, MaskRay, libc-commits Tags: #libc-project Differential Revision: https://reviews.llvm.org/D72516 2020-01-06 13:17:04 +01:00			`# Libc mem* benchmarks`

			`This framework has been designed to evaluate and compare relative performance of`
			`memory function implementations on a particular host.`

			`It will also be use to track implementations performances over time.`

			`## Quick start`

			`### Setup`

			`Python 2 [being deprecated](https://www.python.org/doc/sunset-python-2/) it is`
			`advised to used Python 3.`

			Then make sure to have `matplotlib`, `scipy` and `numpy` setup correctly:

			```shell
			`apt-get install python3-pip`
			`pip3 install matplotlib scipy numpy`
			```

			`To get good reproducibility it is important to make sure that the system runs in`
			`performance` mode. This is achieved by running:

			```shell
			`cpupower frequency-set --governor performance`
			```

			### Run and display `memcpy` benchmark

			`The following commands will run the benchmark and display a 95 percentile`
			`confidence interval curve of time per copied bytes. It also features **host`
			`informations and benchmarking configuration**.`

			```shell
			`cd llvm-project`
			`cmake -B/tmp/build -Sllvm -DLLVM_ENABLE_PROJECTS=libc -DCMAKE_BUILD_TYPE=Release`
			`make -C /tmp/build -j display-libc-memcpy-benchmark-small`
			```

			`## Benchmarking regimes`

			`Using a profiler to observe size distributions for calls into libc functions, it`
			`was found most operations act on a small number of bytes.`

			`Function \| % of calls with size ≤ 128 \| % of calls with size ≤ 1024`
			`------------------ \| --------------------------: \| ---------------------------:`
			`memcpy \| 96% \| 99%`
			`memset \| 91% \| 99.9%`
			`memcmp<sup>1</sup> \| 99.5% \| ~100%`

			`Benchmarking configurations come in two flavors:`

			`- [small](libc/utils/benchmarks/configuration_small.json)`
			- Exercises sizes up to `1KiB`, representative of normal usage
			- The data is kept in the `L1` cache to prevent measuring the memory
			`subsystem`
			`- [big](libc/utils/benchmarks/configuration_big.json)`
			- Exercises sizes up to `32MiB` to test large operations
			`- Caching effects can show up here which prevents comparing different hosts`

			`_<sup>1</sup> - The size refers to the size of the buffers to compare and not`
			`the number of bytes until the first difference._`

			`## Benchmarking targets`

			`The benchmarking process occurs in two steps:`

			1. Benchmark the functions and produce a `json` file
			2. Display (or renders) the `json` file

			Targets are of the form `<action>-libc-<function>-benchmark-<configuration>`

			- `action` is one of :
			- `run`, runs the benchmark and writes the `json` file
			- `display`, displays the graph on screen
			- `render`, renders the graph on disk as a `png` file
			- `function` is one of : `memcpy`, `memcmp`, `memset`
			- `configuration` is one of : `small`, `big`

			`## Superposing curves`

			It is possible to merge several `json` files into a single graph. This is
			`useful to compare implementations.`

			In the following example we superpose the curves for `memcpy`, `memset` and
			`memcmp`:

			```shell
			`> make -C /tmp/build run-libc-memcpy-benchmark-small run-libc-memcmp-benchmark-small run-libc-memset-benchmark-small`
			`> python libc/utils/benchmarks/render.py3 /tmp/last-libc-memcpy-benchmark-small.json /tmp/last-libc-memcmp-benchmark-small.json /tmp/last-libc-memset-benchmark-small.json`
			```

			## Useful `render.py3` flags

			- To save the produced graph `--output=/tmp/benchmark_curve.png`.
			- To prevent the graph from appearing on the screen `--headless`.


			`## Under the hood`

			`To learn more about the design decisions behind the benchmarking framework,`
			`have a look at the [RATIONALE.md](RATIONALE.md) file.`