mirror of
https://github.com/llvm/llvm-project.git
synced 2025-04-26 15:56:07 +00:00

As discussed on the weekly OpenMP meeting on the second of August 2023, the default version in the OpenMP documentation shoud be changed from OpenMP 5.0 to 5.1. Differential Revision: https://reviews.llvm.org/D156901
187 lines
7.7 KiB
ReStructuredText
187 lines
7.7 KiB
ReStructuredText
OpenMP Command-Line Argument Reference
|
|
======================================
|
|
Welcome to the OpenMP in LLVM command line argument reference. The content is
|
|
not a complete list of arguments but includes the essential command-line
|
|
arguments you may need when compiling and linking OpenMP.
|
|
Section :ref:`general_command_line_arguments` lists OpenMP command line options
|
|
for multicore programming while :ref:`offload_command_line_arguments` lists
|
|
options relevant to OpenMP target offloading.
|
|
|
|
.. _general_command_line_arguments:
|
|
|
|
OpenMP Command-Line Arguments
|
|
-----------------------------
|
|
|
|
``-fopenmp``
|
|
^^^^^^^^^^^^
|
|
Enable the OpenMP compilation toolchain. The compiler will parse OpenMP
|
|
compiler directives and generate parallel code.
|
|
|
|
``-fopenmp-extensions``
|
|
^^^^^^^^^^^^^^^^^^^^^^^
|
|
Enable all ``Clang`` extensions for OpenMP directives and clauses. A list of
|
|
current extensions and their implementation status can be found on the
|
|
`support <https://clang.llvm.org/docs/OpenMPSupport.html#openmp-extensions>`_
|
|
page.
|
|
|
|
``-fopenmp-simd``
|
|
^^^^^^^^^^^^^^^^^
|
|
This option enables OpenMP only for single instruction, multiple data
|
|
(SIMD) constructs.
|
|
|
|
``-static-openmp``
|
|
^^^^^^^^^^^^^^^^^^
|
|
Use the static OpenMP host runtime while linking.
|
|
|
|
``-fopenmp-version=<arg>``
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
Set the OpenMP version to a specific version ``<arg>`` of the OpenMP standard.
|
|
For example, you may use ``-fopenmp-version=45`` to select version 4.5 of
|
|
the OpenMP standard. The default value is ``-fopenmp-version=51`` for ``Clang``.
|
|
|
|
.. _offload_command_line_arguments:
|
|
|
|
Offloading Specific Command-Line Arguments
|
|
------------------------------------------
|
|
|
|
.. _fopenmp-targets:
|
|
|
|
``-fopenmp-targets``
|
|
^^^^^^^^^^^^^^^^^^^^
|
|
| Specify which OpenMP offloading targets should be supported. For example, you
|
|
may specify ``-fopenmp-targets=amdgcn-amd-amdhsa,nvptx64``. This option is
|
|
often optional when :ref:`offload_arch` is provided.
|
|
| It is also possible to offload to CPU architectures, for instance with
|
|
``-fopenmp-targets=x86_64-pc-linux-gnu``.
|
|
|
|
.. _offload_arch:
|
|
|
|
``--offload-arch``
|
|
^^^^^^^^^^^^^^^^^^
|
|
| Specify the device architecture for OpenMP offloading. For instance
|
|
``--offload-arch=sm_80`` to target an Nvidia Tesla A100,
|
|
``--offload-arch=gfx90a`` to target an AMD Instinct MI250X, or
|
|
``--offload-arch=sm_80,gfx90a`` to target both.
|
|
| It is also possible to specify :ref:`fopenmp-targets` without specifying
|
|
``--offload-arch``. In that case, the executables ``amdgpu-arch`` or
|
|
``nvptx-arch`` will be executed as part of the compiler driver to
|
|
detect the device architecture automatically.
|
|
| Finally, the device architecture will also be automatically inferred with
|
|
``--offload-arch=native``.
|
|
|
|
``--offload-device-only``
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
Compile only the code that goes on the device. This option is mainly for
|
|
debugging purposes. It is primarily used for inspecting the intermediate
|
|
representation (IR) output when compiling for the device. It may also be used
|
|
if device-only runtimes are created.
|
|
|
|
``--offload-host-only``
|
|
^^^^^^^^^^^^^^^^^^^^^^^
|
|
Compile only the code that goes on the host. With this option enabled, the
|
|
``.llvm.offloading`` section with embedded device code will not be included in
|
|
the intermediate representation.
|
|
|
|
``--offload-host-device``
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
Compile the target regions for both the host and the device. That is the
|
|
default option.
|
|
|
|
``-Xopenmp-target <arg>``
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
Pass an argument ``<arg>`` to the offloading toolchain, for instance
|
|
``-Xopenmp-target -march=sm_80``.
|
|
|
|
``-Xopenmp-target=<triple> <arg>``
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
Pass an argument ``<arg>`` to the offloading toolchain for the target
|
|
``<triple>``. That is especially useful when an argument must differ for each
|
|
triple. For instance ``-Xopenmp-target=nvptx64 --offload-arch=sm_80
|
|
-Xopenmp-target=amdgcn --offload-arch=gfx90a`` to specify the device
|
|
architecture. Alternatively, :ref:`Xarch_host` and :ref:`Xarch_device` can
|
|
pass an argument to the host and device compilation toolchain.
|
|
|
|
``-Xoffload-linker<triple> <arg>``
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
Pass an argument ``<arg>`` to the offloading linker for the target specified in
|
|
``<triple>``.
|
|
|
|
.. _Xarch_device:
|
|
|
|
``-Xarch_device <arg>``
|
|
^^^^^^^^^^^^^^^^^^^^^^^
|
|
Pass an argument ``<arg>`` to the device compilation toolchain.
|
|
|
|
.. _Xarch_host:
|
|
|
|
``-Xarch_host <arg>``
|
|
^^^^^^^^^^^^^^^^^^^^^
|
|
Pass an argument ``<arg>`` to the host compilation toolchain.
|
|
|
|
``-foffload-lto[=<arg>]``
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
Enable device link time optimization (LTO) and select the LTO mode ``<arg>``.
|
|
Select either ``-foffload-lto=thin`` or ``-foffload-lto=full``. Thin LTO takes
|
|
less time while still achieving some performance gains. If no argument is set,
|
|
this option defaults to ``-foffload-lto=full``.
|
|
|
|
``-fopenmp-offload-mandatory``
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
| This option is set to avoid generating the host fallback code
|
|
executed when offloading to the device fails. That is
|
|
helpful when the target contains code that cannot be compiled for the host, for
|
|
instance, if it contains unguarded device intrinsics.
|
|
| This option can also be used to reduce compile time.
|
|
| This option should not be used when one wants to verify that the code is being
|
|
offloaded to the device. Instead, set the environment variable
|
|
``OMP_TARGET_OFFLOAD='MANDATORY'`` to confirm that the code is being offloaded to
|
|
the device.
|
|
|
|
``-fopenmp-target-debug[=<arg>]``
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
Enable debugging in the device runtime library (RTL). Note that it is both
|
|
necessary to configure the debugging in the device runtime at compile-time with
|
|
``-fopenmp-target-debug=<arg>`` and enable debugging at runtime with the
|
|
environment variable ``LIBOMPTARGET_DEVICE_RTL_DEBUG=<arg>``. Further, it is
|
|
currently only supported for Nvidia targets as of July 2023. Alternatively, the
|
|
environment variable ``LIBOMPTARGET_DEBUG`` can be set to debug both Nvidia and
|
|
AMD GPU targets. For more information, see the
|
|
`debugging instructions <https://openmp.llvm.org/design/Runtimes.html#debugging>`_.
|
|
The debugging instructions list the supported debugging arguments.
|
|
|
|
``-fopenmp-target-jit``
|
|
^^^^^^^^^^^^^^^^^^^^^^^
|
|
| Emit code that is Just-in-Time (JIT) compiled for OpenMP offloading. Embed
|
|
LLVM-IR for the device code in the object files rather than binary code for the
|
|
respective target. At runtime, the LLVM-IR is optimized again and compiled for
|
|
the target device. The optimization level can be set at runtime with
|
|
``LIBOMPTARGET_JIT_OPT_LEVEL``, for instance,
|
|
``LIBOMPTARGET_JIT_OPT_LEVEL=3`` corresponding to optimizations level ``-O3``.
|
|
See the
|
|
`OpenMP JIT details <https://openmp.llvm.org/design/Runtimes.html#libomptarget-jit-pre-opt-ir-module>`_
|
|
for instructions on extracting the embedded device code before or after the
|
|
JIT and more.
|
|
| We want to emphasize that JIT for OpenMP offloading is good for debugging as
|
|
the target IR can be extracted, modified, and injected at runtime.
|
|
|
|
``--offload-new-driver``
|
|
^^^^^^^^^^^^^^^^^^^^^^^^
|
|
In upstream LLVM, OpenMP only uses the new driver. However, enabling this
|
|
option for experimental linking with CUDA or HIP files is necessary.
|
|
|
|
``--offload-link``
|
|
^^^^^^^^^^^^^^^^^^
|
|
Use the new offloading linker `clang-linker-wrapper` to perform the link job.
|
|
`clang-linker-wrapper` is the default offloading linker for OpenMP. This option
|
|
can be used to use the new offloading linker in toolchains that do not automatically
|
|
use it. It is necessary to enable this option when linking with CUDA or HIP files.
|
|
|
|
``-nogpulib``
|
|
^^^^^^^^^^^^^
|
|
Do not link the device library for CUDA or HIP device compilation.
|
|
|
|
``-nogpuinc``
|
|
^^^^^^^^^^^^^
|
|
Do not include the default CUDA or HIP headers, and do not add CUDA or HIP
|
|
include paths.
|