Both input and output of ballot are lane-masks:
result is lane-mask with 'S32/S64 LLT and SGPR bank'
input is lane-mask with 'S1 LLT and VCC reg bank'.
Ballot copies bits from input lane-mask for
all active lanes and puts 0 for inactive lanes.
GlobalISel did not set 0 in result for inactive lanes
for non-constant input.
This change implements support of metadata strings in operand bundle
values. It makes possible calls like:
call void @some_func(i32 %x) [ "foo"(i32 42, metadata !"abc") ]
It requires some extension of the bitcode serialization. As SSA values
and metadata are stored in different tables, there must be a way to
distinguish them during deserialization. It is implemented by putting a
special marker before the metadata index. The marker cannot be treated
as a reference to any SSA value, so it unambiguously identifies
metadata. It allows extending the bitcode serialization without breaking
compatibility.
Metadata as operand bundle values are intended to be used in
floating-point function calls. They would represent the same information
as now is passed by the constrained intrinsic arguments.
Update the llvm/docs/Coroutines.rst docs to include a full description
of Custom ABI objects. This documentation describes the how ABI objects
allow users (plugin libraries) to create custom ABI objects for their
needs.
This is part of a series of patches that tries to improve DILocation bug
detection in Debugify. This first patch adds the necessary CMake flag to
LLVM and a variable defined by that flag to LLVM's config header, allowing
the next patch to track information without affecting normal builds.
This series of patches adds a "DebugLoc coverage tracking" feature, that
inserts conditionally-compiled tracking information into DebugLocs (and
by extension, to Instructions), which is used by Debugify to provide
more accurate and detailed coverage reports. When enabled, this features
tracks whether and why we have intentionally dropped a DebugLoc,
allowing Debugify to ignore false positives. An optional additional
feature allows also storing a stack trace of the point where a DebugLoc
was unintentionally dropped/not generated, which is used to make fixing
detected errors significantly easier. The goal of these features is to
provide useful tools for developers to fix existing DebugLoc errors and
allow reliable detection of regressions by either manual inspection or
an automated script.
Implement the intrinsic `llvm.spv.handle.fromBinding`, which returns the
handle for a global resource. This involves creating a global variable
that matches the return-type, set, and binding in the call, and
returning the handle to that resource.
This commit implements the scalar version. It does not handle arrays of
resources yet. It also does not handle storage buffers yet. We do not
have the type for the storage buffers designed yet.
Part of #81036
This adds `.insn` support for assembling instructions of 48- and
64-bits (only when giving an explicit length). Disassembly already
knows to bunch up the instruction bits for these instructions.
This changes some error messages so they are a little clearer.
Co-authored-by: Sudharsan Veeravalli <quic_svs@quicinc.com>
...and add shared libs as a suggestion.
* Mark options, option values and program names as plain text.
* Add a blank line between the option and the explanatory text
so that it doesn't get printed on the same line.
(this seems to be the original intent of the rst source anyway)
* Update the phrasing of a couple of the options.
* Add BUILD_SHARED_LIBS to suggestions.
This is intended to solve a problem with lowering atomics in
OpenMP and C++ common to AMDGPU and NVPTX.
In OpenCL and CUDA, it is undefined behavior for an atomic instruction
to modify an object in thread private memory. In OpenMP, it is defined.
Correspondingly, the hardware does not handle this correctly. For
AMDGPU,
32-bit atomics work and 64-bit atomics are silently dropped. We
therefore
need to codegen this by inserting a runtime address space check,
performing
the private case without atomics, and fallback to issuing the real
atomic
otherwise. This metadata allows us to avoid this extra check and branch.
Handle this by introducing metadata intended to be applied to atomicrmw,
indicating they cannot access the forbidden address space.
global_wb with scopes lower than SCOPE_SYS is unnecessary for
correctness.
I was initially optimistic they would be very cheap no-ops but they can
actually be quite expensive so let's avoid them.
Adds hidden kernel arguments to the function signature and marks them
inreg if they should be preloaded into user SGPRs. The normal kernarg
preloading logic then takes over with some additional checks for the
correct implicitarg_ptr alignment.
Special care is needed so that metadata for the hidden arguments is not
added twice when generating the code object.
The SPIRV backend has a special type named `spirv.Image`. This type is
meant to correspond to the OpTypeImage instruction in SPIR-V, but there
is one difference. The access qualifier operand in OpTypeImage is
optional. On top of that, the access qualifiers are only valid for
kernels, and not for shaders.
We want to reuse this type when generating shader from HLSL, but we
can't use the access qualifier. This commit make the access qualifer
optional in the target extension type.
The same is done for `spirv.SampledImage`.
Contributes to #81036
This extends FPMathOperator to allow calls that return literal structs
of homogeneous floating-point or vector-of-floating-point types.
The intended use case for this is to support FP intrinsics that return
multiple values (such as `llvm.sincos`).
This adds a flag to lit for detecting and updating failing tests when
possible to do so automatically. The flag uses a plugin architecture
where config files can add additional auto-updaters for the types of
tests in the test suite. When a test fails with `--update-tests` enabled
lit passes the test RUN invocation and output to each registered test
updater until one of them signals that it updated the test (or all test
updaters have been run). As such it is the responsibility of the test
updater to only update tests where it is reasonably certain that it will
actually fix the test, or come close to doing so.
Initially adds support for UpdateVerifyTests and UpdateTestChecks. The
flag is currently only implemented for lit's internal shell, so
`--update-tests` implies `LIT_USE_INTERNAL_SHELL=1`.
Builds on work in #97369Fixes#81320
Converts AMDGPUResourceUsageAnalysis pass from Module to MachineFunction
pass. Moves function resource info propagation to to MC layer (through
helpers in AMDGPUMCResourceInfo) by generating MCExprs for every
function resource which the emitters have been prepped for.
Fixes https://github.com/llvm/llvm-project/issues/64863
Remove the following intrinsics which can be trivially replaced with an
`addrspacecast`
* llvm.nvvm.ptr.gen.to.global
* llvm.nvvm.ptr.gen.to.shared
* llvm.nvvm.ptr.gen.to.constant
* llvm.nvvm.ptr.gen.to.local
* llvm.nvvm.ptr.global.to.gen
* llvm.nvvm.ptr.shared.to.gen
* llvm.nvvm.ptr.constant.to.gen
* llvm.nvvm.ptr.local.to.gen
Also, cleanup the NVPTX lowering of `addrspacecast` making it more
concise.
This was reverted to avoid conflicts while reverting #107655. Re-landing
unchanged.
I'd like to nominate Sergey Zverev as an Intel representative to replace
Andy Kaylor, who will be leaving the security group. Sergey is the one
of the main security points of contact for the Intel compiler team.
We already disallow accessing the callee's allocas from a tail-called
function, because their stack memory will have been de-allocated before
the tail call. I think this should apply to byval arguments too, as they
also occupy space in the caller's stack frame.
This was originally part of #109943, spilt out for separate review.
* Markdown is the most common format on GitHub and most contributors are
more familiar with it than RST.
* This leads to mistakes in the RST syntax and/or folks just using
Markdown syntax and assuming it works.
* The release notes have a high number of edits and a high number of
views, we should optimise for making the common path easy. That is,
adding a bullet point and a link.
* Though GitHub can render RST and Markdown, its support for Markdown is
more complete (and neither handle the Sphinx directives well).
* We already have some Markdown docs in the llvm docs.
To keep the original formatting we do need some Sphinx directives still,
and those are provided by MyST which is already enabled.
https://myst-parser.readthedocs.io/en/latest/
I did have to enable an extension so we can substitute in the release
version.
https://myst-parser.readthedocs.io/en/latest/syntax/optional.html#substitutions-with-jinja2
Needing to use MyST means there is some special knowledge needed if you
want to do advanced things, but at least the basics remain Markdown.
Even in RST form, you still had to look up Sphinx syntax.
I also make use of a nested directive
https://myst-parser.readthedocs.io/en/latest/syntax/roles-and-directives.html#nesting-directives
to implement the prerelease warning.
The note about sections referred to another note that got removed in
4c72deb613d9d8838785b431facb3eb480fb2f51. I presume accidentally, so I
have restored that.
I also removed the "Update on required toolchains to build LLVM" header
because the section is now empty.
The other difference is that the table of contents now has a heading
"Contents". This is the default and I could not find a way to remove
that name. Otherwise it's the same table as you'd get from the RST
document.
Two main goals of this PR are:
* to support "Arithmetic with Overflow" intrinsics, including the
special case when those intrinsics are being generated by the
CodeGenPrepare pass during translations with optimization;
* to redirect intrinsics with aggregate return type to be lowered via
GlobalISel operations instead of SPIRV-specific unfolding/lowering (see
https://github.com/llvm/llvm-project/pull/95012).
There is a new test case
`llvm/test/CodeGen/SPIRV/passes/translate-aggregate-uaddo.ll` that
describes and checks the general logics of the translation.
This PR continues a series of PRs aimed to identify and fix flaws in
code emission, to improve pass rates for the mode with expensive checks
set on (see https://github.com/llvm/llvm-project/pull/101732,
https://github.com/llvm/llvm-project/pull/104104,
https://github.com/llvm/llvm-project/pull/106966), having in mind the
ultimate goal of proceeding towards the non-experimental status of
SPIR-V Backend.
The reproducers are:
1) consider `llc -O3 -mtriple=spirv64-unknown-unknown ...` with:
```
define spir_func i32 @foo(i32 %a, ptr addrspace(4) %p) {
entry:
br label %l1
l1:
%e = phi i32 [ %a, %entry ], [ %i, %body ]
%i = add nsw i32 %e, 1
%fl = icmp eq i32 %i, 0
br i1 %fl, label %exit, label %body
body:
store i8 42, ptr addrspace(4) %p
br label %l1
exit:
ret i32 %i
}
```
2) consider `llc -O0 -mtriple=spirv64-unknown-unknown ...` with:
```
define spir_func i32 @foo(i32 %a, ptr addrspace(4) %p) {
entry:
br label %l1
l1: ; preds = %body, %entry
%e = phi i32 [ %a, %entry ], [ %math, %body ]
%0 = call { i32, i1 } @llvm.uadd.with.overflow.i32(i32 %e, i32 1)
%math = extractvalue { i32, i1 } %0, 0
%ov = extractvalue { i32, i1 } %0, 1
br i1 %ov, label %exit, label %body
body: ; preds = %l1
store i8 42, ptr addrspace(4) %p, align 1
br label %l1
exit: ; preds = %l1
ret i32 %math
}
```
This reapplies commit 1911a50fae8a441b445eb835b98950710d28fc88 with a
minor fix in lld/ELF/LTO.cpp which sets Options.BBAddrMap when
`--lto-basic-block-sections=labels` is passed.
This feature is supported via the newer option
`-fbasic-block-address-map`. Using the old option still works by
delegating to the newer option, while a warning is printed to show
deprecation.
After changes in #109651.
Warning, treated as error:
/home/davspi01/work/open_source/llvm-project/llvm/docs/RISCVUsage.rst::Anonymous hyperlink mismatch: 1 references but 0 targets.
In typical RST fashion, all that was missing was a space between
the last word and the opening `<` of the link.
This change is part of this proposal:
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294
This preliminary work adds the intrinsic to llvm and expands using atan
intrinsic for DXIL backend, since DXIL has no atan2 op.
Part 1 for Implement the atan2 HLSL Function #70096.
(reland #108865 reverted in #109842 due to doc build break)
This change is part of this proposal:
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294
This preliminary work adds the intrinsic to llvm and expands using atan
intrinsic for DXIL backend, since DXIL has no atan2 op.
Part 1 for Implement the atan2 HLSL Function #70096.