Switch to using __tsan_init rather than RunningOnValgrind as the means
for detecting TSan instumented binaries. RunningOnValgrind is present in
other libraries (such as Google perftools tcmalloc). An exe that links
with a tcmalloc static library and exports symbols with -rdynamic will
appear to be TSan instrumented even when it is not resulting in "Unable
to fint TSan function ..." messages.
Fixes issue #122319.
Even though `__builtin_wasm_throw`, which is lowered down to
`llvm.wasm.throw`, throws,
```cpp
try {
__builtin_wasm_throw(0, obj);
} catch (...) {
}
```
does not generate `invoke`. This is because we have assumed the
intrinsic cannot be invoked, which doesn't make much sense. (See #128104
for the historical context)
#128104 made `llvm.wasm.throw` intrinsic invokable in the backend. This
actually generates `invoke`s in Clang for `__builtin_wasm_throw`.
While we're at it, this also generates `invoke`s for
`__builtin_wasm_rethrow`, which is actually not used anywhere in C++
support. I haven't deleted it just in case in may have uses later. (For
example, to support rethrow functionality that carries stack trace with
exnref)
Depends on #128104 for the CI to pass.
Fixes#124710.
Enable gisel selection for uaddsat and usubsat in true16 flow
This patch includes:
1. Added VGPR_16_Lo128/VGPR_16 to register bank and update register info
for recognizing 16bit regclass id and bit width
2. uaddsat/usubsat test update
cvise reimplements creduce in Python and bundles clang-delta and other
tools. In my experience, it is generally a more robust reduction tool
that is better maintained. I renamed the script to make it tool-neutral,
which also opens up the possibility that we teach it how to
automatically transition over to llvm-reduce and opt/llc to handle LLVM
backend crashes, but that is potential future work.
Internally, the variable names still say "creduce". I kept using the
verb "reduce" because "vise" is not a verb, but the external facing text
has been updated.
Both fileno and fdopen require interfacing with the opaque FILE struct,
so they shouldn't be enabled in overlay mode. This patch moves both into
fullbuild only on all platforms.
Fixes#128643
When some resource types were present, but not all of them, we were
ending up in a situation where we would fail to initialize the `FirstX`
variables and get incorrect iterators.
Fixes#128560.
In #128754, `DataLayoutTypeInterface` was changed to give
`getPreferredAlignment` a default implemention. As a result, table-gen
no longer declared `getPreferredAlignment` when defining a class that
contained `[DeclareTypeInterfaceMethods<DataLayoutTypeInterface>]` in
the table-gen definition. That means all of the definitions in
`CIRTypes.cpp`, such as `PointerType::getPreferredAligment`, were
compilation errors.
Delete all the definitions of `getPreferredAlignment`. I verified that
the default implementation does the exact same thing as the explicit
overrides that are being deleted.
The existing OneShotModuleBufferize will analyze and bufferize
operations which are in nested symbol tables (e.g. nested
`builtin.module`, `gpu.module`, or similar operations). This
behavior is untested and likely unintentional given other
limitations of OneShotModuleBufferize (`func.call` can't call
into nested symbol tables). This change reverses the existing
behavior so that the operations considered by the analysis and
bufferization exclude any operations in nested symbol table
scopes. Users who desire to bufferize nested modules can still do
so by applying the transformation in a pass pipeline or in a
custom pass. This further enables controlling the order in which
modules are bufferized as well as allowing use of different
options for different kinds of modules.
Fixes#128071
The current behavior lets intrinsics that don't map to a DXILOP slip
through. Nothing catches this until we hit the DXIL validator. This
change fails earlier so we don't encode invalid llvm intrinsics that can
slip through because of clang builtins like `__builtin_reduce_and`
example:
https://hlsl.godbolt.org/z/13rPj18vn
Many interface methods did not prefix the `mlir` namespace, which
prevented inheriting from this interface from an interface defined
outside the `mlir` namespace. Prefix namespaces everywhere to enable
this.
This adds support cbuffers based on llvm/wg-hlsl#171 - the type argument
of the CBuffer TargetExtType is either a `dx.Layout` type which reports
its own size, or it's a normal type and we can simply refer to
DataLayout.
Support the type `bool` and the literals `true` and `false`. Add the
type `cir::BoolType` and the attribute `cir::BoolAttr` to ClangIR. Add
code in all the necessary places in ClangIR CodeGen to handle and to
recognize the type and the attribute.
Add test cases to existing tests func-simple.cpp and
global-var-simple.cpp.
Many types don't have a preferred alignment, but often specifying an ABI
alignment is required to implement APIs on top of data layouts. Default
the preferred alignment to `getABIAlignment` to simplify things.
There had been concern raised about possible confusion with "rvv". After
internal discussion, we decided to go with an alternate prefix to reduce
possible confusion going forward. The specification document
(https://github.com/rivosinc/rivos-custom-extensions) has been updated.
And also add the XRivosVizip extension to the documentation. I'd missed
that in the initial commit.
RISC-V has multiple decoder tables because there is no guarantee that
non-standard extensions do not overlap with each other.
Qualcomm's Xqci family of extensions are intended to be implemented
together, and therefore we want a single decode table for this group of
extensions. This should be more efficient overall, and allows us to use
tablegen's existing mechanism that finds overlapping encodings within
the group.
To implement this, the key addition is `TRY_TO_DECODE_FEATURE_ANY`,
which will use the provided decoder table if any of the features from
the FeatureBitset (first argument) are enabled, rather than if all are
enabled.
We currently have two operands upstream that are an unsigned immediate
with a range constraint - `uimm8ge32` (for `cm.jalt`) and `uimm5gt3`
(for `qc.shladd`).
Both of these were using `decodeUImmOperand<N>` for decoding. For `Zcmt`
this worked, because the generated decoder automatically checked for
`cm.jt` first because the 8 undefined bits in `cm.jalt` are `000?????`
in `cm.jt` (this is to do with the range lower-bound being a
power-of-two). For Zcmt, this patch is NFC.
We have less luck with `Xqciac` - `qc.shladd` is being decoded where the
`uimm5` field is 3 or lower. This patch fixes this by introducing a
`decodeUImmOperandGE<Width, LowerBound>` helper, which will corretly
return `MCDisassembler::Fail` when the immediate is below the lower
bound.
I have added a test to show the encoding where `uimm5` is equal to 3 is
no longer disassembled as `qc.shladd`.
This refactors the response handlers for reverse request to follow the
same architecture as the request handlers. With only two implementation
that might be overkill, but it reduces code duplication and improves
error reporting by storing the sequence ID. This PR also fixes an
unchecked Expected in the old callback for unknown sequence IDs.
DATA_FILES CMake argument never existed in the new YAML-based hdrgen
version of add_gen_header function, and thus its uses added in
b1fd6f0996a9d6e6ebfa0cc3df0fe499c5ccdf65 were always dead code.
Remove them to clean up the function implementation.
Co-authored-by: Alexey Samsonov <samsonov@google.com>
When processing BOLTed binaries with BAT section, we used to
indiscriminately use `BAT->getFallthroughsInTrace` to record
fall-throughs, even if the function is not covered by BAT.
Fix that by using non-BAT CFG-based `getFallthroughsInTrace` if the
function is not in BAT.
Test Plan: updated bolt-address-translation-yaml.test
BOLT used to mark multi-entry functions non-simple in non-relocation
mode with the reasoning that we can't move them due to potentially
undetected references. However, in aggregation mode it doesn't apply as
BOLT doesn't perform optimizations.
Relax this constraint in case of an aggregation job.
Test Plan: added entry-point-fallthru.s
Creates an interface target for `LinalgRelayoutInterface`. This is
primarily to reduce the dependency of `Tensor` on `Linalg` to the
required minimum. For context and rationale, see:
* https://github.com/llvm/llvm-project/issues/127668
Note, I also took the liberty of renaming `LinalgRelayoutInterface` as
`RelayoutOpInterface` (removed `Linalg`, added `Op`).
Problem identified by Joseph. The openmp device runtime uses
__scoped_atomic_load_n and similar which presently hit
```
error: large atomic operation may incur significant performance
penalty; the access size (4 bytes) exceeds the max lock-free size (0 bytes) [-Werror,-Watomic-alignment]
```
This is because the spirv class doesn't set the corresponding field. The
base does, but only if there's a host toolchain, which there isn't.
We prefer to emit addi instead of ori because its more compressible, but
this can pessimize the sext.w removal pass.
If the input to the OR is known to be a sign extended 32 bit value, we
can use addiw instead of addi which will give more power to the sext.w
removal pass. As it is known to produce sign a sign extended value and
only consume the lower 32 bits.
Fixes#128468.
`llvm.wasm.throw` intrinsic can throw but it was not invokable. Not sure
what the rationale was when it was first written that way, but I think
at least in Emscripten's C++ exception support with the Wasm port of
libunwind, `__builtin_wasm_throw`, which is lowered down to
`llvm.wasm.rethrow`, is used only within `_Unwind_RaiseException`, which
is an one-liner and thus does not need an `invoke`:
720e97f76d/system/lib/libunwind/src/Unwind-wasm.c (L69)
(`_Unwind_RaiseException` is called by `__cxa_throw`, which is generated
by the `throw` C++ keyword)
But this does not address other direct uses of the builtin in C++, whose
use I'm not sure about but is not prohibited. Also other language
frontends may need to use the builtin in different functions, which has
`try`-`catch`es or destructors.
This makes `llvm.wasm.throw` invokable in the backend. To do that, this
adds a custom lowering routine to `SelectionDAGBuilder::visitInvoke`,
like we did for `llvm.wasm.rethrow`.
This does not generate `invoke`s for `__builtin_wasm_throw` yet, which
will be done by a follow-up PR.
Addresses #124710.
Unsafe operation in methods that are already annotated with
clang::unsafe_buffer_usage attribute, should not trigger a warning. This
is because, the developer has already identified the method as unsafe
and warning at every unsafe operation is redundant.
rdar://138644831
---------
Co-authored-by: MalavikaSamak <malavika2@apple.com>
Enough suport to parse correctly formed directives of !$OMP ASSUME and
!$OMP ASSUMES with teh related clauses that go with them: ABSENT,
CONTAINS, NO_OPENPP, NO_OPENMP_ROUTINES, NO_PARALLELISM and HOLDS.
Tests added for unparsing and dump parse-tree.
Semantics support is very minimal and no specific tests added.
The lowering will hit a TODO, and there are tests in Lower/OpenMP/Todo
to make it clear that this is currently expected behaviour.
---------
Co-authored-by: Kiran Chandramohan <kiran.chandramohan@arm.com>
Co-authored-by: Krzysztof Parzyszek <Krzysztof.Parzyszek@amd.com>
This commit adds the concept of a controlflow extension to the dialect
and updates the validation pass to check conf_if and while_loop are
supported only in the presence of the controlflow extension.
Signed-off-by: Luke Hutton <luke.hutton@arm.com>
Co-authored-by: Luke Hutton <luke.hutton@arm.com>
XOR(X, SUB(0, X)) corresponds to a bitwise-negated BLSMSK instruction
(i.e., x ^ (x - 1)). On its own, this transformation is probably not
really profitable but when the XOR operation is an operand of an AND
operation, we can use an ANDN instruction to reduce the number of
emitted instructions by one.
Fixes#103501.
This commit updates the Hexagon backend to handle
vxi1 call operands. It ensures compatibility for
vector types of sizes 4, 8, 16, 32, 64, and 128 x i1 when HVX is
enabled.
~Fixes #59009 and #118879~