This adds a feature to hdrgen to emit JSON summaries of header
files for build system integration. For now the summaries have
only the basic information about each header that is relevant for
build and testing purposes: the standards and includes lists.
There are a lot of lldb commands whose result is really one or more
ValueObjects that we then print with the ValueObjectPrinter. Now that we
have the ability to access the SBCommandReturnObject through a callback
(#125006), we can store the resultant ValueObjects in the return object,
allowing an IDE to access the SBValues and do its own rich formatting.
rdar://143965453
Looks like https://github.com/llvm/llvm-project/pull/124936 was reverted
(for modifying JSON output), but the test for JSON output with errors
was deleted in https://github.com/llvm/llvm-project/pull/126587 (to
attempt to fix failing build)
This will add back a test and a new one for llvm-dwarfdump to validate
the JSON for errors. One case where the sub-categories will eventually
appear and another where not.
test plan:
ninja check-llvm-tools-llvm-dwarfdump
Summary:
The scan operation implemented here only works if there are contiguous
ones in the executation mask that can be used to propagate the result.
There are two solutions to this, one is to enter 'whole-wave-mode' and
forcibly turn them back on, or to do this serially. This implementation
does the latter because it's more portable, but checks to see if the
parallel fast-path is applicable.
Needs to be backported for correct behavior and because it fixes a
failing libc test.
This patch adds codegen for all Arm9.6-a compare-and-branch
instructions, that operate on full w or x registers. The instruction
variants operating on half-words (cbh) and bytes (cbb) are added in a
subsequent patch.
Since CB doesn't use standard 4-bit Arm condition codes but a reduced
set of conditions, encoded in 3 bits, some conditions are expressed by
modifying operands, namely incrementing or decrementing immediate
operands and swapping register operands. To invert a CB instruction it's
therefore not enough to just modify the condition code which doesn't
play particularly well with how the backend is currently organized. We
therefore introduce a number of pseudos which operate on the standard
4-bit condition codes and lower them late during codegen.
Add option `AllowedTypes` which allow users to specify types they want
to exclude from const-correctness check.
Small real-world example:
```cpp
#include <mutex>
int main() {
std::mutex m;
std::lock_guard<std::mutex> l(m); // we want to ignore it since std::lock_guard is already immutable.
}
```
Closes issue https://github.com/llvm/llvm-project/issues/122592
Like a C++ member variable, every Objective-C++ instance variable must
be a RefPtr, Ref CheckedPtr, or CheckedRef to an object, not a raw
pointer or reference.
Move the most common if statement to the top and the least common ones
to the bottom. This should save CPU cycles during compilation.
This patch also prefixes the llvm variables with the LLVM prefix to make
the naming convention in this function more uniform. For example `C` to
`LLVMC`.
In order to facilitate cloning of recursive cycles, we first identify
backedges using a standard DFS search from the root callers, then
initially defer recursively invoking the cloning function via those
edges. This is because the cloning opportunity along the backedge may
not be exposed until the current node is cloned for other non-backedge
callers that are cold after the earlier recursive cloning, resulting
in a cold predecessor of the backedge. So we recursively invoke the
cloning function for the backedges during the cloning of the current
node for its caller edges (which were sorted to enable handling cold
callers first).
There was no significant time or memory overhead measured for several
large applications.
In removePartiallyOverlappedStores we iterate over
InstOverlapIntervalsTy which is a DenseMap. Change that map into using
MapVector to ensure that we apply the transforms in a deterministic
order. I've only seen that the order matters if starting to use names
for the instructions created when doing the transforms. But such things
are a bit annoying when debugging etc.
Tests have been re-generated with recent scheduler changes.
Original message:
SelectionDAG will not reassociate adds to the end of a chain if
there are multiple users of later additions. This prevents isel
from folding the immediate into a load/store address.
One easy way to see this is accessing an array in a struct with
two different indices. An ADDI will be used to get to the start
of the array then 2 different SHXADD instructions will be used to
add the scaled indices. Finally the SHXADD will be used by different
load instructions. We can remove the ADDI by folding the offset into
each load.
This patch adds a new pass that analyzes how an ADDI constant
propagates through address arithmetic. If the arithmetic is only
used by a load/store and the offset is small enough, we can adjust
the load/store offset and remove the ADDI.
This pass is placed before MachineCSE to allow cleanups if some
instructions become common after removing offsets from their inputs.
This pass gives ~3% improvement on dynamic instruction count on
541.leela_r and 544.nab_r from SPEC2017 for the train data set. There's
a ~1% improvement on 557.xz_r.
This pr fixes how iteration domain of linalg.generic is collapsed when
fusing with tensor.expand_shape. Previously, the output_shape for
tensor.expand shape was infered, which doesn't always work except some
special cases.
This patch makes the logic explicitly set the bounds of the new
collapsed iteration domain, because we already know them.
---------
Co-authored-by: Jakub Kuderski <kubakuderski@gmail.com>
This patch implements a new subclass of the Value class used for Sandbox
IR Values that we don't support, like metadata or inline asm. The goal
is to never have null sandboxir::Value objects, because this is not the
expected behavior.
There was a discrepancy between the type-converter and rewrite-pattern
parts of conversion to LLVM used in various GPU targets, at least ROCDL
and NVVM:
- The TypeConverter part was handling vectors of arbitrary rank,
converting them to nests of `!llvm.array< ... >` with a vector at the
inner-most dimension:
8337d01e30/mlir/lib/Conversion/LLVMCommon/TypeConverter.cpp (L629-L655)
- The rewrite pattern part was not handling `llvm.array`:
8337d01e30/mlir/lib/Conversion/GPUCommon/GPUOpsLowering.cpp (L594-L596)
That led to conversion failures when lowering `math` dialect ops on
rank-2 vectors, as in the testcase being added in this PR.
This PR fixes this by reusing a shared utility already used in other
conversions to LLVM:
8337d01e30/mlir/lib/Conversion/LLVMCommon/VectorPattern.cpp (L80-L104)
---------
Signed-off-by: Benoit Jacob <jacob.benoit.1@gmail.com>
Using LLVM build itself for PGO training is convenient and a great
starting point but it also has several issues:
* LLVM build implicitly depends on tools other than CMake and C/C++
compiler and if those tools aren't available in PATH, the build will
fail.
* LLVM build also requires standard headers and libraries which may not
always be available in the default location requiring an explicit
sysroot.
* Building a single configuration (-DCMAKE_BUILD_TYPE=Release) only
exercises the -O3 pipeline and can pesimize other configurations.
* Building for the host target doesn't exercise all other targets.
* Since LLVMSupport is a static library, this doesn't exercise the
linker (beyond what the CMake itself does).
Rather than using LLVM build, ideally we would provide a more minimal,
purpose built corpus. While we're working on building such a corpus,
provide a CMake option that lets vendors disable the use LLVM build for
PGO training.
SelectionDAG will not reassociate adds to the end of a chain if
there are multiple users of later additions. This prevents isel
from folding the immediate into a load/store address.
One easy way to see this is accessing an array in a struct with
two different indices. An ADDI will be used to get to the start
of the array then 2 different SHXADD instructions will be used to
add the scaled indices. Finally the SHXADD will be used by different
load instructions. We can remove the ADDI by folding the offset into
each load.
This patch adds a new pass that analyzes how an ADDI constant
propagates through address arithmetic. If the arithmetic is only
used by a load/store and the offset is small enough, we can adjust
the load/store offset and remove the ADDI.
This pass is placed before MachineCSE to allow cleanups if some
instructions become common after removing offsets from their inputs.
This pass gives ~3% improvement on dynamic instruction count on
541.leela_r and 544.nab_r from SPEC2017 for the train data set. There's
a ~1% improvement on 557.xz_r.
b41b86a907f653f79bab10d4c80b3a41d146c71b added a new textual header
`clang/Lex/HLSLRootSignatureTokenKinds.def` but did not add it to
`clang`'s module map. This causes build failure when building llvm with
`-DLLVM_ENABLE_MODULES=ON`. This PR adds the new textual header to the
module map and fixes the build break.
Fixing rdar://145148093.
For targets that support IEEE fminimum_num/fmaximum_num, the
corresponding *_min_num_fXY/*_max_num_fXY instructions themselves
already did the canonicalization for the inputs. As a result, we do not
need to explicitly canonicalize the inputs for fminnum/fmaxnum.
- Support finding implementors of a protocol and discovering subclasses for ObjC interfaces via the implementations call
- Support jumping to the overridden method when you trigger goto definition on an override
- Properly find references to overridden methods
Querying TTI creates a Subtarget object, but an llvm.memcpy declaration
doesn't have target-cpu and target-feature attributes like functions
with definitions. This can cause a warning to be printed on RISC-V
because the target-abi in the Module requires floating point, but the
subtarget features don't enable floating point. So far we've only seen
this in LTO when an -mcpu is not supplied for the TargetMachine.
To fix this, get TTI for the calling function instead.
Fixes the issue reported here
https://github.com/llvm/llvm-project/issues/69780#issuecomment-2665273161
This merges several falloff and noreturn-related warnings and
removes unused diagnostic arguments.
Changes:
- `warn_maybe_falloff_nonvoid_function` and
`warn_falloff_nonvoid_function`, `warn_maybe_falloff_nonvoid_coroutine`
and `warn_falloff_nonvoid_coroutine`,
`warn_maybe_falloff_nonvoid_lambda` and `warn_falloff_nonvoid_lambda`
were combined into `warn_falloff_nonvoid`,
- `err_maybe_falloff_nonvoid_block` and `err_falloff_nonvoid_block` were
combined into `err_falloff_nonvoid`
- `err_noreturn_block_has_return_expr` and
`err_noreturn_lambda_has_return_expr` were merged into
`err_noreturn_has_return_expr` with the same semantics as
`warn_falloff_nonvoid` or `err_falloff_nonvoid`.
- Removed some diagnostic args that weren’t being used by the diagnostics.
Minor NFC refactor before making better variable mask combining decisions - isTargetShuffleVariableMask doesn't discriminate between fast (AND, PSHUFB etc.) and slow (VPERMV3 etc.) variable shuffles, so an opaque HasVariableMask is only of limited use.
The mapping of IR ExitBB to a VPBB isn't used. It also sets an incorrect
VPBB for the ExitBB; the regions successor is the middle block, no the
exit block.
It also unnecessarily triggers an assertion after 38376dee922.
This refactor includes the following changes:
- Refactor similar tests using `types::for_each` to remove redundant code;
- Explicitly include the missing header `type_algorithms.h` in some test files;
- Some tests scattered in different test functions with ad-hoc names
(e.g., `test5()`, `test6()`) but belong to the same kind are now grouped
into one function (`test_struct_array()`).
for `trunc nuw` saves a instruction and otherwise only other
instructions without the select, same behavior as for bit test before.
proof: https://alive2.llvm.org/ce/z/a6QmyV