While reference comparators are a terrible idea and it's not entirely
clear whether they are supported, fixing the unintended ABI break is
straightforward so we should do it as a first step.
Fixes#118559
Add DAG legalization support for expanding i1 SETCC nodes using
appropriate logical operations to simulate integer comparisons. Use
these expansions to handle i1 SETCC in NVPTX.
fixes#58428 and #57405
It's really not useful at all to run benchmarks without --show-all since
you don't get the benchmark output. And since --show-all is the suggested
default way to run benchmarks, it's not necessary anymore to mention it
right below.
The increasing_allocator<T> class, originally introduced to test shrink_to_fit()
for std::vector, std::vector<bool>, and std::basic_string, has duplicated
definitions across several test files. Given the potential utility of this
class for capacity-related tests in various sequence containers, this patch
refactors the definition of increasing_allocator<T> into a single, reusable
location.
This is a prep patch for improving spread(4,8) shuffles. I also think it
improves the readability of the existing code, but the primary
motivation is simply staging work.
Given a `noexcept` operator with an operand that calls `delete`, Clang
was not considering whether the selected `operator delete` function was
a destroying delete or not when inspecting whether the deleted object
type has a throwing destructor. Thus, the operator would return `false`
for a type with a potentially throwing destructor even though that
destructor would not be called due to the destroying delete. Clang now
takes the kind of delete operator into consideration.
Fixes#118660
Add a `clock_gettime` emulation layer and use it to implement the `time`
entrypoint.
For windows, the monotonic clock is emulated using `QPC`.
The realtime clock is emulated using `GetSystemTimePreciseAsFileTime`.
This PR introduces a new checker
`[alpha.webkit.MemoryUnsafeCastChecker]` that warns all downcasts from a base type to a derived type.
rdar://137766829
This is very similar to 'gang', except with fewer restrictions, and only an
interaction with 'num_workers', plus disallowing 'gang' and 'worker' in
its associated statement. This patch implements this, the same as how
'gang' implemented it.
If this test fails, you're likely going to see something like "Assertion
Error: A != B" which doesn't really give much explanation for why this
failed.
Instead of ignoring the error, we should assert that it succeeded. This
will lead to a better error message, for example:
`AssertionError: 'memory write failed for 0x102d7c018' is not success`
Specifically, usernames containing `handle`, such as `chandlerc`, often
end up in paths, including the path of this test file which contains the
word `overflow`. Combined, they create a match for `handle.*overflow` in
the filename on my system (but likely not many others), leading this
test to mysteriously fail for unfortunate usernames like mine. =D
No discussion of the amount of time I spent debugging this please. =[
Version of SYCL was changed according to the latest agreement:
The lower 2 digits are not formally specified, but we plan to use these
to identify the month in which we submit the specification for
ratification, which is similar to the C++ macro __cplusplus.
Since the SYCL 2020 specification was submitted for ratification in
December of 2020, the macro's value is now 202012 for SYCL 2020.
see PR for details
https://github.com/KhronosGroup/SYCL-Docs/pull/634
This allows us to handle small coerced structs that are passed as
[2 x i64]. This is one of the last big reasons for -O0 fallbacks
in some of my testing.
Patch[1] has update intrinsic interface for ld1/st1, while based on
ARM's document, "If the intrinsic also has a vnum argument, the ZA slice
number is calculated by adding vnum to slice.". But the "vnum" did not
work for our realization now, this patch fix this point.
[1]ee31ba0dd9
Adds a ValueBoundsOpInterface implementation for scf.forall ops. The
implementation supports bounding for both induction variables, results,
and block args of the forall op. Induction variables are given upper and
lower bounds based on the lower and upper loop bounds, and dimensions of
the results and init block arguments are constrained to be equal to the
matching dims of the shared_outs operand.
Signed-off-by: Max Dawkins <maxdawkins19@gmail.com>
Co-authored-by: Max Dawkins <maxdawkins19@gmail.com>
An array SUM with the specified constant DIM argument
may be expanded into hlfir.elemental with a reduction loop
inside it processing all elements of the specified dimension.
The expansion allows further optimization of the cases like
`A=SUM(B+1,DIM=1)` in the optimized bufferization pass
(given that it can prove there are no read/write conflicts).
The optimized bufferization pass cannot optimize very simple cases of
elemental
assignments, because of the suboptimal checks order. This patch relies
on the fact that in a legal program the lhs and rhs of an assignment
have matching shapes, when lhs is not an allocatable and rhs is a result
of an elemental array operation.
The test `SocketTest::TCPListen0MultiListenerGetListeningConnectionURI`
is failing on hosts that do not map `localhost` to both an ipv4 and ipv6
address. For example this build
https://lab.llvm.org/buildbot/#/builders/195/builds/1909.
To fix this, I added a helper to validate if the host has an /etc/hosts
entry for both ipv4 and ipv6, otherwise we skip the test.
This is an alternative to #117866 that works by demanding a valid vtype
instead of using a separate pass.
The main advantage of this is that it allows coalesceVSETVLIs to just
reuse an existing vsetvli later in the block.
To do this we need to first transfer the vsetvli info to some arbitrary
valid state in transferBefore when we encounter a vector copy. Then we
add a new vill demanded field that will happily accept any other known
vtype, which allows us to coalesce these where possible.
Note we also need to check for vector copies in computeVLVTYPEChanges,
otherwise the pass will completely skip over functions that only have
vector copies and nothing else.
This is one part of a fix for #114518. We still need to check if there's
other cases where vector copies/whole register moves that are inserted
after vsetvli insertion.
The patch makes InstrProfWriter::writeImpl less monolithic by adding
InstrProfWriter::writeBinaryIds to serialize binary IDs. This way,
InstrProfWriter::writeImpl can simply call the new function instead of
handling all the details within writeImpl.
The following pattern: `(C2 << X) << C1` will usually be transformed
into `(C2 << C1) << X`, essentially swapping `X` and `C1`.
However, this should only be done when `C1` is an immediate constant,
otherwise thiscan lead to both constants being swapped forever.
This fixes#118798.
Fix -Wunused-private-field incorrectly suppressing warnings for friend
defaulted comparison operators. The warning should only be suppressed
when the defaulted comparison is a class member function.
Fixes#116270
This patch updates the matchSplatAsGather function so we can handle vectors of different sizes. The goal is to improve the code gen for @llvm.experimental.vector.match on RISCV.
Currently, we use a scalar extract and splat instead of vrgather, and the patch changes that.
The Lexer used in getRawToken is not told to keep whitespace, so when it
skips over escaped newlines, it also ignores whitespace, regardless of
getRawToken's IgnoreWhiteSpace parameter.
Instead of letting this case fall through to lexing, check
for whitespace after skipping over any escaped newlines.
This one is a bit complicated, as it has some interesting interactions,
as 'gang' Sema is required to look at its containing compute construct.
Except in the case of a combined construct, they are the same. This
resulted in a large refactor of the checking code for CheckGangExpr,
plus some additional work on the diagnostics for its interaction with
'num_gangs' and 'vector'/'worker'.
This is a follow up to #112015 and it reduces the unnecessary
duplication of source locations further.
We do not need to allocate source location space in the serialized PCMs
for module maps used only to find textual headers. Those module maps are
never referenced from anywhere in the serialized ASTs and are re-read in
other compilations.
This change should not affect correctness of Clang compilations or
clang-scan-deps in any way.
We do need the InputFile entry in the serialized AST because
clang-scan-deps relies on it. The previous patch introduced a mechanism
to do exactly that.
We have found that to finally remove any duplication of module maps we
use internally in our build system.