The patch adds a nullptr check before accessing the loop blocks in
'hasPossiblyDistributableLoop' function. The existing check for the
loop’s containment in the region does not capture nullptr cases when the
region covers the entire function. Therefore, it’s better to exit if the
basic block isn’t part of any loop
Fixes#113772.
This PR resolves a crash triggered by a forward reference to an enum
type in a function parameter list. The fix includes setting `Invalid`
when `TagUseKind` is `Declaration` to ensure correct error handling.
Fixes#112208
This patch intersects attributes of two calls to avoid introducing UB.
It also skips incompatible call pairs in GVN/NewGVN. However, I cannot
provide negative tests for these changes.
Fixes https://github.com/llvm/llvm-project/issues/113997.
In https://github.com/llvm/llvm-project/pull/109837, it sets a global
variable(`PGOInstrumentColdFunctionOnly`) in PassBuilderPipelines.cpp
which introduced a data race detected by TSan. To fix this, I decouple
the flag setting, the flags are now set
separately(`instrument-cold-function-only-path` is required to be used
with `--pgo-instrument-cold-function-only`).
Relands #114356. Compared to the last version, this patch only merges
poison-generating/nsz flags from the select to fix LV regression in
`llvm/test/Transforms/PhaseOrdering/AArch64/predicated-reduction.ll`.
As a drive-by, remove unused functor inside the unordered_set benchmark.
That benchmark still isn't very exhaustive, but that can be addressed
separately.
_SC_OPEN_MAX is quite high on FreeBSD, which makes this close() loop a
quite obvious problem when attempting to do any kind of debugging in a
process that uses StartSubprocess. Switch to using close_range(2)
instead to close them all in a single syscall and dramatically reduce
the runtime and syscall trace noise
Linux has an equivalent syscall, but I do not have the capacity to test
that it works there, so this is limited to SANITIZER_FREEBSD for the
time being.
In case where a fir.global might be duplicated in an inner module
(gpu.module), the conversion pattern will be applied on the module and
the gpu module version of the global and try to generate multiple comdat
with the same symbol name. This is what we have in the implementation of
CUDA Fortran.
Just check for the presence of the `ComdatSelectorOp` before creating a
new one.
If we know the output arity at tablegen time, we can often just call
.result or .results directly.
This saves almost 1s in a JAX-based LLM benchmark building a mixture of
upstream dialects and StableHLO.
This PR removes the `HeaderFileInfo::Framework` member and reduces the
size of this data type from 32B to 16B. This should improve Clang's
memory usage in situations where it keeps track of lots of header files.
NFCI. Depends on #114459.
This PR removes the `-index-header-map` functionality from Clang. AFAIK
this was only used internally at Apple and is now dead code. The main
motivation behind this change is to enable the removal of
`HeaderFileInfo::Framework` member and reducing the size of that data
structure.
rdar://84036149
We shouldn't be including headers directly from asm-generic for macros.
It's safer to get those through the correct primary header where
possible (e.g. fcntl instead of asm-generic/fcntl).
For our public headers we may need to include the asm-generic
headers instead of defining all the macros ourselves, but that's
something for a followup PR.
Previously shm_test was including <asm-generic/fcntl.h> for the macro
FD_CLOEXEC. This patch adds that macro to the list we have defined, and
redirects the test to use the correct proxy header.
The proxy header definition for mode_t was using an incorrect form for
its dependency on fcntl_overlay. The relative paths for dependencies can
only go down, not up so "../" doesn't work. This patch fixes it to be
absolute.
Update VPlan to include the scalar loop header. This allows retiring
VPLiveOut, as the remaining live-outs can now be handled by adding
operands to the wrapped phis in the scalar loop header.
Note that the current version only includes the scalar loop header, no
other loop blocks and also does not wrap it in a region block.
PR: https://github.com/llvm/llvm-project/pull/109975
This is a fixed version of #106185, which was reverted in #113978 due to
a buildbot failure.
Motivation example:
```
> cat test.cpp
extern "C" [[gnu::weak]] void f() {}
void alias() __attribute__((alias("f")));
int main() { auto p = alias; p(); }
> clang test.cpp -fsanitize=cfi-icall -flto=thin -fuse-ld=lld
> ./a.out
[1] 1868 illegal hardware instruction ./a.out
```
If the address of a function was only taken through its alias, the
function was not considered exported and therefore was not included in
the CFI jumptable. This resulted in `@llvm.type.test()` being lowered to
`false`, and consequently the indirect call to the function was
eventually optimized to `ubsantrap()`.
Fixes: https://github.com/llvm/llvm-project/issues/71030
Bug only happens in 64bit involving spills. Since we don't know when the
spill will happen, all instructions in the chain used to deduce sign
extension for eliminating 'extsw' will need to be promoted to 64-bit
pseudo instructions.
The following instruction will promoted in PPCMIPeepholes: EXTSH, LHA,
ISEL to EXTSH8, LHA8, ISEL8
We want to support CFI instrumentation for the bitcode section, without
miscompiling the object code portion of a FatLTO object. We can reuse
the existing mechanisms in the LowerTypeTestsPass to do that, by just
adding the pass to the FatLTO pipeline after the EmbedBitcodePass with
the correct options set.
Fixes#112053
Unifies parsing and printing for DLTI attributes. Introduces a format of
`#dlti.attr<key1 = val1, ..., keyN = valN>` syntax for all queryable
DLTI attributes similar to that of the DictionaryAttr, while retaining
support for specifying key-value pairs with `#dlti.dl_entry` (whether to
retain this is TBD).
As the new format does away with most of the boilerplate, it is much easier
to parse for humans. This makes an especially big difference for nested
attributes.
Updates the DLTI-using tests and includes fixes for misc error checking/
error messages.
When expanding an atomicrmw with a cmpxchg, preserve any metadata
attached to it. This will avoid unwanted double expansions
in a future commit.
The initial load should also probably receive the same metadata
(which for some reason is not emitted as an atomic).
This reverts commit 5ce8a93ebb7688c66bd08fa27d2e04bca33b6763.
Commit 1bc58a258e2edb6221009a26d0f0037eda6c7c47 was reverted in
a9a8351ef181610559687679a40efe24649f9600, so the bazel fix is no longer
needed.
It's possible that we encounter Irreducible control flow, due to which,
we may find that a few predecessors of BB are not a part of the CurLoop.
Currently we crash in the function for such cases. This patch ensures
that we only return Predecessors that are a part of CurLoop and
gracefully ignore other Predecessors.
For example, consider Irreducible IR of this form:
```
define i64 @baz() {
bb:
br label %bb1
bb1: ; preds = %bb3, %bb
br label %bb3
bb2: ; No predecessors!
br label %bb3
bb3: ; preds = %bb2, %bb1
%load = load ptr addrspace(1), ptr addrspace(1) null, align 8
br label %bb1
}
```
This crashes when `collectTransitivePredecessors` is called on the
`%bb1<Header>, %bb3<latch>` loop, because the loop body has a
predecessor `%bb2` which is not a part of the loop.
See https://godbolt.org/z/E9fM1q3cT for the crash
My earlier patch https://github.com/llvm/llvm-project/pull/98200 caused a regression because it unconditionally unblocked synchronous signals, even if the user program had deliberately blocked them. This patch fixes the issue by checking the current signal mask, as suggested by Vitaly. It also adds tests.
Fixes#113385
---------
Co-authored-by: Vitaly Buka <vitalybuka@gmail.com>
The nested switch exists to share setting IntrinsicsTypes to {ResultType}.
clz/ctz return before we reach that so they can just be in the top
level switch.