This patch fixes:
llvm/lib/Target/AMDGPU/SILateBranchLowering.cpp:173:64: error:
comparison of integers of different signs: 'unsigned int' and 'int'
[-Werror,-Wsign-compare]
* libcxx/test/support/min_allocator.h
+ Fix `tiny_size_allocator::rebind` which mistakenly said `T` instead of
`U`.
*
libcxx/test/std/algorithms/alg.modifying.operations/alg.partitions/stable_partition.pass.cpp
+ `std::stable_partition` requires bidirectional iterators.
* libcxx/test/std/containers/sequences/vector.bool/max_size.pass.cpp
+ Fix allocator type given to `std::vector<bool>`. The element types are
required to match, [N5008](https://isocpp.org/files/papers/N5008.pdf)
\[container.alloc.reqmts\]/5: "*Mandates:* `allocator_type::value_type`
is the same as `X::value_type`."
* libcxx/test/std/time/time.clock/time.clock.utc/types.compile.pass.cpp
+ Mark `is_steady` as `[[maybe_unused]]`, as it appears within
`LIBCPP_STATIC_ASSERT` only.
*
libcxx/test/std/algorithms/alg.modifying.operations/alg.rotate/rotate.pass.cpp
*
libcxx/test/std/algorithms/alg.modifying.operations/alg.swap/swap_ranges.pass.cpp
* libcxx/test/std/utilities/utility/utility.swap/swap_array.pass.cpp
+ Fix MSVC warning C4127 "conditional expression is constant".
`TEST_STD_AT_LEAST_23_OR_RUNTIME_EVALUATED` was introduced for this
purpose, so it should be used consistently.
* libcxx/test/std/numerics/numeric.ops/numeric.ops.gcd/gcd.pass.cpp
+ Fix `gcd()` precondition violation for `signed char`. This test case
was causing `-128` to be passed as a `signed char` to `gcd()`, which is
forbidden.
* libcxx/test/std/containers/sequences/array/assert.iterators.pass.cpp
*
libcxx/test/std/containers/sequences/vector/vector.modifiers/assert.push_back.invalidation.pass.cpp
*
libcxx/test/std/input.output/iostream.format/print.fun/no_file_description.pass.cpp
+ Split some REQUIRES and XFAIL lines. This is a "nice to have" for
MSVC's internal test harness, which is extremely simple and looks for
exact comment matches to skip tests. We can recognize the specific lines
"REQUIRES: has-unix-headers" and "XFAIL: msvc", but it's a headache to
maintain if they're chained with other conditions.
* libcxx/test/support/sized_allocator.h
+ Fix x86 truncation warnings. `std::allocator` takes `std::size_t`, so
we need to `static_cast`.
*
libcxx/test/std/input.output/file.streams/fstreams/ifstream.members/offset_range.pass.cpp
+ Fix x86 truncation warning. `std::min()` is returning
`std::streamoff`, which was being unnecessarily narrowed to
`std::size_t`.
*
libcxx/test/std/algorithms/alg.sorting/alg.merge/inplace_merge_comp.pass.cpp
+ Fix MSVC warning C4127 "conditional expression is constant" for an
always-true branch. This was very recently introduced by #129008 making
`N` constexpr. As it's a local constant just nine lines above, we don't
need to test whether 100 is greater than 0.
The llvm.amdgcn.cs.chain intrinsic has a 'flags' operand which may
indicate that we want to reallocate the VGPRs before performing the
call.
A call with the following arguments:
```
llvm.amdgcn.cs.chain %callee, %exec, %sgpr_args, %vgpr_args,
/*flags*/0x1, %num_vgprs, %fallback_exec, %fallback_callee
```
is supposed to do the following:
- copy the SGPR and VGPR args into their respective registers
- try to change the VGPR allocation
- if the allocation has succeeded, set EXEC to %exec and jump to
%callee, otherwise set EXEC to %fallback_exec and jump to
%fallback_callee
This patch implements the dynamic VGPR behaviour by generating an
S_ALLOC_VGPR followed by S_CSELECT_B32/64 instructions for the EXEC and
callee. The rest of the call sequence is left undisturbed (i.e.
identical to the case where the flags are 0 and we don't use dynamic
VGPRs). We achieve this by introducing some new pseudos
(SI_CS_CHAIN_TC_Wn_DVGPR) which are expanded in the SILateBranchLowering
pass, just like the simpler SI_CS_CHAIN_TC_Wn pseudos. The main reason
is so that we don't risk other passes (particularly the PostRA
scheduler) introducing instructions between the S_ALLOC_VGPR and the
jump. Such instructions might end up using VGPRs that have been
deallocated, or the wrong EXEC mask. Once the whole backend treats
S_ALLOC_VGPR and changes to EXEC as barriers for instructions that use
VGPRs, we could in principle move the expansion earlier (but in the
absence of a good reason for that my personal preference is to keep it
later in order to make debugging easier).
Since the expansion happens after register allocation, we're careful to
select constants to immediate operands instead of letting ISel generate
S_MOVs which could interfere with register allocation (i.e. make it look
like we need more registers than we actually do).
For GFX12, S_ALLOC_VGPR only works in wave32 mode, so we bail out during
ISel in wave64 mode. However, we can define the pseudos for wave64 too
so it's easy to handle if future generations support it.
---------
Co-authored-by: Ana Mihajlovic <Ana.Mihajlovic@amd.com>
Co-authored-by: Matt Arsenault <Matthew.Arsenault@amd.com>
Similar to other operations, s8, s16 s32 and s64 vector elements are
clamped to legal vector sizes, odd number of elements are widened to the
next power-2 and s128 is scalarized.
This helps legalize cttz as well as ctpop.
Like other operations larger than i64, we scalarize i128 and allow them to
legalize from there. This also helps with v2i64 udiv by constant, which needs a
legalize a umulh.
I was trying a change that caused some of the error messages in this
test to change. I was having trouble updating the test because FileCheck
kept scanning ahead when I missed updating a line. Checking the line number
should make it easier to update.
This moves two functions from Platform to Host:
1. GetCurrentXcodeToolchainDirectory
2. GetCurrentCommandLineToolsDirectory.
These two functions caused a layering violation in the Swift fork, which
added a dependency from lldbHost to lldbPlatform. As show by this PR,
there's no need for these two functions to live in Platform, and we
already have similar functions in Host.
We have various layering violations but this one is particularly bad,
because lldb-dap started depending on lldbHost. On the Swift fork, this
library was depending on lldbPlatform which pulled in various Swift
files, which libLLDB needs, but lldb-dap itself does not. We were
missing RPATHs to resume them, so in the current nightly, lldb-dap
crashes because the dynamic loader can't find the missing Swift libs.
rdar://146537366
Before this patch, whole program devirtualization is suppressed on a
class if any superclass is visible to regular object files, by recording
the class GUID in `VisibleToRegularObjSymbols`.
This patch suppresses whole program devirtualization on a class if the
LTO unit doesn't have the prevailing definition of this class (e.g., the
prevailing definition is in a shared library)
Implementation summaries:
1. In llvm/lib/LTO/LTO.cpp, `IsVisibleToRegularObj` is updated to look
at the global resolution's `IsPrevailing` bit for ThinLTO and
regularLTO.
2. In llvm/tools/llvm-lto2/llvm-lto2.cpp,
- three command line options are added so `llvm-lto2` can override
`Conf.HasWholeProgramVisibility`, `Conf.ValidateAllVtablesHaveTypeInfos`
and `Conf.AllVtablesHaveTypeInfos`.
The test case is reduced from a small C++ program (main.cc, lib.cc/h
pasted below in [1]). To reproduce the program failure without this
patch, compile lib.cc into a shared library, and provide it to a ThinLTO
build of main.cc (commands are pasted in [2]).
[1]
* lib.h
```
#include <cstdio>
class Derived {
public:
void dispatch();
virtual void print();
virtual void sum();
};
void Derived::dispatch() {
static_cast<Derived*>(this)->print();
static_cast<Derived*>(this)->sum();
}
void Derived::sum() {
printf("Derived::sum\n");
}
__attribute__((noinline)) void* create(int i);
__attribute__((noinline)) void* getPtr(int i);
```
* lib.cc
```
#include "lib.h"
#include <cstdio>
#include <iostream>
class Derived2 : public Derived {
public:
void print() override {
printf("DerivedSharedLib\n");
}
void sum() override {
printf("DerivedSharedLib::sum\n");
}
};
void Derived::print() {
printf("Derived\n");
}
__attribute__((noinline)) void* create(int i) {
if (i & 1)
return new Derived2();
return new Derived();
}
```
* main.cc
```
cat main.cc
#include "lib.h"
class DerivedN : public Derived {
public:
};
__attribute__((noinline)) void* getPtr(int x) {
return new DerivedN();
}
int main() {
Derived*b = static_cast<Derived*>(create(201));
b->dispatch();
delete b;
Derived* a = static_cast<Derived*>(getPtr(202));
a->dispatch();
delete a;
return 0;
}
```
[2]
```
# compile lib.o in a shared library.
$ ./bin/clang++ -O2 -fPIC -c lib.cc -o lib.o
$ ./bin/clang++ -shared -o libdata.so lib.o
# Provide the shared library in `-ldata`
$ ./bin/clang++ -v -g -ldata --save-temps -fno-discard-value-names -Wl,-mllvm,-print-before=wholeprogramdevirt -Wl,-mllvm,-wholeprogramdevirt-check=trap -Rpass=wholeprogramdevirt -Wl,--lto-whole-program-visibility -Wl,--lto-validate-all-vtables-have-type-infos -mllvm -disable-icp=true -Wl,-mllvm,-disable-icp=false -flto=thin -fwhole-program-vtables -fno-split-lto-unit -fuse-ld=lld main.cc -L . -o main >/tmp/wholeprogramdevirt.ir 2>&1
# Run the program hits a segmentation fault with `-Wl,-mllvm,-wholeprogramdevirt-check=trap`
$ LD_LIBRARY_PATH=. ./main
DerivedSharedLib
Trace/breakpoint trap (core dumped)
```
The non-freeze poison argument to select can be one of the following: global,
constant, and noundef arguments.
Alive2 test validation: https://alive2.llvm.org/ce/z/jbtCS6
This adds an explicit handler for:
- llvm.aarch64.neon.ld1x2, llvm.aarch64.neon.ld1x3,
llvm.aarch64.neon.ld1x4
- llvm.aarch64.neon.ld2, llvm.aarch64.neon.ld3, llvm.aarch64.neon.ld4
- llvm.aarch64.neon.ld2lane, llvm.aarch64.neon.ld3lane,
llvm.aarch64.neon.ld4lane
- llvm.aarch64.neon.ld2r, llvm.aarch64.neon.ld3r, llvm.aarch64.neon.ld4r
instead of relying on the default strict handler.
Updates the tests from https://github.com/llvm/llvm-project/pull/125267
This patch does two things:
1. It implements an ephemeral values cache analysis pass that collects the ephemeral values of a function and caches them for fast lookups. The collection of the ephemeral values is done lazily when the user calls `EphemeralValuesCache::ephValues()`.
2. It adds caching of ephemeral values using the `EphemeralValuesCache` to speed up `CallAnalyzer::analyze()`. Without caching this can take a long time to run in cases where the function contains a large number of `@llvm.assume()` calls and a large number of callsites. The time is spent in `collectEphemeralvalues()`.
TARGET dummy arrays can be accessed indirectly, so it is unsafe
to repack them.
INTENT(OUT) dummy arrays that require finalization on entry
to their subroutine must be copied-in by `fir.pack_arrays`.
In addition, based on my testing results, I think it will be useful
to document that `LOC` and `IS_CONTIGUOUS` will have different values
for the repacked arrays. I still need to decide where to document
this, so just added a note in the design doc for the time being.
This pull request is the third part of an ongoing effort to extends PGO
instrumentation to GPU device code and depends on
https://github.com/llvm/llvm-project/pull/93365. This PR makes the
following changes:
- Allows PGO flags to be supplied to GPU targets
- Pulls version global from device
- Modifies `__llvm_write_custom_profile` and `lprofWriteDataImpl` to
allow the PGO version to be overridden
This properly implements getCommonNNS, for getting the common
NestedNameSpecifier, for which the previous implementation was a bare
minimum placeholder.
For some reason the MD tests don't appear to have ever run, despite
having checks. This patch adds a new set of RUN lines that will
exercise the markdown generation.
The `--disable_verify` flag is implemented for ELF and is used to
disable LLVM module verification.
93afd8f9ac/lld/ELF/Options.td (L661)
This allows us to quickly suppress verification errors.
I can't figure out why this would be necessary. Nothing is checking if
libpthread is available, nothing in lldb-dap is relying on libpthread
directly and nothing else in LLDB is doing this.
DenseSet, SmallPtrSet, SmallSet, SetVector, and StringSet recently
gained C++23-style insert_range. This patch replaces:
Dest.insert(Src.begin(), Src.end());
with:
Dest.insert_range(Src);
This patch does not touch custom begin like succ_begin for now.
Also use ErrnoSetterMatcher to verify the function return values and
verify/clear out errno values. Fix the bug in ErrnoSetterMatcher error
reporting machinery to properly convert errno values into errno names to
make error messages easier to debug.
Update initial VPlan-construction in VPlanNativePath in line with the
inner loop path, in that it bails out when encountering constructs it
cannot handle, like non-intrinsic calls.
Fixes https://github.com/llvm/llvm-project/issues/131071.
It turns out trailing objects are uninitialized
and APValue assignment operator requires a fully initialized object.
Additionally, do some drive-by post-commit-review fixes.
Fixes#125810
---
This patch resolves an issue in Clang where the `-Wunused-variable`
warning was suppressed for structured bindings with elements marked
`[[maybe_unused]]`, causing the entire declaration to be treated as used
and preventing the warning from being emitted.
Collect profiles for functions we encounter when collecting a contextual profile, that are not associated with a call site. This is expected to happen for signal handlers, but it also - problematically - currently happens for mem{memset|copy|move|set}, which are currently inserted after profile instrumentation.
Collecting a "regular" flat profile in these cases would hide the problem - that we loose better profile opportunities.
Use ErrnoCheckingTest harness added in
d039af33096c0a83b03475a240d5e281e2271c44 for all unistd tests that
verify errno values. Stop explicitly setting it to zero in test code, as
harness does it.
It also verifies that the errno is zero at the end of each test case, so
update the ASSERT_ERRNO_EQ and ASSERT_ERRNO_FAILURE macro to clear out
its value after the verification (similar to how ErrnoSetterMatcher
does).
Update the CMake and Bazel rules for those tests. In the latter case,
remove commented out tests that are currently unsupported in Bazel,
since they get stale quickly.