This reverts commit cf1964af5a461196904b663ede04c26555fcff69.
This causes breakage on all the non-x86 buildbots as they don't have the i686
target enabled. This was missed in pre-commit CI.
Although 32-bit targets are currently not officially supported, add a type conversion in the AllocMemOp lowering when calling the `malloc` function on 32-bit targets. This fixes a type mismatch, and this fix makes it easier to potentially support such targets in the future.
This involves making sure the `LLVMTypeConverter` has the necessary information to know the target bit width.
Co-authored-by: Valentin Clement (バレンタイン クレメン) <clementval@gmail.com>
I think it is generally better for tests have some descriptive function
names so that we can insert new tests in the middle and don't have to
renumber all tests.
Also recently I added a (named) test to this file in #129020 so I think
it's consistent for other tests to be named.
I think the chance of this changing the tests in meaningful ways
is very low. This was perl with a few minor adjustments to a few
tests that produce new undefs. Only one test had a minor codegen
change with the switch, which I dropped from the change.
This is a follow up from 39bf765bb671fa7df3fe6c164cc9532fcb8653bd,
for the other case handled here. We would create CopyToReg marked
as uniform, even though the end phi would need to use VGPRs due
to another divergent input. There's no directly observable change in
the final output of the new test, but it does hit this case.
Preparation for CFI Index refactoring,
which will fix O(N^2) in ThinLTO indexing.
We need a data structure to lookup by GUID.
Wrapping allow us to change implementation
with minimal changes to users.
Use it instead of libc_support_library macros for all helper libraries
that are used for unit tests. See #130327 for the rationale why we want
to do this. With this change, we can additionally ensure that no
testonly library will end up being a dependency of production libraries.
We can take advantage of the attribute `alloc_size`. For example,
```
void * malloc(size_t size) __attribute__((alloc_size(1)));
std::span<char>{(char *)malloc(x), x}; // this is safe
```
rdar://136634730
Handle CXXUnresolvedConstructExpr in tryToFindPtrOrigin so that
constructing Ref, RefPtr, CheckedRef, CheckedPtr, ... constructed in
such a way that its type is unresolved at AST level will be still
treated as a safe pointer origin.
Also fix a bug in isPtrOfType that it was not recognizing
DeducedTemplateSpecializationType.
My previous patch to add DISubrangeType (#126772) had a couple of minor
errors. This patch corrects them.
1. When using a DISubrangeType as an array index type, the wrong tag was
written into the DIE.
2. I'd intended for subranges to use bit strides, not byte strides --
but neglected to actually implement this. Ada needs bit strides.
This patch adds a new test that checks both these things.
Finally, this patch adds some documentation for DISubrangeType.
* strcpy doesn't need to depend on memcpy
* qsort_test_helper has been generalized and doesn't need to depend on
qsort.
This is a small cleanup to unblock the work outlined in #130327.
This PR adds an initial implementation for the map modifiers close,
present and ompx_hold, primarily just required adding the appropriate
map type flags to the map type bits. In the case of ompx_hold it
required adding the map type to the OpenMP dialect. Close has a bit of a
problem when utilised with the ALWAYS map type on descriptors, so it is
likely we'll have to make sure close and always are not applied to the
descriptor simultaneously in the future when we apply always to the
descriptors to facilitate movement of descriptor information to device
for consistency, however, we may find an alternative to this with
further investigation. For the moment, it is a TODO/Note to keep track
of it.
The first agrument can be an address of a scalare, an array element or
even just the address of the first element of an array. Update lowering
to not trigger elemental lowering.
These are macOS tests only and are currently failing on the x86_64 CI
and on arm64 on recent versions of macOS/Xcode.
The tests are failing because we're stopping in:
```
Process 17458 stopped
* thread #1: tid = 0xbda69a, 0x00000002735bd000
libsystem_malloc.dylib`purgeable_print_self.cold.1, stop reason = EXC_BREAKPOINT (code=1, subcode=0x2735bd000)
```
instead of the libsanitizers library. This seems to be related to
`-fsanitize-trivial-abi` support
Skip these for now until we figure out the root cause.
This reverts commit 6fa1bfad65edefe3f4c17740f05297d34e833b47.
Revert it due to breaking buildbot:
Z:\b\llvm-clang-x86_64-sie-win\llvm-project\clang\tools\amdgpu-arch\AMDGPUArchByHIP.cpp(104):
error C2039: 'parse': is not a member of 'llvm::VersionTuple'
https://lab.llvm.org/buildbot/#/builders/46/builds/13184
This change moves GlobalUniqueCallSite into NVPTXISelLowering. In
processes where multiple compilations occur, this makes call site
enumeration local to individual compilation, which ensures that call
site numbers are consistently sequential within each compilation and is
independent of other compilations happening in parallel.
Inspired by PR #127944, this patch adds an option to print profile metadata inline with respect to the instruction (or function) it annotates - this saves one time from having to search up and down large textual modules to find this info.
This paper made it a constraint violation for the same identifier
within a TU to have both internal and external linkage. It was
previously UB.
Clang does not correctly diagnose the constraint in some cases,
documented in the added test case.
This paper removes UB around use of void expressions. Previously, code
like this had undefined behavior:
```
void foo(void) {
(void)(void)1;
extern void x;
x;
}
```
and this is now well-defined in C2y. Functionally, this now means that
it is valid to use `void` as a `_Generic` association.
This patch is to explicitly link the pthread library when building
shared flang-rt.
On AIX, for example, it needs to link in `libpthread.a` explicitly in
order to resolve the references to those `pthread_*` functions in
`include/flang-rt/runtime/lock.h`
After #130165.
In the code: `VRMap[CurStageNum][Def] = VRMap[CurStageNum][LoopVal]`
`VRMap[CurStageNum][LoopVal]` calculates a reference before
`VRMap[CurStageNum][Def]` which may rehash the DenseMap.
Then the reference can be dead.
Instead of returning the number of bytes emitted, just take the iterator
by reference so the increments in emitULEB128 will update the copy in
the caller.
Also pass the iterator by reference to emitNumToSkip so we don't need a
separate I += 3 in the caller.
Removes the condition that checks that operand is not indexed by
reduction iterators which allows for more fine-grained control via the
reshape fusion control function. For example, users could allow fusing
reshapes expand the M/N dims of a matmul but not the K dims (or preserve
the current behavior by not fusing at all).
---------
Signed-off-by: Ian Wood <ianwood2024@u.northwestern.edu>
Add Rocdl support for the following GFX950 instructions:
CVT_SCALE_PK_FP8_F16
CVT_SCALE_PK_BF8_F16
CVT_SCALE_PK_FP8_BF16
CVT_SCALE_PK_BF8_BF16
CVT_SCALE_SR_FP8_F16
CVT_SCALE_SR_BF8_F16
CVT_SCALE_SR_FP8_BF16
CVT_SCALE_SR_BF8_BF16
CVT_SCALE_PK_F16_FP8
CVT_SCALE_PK_F16_BF8
CVT_SCALE_F16_FP8
CVT_SCALE_F16_BF8
`inst->getFnAttr(Kind)` fallbacks to check if the parent has an
attribute, which breaks roundtriping the LLVM IR. This change actually
checks only in the call attribute list (no fallback to parent queries).
It's possible to argue that this small optimization isn't harmful, but
seems too early if it's breaking roundtrip behavior.