532491 Commits

Author SHA1 Message Date
Alan Zhao
c5b3fe2094
[clang] Automatically add the returns_twice attribute to certain functions even if -fno-builtin is set (#133511)
Certain functions require the `returns_twice` attribute in order to
produce correct codegen. However, `-fno-builtin` removes all knowledge
of functions that require this attribute, so this PR modifies Clang to
add the `returns_twice` attribute even if `-fno-builtin` is set. This
behavior is also consistent with what GCC does.

It's not (easily) possible to get the builtin information from
`Builtins.td` because `-fno-builtin` causes Clang to never initialize
any builtins, so functions never get tokenized as functions/builtins
that require `returns_twice`. Therefore, the most straightforward
solution is to explicitly hard code the function names that require
`returns_twice`.

Fixes #122840
2025-03-31 09:42:34 -07:00
Jonas Devlieghere
94b04b4119
[lldb] Include the version in the lldbassert error message (#133740)
Include the LLDB version in the lldbassert error message, and prompt
users to include it in the bug report. The majority of users that bother
filing a bug report just copy past the stack trace and often forget to
include this important detail. By putting it after the backtrace and
before the prompt, I'm hoping it'll get copy-pasted in.

rdar://146793016
2025-03-31 09:40:33 -07:00
Han-Chung Wang
66b0b0466b
[MLIR][NFC] Fix incomplete boundary comments. (#133516)
I observed that we have the boundary comments in the codebase like:

```
//===----------------------------------------------------------------------===//
// ...
//===----------------------------------------------------------------------===//
```

I also observed that there are incomplete boundary comments. The
revision is generated by a script that completes the boundary comments.

```
//===----------------------------------------------------------------------===//
// ...

...
```

Signed-off-by: hanhanW <hanhan0912@gmail.com>
2025-03-31 09:29:54 -07:00
3405691582
c180e249d0
Fix crash lowering stack guard on OpenBSD/aarch64. (#125416)
TargetLoweringBase::getIRStackGuard refers to a platform-specific guard
variable. Before this change, TargetLoweringBase::getSDagStackGuard only
referred to a different variable.

This means that SelectionDAGBuilder's getLoadStackGuard does not get
memory operands. However, AArch64InstrInfo::expandPostRAPseudo assumes
that the passed MachineInstr has nonzero memoperands, causing a
segfault.

We have two possible options here: either disabling the LOAD_STACK_GUARD
node entirely in AArch64TargetLowering::useLoadStackGuardNode or just
making the platform-specific values match across TargetLoweringBase.
Here, we try the latter.
2025-03-31 09:17:55 -07:00
Fraser Cormack
87602f6d03
[libclc] Fix unresolved reference to missing table (#133691)
Splitting the 'ln_tbl' into two in db98e292 wasn't done thoroughly
enough as some references to the old table still remained. This commit
fixes the unresolved references by updating to the new split table.
2025-03-31 16:55:23 +01:00
Fraser Cormack
3fd0eaae52
[libclc][amdgpu] Implement native_exp2 via AMD builtin (#133696)
This came up during a discussion on #129679, which has been split out as
a preparatory commit.

An example of the AMDGPU codegen is:

    define <2 x float> @_Z10native_expDv2_f(<2 x float> %val) {
      %mul = fmul afn <2 x float> %val, splat (float 0x3FF7154760000000)
      %0 = extractelement <2 x float> %mul, i64 0
      %1 = tail call float @llvm.amdgcn.exp2.f32(float %0)
      %vecinit.i = insertelement <2 x float> poison, float %1, i64 0
      %2 = extractelement <2 x float> %mul, i64 1
      %3 = tail call float @llvm.amdgcn.exp2.f32(float %2)
%vecinit2.i = insertelement <2 x float> %vecinit.i, float %3, i64 1
      ret <2 x float> %vecinit2.i
    }

    define <2 x float> @_Z11native_exp2Dv2_f(<2 x float> %x) {
      %0 = extractelement <2 x float> %x, i64 0
      %1 = tail call float @llvm.amdgcn.exp2.f32(float %0)
      %vecinit = insertelement <2 x float> poison, float %1, i64 0
      %2 = extractelement <2 x float> %x, i64 1
      %3 = tail call float @llvm.amdgcn.exp2.f32(float %2)
      %vecinit2 = insertelement <2 x float> %vecinit, float %3, i64 1
      ret <2 x float> %vecinit2
    }
2025-03-31 16:54:04 +01:00
Paul Bowen-Huggett
ea06f7f96f
[RISCV] For RV32C, disassembly of c.slli should fail when immediate > 31 (#133713)
Fixes #133712.

The change causes `c.slli` instructions whose immediate has bit 5 set to
be rejected when disassembling RV32C. Added a test to exhaustively cover
c.slli for 32 bit targets. A minor tweak to make the debug output a
little more readable.

The spec. (20240411) says:

> For RV32C, shamt[5] must be zero; the code points with shamt[5]=1 are
designated for custom extensions. For RV32C and RV64C, the shift amount
must be non-zero; the code points with shamt=0 are HINTs. For all base
ISAs, the code points with rd=x0 are HINTs, except those with shamt[5]=1
in RV32C.
2025-03-31 08:51:34 -07:00
Jonas Devlieghere
799e905364
[lldb] Create a default rate limit constant in Progress (NFC) (#133506)
In #133211, Greg suggested making the rate limit configurable through a
setting. Although adding the setting is easy, the two places where we
currently use rate limiting aren't tied to a particular debugger.
Although it'd be possible to hook up, given how few progress events
currently implement rate limiting, I don't think it's worth threading
this through, if that's even possible.

I still think it's a good idea to be consistent and make it easy to pick
the same rate limiting value, so I've moved it into a constant in the
Progress class.
2025-03-31 08:29:20 -07:00
Jay Foad
bd862a459d
[AMDGPU] Add subtarget feature for v_lshl_add_u64. NFC. (#133723) 2025-03-31 16:25:00 +01:00
Ramkumar Ramachandra
c20bea09c2
[LV] Regen a test with UTC (#133432) 2025-03-31 16:24:45 +01:00
LLVM GN Syncbot
b91f978647 [gn build] Port 50949ebf523c 2025-03-31 15:20:59 +00:00
Yuval Deutscher
945c494e2c
[lldb] Use correct path for lldb-server executable (#131519)
Hey,

This solves an issue where running lldb-server-20 with a non-absolute
path (for example, when it's installed into `/usr/bin` and the user runs
it as `lldb-server-20 ...` and not `/usr/bin/lldb-server-20 ...`) fails
with `error: spawn_process failed: execve failed: No such file or
directory`. The underlying issue is that when run that way, it attempts
to execute a binary named `lldb-server-20` from its current directory.
This is also a mild security hazard because lldb-server is often being
run as root in the directory /tmp, meaning that an unprivileged user can
create the file /tmp/lldb-server-20 and lldb-server will execute it as
root. (although, well, it's a debugging server we're talking about, so
that may not be a real concern)

I haven't previously contributed to this project; if you want me to
change anything in the code please don't hesitate to let me know.
2025-03-31 08:20:40 -07:00
Jonas Devlieghere
50949ebf52
[lldb] Expose the Target API mutex through the SB API (#133295)
Expose u target API mutex through the SB API. This is motivated by
lldb-dap, which is built on top of the SB API and needs a way to execute
a series of SB API calls in an atomic manner (see #131242).

We can solve this problem by either introducing an additional layer of
locking at the DAP level or by exposing the existing locking at the SB
API level. This patch implements the second approach.

This was discussed in an RFC on Discourse [0]. The original
implementation exposed a move-only lock rather than a mutex [1] which
doesn't work well with SWIG 4.0 [2]. This implement the alternative
solution of exposing the mutex rather than the lock. The SBMutex
conforms to the BasicLockable requirement [3] (which is why the methods
are called `lock` and `unlock` rather than Lock and Unlock) so it can be
used as `std::lock_guard<lldb::SBMutex>` and
`std::unique_lock<lldb::SBMutex>`.

[0]: https://discourse.llvm.org/t/rfc-exposing-the-target-api-lock-through-the-sb-api/85215/6
[1]: https://github.com/llvm/llvm-project/pull/131404
[2]: https://discourse.llvm.org/t/rfc-bumping-the-minimum-swig-version-to-4-1-0/85377/9
[3]: https://en.cppreference.com/w/cpp/named_req/BasicLockable
2025-03-31 08:19:41 -07:00
Rahul Joshi
74b7abf154
[IRBuilder] Add new overload for CreateIntrinsic (#131942)
Add a new `CreateIntrinsic` overload with no `Types`, useful for
creating calls to non-overloaded intrinsics that don't need additional
mangling.
2025-03-31 08:10:34 -07:00
Tom Tromey
68947342b7
Add support for fixed-point types (#129596)
This adds DWARF generation for fixed-point types. This feature is needed
by Ada.

Note that a pre-existing GNU extension is used in one case. This has
been emitted by GCC for years, and is needed because standard DWARF is
otherwise incapable of representing these types.
2025-03-31 07:42:21 -07:00
Devon Loehr
4007de00a0
Enable unnecessary-virtual-specifier by default (#133265)
This turns on the unnecessary-virtual-specifier warning in general, but
disables it when building LLVM. It also tweaks the warning description
to be slightly more accurate.

Background: I've been working on cleaning up this warning in two
codebases: LLVM and chromium (plus its dependencies). The chromium
cleanup has been straightforward. Git archaeology shows that there are
two reasons for the warnings: classes to which `final` was added after
they were initially committed, and classes with virtual destructors that
nobody remarks on. Presumably the latter case is because people are just
very used to destructors being virtual.

The LLVM cleanup was more surprising: I discovered that we have an [old
policy](https://llvm.org/docs/CodingStandards.html#provide-a-virtual-method-anchor-for-classes-in-headers)
about including out-of-line virtual functions in every class with a
vtable, even `final` ones. This means our codebase has many virtual
"anchor" functions which do nothing except control where the vtable is
emitted, and which trigger the warning. I looked into alternatives to
satisfy the policy, such as using destructors instead of introducing a
new function, but it wasn't clear if they had larger implications.

Overall, it seems like the warning is genuinely useful in most codebases
(evidenced by chromium and its dependencies), and LLVM is an unusual
case. Therefore we should enable the warning by default, and turn it off
only for LLVM builds.
2025-03-31 16:28:53 +02:00
Brox Chen
a61cc1b99a
[AMDGPU][True16][CodeGen] Skip combineDpp with t16 instructions (#128918)
We only emits v_mov_b32/64_dpp. Don't combine t16 instructions with mov
dpp. Update the test inputs to be legal.

It is future work to emit v_mov_b16_dpp, and then update GCNDPPCombine
to combine it with the 16-bit instructions.
2025-03-31 10:18:25 -04:00
Tejas Vipin
8078665bca
[libc][math][c23] Add hypotf16 function (#131991)
Implement hypot for Float16 along with tests.
2025-03-31 10:06:28 -04:00
Phoebe Wang
c7572ae213
[X86][AVX10] Re-target mavx10.1 and emit warning for mavx10.x-256/512 and m[no-]evex512 (#132542)
The 256-bit maximum vector register size control was removed from AVX10
whitepaper, ref: https://cdrdv2.intel.com/v1/dl/getContent/784343

- Re-target m[no-]avx10.1 to enable AVX10.1 with 512-bit maximum vector
register size;
- Emit warning for mavx10.x-256, noting AVX10/256 is not supported;
- Emit warning for mavx10.x-512, noting to use m[no-]avx10.x instead;
- Emit warning for m[no-]evex512, noting AVX10/256 is not supported;

This patch only changes Clang driver behavior. The features
avx10.x-256/512 keep unchanged and will be removed in the next release.
2025-03-31 22:05:50 +08:00
Stefan Pintilie
8d69e953b5
[RISCV] Add combine for shadd family of instructions. (#130829)
For example for the following situation:
  %6:gpr = SLLI %2:gpr, 2
  %7:gpr = ADDI killed %6:gpr, 24
  %8:gpr = ADD %0:gpr, %7:gpr

If we swap the two add instrucions we can merge the shift and add. The
final code will look something like this:
  %7 = SH2ADD %0, %2
  %8 = ADDI %7, 24
2025-03-31 10:02:12 -04:00
Simon Pilgrim
96efb21e88 [X86] Add regression test for insert_subvector(x,extract_subvector(broadcast)) pattern identified in #133083
Infinite loop check
2025-03-31 15:01:31 +01:00
Michael Buch
0794d5cfba
[clang][Sema] Fix typo in 'offsetof' diagnostics (#133448)
Before:
```
offset of on non-POD type
```
After:
```
offsetof on non-POD type
```

---------

Co-authored-by: Aaron Ballman <aaron@aaronballman.com>
2025-03-31 14:56:29 +01:00
sstwcw
ab7cee8a0e [clang-format] Handle C++ keywords in other languages better (#132941)
There is some code to make sure that C++ keywords that are identifiers
in the other languages are not treated as keywords.  Right now, the kind
is set to identifier, and the identifier info is cleared.  The latter is
probably so that the code for identifying C++ structures does not
recognize those structures by mistake when formatting a language that
does not have those structures.  But we did not find an instance where
the language can have the sequence of tokens, the code tries to parse
the structure as if it is C++ using the identifier info instead of the
token kind, but without checking for the language setting.  However,
there are places where the code checks whether the identifier info field
is null or not.  They are places where an identifier and a keyword are
treated the same way.  For example, the name of a function in
JavaScript.  This patch removes the lines that clear the identifier
info.  This way, a C++ keyword gets treated in the same way as an
identifier in those places.

JavaScript

New

```JavaScript
async function
union(
    myparamnameiswaytooloooong) {
}
```

Old

```JavaScript
async function
    union(
        myparamnameiswaytooloooong) {
}
```

Java

New

```Java
enum union { ABC, CDE }
```

Old

```Java
enum
union { ABC, CDE }
```
2025-03-31 13:54:49 +00:00
sstwcw
cb54026d92
[clang-format] Recognize wait fork in Verilog (#132042)
before

```Verilog
wait fork
  ;
  wait fork
    ;
    wait fork
      ;
```

after

```Verilog
wait fork;
wait fork;
wait fork;
```

The `wait fork` statement should not start a block. Previously the
formatter treated the `fork` part as the start of a new block. Now the
problem is fixed.
2025-03-31 13:53:23 +00:00
Simon Pilgrim
9b32f3d096
[DAG] visitEXTRACT_SUBVECTOR - don't return early on failure of EXTRACT_SUBVECTOR(INSERT_SUBVECTOR()) -> BITCAST fold (#133695)
Always allow later folds to try to match as well.
2025-03-31 14:32:43 +01:00
David Green
b9b9addae6 [AArch64] Add bitcast + extend tests. NFC 2025-03-31 14:22:49 +01:00
David Green
2e54b4f9ea [ARM] Silence signed comparison warning. NFC
After f4ec179bf5295f92aa0346392a58fad54f9b458e, AbsImm is no longer signed and cannot be < 0.
2025-03-31 13:59:58 +01:00
Simon Pilgrim
aad9630e42
[X86] combineINSERT_SUBVECTOR - pull out common variables. NFC. (#133705)
Reduces diff for an updated version of #133083
2025-03-31 13:23:06 +01:00
David Sherwood
f4d25c498a
[LV][NFC] Regenerate some SVE tests using --filter-out-after option (#132174)
I recently added a new option to update_test_checks.py that can
filter out all CHECK lines after a certain point. We usually don't
care about checking for the original scalar loop after the vector
loop because it doesn't change. Cutting out unnecessary CHECK
lines makes the files smaller and hopefully the tests run quicker.
2025-03-31 12:40:41 +01:00
Alexey Bataev
78777a204a
[LV]Split store-load forward distance analysis from other checks, NFC (#121156)
The patch splits the store-load forwarding distance analysis from other
dependency analysis in LAA. Currently it supports only power-of-2
distances, required to support non-power-of-2 distances in future.

Part of #100755
2025-03-31 07:28:44 -04:00
Matt Arsenault
f82283a84e
llvm-reduce: Use 80 dashes for section separator in status printing (#133686) 2025-03-31 18:06:37 +07:00
Jacek Caban
606e0b4806
[ARM64EC] Add support for function aliases on ARM64EC (#132295)
Required for mingw-w64, which uses the alias attribute in its CRT.

Follows ARM64EC mangling rules by mangling the alias symbol and emitting
an unmangled anti-dependency alias. Since metadata is not allowed on
GlobalAlias objects, extend arm64ec_unmangled_name to support multiple
unmangled names and attach the alias anti-dependency name to the target
function's metadata.
2025-03-31 12:56:09 +02:00
Frank Schlimbach
1dee12531d
[mlir][mpi] Lowering MPI_Allreduce (#133133)
Lowering of mpi.all_reduce to LLVM function call
2025-03-31 12:51:45 +02:00
Zhaoxin Yang
0ec94983c4
[lld][LoongArch] Relax TLSDESC code sequence (#123677)
Relax TLSDESC code sequence.

Original code sequence:
  * pcalau12i  $a0, %desc_pc_hi20(sym_desc)
  * addi.d     $a0, $a0, %desc_pc_lo12(sym_desc)
  * ld.d       $ra, $a0, %desc_ld(sym_desc)
  * jirl       $ra, $ra, %desc_call(sym_desc)

Cannot convert to LE/IE, but relax:
  * pcaddi     $a0, %desc_pcrel_20(sym_desc)
  * ld.d       $ra, $a0, %desc_ld(sym_desc)
  * jirl       $ra, $ra, %desc_call(sym_desc)

TODO: The transition from TLSDESC GD/LD to IE/LE will implement in a
future patch.
2025-03-31 17:47:50 +08:00
Pavel Labath
9d61eaa9ec
[lldb] Make GetRowForFunctionOffset compatible with discontinuous functions (#133250)
The function had special handling for -1, but that is incompatible with
functions whose entry point is not the first address. Use std::nullopt
instead.
2025-03-31 11:45:11 +02:00
MingYan
5c65a32177
[RISCV] Vectorize phi for loop carried @llvm.vp.reduce.* (#131974)
LLVM vector predication reduction intrinsics return a scalar result, but
on RISC-V vector reduction instructions write the result in the first
element of a vector register. So when a reduction in a loop uses a
scalar phi, we end up with unnecessary scalar moves:
```asm
loop:
    vmv.s.x v8, zero
    vredsum.vs v8, v10, v8
    vmv.x.s a0, v8
````
This mainly affects vector predication reduction. This tries to
vectorize any scalar phis that feed into a vector predication reduction
in RISCVCodeGenPrepare, converting:
```llvm
vector.body:
%red.phi = phi i32 [ ..., %entry ], [ %red, %vector.body ]
%red = tail call i32 @llvm.vp.reduce.add.nxv4i32(i32 %red.phi, <vscale x 4 x i32> %wide.load, <vscale x 4 x i1> splat (i1 true), i32 %evl)
```
to
```llvm
vector.body:
%red.phi = phi <vscale x 2 x i32> [ ..., %entry ], [ %acc.vec, %vector.body]
%phi.scalar = extractelement <vscale x 2 x i32> %red.phi, i64 0
%acc = tail call i32 @llvm.vp.reduce.add.nxv4i32(i32 %phi.scalar, <vscale x 4 x i32> %wide.load, <vscale x 4 x i1> splat (i1 true), i32 %evl)
%acc.vec = insertelement <vscale x 2 x i32> poison, float %acc, i64 0
```
Which eliminates the scalar -> vector -> scalar crossing during
instruction selection.

---------

Co-authored-by: yanming <ming.yan@terapines.com>
2025-03-31 09:13:46 +01:00
Mariya Podchishchaeva
842b57b775
Reland [MS][clang] Add support for vector deleting destructors (#133451)
Whereas it is UB in terms of the standard to delete an array of objects
via pointer whose static type doesn't match its dynamic type, MSVC
supports an extension allowing to do it.
Aside from array deletion not working correctly in the mentioned case,
currently not having this extension implemented causes clang to generate
code that is not compatible with the code generated by MSVC, because
clang always puts scalar deleting destructor to the vftable. This PR
aims to resolve these problems.

It was reverted due to link time errors in chromium with sanitizer
coverage enabled,
which is fixed by https://github.com/llvm/llvm-project/pull/131929 .

The second commit of this PR also contains a fix for a runtime failure
in chromium reported
in
https://github.com/llvm/llvm-project/pull/126240#issuecomment-2730216384
.

Fixes https://github.com/llvm/llvm-project/issues/19772
2025-03-31 10:03:39 +02:00
Florian Hahn
809f857d2c
[VPlan] Support early-exit loops in optimizeForVFAndUF. (#131539)
Update optimizeForVFAndUF to support early-exit loops by handling
BranchOnCond(Or(..., CanonicalIV == TripCount)) via SCEV

PR: https://github.com/llvm/llvm-project/pull/131539
2025-03-31 07:55:48 +01:00
T-Gruber
d63cc4c876
[analyzer] Unknown array lvalue element in Store (#133381)
Remove the early return for BaseRegions of type ElementRegion. Return
meaningful MemRegionVal for these cases as well.
Previous discussion:
https://discourse.llvm.org/t/lvalueelement-returns-unknownval-for-multi-dimensional-arrays/85476
2025-03-31 08:44:28 +02:00
Kazu Hirata
fff8f035ac
[polly] Use DenseMap::insert_range (NFC) (#133657) 2025-03-30 22:58:03 -07:00
Kazu Hirata
2fc08d4c31
[Vectorize] Use DenseMap::insert_range (NFC) (#133656) 2025-03-30 22:57:45 -07:00
Kazu Hirata
60199ee539
[clang] Use DenseMap::insert_range (NFC) (#133655) 2025-03-30 22:57:25 -07:00
Craig Topper
6fb674174e [RISCV] Fix the operand types for shift instructions in RISCVInstrInfoSFB.td. NFC
Due to a copy paste mistake we used simm12 instead of the correct
type. This doesn't matter in practice because we only generate these
instructions with C++ code and we expand them before the AsmPrinter.
2025-03-30 22:38:26 -07:00
Fangrui Song
c6b3fd7999 [MC] maybeParseSectionType: test CommentString instead of AllowAtInIdentifier
Rework https://reviews.llvm.org/D31026
AllowAtInIdentifier is a misnomer: it should be false for ELF targets,
but is currently true as a hack to parse expr@specifier.
2025-03-30 22:27:47 -07:00
Fangrui Song
04a67528d3
[MC] Simplify MCBinaryExpr/MCUnaryExpr printing by reducing parentheses (#133674)
The existing pretty printer generates excessive parentheses for
MCBinaryExpr expressions. This update removes unnecessary parentheses
of MCBinaryExpr with +/- operators and MCUnaryExpr.
Since relocatable expressions only use + and -, this change improves
readability in most cases.

Examples:

- (SymA - SymB) + C now prints as SymA - SymB + C.
  This updates the output of -fexperimental-relative-c++-abi-vtables for
  AArch64 and x86 to `.long _ZN1B3fooEv@PLT-_ZTV1B-8`
- expr + (MCTargetExpr) now prints as expr + MCTargetExpr, with this
  change primarily affecting AMDGPUMCExpr.
2025-03-30 22:03:14 -07:00
Craig Topper
c9095aa310
[RISCV] Cleanup assembler predicates after #133377. (#133652)
Make isSImm12 look more like isUImm20LUI.
Move variables closer to their use.
Fold some function calls into if statements.
2025-03-30 21:10:08 -07:00
Lang Hames
dad86f5931 [ORC] MapperJITLinkMemoryManager should deinitialize on abandon, not deallocate.
The JITLinkMemoryManager::InFlightAlloc::abandon method should only abandon
memory for the current allocation, not any other allocations. In
MapperJITLinkMemoryManager this corresponds to the deinitialize operation, not
the deallocate operation (which releases whole slabs of memory that may be
shared by many allocations).

No testcase: This was spotted by inspection. The failing program was linking
concurrently when one linker instance raised an error. Through the call to
abandon an entire underlying slab was deallocated, resulting in segfaults in
other concurrent links that were sharing that slab.
2025-03-31 14:08:42 +11:00
Han-Kuan Chen
65734de9b9
[SLP] NFC. Remove the redundant MainOp and AltOp find process. (#133642) 2025-03-31 10:26:45 +08:00
Kazu Hirata
6257621f41
[llvm] Use llvm::append_range (NFC) (#133658) 2025-03-30 18:43:02 -07:00
Matt Arsenault
94122d58fc
Lint: Replace -lint-abort-on-error cl::opt with pass parameter (#132933) 2025-03-31 08:42:51 +07:00