532872 Commits

Author SHA1 Message Date
Ryotaro Kasuga
91f3965be4
[LoopInterchange] Fix the vectorizable check for a loop (#133667)
In the profitability check for vectorization, the dependency matrix was
not handled correctly. This can result to make a wrong decision: It may
say "this loop can be vectorized" when in fact it cannot. The root cause
of this is that the check process early returns when it finds '=' or 'I'
in the dependency matrix. To make sure that we can actually vectorize
the loop, we need to check all the rows of the matrix. This patch fixes
the process of checking whether we can vectorize the loop or not. Now it
won't make a wrong decision for a loop that cannot be vectorized.

Related: #131130
2025-04-03 16:21:19 +09:00
Yingwei Zheng
b6c0ce0bb6
[IR][NFC] Use SwitchInst::defaultDestUnreachable (#134199) 2025-04-03 14:47:47 +08:00
Iris
3295970d84
[ConstantFolding] Add support for sinh and cosh intrinsics in constant folding (#132671)
Closes #132503.
2025-04-03 08:34:09 +02:00
Hua Tian
7e65944292
[llvm][CodeGen] avoid repeated interval calculation in window scheduler (#132352)
Some new registers are reused when replacing some old ones in
certain use case of ModuloScheduleExpander. It is necessary to
avoid repeated interval calculations for these registers.
2025-04-03 14:25:55 +08:00
Nikita Popov
b384d6d6cc
[CodeGen] Don't include CGDebugInfo.h in CodeGenFunction.h (NFC) (#134100)
This is an expensive header, only include it where needed. Move some
functions out of line to achieve that.

This reduces time to build clang by ~0.5% in terms of instructions
retired.
2025-04-03 08:04:19 +02:00
Jason Molenda
a19c018379 Revert "[lldb][debugserver] Save and restore the SVE/SME register state (#134184)"
This reverts commit 4e40c7c4bd66d98f529a807dbf410dc46444f4ca.

arm64 CI is getting a failure in
lldb-api.tools/lldb-server.TestGdbRemoteRegisterState.py
with this commit, need to investigate and re-land.
2025-04-02 23:01:51 -07:00
Snehasish Kumar
7f2abe8fd1
Revert "[Metadata] Preserve MD_prof when merging instructions when one is missing." (#134200)
Reverts llvm/llvm-project#132433

I suspect this change caused a failure in the bolt build bot.
https://lab.llvm.org/buildbot/#/builders/113/builds/6621

```
!9185 = !{!"branch_weights", i32 3912, i32 802}
Wrong number of operands
!9185 = !{!"branch_weights", i32 3912, i32 802}
fatal error: error in backend: Broken module found, compilation aborted!
```
2025-04-02 22:11:17 -07:00
Craig Topper
f404826842
[RISCV] Don't allow '-' after 'ra' in Zcmp/Xqccmp register list. (#134182)
Move the parsing of '-' under the check that we parsed a comma.
Unfortunately, this leads to a poor error, but I still have more known
issues in this code and may end up with an overall restructuring and
want to think about wording.
2025-04-02 21:51:31 -07:00
Craig Topper
3ea7902494
[RISCV] Check S0 register list check for qc.cm.pushfp to after we parsed the whole register list. (#134180)
This is more of a semantic check. The diagnostic location to has been
changed to point at the register list start instead of the
closing brace or whatever character might be there instead of a brace
if its malformed.
2025-04-02 21:48:48 -07:00
Sam Elliott
4998273189
Reland [RISCV] Add Xqci Insn Formats (#134134)
This adds the following instruction formats from the Xqci Spec:
- QC.EAI
- QC.EI
- QC.EB
- QC.EJ
- QC.ES

The update to the THead test is because the largest number of operands
for a valid instruction has been bumped by this change.

This reverts commit 68fb7a5a1d203dde7badf67031bdd9eb650eef5d. This
relands commit 0cfabd37df9940346f3bf8a4d74c19e1f48a00e9.
2025-04-02 21:37:44 -07:00
Jacob Lalonde
b8d8405238
[LLDB] Expose checking if the symbol file exists/is loaded via SBModule (#134163)
The motivation for this patch is that in Statistics.cpp we [check to see
if the module symfile is
loaded](990a086d9d/lldb/source/Target/Statistics.cpp (L353C60-L353C75))
to calculate how much debug info has been loaded. I have an external
utility that only wants to look at the loaded debug info, which isn't
exposed by the SBAPI.
2025-04-02 21:27:44 -07:00
Reid Kleckner
e3c0565b74
Reapply "[cmake] Refactor clang unittest cmake" (#134195)
This reapplies 5ffd9bdb50b57 (#133545) with fixes.

The BUILD_SHARED_LIBS=ON build was fixed by adding missing LLVM
dependencies to the InterpTests binary in
unittests/AST/ByteCode/CMakeLists.txt .
2025-04-02 21:07:30 -07:00
Matt Arsenault
3140d51cf3
llvm-reduce: Remove unsupported from bitcode uselistorder test (#134185)
This was disabled due to flakiness but I'm currently unable to
reproduce.

I'm nervous the original issue still exists. However, I downgraded the
tripped
assert in 8c18c25b1b22ea710edb40a4f167a6a8bfe6ff9d to a warning since
the same
assert can trigger for illegitimate reasons.

Fixes #64157
2025-04-03 11:04:02 +07:00
Jason Molenda
4e40c7c4bd
[lldb][debugserver] Save and restore the SVE/SME register state (#134184)
debugserver isn't saving and restoring the SVE/SME register state around
inferior function calls.

Making arbitrary function calls while in Streaming SVE mode is generally
a poor idea because a NEON instruction can be hit and crash the
expression execution, which is how I missed this, but they should be
handled correctly if the user knows it is safe to do.

rdar://146886210
2025-04-02 20:37:07 -07:00
LU-JOHN
6a46c6c865
Ensure KnownBits passed when calculating from range md has right size (#132985)
KnownBits passed to computeKnownBitsFromRangeMetadata must have the same
bit width as the range metadata bit width. Otherwise the calculated
results will be incorrect.

---------

Signed-off-by: John Lu <John.Lu@amd.com>
2025-04-03 10:17:14 +07:00
Younan Zhang
dcc2182bce
[Clang] Fix a lambda pattern comparison mismatch after ecc7e6ce4 (#133863)
In ecc7e6ce4, we tried to inspect the `LambdaScopeInfo` on stack to
recover the instantiating lambda captures. However, there was a mismatch
in how we compared the pattern declarations of lambdas: the constraint
instantiation used a tailored `getPatternFunctionDecl()` which is
localized in SemaLambda that finds the very primal template declaration
of a lambda, while `FunctionDecl::getTemplateInstantiationPattern` finds
the latest template pattern of a lambda. This difference causes issues
when lambdas are nested, as we always want the primary template
declaration.

This corrects that by moving `Sema::addInstantiatedCapturesToScope` from
SemaConcept to SemaLambda, allowing it to use the localized version of
`getPatternFunctionDecl`.

It is also worth exploring to coalesce the implementation of
`getPatternFunctionDecl` with
`FunctionDecl::getTemplateInstantiationPattern`. But I’m leaving that
for the future, as I’d like to backport this fix (ecc7e6ce4 made the
issue more visible in clang 20, sorry!), and changing Sema’s ABI would
not be suitable in that regards. Hence, no release note.

Fixes https://github.com/llvm/llvm-project/issues/133719
2025-04-03 11:15:42 +08:00
Pengcheng Wang
4986a79648
[TableGen] Emit llvm::is_contained for CheckOpcode predicate (#134057)
When the list is large, using `llvm::is_contained` is of higher
performance than a sequence of comparisons. When the list is small,
the `llvm::is_contained` can be inlined and unrolled, which has the
same effect as using a sequence of comparisons.

And the generated code is more readable.
2025-04-03 11:11:36 +08:00
Owen Pan
4fe0d74275
[clang-format] Fix a bug in annotating braces (#134039)
Fix #133873
2025-04-02 20:08:56 -07:00
Jerry-Ge
94dbe5e405
[mlir][tosa] Remove extra whitespace in the PadOp example (#134113)
Trivial cleanup change.

Signed-off-by: Jerry Ge <jerry.ge@arm.com>
2025-04-02 19:40:54 -07:00
Jonas Devlieghere
18c43d01fc
[lldb-dap] Add a -v/--version command line argument (#134114)
Add a -v/--version command line argument to print the version of both
the lldb-dap binary and the liblldb it's linked against.

This is motivated by me trying to figure out which lldb-dap I had in my
PATH.
2025-04-02 18:40:37 -07:00
tangaac
ff0c2fbd8e
[LoongArch] Pre-commit tests for vector absolute difference (#132898) 2025-04-03 09:19:59 +08:00
Mircea Trofin
d59b2c4def
[ctxprof][nfc] Make computeImportForFunction a member of ModuleImportsManager (#134011) 2025-04-02 18:18:17 -07:00
Mircea Trofin
02467f9e21
[ctxprof] Option to move a whole tree to its own module (#133992)
Modules may contain a mix of functions that participate or don't participate in callgraphs covered by a contextual profile. We currently have been importing all the functions under a context root in the module defining that root, but if the other functions there are covered by flat profiles, the result is difficult to reason about.

This patch allows moving everything under a context root (and that root) in its own module. For now, we expect a module with a filename matching the GUID of the function be present in the set of modules known by the linker. This mechanism can be improved in a later patch.

Subsequent patches will handle implementing "move" instead of "import" semantics for the root function (because we want to make sure only one version of the root exists - so the optimizations we perform are actually the ones being observed at runtime).
2025-04-02 18:15:48 -07:00
Ankur Ahir
fb7135ec52
[Clang] fixed clang frontend crash with friend class declaration and overload == (#133878) 2025-04-03 09:11:27 +08:00
Jon Roelofs
749c20b3e0
[LIT] Add a test for lit.Test.toMetricValue. NFC 2025-04-02 17:35:14 -07:00
Joseph Huber
e5809f0172
[LLVM] Only build the GPU loader utility if it has LLVM-libc (#134141)
Summary:
There were some discussions about this being included by default. I need
to fix this up and codify the use of LLVM libc inside of LLVM. For now,
just turn it off unless the user requested the `libc` GPU stuff. This
matches the old behavior.
2025-04-02 19:26:19 -05:00
David Peixotto
b55bab2292
[lldb] Fix plugin manager test failure on windows (#134173)
This is an attempt to fix a test failure from #133794 when running on
windows builds. I suspect we are running into a case where the
[ICF](https://learn.microsoft.com/en-us/cpp/build/reference/opt-optimizations?view=msvc-170)
optimization kicks in and combines the CreateSystemRuntimePlugin*
functions into a single address. This means that we cannot uniquely
unregister the plugin based on its create function address.

The fix is have each create function return a different (bogus) value.
2025-04-02 17:22:46 -07:00
Matt Arsenault
7559c64c5e
CloneModule: Map global initializers after mapping the function (#134082) 2025-04-03 07:17:12 +07:00
Sterling-Augustine
f68a5185d0
Allow this test to pass when the source is on a read-only filesystem (#134179)
llc attempts to create an empty file in the current directory, but it
can't do that on a read-only file system. Send that empty-output to
stdout, which prevents this failure.
2025-04-02 16:49:57 -07:00
Craig Topper
ffed17624e [RISCV] Correct the error location for the X26 check in parseRegListCommon.
We should point to the start of the reglist not the closing parenthesis.

I also moved the check after we finishing parsing the closing brace.
The diagnostic mentions '{ra, s0-s10} or {x1, x8-x9, x18-x26}' so we
should be sure that's what we parsed.
2025-04-02 16:46:21 -07:00
Jeremy Day
be3abfc00f
[lldb] Clear thread name container before writing UTF8 bytes (#134150)
`llvm::convertUTF16ToUTF8String` opens with an assertion that the output
container is empty:
3bdf9a0880/llvm/lib/Support/ConvertUTFWrapper.cpp (L83-L84)

It's not clear to me why this function requires the output container to
be empty instead of just overwriting it, but the callsite in
`TargetThreadWindows::GetName` may reuse the container without clearing
it out first, resulting in an assertion failure:
```
 # Child-SP          RetAddr               Call Site
00 000000d2`44b8ea48 00007ff8`beefc12e     ntdll!NtTerminateProcess+0x14
01 000000d2`44b8ea50 00007ff8`bcf518ab     ntdll!RtlExitUserProcess+0x11e
02 000000d2`44b8ea80 00007ff8`bc0e0143     KERNEL32!ExitProcessImplementation+0xb
03 000000d2`44b8eab0 00007ff8`bc0e4c49     ucrtbase!common_exit+0xc7
04 000000d2`44b8eb10 00007ff8`bc102ae6     ucrtbase!abort+0x69
05 000000d2`44b8eb40 00007ff8`bc102cc1     ucrtbase!common_assert_to_stderr<wchar_t>+0x6e
06 000000d2`44b8eb80 00007fff`b8e27a80     ucrtbase!wassert+0x71
07 000000d2`44b8ebb0 00007fff`b8b821e1     liblldb!llvm::convertUTF16ToUTF8String+0x30 [D:\r\_work\swift-build\swift-build\SourceCache\llvm-project\llvm\lib\Support\ConvertUTFWrapper.cpp @ 88] 
08 000000d2`44b8ec30 00007fff`b83e9aa2     liblldb!lldb_private::TargetThreadWindows::GetName+0x1b1 [D:\r\_work\swift-build\swift-build\SourceCache\llvm-project\lldb\source\Plugins\Process\Windows\Common\TargetThreadWindows.cpp @ 198] 
09 000000d2`44b8eca0 00007ff7`2a3c3c14     liblldb!lldb::SBThread::GetName+0x102 [D:\r\_work\swift-build\swift-build\SourceCache\llvm-project\lldb\source\API\SBThread.cpp @ 432] 
0a 000000d2`44b8ed70 00007ff7`2a3a5ac6     lldb_dap!lldb_dap::CreateThread+0x1f4 [S:\SourceCache\llvm-project\lldb\tools\lldb-dap\JSONUtils.cpp @ 877] 
0b 000000d2`44b8ef10 00007ff7`2a3b0ab5     lldb_dap!`anonymous namespace'::request_threads+0xa6 [S:\SourceCache\llvm-project\lldb\tools\lldb-dap\lldb-dap.cpp @ 3906] 
0c 000000d2`44b8f010 00007ff7`2a3b0fe8     lldb_dap!lldb_dap::DAP::HandleObject+0x1c5 [S:\SourceCache\llvm-project\lldb\tools\lldb-dap\DAP.cpp @ 796] 
0d 000000d2`44b8f130 00007ff7`2a3a8b96     lldb_dap!lldb_dap::DAP::Loop+0x78 [S:\SourceCache\llvm-project\lldb\tools\lldb-dap\DAP.cpp @ 812] 
0e 000000d2`44b8f1d0 00007ff7`2a4b5fbc     lldb_dap!main+0x1096 [S:\SourceCache\llvm-project\lldb\tools\lldb-dap\lldb-dap.cpp @ 5319] 
0f (Inline Function) --------`--------     lldb_dap!invoke_main+0x22 [D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl @ 78] 
10 000000d2`44b8fb80 00007ff8`bcf3e8d7     lldb_dap!__scrt_common_main_seh+0x10c [D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl @ 288] 
11 000000d2`44b8fbc0 00007ff8`beefbf6c     KERNEL32!BaseThreadInitThunk+0x17
12 000000d2`44b8fbf0 00000000`00000000     ntdll!RtlUserThreadStart+0x2c
```

This stack trace was captured from the lldb distributed in the Swift
toolchain. The issue is easy to reproduce by resuming from a breakpoint
twice in VS Code.

I've verified that clearing out the container here fixes the assertion
failure.
2025-04-02 16:35:02 -07:00
Craig Topper
40a0e34304 [RISCV] Use location of negative sign if present for error in parseZcmpStackAdj
As far as the user is concerned the negative sign and the number
are a single value so the error should point to the beginning.
2025-04-02 16:19:16 -07:00
Craig Topper
1edb6b0af1 [RISCV] Fix crash in parseZcmpStackAdj if token is not an integer. 2025-04-02 16:06:15 -07:00
Andy Kaylor
c57b9c233a
[CIR] Generate the nsw flag correctly for unary ops (#133815)
A previous checkin used a workaround to generate the nsw flag where
needed for unary ops. This change upstreams a subsequent change that was
made in the incubator to generate the flag correctly.
2025-04-02 15:48:55 -07:00
Jonas Devlieghere
3f7ca88267
[lldb-dap] Add progress events to the packet list (#134157)
Before #134048, TestDAP_Progress relied on wait_for_event to block until
the progressEnd came in. However, progress events were not added to the
packet list, so this call would always time out. This PR makes it so
that packets are added to the packet list, and you can block on them.
2025-04-02 15:33:07 -07:00
Yaxun (Sam) Liu
dedb632b83
[HIP] Claim --offload-compress for -M (#133456)
Cmake automatically generates dependency files with all compilation
options provided by users. When users use `--offload-compress` for HIP
compilation, it causes warnings when cmake generates dependency files.
Claim this option to suppress warnings.
2025-04-02 18:28:56 -04:00
Matheus Izvekov
f302f35526
[clang] Track final substitution for Subst* AST nodes (#132748) 2025-04-02 19:27:29 -03:00
Jorge Gorbe Moya
990a086d9d [bazel] Add missing dep after 51d1c7288662ea801b07133fd2d22aff6bac50e2 2025-04-02 15:17:06 -07:00
Dave Lee
93d3775da8
[lldb] Fix tagged-pointer info address parsing (#134123)
Change `objc tagged-pointer info` to call
`OptionArgParser::ToRawAddress`.

Previously `ToAddress` was used, but it calls `FixCodeAddress`, which
can erroneously mutate the bits of a tagged pointer.
2025-04-02 15:16:58 -07:00
Hansang Bae
8100bd58a3
[OpenMP] 6.0 (TR11) Memory Management Update (#97106)
TR11 introduced changes to support target memory management in a unified
way by defining a series of API routines and additional traits. Host
runtime is oblivious to how actual memory resources are mapped when
using the new API routines, so it can only support how the composed
memory space is maintained, and the offload backend must handle which
memory resources are actually used to allocate memory from the memory
space.

Here is summary of the implementation.
* Implemented 12 API routines to get/mainpulate memory space/allocator.
* Memory space composed with a list of devices has a state with resource
description, and runtime is responsible for maintaining the allocated
memory space objects.
* Defined interface with offload runtime to access memory resource list,
and to redirect calls to omp_alloc/omp_free since it requires
backend-specific information.
* Value of omp_default_mem_space changed from 0 to 99, and
omp_null_mem_space took the value 0 as defined in the language.
* New allocator traits were introduced, but how to use them is up to the
offload backend.
* Added basic tests for the new API routines.
2025-04-02 17:16:30 -05:00
Sami Tolvanen
acc6bcdc50
Support alternative sections for patchable function entries (#131230)
With -fpatchable-function-entry (or the patchable_function_entry
function attribute), we emit records of patchable entry locations to the
__patchable_function_entries section. Add an additional parameter to the
command line option that allows one to specify a different default
section name for the records, and an identical parameter to the function
attribute that allows one to override the section used.

The main use case for this change is the Linux kernel using prefix NOPs
for ftrace, and thus depending on__patchable_function_entries to locate
traceable functions. Functions that are not traceable currently disable
entry NOPs using the function attribute, but this creates a
compatibility issue with -fsanitize=kcfi, which expects all indirectly
callable functions to have a type hash prefix at the same offset from
the function entry.

Adding a section parameter would allow the kernel to distinguish between
traceable and non-traceable functions by adding entry records to
separate sections while maintaining a stable function prefix layout for
all functions. LKML discussion:

https://lore.kernel.org/lkml/Y1QEzk%2FA41PKLEPe@hirez.programming.kicks-ass.net/
2025-04-02 21:53:55 +00:00
Theo de Magalhaes
76fa9530c9
[clang] add support for -Wpadded on Windows (#130182)
Implements the -Wpadded warning for --target=x86_64-windows-msvc etc.

Fixes #61702 .
2025-04-02 14:46:58 -07:00
Florian Hahn
380defd4b3
[VPlan] Update VPInterleaveRecipe to take debug loc directly as arg (NFC) 2025-04-02 22:46:38 +01:00
Chris B
81601cf3ab
[Docs] Clarify that reassoc isn't just for reassociation (#133168)
The `reassoc` fast-math flag allows a much wider array of algebraic
transformations than just strictly reassociations. In some cases it does
commutations, distributions, and folds away redundant inverse
operations...

While it might make sense to fix the flag naming at some point, in the
meantime we should at least have the docs be accurate to avoid
confusion.
2025-04-02 16:43:10 -05:00
Nirvedh Meshram
42b3f91fd6
[mlir] Vectorize tensor.pad with low padding for unit dims (#133808)
We currently do not have masked vectorization support for tenor.pad with
low padding. However, we can allow this in the special case where the
result dimension after padding is a unit dim. The reason is when we
actually have a low pad on a unit dim, the input size of that dimension
will be (or should be for correct IR) dynamically zero and hence we will
create a zero mask which is correct. If the low pad is dynamically zero
then the lowering is correct as well.

---------

Signed-off-by: Nirvedh <nirvedh@gmail.com>
2025-04-02 16:32:36 -05:00
Valentin Clement (バレンタイン クレメン)
db21ae7803
[flang][cuda] Support any_sync and ballot_sync (#134135) 2025-04-02 14:26:09 -07:00
Brox Chen
066787b9bd
[AMDGPU][True16][CodeGen] fold clamp update for true16 (#128919)
Check through COPY for possible clamp folding for v_mad_mixhi_f16 isel
2025-04-02 17:10:53 -04:00
Craig Topper
38937ac24c [RISCV] Check line and column for errors in rv(32/64)zcmp-invalid.s. NFC
Same for the Xqccmp version.
2025-04-02 14:00:08 -07:00
Florian Hahn
4b67c53e20
[VPlan] Use recipe debug loc instead of instr DLs in more cases (NFC)
Update both VPInterleaveRecipe and VPReplicateRecipe codegen to use
debug location directly from the recipe, not the underlying instruction.
This removes another dependency on underlying instructions.
2025-04-02 21:51:17 +01:00
vporpo
a1b0b4997e
[SandboxVec][NFC] Replace std::regex with llvm::Regex (#134110) 2025-04-02 13:46:56 -07:00