58671 Commits

Author SHA1 Message Date
Fangrui Song
329a675006 ELFObjectWriter: Simplify STT_SECTION adjustment. NFC 2025-04-12 20:54:05 -07:00
Fangrui Song
b864405c9b MCAsmInfo: Remove unused UseParensForDollarSignNames
Follow-up to 3acccf042ab8a7b7e663bb2b2fac328d9bf65b38
2025-04-11 22:09:24 -07:00
donald chen
7a6a79551d
[NFC][equivalenceClass] Refactor coding style in EquivalenceClasses.h. (#135467) 2025-04-12 13:08:50 +08:00
Valentin Clement (バレンタイン クレメン)
2837fd7e5a
[flang][openacc] Allow if_present multiple times on host_data and update (#135422)
Similar to #135415.

The spec has not strict restriction to allow a single `if_present`
clause on the host_data and update directives. Allowing this clause
multiple times does not change the semantic of it. This patch relax the
rules in ACC.td since there is no restriction in the standard.

The OpenACC dialect represents the `if_present` clause with a `UnitAttr`
so the attribute will be set if the is one or more `if_present` clause.
2025-04-11 14:01:03 -07:00
Valentin Clement (バレンタイン クレメン)
609361ab39
[flang][openacc] Allow finalize clause on exit data more than once (#135415)
The spec has not strict restriction to allow a single `finalize` clause
on the `exit data` directive. Allowing this clause multiple times does
not change the semantic of it. This patch relax the rules in `ACC.td`
since there is no restriction in the standard.

The OpenACC dialect represent the finalize clause with a UnitAttr so the
attribute will be set if the is one or more `finalize` clause.
2025-04-11 13:54:48 -07:00
Victor Vianna
90a202f2ff
[cpp23] Remove usage of std::aligned_union<> in llvm (#135146)
std::aligned_union<> is deprecated in C++23. See more details in the
linked bug.

Bug: https://crbug.com/388068052
2025-04-11 16:16:33 -04:00
Shilei Tian
a45b133d40
[AMDGPU][Verifier] Mark calls to entry functions as invalid in the IR verifier (#134910) 2025-04-11 15:32:37 -04:00
Valentin Clement (バレンタイン クレメン)
8fb6bb3e23
[flang][openacc] Allow multiple device_type clauses on init and shutdown (#135314)
Relax the restriction for init and shutdown directives for device_type
clause. The clause can be allowed multiple times.
2025-04-11 10:15:17 -07:00
Harrison Hao
e36e57b478
[BUILD] Fix unicode build issue. (#135315)
Fix unicode build fail issue:
```
C4819	The file contains a character that cannot be represented in the current code page (936). Save the file in Unicode format to prevent data loss	...\llvm-project\llvm\include\llvm\Support\Compiler.h
```
2025-04-11 20:35:16 +08:00
Simon Pilgrim
1d3d3f404e
[DAG] SDPatternMatch::ReassociatableOpc_match - pull out repeated pattern count expression. NFC. (#135187)
Minor tidyup to remove so much template noise.

CC @esan5
2025-04-11 10:53:28 +01:00
Fangrui Song
34fb673b08 MCStreamer: Remove Mach-O specific functions from derived MCObjectStreamer 2025-04-10 23:15:41 -07:00
Fangrui Song
c04d9d57ee
MCAsmStreamer: Replace the MCInstPrinter * parameter with unique_ptr
... to clarify ownership, aligning with other parameters. Using
`std::unique_ptr` encourages users to manage `createMCInstPrinter` with
a unique_ptr instead of a raw pointer, reducing the risk of memory
leaks.

* llvm-mc: fix a leak and update llvm/test/tools/llvm-mc/disassembler-options.test
* #121078 copied the llvm-mc code to CodeGenTargetMachineImpl and made
  the same mistake. Fixed by 2b8cc651dca0c000ee18ec79bd5de4826156c9d6

Using unique_ptr requires #include MCInstPrinter.h in a few translation
units.

* Delete a createAsmStreamer overload I deprecated in 2024
* SystemZMCTargetDesc.cpp: rename to `createSystemZAsmStreamer` to fix
  an overload conflict.

Pull Request: https://github.com/llvm/llvm-project/pull/135128
2025-04-10 21:25:35 -07:00
Fangrui Song
f2ff298867 [MC] Remove deprecated createAsmStreamer/createMCObjectStreamer with 3 trailing bool
They were deprecated around 867faeec054abb4c035673189c1169fef45f54c8
(June 2024)
2025-04-10 09:31:13 -07:00
zhijian lin
378ac572ac
Reland "[SelectionDAG] Introducing a new ISD::POISON SDNode to represent the poison value in the IR." (#135056)
A new ISD::POISON SDNode is introduced to represent the poison value in
the IR, replacing the previous use of ISD::UNDEF
2025-04-10 11:29:14 -04:00
u4f3
5978bb2936
[DeadArgElim] fix verifier failure when changing musttail's function signature (#127366)
This commit is for #107569 and #126817.

Stop changing musttail's caller and callee's function signature when
calling convention is not swifttailcc nor tailcc. Verifier makes sure
musttail's caller and callee shares exactly the same signature, see
commit 9ff2eb1 and #54964.

Otherwise just make sure the return type is the same and then process
musttail like usual calls.

close #107569, #126817
2025-04-10 07:08:09 -07:00
Abhishek Kaushik
5543d9ded7
[RegAlloc][NFC] Use std::move to avoid copy (#134533) 2025-04-10 14:45:02 +05:30
Fangrui Song
3fd0d22d74 AArch64AsmParser: Restore Lsym@page-offset support
https://github.com/llvm/llvm-project/pull/134202 removed support for
`sym@page-offset` in instruction operands. This change is generally
reasonable since subtracting an offset from a symbol typically doesn’t
make sense for Mach-O due to its .subsections_via_symbols mechanism, which treats
them as separate atoms.

However, BoringSSL relies on a temporary symbol with a negative offset,
which can be meaningful when the symbol and the referenced location are
within the same atom.
```
../../third_party/boringssl/src/gen/bcm/p256-armv8-asm-apple.S:1160:25: error: unexpected token in argument list
 adrp x23,Lone_mont@PAGE-64
```

It's worth noting that expressions involving @ can be complex and
brittle in MCParser, and much of the Mach-O @ offsets remains
under-tested.

* Allow default argument for parsePrimaryExpr. The argument, used by the niche llvm-ml,
  should not require other targets to adapt.
2025-04-09 23:13:06 -07:00
Matt Arsenault
f819f46284
Reapply "Inline: Propagate callsite nofpclass attribute" (#135018)
This reverts commit 3f38cd07d820248fd2043efb1341fabaac2d84a6.

Fix case where inner callsite has nofpclass but callsite does not.
2025-04-10 07:15:58 +02:00
donald chen
27ca4837ee
[EquivalenceClasses] Introduce erase member function (#134660)
Introduce 'erase(const ElemTy &V)' member function to allow the deletion
of a certain value from EquivClasses. This is essential for certain
scenarios that require modifying the contents of EquivClasses.

---------

Co-authored-by: Florian Hahn <flo@fhahn.com>
2025-04-10 12:42:10 +08:00
Alan Li
2795abb2f8
[GISel][AMDGPU] Expand ShuffleVector (#124527)
This patch dismantles G_SHUFFLE_VECTOR before lowering. The original
lowering would emit extract vector element ops. We found that by using
unmerged values the build vector op combine could find ways to fold.

Only enabled on AMDGPU.

This resolves #123631
2025-04-09 17:51:24 -07:00
Florian Hahn
7cbf78ec74
[VPlan] Remove no-op addMetadata for VPWidenGEP/VPWidenIntOrFPInd (NFC).
GEPs and truncates should not have any metadata that can be propgated at
the moment, so addMetadata is a no-op. Remove the calls.

This patch also adds assertions to the recipes' constructors, to ensure
no metadata is accidentially dropped.
2025-04-09 22:03:43 +01:00
Jakub Kuderski
ef1088f703
Revert "[SelectionDAG] Introducing a new ISD::POISON SDNode to represent the poison value in the IR." (#135060)
Reverts llvm/llvm-project#125883

This PR causes crashes in RISC-V codegen around f16/f64 poison values:
https://github.com/llvm/llvm-project/pull/125883#issuecomment-2787048206
2025-04-09 14:40:56 -04:00
Mircea Trofin
6d4d017fd2
[llvm-extract] Delete dead CG Profile edges (#134940)
When `llvm-extract`-ing a function, and the `CG Profile` flag is present in the original module, we end up with lots of `!{null, null, i64 1234}` entries for call edges that have disappeared as result of the removed functions.

This patch fixes that by adding a pass to `llvm-extract` that finds `CG Profile` edges with one or both operands `null` and removes them. This results in a cleaner output.
2025-04-09 10:36:12 -07:00
Hans Wennborg
0eb560a4de
[Coroutines] Don't assert if coro-early runs more than once (#134854)
The pass may run more than once during ThinLTO for example (see bug).
Maybe that means those pass pipelines aren't optimal, but the pass
should be resilient against that.

Fixes #134054
2025-04-09 18:59:02 +02:00
Matt Arsenault
2a7f12e37b
CodeGen: Trim redundant template argument from defusechain_iterator (#135024)
Only one of ByOperand, ByInstr, or ByBundle should be true. Replace
ByBundle with !ByInstr, and assert that both are not used.
2025-04-09 18:28:00 +02:00
Matt Arsenault
840b366d47
CodeGen: Remove redundant arguments to defusechain_instr_iterator (#135023)
ByOperand must be false, this is implied by the iterator type.
The instr_iterator cases are a separate implementation from the single
operand defusechain_iterator.

Additionally ByInstr and ByBundle are mutually exclusive.
2025-04-09 18:24:28 +02:00
Matt Arsenault
d99cdd7fba MachineRegisterInfo: Remove trailing whitespace 2025-04-09 16:09:24 +02:00
Zhaoxuan Jiang
e24c9e7a0c
[IR] improve hashing quality for ValueInfo (#132917)
The current hashing quality for `ValueInfo` is poor because it uses
pointers as the hash value, which can negatively impact performance in
various places that use a `DenseSet`/`Map` of `ValueInfo`. In one
observed case, `ModuleSummaryIndex::propagateAttributes()` was taking
about 25 minutes to complete on a ThinLTO application. Profiling
revealed that the majority of this time was spent operating on the
`MarkedNonReadWriteOnly` set.

With the improved hashing, the execution time for `propagateAttributes`
is dramatically reduced to less than 10 seconds.
2025-04-09 06:44:35 -07:00
Akshat Oke
2f6b06b264
[CodeGen][NPM] Port PostRAHazardRecognizer to NPM (#130066) 2025-04-09 16:36:22 +05:30
NimishMishra
53fa92dcad
[mlir][llvm][OpenMP] Hoist __atomic_load alloca (#132888)
Current implementation of `__atomic_compare_exchange` uses an alloca for
`__atomic_load`, leading to issues like
https://github.com/llvm/llvm-project/issues/120724. This PR hoists this
alloca to `AllocaIP`.


Fixes: https://github.com/llvm/llvm-project/issues/120724
2025-04-09 03:01:44 -07:00
Fangrui Song
c46be969f0 [MC] Optimize isInSection
Remove one call of getFragment. The `SetUsed` bit isn't need here.
2025-04-08 23:42:02 -07:00
Cyndy Ishida
bada5eb009
[llvm-nm] Fix how inlined dylibs are reported from tbd files (#134498)
An Inlined library is a dylib that is reexported from an umbrella or
top-level library. When this is encoded in tbd files, ensure we are
reading the symbol table from the inlined library when
`--add-inlinedinfo` is used as opposed to the top-level lib.

resolves: rdar://147767733
2025-04-08 20:56:54 -07:00
Fangrui Song
e6b55cd73b MCExpr: Remove unused VK_GOT/VK_GOTPCREL 2025-04-08 20:35:42 -07:00
Fangrui Song
50428fb5e9
[WebAssembly] Add WebAssembly::Specifier
Move wasm-specific members outside of MCSymbolRefExpr::VariantKind (a
legacy interface I am eliminating). Most changes are mechanic and
similar to what I've done for many ELF targets (e.g. X86 #132149)

Notes:

* `fixSymbolsInTLSFixups` is replaced with `setTLS` in
  `WebAssemblyWasmObjectWriter::getRelocType`, similar to what I've done
  for many ELF targets.
* `SymA->setUsedInGOT()` in `recordRelocation` is moved to
  `getRelocType`.

While here, rename "Modifier' to "Specifier":

> "Relocation modifier", though concise, suggests adjustments happen during the linker's relocation step rather than the assembler's expression evaluation. I landed on "relocation specifier" as the winner. It's clear, aligns with Arm and IBM’s usage, and fits the assembler's role seamlessly.

Pull Request: https://github.com/llvm/llvm-project/pull/133116
2025-04-08 19:44:40 -07:00
Fangrui Song
02b377d8f7
[llc] Add -M for InstPrinter options
For many targets, llvm-objdump and llvm-mc
(https://reviews.llvm.org/D103004) support -M no-aliases (e.g.
`RISCVInstPrinter::applyTargetSpecificCLOption`).

This patch implements -M for llc.

While here, rename "DisassemblerOptions" in llvm-mc to the more
appropriate "InstPrinterOptions". For llvm-mc --assemble, there is no
disassembler involved.

Pull Request: https://github.com/llvm/llvm-project/pull/121078
2025-04-08 19:34:03 -07:00
Mircea Trofin
442050ce8f
[ctxprof] Flatten indirect call info in pre-thinlink compilation (#134766)
Same idea as in #134723 - flatten indirect call info in `"VP"` `MD_prof` metadata for the thinlinker, for cases that aren't covered by a contextual profile. If we don't ICP an indirect call target in the specialized module, the call will fall to the copy of that target outside the specialized module. If the graph under that target also has some indirect calls, in the absence of this pass, we'd have a steeper performance regression - because none of those would have a chance to be ICPed.
2025-04-08 17:33:37 -07:00
Mircea Trofin
4c90d977db
[ctxprof] Use the flattened contextual profile pre-thinlink (#134723)
Flatten the profile pre-thinlink so that ThinLTO has something to work with for the parts of the binary that aren't covered by contextual profiles. Post-thinlink, the flattener is re-run and will actually change profile info, but just for the modules containing contextual trees ("specialized modules"). For the rest, the flattener just yanks out the instrumentation.
2025-04-08 17:30:49 -07:00
Alex MacLean
a6853cd9af
[NVPTX] Auto-Upgrade llvm.nvvm.atomic.load.{inc,dec}.32 (#134111)
These intrinsics can be upgrade to an atomicrmw instruction.
2025-04-08 13:44:11 -07:00
Kevin McAfee
c13436e516
[SDAG][NVPTX] Add TLI hook to get preferred FP->INT opcode (#132470)
Extract the logic for choosing FP_TO_UINT vs FP_TO_SINT opcodes into a
TLI hook. This hook can be overridden by targets that prefer not to use
the default behavior of replacing FP_TO_UINT with FP_TO_SINT when both
are custom.
Implement an override for NVPTX to only change opcode when
FP_TO_UINT is not legal and FP_TO_SINT is legal.
2025-04-08 13:32:39 -07:00
Douglas
156e2532ed
Revert "Rename F_no_mmap to F_mmap" (#134924)
Reverts llvm/llvm-project#134787

Causes the LIT test `lld\test\ELF\link-open-file.test` to fail on the
`llvm-clang-x86_64-sie-win` Build Bot. First instance of the failure
observed in: https://lab.llvm.org/buildbot/#/builders/46/builds/14847
2025-04-08 13:20:42 -07:00
Alexandre Ganea
661f90ad08 [SandboxIR] Fix warning when building on Windows with clang-cl. NFC.
This fixes:
```
[2295/3381] Building CXX object lib\Passes\CMakeFiles\LLVMPasses.dir\PassBuilder.cpp.obj
In file included from C:\git\llvm-project\llvm\lib\Passes\PassBuilder.cpp:368:
In file included from C:\git\llvm-project\llvm\include\llvm/Transforms/Vectorize/SandboxVectorizer/SandboxVectorizer.h:16:
C:\git\llvm-project\llvm\include\llvm/SandboxIR/Context.h(73,16): warning: unqualified friend declaration referring to type outside of the nearest enclosing namespace is a Microsoft extension; add a nested name specifier [-Wmicrosoft-unqualified-friend]
   73 |   friend class Region;            // For LLVMCtx.
      |                ^
      |                ::llvm::
1 warning generated.
```
2025-04-08 14:42:31 -04:00
Alexandre Ganea
446e793d77 [SandboxIR] Fix warning when building on Windows with clang-cl. NFC.
This fixes:
```
[1230/3381] Building CXX object lib\Transforms\Vectorize\CMakeFiles\LLVMVectorize.dir\SandboxVectorizer\VecUtils.cpp.obj
In file included from C:\git\llvm-project\llvm\lib\Transforms\Vectorize\SandboxVectorizer\VecUtils.cpp:9:
In file included from C:\git\llvm-project\llvm\include\llvm/Transforms/Vectorize/SandboxVectorizer/VecUtils.h:17:
C:\git\llvm-project\llvm\include\llvm/SandboxIR/Type.h(55,16): warning: unqualified friend declaration referring to type outside of the nearest enclosing namespace is a Microsoft extension; add a nested name specifier [-Wmicrosoft-unqualified-friend]
   55 |   friend class CallBase;           // For LLVMTy.
      |                ^
      |                ::llvm::
C:\git\llvm-project\llvm\include\llvm/SandboxIR/Type.h(60,16): warning: unqualified friend declaration referring to type outside of the nearest enclosing namespace is a Microsoft extension; add a nested name specifier [-Wmicrosoft-unqualified-friend]
   60 |   friend class CmpInst;            // For LLVMTy. TODO: Cleanup after
      |                ^
      |                ::llvm::
2 warnings generated.
```
2025-04-08 14:42:24 -04:00
Fangrui Song
7117dea043
AsmPrinter: Remove ELF's special lowerRelativeReference for unnamed_addr function; use lowerDSOLocalEquivalent in more cases
https://reviews.llvm.org/D17938 introduced lowerRelativeReference to
give ConstantExpr sub (A-B) special semantics in ELF: when `A` is an
`unnamed_addr` function, create a PLT-generating relocation. This was
intended for C++ relative vtables, but C++ relative vtable ended up
using DSOLocalEquivalent (lowerDSOLocalEquivalent).

This special treatment of `unnamed_addr` seems unusual.
Let's remove it. Only COFF needs an overload to generate a @IMGREL32
relocation specifier (llvm/test/MC/COFF/cross-section-relative.ll).

Pull Request: https://github.com/llvm/llvm-project/pull/134781
2025-04-08 10:11:20 -07:00
Stephen Tozer
e3d114ceb8
[DebugInfo][Reassociate] Propagate source loc when negating mul factor (#134679)
As part of RemoveFactorFromExpression, we attempt to remove a factor
from a mul/fmul expression; this may involve generating new
instructions, e.g. to negate the result if the factor was negative in
the original expression. When this happens, the new instructions should
have a DebugLoc set from the instruction that the factored expression is
being used to compute.

Found using https://github.com/llvm/llvm-project/pull/107279.
2025-04-08 17:45:54 +01:00
Dmitry Chestnykh
d6c8e8908d
Rename F_no_mmap to F_mmap (#134787)
The `F_no_mmap` flag was introduced by
6814232429
2025-04-08 19:22:03 +03:00
Fangrui Song
26475f5bdd
[AArch64] Refactor @plt, @gotpcrel, and @AUTH to use parseDataExpr
Following PR #132569 (RISC-V), which added `parseDataExpr` for parsing
expressions in data directives (e.g., `.word`), this PR migrates AArch64
`@plt`, `@gotpcrel`, and `@AUTH` from the `parsePrimaryExpr` workaround
to `parseDataExpr`. The goal is to align with the GNU assembler model,
where relocation specifiers apply to the entire operand rather than
individual terms, reducing complexity-especially evident in `@AUTH`
parsing.

Note: AArch64 ELF lacks an official syntax for data directives
(#132570). A prefix notation might be a preferable future direction.
I recommend `%specifier(expr)`.

AsmParser's `@specifier` parsing is suboptimal, necessitating lexer
workarounds. `@` might appear multiple times in an operand.
We should not use `@` beyond the existing AArch64 Mach-O instruction
operands.

In the test elf-reloc-ptrauth.s, many errors are now reported at parse
time.

Pull Request: https://github.com/llvm/llvm-project/pull/134202
2025-04-08 09:09:19 -07:00
Krzysztof Drewniak
4a7b34d03c
Revert "[AMDGPU] Add buffer.fat.ptr.load.lds intrinsic wrapping raw rsrc version (#133015)" (#134871)
This reverts commit d1a05721172272f7aab685b56d99e86814a15bff.

There was further discussion on the PR about whether the intinsics
should exist in this form.
2025-04-08 11:00:41 -05:00
Mircea Trofin
b2dea4fd22
[ctxprof] root autodetection mechanism (#133147)
This is an optional mechanism that automatically detects roots. It's a best-effort mechanism, and its main goal is to *avoid* pointing at the message pump function as a root. This is the function that polls message queue(s) in an infinite loop, and is thus a bad root (it never exits).

High-level, when collection is requested - which should happen when a server has already been set up and handing requests - we spend a bit of time sampling all the server's threads. Each sample is a stack which we insert in a `PerThreadCallsiteTrie`. After a while, we run for each `PerThreadCallsiteTrie` the root detection logic. We then traverse all the `FunctionData`, find the ones matching the detected roots, and allocate a `ContextRoot` for them. From here, we special case `FunctionData` objects, in `__llvm_ctx_profile_get_context, that have a `CtxRoot` and route them to `__llvm_ctx_profile_start_context`.

For this to work, on the llvm side, we need to have all functions call `__llvm_ctx_profile_release_context` because they _might_ be roots. This comes at a slight (percentages) penalty during collection - which we can afford since the overall technique is ~5x faster than normal instrumentation. We can later explore conditionally enabling autoroot detection and avoiding this penalty, if desired. 

Note that functions that `musttail call` can't have their return instrumented this way, and a subsequent patch will harden the mechanism against this case.

The mechanism could be used in combination with explicit root specification, too.
2025-04-08 06:59:38 -07:00
Mircea Trofin
6a3e5f89bb
[ctxprof] Only prune the profile in modules containing only context trees (#134340)
We will subsequently treat the whole profile as "flat" in the frontend, (i.e flatten and combine with the flat profile section), so we can have a profile for ThinLTO for parts of the application that don't come under the contextual profile. After ThinLTO, we will treat the module(s) containing contextual trees differently: they'll have only the contextual profile pertinent to them. The rest of the modules (non-contextual) will proceed "as usual", off the flattened profile.

This patch implements pruning of the contextual profile to enable the above.
2025-04-07 19:52:03 -07:00
Rahul Joshi
bb1f32ded0
[NFC][LLVM] Change initialize<PassName>PassOnce to return void (#134500)
- The return value of these functions (called using `llvm::call_once`)
is never used, so make these functions return void.
2025-04-07 18:10:06 -07:00