532828 Commits

Author SHA1 Message Date
Timm Baeder
7267dbfe10
[clang][bytecode] Fix comparing the addresses of union members (#133852)
Union members get the same address, so we can't just use
`Pointer::getByteOffset()`.
2025-04-01 09:00:46 +02:00
Frank Schlimbach
49f080afc4
[mlir][mpi] Mandatory Communicator (#133280)
This is replacing #125361
- communicator is mandatory
- new mpi.comm_world
- new mp.comm_split
- lowering and test

---------

Co-authored-by: Sergio Sánchez Ramírez <sergio.sanchez.ramirez+git@bsc.es>
2025-04-01 08:58:55 +02:00
Jonas Devlieghere
aa889ed129
[lldb] Fix statusline terminal resizing
Simplify and fix the logic to clear the old statusline when the terminal
window dimensions have changed. I accidentally broke the terminal
resizing behavior when addressing code review feedback.

I'd really like to figure out a way to test this. PExpect isn't a good
fit for this, because I really need to check the result, rather than the
control characters, as the latter doesn't tell me whether any part of
the old statusline is still visible.
2025-03-31 23:53:35 -07:00
Kazu Hirata
fe3e9c2b46
[Analysis] Avoid repeated hash lookups (NFC) (#133045) 2025-03-31 23:17:44 -07:00
Owen Pan
d3be29642f
[clang-format] Correctly annotate pointer/reference in _Generic (#133673)
Fix #133663
2025-03-31 23:16:41 -07:00
Jean-Didier PAILLEUX
bae3577002
[flang] Define ERF, ERFC and ERFC_SCALED intrinsics with Q and D prefix (#125217)
`ERF`, `ERFC` and `ERFC_SCALED` intrinsics prefixed by `Q` and `D` are
missing. Codes such as `CP2K`(https://github.com/cp2k/cp2k) and
`TurboRVB`(https://github.com/sissaschool/turborvb) use these intrinsics
just like defined in the GNU standard and here:
https://www.ibm.com/docs/fr/xl-fortran-aix/16.1.0?topic=reference-intrinsic-procedures
These intrinsics are based on the existing intrinsics but apply a
restriction on the type kind.

- `DERF`, `DERFC` and `DERFC_SCALED` are for double précision only.
- `QERF`, `QERFC` and `QERFC_SCALED` are for quad précision only.
2025-04-01 08:07:26 +02:00
Thirumalai Shaktivel
091dcb8fc2
[Flang] Make a private copy for the common block variables in copyin clause (#111359)
Fixes: https://github.com/llvm/llvm-project/issues/82949
2025-04-01 11:35:44 +05:30
Kazu Hirata
2de7b6ca4e
[ExecutionEngine] Use DenseMap::insert_range (NFC) (#133847)
We can safely switch to insert_range here because LR starts out empty.
Also, *Result is a DenseMap, so we know that the keys are unique.
2025-03-31 22:11:34 -07:00
Kazu Hirata
4d68cf384d
[lldb] Use DenseMap::insert_range (NFC) (#133846) 2025-03-31 22:11:22 -07:00
Kazu Hirata
ee3c892b35
[clang-tidy] Use DenseMap::insert_range (NFC) (#133844)
We can safely switch to insert_range here because
SyntheticStmtSourceMap starts out empty in the constructor.  Also
TheCFG->synthetic_stmts() comes from DenseMap, so we know that the
keys are unique.  That is, operator[] and insert are equivalent in
this particular case.
2025-03-31 22:11:06 -07:00
Craig Topper
eb2aba4a64 [RISCV] Remove extra call to MatchRegisterName in parseRegListCommon. NFC
Update RegEnd after each call to MatchRegisterName end of calling it
again.
2025-03-31 21:55:24 -07:00
Craig Topper
e3adf6bbfc
[RISCV] Use decodeCLUIImmOperand when disassembling C_LUI_HINT. (#133789)
This correctly rejects imm==0 and prints 1048575 instead of -1.

I've modified the test to only have each hex pattern once with different
check lines before it. This ensures we don't have more invalid messages
printed than we're checking for.
2025-03-31 21:49:07 -07:00
Kazu Hirata
b3c7d59516
[lld] Use DenseMap::insert_range (NFC) (#133845) 2025-03-31 21:03:26 -07:00
Craig Topper
386aca4a3c
[RISCV] Correct disassembly of cm.push/pop for RVE. (#133816)
We shouldn't disassemble any encoding that refers to registers x16-x31
with RV32E.
2025-03-31 20:54:19 -07:00
Craig Topper
ea68b22881
[RISCV] Prevent disassembling RVC hint instructions with x16-x31 for RVE. (#133805)
We can't ignore the return value form the GPR decode function, as it
contains the RVE check.
2025-03-31 20:49:51 -07:00
Craig Topper
27b49288f7
[RISCV] Add exhaustive diassember tests for c.slli64. NFC (#133820)
The c.slli encoding with a shift of 0 is c.slli64 for RV128 and a hint
for RV32 and RV64. Add a test for this encoding to the exhaustive c.slli
test.
2025-03-31 20:47:08 -07:00
Fangrui Song
dd862356e2
AsmPrinter: Remove ELF's special lowerRelativeReference for unnamed_addr function
https://reviews.llvm.org/D17938 introduced lowerRelativeReference to
give ConstantExpr sub (A-B) special semantics in ELF: when `A` is an
`unnamed_addr` function, create a PLT-generating relocation. This was
intended for C++ relative vtables, but C++ relative vtable ended up
using DSOLocalEquivalent (lowerDSOLocalEquivalent).

This special treatment of `unnamed_addr` seems unusual.
Let's remove it. Only COFF needs an overload to generate a @IMGREL32
relocation specifier (llvm/test/MC/COFF/cross-section-relative.ll).

Pull Request: https://github.com/llvm/llvm-project/pull/132684
2025-03-31 20:44:29 -07:00
Fangrui Song
71cf592191 [IR] Fix -Wunused-but-set-variable 2025-03-31 20:23:34 -07:00
Shoreshen
145b4a3950
[AMDGPU][CodeGenPrepare] Narrow 64 bit math to 32 bit if profitable (#130577)
For Add, Sub, Mul with Int64 type, if profitable, then do:
1. Trunc operands to Int32 type
2. Apply 32 bit Add/Sub/Mul
3. Zext to Int64 type
2025-04-01 11:18:17 +08:00
John Harrison
a417a868cd
[lldb-dap] Enable runInTerminal tests on macOS. (#133824)
These tests are currently filtered on macOS if your on an M1 (or newer)
device. These tests do work on macOS, for me at least on M1 Max with
macOS 15.3.2 and Xcode 16.2.

Enabling them again, but if we have CI problems with them we can keep
them disabled.
2025-03-31 19:50:36 -07:00
Jonas Devlieghere
0b8c8ed042
[lldb] Fix use-after-free in SBMutexTest (#133840)
The `locked` variable can be accessed from the asynchronous thread until
the call to f.wait() completes. However, the variable is scoped in a
lexical block that ends before that, leading to a use-after-free.
2025-03-31 19:36:05 -07:00
Maksim Panchenko
b2d272ccfb
[BOLT][X86] Fix getTargetSymbol() (#133834)
In 96e5ee2, I inadvertently broke the way non-trivial symbol references
got updated from non-optimized code. The breakage was a consequence of
`getTargetSymbol(MCExpr *)` not returning a symbol when the parameter
was a binary expression. Fix `getTargetSymbol()` to cover such cases.
2025-03-31 18:31:33 -07:00
Craig Topper
508a6b2e01
[RISCV] Use decodeUImmLog2XLenNonZeroOperand in decodeRVCInstrRdRs1UImm. NFC (#133759)
decodeUImmLog2XLenNonZeroOperand already contains the uimm5 check for
RV32 so we can reuse it. This makes C_SLLI_HINT code more similar to the
tblgen code for C_SLLI.
2025-03-31 17:59:02 -07:00
YunQiang Su
f9282475b3 Revert "LLVM/Test: Add vectorizing testcases for fminimumnum and fminimumnum (#133690)"
This reverts commit de053bb4b0db64aebdff7719ff6ce75487f6ba5d.
2025-04-01 08:48:10 +08:00
Matt Arsenault
f77f2b9c56
llvm-reduce: Try to preserve instruction metadata as argument attributes (#133557)
Fixes #131825
2025-04-01 07:34:31 +07:00
Yaxun (Sam) Liu
0248d277ca
Reland [HIP] fix host min/max in header (#133590)
CUDA defines min/max functions for host in global namespace. HIP header
needs to define them too to be compatible. Currently only min/max(int,
int) is defined. This causes wrong result for arguments that are out of
range for int. This patch defines host min/max functions to be
compatible with CUDA.

Since some HIP apps defined min/max functions by themselves, newly added
min/max function are under the control of macro
`__HIP_DEFINE_EXTENDED_HOST_MIN_MAX__`, which is 0 by default. In the
future, this will change to 1 by
default after most existing HIP apps adopt this change.

Also allows users to define
`__HIP_NO_HOST_MIN_MAX_IN_GLOBAL_NAMESPACE__` to disable host max/min in
global namespace.

min/max functions with mixed signed/unsigned integer parameters are not
defined unless
`__HIP_DEFINE_MIXED_HOST_MIN_MAX__` is defined.

Fixes: SWDEV-446564
2025-03-31 20:28:29 -04:00
Mikhail R. Gadelha
091051fb7f
[libc] Add myself as maintainer of the riscv port (#133757)
Co-authored-by: Joseph Huber <huberjn@outlook.com>
2025-03-31 20:13:44 -04:00
Mariusz Borsa
02837acaaf
[Sanitizers][Darwin][Test] Remove community incompliant internal link from sources (#133187)
The malloc_zone.cpp test currently fails on Darwin hosts, in
SanitizerCommon
tests with lsan enabled.

Need to XFAIL this test to buy time to investigate this failure. Also
we're trying to bring the number of test failing on Darwin bots to 0, to
get clearer signal of any new failures.

rdar://145873843

Co-authored-by: Mariusz Borsa <m_borsa@apple.com>
2025-03-31 17:06:41 -07:00
YunQiang Su
de053bb4b0
LLVM/Test: Add vectorizing testcases for fminimumnum and fminimumnum (#133690)
Vectorizing of fminimumnum and fminimumnum have not support yet. Let's
add the testcase for it now, and we will update the testcase when we
support it.
2025-04-01 08:00:22 +08:00
Jakub Kuderski
66db3ccd8c
[mlir] Update vector return types for .getMixed* methods (NFC) (#133821)
Drop small size to make vector types match the generic helper
`getMixedValues` in `StaticValueUtils.h`.

This saves some needles vector copies. I didn't find any local variables
that need updating.
2025-03-31 19:56:46 -04:00
Alexey Bataev
cf6a452cc7
[SLP]Fix same/alternate analysis in split node analysis for compares
getSameOpcode in some cases may consider 2 compares as having same
opcode, even though previously they were considered as alternate. It may
happen, because getSameOpcode looses info about previous instructions
and their states. Need to use isAlternateInstruction function instead
for the correct analysis.

Reviewers: RKSimon, hiraditya

Reviewed By: RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/133769
2025-03-31 19:33:40 -04:00
Matt Arsenault
5b8d8bb90a
Inliner: Fix missing test coverage for incompatible gc rejection (#133708) 2025-04-01 06:24:59 +07:00
Zequan Wu
3483740289
Reland "Symbolize line zero as if no source info is available (#124846)" (#133798)
This land commits 23aca2f88dd5d2447e69496c89c3ed42a56f9c31 and
1b15a89a23c631a8e2d096dad4afe456970572c0.
https://github.com/llvm/llvm-project/pull/128619 makes symbolizer to
always use debug info when available so we can reland this chagnge.
2025-03-31 19:13:46 -04:00
Matthias Braun
5d1f27f349
GlobalISel: neg (and x, 1) --> SIGN_EXTEND_INREG x, 1 (#131367)
The pattern
```LLVM
%shl = shl i32 %x, 31
%ashr = ashr i32 %shl, 31
```
would be combined to `G_EXT_INREG %x, 1` by GlobalISel. However
InstCombine normalizes this pattern to:
```LLVM
%and = and i32 %x, 1
%neg = sub i32 0, %and
```
This adds a combiner for this variant as well.
2025-03-31 16:06:51 -07:00
Jonas Devlieghere
46457ed1df
[lldb] Convert Breakpoint & Watchpoints structs to classes (NFC) (#133780)
Convert Breakpoint & Watchpoints structs to classes to provide proper
access control. This is in preparation for adopting SBMutex to protect
the underlying SBBreakpoint and SBWatchpoint.
2025-03-31 16:04:31 -07:00
Craig Topper
40c859a704
[TableGen] Use size returned by encodeULEB128 to simplify some code. NFC (#133750)
We can use the length to insert all the bytes at once instead of
partially decoding them to insert one byte at a time.
2025-03-31 15:58:36 -07:00
John Harrison
4492632432
[lldb-dap] Do not take ownership of stdin. (#133811)
There isn't any benefit to taking ownership of stdin and it may cause
issues if `Transport` is dealloced.
2025-03-31 15:51:07 -07:00
Tom Stellard
7793bae97d
[workflows] Add missing -y option to apt-get for abi tests (#133337) 2025-03-31 15:30:05 -07:00
Ryosuke Niwa
6ff33edcdc
[alpha.webkit.NoUnretainedMemberChecker] Ignore system-header-defined ivar / property of a forward declared type (#133755)
Prior to this PR, we were emitting warnings for Objective-C ivars and
properties if the forward declaration of the type appeared first in a
non-system header. This PR fixes the checker so tha we'd ignore ivars
and properties defined for a forward declared type.
2025-03-31 14:59:41 -07:00
Keith Smiley
f30c6a047d
[bazel] Format BUILD files with buildifier (#133802) 2025-03-31 14:38:58 -07:00
Florian Hahn
32f24029c7
Reapply "[EquivalenceClasses] Replace findValue with contains (NFC)."
This reverts the revert commit 616f447fc84bdc7655117f1b303d895dc3b93e4d.

It includes updates to remaining users in Polly and Clang, to avoid
failures when building those projects.
2025-03-31 22:27:59 +01:00
Farzon Lotfi
bdae91b08b
Revert "[Clang][Cmake] fix libtool duplicate member name warnings" (#133795)
Reverts llvm/llvm-project#133619
2025-03-31 17:00:38 -04:00
Paul Osmialowski
cb7c223625
[clang][driver] Fix -fveclib=ArmPL issue: with -nostdlib do not link against libm (#133578)
Although combining -fveclib=ArmPL with -nostdlib is a rare situation, it
should still be supported correctly and should effect in avoidance of
linking against libm.
2025-03-31 21:55:58 +01:00
Sandeep Dasgupta
eefefb5da7
Fix sub-channel quantized type documentation (#133765)
fixes the issue reported in
https://github.com/llvm/llvm-project/pull/120172#issuecomment-2748367578
2025-03-31 16:45:54 -04:00
Sandeep Dasgupta
baacd1287b
Fix printing of mlirUniformQuantizedSubChannelTypeGetNumBlockSizes in 32-bit machine. (#133763)
Fixes the issue reported in
https://github.com/llvm/llvm-project/pull/120172#issuecomment-2763212827

cc @mgorny
2025-03-31 16:45:43 -04:00
Finn Plummer
5e2860a8d3
Revert "[HLSL][RootSignature] Implement parsing of a DescriptorTable with empty clauses" (#133790)
Reverts llvm/llvm-project#133302

Reverting to inspect build failures that were introduced from use of the
`clang::Preprocessor` in unit testing, as well as, the warning about an
unused declaration. See linked issue for failures.
2025-03-31 13:38:09 -07:00
Tom Yang
a8d2d169c7
Parallelize module loading in POSIX dyld code (#130912)
This patch improves LLDB launch time on Linux machines for **preload
scenarios**, particularly for executables with a lot of shared library
dependencies (or modules). Specifically:
* Launching a binary with `target.preload-symbols = true` 
* Attaching to a process with `target.preload-symbols = true`.
It's completely controlled by a new flag added in the first commit
`plugin.dynamic-loader.posix-dyld.parallel-module-load`, which *defaults
to false*. This was inspired by similar work on Darwin #110646.

Some rough numbers to showcase perf improvement, run on a very beefy
machine:
* Executable with ~5600 modules: baseline 45s, improvement 15s
* Executable with ~3800 modules: baseline 25s,  improvement 10s
* Executable with ~6650 modules: baseline 67s, improvement 20s
* Executable with ~12500 modules: baseline 185s, improvement 85s
* Executable with ~14700 modules: baseline 235s, improvement 120s
A lot of targets we deal with have a *ton* of modules, and unfortunately
we're unable to convince other folks to reduce the number of modules, so
performance improvements like this can be very impactful for user
experience.

This patch achieves the performance improvement by parallelizing
`DynamicLoaderPOSIXDYLD::RefreshModules` for the launch scenario, and
`DynamicLoaderPOSIXDYLD::LoadAllCurrentModules` for the attach scenario.
The commits have some context on their specific changes as well --
hopefully this helps the review.

# More context on implementation

We discovered the bottlenecks by via `perf record -g -p <lldb's pid>` on
a Linux machine. With an executable known to have 1000s of shared
library dependencies, I ran
```
(lldb) b main
(lldb) r
# taking a while
```
and showed the resulting perf trace (snippet shown)
```
Samples: 85K of event 'cycles:P', Event count (approx.): 54615855812
  Children      Self  Command          Shared Object              Symbol
-   93.54%     0.00%  intern-state     libc.so.6                  [.] clone3
     clone3
     start_thread
     lldb_private::HostNativeThreadBase::ThreadCreateTrampoline(void*)                                                                           r
     std::_Function_handler<void* (), lldb_private::Process::StartPrivateStateThread(bool)::$_0>::_M_invoke(std::_Any_data const&)
     lldb_private::Process::RunPrivateStateThread(bool)                                                                                          n
   - lldb_private::Process::HandlePrivateEvent(std::shared_ptr<lldb_private::Event>&)
      - 93.54% lldb_private::Process::ShouldBroadcastEvent(lldb_private::Event*)
         - 93.54% lldb_private::ThreadList::ShouldStop(lldb_private::Event*)
            - lldb_private::Thread::ShouldStop(lldb_private::Event*)                                                                             *
               - 93.53% lldb_private::StopInfoBreakpoint::ShouldStopSynchronous(lldb_private::Event*)                                            t
                  - 93.52% lldb_private::BreakpointSite::ShouldStop(lldb_private::StoppointCallbackContext*)                                     i
                       lldb_private::BreakpointLocationCollection::ShouldStop(lldb_private::StoppointCallbackContext*)                           k
                       lldb_private::BreakpointLocation::ShouldStop(lldb_private::StoppointCallbackContext*)                                     b
                       lldb_private::BreakpointOptions::InvokeCallback(lldb_private::StoppointCallbackContext*, unsigned long, unsigned long)    i
                       DynamicLoaderPOSIXDYLD::RendezvousBreakpointHit(void*, lldb_private::StoppointCallbackContext*, unsigned long, unsigned lo
                     - DynamicLoaderPOSIXDYLD::RefreshModules()                                                                                  O
                        - 93.42% DynamicLoaderPOSIXDYLD::RefreshModules()::$_0::operator()(DYLDRendezvous::SOEntry const&) const                 u
                           - 93.40% DynamicLoaderPOSIXDYLD::LoadModuleAtAddress(lldb_private::FileSpec const&, unsigned long, unsigned long, bools
                              - lldb_private::DynamicLoader::LoadModuleAtAddress(lldb_private::FileSpec const&, unsigned long, unsigned long, boos
                                 - 83.90% lldb_private::DynamicLoader::FindModuleViaTarget(lldb_private::FileSpec const&)                        o
                                    - 83.01% lldb_private::Target::GetOrCreateModule(lldb_private::ModuleSpec const&, bool, lldb_private::Status*
                                       - 77.89% lldb_private::Module::PreloadSymbols()
                                          - 44.06% lldb_private::Symtab::PreloadSymbols()
                                             - 43.66% lldb_private::Symtab::InitNameIndexes()
...
```
We saw that majority of time was spent in `RefreshModules`, with the
main culprit within it `LoadModuleAtAddress` which eventually calls
`PreloadSymbols`.

At first, `DynamicLoaderPOSIXDYLD::LoadModuleAtAddress` appears fairly
independent -- most of it deals with different files and then getting or
creating Modules from these files. The portions that aren't independent
seem to deal with ModuleLists, which appear concurrency safe. There were
members of `DynamicLoaderPOSIXDYLD` I had to synchronize though: namely
`m_loaded_modules` which `DynamicLoaderPOSIXDYLD` maintains to map its
loaded modules to their link addresses. Without synchronizing this, I
ran into SEGFAULTS and other issues when running `check-lldb`. I also
locked the assignment and comparison of `m_interpreter_module`, which
may be unnecessary.

# Alternate implementations

When creating this patch, another implementation I considered was
directly background-ing the call to `Module::PreloadSymbol` in
`Target::GetOrCreateModule`. It would have the added benefit of working
across platforms generically, and appeared to be concurrency safe. It
was done via `Debugger::GetThreadPool().async` directly. However, there
were a ton of concurrency issues, so I abandoned that approach for now.

# Testing

With the feature active, I tested via `ninja check-lldb` on both Debug
and Release builds several times (~5 or 6 altogether?), and didn't spot
additional failing or flaky tests.

I also tested manually on several different binaries, some with around
14000 modules, but just basic operations: launching, reaching main,
setting breakpoint, stepping, showing some backtraces.

I've also tested with the flag off just to make sure things behave
properly synchronously.
2025-03-31 13:29:31 -07:00
Luke Lau
6afe5e5d1a
[LV][EVL] Peek through combination tail-folded + predicated masks (#133430)
If a recipe was predicated and tail folded at the same time, it will
have a mask like

    EMIT vp<%header-mask> = icmp ule canonical-iv, backedge-tc
    EMIT vp<%mask> = logical-and vp<%header-mask>, vp<%pred-mask>

When converting to an EVL recipe, if the mask isn't exactly just the
header-mask we copy the whole logical-and.
We can remove this redundant logical-and (because it's now covered by
EVL) and just use vp<%pred-mask> instead.

This lets us remove the widened canonical IV in more places.
2025-03-31 21:28:39 +01:00
Florian Hahn
4e8fbc6071
[LV] Add epilogue vectorization tests for FindLastIV reductions.
Add missing test coverage for #126836.
2025-03-31 21:23:35 +01:00
Valentin Clement (バレンタイン クレメン)
0b31f08537
[flang][cuda] Add support for NV_CUDAFOR_DEVICE_IS_MANAGED (#133778)
Add support for the environment variable `NV_CUDAFOR_DEVICE_IS_MANAGED`
as described in the documentation:
https://docs.nvidia.com/hpc-sdk/compilers/cuda-fortran-prog-guide/index.html#controlling-device-data-is-managed.

This mainly switch device allocation to managed allocation.
2025-03-31 13:17:21 -07:00