llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-04-24 03:06:06 +00:00

Author	SHA1	Message	Date
Timm Baeder	7267dbfe10	[clang][bytecode] Fix comparing the addresses of union members (#133852 ) Union members get the same address, so we can't just use `Pointer::getByteOffset()`.	2025-04-01 09:00:46 +02:00
Frank Schlimbach	49f080afc4	[mlir][mpi] Mandatory Communicator (#133280 ) This is replacing #125361 - communicator is mandatory - new mpi.comm_world - new mp.comm_split - lowering and test --------- Co-authored-by: Sergio Sánchez Ramírez <sergio.sanchez.ramirez+git@bsc.es>	2025-04-01 08:58:55 +02:00
Jonas Devlieghere	aa889ed129	[lldb] Fix statusline terminal resizing Simplify and fix the logic to clear the old statusline when the terminal window dimensions have changed. I accidentally broke the terminal resizing behavior when addressing code review feedback. I'd really like to figure out a way to test this. PExpect isn't a good fit for this, because I really need to check the result, rather than the control characters, as the latter doesn't tell me whether any part of the old statusline is still visible.	2025-03-31 23:53:35 -07:00
Kazu Hirata	fe3e9c2b46	[Analysis] Avoid repeated hash lookups (NFC) (#133045 )	2025-03-31 23:17:44 -07:00
Owen Pan	d3be29642f	[clang-format] Correctly annotate pointer/reference in _Generic (#133673 ) Fix #133663	2025-03-31 23:16:41 -07:00
Jean-Didier PAILLEUX	bae3577002	[flang] Define ERF, ERFC and ERFC_SCALED intrinsics with Q and D prefix (#125217 ) `ERF`, `ERFC` and `ERFC_SCALED` intrinsics prefixed by `Q` and `D` are missing. Codes such as `CP2K`(https://github.com/cp2k/cp2k) and `TurboRVB`(https://github.com/sissaschool/turborvb) use these intrinsics just like defined in the GNU standard and here: https://www.ibm.com/docs/fr/xl-fortran-aix/16.1.0?topic=reference-intrinsic-procedures These intrinsics are based on the existing intrinsics but apply a restriction on the type kind. - `DERF`, `DERFC` and `DERFC_SCALED` are for double précision only. - `QERF`, `QERFC` and `QERFC_SCALED` are for quad précision only.	2025-04-01 08:07:26 +02:00
Thirumalai Shaktivel	091dcb8fc2	[Flang] Make a private copy for the common block variables in copyin clause (#111359 ) Fixes: https://github.com/llvm/llvm-project/issues/82949	2025-04-01 11:35:44 +05:30
Kazu Hirata	2de7b6ca4e	[ExecutionEngine] Use DenseMap::insert_range (NFC) (#133847 ) We can safely switch to insert_range here because LR starts out empty. Also, *Result is a DenseMap, so we know that the keys are unique.	2025-03-31 22:11:34 -07:00
Kazu Hirata	4d68cf384d	[lldb] Use DenseMap::insert_range (NFC) (#133846 )	2025-03-31 22:11:22 -07:00
Kazu Hirata	ee3c892b35	[clang-tidy] Use DenseMap::insert_range (NFC) (#133844 ) We can safely switch to insert_range here because SyntheticStmtSourceMap starts out empty in the constructor. Also TheCFG->synthetic_stmts() comes from DenseMap, so we know that the keys are unique. That is, operator[] and insert are equivalent in this particular case.	2025-03-31 22:11:06 -07:00
Craig Topper	eb2aba4a64	[RISCV] Remove extra call to MatchRegisterName in parseRegListCommon. NFC Update RegEnd after each call to MatchRegisterName end of calling it again.	2025-03-31 21:55:24 -07:00
Craig Topper	e3adf6bbfc	[RISCV] Use decodeCLUIImmOperand when disassembling C_LUI_HINT. (#133789 ) This correctly rejects imm==0 and prints 1048575 instead of -1. I've modified the test to only have each hex pattern once with different check lines before it. This ensures we don't have more invalid messages printed than we're checking for.	2025-03-31 21:49:07 -07:00
Kazu Hirata	b3c7d59516	[lld] Use DenseMap::insert_range (NFC) (#133845 )	2025-03-31 21:03:26 -07:00
Craig Topper	386aca4a3c	[RISCV] Correct disassembly of cm.push/pop for RVE. (#133816 ) We shouldn't disassemble any encoding that refers to registers x16-x31 with RV32E.	2025-03-31 20:54:19 -07:00
Craig Topper	ea68b22881	[RISCV] Prevent disassembling RVC hint instructions with x16-x31 for RVE. (#133805 ) We can't ignore the return value form the GPR decode function, as it contains the RVE check.	2025-03-31 20:49:51 -07:00
Craig Topper	27b49288f7	[RISCV] Add exhaustive diassember tests for c.slli64. NFC (#133820 ) The c.slli encoding with a shift of 0 is c.slli64 for RV128 and a hint for RV32 and RV64. Add a test for this encoding to the exhaustive c.slli test.	2025-03-31 20:47:08 -07:00
Fangrui Song	dd862356e2	AsmPrinter: Remove ELF's special lowerRelativeReference for unnamed_addr function https://reviews.llvm.org/D17938 introduced lowerRelativeReference to give ConstantExpr sub (A-B) special semantics in ELF: when `A` is an `unnamed_addr` function, create a PLT-generating relocation. This was intended for C++ relative vtables, but C++ relative vtable ended up using DSOLocalEquivalent (lowerDSOLocalEquivalent). This special treatment of `unnamed_addr` seems unusual. Let's remove it. Only COFF needs an overload to generate a @IMGREL32 relocation specifier (llvm/test/MC/COFF/cross-section-relative.ll). Pull Request: https://github.com/llvm/llvm-project/pull/132684	2025-03-31 20:44:29 -07:00
Fangrui Song	71cf592191	[IR] Fix -Wunused-but-set-variable	2025-03-31 20:23:34 -07:00
Shoreshen	145b4a3950	[AMDGPU][CodeGenPrepare] Narrow 64 bit math to 32 bit if profitable (#130577 ) For Add, Sub, Mul with Int64 type, if profitable, then do: 1. Trunc operands to Int32 type 2. Apply 32 bit Add/Sub/Mul 3. Zext to Int64 type	2025-04-01 11:18:17 +08:00
John Harrison	a417a868cd	[lldb-dap] Enable runInTerminal tests on macOS. (#133824 ) These tests are currently filtered on macOS if your on an M1 (or newer) device. These tests do work on macOS, for me at least on M1 Max with macOS 15.3.2 and Xcode 16.2. Enabling them again, but if we have CI problems with them we can keep them disabled.	2025-03-31 19:50:36 -07:00
Jonas Devlieghere	0b8c8ed042	[lldb] Fix use-after-free in SBMutexTest (#133840 ) The `locked` variable can be accessed from the asynchronous thread until the call to f.wait() completes. However, the variable is scoped in a lexical block that ends before that, leading to a use-after-free.	2025-03-31 19:36:05 -07:00
Maksim Panchenko	b2d272ccfb	[BOLT][X86] Fix getTargetSymbol() (#133834 ) In 96e5ee2, I inadvertently broke the way non-trivial symbol references got updated from non-optimized code. The breakage was a consequence of `getTargetSymbol(MCExpr *)` not returning a symbol when the parameter was a binary expression. Fix `getTargetSymbol()` to cover such cases.	2025-03-31 18:31:33 -07:00
Craig Topper	508a6b2e01	[RISCV] Use decodeUImmLog2XLenNonZeroOperand in decodeRVCInstrRdRs1UImm. NFC (#133759 ) decodeUImmLog2XLenNonZeroOperand already contains the uimm5 check for RV32 so we can reuse it. This makes C_SLLI_HINT code more similar to the tblgen code for C_SLLI.	2025-03-31 17:59:02 -07:00
YunQiang Su	f9282475b3	Revert "LLVM/Test: Add vectorizing testcases for fminimumnum and fminimumnum (#133690 )" This reverts commit de053bb4b0db64aebdff7719ff6ce75487f6ba5d.	2025-04-01 08:48:10 +08:00
Matt Arsenault	f77f2b9c56	llvm-reduce: Try to preserve instruction metadata as argument attributes (#133557 ) Fixes #131825	2025-04-01 07:34:31 +07:00
Yaxun (Sam) Liu	0248d277ca	Reland [HIP] fix host min/max in header (#133590 ) CUDA defines min/max functions for host in global namespace. HIP header needs to define them too to be compatible. Currently only min/max(int, int) is defined. This causes wrong result for arguments that are out of range for int. This patch defines host min/max functions to be compatible with CUDA. Since some HIP apps defined min/max functions by themselves, newly added min/max function are under the control of macro `__HIP_DEFINE_EXTENDED_HOST_MIN_MAX__`, which is 0 by default. In the future, this will change to 1 by default after most existing HIP apps adopt this change. Also allows users to define `__HIP_NO_HOST_MIN_MAX_IN_GLOBAL_NAMESPACE__` to disable host max/min in global namespace. min/max functions with mixed signed/unsigned integer parameters are not defined unless `__HIP_DEFINE_MIXED_HOST_MIN_MAX__` is defined. Fixes: SWDEV-446564	2025-03-31 20:28:29 -04:00
Mikhail R. Gadelha	091051fb7f	[libc] Add myself as maintainer of the riscv port (#133757 ) Co-authored-by: Joseph Huber <huberjn@outlook.com>	2025-03-31 20:13:44 -04:00
Mariusz Borsa	02837acaaf	[Sanitizers][Darwin][Test] Remove community incompliant internal link from sources (#133187 ) The malloc_zone.cpp test currently fails on Darwin hosts, in SanitizerCommon tests with lsan enabled. Need to XFAIL this test to buy time to investigate this failure. Also we're trying to bring the number of test failing on Darwin bots to 0, to get clearer signal of any new failures. rdar://145873843 Co-authored-by: Mariusz Borsa <m_borsa@apple.com>	2025-03-31 17:06:41 -07:00
YunQiang Su	de053bb4b0	LLVM/Test: Add vectorizing testcases for fminimumnum and fminimumnum (#133690 ) Vectorizing of fminimumnum and fminimumnum have not support yet. Let's add the testcase for it now, and we will update the testcase when we support it.	2025-04-01 08:00:22 +08:00
Jakub Kuderski	66db3ccd8c	[mlir] Update vector return types for `.getMixed`* methods (NFC) (#133821 ) Drop small size to make vector types match the generic helper `getMixedValues` in `StaticValueUtils.h`. This saves some needles vector copies. I didn't find any local variables that need updating.	2025-03-31 19:56:46 -04:00
Alexey Bataev	cf6a452cc7	[SLP]Fix same/alternate analysis in split node analysis for compares getSameOpcode in some cases may consider 2 compares as having same opcode, even though previously they were considered as alternate. It may happen, because getSameOpcode looses info about previous instructions and their states. Need to use isAlternateInstruction function instead for the correct analysis. Reviewers: RKSimon, hiraditya Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/133769	2025-03-31 19:33:40 -04:00
Matt Arsenault	5b8d8bb90a	Inliner: Fix missing test coverage for incompatible gc rejection (#133708 )	2025-04-01 06:24:59 +07:00
Zequan Wu	3483740289	Reland "Symbolize line zero as if no source info is available (#124846 )" (#133798 ) This land commits 23aca2f88dd5d2447e69496c89c3ed42a56f9c31 and 1b15a89a23c631a8e2d096dad4afe456970572c0. https://github.com/llvm/llvm-project/pull/128619 makes symbolizer to always use debug info when available so we can reland this chagnge.	2025-03-31 19:13:46 -04:00
Matthias Braun	5d1f27f349	GlobalISel: neg (and x, 1) --> SIGN_EXTEND_INREG x, 1 (#131367 ) The pattern ```LLVM %shl = shl i32 %x, 31 %ashr = ashr i32 %shl, 31 ``` would be combined to `G_EXT_INREG %x, 1` by GlobalISel. However InstCombine normalizes this pattern to: ```LLVM %and = and i32 %x, 1 %neg = sub i32 0, %and ``` This adds a combiner for this variant as well.	2025-03-31 16:06:51 -07:00
Jonas Devlieghere	46457ed1df	[lldb] Convert Breakpoint & Watchpoints structs to classes (NFC) (#133780 ) Convert Breakpoint & Watchpoints structs to classes to provide proper access control. This is in preparation for adopting SBMutex to protect the underlying SBBreakpoint and SBWatchpoint.	2025-03-31 16:04:31 -07:00
Craig Topper	40c859a704	[TableGen] Use size returned by encodeULEB128 to simplify some code. NFC (#133750 ) We can use the length to insert all the bytes at once instead of partially decoding them to insert one byte at a time.	2025-03-31 15:58:36 -07:00
John Harrison	4492632432	[lldb-dap] Do not take ownership of stdin. (#133811 ) There isn't any benefit to taking ownership of stdin and it may cause issues if `Transport` is dealloced.	2025-03-31 15:51:07 -07:00
Tom Stellard	7793bae97d	[workflows] Add missing -y option to apt-get for abi tests (#133337 )	2025-03-31 15:30:05 -07:00
Ryosuke Niwa	6ff33edcdc	[alpha.webkit.NoUnretainedMemberChecker] Ignore system-header-defined ivar / property of a forward declared type (#133755 ) Prior to this PR, we were emitting warnings for Objective-C ivars and properties if the forward declaration of the type appeared first in a non-system header. This PR fixes the checker so tha we'd ignore ivars and properties defined for a forward declared type.	2025-03-31 14:59:41 -07:00
Keith Smiley	f30c6a047d	[bazel] Format BUILD files with buildifier (#133802 )	2025-03-31 14:38:58 -07:00
Florian Hahn	32f24029c7	Reapply "[EquivalenceClasses] Replace findValue with contains (NFC)." This reverts the revert commit 616f447fc84bdc7655117f1b303d895dc3b93e4d. It includes updates to remaining users in Polly and Clang, to avoid failures when building those projects.	2025-03-31 22:27:59 +01:00
Farzon Lotfi	bdae91b08b	Revert "[Clang][Cmake] fix libtool duplicate member name warnings" (#133795 ) Reverts llvm/llvm-project#133619	2025-03-31 17:00:38 -04:00
Paul Osmialowski	cb7c223625	[clang][driver] Fix -fveclib=ArmPL issue: with -nostdlib do not link against libm (#133578 ) Although combining -fveclib=ArmPL with -nostdlib is a rare situation, it should still be supported correctly and should effect in avoidance of linking against libm.	2025-03-31 21:55:58 +01:00
Sandeep Dasgupta	eefefb5da7	Fix sub-channel quantized type documentation (#133765 ) fixes the issue reported in https://github.com/llvm/llvm-project/pull/120172#issuecomment-2748367578	2025-03-31 16:45:54 -04:00
Sandeep Dasgupta	baacd1287b	Fix printing of `mlirUniformQuantizedSubChannelTypeGetNumBlockSizes` in 32-bit machine. (#133763 ) Fixes the issue reported in https://github.com/llvm/llvm-project/pull/120172#issuecomment-2763212827 cc @mgorny	2025-03-31 16:45:43 -04:00
Finn Plummer	5e2860a8d3	Revert "[HLSL][RootSignature] Implement parsing of a DescriptorTable with empty clauses" (#133790 ) Reverts llvm/llvm-project#133302 Reverting to inspect build failures that were introduced from use of the `clang::Preprocessor` in unit testing, as well as, the warning about an unused declaration. See linked issue for failures.	2025-03-31 13:38:09 -07:00
Tom Yang	a8d2d169c7	Parallelize module loading in POSIX dyld code (#130912 ) This patch improves LLDB launch time on Linux machines for preload scenarios, particularly for executables with a lot of shared library dependencies (or modules). Specifically: * Launching a binary with `target.preload-symbols = true` * Attaching to a process with `target.preload-symbols = true`. It's completely controlled by a new flag added in the first commit `plugin.dynamic-loader.posix-dyld.parallel-module-load`, which defaults to false. This was inspired by similar work on Darwin #110646. Some rough numbers to showcase perf improvement, run on a very beefy machine: * Executable with ~5600 modules: baseline 45s, improvement 15s * Executable with ~3800 modules: baseline 25s, improvement 10s * Executable with ~6650 modules: baseline 67s, improvement 20s * Executable with ~12500 modules: baseline 185s, improvement 85s * Executable with ~14700 modules: baseline 235s, improvement 120s A lot of targets we deal with have a ton of modules, and unfortunately we're unable to convince other folks to reduce the number of modules, so performance improvements like this can be very impactful for user experience. This patch achieves the performance improvement by parallelizing `DynamicLoaderPOSIXDYLD::RefreshModules` for the launch scenario, and `DynamicLoaderPOSIXDYLD::LoadAllCurrentModules` for the attach scenario. The commits have some context on their specific changes as well -- hopefully this helps the review. # More context on implementation We discovered the bottlenecks by via `perf record -g -p <lldb's pid>` on a Linux machine. With an executable known to have 1000s of shared library dependencies, I ran ``` (lldb) b main (lldb) r # taking a while ``` and showed the resulting perf trace (snippet shown) ``` Samples: 85K of event 'cycles:P', Event count (approx.): 54615855812 Children Self Command Shared Object Symbol - 93.54% 0.00% intern-state libc.so.6 [.] clone3 clone3 start_thread lldb_private::HostNativeThreadBase::ThreadCreateTrampoline(void) r std::_Function_handler<void (), lldb_private::Process::StartPrivateStateThread(bool)::$_0>::_M_invoke(std::_Any_data const&) lldb_private::Process::RunPrivateStateThread(bool) n - lldb_private::Process::HandlePrivateEvent(std::shared_ptr<lldb_private::Event>&) - 93.54% lldb_private::Process::ShouldBroadcastEvent(lldb_private::Event) - 93.54% lldb_private::ThreadList::ShouldStop(lldb_private::Event) - lldb_private::Thread::ShouldStop(lldb_private::Event) - 93.53% lldb_private::StopInfoBreakpoint::ShouldStopSynchronous(lldb_private::Event) t - 93.52% lldb_private::BreakpointSite::ShouldStop(lldb_private::StoppointCallbackContext) i lldb_private::BreakpointLocationCollection::ShouldStop(lldb_private::StoppointCallbackContext) k lldb_private::BreakpointLocation::ShouldStop(lldb_private::StoppointCallbackContext) b lldb_private::BreakpointOptions::InvokeCallback(lldb_private::StoppointCallbackContext, unsigned long, unsigned long) i DynamicLoaderPOSIXDYLD::RendezvousBreakpointHit(void, lldb_private::StoppointCallbackContext, unsigned long, unsigned lo - DynamicLoaderPOSIXDYLD::RefreshModules() O - 93.42% DynamicLoaderPOSIXDYLD::RefreshModules()::$_0::operator()(DYLDRendezvous::SOEntry const&) const u - 93.40% DynamicLoaderPOSIXDYLD::LoadModuleAtAddress(lldb_private::FileSpec const&, unsigned long, unsigned long, bools - lldb_private::DynamicLoader::LoadModuleAtAddress(lldb_private::FileSpec const&, unsigned long, unsigned long, boos - 83.90% lldb_private::DynamicLoader::FindModuleViaTarget(lldb_private::FileSpec const&) o - 83.01% lldb_private::Target::GetOrCreateModule(lldb_private::ModuleSpec const&, bool, lldb_private::Status - 77.89% lldb_private::Module::PreloadSymbols() - 44.06% lldb_private::Symtab::PreloadSymbols() - 43.66% lldb_private::Symtab::InitNameIndexes() ... ``` We saw that majority of time was spent in `RefreshModules`, with the main culprit within it `LoadModuleAtAddress` which eventually calls `PreloadSymbols`. At first, `DynamicLoaderPOSIXDYLD::LoadModuleAtAddress` appears fairly independent -- most of it deals with different files and then getting or creating Modules from these files. The portions that aren't independent seem to deal with ModuleLists, which appear concurrency safe. There were members of `DynamicLoaderPOSIXDYLD` I had to synchronize though: namely `m_loaded_modules` which `DynamicLoaderPOSIXDYLD` maintains to map its loaded modules to their link addresses. Without synchronizing this, I ran into SEGFAULTS and other issues when running `check-lldb`. I also locked the assignment and comparison of `m_interpreter_module`, which may be unnecessary. # Alternate implementations When creating this patch, another implementation I considered was directly background-ing the call to `Module::PreloadSymbol` in `Target::GetOrCreateModule`. It would have the added benefit of working across platforms generically, and appeared to be concurrency safe. It was done via `Debugger::GetThreadPool().async` directly. However, there were a ton of concurrency issues, so I abandoned that approach for now. # Testing With the feature active, I tested via `ninja check-lldb` on both Debug and Release builds several times (~5 or 6 altogether?), and didn't spot additional failing or flaky tests. I also tested manually on several different binaries, some with around 14000 modules, but just basic operations: launching, reaching main, setting breakpoint, stepping, showing some backtraces. I've also tested with the flag off just to make sure things behave properly synchronously.	2025-03-31 13:29:31 -07:00
Luke Lau	6afe5e5d1a	[LV][EVL] Peek through combination tail-folded + predicated masks (#133430 ) If a recipe was predicated and tail folded at the same time, it will have a mask like EMIT vp<%header-mask> = icmp ule canonical-iv, backedge-tc EMIT vp<%mask> = logical-and vp<%header-mask>, vp<%pred-mask> When converting to an EVL recipe, if the mask isn't exactly just the header-mask we copy the whole logical-and. We can remove this redundant logical-and (because it's now covered by EVL) and just use vp<%pred-mask> instead. This lets us remove the widened canonical IV in more places.	2025-03-31 21:28:39 +01:00
Florian Hahn	4e8fbc6071	[LV] Add epilogue vectorization tests for FindLastIV reductions. Add missing test coverage for #126836.	2025-03-31 21:23:35 +01:00
Valentin Clement (バレンタインクレメン)	0b31f08537	[flang][cuda] Add support for NV_CUDAFOR_DEVICE_IS_MANAGED (#133778 ) Add support for the environment variable `NV_CUDAFOR_DEVICE_IS_MANAGED` as described in the documentation: https://docs.nvidia.com/hpc-sdk/compilers/cuda-fortran-prog-guide/index.html#controlling-device-data-is-managed. This mainly switch device allocation to managed allocation.	2025-03-31 13:17:21 -07:00

... 4 5 6 7 8 ...

532828 Commits