llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-04-16 23:36:35 +00:00

Author	SHA1	Message	Date
Nikita Popov	979c275097	[IR] Store Triple in Module (NFC) (#129868 ) The module currently stores the target triple as a string. This means that any code that wants to actually use the triple first has to instantiate a Triple, which is somewhat expensive. The change in #121652 caused a moderate compile-time regression due to this. While it would be easy enough to work around, I think that architecturally, it makes more sense to store the parsed Triple in the module, so that it can always be directly queried. For this change, I've opted not to add any magic conversions between std::string and Triple for backwards-compatibilty purses, and instead write out needed Triple()s or str()s explicitly. This is because I think a decent number of them should be changed to work on Triple as well, to avoid unnecessary conversions back and forth. The only interesting part in this patch is that the default triple is Triple("") instead of Triple() to preserve existing behavior. The former defaults to using the ELF object format instead of unknown object format. We should fix that as well.	2025-03-06 10:27:47 +01:00
Ricardo Jesus	f01e760c08	[AArch64][SVE] Improve fixed-length addressing modes. (#129732 ) When compiling VLS SVE, the compiler often replaces VL-based offsets with immediate-based ones. This leads to a mismatch in the allowed addressing modes due to SVE loads/stores generally expecting immediate offsets relative to VL. For example, given: ```c svfloat64_t foo(const double *x) { svbool_t pg = svptrue_b64(); return svld1_f64(pg, x+svcntd()); } ``` When compiled with `-msve-vector-bits=128`, we currently generate: ```gas foo: ptrue p0.d mov x8, #2 ld1d { z0.d }, p0/z, [x0, x8, lsl #3] ret ``` Instead, we could be generating: ```gas foo: ldr z0, [x0, #1, mul vl] ret ``` Likewise for other types, stores, and other VLS lengths. This patch achieves the above by extending `SelectAddrModeIndexedSVE` to let constants through when `vscale` is known.	2025-03-06 09:27:07 +00:00
Nikolas Klauser	0cceac6bbd	[libc++] Remove a few unused includes from <__concepts/*> (#129883 )	2025-03-06 10:22:18 +01:00
Simon Pilgrim	2c7e7b5627	[X86] Extend shuf128(concat(x,y),concat(z,w)) -> shuf128(widen(y),widen(w)) folds to peek through bitcasts (#129896 ) Peek through bitcasts when looking for freely accessible upper subvectors	2025-03-06 09:21:49 +00:00
Andrzej Warzyński	620c38371d	[mlir][nfc] De-duplicate tests from `Type::isIntOrFloat` (#129710 ) This PR makes sure that we always use `Type::isIntOrFloat` rather than re-implementing this condition inline. Also, it removes `isScalarType` that effectively re-implemented this method.	2025-03-06 09:04:30 +00:00
lakshayk-nv	d61d219739	Adding support in llvm-exegesis for Aarch64 for handling FPR64/128, PPR16 and ZPR128 reg class. (#127564 ) Current implementation (for Aarch64) in llvm-exegesis only supports GRP32 and GPR64 bit register class, thus for opcodes variants which used FPR64/128, PPR16 and ZPR128, llvm-exegesis throws warning "setReg is not implemented". This code will handle the above register class and initialize the registers using appropriate base instruction class.	2025-03-06 09:02:54 +00:00
Fraser Cormack	a2b0576172	[libclc] Stop installing CLC headers (#126908 ) The libclc headers are an implementation detail and are not intended to be used by others as OpenCL headers. The only artifacts of libclc we want to publish are the LLVM bytecode libraries. As the headers have been incidentally broken by recent changes, this commit takes the step to stop installing the headers at all. Downstreams can use clang's own OpenCL headers, and/or its -fdeclare-opencl-builtins flag. Fixes #119967.	2025-03-06 08:52:23 +00:00
Kunwar Grover	8e0a63ddad	[mlir][docs] Add docs on canonicalizers being folders or patterns (#129517 ) If a transformation should be a canonicalization is an orthogonal question to if a transformation should be implemented as a `RewritePattern` or a `fold` method. The later is an implementation detail. This patch adds a suggestion to always implement a canonicalization as a `fold` pattern if possible, as they are a restricted subset of a `RewritePattern`. This has been a common source of confusion, as to when to implement a canonicalization as a fold method or a RewritePattern.	2025-03-06 08:41:47 +00:00
Narayan	6311e3fcc8	[ValueTracking] ComputeNumSignBitsImpl - add basic handling of BITCAST nodes (#127218 ) When a wider scalar/vector type containing all sign bits is bitcast to a narrower vector type, we can deduce that the resulting narrow elements will also be all sign bits. This matches existing behavior in SelectionDAG and helps optimize cases involving SSE intrinsics where sign-extended values are bitcast between different vector types. The current implementation fails to recognize that an arithmetic right shift is redundant when applied to elements that are already known to be all sign bits. This PR improves ComputeNumSignBitsImpl to track this information through bitcasts, enabling the optimization of such cases. ``` %ext = sext <1 x i1> %cmp to <1 x i8> %sub = bitcast <1 x i8> %ext to <4 x i2> %sra = ashr <4 x i2> %sub, <i2 1, i2 1, i2 1, i2 1> ; Can be simplified to just: %sub = bitcast <1 x i8> %ext to <4 x i2> ``` Closes #87624	2025-03-06 08:30:36 +00:00
Simon Pilgrim	1987d18012	[X86] isFreeToSplitVector - use a SDValue instead of SDNode argument. NFC. (#129906 ) Every caller was having to call getNode() - so move that insider the helper.	2025-03-06 08:29:04 +00:00
Luke Lau	c6e2cbe5fd	[LV] Regenerate select-cmp-predicated.ll with UTC. NFC The main select-cmp.ll tests seem to be generated with UTC after it should probably be converted to UTC beforehand.	2025-03-06 16:20:03 +08:00
Lu Weining	bae6644e12	[LoongArch] Relax the restrictions of inlineasm operand modifier 'u' and 'w' (#129864 ) - Allow 'u' and 'w' on LASX, LSX or floating point register operands. - Also add missing description in LangRef. Fixes #129863.	2025-03-06 16:17:12 +08:00
Fangrui Song	687854aea8	[MC] Remove unneeded VK_None argument from MCSymbolRefExpr::create. NFC	2025-03-06 00:00:05 -08:00
Matthias Springer	a6151f4e23	[mlir][IR] Move `match` and `rewrite` functions into separate class (#129861 ) The vast majority of rewrite / conversion patterns uses a combined `matchAndRewrite` instead of separate `match` and `rewrite` functions. This PR optimizes the code base for the most common case where users implement a combined `matchAndRewrite`. There are no longer any `match` and `rewrite` functions in `RewritePattern`, `ConversionPattern` and their derived classes. Instead, there is a `SplitMatchAndRewriteImpl` class that implements `matchAndRewrite` in terms of `match` and `rewrite`. Details: * The `RewritePattern` and `ConversionPattern` classes are simpler (fewer functions). Especially the `ConversionPattern` class, which now has 5 fewer functions. (There were various `rewrite` overloads to account for 1:1 / 1:N patterns.) * There is a new class `SplitMatchAndRewriteImpl` that derives from `RewritePattern` / `OpRewritePatern` / ..., along with a type alias `RewritePattern::SplitMatchAndRewrite` for convenience. * Fewer `llvm_unreachable` are needed throughout the code base. Instead, we can use pure virtual functions. (In cases where users previously had to implement `rewrite` or `matchAndRewrite`, etc.) * This PR may also improve the number of [`-Woverload-virtual` warnings](https://discourse.llvm.org/t/matchandrewrite-hiding-virtual-functions/84933) that are produced by GCC. (To be confirmed...) Note for LLVM integration: Patterns with separate `match` / `rewrite` implementations, must derive from `X::SplitMatchAndRewrite` instead of `X`. --------- Co-authored-by: River Riddle <riddleriver@gmail.com>	2025-03-06 08:48:51 +01:00
Jim Lin	87976ca45f	[M68k] Suppress compilation warning `enumerated mismatch in conditional expression`. (NFC)	2025-03-06 15:06:46 +08:00
Fangrui Song	fe56c4c019	[MC] Remove unneeded VK_None argument from MCSymbolRefExpr::create. NFC	2025-03-05 23:14:04 -08:00
Balazs Benics	7e5821bae8	Reapply "[analyzer] Handle [[assume(cond)]] as __builtin_assume(cond)" (#129234 ) This is the second attempt to bring initial support for [[assume()]] in the Clang Static Analyzer. The first attempt (#116462) was reverted in 2b9abf0db2d106c7208b4372e662ef5df869e6f1 due to some weird failure in a libcxx test involving `#pragma clang loop vectorize(enable) interleave(enable)`. The failure could be reduced into: ```c++ template <class ExecutionPolicy> void transform(ExecutionPolicy) { #pragma clang loop vectorize(enable) interleave(enable) for (int i = 0; 0;) { // The DeclStmt of "i" would be added twice in the ThreadSafety analysis. // empty } } void entrypoint() { transform(1); } ``` As it turns out, the problem with the initial patch was this: ```c++ for (const auto Attr : AS->getAttrs()) { if (const auto AssumeAttr = dyn_cast<CXXAssumeAttr>(Attr)) { Expr *AssumeExpr = AssumeAttr->getAssumption(); if (!AssumeExpr->HasSideEffects(Ctx)) { childrenBuf.push_back(AssumeExpr); } } // Visit the actual children AST nodes. // For CXXAssumeAttrs, this is always a NullStmt. llvm::append_range(childrenBuf, AS->children()); // <--- This was not meant to be part of the "for" loop. children = childrenBuf; } return; ``` The solution was simple. Just hoist it from the loop. I also had a closer look at `CFGBuilder::VisitAttributedStmt`, where I also spotted another bug. We would have added the CFG blocks twice if the AttributedStmt would have both the `[[fallthrough]]` and the `[[assume()]]` attributes. With my fix, it will only once add the blocks. Added a regression test for this. Co-authored-by: Vinay Deshmukh <vinay_deshmukh AT outlook DOT com>	2025-03-06 08:09:09 +01:00
Maksim Panchenko	a28daa7c1a	[BOLT][AArch64] Keep relocations for linker-relaxed instructions. NFCI (#129980 ) We used to filter out relocations corresponding to NOP+ADR instruction pairs that were a result of linker "relaxation" optimization. However, these relocations will be useful for reversing the linker optimization. Keep the relocations and ignore them while symbolizing ADR instruction operands.	2025-03-05 23:06:01 -08:00
quic_hchandel	6e7e46cafe	[RISCV] Add Qualcomm uC Xqcibm (Bit Manipulation) extension (#129504 ) This extension adds thirty eight bit manipulation instructions. The current spec can be found at: https://github.com/quic/riscv-unified-db/releases/tag/Xqci-0.6 This patch adds assembler only support. Co-authored-by: Sudharsan Veeravalli <quic_svs@quicinc.com>	2025-03-06 12:01:53 +05:30
Lang Hames	b18e5b6a36	Re-apply "[ORC] Remove the Triple argument from LLJITBuilder::..." with fixes. This re-applies f905bf3e1ef860c4d6fe67fb64901b6bbe698a91, which was reverted in c861c1a046eb8c1e546a8767e0010904a3c8c385 due to compiler errors, with a fix for MLIR.	2025-03-06 17:17:05 +11:00
chrisPyr	038fff3f24	[NFC][BOLT] Make file-local cl::opt global variables static (#126472 ) #125983	2025-03-05 22:11:05 -08:00
Fangrui Song	75f6fe2ee5	[MC] Remove unneeded MCSymbolRefExpr::create overload and add comments The StringRef overload is often error-prone as users might forget to register the MCSymbol. Add comments to MCTargetExpr and MCSymbolRefExpr::VariantKind. In the distant future the VariantKind parameter might be removed.	2025-03-05 22:10:08 -08:00
Owen Pan	a6ccda28f7	[clang-format][NFC] Use better names for a couple of data members	2025-03-05 21:45:01 -08:00
Lang Hames	c861c1a046	Revert "[ORC] Remove the Triple argument from LLJITBuilder::ObjectLinking..." This reverts commit f905bf3e1ef860c4d6fe67fb64901b6bbe698a91 while I fix some compile errors reported on the buildbots (see e.g. https://lab.llvm.org/buildbot/#/builders/53/builds/13369).	2025-03-06 16:22:39 +11:00
Peter Collingbourne	c72ebeeb7f	ADT: Switch to a raw pointer for DoubleAPFloat::Floats. In order for the union APFloat::Storage to permit access to the semantics field when another union member is stored there, all members of Storage must be standard layout. This is not necessarily the case for DoubleAPFloat which may be non-standard layout because there is no requirement that its std::unique_ptr member is standard layout. Fix this by converting Floats to a raw pointer. Reviewers: arsenm Reviewed By: arsenm Pull Request: https://github.com/llvm/llvm-project/pull/129981	2025-03-05 21:15:08 -08:00
Lang Hames	f905bf3e1e	[ORC] Remove the Triple argument from LLJITBuilder::ObjectLinkingLayerCreator. ExecutionSession can provide the Triple, so this argument has been redundant for a while, and no in-tree clients use it.	2025-03-06 16:13:10 +11:00
Lang Hames	a22881c9db	[ORC-RT] Fix type name in comment. NFC.	2025-03-06 16:13:10 +11:00
Jim Lin	316f68f7f2	[RISCV] Remove the TODO for folding bswap and shift from the test. (NFC) (#129972 ) https://reviews.llvm.org/D122655 has supported it.	2025-03-06 13:02:17 +08:00
Jim Lin	463a0964f4	[RISCV] Remove the TODO for fmaximum/fminimum from the tests. (NFC) (#129969 ) https://reviews.llvm.org/D156069 has supported it.	2025-03-06 13:01:58 +08:00
Jinsong Ji	e4c3d258b7	[NFC][c-index-test] factor data len out (#129971 ) Follow up of #129922	2025-03-05 23:21:07 -05:00
Thurston Dang	c14d3dff97	[msan][NFCI] Add tests for Arm NEON vector load (#125267 ) Forked from llvm/test/CodeGen/AArch64/arm64-ld1.ll Incorrectly handled by handleUnknownInstruction: - llvm.aarch64.neon.ld1x2, llvm.aarch64.neon.ld1x3, llvm.aarch64.neon.ld1x4 - llvm.aarch64.neon.ld2, llvm.aarch64.neon.ld3, llvm.aarch64.neon.ld4 - llvm.aarch64.neon.ld2lane, llvm.aarch64.neon.ld3lane, llvm.aarch64.neon.ld4lane - llvm.aarch64.neon.ld2r, llvm.aarch64.neon.ld3r, llvm.aarch64.neon.ld4r	2025-03-05 20:03:38 -08:00
A. Jiang	27e686c788	[libc++] Verify that LWG4140 is implemented (#128624 ) According to the commit history, the constructors removed by LWG4140 have never been added to libc++. Existence of non-public or deleted default constructor is observable, this patch tests that there's no such default constructor at all.	2025-03-06 11:12:17 +08:00
Matt Arsenault	a216358ce7	AMDGPU: Replace amdgpu-no-agpr with amdgpu-agpr-alloc (#129893 ) This performs the minimal replacment of amdgpu-no-agpr to amdgpu-agpr-alloc=0. Most of the test diffs are due to the new attribute sorting later alphabetically. We could do better by trying to perform range merging in the attributor, and trying to pick non-0 values.	2025-03-06 09:17:51 +07:00
Matt Arsenault	f4ba2bf236	AMDGPU: Add amdgpu-agpr-alloc attribute to control AGPR allocation (#128034 ) This provides a range to decide how to subdivide the vector register budget on gfx90a+. A single value declares the minimum AGPRs that should be allocatable. Eventually this should replace amdgpu-no-agpr. I want this primarily for testing agpr allocation behavior. We should have a heuristic try to detect a reasonable number of AGPRs to keep allocatable.	2025-03-06 09:13:59 +07:00
Jinsong Ji	560cfd5099	c-index-test: fix buffer overflow (#129922 ) We should not try to overwrite the pointer of struct, also need to add 1 for end of line.	2025-03-05 20:52:11 -05:00
Valentin Clement (バレンタインクレメン)	2130285564	[flang][cuda] Make sure allocator id is set for pointer allocate (#129950 )	2025-03-05 17:29:09 -08:00
Prabhuk	45ca613c13	[clang] Use TargetInfo to decide Mangling for C (#129920 ) Instead of hardcoding the decision on what mangling scheme to use based on targets, use TargetInfo to make the decision.	2025-03-05 17:26:52 -08:00
A. Jiang	c28c508962	[libc++] Implement part of P2562R1: constexpr `ranges::stable_partition` (#129839 )	2025-03-06 09:23:55 +08:00
Joseph Huber	12c5a46c30	[Clang] Fix incorrect condition on ballot Summary: Somehow these got the `!` dropped and it wasn't tested because the existing test only used the 32-bit variant.	2025-03-05 19:15:23 -06:00
Deric C.	b4ecebe745	[HLSL] [DXIL] Implement the AddUint64 HLSL function and the UAddc DXIL op (#127137 ) Fixes #99205. - Implements the HLSL intrinsic `AddUint64` used to perform unsigned 64-bit integer addition by using pairs of unsigned 32-bit integers instead of native 64-bit types - The LLVM intrinsic `uadd_with_overflow` is used in the implementation of `AddUint64` in `CGBuiltin.cpp` - The DXIL op `UAddc` was defined in `DXIL.td`, and a lowering of the LLVM intrinsic `uadd_with_overflow` to the `UAddc` DXIL op was implemented in `DXILOpLowering.cpp` Notes: - `__builtin_addc` was not able to be used to implement `AddUint64` in `hlsl_intrinsics.h` because its `CarryOut` argument is a pointer, and pointers are not supported in HLSL - A lowering of the LLVM intrinsic `uadd_with_overflow` to SPIR-V [already exists](https://github.com/llvm/llvm-project/blob/main/llvm/test/CodeGen/SPIRV/llvm-intrinsics/uadd.with.overflow.ll) - When lowering the LLVM intrinsic `uadd_with_overflow` to the `UAddc` DXIL op, the anonymous struct type `{ i32, i1 }` is replaced with a named struct type `%dx.types.i32c`. This aspect of the implementation may be changed when issue #113192 gets addressed - Fixes issues mentioned in the comments on the original PR #125319 --------- Co-authored-by: Finn Plummer <50529406+inbelic@users.noreply.github.com> Co-authored-by: Farzon Lotfi <farzonlotfi@microsoft.com> Co-authored-by: Chris B <beanz@abolishcrlf.org> Co-authored-by: Justin Bogner <mail@justinbogner.com>	2025-03-05 17:04:10 -08:00
Sirraide	5b3ba261c4	[Clang] [Sema] Allow non-local/non-variable declarations in for loop (#129737 ) Currently, we error on non-variable or non-local variable declarations in `for` loops such as `for (struct S {}; 0; ) {}`. However, this is valid in C23, so this patch changes the error to a compatibilty warning and also allows this as an extension in earlier language modes. This also matches GCC’s behaviour.	2025-03-06 02:03:22 +01:00
Shafik Yaghmour	4b454afc45	Convert unreachable return statement into llvm_unreachable (#129627 ) Static analysis flags the final return statement in `ReadExtensionBlock` as unreachable and indeed it is since there is no way to exit the `while(true)` loop besides a return statement. So I am converting it into a `llvm_unreachable` to explicitly document this.	2025-03-05 16:59:44 -08:00
Jonas Devlieghere	b5e70d0682	[lldb-dap] Use LLDB_INVALID_LINE_NUMBER & LLDB_INVALID_COLUMN_NUMBER (#129948 ) Consistently use LLDB_INVALID_LINE_NUMBER & LLDB_INVALID_COLUMN_NUMBER when parsing line and column numbers respectively.	2025-03-05 16:19:36 -08:00
lntue	f2bebdc688	[libc][math] Add skip accurate pass option for exp, log, and powf functions. (#129831 )	2025-03-05 19:18:25 -05:00
Amara Emerson	5422e2c681	[AArch64][GlobalISel] On Darwin don't fold globals into the offset of prefetches. ld64 doesn't currently support the PAGEOFF relocations on anything but load/stores so we need to bail-out here to fix the build failures on greendragon. rdar://145495288	2025-03-05 16:06:08 -08:00
choikwa	45759fe5b4	[AMDGPU] Filter candidates of LiveRegOptimizer for profitable cases (#124624 ) It is known that for vector whose element fits in i16 will be split and scalarized in SelectionDag's type legalizer (see SIISelLowering::getPreferredVectorAction). LRO attempts to undo the scalarizing of vectors across basic block boundary and shoehorn Values in VGPRs. LRO is beneficial for operations that natively work on illegal vector types to prevent flip-flopping between unpacked and packed. If we know that operations on vector will be split and scalarized, then we don't want to shoehorn them back to packed VGPR. Operations that we know to work natively on illegal vector types usually come in the form of intrinsics (MFMA, DOT8), buffer store, shuffle, phi nodes to name a few.	2025-03-05 18:44:48 -05:00
Bruno Cardoso Lopes	aea74034eb	[MLIR][LLVMIR] Add elementtype attribute (#129918 ) These are very common when using intrinsics (e.g. ARM NEON). For more context: ClangIR has currently been blocked on such intrinsics emission because of this lacking capability.	2025-03-05 15:21:45 -08:00
Zhen Wang	d1abbb4dc5	[flang][cuda] Change induction variable from i32 to index for doconcurrent inside cuf kernel directive (#129924 ) Use `index` instead of `i32` for induction variables for doconcurrent inside cuf kernel directive. Regular do loop inside cuf kernel directive also uses `index`: ``` cuf.kernel<<<, >>> (%arg0 : index) = ... ```	2025-03-05 14:50:42 -08:00
Joseph Huber	77363f7518	[libc] Allow building the GPU targets that don't have CRT (#129945 ) Summary: If there's no subdirectory we can't make an alias. This allows it to at least continue.	2025-03-05 16:39:35 -06:00
Vy Nguyen	275eab91ed	[LLDB]Fix test crash (#129921 ) Use the `SubsystemRAII` to unregister the fake manager at end of tests (Should fix https://github.com/llvm/llvm-project/issues/129910)	2025-03-05 17:36:06 -05:00

1 2 3 4 5 ...

529559 Commits