llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-04-24 22:06:06 +00:00

Author	SHA1	Message	Date
tdanyluk	76d2e0881e	[mlir] fix references of attributes which are not defined earlier (#134364 ) If an attribute is not defined earlier in the same file, but just referenced from its dialect directly, then currently not the correct check is being emited. What would it emit for #toy.shape<[1, 2, 3]>: Earlier: // CHECK: #[['?']]<[1, 2, 3]> Now: // CHECK: #toy.shape<[1, 2, 3]>	2025-04-08 17:34:20 +02:00
Christian Sigg	4e9cfcf6af	[llvm][bazel] Fix BUILD after 561506144531cf0a760bb437fd74c683931c60ae.	2025-04-08 17:28:20 +02:00
Sirraide	6c74fe9087	[Clang] [NFC] Tablegen component diags headers (#134777 ) The component diagnostic headers (i.e. `DiagnosticAST.h` and friends) all follow the same format, and there’s enough of them (and in them) to where updating all of them has become rather tedious (at least it was for me while working on #132348), so this patch instead generates all of them (or rather their contents) via Tablegen. Also, it seems that `%enum_select` currently wouldn’t work in `DiagnosticCommonKinds.td` because the infrastructure for that was missing from `DiagnosticIDs.h`; this patch should fix that as well.	2025-04-08 17:21:45 +02:00
Matt Arsenault	34e8f00066	Attributor: Propagate align to cmpxchg instructions (#134838 ) Fixes #134480	2025-04-08 22:15:50 +07:00
Matt Arsenault	66f0343609	Attributor: Propagate align to atomicrmw instructions (#134837 ) Partially fixes #134480	2025-04-08 22:12:20 +07:00
Matt Arsenault	2cf4254466	Attributor: Add baseline tests for propagating align to atomics (#134836 )	2025-04-08 22:08:11 +07:00
Adrian Prantl	5615061445	[dsymutil] Avoid copying binary swiftmodules built from textual (#134719 ) .swiftinterface files into the dSYM bundle. These typically come only from the SDK (since textual interfaces require library evolution) and thus are a waste of space to copy into the bundle. The information about this is being parsed out of the control block, which means duplicating 5 constants from the Swift frontend. If a file cannot be parsed, dsymutil errs on the side of copying the file anyway. rdar://138186524	2025-04-08 08:03:32 -07:00
Matt Arsenault	dfe4d9187c	GCStrategy: Use Twine properly for error message (#132760 )	2025-04-08 21:57:29 +07:00
Christopher McGirr	ae3faea1f2	[MLIR][mlir-opt] move action debugger hook flag (#134842 ) Currently if a developer uses the flag `--mlir-enable-debugger-hook` the debugger hook is not actually enabled. It seems the DebugConfig and the MainMLIROptConfig are not connected. To fix this we can move the `enableDebuggerHook` CL Option to the DebugConfigCLOptions struct so that it can get registered and enabled along with the other debugger flags. AFAICS there are no other uses of the flag so this should be safe. This also adds a small LIT test to check that the hook is enabled by checking the std::cerr output for the log message.	2025-04-08 16:54:11 +02:00
Alan Li	b5045ae9bc	[MLIR][Fix] Fix missing dep in AMDGPUDialect. (#134862 ) Issue introduced in https://github.com/llvm/llvm-project/pull/133498	2025-04-08 10:46:55 -04:00
Michael Liao	4f77e50042	[MLIR][AMDGPU] Fix shared build. NFC	2025-04-08 10:46:15 -04:00
Han-Kuan Chen	2347aa1fcc	[SLP][REVEC] Fix the mismatch between the result of getAltInstrMask and the VecTy argument of TargetTransformInfo::isLegalAltInstr. (#134795 ) We cannot determine ScalarTy from VL because some ScalarTy is determined from VL[0]->getType(), while others are determined from getValueType(VL[0]). Fix "Mask and VecTy are incompatible".	2025-04-08 22:29:11 +08:00
Han-Kuan Chen	97c4cb4d13	[SLP][REVEC] getNumElements should not be used as VF when REVEC is enabled. (#134763 )	2025-04-08 22:29:03 +08:00
Philip Reames	c1e95b2e5e	[RISCV] Fix matching bug in VLA shuffle lowering (#134750 ) Fix https://github.com/llvm/llvm-project/issues/134126. The matching code was previous written as if we were mutating the indices to replace undef elements with preferred values, but the actual lowering code just took a prefix of the index vector. This resulted in us using undef indices for lanes which should have been defined, resulting in incorrect codegen. Longer term, we probably should rewrite the mask, but this seemed like an easier tactical fix.	2025-04-08 07:20:25 -07:00
Michael Kruse	8b11c39a0f	[llvm-mt] Do not build llvm-mt if not functional (#134631 ) llvm-mt requires libxml2 to work, so do not even build it without libxml2. CMake 3.31 and later prefer llvm-mt.exe over Microsoft's mt.exe if available and using clang-cl.exe as CMAKE_CXX_COMPILER. When CMake picks up llvm-mt.exe without libxml2, any build will fail with the message ``` llvm-mt: error: no libxml2 ``` Any test except `--help` already uses `REQUIRES: libxml2`. There is no point in having a non-functional executable. Not building llvm-mt.exe will force CMake to use Microsoft's `mt.exe` instead. Fixes: #134237	2025-04-08 16:16:53 +02:00
Mircea Trofin	b2dea4fd22	[ctxprof] root autodetection mechanism (#133147 ) This is an optional mechanism that automatically detects roots. It's a best-effort mechanism, and its main goal is to avoid pointing at the message pump function as a root. This is the function that polls message queue(s) in an infinite loop, and is thus a bad root (it never exits). High-level, when collection is requested - which should happen when a server has already been set up and handing requests - we spend a bit of time sampling all the server's threads. Each sample is a stack which we insert in a `PerThreadCallsiteTrie`. After a while, we run for each `PerThreadCallsiteTrie` the root detection logic. We then traverse all the `FunctionData`, find the ones matching the detected roots, and allocate a `ContextRoot` for them. From here, we special case `FunctionData` objects, in `__llvm_ctx_profile_get_context, that have a `CtxRoot` and route them to `__llvm_ctx_profile_start_context`. For this to work, on the llvm side, we need to have all functions call `__llvm_ctx_profile_release_context` because they _might_ be roots. This comes at a slight (percentages) penalty during collection - which we can afford since the overall technique is ~5x faster than normal instrumentation. We can later explore conditionally enabling autoroot detection and avoiding this penalty, if desired. Note that functions that `musttail call` can't have their return instrumented this way, and a subsequent patch will harden the mechanism against this case. The mechanism could be used in combination with explicit root specification, too.	2025-04-08 06:59:38 -07:00
Shilei Tian	f19c6f23ab	[Clang][AMDGPU] Improve error message when device libraries for COV6 are missing (#134745 ) #130963 switches the default to COV6, which requires ROCm 6.3. Currently, if the device libraries for COV6 are not found, the error message is not very helpful. This PR provides a more informative error message in such cases.	2025-04-08 09:57:43 -04:00
Romaric Jodin	0e98817458	libclc: frexp: fix implementation regarding denormals (#134823 ) Devices not supporting denormals can compare them true against zero. It leads to result not matching the CTS expectation when either supporting or not denormals. For example for 0x1.008p-140 we get {0x1.008p-140, 0} while the CTS expects {0x1.008p-1, -139} when supporting denormals, or {0, 0} when not supporting denormals (flushed to zero). Ref #129871	2025-04-08 14:50:26 +01:00
Christian Sigg	3a6b9b3a87	[mlir][bazel] Fix after dae0ef53a0b99c6c2b74143baee5896e8bc5c8e7 Remove unnecessary include.	2025-04-08 15:47:14 +02:00
Hans Wennborg	35b3886382	[win/arm64] Enable tail call with inreg arguments when possible (#134671 ) Tail calls were disabled from callers with inreg parameters in 5dc8aeb with a fixme to check if the callee also takes an inreg parameter. The issue is that inreg parameters (which are passed in x0 or x1 for free and member functions respectively) are supposed to be returned (in x0) at the end of the function. In case of a tail call, that means the callee needs to return the same value as the caller would. We can check for that case, and it's not as niche as it sounds, as that's how Clang will lower one function with an sret return value calling another, such as: ``` struct T { int x; }; struct S { T foo(); T bar(); }; T S::foo() { return bar(); } // foo's sret argument will get passed directly to bar ``` Fixes #133098	2025-04-08 15:25:28 +02:00
wldfngrs	fdf20941a8	[libc][math] Fix signaling NaN handling for math functions. (#133347 ) Add tests for signaling NaNs, and fix function behavior for handling signaling NaN input. Fixes https://github.com/llvm/llvm-project/issues/124812	2025-04-08 15:23:38 +02:00
Alan Li	dae0ef53a0	[MLIR][AMDGPU] Add a wrapper for global LDS load intrinsics in AMDGPU (#133498 ) Defining a new `amdgpu.global_load` op, which is a thin wrap around ROCDL `global_load_lds` intrinsic, along with its lowering logics to `rocdl.global.load.lds`.	2025-04-08 09:18:30 -04:00
Nico Weber	94b9d75c6d	[gn] port 65813e0e94c04	2025-04-08 09:16:37 -04:00
Jay Foad	008c875be8	[AMDGPU] Fix excessive stack usage in SIInsertWaitcnts::run (#134835 ) Noticed on Windows when running LLVM as part of a graphics driver, with total stack usage limited to about 128 KB. In some cases this function would overflow the stack. On Linux this reduces stack usage in this function from about 32 KB to about 0.5 KB.	2025-04-08 14:08:42 +01:00
Kajetan Puchalski	7e1b76c2d7	Revert "[flang] Use precompiled parsing headers" (#134851 ) Reverts llvm/llvm-project#130600 Reverting on account of Windows issues with ccache, will bring it back along with #131137 once those are resolved.	2025-04-08 13:47:25 +01:00
TatWai Chong	728320f946	[mlir][tosa] Increase test coverage for profile-based validation (#134754 ) Add more tests to increase test coverage.	2025-04-08 13:33:16 +01:00
Akshat Oke	fcaefc2c19	[AMDGPU][NPM] Port SIPreEmitPeephole to NPM (#130065 )	2025-04-08 17:58:48 +05:30
Joseph Huber	79cb6f05da	[Clang] Unify 'nvptx-arch' and 'amdgpu-arch' into 'offload-arch' (#134713 ) Summary: These two tools do the same thing, we should unify them into a single tool. We create symlinks for backward compatiblity and provide a way to get the old vendor specific behavior with `--amdgpu-only` and `--nvptx-only`.	2025-04-08 07:27:12 -05:00
David Spickett	db7fb704f6	[lldb][test] Explain why TestExprFromNonZeroFrame is disabled on Windows It's not scientific but I think the PDB we produce on the Windows on Arm bot simply doesn't have the information needed. Could also be that clang is producing some DWARF, but link.exe is dropping it from the final executable, the effect is the same.	2025-04-08 12:17:07 +00:00
Kajetan Puchalski	25e08c0b9c	Revert "[CMake] Fix using precompiled headers with ccache" (#134848 ) Reverts llvm/llvm-project#131397 Reverting for now on account of build bot failures on certain platforms.	2025-04-08 13:13:49 +01:00
Florian Hahn	a51e282784	[LV] Check if plan has an early exit via plan's exit blocks. (NFC) (#134720 ) Add a dedicated function to check if a plan is for a loop with an early exit. This can easily be determined by checking the exit blocks. This allows removing a use of Legal->hasUncountableEarlyExit() from InnerLoopVectorizer. PR: https://github.com/llvm/llvm-project/pull/134720	2025-04-08 12:52:38 +01:00
Michael Klemm	69c4e172d9	[Flang][OpenMP] Add semantic tests for threadprivate variables with host assoc (#134680 )	2025-04-08 13:22:05 +02:00
Omair Javaid	c2c1031e90	[Flang][Windows] Fix test_errors.py by enforcing UTF-8 encoding (#134625 ) This patch fixes UnicodeDecodeError on Windows in test_errors.py. This issue was observed on the flang-arm64-windows-msvc buildbot. Semantics/OpenMP/interop-construct.f90 was crashing due to Python defaulting to the cp1252 codec on Windows. I have fixed this by explicitly setting encoding="utf-8" when reading source files and invoking subprocess.run() in test_errors.py flang-arm64-windows-msvc was running on stagging master which resulted in this issue not being fixed earlier. https://lab.llvm.org/staging/#/builders/206	2025-04-08 16:16:26 +05:00
Kajetan Puchalski	e8dc8add3c	[CMake] Fix using precompiled headers with ccache (#131397 ) Using precompiled headers with ccache requires special accommodations. Add the required ccache options, clang and gcc compiler flags to CMake. Refactor ccache configuration to pass options directly on the command line for versions of ccache that support it. --------- Signed-off-by: Kajetan Puchalski <kajetan.puchalski@arm.com>	2025-04-08 12:09:52 +01:00
Paul Walker	e06a9ca2cb	[LLVM][CodeGen][SVE] Improve lowering of fixed length masked mem ops. (#134402 ) Converting fixed length masks, as used by MLOAD, to scalable vectors is done by comparing the mask to zero. When the mask is the result of a compare we can instead promote the operands and regenerate the original compare. At worst this reduces the dependecy chain and in most cases removes the need for multiple compares.	2025-04-08 12:09:10 +01:00
Nikolas Klauser	483edfeeb5	[libc++] Use __add_pointer and __remove_pointer builtins when they are fixed (#134147 )	2025-04-08 13:05:24 +02:00
Nathan Gauër	739062d2c3	[SPIR-V] Add spv.gep support for ptrcast legal (#134388 ) Adds support the the spv.gep intrinsic to the spv ptrcast legalization step. Those intrinsics are generated by the backend thus not directly visible in the tests. This is a pre-requisite to implement addrspacecast legalization for logical SPIR-V.	2025-04-08 12:55:37 +02:00
Jonathan Thackray	204d8c0d58	[clang][llvm] Fix AArch64 MOP4{A/S} intrinsic tests (NFC) (#134746 ) Fix some of the recently-added tests (PRs #127797, #128854, #129226 and #129230) which were incorrectly defined.	2025-04-08 11:45:47 +01:00
Simon Tatham	7af2b51e76	[AArch64][v8.5A] Omit BTI for non-addr-taken static fns on Linux (#134669 ) This is a conditional revert of cca40aa8d8aa732, which made LLVM's branch-target-enforcement mode generate BTI at the start of _every_ function, even in the case where the function has internal linkage and its address is never taken for use in an indirect call. The rationale was that it might turn out at link time that a direct call to the function spanned a larger distance than the range of a BL instruction (say, if the translation unit generated multiple code sections and the linker put them a very long way apart). Then the linker might insert a long-branch thunk using an indirect call instruction. SYSVABI64 has now clarified that in this situation the static linker may not assume that the target function is safe to call directly. If it needs to use this strategy, it's responsible for also generating a 'landing pad' near the target function, with a BTI followed by a direct branch, and using that as the target of the long-distance indirect call. `606ce44fe4` LLD complies with this spec as of commit 098b0d18add97de. So if we're compiling in a mode that respects SYSVABI64, such as targeting Linux, it's safe to leave out the BTI at the start of a function with internal linkage if we can prove that its address isn't either used in an indirect call in _this_ translation unit or passed out of the object. Therefore, this patch goes back to the behavior before cca40aa8d8aa732, leaving out BTIs in functions that can't be called indirectly, but only if the target triple is Linux. (I wasn't able to find a more precise query for "is this a SYSVABI64-compliant platform?", but Linux certainly is, and this check at least fails in the safe direction - if in doubt, we put in all the BTIs that might be necessary.)	2025-04-08 11:44:12 +01:00
Paul Walker	1997073a54	[LLVM][InstCombine][SVE] Refactor sve.mul/fmul combines. (#134116 ) After https://github.com/llvm/llvm-project/issues/126928 it's now possible to rewrite the existing combines, which mostly only handle cases where a operand is an identity value, to use existing simplify code to unlock general constant folding.	2025-04-08 11:38:27 +01:00
Simon Pilgrim	83fbe67986	[X86] combineX86ShufflesRecursively - iteratively peek through bitcasts to free subvector widening/narrowing sources. (#134701 ) Generalizes the existing code to repeatedly peek though mixed bitcast/insert_subvector/extract_subvector chains to find the source of the shuffle operand.	2025-04-08 11:28:40 +01:00
Anatoly Trosinenko	8521bd2424	[BOLT][AArch64] Handle PAuth call instructions in isIndirectCall (#133227 ) Handle `BLRA*` opcodes in AArch64MCPlusBuilder::isIndirectCall, update getRegUsedAsCallDest accordingly.	2025-04-08 13:23:10 +03:00
MisakaVan	ff5b649a84	[libc++] Fix a comment typo in __tree (#134831 ) "Returns true is __root is a proper red black tree" -> "Returns true if __root is a proper red black tree"	2025-04-08 12:17:43 +02:00
Ramkumar Ramachandra	6a42fb8fbf	[LV] Clarify code in isPredicatedInst (NFC) (#134251 )	2025-04-08 10:46:17 +01:00
Jay Foad	6f93c0676f	[AMDGPU] Make a few WaitcntBrackets methods const. NFC. (#134824 )	2025-04-08 10:44:02 +01:00
Jakub Ficek	a5509d62a7	[clang] fp options fix for __builtin_convertvector (#134102 ) Add missing CGFPOptionsRAII for fptoi and itofp cases	2025-04-08 11:36:48 +02:00
Tom Eccles	4c09ae0b2e	[flang][OpenMP] Lowering for CANCEL and CANCELLATIONPOINT (#134248 ) These will still hit TODOs in OpenMPToLLVMIRConversion.cpp	2025-04-08 10:29:18 +01:00
Tom Eccles	446d4f51eb	[flang][OpenMP][Lower] fix statement context cleanup insertion point (#133891 ) The statement context is used for lowering clauses for openmp operations using generalised helpers from flang lowering. The statement context stores closures which generate code for cleaning up temporary values generated by the lowering helper. These closures are run when the statement construct is destroyed. Keeping the statement context local to the clause or operation being lowered without any special handling was not correct because any cleanup code would be generated at the insertion point when that statement context went out of scope (which would in general be inside of the newly created container operation). It would be better to generate the cleanup code after the newly created operation (clause processing is synchronous even for deferred tasks). Currently supported clauses are mostly populated with simple scalar values that require no cleanup. Even the simple array sections added by #132994 needed no cleanup because indexing the right values of the array did not create any temporaries. Supporting array sections with vector indexing will generate hlfir.destroy operations for cleanup. This patch fixes where those will be created. Those hlfir.destroy operations don't generate any FIR (or LLVM) code, but the issue still exists theoretically. I wasn't able to find any clauses which have any cleanup to use to test this PR. It is probably NFC for the current lowering. This will be tested in [the PR adding vector subscripting of array sections](https://github.com/llvm/llvm-project/pull/133892).	2025-04-08 10:27:27 +01:00
Nathan Gauër	fe4f666363	[CI] Always upload queue/running count (#134814 ) Before this commit, we only pushed a queue/running count when the value was not zero. This makes building Grafana alerting a bit harder. Changing this to always upload a value for watched workflows.	2025-04-08 11:16:24 +02:00
David Green	c23e1cb936	[BasicAA] Treat ExtractValue(Argument) similar to Argument in relation to function-local objects. (#134716 ) This is a much smaller, technically orthogonal patch similar to #134505. It states that a extractvalue(Argument) can be treated like an Argument for alias analysis, where the extractelement acts like a phi / copy. No inttoptr here.	2025-04-08 10:05:58 +01:00

1 2 3 4 5 ...

533385 Commits