llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-04-24 06:26:07 +00:00

Author	SHA1	Message	Date
Mark de Wever	658f848fed	[NFC][libc++][test] Refactor new ftm generator tests. (#134490 ) This uses the python unit test framework instead of just asserts. This improves the diagnostics when a test fails.	2025-04-08 19:38:26 +02:00
Andy Kaylor	4928093a21	[CIR] Upstream support for address of and dereference (#134317 ) This adds support for handling the address of and dereference unary operations in ClangIR code generation. This also adds handling for nullptr and proper initialization via the NullToPointer cast.	2025-04-08 10:32:03 -07:00
Min-Yih Hsu	9bfb4b8fb1	[MachineScheduler] Add more debug prints w.r.t hazards and pending SUnits (#134328 ) While we already have some detailed debug messages on the candidate selection process -- which selects a SUnit from the Available queue, we didn't say much about why a SUnit was _not_ moved from Pending queue to Available queue in the first place, which is just as important as why we scheduled a node IMHO. Therefore, I added some debug prints for this very purpose. I decide to print these extra messages by default (instead of being guarded by command line like `-misched-detail-resource-booking`) because we have been printing some of the hazard remarks, so I thought we might as well print these new messages -- which are mostly about hazard -- by default.	2025-04-08 10:31:05 -07:00
Matthias Springer	b7b3758e88	[mlir][IR] Add `VectorTypeElementInterface` with `!llvm.ptr` (#133455 ) This commit extends the MLIR vector type to support pointer-like types such as `!llvm.ptr` and `!ptr.ptr`, as indicated by the newly added `VectorTypeElementInterface`. This makes the LLVM dialect closer to LLVM IR. LLVM IR already supports pointers as vector element type. Only integers, floats, pointers and index are valid vector element types for now. Additional vector element types may be added in the future after further discussions. The interface is still evolving and may eventually turn into one of the alternatives that were discussed on the RFC. This commit also disallows `!llvm.ptr` as an element type of `!llvm.vec`. This type exists due to limitations of the MLIR vector type. RFC: https://discourse.llvm.org/t/rfc-allow-pointers-as-element-type-of-vector/85360	2025-04-08 19:21:45 +02:00
Valentin Clement (バレンタインクレメン)	5ebe22a35d	[flang][cuda] Add async id to allocators (#134724 ) Add async id to allocators in preparation for stream allocation.	2025-04-08 10:16:59 -07:00
Fangrui Song	7117dea043	AsmPrinter: Remove ELF's special lowerRelativeReference for unnamed_addr function; use lowerDSOLocalEquivalent in more cases https://reviews.llvm.org/D17938 introduced lowerRelativeReference to give ConstantExpr sub (A-B) special semantics in ELF: when `A` is an `unnamed_addr` function, create a PLT-generating relocation. This was intended for C++ relative vtables, but C++ relative vtable ended up using DSOLocalEquivalent (lowerDSOLocalEquivalent). This special treatment of `unnamed_addr` seems unusual. Let's remove it. Only COFF needs an overload to generate a @IMGREL32 relocation specifier (llvm/test/MC/COFF/cross-section-relative.ll). Pull Request: https://github.com/llvm/llvm-project/pull/134781	2025-04-08 10:11:20 -07:00
Mark de Wever	6d2b767678	[NFC][libc++] Removes Clang 16 work-arounds. (#91636 ) This was noticed while reviewing the implementation status of P1614R2 The Mothership has Landed Drive-by: Add some missing _LIBCPP_HIDE_FROM_ABIs.	2025-04-08 19:10:11 +02:00
Erich Keane	231aa3070d	[OpenACC][CIR] Basic infrastructure for OpenACC lowering (#134717 ) This is the first of a few patches that will do infrastructure work to enable the OpenACC lowering via the OpenACC dialect. At the moment this just gets the various function calls that will end up generating OpenACC, plus some tests to validate that we're doing the diagnostics in OpenACC specific locations. Additionally, this adds Stmt and Decl files for CIRGen.	2025-04-08 10:06:28 -07:00
Alexey Bataev	edcbd4a211	[SLP][NFC]Extract a check for strided loads into separate function, NFC Reviewers: hiraditya, RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/134876	2025-04-08 13:02:31 -04:00
Alexey Bataev	02a708b93b	[SLP][NFC]Extract TryToFindDuplicates lambda into a separate function, NFC Reviewers: RKSimon, hiraditya Reviewed By: hiraditya, RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/134873	2025-04-08 13:01:54 -04:00
Morris Hafner	441f87968d	[CIR] Upstream CmpOp (#133159 ) This patch adds support for comparison operators with ClangIR, both integral and floating point. --------- Co-authored-by: Morris Hafner <mhafner@nvidia.com> Co-authored-by: Henrich Lauko <xlauko@mail.muni.cz> Co-authored-by: Andy Kaylor <akaylor@nvidia.com>	2025-04-08 09:53:54 -07:00
k-kashapov	271399831b	[MSan] Change overflow_size_tls type to IntPtrTy (#117689 ) As discussed in https://github.com/llvm/llvm-project/pull/109284#discussion_r1838819987: Changed `__msan_va_arg_overflow_size_tls` type from `Int64Ty` to `IntPtrTy`.	2025-04-08 09:51:13 -07:00
Adrian Prantl	2721d50d87	Revert "[dsymutil] Avoid copying binary swiftmodules built from textual" This reverts commit 39ace8a63012af7d6ad7bf065c233fd3d5df44a3. while investigating Linux bot failures.	2025-04-08 09:49:36 -07:00
Jacob Lalonde	f869d6efee	[LLDB][Minidump]Update MinidumpFileBuilder to read and write in chunks (#129307 ) I recently received an internal error report that LLDB was OOM'ing when creating a Minidump. In my 64b refactor we made a decision to acquire buffers the size of the largest memory region so we could read all of the contents in one call. This made error handling very simple (and simpler coding for me!) but had the trade off of large allocations if huge pages were enabled. This patch is one I've had on the back burner for awhile, but we can read and write the Minidump memory sections in discrete chunks which we already do for writing to disk. I had to refactor the error handling a bit, but it remains the same. We make a best effort attempt to read as much of the memory region as possible, but fail immediately if we receive an error writing to disk. I did not add new tests for this because our existing test suite is quite good, but I did manually verify a few Minidumps couldn't read beyond the red_zone. ``` (lldb) reg read $sp rsp = 0x00007fffffffc3b0 (lldb) p/x 0x00007fffffffc3b0 - 128 (long) 0x00007fffffffc330 (lldb) memory read 0x00007fffffffc330 0x7fffffffc330: 60 c3 ff ff ff 7f 00 00 60 cd ff ff ff 7f 00 00 `.......`....... 0x7fffffffc340: 60 c3 ff ff ff 7f 00 00 65 e6 26 00 00 00 00 00 `.......e.&..... (lldb) memory read 0x00007fffffffc329 error: could not parse memory info (Success!) ``` I'm not sure how to quantify the memory improvement other than we would allocate the largest size regardless of the size. So a 2gb unreadable region would cause a 2gb allocation even if we were reading 4096 kb. Now we will take the range size or the max chunk size of 128 mb.	2025-04-08 09:47:52 -07:00
Stephen Tozer	e3d114ceb8	[DebugInfo][Reassociate] Propagate source loc when negating mul factor (#134679 ) As part of RemoveFactorFromExpression, we attempt to remove a factor from a mul/fmul expression; this may involve generating new instructions, e.g. to negate the result if the factor was negative in the original expression. When this happens, the new instructions should have a DebugLoc set from the instruction that the factored expression is being used to compute. Found using https://github.com/llvm/llvm-project/pull/107279.	2025-04-08 17:45:54 +01:00
Simon Pilgrim	46d4c3b1f6	[X86] combineX86ShuffleChain - always prefer VPERMQ/PD for unary subvector shuffles on AVX2+ targets (#134849 ) When combining 2 x 128-bit subvectors, don't assume that if the node is already a X86ISD::VPERM2X128 node then there's nothing to do. Fix issue where if we'd somehow combined to X86ISD::VPERM2X128 (typically if the 2 operands had then simplified to a common operand), we can't canonicalise back to X86ISD::VPERMI on AVX2+ targets. This matches the v4f64/v4i64 shuffle lowering preference for VPERMQ/PD over VPERM2F128/I128.	2025-04-08 17:30:35 +01:00
Dmitry Chestnykh	d6c8e8908d	Rename `F_no_mmap` to `F_mmap` (#134787 ) The `F_no_mmap` flag was introduced by `6814232429`	2025-04-08 19:22:03 +03:00
Thurston Dang	df0ccf6df0	[asan] Disable TestCases/Linux/asan_rt_confict_test-2.cpp to fix build TestCases/Linux/asan_rt_confict_test-2.cpp started failing in https://lab.llvm.org/buildbot/#/builders/66/builds/12265/steps/9/logs/stdio The only change is "[LLD][ELF] Allow merging XO and RX sections, and add --[no-]xosegment flag (#132412)" (`2c1bdd4a08`). Based on the test case (which deliberately tries to mix static and dynamically linked ASan), I suspect it's actually the test case that needs to be fixed (probably with a different error message check). This patch disables TestCases/Linux/asan_rt_confict_test-2.cpp to make the buildbots green while I investigate.	2025-04-08 16:16:22 +00:00
Nikolas Klauser	16d10546d2	[libc++] Remove _LIBCPP_METHOD_TEMPLATE_IMPLICIT_INSTANTIATION_VIS (#111964 ) This macro isn't required if we define all the functions inline. In fact, quite a few of the marked functions have already been inlined. This patch basically only moves code around and adds `_LIBCPP_HIDE_FROM_ABI` to the places where it's been missing so far. This also removes inlining hints, since it dropps `inline` in some places, but that shouldn't make much of a difference. The functions tend to be either really small, so should be inlined anyways, or are big enough that they shouldn't be inlined even with an inlinehint.	2025-04-08 18:16:18 +02:00
Matt Arsenault	3f38cd07d8	Revert "Inline: Propagate callsite nofpclass attribute" This reverts commit b0cb672b9968eeee6eb022e98476957dbdf8e6e2. Breaks bot	2025-04-08 23:15:00 +07:00
Fangrui Song	26475f5bdd	[AArch64] Refactor @plt, @gotpcrel, and @AUTH to use parseDataExpr Following PR #132569 (RISC-V), which added `parseDataExpr` for parsing expressions in data directives (e.g., `.word`), this PR migrates AArch64 `@plt`, `@gotpcrel`, and `@AUTH` from the `parsePrimaryExpr` workaround to `parseDataExpr`. The goal is to align with the GNU assembler model, where relocation specifiers apply to the entire operand rather than individual terms, reducing complexity-especially evident in `@AUTH` parsing. Note: AArch64 ELF lacks an official syntax for data directives (#132570). A prefix notation might be a preferable future direction. I recommend `%specifier(expr)`. AsmParser's `@specifier` parsing is suboptimal, necessitating lexer workarounds. `@` might appear multiple times in an operand. We should not use `@` beyond the existing AArch64 Mach-O instruction operands. In the test elf-reloc-ptrauth.s, many errors are now reported at parse time. Pull Request: https://github.com/llvm/llvm-project/pull/134202	2025-04-08 09:09:19 -07:00
Nico Weber	bb7ff134dc	[gn] port 6c74fe9087	2025-04-08 12:03:45 -04:00
Stephen Tozer	84fde791a1	[Reassociate] Apply Debugloc to instrs produced when optimizing add (#134676 ) Currently in Reassociate we may create a set of new instructions when optimizing an `add`, but we do not set DebugLocs on the new instructions; this patch propagates the add's DebugLoc to the new instructions. Found using #107279.	2025-04-08 17:02:16 +01:00
Krzysztof Drewniak	4a7b34d03c	Revert "[AMDGPU] Add buffer.fat.ptr.load.lds intrinsic wrapping raw rsrc version (#133015 )" (#134871 ) This reverts commit d1a05721172272f7aab685b56d99e86814a15bff. There was further discussion on the PR about whether the intinsics should exist in this form.	2025-04-08 11:00:41 -05:00
Nuno Lopes	b416e7f592	[CI] adjust the undef warning regex so it doesn't catch %undef in .ll files	2025-04-08 16:56:38 +01:00
Matt Arsenault	b0cb672b99	Inline: Propagate callsite nofpclass attribute (#134800) Fixes #134070	2025-04-08 22:53:17 +07:00
Congcong Cai	bd49d278c6	[clang-tidy][NFC] update test name and config for bugprone-unintended-char-ostream-output (#134868 )	2025-04-08 23:46:13 +08:00
tdanyluk	76d2e0881e	[mlir] fix references of attributes which are not defined earlier (#134364 ) If an attribute is not defined earlier in the same file, but just referenced from its dialect directly, then currently not the correct check is being emited. What would it emit for #toy.shape<[1, 2, 3]>: Earlier: // CHECK: #[['?']]<[1, 2, 3]> Now: // CHECK: #toy.shape<[1, 2, 3]>	2025-04-08 17:34:20 +02:00
Christian Sigg	4e9cfcf6af	[llvm][bazel] Fix BUILD after 561506144531cf0a760bb437fd74c683931c60ae.	2025-04-08 17:28:20 +02:00
Sirraide	6c74fe9087	[Clang] [NFC] Tablegen component diags headers (#134777 ) The component diagnostic headers (i.e. `DiagnosticAST.h` and friends) all follow the same format, and there’s enough of them (and in them) to where updating all of them has become rather tedious (at least it was for me while working on #132348), so this patch instead generates all of them (or rather their contents) via Tablegen. Also, it seems that `%enum_select` currently wouldn’t work in `DiagnosticCommonKinds.td` because the infrastructure for that was missing from `DiagnosticIDs.h`; this patch should fix that as well.	2025-04-08 17:21:45 +02:00
Matt Arsenault	34e8f00066	Attributor: Propagate align to cmpxchg instructions (#134838 ) Fixes #134480	2025-04-08 22:15:50 +07:00
Matt Arsenault	66f0343609	Attributor: Propagate align to atomicrmw instructions (#134837 ) Partially fixes #134480	2025-04-08 22:12:20 +07:00
Matt Arsenault	2cf4254466	Attributor: Add baseline tests for propagating align to atomics (#134836 )	2025-04-08 22:08:11 +07:00
Adrian Prantl	5615061445	[dsymutil] Avoid copying binary swiftmodules built from textual (#134719 ) .swiftinterface files into the dSYM bundle. These typically come only from the SDK (since textual interfaces require library evolution) and thus are a waste of space to copy into the bundle. The information about this is being parsed out of the control block, which means duplicating 5 constants from the Swift frontend. If a file cannot be parsed, dsymutil errs on the side of copying the file anyway. rdar://138186524	2025-04-08 08:03:32 -07:00
Matt Arsenault	dfe4d9187c	GCStrategy: Use Twine properly for error message (#132760 )	2025-04-08 21:57:29 +07:00
Christopher McGirr	ae3faea1f2	[MLIR][mlir-opt] move action debugger hook flag (#134842 ) Currently if a developer uses the flag `--mlir-enable-debugger-hook` the debugger hook is not actually enabled. It seems the DebugConfig and the MainMLIROptConfig are not connected. To fix this we can move the `enableDebuggerHook` CL Option to the DebugConfigCLOptions struct so that it can get registered and enabled along with the other debugger flags. AFAICS there are no other uses of the flag so this should be safe. This also adds a small LIT test to check that the hook is enabled by checking the std::cerr output for the log message.	2025-04-08 16:54:11 +02:00
Alan Li	b5045ae9bc	[MLIR][Fix] Fix missing dep in AMDGPUDialect. (#134862 ) Issue introduced in https://github.com/llvm/llvm-project/pull/133498	2025-04-08 10:46:55 -04:00
Michael Liao	4f77e50042	[MLIR][AMDGPU] Fix shared build. NFC	2025-04-08 10:46:15 -04:00
Han-Kuan Chen	2347aa1fcc	[SLP][REVEC] Fix the mismatch between the result of getAltInstrMask and the VecTy argument of TargetTransformInfo::isLegalAltInstr. (#134795 ) We cannot determine ScalarTy from VL because some ScalarTy is determined from VL[0]->getType(), while others are determined from getValueType(VL[0]). Fix "Mask and VecTy are incompatible".	2025-04-08 22:29:11 +08:00
Han-Kuan Chen	97c4cb4d13	[SLP][REVEC] getNumElements should not be used as VF when REVEC is enabled. (#134763 )	2025-04-08 22:29:03 +08:00
Philip Reames	c1e95b2e5e	[RISCV] Fix matching bug in VLA shuffle lowering (#134750 ) Fix https://github.com/llvm/llvm-project/issues/134126. The matching code was previous written as if we were mutating the indices to replace undef elements with preferred values, but the actual lowering code just took a prefix of the index vector. This resulted in us using undef indices for lanes which should have been defined, resulting in incorrect codegen. Longer term, we probably should rewrite the mask, but this seemed like an easier tactical fix.	2025-04-08 07:20:25 -07:00
Michael Kruse	8b11c39a0f	[llvm-mt] Do not build llvm-mt if not functional (#134631 ) llvm-mt requires libxml2 to work, so do not even build it without libxml2. CMake 3.31 and later prefer llvm-mt.exe over Microsoft's mt.exe if available and using clang-cl.exe as CMAKE_CXX_COMPILER. When CMake picks up llvm-mt.exe without libxml2, any build will fail with the message ``` llvm-mt: error: no libxml2 ``` Any test except `--help` already uses `REQUIRES: libxml2`. There is no point in having a non-functional executable. Not building llvm-mt.exe will force CMake to use Microsoft's `mt.exe` instead. Fixes: #134237	2025-04-08 16:16:53 +02:00
Mircea Trofin	b2dea4fd22	[ctxprof] root autodetection mechanism (#133147 ) This is an optional mechanism that automatically detects roots. It's a best-effort mechanism, and its main goal is to avoid pointing at the message pump function as a root. This is the function that polls message queue(s) in an infinite loop, and is thus a bad root (it never exits). High-level, when collection is requested - which should happen when a server has already been set up and handing requests - we spend a bit of time sampling all the server's threads. Each sample is a stack which we insert in a `PerThreadCallsiteTrie`. After a while, we run for each `PerThreadCallsiteTrie` the root detection logic. We then traverse all the `FunctionData`, find the ones matching the detected roots, and allocate a `ContextRoot` for them. From here, we special case `FunctionData` objects, in `__llvm_ctx_profile_get_context, that have a `CtxRoot` and route them to `__llvm_ctx_profile_start_context`. For this to work, on the llvm side, we need to have all functions call `__llvm_ctx_profile_release_context` because they _might_ be roots. This comes at a slight (percentages) penalty during collection - which we can afford since the overall technique is ~5x faster than normal instrumentation. We can later explore conditionally enabling autoroot detection and avoiding this penalty, if desired. Note that functions that `musttail call` can't have their return instrumented this way, and a subsequent patch will harden the mechanism against this case. The mechanism could be used in combination with explicit root specification, too.	2025-04-08 06:59:38 -07:00
Shilei Tian	f19c6f23ab	[Clang][AMDGPU] Improve error message when device libraries for COV6 are missing (#134745 ) #130963 switches the default to COV6, which requires ROCm 6.3. Currently, if the device libraries for COV6 are not found, the error message is not very helpful. This PR provides a more informative error message in such cases.	2025-04-08 09:57:43 -04:00
Romaric Jodin	0e98817458	libclc: frexp: fix implementation regarding denormals (#134823 ) Devices not supporting denormals can compare them true against zero. It leads to result not matching the CTS expectation when either supporting or not denormals. For example for 0x1.008p-140 we get {0x1.008p-140, 0} while the CTS expects {0x1.008p-1, -139} when supporting denormals, or {0, 0} when not supporting denormals (flushed to zero). Ref #129871	2025-04-08 14:50:26 +01:00
Christian Sigg	3a6b9b3a87	[mlir][bazel] Fix after dae0ef53a0b99c6c2b74143baee5896e8bc5c8e7 Remove unnecessary include.	2025-04-08 15:47:14 +02:00
Hans Wennborg	35b3886382	[win/arm64] Enable tail call with inreg arguments when possible (#134671 ) Tail calls were disabled from callers with inreg parameters in 5dc8aeb with a fixme to check if the callee also takes an inreg parameter. The issue is that inreg parameters (which are passed in x0 or x1 for free and member functions respectively) are supposed to be returned (in x0) at the end of the function. In case of a tail call, that means the callee needs to return the same value as the caller would. We can check for that case, and it's not as niche as it sounds, as that's how Clang will lower one function with an sret return value calling another, such as: ``` struct T { int x; }; struct S { T foo(); T bar(); }; T S::foo() { return bar(); } // foo's sret argument will get passed directly to bar ``` Fixes #133098	2025-04-08 15:25:28 +02:00
wldfngrs	fdf20941a8	[libc][math] Fix signaling NaN handling for math functions. (#133347 ) Add tests for signaling NaNs, and fix function behavior for handling signaling NaN input. Fixes https://github.com/llvm/llvm-project/issues/124812	2025-04-08 15:23:38 +02:00
Alan Li	dae0ef53a0	[MLIR][AMDGPU] Add a wrapper for global LDS load intrinsics in AMDGPU (#133498 ) Defining a new `amdgpu.global_load` op, which is a thin wrap around ROCDL `global_load_lds` intrinsic, along with its lowering logics to `rocdl.global.load.lds`.	2025-04-08 09:18:30 -04:00
Nico Weber	94b9d75c6d	[gn] port 65813e0e94c04	2025-04-08 09:16:37 -04:00

1 2 3 4 5 ...

533412 Commits