llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-04-24 09:56:06 +00:00

Author	SHA1	Message	Date
Simon Pilgrim	5e0e04f087	[X86] combineX86ShufflesRecursively - replace Root node argument with opcode/valuetype/ismaskedshuffle data. NFC. (#132437 ) Preparatory cleanup up patch to makes it easier for combineX86ShufflesRecursively/combineX86ShuffleChain to handle length changing shuffles up the shuffle chain than what combineX86ShuffleChainWithExtract can manage. Instead of passing the original Root node, pass the root opcode and the current effective value type (which may have widened as we recurse through EXTRACT_SUBVECTOR/TRUNCATE nodes etc.).	2025-03-25 10:20:23 +00:00
Ramkumar Ramachandra	e8d882a95b	[LV] Audit and fix nits in cl::opts (NFC) (#130601 ) Non-static cl::opts should be under the llvm namespace.	2025-03-25 10:19:45 +00:00
Benjamin Maxwell	107260cc29	[AArch64][SME2] Don't preserve ZT0 around SME ABI routines (#132722 ) This caused ZT0 to be preserved around `__arm_tpidr2_save` in functions with "aarch64_new_zt0". The block in which `__arm_tpidr2_save` is called is added by the SMEABIPass and may be reachable in cases where ZA has not been enabled* (so using `str zt0` is invalid). * (when za_save_buffer is null and num_za_save_slices is zero)	2025-03-25 10:09:25 +00:00
Florian Hahn	9c7e38896f	[VPlan] Split off reduction printing tests, add find-last-IV test. Splits off reduction printing tests, to limit growth and add test case for printing find-last-IV (https://github.com/llvm/llvm-project/pull/132689)	2025-03-25 10:06:28 +00:00
Luke Hutton	d4570ea813	[mlir][tosa] Disallow invalid datatype combinations in the validation pass (#131595 ) This commit checks if the operands/results of an operator can be found in the profile compliance mapping, if it isn't the operator is considered invalid. As a result, operator datatype combinations that are not listed under the "Supported Data Types" of the TOSA specification are disallowed and the validation pass results in failure. Signed-off-by: Luke Hutton <luke.hutton@arm.com>	2025-03-25 10:05:39 +00:00
Akshat Oke	f8e908a0ed	[AMDGPU][NPM] Port SIInsertHardClauses to NPM (#130062 )	2025-03-25 15:33:32 +05:30
Felipe de Azevedo Piovezan	07c82b1622	[lldb] Implement missing queue overloads from ThreadMemory (#132906 ) This commit makes ThreadMemory a real "forwarder" class by implementing the missing queue methods: they will just call the corresponding backing thread method. To make this patch NFC() and not change the behavior of the Python OS plugin, NamedThreadMemoryWithQueue also overrides these methods to simply call the `Thread` method, just as it was doing before. This also makes it obvious that there are missing pieces of this class if it were to provide full queue support. () This patch is NFC in the sense that all llvm.org plugins will not have any behavior change, but downstream consumers of ThreadMemory will benefit from the newly implemented forwarding methods.	2025-03-25 06:52:07 -03:00
Felipe de Azevedo Piovezan	65ad02b882	[lldb][NFC] Break ThreadMemory into smaller abstractions (#132905 ) ThreadMemory attempts to be a class abstracting the notion of a "fake" Thread that is backed by a "real" thread. According to its documentation, it is meant to be a class forwarding most methods to the backing thread, but it does so only for a handful of methods. Along the way, it also tries to represent a Thread that may or may not have a different name, and may or may not have a different queue from the backing thread. This can be problematic for a couple of reasons: 1. It forces all users into this optional behavior. 2. The forwarding behavior is incomplete: not all methods are currently being forwarded properly. Some of them involve queues and seem to have been intentionally left unimplemented. This commit creates the following separation: ThreadMemory <- ThreadMemoryProvidingName <- ThreadMemoryProvidingNameAndQueue ThreadMemory captures the notion of a backed thread that _really_ forwards all methods to the backing thread. (Missing methods should be implemented in a later commit, and allowing them to be implemented without changing behavior of other derived classes is the main purpose of this refactor). ThreadMemoryProvidingNameAndQueue is a ThreadMemory that allows users to override the thread name. If a name is present, it is used; otherwise the forwarding behavior is used. ThreadMemoryProvidingNameAndQueue is a ThreadMemoryProvidingName that allows users to override queue information. If queue information is present, it is used; otherwise, the forwarding behavior is used. With this separation, we can more explicitly implement missing methods of the base class and override them, if needed, in ThreadMemoryProvidingNameAndQueue. But this commit really is NFC, no new methods are implemented and no method implementation is changed.	2025-03-25 06:50:52 -03:00
Dhruv Srivastava	e6e8252ba0	[lldb][AIX] Minor AIX specific changes (#132718 ) This PR is in reference to porting LLDB on AIX. Link to discussions on llvm discourse and github: 1. https://discourse.llvm.org/t/port-lldb-to-ibm-aix/80640 2. https://github.com/llvm/llvm-project/issues/101657 The complete changes for porting are present in this draft PR: https://github.com/llvm/llvm-project/pull/102601 AIX build specific changes	2025-03-25 15:16:23 +05:30
Simon Pilgrim	6984cfea6c	[X86] Ensure concat(blendi(),blendi()) -> vselect() uses legal select mask types For 256-bit selections, we could be using sub-i8/vXi8 selection condition masks - extend these to i8 and then extract the lowest mask subvector Fixes #132844	2025-03-25 09:14:08 +00:00
Fraser Cormack	d46a699953	[libclc] Move asin/acos/atan to the CLC library (#132788 ) This commit simultaneously moves these three functions to the CLC library and optimizing them for vector types by avoiding scalarization.	2025-03-25 09:11:32 +00:00
Martin Storsjö	64779455b8	Revert "[YAML][NFC] precommit wrong test case (#131782 )" This reverts commit cb4ae35de0b4c19149379f16c7b279d80a669f9d. That commit broke compilation with GCC: ../unittests/Support/YAMLIOTest.cpp:1280:20: error: explicit specialization of template<class T> struct llvm::yaml::MappingTraits’ outside its namespace must u se a nested-name-specifier [-fpermissive] 1280 \| template <> struct MappingTraits<V> { \| ^~~~~~~~~~~~~~~~	2025-03-25 10:36:14 +02:00
Martin Storsjö	b33bec9b21	Revert "[SLP] Make getSameOpcode support interchangeable instructions. (#127450 )" This reverts commit 71a0cfd93263552ddc0bfd2ea7b0abe9a578f87e. This commit triggers failed asserts when compiling ffmpeg. The issue is reproducible with a small standalone reproducer like this: void make_filters_from_proto(int filter[][2], int bands) { int c, q, n; for (;; q++) { n = 0; for (; n < 7; n++) { int theta = (q (n - 6) + (n >> 1) - 3) % bands; if (theta) c = theta; filter[q][n][0] = c; } } } $ clang -target x86_64-linux-gnu -c repro.c -O3 clang: ../lib/Transforms/Vectorize/SLPVectorizer.cpp:989: llvm::SmallVector<llvm ::Value> {anonymous}::BinOpSameOpcodeHelper::InterchangeableInfo::getOperand(ll vm::Instruction) const: Assertion `FromCIValue.isZero() && "Cannot convert the instruction."' failed. The same issue also reproduces for a large number of other target triples, aarch64-linux-gnu and others.	2025-03-25 10:22:44 +02:00
Martin Storsjö	dd059338a2	Revert "[Vectorize] Fix a warning" This reverts commit 4c68061254c896214b7ad5ab807ac4ba11517812. Reverting as part of a revert of a preceding commit.	2025-03-25 10:21:05 +02:00
Aiden Grossman	e696f4e500	[llvm-exegesis] Fix LBR checks/test This patch fixes the LBR check in the local lit config. The test would segfault as the loop body would be basically empty, causing a divide by zero error. More investigation is needed there so we do not actually hit that assertion and report a cleaner error somewhere. Specifying an actual opcode to benchmark fixes the problem. The test would also fail as -mcpu was set to the default x86 CPU rather than the one currently being run on, so it would always fail to find a perf counter. This patch fixes that by simply removing the -mcpu flag. Given these issues, I'm not sure these tests have ever run in the ~5 years they have been in tree. There were some issues reported in \#132861, so I guess we'll see if there are further issues when the testing becomes more broad.	2025-03-25 08:10:58 +00:00
Ricardo Jesus	847e46ca01	[AArch64] Add initial support for -mcpu=olympus. (#132368 ) This patch adds support for the NVIDIA Olympus core. This does not add any special tuning decisions, and those may come later.	2025-03-25 08:09:04 +00:00
Timm Baeder	9b060d1e6a	[clang][bytecode] Fix zero-init of atomic floating point objects (#132782 ) We can't pass the AtomicType along to ASTContext::getFloatTypeSemantics.	2025-03-25 08:05:04 +01:00
Congcong Cai	cb4ae35de0	[YAML][NFC] precommit wrong test case (#131782 )	2025-03-25 14:44:12 +08:00
Timm Baeder	bcedb368e3	[clang][bytecode] Support composite arrays in memcpy op (#132775 ) See the attached test case.	2025-03-25 07:17:10 +01:00
Timm Bäder	1e2ad6793a	Revert "[clang][bytecode] Implement __builtin_{wcscmp,wcsncmp} (#132723 )" This reverts commit f7aea4d081f77dba48b0fc019f59b678fb679aa8. This broke the clang-solaris11-sparcv9 builder: https://lab.llvm.org/buildbot/#/builders/13/builds/6151	2025-03-25 07:15:30 +01:00
Kazu Hirata	fac8fe9cf9	[Target] Use Set::insert_range (NFC) (#132879 ) We can use Set::insert_range to collapse: for (auto Elem : Range) Set.insert(E); down to: Set.insert_range(Range); In some cases, we can further fold that into the set declaration.	2025-03-24 22:42:04 -07:00
Kazu Hirata	75210df5a2	[AMDGPU] Avoid repeated map lookups (NFC) (#132877 )	2025-03-24 22:41:27 -07:00
Paul Schwabauer	cca0f8113e	[PATCH] [clang][modules] Fix serialization and de-serialization of PCH module file refs (#105994 ) (#132802 ) The File ID is incorrectly calculated, resulting in an out-of-bounds access. The test code is more complex because the File fetching only happens in specific scenarios. --------- Co-authored-by: ShaderKeeper <no-reply@shaderkeeper.com> Co-authored-by: Chuanqi Xu <yedeng.yd@linux.alibaba.com>	2025-03-25 13:24:21 +08:00
Jim Lin	26a52f828d	[RISCV] RISCVInstrInfoSFB.td shouldn't be included in Vendor extensions section. NFC. RISCVInstrInfoSFB.td is for Short Forward Branch, not a kind of Vendor extension.	2025-03-25 13:00:34 +08:00
Sameer Sahasrabuddhe	f6a3cd54bc	[clang] ``noconvergent`` does not affect calls to convergent functions (#132701 ) When placed on a function, the ``clang::noconvergent`` attribute ensures that the function is not assumed to be convergent. But the same attribute has no effect on function calls. A call is convergent if the callee is convergent. This is based on the fact that in LLVM, a call always inherits all the attributes of the callee. Only ``convergent`` is an attribute in LLVM IR, and there is no equivalent of ``clang::noconvergent``.	2025-03-25 10:44:08 +05:30
Fangrui Song	9ee950be95	MCValue: Simplify code with getSubSym The MCValue::SymB MCSymbolRefExpr member might be replaced with a MCSymbol in the future. Reduce direct access.	2025-03-24 21:52:40 -07:00
Lang Hames	473b059505	[ORC] Add ExecutorAddrRange::fromPtrRange convenience method. This can be used to construct an ExecutorAddrRange from a pair of pointers, or a pointer and a size. This will be used to reduce boilerplate in an upcoming patch.	2025-03-25 15:45:48 +11:00
Matt Arsenault	37b5f77f8b	llvm-reduce: Fix asserting on TargetExtTypes that do not support zeroinit (#132733 ) So far I've been unsuccessful in finding an example where the used constant value is directly observed in the output. This avoids an assert in an intermediate step of value replacement.	2025-03-25 11:40:55 +07:00
Matt Arsenault	bfb549ff33	llvm-reduce: Fix operand reduction asserting on target ext types (#132732 ) Not all TargetExtTypes support zeroinit, so use poison as a substitute if unavailable.	2025-03-25 11:38:04 +07:00
Rahul Joshi	eeb4132b8d	[NFC] Fix macro redefinition warning in NewPMDriver.cpp (#132854 )	2025-03-24 20:16:48 -07:00
tangaac	a6d366268d	[LoongArch] Pre-commit tests for vector shift (#132702 )	2025-03-25 10:31:54 +08:00
Owen Pan	da7f1564a8	[clang-format] Don't wrap before attributes in parameter lists (#132519 ) Fix #132240	2025-03-24 19:18:13 -07:00
Chuanqi Xu	e5f100676e	[clangd] [C++20] [Modules] Add modules suffix for 'Header' Source Switch (#131591 ) Support the trivial "header"/source switch for module interfaces. I initially thought the naming are bad and we should rename it. But later I feel it is better to split patches as much as possible. From the codes it looks like there are problems. e.g., `isHeaderFile`. But let's try to fix them in different patches.	2025-03-25 09:43:13 +08:00
Valentin Clement (バレンタインクレメン)	5be9082fed	[flang][cuda] Carry over the dynamic shared memory size to gpu.launch_func (#132837 )	2025-03-24 18:37:19 -07:00
Kazu Hirata	4c68061254	[Vectorize] Fix a warning This patch fixes: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:855:52: error: unused variable 'SupportedOp' [-Werror,-Wunused-const-variable]	2025-03-24 17:38:47 -07:00
Paul Kirth	c1ed4f6423	[clang-doc] Format test files (#132428 ) Many of the test files had an inconsistent formatting. This patch ran clang-format over them using the project's .clang-format file, with column limit = 0, to prevent test directives from being split over multiple lines.	2025-03-24 17:27:16 -07:00
Han-Kuan Chen	71a0cfd932	[SLP] Make getSameOpcode support interchangeable instructions. (#127450 ) We use the term "interchangeable instructions" to refer to different operators that have the same meaning (e.g., `add x, 0` is equivalent to `mul x, 1`). Non-constant values are not supported, as they may incur high costs with little benefit. --------- Co-authored-by: Alexey Bataev <a.bataev@gmx.com>	2025-03-25 08:24:46 +08:00
Paul Kirth	ece59a8cb9	Reland Support for mustache templating language (#132467 ) The last version of this patch had memory leaks due to using the BumpPtrAllocator for data types that required destructors to run to release heap memory (e.g. via std::vector and std::string). This version avoids that by using smart pointers, and dropping support for BumpPtrAllocator. We should refactor this code to use the BumpPtrAllocator again, but that can be addressed in future patches, since those are more invasive changes that need to refactor many of the core data types to avoid owning allocations. Adds Support for the Mustache Templating Language. See specs here: https://mustache.github.io/mustache.5.html This patch implements support+tests for majority of the features of the language including: - Variables - Comments - Lambdas - Sections This meant as a library to support places where we have to generate HTML, such as in clang-doc. Co-authored-by: Peter Chou <peter.chou@mail.utoronto.ca>	2025-03-24 17:23:25 -07:00
Joseph Huber	25bf4e262c	[Offload] Remove handling for COV4 binaries from offload/ (#131033 ) Summary: We moved from cov4 to cov5 a long time ago, and it guards simplifying some front end code, so we should be able to move up with this.	2025-03-24 18:58:20 -05:00
Shilei Tian	ff8aa300d6	[AMDGPU] Remove outdated COV6 warning (#132814 )	2025-03-24 19:57:07 -04:00
Paul Kirth	0aa4c35beb	[libc][__support] Fix -Wimplicit-int-conversion warning (#132839 ) Newer versions of clang now warn about these, so use explicit conversion instead.	2025-03-24 16:47:07 -07:00
David Benjamin	e6de45a229	[tsan] Don't treat uncontended pthread_once as a potentially blocking region (#132477 ) guard_acquire is a helper function used to implement TSan's __cxa_guard_acquire and pthread_once interceptors. https://reviews.llvm.org/D54664 introduced optional hooks to support cooperative multi-threading. It worked by marking the entire guard_acquire call as a potentially blocking region. In principle, only the contended case needs to be a potentially blocking region. This didn't matter for __cxa_guard_acquire because the compiler emits an inline fast path before calling __cxa_guard_acquire. That is, once we call __cxa_guard_acquire at all, we know we're in the contended case. https://reviews.llvm.org/D107359 then unified the __cxa_guard_acquire and pthread_once interceptors, adding the hooks to pthread_once. However, unlike __cxa_guard_acquire, pthread_once callers are not expected to have an inline fast path. The fast path is inside the function. As a result, TSan unnecessarily calls into the cooperative multi-threading engine on every pthread_once call, despite applications generally expecting pthread_once to be fast after initialization. Fix this by deferring the hooks to the contended case inside guard_acquire.	2025-03-24 19:30:15 -04:00
Joseph Huber	ef2735d243	[Flang] Detect endianness in the preprocessor (#132767 ) Summary: Currently we use `TestBigEndian` in CMake to determine endianness. This doesn't work on all platforms and is deprecated since CMake 3.20. Instead of using CMake, we can just use the GNU/Clang preprocessor definitions. The only difficulty is MSVC, mostly because they don't support the same macros. But, as far as I'm aware, MSVC / Windows targets are always little endian, and if not we can just override it for that specific target in the future.	2025-03-24 18:29:05 -05:00
Krzysztof Parzyszek	c221d64206	[flang] Remove mentions of evaluate::Variable<T> (#132805 ) The template itself was not defined anywhere. The closest thing was a forward declaration in flang/include/flang/Evaluate/variable.h.	2025-03-24 18:26:57 -05:00
Thurston Dang	3ce3d889f6	[asan] Re-exec without ASLR if needed on 64-bit Linux (#132682 ) This generalizes https://github.com/llvm/llvm-project/pull/131975 to non-32-bit Linux (i.e., 64-bit Linux). This works around an edge case in 64-bit Linux, whereby the memory layout is incompatible if the stack size is unlimited AND ASLR entropy is 31+ bits (see https://github.com/google/sanitizers/issues/856#issuecomment-2747076811). More generally, this "re-exec without ASLR if layout is incompatible" is a hammer that can work around most shadow mapping issues, without incurring the overhead of using a dynamic shadow.	2025-03-24 16:24:38 -07:00
joaosaffran	567b0f8923	[HLSL] Add support to branch/flatten attributes to switch (#131739 ) closes: [#125754](https://github.com/llvm/llvm-project/issues/125754) --------- Co-authored-by: joaosaffran <joao.saffran@microsoft.com>	2025-03-24 16:17:19 -07:00
Jannick Kremer	20fc2d5aa5	[libclang/python] Add some bindings to the `Cursor` interface (#132377 ) Make Cursor hashable Add `Cursor.has_attr()` Add `Cursor.specialized_template` This covers the Cursor interface changes added by #120590 --------- Co-authored-by: Mathias Stearn <redbeard0531@gmail.com>	2025-03-25 00:08:32 +01:00
Sarah Spall	14d25613cb	[HLSL] Finish exposing half types and intrinsics always (#132804 ) We previously made an implementation error when adding half overloads for HLSL library functionality. The half type is always defined in HLSL and half intrinsics should not be conditionally included. When native 16-bit types are disabled half is a unique 32-bit float type with lesser promotion rank than float. Apply pattern #81782 to intrinsics added in #95999. Closes #132793	2025-03-24 15:34:58 -07:00
LLVM GN Syncbot	0adc672ed4	[gn build] Port 9a82f742b497	2025-03-24 21:56:36 +00:00
Helena Kotas	9a82f742b4	[HLSL][NFC] Refactor HLSLExternalSemaSource (#131032 ) Moving builder classes into separate files `HLSLBuiltinTypeDeclBuilder.cpp`/`.h`, changing a some `HLSLExternalSemaSource` methods to private and removing unused methods. This is a prep work before we start adding more builtin types and methods, like textures, resource constructors or matrices. For example constructors could make use of the `BuiltinTypeMethodBuilder`, but this helper class was defined in `HLSLExternalSemaSource.cpp` after the method that creates a constructor. Rather than reshuffling the code one big source file I am moving the builders into a separate cpp & header file and placing the helper classes declarations up top. Currently the new header only exposes `BuiltinTypeDeclBuilder` to `HLSLExternalSemaSource`. In the future but we might decide to expose more helper classes as needed.	2025-03-24 14:56:05 -07:00

... 5 6 7 8 9 ...

532151 Commits