llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-04-25 01:06:05 +00:00

Author	SHA1	Message	Date
Luke Lau	ffaaaceaa1	[RISCV] Add test for vmv.s.x into a zeroinitializer vector. NFC This is generated by the loop vectorizer for out-of-loop add reductions with some starting value	2025-04-02 16:09:44 +01:00
Simon Pilgrim	3843dfeaf7	[X86] Add demanded elts test coverage for vXi16 VPERMW nodes Requested for #133923	2025-04-02 15:32:01 +01:00
yonghong-song	f99072bd8c	[Clang][BPF] Add tests for btf_type_tag c2x-style attributes (#133666 ) For btf_type_tag implementation, in order to have the same results with clang (__attribute__((btf_type_tag("...")))), gcc intends to use c2x syntax '[[...]]'. Clang also supports similar c2x syntax. Currently, the clang selftest contains the following five tests: ``` attr-btf_type_tag-func.c attr-btf_type_tag-similar-type.c attr-btf_type_tag-var.c attr-btf_type_tag-func-ptr.c attr-btf_type_tag-typedef-field.c ``` Tests attr-btf_type_tag-func.c and attr-btf_type_tag-var.c already have c2x syntax test. Test attr-btf_type_tag-func-ptr.c does not support c2x syntax when '__attribute__((...))' is replaced with with '[[...]]'. This should not be an issue since we do not have use cases for function pointer yet. This patch added '[[...]]' syntax for ``` attr-btf_type_tag-similar-type.c attr-btf_type_tag-typedef-field.c ```	2025-04-02 07:31:32 -07:00
Alexey Bataev	48a4b14cb6	[SLP]Fix whole vector registers calculations for compares Need to check that the calculated number of the elements is not larger than the original number of scalars to prevent a compiler crash. Fixes #134013	2025-04-02 07:26:40 -07:00
Yingwei Zheng	65ed35393c	[IR] Add helper `CmpPredicate::dropSameSign` (#134071 ) Address review comment https://github.com/llvm/llvm-project/pull/133711#discussion_r2024519641	2025-04-02 22:25:01 +08:00
Fraser Cormack	f186041553	[libclc] Move sinh, cosh & tanh to the CLC library (#134063 ) This commit also vectorizes the builtins.	2025-04-02 15:22:42 +01:00
Fraser Cormack	d51525ba36	[libclc] Move lgamma, lgamma_r & tgamma to CLC library (#134053 ) Also enable half-precision variants of tgamma, which were previously missing. Note that unlike recent work, these builtins are not vectorized as part of this commit. Ultimately all three call into lgamma_r, which has heavy control flow (including switch statements) that would be difficult to vectorize. Additionally the lgamma_r algorithm is copyrighted to SunPro so may need a rewrite in the future anyway. There are no codegen changes (to non-SPIR-V targets) with this commit, aside from the new half builtins.	2025-04-02 15:20:32 +01:00
Vy Nguyen	87bebd37ff	[LLDB][NFC]Move fields that might be referenced in scope-exit to beginning (#133785 ) Details: The ScopedDiscpatcher's dtor may reference these fields so we need the fields' dtor to be be invoked after the dispatcher's.	2025-04-02 10:19:12 -04:00
Igor Wodiany	200b589a1b	[mlir][spirv] Fix ambiguous conversion between SmallVector and TypeRange (#134087 ) This address buildbot failures caused by #133702.	2025-04-02 15:09:36 +01:00
vdonaldson	8a0f694381	[flang] Legacy ASSIGN statement target processing (#133737 ) Like other target statements, the statement associated with the label in a legacy ASSIGN statement could be inside a construct. Constructs containing such a target must therefore be marked as unstructured, fairly similar to how targets are processed in `markBranchTarget`.	2025-04-02 09:52:13 -04:00
David Green	c51b24c36a	[AArch64] Use getVectorInstrCost in div cost The costs of ExtractElement and InsertElement should be obtained via getVectorInstrCost.	2025-04-02 14:51:22 +01:00
Kareem Ergawy	de6c9096ba	[flang][OpenMP] Handle "loop-local values" in `do concurrent` nests (#127635 ) Extends `do concurrent` mapping to handle "loop-local values". A loop-local value is one that is used exclusively inside the loop but allocated outside of it. This usually corresponds to temporary values that are used inside the loop body for initialzing other variables for example. After collecting these values, the pass localizes them to the loop nest by moving their allocations. PR stack: - https://github.com/llvm/llvm-project/pull/126026 - https://github.com/llvm/llvm-project/pull/127595 - https://github.com/llvm/llvm-project/pull/127633 - https://github.com/llvm/llvm-project/pull/127634 - https://github.com/llvm/llvm-project/pull/127635 (this PR)	2025-04-02 15:43:19 +02:00
مهدي شينون (Mehdi Chinoune)	666df54ea6	[flang] Fold double bessel functions on Windows. (#130253 ) There are no functions for `float`. see: https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/bessel-functions-j0-j1-jn-y0-y1-yn	2025-04-02 14:43:09 +01:00
Nikolas Klauser	d8d561a388	[libc++][NFC] Remove _LIBCPP_DISABLE_EXTENSION_WARNINGS (#133693 ) We only use `_LIBCPP_DISABLE_EXTENSION_WARNINGS` in a single place while we use extensions all over the place. The warnings are already disabled, since libc++'s headers are system headers, so this shouldn't be in any way observable by users.	2025-04-02 15:40:00 +02:00
Igor Wodiany	2a90631841	[mlir][spirv] Allow yielding values from selection regions (#133702 ) There are cases in SPIR-V shaders where values need to be yielded from the selection region to make valid MLIR. For example (part of the SPIR-V shader decompiled to GLSL): ``` bool _115 if (_107) { // ... float _200 = fma(...); // ... _115 = _200 < _174; } else { _115 = _107; } bool _123; if (_115) { // ... float _213 = fma(...); // ... _123 = _213 < _174; } else { _123 = _115; } ```` This patch extends `mlir.selection` so it can return values. `mlir.merge` is used as a "yield" operation. This allows to maintain a compatibility with code that does not yield any values, as well as, to maintain an assumption that `mlir.merge` is the only operation in the merge block of the selection region.	2025-04-02 14:35:22 +01:00
Michael Klemm	581f8bccb5	[clang][OpenMP] Fix bug #62099 - use hash value when inode ID cannot be determined (#131646 ) When creating the name of an outlined region, Clang uses the file's inode ID to generate a unique name. When the file does not exist, this causes a fatal abort of the compiler. This PR switches to a has value that is used instead. --------- Co-authored-by: Michael Kruse <github@meinersbur.de>	2025-04-02 15:09:22 +02:00
Maxim Zhukov	2b7daaf967	[sanitizer][CFI] Add support to build CFI with sanitize-coverage (#131296 ) Added ability to build together with -fsanitize=cfi and -fsanitize-coverage=trace-cmp at the same time.	2025-04-02 16:05:44 +03:00
Jean-Didier PAILLEUX	c309abd925	[flang] Implement !DIR$ NOVECTOR and !DIR$ NOUNROLL[_AND_JAM] (#133885 ) Hi, This patch implements support for the following directives : - `!DIR$ NOUNROLL_AND_JAM` to disable unrolling and jamming on a DO LOOP. - `!DIR$ NOUNROLL` to disable unrolling on a DO LOOP. - `!DIR$ NOVECTOR` to disable vectorization on a DO LOOP.	2025-04-02 14:30:01 +02:00
dianqk	842785adf7	[MachineInstr] Remove the code that was accidentally added in #132536 (NFC)	2025-04-02 20:08:37 +08:00
Ryotaro Kasuga	cf976bfdeb	[LoopInterchange] Add tests for the vectorization profitability (NFC) (#133665 ) There is a problem with the current profitability check for vectorization in LoopInterchange. There are both false positives and false negatives. The former means that the heuristic may say that "an exchange is necessary to vectorize the innermost loop" even though it's already possible. The latter means that the heuristic may miss a case where an exchange is necessary to vectorize the innermost loop. Note that this is not a dependency analysis problem. This is caused by incorrect handling of the dependency matrix in the profitability check, so these problems can occur even if the analysis is accurate (no overestimation). This patch adds tests to clarify the cases that should be fixed. The root cause of these cases is that the heuristic doesn't handle the direction of a dependency correctly.	2025-04-02 21:02:30 +09:00
cor3ntin	14335be078	[Clang][NFC] Minor constraint satisfaction checking cleanup (#134059 ) We had a weird, incorrect, "ConstraintEvaluator" object that was not useful for anything, so I removed that. I also changed the CheckConstraintSatisfaction overload that just took an Expr* as this did not make much sense at all. Satisfaction checking is still fairly wrong, we do not follow the standard that requires we only substitute into the mapping of the normal form, so we produce errors for incorrect substitution into concepts id, even though we should not.	2025-04-02 13:49:48 +02:00
Lyle Dean	a0b75b9d99	[Clang] add emit -Wignored-base-class-qualifiers diagnostic for cv-qualified base classes (#132116 ) Split diagnosing base class qualifiers from the ``-Wignored-Qualifiers`` diagnostic group into a new ``-Wignored-base-class-qualifiers`` diagnostic group (which is grouped under ``-Wignored-qualifiers``). Fixes #131935	2025-04-02 07:31:42 -04:00
NAKAMURA Takumi	3cc7148fe0	[bazel] Update for #134043	2025-04-02 20:29:52 +09:00
Aaron Ballman	574e43dffd	[C23] Allow casting from a null pointer constant to nullptr_t (#133742 ) C23 allows a cast of a null pointer constant to nullptr_t. e.g., (nullptr_t)0 or (nullptr_t)(void *)0. Fixes #133644	2025-04-02 07:28:45 -04:00
Akshat Oke	7c4009f21b	[AMDGPU] AMDGPUSetWavePriority: Remove unused variable NFC (#134069 ) Fixes a13a51b91fcb2797cc5a16bd8bc7ad714bb15df6 (#130064)	2025-04-02 16:52:05 +05:30
Aaron Ballman	53f3031005	[C99] Fix definitions of INTn_C macros (#133916 ) C99 introduced macros of the form `INTn_C(v)` which expand to a signed or unsigned integer constant with the specified value `v` and the type `int_leastN_t`. Clang's initial implementation of these macros used token pasting to form the integer constants, but this means that users cannot define a macro named `L`, `U`, `UL`, etc before including `<stdint.h>` (in freestanding mode, where Clang's header is being used) because that could form invalid token pasting results. The new definitions now use the predefined `__INTn_C` macros instead of using token pasting. This matches the behavior of GCC. Fixes #85995	2025-04-02 07:21:15 -04:00
Han-Kuan Chen	5bbcc765cc	[SLP][REVEC] getNumElements should not be used as VF when REVEC is enabled. (#134031 )	2025-04-02 19:04:07 +08:00
Akshat Oke	a13a51b91f	[AMDGPU][NPM] Port AMDGPUSetWavePriority to NPM (#130064 )	2025-04-02 16:28:05 +05:30
Pavel Labath	d7afafdbc4	[lldb] Return const UnwindPlan pointers from FuncUnwinders (#133247 ) These plans are cached and accessed from multiple threads. Modifying them would be a Bad Idea(tm).	2025-04-02 12:48:57 +02:00
Yingwei Zheng	f066d7504e	[Reland][SCEV] teach isImpliedViaOperations about samesign (#133711 ) This patch relands https://github.com/llvm/llvm-project/pull/124270. Closes https://github.com/llvm/llvm-project/issues/126409. The root cause is that we incorrectly preserve the samesign flag after truncating operands of an icmp: https://alive2.llvm.org/ce/z/4NE9gS --------- Co-authored-by: Ramkumar Ramachandra <ramkumar.ramachandra@codasip.com>	2025-04-02 18:45:33 +08:00
Kareem Ergawy	ef56b53712	[flang][OpenMP] Extend `do concurrent` mapping to multi-range loops (#127634 ) Adds support for converting mulit-range loops to OpenMP (on the host only for now). The changes here "prepare" a loop nest for collapsing by sinking iteration variables to the innermost `fir.do_loop` op in the nest. PR stack: - https://github.com/llvm/llvm-project/pull/126026 - https://github.com/llvm/llvm-project/pull/127595 - https://github.com/llvm/llvm-project/pull/127633 - https://github.com/llvm/llvm-project/pull/127634 (this PR) - https://github.com/llvm/llvm-project/pull/127635	2025-04-02 12:43:04 +02:00
Longsheng Mou	7d441d9892	[mlir] Use `dyn_cast` instead of `cast` in MathToVCIX conversion (#134047 ) Fixes #131093.	2025-04-02 18:30:42 +08:00
Simon Pilgrim	2426ac647f	[X86] Add demanded elts for v8f32 VPERMV node Based off #133923 - test to ensure the VPERMV node as only the lower 128-bit source elements are demanded.	2025-04-02 11:18:47 +01:00
Matt Arsenault	54385f5ebe	llvm-reduce: Increase operands-to-args test coverage (#133853 ) This wasn't checking the output for all functions. --match-full-lines is also particularly hazardous for the interestingness checks for avoiding asserts and broken IR. Also add tests for some of the filtered function user types. This wasn't covered, and is overly conservative.	2025-04-02 17:02:53 +07:00
Miro Hrončok	e0f8898e1d	Avoid a race condition in opt-viewer/optrecord (#131214 ) See https://bugzilla.redhat.com/2336915 See https://reviews.llvm.org/D41784?id= See https://github.com/androm3da/optviewer-demo/issues/4#issuecomment-718787822 Fixes https://github.com/llvm/llvm-project/issues/62403. The race condition happened when the demangler_proc was being set. The locking mechanism itself happened too late. This way, the lock always exists (to avoid a race when creating it) and is always used when creating demangler_proc.	2025-04-02 11:52:41 +02:00
Tom Eccles	9b2fd1a6ec	[flang][OpenMP] Bump default OpenMP version to 3.1 (#133745 ) Precise OpenMP standards support information is being documented in #132707 Flang now has good support for OpenMP Version 3.1 and earlier.	2025-04-02 10:43:48 +01:00
Kareem Ergawy	3f8bfc9f7f	[flang][OpenMP] Map simple `do concurrent` loops to OpenMP host constructs (#127633 ) Upstreams one more part of the ROCm `do concurrent` to OpenMP mapping pass. This PR add support for converting simple loops to the equivalent OpenMP constructs on the host: `omp parallel do`. Towards that end, we have to collect more information about loop nests for which we add new utils in the `looputils` name space. PR stack: - https://github.com/llvm/llvm-project/pull/126026 - https://github.com/llvm/llvm-project/pull/127595 - https://github.com/llvm/llvm-project/pull/127633 (this PR) - https://github.com/llvm/llvm-project/pull/127634 - https://github.com/llvm/llvm-project/pull/127635	2025-04-02 11:26:58 +02:00
Luke Lau	8107b430ed	[VPlan] Simplify select c, x, x -> x (#133731 ) As noted in 1a9358c090d0507be21c5e9b2d97a23ef1de8ab0, some simplifications can produce a redundant select where the true and false operands are the same, which this patch removes. The is_fpclass test was changed so the condition wasn't made dead.	2025-04-02 10:26:48 +01:00
Fraser Cormack	dd19e7eaaa	[libclc] Move cbrt to the CLC library; vectorize (#133940 )	2025-04-02 10:18:24 +01:00
Nikita Popov	9356091a98	[GlobalMerge][PPC] Don't merge globals in llvm.metadata section (#131801 ) The llvm.metadata section is not emitted and has special semantics. We should not merge globals in it, similarly to how we already skip merging of `llvm.xyz` globals. Fixes https://github.com/llvm/llvm-project/issues/131394.	2025-04-02 10:40:53 +02:00
Sirraide	10c6ebc427	Reapply "[Clang] [NFC] Introduce a helper for emitting compatibility diagnostics (#132348 )" (#134043 ) This reapplies #132348 with a fix to the python bindings tests, reverting `076397ff32`.	2025-04-02 10:40:05 +02:00
Kareem Ergawy	41d718b1cf	[flang][OpenMP] Upstream `do concurrent` loop-nest detection. (#127595 ) Upstreams the next part of do concurrent to OpenMP mapping pass (from AMD's ROCm implementation). See https://github.com/llvm/llvm-project/pull/126026 for more context. This PR add loop nest detection logic. This enables us to discover muli-range do concurrent loops and then map them as "collapsed" loop nests to OpenMP. This is a follow up for https://github.com/llvm/llvm-project/pull/126026, only the latest commit is relevant. This is a replacement for https://github.com/llvm/llvm-project/pull/127478 using a `/user/<username>/<branchname>` branch. PR stack: - https://github.com/llvm/llvm-project/pull/126026 - https://github.com/llvm/llvm-project/pull/127595 (this PR) - https://github.com/llvm/llvm-project/pull/127633 - https://github.com/llvm/llvm-project/pull/127634 - https://github.com/llvm/llvm-project/pull/127635	2025-04-02 10:12:52 +02:00
Matt Arsenault	cde2ea377d	llvm-reduce: Defer a shouldKeep call in operand reduction (#133387 ) Ideally shouldKeep is only called in contexts that will successfully do something.	2025-04-02 15:00:41 +07:00
Harmen Stoppels	bd788dbf51	[AMDGPU] Remove detection of hip runtime for Spack (#133263 ) There is special logic to detect the hip runtime when llvm is installed with Spack. It works by matching the install prefix of llvm against `llvm-amdgpu-` followed by effectively globbing for ``` <llvm dir>/../hip-x.y.z-/ ``` and checking there is exactly one such directory. I would suggest to remove autodetection for the following reasons: 1. In the Spack ecosystem it's by design that every package lives in its own prefix, and can only know where its dependencies are installed, it has no clue what its dependents are and where they are installed. This heuristic detection breaks that invariant, since `hip` is a dependent of `llvm`, and can be surprising to Spack users. 2. The detection can lead to false positives, since users can be using an llvm installed "upstream" with their own build of hip locally, and they may not realize that clang is picking up upstream hip instead of their local copy. 3. It only works if the directory name is `llvm-amdgpu-*` which happens to be the name of AMD's fork of `llvm`, so it makes no sense that this code lives in the main LLVM repo for which the Spack package name is `llvm`. Feels wrong that LLVM knows about Spack package names, which can change over time. 4. Users can change the install directory structure, meaning that this detection is not robust under config changes in Spack.	2025-04-02 14:52:27 +07:00
Mariya Podchishchaeva	8a691cc615	[MS][clang] Make sure vector deleting dtor calls correct operator delete (#133950 ) During additional testing I spotted that vector deleting dtor calls operator delete, not operator delete[] when performing array deletion. This patch fixes that.	2025-04-02 09:25:43 +02:00
Kareem Ergawy	5d364481e3	[flang][OpenMP] Upstream first part of `do concurrent` mapping (#126026 ) This PR starts the effort to upstream AMD's internal implementation of `do concurrent` to OpenMP mapping. This replaces #77285 since we extended this WIP quite a bit on our fork over the past year. An important part of this PR is a document that describes the current status downstream, the upstreaming status, and next steps to make this pass much more useful. In addition to this document, this PR also contains the skeleton of the pass (no useful transformations are done yet) and some testing for the added command line options. This looks like a huge PR but a lot of the added stuff is documentation. It is also worth noting that the downstream pass has been validated on https://github.com/BerkeleyLab/fiats. For the CPU mapping, this achived performance speed-ups that match pure OpenMP, for GPU mapping we are still working on extending our support for implicit memory mapping and locality specifiers. PR stack: - https://github.com/llvm/llvm-project/pull/126026 (this PR) - https://github.com/llvm/llvm-project/pull/127595 - https://github.com/llvm/llvm-project/pull/127633 - https://github.com/llvm/llvm-project/pull/127634 - https://github.com/llvm/llvm-project/pull/127635	2025-04-02 09:24:38 +02:00
nawrinsu	730e8a4a59	[OpenMP] Add memory allocation using hwloc (#132843 ) This patch adds support for memory allocation using hwloc. To enable memory allocation using hwloc, env KMP_TOPOLOGY_METHOD=hwloc needs to be used. If hwloc is not supported/available, allocation will fallback to default path.	2025-04-02 00:17:50 -07:00
Sudharsan Veeravalli	536fe74aaa	[RISCV] Modify register type of extd* Xqcibm instructions (#134027 ) The v0.8 spec specifies that rs1 cannot be x31 (t6) since these instructions operate on a pair of registers (rs1 and rs1 + 1) with no wrap around. The latest spec can be found here: https://github.com/quic/riscv-unified-db/releases/tag/Xqci-0.8.0	2025-04-02 12:14:50 +05:30
Matt Arsenault	09e19cfacf	llvm-reduce: Do not reduce alloca array sizes to 0 (#132864 ) Fixes #64340	2025-04-02 13:44:45 +07:00
Ryotaro Kasuga	528e408b94	[LoopInterchange] Add an option to control the cost heuristics applied (#133664 ) LoopInterchange has several heuristic functions to determine if exchanging two loops is profitable or not. Whether or not to use each heuristic and the order in which to use them were fixed, but #125830 allows them to be changed internally at will. This patch adds a new option to control them via the compiler option. The previous patch also added an option to prioritize the vectorization heuristic. This patch also removes it to avoid conflicts between it and the newly introduced one, e.g., both `-loop-interchange-prioritize-vectorization=1` and `-loop-interchange-profitabilities='cache,vectorization'` are specified.	2025-04-02 15:41:40 +09:00

1 2 3 4 5 ...

532875 Commits