llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-04-17 19:26:36 +00:00

Author	SHA1	Message	Date
Austin Kerbow	62e5168889	[AMDGPU] Update code object metadata for kernarg preload Tracks the registers that explicit and hidden arguments are preloaded to with new code object metadata. IR arguments may be split across multiple parts by isel, and SGPR tuple alignment means that an argument may be spread across multiple registers. To support this, some of the utilities for hidden kernel arguments are moved to `AMDGPUArgumentUsageInfo.h`. Additional bookkeeping is also needed for tracking purposes.	2025-04-07 08:03:44 -07:00
Nico Weber	8f6551935a	[gn] Add a missing dependency Needed after 6ee5e694bff	2025-04-04 13:32:36 -04:00
Nico Weber	6ee5e694bf	[gn] port 10c6ebc4271 (-gen-clang-diags-compat-ids)	2025-04-04 13:25:41 -04:00
Jan Svoboda	cde90e68f8	[clang][deps] Respect `Lexer::cutOffLexing()` (#134404 ) This is crucial when recovering from fatal loader errors. Without it, the `Lexer` keeps yielding more tokens and the compiler may access invalid `ASTReader` state. rdar://133388373	2025-04-04 10:21:33 -07:00
Jan Svoboda	ea0869ccb9	[clang][parse] Fix build of ParseHLSLRootSignatureTest.cpp Fallout from PR #133467.	2025-04-04 10:20:44 -07:00
Jan Svoboda	1688c3062a	[clang] Do not share ownership of `PreprocessorOptions` (#133467 ) This PR makes it so that `CompilerInvocation` is the sole owner of the `PreprocessorOptions` instance.	2025-04-04 10:11:14 -07:00
Alexey Bataev	90cf2e31ab	Revert "[SLP]Initial support for (masked)loads + compress and (masked)interleaved" This reverts commit daab7d08078bb7cd37c66b78a56f4773e6b12fba to fix a crash reported in https://github.com/llvm/llvm-project/issues/134411.	2025-04-04 10:09:39 -07:00
Felipe de Azevedo Piovezan	232525f069	[lldb] Clear thread-creation breakpoints in ProcessGDBRemote::Clear (#134397 ) Currently, these breakpoints are being accumulated every time a new process if created (e.g. through a `run`). Depending on the circumstances, the old breakpoints are even left enabled, interfering with subsequent processes. This is addressed by removing the breakpoints in ProcessGDBRemote::Clear Note that these breakpoints are more of a PlatformDarwin thing, so in the future we should look into moving them there.	2025-04-04 10:05:44 -07:00
Peter Klausler	5942f0269e	[flang] Preserve compiler directives in -E output (#133959 ) No longer require -fopenmp or -fopenacc with -E, unless specific version number options are also required for predefined macros. This means that most source can be preprocessed with -E and then later compiled with -fopenmp, -fopenacc, or neither. This means that OpenMP conditional compilation lines (!$) are also passed through to -E output. The tricky part of this patch was dealing with the fact that those conditional lines can also contain regular Fortran line continuation, and that now has to be deferred when !$ lines are interspersed.	2025-04-04 09:49:57 -07:00
Craig Topper	70a1445e40	[RISCV] Prefer RegList over Rlist in assembler. NFC This makes it more obvious what the R means. I've kept rlist in place that refer to the encoding.	2025-04-04 09:37:52 -07:00
Snehasish Kumar	f9193f3b18	[DebugInfo] Preserve line and column number when merging debug info. (#129960 ) This patch introduces a new option `-preserve-merged-debug-info` to preserve an arbitrary but deterministic version of debug information when DILocations are merged. This is intended to be used in production environments from which sample based profiles are derived such as AutoFDO and MemProf. With this patch we have see a 0.2% improvement on an internal workload at Google when generating AutoFDO profiles. It also significantly improves the ability for MemProf by preserving debug info for merged call instructions used in the contextual profile. --------- Co-authored-by: Krzysztof Pszeniczny <kpszeniczny@google.com>	2025-04-04 09:37:25 -07:00
Matthias Springer	5acab1bd15	[mlir][SPIRV] `IfOpConversion`: Compute result types earlier (#134380 ) Compute the result types and bail out before modifying any IR. That is more efficient when type conversion failed, because no modifications must be rolled back. Note: This is in preparation of the One-Shot Dialect Conversion refactoring.	2025-04-04 09:37:15 -07:00
Paschalis Mpeis	3d24046b33	[BOLT] Skip out-of-range pending relocations (#116964 ) When a pending relocation is created it is also marked whether it is optional or not. It can be optional when such relocation is added as part of an optimization (i.e., `scanExternalRefs`). When bolt tries to `flushPendingRelocations`, it safely skips any optional relocations that cannot be encoded due to being out of range. A pre-requisite to that is the usage of the `-force-patch` flag. Alternatrively, BOLT will bail out with a relevant message. Background: BOLT, as part of scanExternalRefs, identifies external references from calls and creates some pending relocations for them. Those when flushed will update references to point to the optimized functions. This optimization can be disabled using `--no-scan`. BOLT can assert if any of these pending relocations cannot be encoded. This patch does not disable this optimization but instead selectively applies it given that a pending relocation is optional and `-force-patch` was enabled.	2025-04-04 17:31:14 +01:00
Kevin Gleason	e8d5009784	[mlir] Fix parsing of empty complex tensors (#134322 ) After https://github.com/llvm/llvm-project/pull/133220 we had some empty complex literals (`tensor<0xcomplex<f32>>`) failing to parse. This was largely due to the ambiguity between `shape.empty()` meaning splat (`dense<1>`) or empty literal (`dense<>`). Used type's numel to disambiguate during verification.	2025-04-04 09:29:51 -07:00
Evan Wilde	0d3f5ec0da	[compiler-rt][CMake] Pass all flags to _Float16 try-compile (#133952 ) The try-compile mechanism requires that `CMAKE_REQUIRED_FLAGS` is a space-separated string instead of a list of flags. The original code expanded `BUILTIN_FLAGS` into `CMAKE_REQUIRED_FLAGS` as a space-separated string and then would overwrite `CMAKE_REQUIRED_FLAGS` with `TARGET_${arch}_CFLAGS` prepended to the unexpanded `BUILTIN_CFLAGS_${arch}`. This resulted in the first two arguments being passed into the try-compile invocation, but dropping the other arguments listed in `BUILTIN_CFLAGS_${arch}`. This patch appends `TARGET_${arch}_CFLAGS` and `BUILTIN_CFLAGS_${arch}` to `CMAKE_REQUIRED_FLAGS` before expanding CMAKE_REQUIRED_FLAGS as a space-separated string. This passes any pre-set required flags, in addition to all of the builtin and target flags to the Float16 detection.	2025-04-04 09:02:24 -07:00
Craig Topper	07161a3fb1	[RISCV] Return NoMatch if register list does not start with a curly brace. This way we emit the error message that explains the full syntax for a register list. parseZcmpStackAdj had to be modified to not assume the previous operand had been successfully parsed as a register list.	2025-04-04 08:55:37 -07:00
Louis Dionne	4090910a69	[libc++] Guard additional headers with _LIBCPP_HAS_LOCALIZATION (#131921 ) There were some remaining headers that were not guarded with _LIBCPP_HAS_LOCALIZATION, leading to errors when trying to use modules on platforms that don't support localization (since all the headers get pulled in when building the 'std' module). This patch brings these headers in line with what we do for every other header that depends on localization. This patch also requires including <picolibc.h> from <__configuration/platform.h> in order to define _NEWLIB_VERSION. In the long term, we should use a better approach for doing that, such as defining a macro in the __config_site header.	2025-04-04 11:48:46 -04:00
Peter Klausler	1bef59c9db	[flang][preprocessor] Further macro replacement of continued identifiers (#134302 ) The preprocessor can perform macro replacement within identifiers when they are split up with Fortran line continuation, but is failing to do macro replacement on a continued identifier when none of its parts are replaced.	2025-04-04 08:44:22 -07:00
Peter Klausler	507ce46b6f	[flang][preprocessor] Directive continuation must skip empty macros (#134149 ) When a compiler directive continuation line starts with keyword macro names that have empty expansions, skip them.	2025-04-04 08:43:56 -07:00
Peter Klausler	efd7caac2e	[flang] IEEE_SUPPORT_FLAG(..., LOCAL) in specification expression (#134270 ) The optional second argument to IEEE_SUPPORT_FLAG (and related functions from the intrinsic IEEE_ARITHMETIC module) is needed only for its type, not its value. Restrictions on local objects as arguments to function references in specification expressions shouldn't apply to it. Define a new attribute for dummy data object characteristics to distinguish such arguments, set it for the appropriate intrinsic function references, and test it during specification expression validation.	2025-04-04 08:43:25 -07:00
Peter Klausler	ade9d1f810	[flang][runtime] Remove bad runtime assertion (#134176 ) The RUNTIME_CHECK in question doesn't allow for the possibility that an allocatable or pointer component could be processed by defined I/O. Remove it in favor of a dynamic allocation check.	2025-04-04 08:43:02 -07:00
Peter Klausler	262b3f7615	[flang] Remove runtime dependence on C++ support for types (#134164 ) Fortran::runtime::Descriptor::BytesFor() only works for Fortran intrinsic types for which a C++ type counterpart exists, so it crashes on some types that are legitimate Fortran types like REAL(2). Move some logic from Evaluate into a new header in flang/Common, then use it to avoid this needless dependence on C++.	2025-04-04 08:42:38 -07:00
Peter Klausler	3674a5f18e	[flang] Permit unused USE association of subprogram name (#134009 ) A function or subroutine can allow an object of the same name to appear in its scope, so long as the name is not used. This is similar to the case of a name being imported from multiple distinct modules, and implemented by the same representation. It's not clear whether this is conforming behavior or a common extension.	2025-04-04 08:41:32 -07:00
Peter Klausler	c8bde44cfc	[flang] Implement FSEEK and FTELL (#133003 ) Add function and subroutine forms of FSEEK and FTELL as intrinsic procedures. Accept common aliases from legacy compilers as well. A separate patch to llvm-test-suite will enable tests for these procedures once this patch has merged. Depends on https://github.com/llvm/llvm-project/pull/132423; CI builds will likely fail until that patch is merged and this PR is rebased.	2025-04-04 08:40:51 -07:00
Gaëtan Bossu	aca270877f	[SLP] Use named structs in vectorizeStores() (NFC) (#132781 ) This is a mostly straightforward replacement of the previous `std::pair<int, std::set<std::pair<...>>>` data structure used in `SLPVectorizerPass::vectorizeStores()` with slightly more readable alternatives. I had done that change in my local tree to help me better understand the code. It’s not very invasive, so I thought I’d create a PR for it.	2025-04-04 16:27:25 +01:00
Florian Hahn	5fbd0658a0	[VPlan] Add initial CFG simplification, removing BranchOnCond true. (#106748 ) Add an initial CFG simplification transform, which removes the dead edges for blocks terminated with BranchOnCond true. At the moment, this removes the edge between middle block and scalar preheader when folding the tail. PR: https://github.com/llvm/llvm-project/pull/106748	2025-04-04 15:44:26 +01:00
David Spickett	d4002b43f5	[lldb] Skip Expression NonZeroFrame test on Windows It is failing on our Windows on Arm bot: https://lab.llvm.org/buildbot/#/builders/141/builds/7605 Will investigate later.	2025-04-04 14:32:48 +00:00
Nikita Popov	7d4ea771c4	[SDAG] Use index type size for offset accumulation This is a precondition of the API. Not testable with in-tree targets. Fixes https://github.com/llvm/llvm-project/issues/134008.	2025-04-04 15:59:19 +02:00
Nashe Mncube	846000c005	Revert "[AArch64][SVE] Use FeatureUseFixedOverScalableIfEqualCost for A510 and A520" (#134382 ) Reverts llvm/llvm-project#132246	2025-04-04 14:36:38 +01:00
Nikita Popov	ecd4c0857b	[Verifier] Require that dbg.declare variable is a ptr (#134355 ) As far as I understand, the first operand of dbg_declare should be a pointer (inside a metadata wrapper). However, using a non-pointer is currently not rejected, and we have some tests that use non-pointer types. As far as I can tell, these tests either meant to use dbg_value or are just incorrect hand-crafted tests. Ran into this while trying to `fix` #134008.	2025-04-04 15:34:45 +02:00
Ramkumar Ramachandra	fd6260f13b	[EquivClasses] Shorten members_{begin,end} idiom (#134373 ) Introduce members() iterator-helper to shorten the members_{begin,end} idiom. A previous attempt of this patch was #130319, which had to be reverted due to unit-test failures when attempting to call members() on the end iterator. In this patch, members() accepts either an ECValue or an ElemTy, which is more intuitive and doesn't suffer from the same issue.	2025-04-04 14:34:08 +01:00
Aaron Ballman	fb9deab74e	Add additional test coverage for WG14 N3042 This addresses a post-commit request for some additional tests	2025-04-04 09:28:07 -04:00
Justin Bogner	77cfa38dcb	[DirectX][TTI] Sort switch statements. NFC (#134379 )	2025-04-04 22:26:46 +09:00
Matthias Springer	6966b4f4a5	[mlir][arith] Remove func patterns from `populateArithWideIntEmulationPatterns` (#134316 ) This function should populate only patterns that are related to wide integer operation emulation.	2025-04-04 06:23:17 -07:00
Asher Mancinelli	85fd83ed49	[flang][nfc] Use llvm memmove intrinsic over regular call (#134294 ) Follow up to #134170. We should be using the LLVM intrinsics instead of plain fir.calls when we can. Existing code creates a declaration for the llvm intrinsic and a regular fir.call, which makes it hard for consumers of the IR to find all the intrinsic calls.	2025-04-04 06:13:30 -07:00
Nashe Mncube	d2bcc11067	[AArch64][SVE] Use FeatureUseFixedOverScalableIfEqualCost for A510 and A520 (#132246 ) Inefficient SVE codegen occurs on at least two in-order cores, those being Cortex-A510 and Cortex-A520. For example a simple vector add ``` void foo(float a, float b, float dst, unsigned n) { for (unsigned i = 0; i < n; ++i) dst[i] = a[i] + b[i]; } ``` Vectorizes the inner loop into the following interleaved sequence of instructions. ``` add x12, x1, x10 ld1b { z0.b }, p0/z, [x1, x10] add x13, x2, x10 ld1b { z1.b }, p0/z, [x2, x10] ldr z2, [x12, #1, mul vl] ldr z3, [x13, #1, mul vl] dech x11 add x12, x0, x10 fadd z0.s, z1.s, z0.s fadd z1.s, z3.s, z2.s st1b { z0.b }, p0, [x0, x10] addvl x10, x10, #2 str z1, [x12, #1, mul vl] ``` By adjusting the target features to prefer fixed over scalable if the cost is equal we get the following vectorized loop. ``` ldp q0, q3, [x11, #-16] subs x13, x13, #8 ldp q1, q2, [x10, #-16] add x10, x10, #32 add x11, x11, #32 fadd v0.4s, v1.4s, v0.4s fadd v1.4s, v2.4s, v3.4s stp q0, q1, [x12, #-16] add x12, x12, #32 ``` Which is more efficient.	2025-04-04 14:12:44 +01:00
Mariya Podchishchaeva	16a1d5d51f	[clang] Do not diagnose unused deleted operator delete[] (#134357 ) For vector deleting dtors support we now also search and save operator delete[]. Avoid diagnosing deleted operator delete[] when doing that because vector deleting dtors are only called when delete[] is present and whenever delete[] is present in the TU it will be diagnosed correctly. Fixes https://github.com/llvm/llvm-project/issues/134265	2025-04-04 14:44:44 +02:00
Ilya Biryukov	da69eb75cb	[NFC] [ASTMatchers] Share code of `forEachArgumentWithParamType` with UnsafeBufferUsage (#132387 ) This changes exposes a low-level helper that is used to implement `forEachArgumentWithParamType` but can also be used without matchers, e.g. if performance is a concern. Commit f5ee10538b68835112323c241ca7db67ca78bf62 introduced a copy of the implementation of the `forEachArgumentWithParamType` matcher that was needed for optimizing performance of `-Wunsafe-buffer-usage`. This change shares the code between the two so that we do not repeat ourselves and any bugfixes or changes will be picked up by both implementations in the future.	2025-04-04 14:35:15 +02:00
Ilya Biryukov	d02786e778	[Sema] Handle AttributedType in template deduction with derived-to-base conversions (#134361 ) Fix #134356. We accidentally skipped checking derived-to-base conversions because deduction did not strip sugar in the relevant code. This caused deduction failures when a parameter type had an attribute.	2025-04-04 14:23:55 +02:00
Baranov Victor	547d054ef1	[clang-tidy][NFC][doc] improve "options" sections of `misc-`, `cppcore-` and other checks (#133694 ) Improved "options" sections of various checks: 1. Added Options keyword to be a delimiter between "body" and "options" parts of docs 2. Added default values where were absent. 3. Changed double-tick to single-tick in default values. --------- Co-authored-by: EugeneZelenko <eugene.zelenko@gmail.com>	2025-04-04 14:21:48 +02:00
Zahira Ammarguellat	babbc6f842	[NFC] Fixes proposed by code sanitizer. (#134138 )	2025-04-04 08:04:16 -04:00
Vy Nguyen	a2e888f5b4	[LLDB][NFC]Fix stack-use-after free bug. (#134296 ) Details: detailed_command_telemetry (bool) and command_id (int) could already be freed when the dispatcher's dtor runs. So we should just copy them into the lambda since they are cheap.	2025-04-04 08:00:46 -04:00
JaydeepChauhan14	0d17547879	[X86][NFC] Added POWI function testcases (#134276 ) - Moved existing llvm/test/CodeGen/X86/powi.ll file to llvm/test/CodeGen/X86/powi-const.ll. - Added new testcases for powi into llvm/test/CodeGen/X86/powi.ll.	2025-04-04 13:42:20 +02:00
Paul Walker	b0b97e3b05	[LLVM][AArch64] Refactor lowering of fixed length integer setcc operations. (#132434 ) The original code is essentially performing isel during legalisation with the AArch64 specific nodes offering no additional value compared to ISD::SETCC.	2025-04-04 12:13:45 +01:00
Sergio Afonso	a17d49687a	[Flang][Driver][AMDGPU] Fix -mcode-object-version (#134230 ) This patch updates flang to follow clang's behavior when processing the `-mcode-object-version` option. It is now used to populate an LLVM module flag called `amdhsa_code_object_version` expected by the backend and also updates the driver to add the `--amdhsa-code-object-version` option to the frontend invocation for device compilation of AMDGPU targets.	2025-04-04 11:54:49 +01:00
Vladi Krapp	a9a7b711e4	[ARM][NFC] Remove lines unnecessary for test (#134359 )	2025-04-04 11:51:18 +01:00
Florian Hahn	2bdc1a1337	[LV] Use frozen start value for FindLastIV if needed. (#132691 ) FindLastIV introduces multiple uses of the start value, where in the original source there was only a single use, when the epilogue is vectorized. Each use of undef may produce a different result, so introducing multiple uses can produce incorrect results when the input is undef/poison. If the start value may be undef or poison, freeze it and use the frozen value, which will be the same at all uses. See the following scenarios in Alive2: * Both main and epilogue vector loops execute, go to exit block: https://alive2.llvm.org/ce/z/_TSvRr * Both main and epilogue vector loops execute, go to scalar loop: https://alive2.llvm.org/ce/z/CsPj5v * Only epilogue vector loop executes, go to exit block: https://alive2.llvm.org/ce/z/5XqkNV * Only epilogue vector loop executes, go to scalar loop: https://alive2.llvm.org/ce/z/JUpqRN The latter 2 show requiring freezing the resume phi. That means we cannot freeze in the preheader. We could move the freeze to the main iteration count check, but that would be a bit fragile to find and other transforms can sink the freeze if needed. Depends on https://github.com/llvm/llvm-project/pull/132689 and https://github.com/llvm/llvm-project/pull/132690. Fixes https://github.com/llvm/llvm-project/issues/126836 PR: https://github.com/llvm/llvm-project/pull/132691	2025-04-04 11:48:01 +01:00
Jerry-Ge	d6c076eeaa	[mlir][tosa] Reorder Tosa_ExtensionAttrs to match with definition order (#134319 ) Simple refactor change. Signed-off-by: Jerry Ge <jerry.ge@arm.com>	2025-04-04 11:33:52 +01:00
Durgadoss R	a03b2250db	[NVPTX][Docs] [NFC] Update docs on intrinsics (#133136 ) Recently, we have added a set of complex intrinsics on the TMA, tcgen05, and Cvt family of instructions. This patch captures the key learnings from our experience so far and documents them as guidelines for future design. Signed-off-by: Durgadoss R <durgadossr@nvidia.com>	2025-04-04 15:39:25 +05:30
Alaa Ali	5812516ae2	[MLIR] Fix canonicalization pattern for 'shape.shape_of' (#134234 ) This PR will fix a bug in a canonicalization pattern (operation shape.shape_of: shape of reshape) ``` // Before func.func @f(%arg0: tensor<?x1xf32>, %arg1: tensor<3xi32>) -> tensor<3xindex> { %reshape = tensor.reshape %arg0(%arg1) : (tensor<?x1xf32>, tensor<3xi32>) -> tensor<?x1x1xf32> %0 = shape.shape_of %reshape : tensor<?x1x1xf32> -> tensor<3xindex> return %0 : tensor<3xindex> } //This is will error out as follows: error: 'tensor.cast' op operand type 'tensor<3xi32>' and result type 'tensor<3xindex>' are cast incompatible %0 = shape.shape_of %reshape : tensor<?x1x1xf32> -> tensor<3xindex> ^ note: see current operation: %0 = "tensor.cast"(%arg1) : (tensor<3xi32>) -> tensor<3xindex> ``` ``` // After func.func @f(%arg0: tensor<?x1xf32>, %arg1: tensor<3xi32>) -> tensor<3xindex> { %0 = arith.index_cast %arg1 : tensor<3xi32> to tensor<3xindex> return %0 : tensor<3xindex> } ``` See file canonicalize.mlir in the change list for an example. For the context, this bug was found while running a test on Keras 3, the canonicalizer errors out due to an invalid tensor.cast operation when the batch size is dynamic. The operands of the op are tensor<3xi32> cast to tensor<3xindex>. This change is related to a previous PR: https://github.com/llvm/llvm-project/pull/98531 --------- Co-authored-by: Alaa Ali <alaaali@ah-alaaali-l.dhcp.mathworks.com> Co-authored-by: Mehdi Amini <joker.eph@gmail.com>	2025-04-04 11:46:58 +02:00

1 2 3 4 5 ...

533054 Commits