llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-04-28 04:16:06 +00:00

Author	SHA1	Message	Date
Steven Perron	756fe54dc7	[SPIRV] Add write to image buffer for shaders. (#115927 ) This commit adds an intrinsic that will write to an image buffer. We chose to match the name of the DXIL intrinsic for simplicity in clang. We cannot reuse the existing openCL write_image function because that is not a reserved name in HLSL. There is not much common code to factor out.	2024-11-18 09:06:05 -05:00
Antonio Frighetto	9fc4654462	[LangRef] Fix mislabeling in calling convention name (NFC) We have explained how musttail can be guaranteed when the calling convention is not `swifttailcc` or `tailcc`, ensure what needs to adhere when it is the opposite case.	2024-11-18 11:02:10 +01:00
Freddy Ye	97836bed63	Reland "[X86] Support -march=diamondrapids (#113881 )" (#116564 ) Ref.: https://cdrdv2.intel.com/v1/dl/getContent/671368	2024-11-18 10:40:32 +08:00
Freddy Ye	90e92239bd	Revert "[X86] Support -march=diamondrapids (#113881 )" (#116563 ) This reverts commit 826b845c9e97448395431be3e4e5da585bd98c5e.	2024-11-18 08:45:28 +08:00
Freddy Ye	826b845c9e	[X86] Support -march=diamondrapids (#113881 ) Ref.: https://cdrdv2.intel.com/v1/dl/getContent/671368	2024-11-18 08:31:17 +08:00
Florian Hahn	c95daac4c1	[LangRef] Spell out alias attribute/metadata violations are UB. (#116220 ) Update the documentation for the noalias attribute, !alias.scope and !loop.parallel_accesses metadata to clarify they are UB on voilation the noalias property. PR: https://github.com/llvm/llvm-project/pull/116220 --------- Co-authored-by: Nuno Lopes <nuno.lopes@tecnico.ulisboa.pt>	2024-11-16 13:38:58 +00:00
Alex Bradbury	298127dcbe	Reapply [IR] Initial introduction of llvm.experimental.memset_pattern (#97583 ) Relands 7ff3a9acd84654c9ec2939f45ba27f162ae7fbc3 after regenerating the test case. Supersedes the draft PR #94992, taking a different approach following feedback: * Lower in PreISelIntrinsicLowering * Don't require that the number of bytes to set is a compile-time constant * Define llvm.memset_pattern rather than llvm.memset_pattern.inline As discussed in the [RFC thread](https://discourse.llvm.org/t/rfc-introducing-an-llvm-memset-pattern-inline-intrinsic/79496), the intent is that the intrinsic will be lowered to loops, a sequence of stores, or libcalls depending on the expected cost and availability of libcalls on the target. Right now, there's just a single lowering path that aims to handle all cases. My intent would be to follow up with additional PRs that add additional optimisations when possible (e.g. when libcalls are available, when arguments are known to be constant etc).	2024-11-15 15:21:39 +00:00
Alex Bradbury	0fb8fac5d6	Revert "[IR] Initial introduction of llvm.experimental.memset_pattern (#97583 )" This reverts commit 7ff3a9acd84654c9ec2939f45ba27f162ae7fbc3. Recent scheduling changes means tests need to be re-generated. Reverting to green while I do that.	2024-11-15 14:48:32 +00:00
Alex Bradbury	7ff3a9acd8	[IR] Initial introduction of llvm.experimental.memset_pattern (#97583 ) Supersedes the draft PR #94992, taking a different approach following feedback: * Lower in PreISelIntrinsicLowering * Don't require that the number of bytes to set is a compile-time constant * Define llvm.memset_pattern rather than llvm.memset_pattern.inline As discussed in the [RFC thread](https://discourse.llvm.org/t/rfc-introducing-an-llvm-memset-pattern-inline-intrinsic/79496), the intent is that the intrinsic will be lowered to loops, a sequence of stores, or libcalls depending on the expected cost and availability of libcalls on the target. Right now, there's just a single lowering path that aims to handle all cases. My intent would be to follow up with additional PRs that add additional optimisations when possible (e.g. when libcalls are available, when arguments are known to be constant etc).	2024-11-15 14:07:46 +00:00
joaosaffran	bc6c068127	[HLSL] Adding HLSL `clip` function. (#114588 ) Adding HLSL `clip` function. - adding llvm intrinsic - adding sema checks - adding dxil lowering - ading spirv lowering - adding sema tests - adding codegen tests - adding lowering tests Closes #99093 --------- Co-authored-by: Joao Saffran <jderezende@microsoft.com>	2024-11-14 23:34:07 -08:00
Matin Raayai	bb3f5e1fed	Overhaul the TargetMachine and LLVMTargetMachine Classes (#111234 ) Following discussions in #110443, and the following earlier discussions in https://lists.llvm.org/pipermail/llvm-dev/2017-October/117907.html, https://reviews.llvm.org/D38482, https://reviews.llvm.org/D38489, this PR attempts to overhaul the `TargetMachine` and `LLVMTargetMachine` interface classes. More specifically: 1. Makes `TargetMachine` the only class implemented under `TargetMachine.h` in the `Target` library. 2. `TargetMachine` contains target-specific interface functions that relate to IR/CodeGen/MC constructs, whereas before (at least on paper) it was supposed to have only IR/MC constructs. Any Target that doesn't want to use the independent code generator simply does not implement them, and returns either `false` or `nullptr`. 3. Renames `LLVMTargetMachine` to `CodeGenCommonTMImpl`. This renaming aims to make the purpose of `LLVMTargetMachine` clearer. Its interface was moved under the CodeGen library, to further emphasis its usage in Targets that use CodeGen directly. 4. Makes `TargetMachine` the only interface used across LLVM and its projects. With these changes, `CodeGenCommonTMImpl` is simply a set of shared function implementations of `TargetMachine`, and CodeGen users don't need to static cast to `LLVMTargetMachine` every time they need a CodeGen-specific feature of the `TargetMachine`. 5. More importantly, does not change any requirements regarding library linking. cc @arsenm @aeubanks	2024-11-14 13:30:05 -08:00
Justin Fargnoli	2e9f8696e9	Reland "[LLVM] Add IRNormalizer Pass" (#113780 ) `IRNormalizer` will reorder instructions. Thus, we need to invalidate analyses. Done in cd500d28cba3177c213f2f2faf50f14ea56e230b. This should resolve the [BuildBot failure](https://github.com/llvm/llvm-project/pull/68176#issuecomment-2428243474). --- Original PR: #68176 Original commit: 1295d2e6da2fe90f3b770ab1d35bf5caecd38bed Reverted with: 8a12e0131f3d84b470fac63af042aa96a1b19f56 --- Add the llvm-canon tool. Description from the [original PR](https://reviews.llvm.org/D66029#change-wZv3yOpDdxIu): > Added a new llvm-canon tool which aims to transform LLVM Modules into a canonical form by reordering and renaming instructions while preserving the same semantics. This tool makes it easier to spot semantic differences while diffing two modules which have undergone different transformation passes. The current version of this tool can: - Reorder instructions within a function. - Rename instructions based on the operands. - Sort commutative operands. This code was originally written by @michalpaszkowski and [submitted to mainline LLVM](`14d358537f`). However, it was quickly [reverted](`335de55fa3`) to do BuildBot errors. Michal presented his version of the tool in [LLVM-Canon: Shooting for Clear Diffs](https://www.youtube.com/watch?v=c9WMijSOEUg). @AidanGoldfarb and I ported the code to the new pass manager, added more tests, and fixed some bugs related to PHI nodes that may have been the root cause of the BuildBot errors that caused the patch to be reverted. Additionally, we rewrote the implementation of instruction reordering to fix cases where the original algorithm would break use-def chains. Note that this is @AidanGoldfarb and I's first time submitting to LLVM. Please liberally critique the PR! CC @plotfi for initial review. --------- Co-authored-by: Aidan <aidan.goldfarb@mail.mcgill.ca>	2024-11-14 09:56:22 -08:00
Graham Hunter	ed5aaddd7b	[IR] Vector extract last active element intrinsic (#113587 ) As discussed in #112738, it may be better to have an intrinsic to represent vector element extracts based on mask bits. This intrinsic is for the case of extracting the last active element, if any, or a default value if the mask is all-false. The target-agnostic SelectionDAG lowering is similar to the IR in #106560.	2024-11-14 17:48:43 +00:00
Diana Picus	2aa6cedfa8	[AMDGPU] Clarify amdgpu.cs.chain + init whole wave. NFC (#115452 ) Add some docs clarifying how inactive lanes are handled in the amdgpu_cs_chain calling convention when the llvm.amdgcn.init.whole.wave intrinsic is used.	2024-11-14 10:10:33 +01:00
Ricardo Jesus	e52238b59f	[AArch64] Add @llvm.experimental.vector.match (#101974 ) This patch introduces an experimental intrinsic for matching the elements of one vector against the elements of another. For AArch64 targets that support SVE2, the intrinsic lowers to a MATCH instruction for supported fixed and scalar vector types.	2024-11-14 09:00:19 +00:00
Rakshit Patel	c63e83f495	[lit] Add --report-failures-only option for lit test reports (#115439 ) - Add option (--report-failures-only) to generate a reduced report for lit tests that only includes failing tests - This is a continuation of proposed patches by @gregbedwell here: - https://reviews.llvm.org/D143516 - https://reviews.llvm.org/D143519 --------- Co-authored-by: Greg Bedwell <greg.bedwell@sony.com> Co-authored-by: James Henderson <James.Henderson@sony.com>	2024-11-13 08:30:33 +00:00
Alex Bradbury	2baead09b2	[docs] Add blank line before bulletpoint list to fix HowToAddABuilder The bulletpoint list wasn't rendering properly due to a missing blank line.	2024-11-13 05:26:02 +00:00
Shilei Tian	de0fd64bed	[AMDGPU] Introduce a new generic target `gfx9-4-generic` (#115190 ) This patch introduces a new generic target, `gfx9-4-generic`. Since it doesn’t support FP8 and XF32-related instructions, the patch includes several code reorganizations to accommodate these changes.	2024-11-12 23:11:05 -05:00
Alex Bradbury	8da61a3434	[llvm][docs] Expand HowToAddABuilder with guidance on testing locally (#115024 ) With <https://github.com/llvm/llvm-zorg/pull/289> and <https://github.com/llvm/llvm-zorg/pull/293> landed, it's now reasonable to ask people to test their builder configurations locally. This patch adds documentation on how to do so.	2024-11-12 22:02:20 +00:00
Tex Riddell	5c2a133b13	Emit constrained atan2 intrinsic for clang builtin (#113636 ) This change is part of this proposal: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294 - `Builtins.td` - Add f16 support for libm atan2 builtin - `CGBuiltin.cpp` - Emit constraint atan2 intrinsic for clang builtin - `clang/test/CodeGenCXX/builtin-calling-conv.cpp` - Use erff instead of atan2 for clang builtin to lib call calling convention check, now that atan2 maps to an intrinsic. - add atan2 cases to llvm.experimental.constrained tests for more backends: ARM, PowerPC, RISCV, SystemZ. - LangRef.rst: add llvm.experimental.constrained.atan2, revise llvm.atan2 description. Last part of Implement the atan2 HLSL Function. Fixes #70096.	2024-11-12 13:34:29 -08:00
Steven Perron	ba572abeb4	[SPIRV] Add reads from image buffer for shaders. (#115178 ) This commit adds an intrinsic that will read from an image buffer. We chose to match the name of the DXIL intrinsic for simplicity in clang. We cannot reuse the existing openCL readimage function because that is not a reserved name in HLSL. I considered trying to refactor generateReadImageInst, so that we could share code between the two implementations. However, most of the code in generateReadImageInst is concerned with trying to figure out which type of image read is being done. Once we factor out the code that will be common, then we end up with just a single call to the MIRBuilder being common.	2024-11-12 14:04:45 -05:00
Fangrui Song	5a094241de	[LangRef] Clarify RISC-V v? constraints Pull Request: https://github.com/llvm/llvm-project/pull/115820	2024-11-12 09:20:54 -08:00
David Spickett	7c04da12f0	[llvm][docs] Add terminology note to Buildbot docs (#115856 ) Choosing another term for this one document would only create confusion, and vendoring Buildbot to change it is a lot of work (as explained in the linked Buildbot issue).	2024-11-12 12:45:43 +00:00
Stephen Tozer	6d23ac1aa2	[DebugInfo] Update policy for when to merge locations (#115349 ) Following discussions on PR #114231 this patch changes the policy on merging locations, making the rule that new instructions should use a merge of the locations of all the instructions whose output is produced by the new instructions; in the case where only one instruction's output is produced, as in most InstCombine optimizations, we use only that instruction's location.	2024-11-12 09:16:59 +00:00
Carlos Alberto Enciso	5e7662efec	[llvm-debuginfo-analyzer] Incorrect DW_AT_call_line/DW_AT_call_file. (#115701 ) The code dealing with DW_AT_call_line/DW_AT_call_file is in the wrong place. The correct functions were call, but with incorrect values: DW_AT_call_line <-- Filename Index DW_AT_call_file <-- Line number	2024-11-11 13:00:24 +00:00
Thorsten Schütt	a5d09f4ad9	[GlobalISel] Add G_STEP_VECTOR instruction (#115598 ) aka llvm.stepvector Intrinsic	2024-11-11 10:45:02 +01:00
Luke Lau	5ca082cdfe	[LangRef] Fix evl type on float VP reduction intrinsics (#115421 ) Looks like a search-and-replace typo	2024-11-11 13:13:08 +08:00
Will	ff0698b258	[LangRef] Fix examples for float to int saturating intrinsics (#115629 ) As per the [LangRef:Simple Constants](https://llvm.org/docs/LangRef.html#simple-constants), exact decimal values of floating-point constants are required. For instance, 23.9 is a repeating decimal in binary and results in the reported error. https://godbolt.org/z/1h7ETPnf6 Fixes #113529.	2024-11-10 16:51:29 +01:00
Durgadoss R	4edd711b4d	[NVPTX] Add TMA bulk tensor prefetch intrinsics (#115527 ) This patch adds NVVM intrinsics and NVPTX codegen for: * cp.async.bulk.tensor.prefetch.1D -> 5D variants, supporting both Tile and Im2Col modes. These intrinsics optionally support cache_hints as indicated by the boolean flag argument. * Lit tests are added for all combinations of these intrinsics in cp-async-bulk-tensor-prefetch.ll. * The generated PTX is verified with a 12.3 ptxas executable. * Added docs for these intrinsics in NVPTXUsage.rst file. * PTX Spec reference: https://docs.nvidia.com/cuda/parallel-thread-execution/#data-movement-and-conversion-instructions-cp-async-bulk-prefetch-tensor Signed-off-by: Durgadoss R <durgadossr@nvidia.com>	2024-11-10 13:44:42 +05:30
T-Tie	c17a914675	[RISCV] Add Smdbltrp and Ssdbltrp extension (#111837 ) Smdbltrp and Ssdbltrp supports are added in this PR. Specification link(Smdbltrp) : [https://github.com/riscv/riscv-isa-manual/blob/main/src/smdbltrp.adoc](url) Specification link(Ssdbltrp) : [https://github.com/riscv/riscv-isa-manual/blob/main/src/ssdbltrp.adoc](url)	2024-11-08 15:01:51 +08:00
Min-Yih Hsu	e8b70e9744	[TableGen] Make `!and` and `!or` short-circuit (#113963 ) The idea is that by preemptively simplifying the result of `!and` and `!or`, we can fold some of the conditional operators, like `!if` or `!cond`, as early as possible.	2024-11-07 10:22:03 -08:00
Sjoerd Meijer	6720ce75f6	[Docs][llvm-exegesis] Clarify AArch64 support (#114989 ) Claiming AArch64 support for llvm-exegesis is a bit of a stretch in my opinion as only a couple of opcodes with GPR64 operands will work for snippet benchmarking, so I propose to clarify that AArch64 support is very experimental. Also added some clarifications about its libpfm4 dependency.	2024-11-07 10:48:52 +00:00
Durgadoss R	1b01064faa	[NVPTX] Add TMA bulk tensor copy intrinsics (#96083 ) This patch adds NVVM intrinsics and NVPTX codegen for: * cp.async.bulk.tensor.S2G.1D -> 5D variants, supporting both Tile and Im2Col modes. These intrinsics optionally support cache_hints as indicated by the boolean flag argument. * cp.async.bulk.tensor.G2S.1D -> 5D variants, with support for both Tile and Im2Col modes. The Im2Col variants have an extra set of offsets as parameters. These intrinsics optionally support multicast and cache_hints, as indicated by the boolean arguments at the end of the intrinsics. * The backend looks through these flag arguments and lowers to the appropriate PTX instruction. * Lit tests are added for all combinations of these intrinsics in cp-async-bulk-tensor-g2s/s2g.ll. * The generated PTX is verified with a 12.3 ptxas executable. * Added docs for these intrinsics in NVPTXUsage.rst file. * PTX Spec reference: https://docs.nvidia.com/cuda/parallel-thread-execution/#data-movement-and-conversion-instructions-cp-async-bulk-tensor Signed-off-by: Durgadoss R <durgadossr@nvidia.com>	2024-11-07 15:21:53 +05:30
abhishek-kaushik22	d2aff182d3	Revert "TLS loads opimization (hoist)" (#114740 ) This reverts commit c31014322c0b5ae596da129cbb844fb2198b4ef4. Based on the discussions in #112772, this pass is not needed after the introduction of `llvm.threadlocal.address` intrinsic. Fixes https://github.com/llvm/llvm-project/issues/112771.	2024-11-07 10:10:28 +01:00
Andrzej Warzyński	41248b598b	[docs] Update docs on code-review process (#111735 ) Clarify expectations for handling new comments post-LGTM but pre-commit. This change aims to standardize expectations when new comments are added after a patch has received LGTM but before it has been committed. Currently, approaches to this vary, and this update seeks to clarify best practices.	2024-11-06 07:39:43 +00:00
Rahul Joshi	b8ac87f34a	[LLVM][AsmParser] Add support for C style comments (#111554 ) Add support for C style comments in LLVM assembly. --------- Co-authored-by: Nikita Popov <github@npopov.com>	2024-11-05 13:28:22 -08:00
Matt Arsenault	0b40f97929	AMDGPU: Treat uint32_max as the default value for amdgpu-max-num-workgroups (#113751 ) 0 does not make sense as a value for this to be, much less the default. Also stop emitting each individual field if it is the default, rather than if any element was the default. Also fix the name of the test since it didn't exactly match the real attribute name.	2024-11-05 12:50:44 -08:00
walter erquinigo	e952728f88	[LLDB] Retry Add a target.launch-working-dir setting This retries the PR 113521 skipping a test in a remote environment.	2024-11-05 13:29:51 -05:00
Finn Plummer	3cdac06708	[HLSL][SPIRV][DXIL] Implement `dot4add_i8packed` intrinsic (#113623 ) - create a clang built-in in Builtins.td - link dot4add_i8packed in hlsl_intrinsics.h - add lowering to spirv backend through expansion of operation as OPSDot is missing up to SPIRV 1.6 in SPIRVInstructionSelector.cpp - add lowering to spirv backend using OpSDot in applicable SPIRV version or if SPV_KHR_integer_dot_product is enabled - add dot4add_i8packed intrinsic to IntrinsicsDirectX.td and mapping to DXIL.td op Dot4AddI8Packed - add tests for HLSL intrinsic lowering to dx/spv intrinsic in dot4add_i8packed.hlsl - add tests for sema checks in dot4add_i8packed-errors.hlsl - add test of spir-v lowering in SPIRV/dot4add_i8packed.ll - add test to dxil lowering in DirectX/dot4add_i8packed.ll Resolves #99220	2024-11-05 10:29:08 -08:00
Walter Erquinigo	5d39e0c7e1	Revert "[LLDB] Add a target.launch-working-dir setting" (#114973 ) Reverts llvm/llvm-project#113521 due to build bot failures mentioned in the original PR.	2024-11-05 07:12:20 -05:00
Walter Erquinigo	6620cd2523	[LLDB] Add a target.launch-working-dir setting (#113521 ) Internally we use bazel in a way in which it can drop you in a LLDB session with the target launched in a particular cwd, which is needed for things to work. We've been making this automation work via `process launch -w`. However, if later the user wants to restart the process with `r`, then they end up using a different cwd for relaunching the process. As a way to fix this, I'm adding a target-level setting that allows configuring a default cwd used for launching the process without needing the user to specify it manually.	2024-11-05 06:33:25 -05:00
Carlo Cabrera	6d2f4dd79d	[llvm][docs] update links to `clang-format-diff.py` and `git-clang-format` (#114646 ) Point to github instead of phabricator.	2024-11-05 11:06:54 +01:00
Jay Foad	4831e0aa88	[IR] Disallow recursive types (#114799 ) StructType::setBody is the only mechanism that can potentially create recursion in the type system. Add a runtime check that it is not actually used to create recursion. If the check fails, report an error from LLParser, BitcodeReader and IRLinker. In all other cases assert that the check succeeds. In future StructType::setBody will be removed in favor of specifying the body when the type is created, so any performance hit from this runtime check will be temporary.	2024-11-05 09:41:10 +00:00
Kyungwoo Lee	ffcf3c8688	[CGData][llvm-cgdata] Support for stable function map (#112664 ) This introduces a new cgdata format for stable function maps. The raw data is embedded in the __llvm_merge section during compile time. This data can be read and merged using the llvm-cgdata tool, into an indexed cgdata file. Consequently, the tool is now capable of handling either outlined hash trees, stable function maps, or both, as they are orthogonal. Depends on #112662. This is a patch for https://discourse.llvm.org/t/rfc-global-function-merging/82608.	2024-11-04 17:32:50 -08:00
Louis Dionne	6127724786	[cmake] Remove obsolete files, docs and CMake variables related to the standalone build (#112741 ) The runtimes used to support a build mode called the "Standalone build", which isn't supported anymore (and hasn't been for a few years). However, various places in the code still contained stuff whose only purpose was to support that build mode, and some outdated documentation. This patch cleans that up (although I probably missed some). - Remove HandleOutOfTreeLLVM.cmake which isn't referenced anymore - Remove the LLVM_PATH CMake variable which isn't used anymore - Update some outdated documentation referencing standalone builds	2024-11-04 17:53:38 -05:00
Alex MacLean	ed19ef740b	[NVPTX][docs] Add isspacep.* to usage doc (#114839 )	2024-11-04 12:11:32 -08:00
Jay Foad	f8559751fc	[llvm-project] Fix typo "propogate" (#114795 )	2024-11-04 15:33:19 +00:00
zhijian lin	a51712751c	[PowerPC][LLC] Utilize PPC::getNormalizedPPCTargetCPU() to set CPU (#113943 ) Utilize common API in PPCTargetParser (https://github.com/llvm/llvm-project/pull/97541) to set default CPU with same interfaces for LLC. This will update AIX default CPU to pwr7 and LoP powerppc64 default CPU to ppc64.	2024-11-04 09:40:54 -05:00
Rajat Bajpai	7603feac78	[Documentation] Update parameter and function attribute section in LangRef (#114007 ) Update the documentation for parameter and function attributes to include support for target-dependent string attributes.	2024-11-02 11:19:21 +01:00
Alex MacLean	8ff60c4d47	[NVPTX] Add support for nvvm.flo.[us] intrinsics (#114489 ) Add support for '`llvm.nvvm.flo.[su].*`' intrinsics which correspond to a PTX `bfind` instruction. See [PTX ISA 9.7.1.16. Integer Arithmetic Instructions: bfind] (https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#integer-arithmetic-instructions-bfind) The '`llvm.nvvm.flo.u`' family of intrinsics identifies the bit position of the leading one, returning either it's offset from the most or least significant bit. The '`llvm.nvvm.flo.s`' family of intrinsics identifies the bit position of the leading non-sign bit, returning either it's offset from the most or least significant bit.	2024-11-01 16:35:43 -07:00

1 2 3 4 5 ...

11331 Commits