llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-04-28 04:16:06 +00:00

Author	SHA1	Message	Date
bodqhrohro	78c96aa24f	[docs] Fix typo in GettingStarted.rst Unlinke -> Unlike (NFC) (#128616 )	2025-02-27 18:50:10 +00:00
Abhilash Majumder	241a56dfad	[NVPTX] Add Intrinsics for applypriority.* (#127989 ) \[NVPTX\] Add ApplyPriority intrinsics This PR adds applypriority.\* intrinsics with relevant eviction priorities. * The lowering is handled from nvvm to nvptx tablegen directly. * Lit tests are added as part of applypriority.ll * The generated PTX is verified with a 12.3 ptxas executable. * Added docs for these intrinsics in NVPTXUsage.rst. For more information, refer to the PTX ISA `<https://docs.nvidia.com/cuda/parallel-thread-execution/#data-movement-and-conversion-instructions-applypriority>`_. --------- Co-authored-by: abmajumder <abmajumder@nvidia.com>	2025-02-27 16:10:29 +05:30
Pradeep Kumar	f5ee401545	[LLVM][NVPTX] Add codegen support for tcgen05.{ld, st} instructions (#126740 ) This commit adds support for tcgen05.{ld, st} instructions with lit tests under tcgen05-ld.ll and tcgen05-st.ll and intrinsics documentation under NVPTXUsage.rst	2025-02-27 14:56:35 +05:30
Omar Hossam	fcc8802133	[Docs] Fix typo in GetElementPtr.rst (#127393 ) I couldn't find the verb "indices", and it was actually a bit confusing for me reading this. I think this should be "indexes" instead.	2025-02-27 10:21:56 +01:00
Sam Elliott	5066d7b601	[RISCV] Add Xqccmp 0.1 Assembly Support (#128731 ) Xqccmp is a new spec by Qualcomm that makes a vendor-specific effort to solve the push/pop + frame pointers issue. Broadly, it takes the Zcmp instructions and reverse the order they push/pop registers in, which ends up matching the frame pointer convention. This extension adds a new instruction not present in Zcmp, `qc.cm.pushfp`, which will set `fp` to the incoming `sp` value after it has pushed the registers. This change duplicates the Zcmp implementation, with minor changes to mnemonics (for the `qc.` prefix), predicates, and the addition of `qc.cm.pushfp`. There is also new logic to prevent combining Xqccmp and Zcmp. Xqccmp is kept separate to Xqci for decoding/encoding etc, as the specs are separate today. Specification: https://github.com/quic/riscv-unified-db/releases/tag/Xqccmp_extension-0.1.0	2025-02-26 20:03:02 -08:00
YunQiang Su	363b05944f	LangRef: Clarify llvm.minnum and llvm.maxnum about sNaN and signed zero (#112852 ) The documents claims that it ignores sNaN, while in the current code it may be different. - as the finally callback, it use libc call fmin(3)/fmax(3). while C23 clarifies that fmin(3)/fmax(3) should return NaN for sNaN vs NUM. - on some architectures, such as aarch64, it converts to `fmaxnm`, which returns qNaN for sNaN vs NUM. - on RISC-V (SPEC 2019+), it converts to `fmax`, which returns NUM for sNaN vs NUM. Since we have introduced llvm.minimumnum and llvm.maximumnum, which follow IEEE 754-2019's minimumNumber/maximumNumber. So, it's time for us to clarify llvm.minnum and llvm.maxnum. Since the final fallback of llvm.minnum and llvm.maxnum is fmin(3)/fmax(3), so that it is reasonable to follow the behaviors of fmin(3)/fmax(3). Although C23 clarified the behavior about sNaN and +0.0/-0.0: (NUM or NaN) vs sNaN -> qNaN +0.0 vs -0.0 -> either one of +0.0/-0.0 It is the same the IEEE754-2008's maxNUM and minNUM. Not all implementation work as expected. Since some architectures such as aarch64/MIPSr6/LoongArch, have instructions that implements +0.0>-0.0. So Let's define llvm.minnum and llvm.maxnum to IEEE754-2008 with +0.0>-0.0. The architectures without such instructions can implements `NSZ` flavor to speed up, and the frontend, such as clang, can call them with `nsz` attribute.	2025-02-27 11:22:30 +08:00
AdityaK	74306afe87	Fix the schedule of vectorizer improvement monthly sync	2025-02-26 10:22:25 -08:00
Justin Bogner	870b376f00	[DirectX] Support the CBufferLoadLegacy operation (#128699 ) Fixes #112992	2025-02-26 09:43:30 -08:00
Craig Topper	5d501c6137	[RISCV][Docs] RISCV -> RISC-V in RISCVUsage.rst. NFC (#128906 )	2025-02-26 09:30:09 -08:00
Philip Reames	8039f8e139	[RISCV][MC] Add assembler support for XRivosVisni (#128773 ) This implements assembler support for the XRivosVisni custom/vendor extension from Rivos Inc. which is defined in: https://github.com/rivosinc/rivos-custom-extensions (See src/xrivosvisni.adoc) Codegen support will follow in separate changes.	2025-02-26 08:55:35 -08:00
Alex MacLean	6c2e170d04	[NVPTX] Convert vector function nvvm.annotations to attributes (#127736 ) Replace some more nvvm.annotations with function attributes, auto-upgrading the annotations as needed. These new attributes will be more idiomatic and compile-time efficient than the annotations. - !"maxntid[xyz]" -> "nvvm.maxntid" - !"reqntid[xyz]" -> "nvvm.reqntid" - !"cluster_dim_[xyz]" -> "nvvm.cluster_dim"	2025-02-26 08:45:27 -08:00
Luke Quinn	aace6a2f9d	[RISCV] Xqcia 0.4 The spec was recently updated, this changes the name in the TD files associated and increments the Extension number in the clang driver. This is mostly a MC change as there is no other generated code for these instructions yet. Signed-off-by: Luke Quinn <quic_lquinn@quicinc.com>	2025-02-26 08:09:20 -05:00
Philip Reames	00f02fed88	[RISCV] Change the vendor prefix for Rivos from "rv." to "ri." (#128761 ) There had been concern raised about possible confusion with "rvv". After internal discussion, we decided to go with an alternate prefix to reduce possible confusion going forward. The specification document (https://github.com/rivosinc/rivos-custom-extensions) has been updated. And also add the XRivosVizip extension to the documentation. I'd missed that in the initial commit.	2025-02-25 11:27:18 -08:00
Florian Hahn	47822c80c1	[LangRef] Clarify that the pointer after an object must be valid. (#127892 ) In some places, we rely on the assumption that the pointer after the object must also be valid and not overflow, but it does not seem to be spelled out clearly in LangRef, unless I missed a reference. The GetElementPtr section mentions that the maximum object size is half the pointer index type space, but then the pointer past the object may wrap. Clarify that the pointer after the object must also be valid. This should match Alive2's semantics: https://alive2.llvm.org/ce/z/Dk8QFL (https://github.com/AliveToolkit/alive2/blob/master/tools/transform.cpp#L1288) PR: https://github.com/llvm/llvm-project/pull/127892	2025-02-24 21:22:59 +00:00
Andy Kaylor	20fd7f0a76	Remove floating-point working group meeting (#128258 ) This meeting never quite took off the way I had hoped, and I haven't had time for it in quite a while, so I am removing it from the Getting Involved page.	2025-02-24 09:06:44 -08:00
Rahul Joshi	5bddadf783	[CodingStandard] Rework anonymous namespace section to cover visibility more broadly (#126775 ) - Rename anonymous namespace section and rework it to cover visibility more broadly. - Add language suggesting restricting visibility as much as possible, using various C++ facilities. --------- Co-authored-by: Aaron Ballman <aaron@aaronballman.com>	2025-02-24 08:24:54 -08:00
quic_hchandel	538b898a83	[RISCV] Add Qualcomm uC Xqcilia (Large Immediate Arithmetic) extension (#124706 ) This extension adds eight 48 bit large arithmetic instructions. The current spec can be found at: https://github.com/quic/riscv-unified-db/releases/latest This patch adds assembler only support.	2025-02-24 08:04:29 -08:00
Dmitry Nechitaev	14f33c6bc1	[llvm-objcopy][mach-o] Fix section finding logic for object files (#127604 ) Fix section finding logic for object files. As by product, make --update-section functional when the input is an object file. This PR fixes #127495	2025-02-23 11:17:58 -08:00
Tom Stellard	ca0406dd1c	DeveloperPolicy: Update commit access requirements (#127006 ) See https://discourse.llvm.org/t/rfc-commit-access-criteria/84073	2025-02-21 14:29:59 -08:00
Carl Ritson	581599096e	[AMDGPU] Expand IR Attribute table to handle longer names (NFC)	2025-02-21 16:34:54 +09:00
Dmitry Sidorov	55fa2fa348	[SPIR-V] Add SPV_INTEL_bindless_images extension (#127737 ) Adds instructions to convert convert unsigned integer handles to images, samplers and sampled images. Spec: https://github.com/intel/llvm/blob/sycl/sycl/doc/design/spirv-extensions/SPV_INTEL_bindless_images.asciidoc --------- Signed-off-by: Sidorov, Dmitry <dmitry.sidorov@intel.com>	2025-02-20 10:27:15 +01:00
Florian Hahn	2d6330f83a	[LangRef] Fix indent for deref assume bundle note added in 65640c1d4c. Fixes odd whitespace rendering.	2025-02-20 08:55:12 +01:00
Durgadoss R	3ce2e4df5d	[NVPTX] Add tcgen05.cp/shift intrinsics (#127669 ) This patch adds intrinsics for tcgen05.cp and tcgen05.shift instructions. lit tests are added and verified with a ptxas-12.8 executable. Docs are updated in the NVPTXUsage.rst file. Signed-off-by: Durgadoss R <durgadossr@nvidia.com>	2025-02-19 17:55:25 +05:30
Fabian Ritter	c442b39770	[AMDGPU][docs][NFC] Replace gfx940 with gfx942 in the gfx940 ISA doc (#126906 ) gfx940 and gfx941 are no longer supported. This is the last one of a series of PRs to remove them from the code base. The ISA documentation still contains a lot of links and file names with the "gfx940" identifier. Changing them to "gfx942" is probably not worth the cost of breaking all URLs to these pages that users might have saved in the past. For SWDEV-512631	2025-02-19 10:37:48 +01:00
Fabian Ritter	db597084c5	[AMDGPU][docs] Replace gfx940 and gfx941 with gfx942 in llvm/docs (#126887 ) gfx940 and gfx941 are no longer supported. This is one of a series of PRs to remove them from the code base. This PR removes all documentation occurrences of gfx940/gfx941 except for the gfx940 ISA description, which will be the subject of a separate PR. For SWDEV-512631	2025-02-19 10:31:47 +01:00
Fabian Ritter	8615f9aaff	[AMDGPU] Replace gfx940 and gfx941 with gfx942 in llvm (#126763 ) gfx940 and gfx941 are no longer supported. This is one of a series of PRs to remove them from the code base. This PR removes all non-documentation occurrences of gfx940/gfx941 from the llvm directory, and the remaining occurrences in clang. Documentation changes will follow. For SWDEV-512631	2025-02-19 10:20:48 +01:00
Krzysztof Drewniak	f7d03707d1	[AMDGPU] Generalize amdgcn.make.buffer.rsrc to fat pointers (#126828 ) Attempting to pass a `ptr addrspace(7)` to functions that take `ptr` arguments produces undesirable `addrspacecast(addrspacecast(p8 x to p7) to p0) => addrspacecast(p8 x to p0)` folds. This results in illegal GEP operations on buffer resources, which can't be GEP'd. (However, note that, while unimplemneted, addressspacecast from ptr addrspace(7) to ptr is legal - it's just an effective address computation) To resolve this problem, and thus prevent illegal `getelementptr T, ptr addrspace(8) %x, ...` s from being produces, this commit extends amdgcn.make.buffer.rsrc to also be variadic in its result type, auto-upgrading old manglings. The logic for handling a make.buffer.rsrc in instruction selection remains untouched and expects the output type to be a ptr addrspace(8), as does the Clang lowering for its builtin (the pointer-to-pointer version might want a different name in clang). LowerBufferFatPointers has been updated to lower amdgcn.make.buffer.rsrc.p7.p* to amdgcn.make.buffer.rsrc.p8.p* . This'll also make exposing buffer fat pointers in Clang easier, since you don't have to cast between a `__amdgcn_rsrc_t` and a pointer.	2025-02-18 14:15:28 -06:00
Kazu Hirata	5d4eb08379	[Analysis] Remove skipSCC (#127412 ) The last use was removed in: commit fa6ea7a419f37befbed04368bcb8af4c718facbb Author: Arthur Eubanks <aeubanks@google.com> Date: Mon Mar 20 11:18:35 2023 -0700	2025-02-18 09:59:12 -08:00
Paul Walker	df300a4a67	[llvm][docs] Fix typo in Backporting section of GitHub.rst.	2025-02-18 12:39:42 +00:00
Jonas Devlieghere	f71b83b359	[lldb] Add a release note for #127419	2025-02-17 18:29:24 -08:00
Dinu Blanovschi	9c9157b256	Fix typo in LangImpl03.rst (#127389 )	2025-02-17 12:11:36 +00:00
Nikita Popov	d8b2e432d6	[IR] Remove mul constant expression (#127046 ) Remove support for the mul constant expression, which has previously already been marked as undesirable. This removes the APIs to create mul expressions and updates tests to stop using mul expressions. Part of: https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179	2025-02-14 09:28:57 +01:00
Florian Hahn	65640c1d4c	[AssumeBundles] Dereferenceable used in bundle only applies at assume. (#126117 ) Update LangRef and code using `Dereferenceable` in assume bundles to only use the information if it is safe at the point of use. `Dereferenceable` in an assume bundle is only guaranteed at the point of the assumption, but may not be guaranteed at later points, because the pointer may have been freed. Update code using `Dereferenceable` to only use it if the pointer cannot be freed. This can further be refined to check if the pointer could be freed between assume and use. This follows up on https://github.com/llvm/llvm-project/pull/123196. With that change, it should be safe to expose dereferenceable assumptions more widely as in https://github.com/llvm/llvm-project/pull/121789 PR: https://github.com/llvm/llvm-project/pull/126117	2025-02-13 20:41:23 +01:00
Abhilash Majumder	55f3df875d	[NVPTX] Fix and refine prefetch.* intrinsics (#126899 ) This is follow-up PR from #125887 which fixes the intrinsic failures . --------- Co-authored-by: abmajumder <abmajumder@nvidia.com>	2025-02-13 17:54:01 +01:00
Alex Bradbury	62eddf4792	[docs] Fix typo in HowToAddABuilder	2025-02-13 15:03:52 +00:00
Alex Bradbury	db2953d801	[doc] Add Discord invite link alongside channel links (#126352 ) By far the most important part of this patch is updating GettingInvolved.rst to include the invite link, but I've grepped for any other discord.com links. I'm no Discord expert, but from my experience (confirmed via @preames kindly testing as well) the direct channel links provide a confusing experience if you haven't already found and used an invite link to the LLVM Discord server. If you're logged into Discord but not a member of LLVM's sever, the web app opens and then...nothing. No channel opens, no prompt to join the server or even a hint that you need to find an invite link (and if you're not used to Discord, you likely don't even know that's necessary). This patch addresses the issue by providing the invite link where Discord is mentioned.	2025-02-13 15:00:21 +00:00
Antonio Frighetto	ff585feacf	[IR][ModRef] Introduce `errno` memory location Model C/C++ `errno` macro by adding a corresponding `errno` memory location kind to the IR. Preliminary work to separate `errno` writes from other memory accesses, to the benefit of alias analyses and optimization correctness. Previous discussion: https://discourse.llvm.org/t/rfc-modelling-errno-memory-effects/82972.	2025-02-13 12:13:39 +01:00
Rahul Joshi	bee9664970	[TableGen] Emit OpName as an enum class instead of a namespace (#125313 ) - Change InstrInfoEmitter to emit OpName as an enum class instead of an anonymous enum in the OpName namespace. - This will help clearly distinguish between values that are OpNames vs just operand indices and should help avoid bugs due to confusion between the two. - Rename OpName::OPERAND_LAST to NUM_OPERAND_NAMES. - Emit declaration of getOperandIdx() along with the OpName enum so it doesn't have to be repeated in various headers. - Also updated AMDGPU, RISCV, and WebAssembly backends to conform to the new definition of OpName (mostly mechanical changes).	2025-02-12 08:19:30 -08:00
Alex MacLean	a282b6c486	[NVPTX] Convert scalar function nvvm.annotations to attributes (#125908 ) Replace some more nvvm.annotations with function attributes, auto-upgrading the annotations as needed. These new attributes will be more idiomatic and compile-time efficient than the annotations. - !"maxclusterrank" / !"cluster_max_blocks" -> "nvvm.maxclusterrank" - !"minctasm" -> "nvvm.minctasm" - !"maxnreg" -> "nvvm.maxnreg"	2025-02-12 07:33:22 -08:00
Yingwei Zheng	34534442a8	[Docs][LangRef] Fix broken ref to pointer capture. NFC (#126910 )	2025-02-12 23:14:58 +08:00
Paul Walker	563d54569e	[NFC][LLVM][LangRef] Fix typos within partial.reduce.add documentation.	2025-02-12 11:51:26 +00:00
Paul Walker	01afa8fc0b	[NFC][LLVM][LangRef] Improve documentation for partial.reduce.add. (#126728 )	2025-02-12 11:33:24 +00:00
Stanislav Mekhanoshin	7639242155	[AMDGPU] Create new directive .amdhsa_inst_pref_size (#126622 ) The field INST_PREF_SIZE is available since gfx11.	2025-02-11 08:35:45 -08:00
Benjamin Maxwell	701223ac20	[IR] Add llvm.sincospi intrinsic (#125873 ) This adds the `llvm.sincospi` intrinsic, legalization, and lowering (mostly reusing the lowering for sincos and frexp). The `llvm.sincospi` intrinsic takes a floating-point value and returns both the sine and cosine of the value multiplied by pi. It computes the result more accurately than the naive approach of doing the multiplication ahead of time, especially for large input values. ``` declare { float, float } @llvm.sincospi.f32(float %Val) declare { double, double } @llvm.sincospi.f64(double %Val) declare { x86_fp80, x86_fp80 } @llvm.sincospi.f80(x86_fp80 %Val) declare { fp128, fp128 } @llvm.sincospi.f128(fp128 %Val) declare { ppc_fp128, ppc_fp128 } @llvm.sincospi.ppcf128(ppc_fp128 %Val) declare { <4 x float>, <4 x float> } @llvm.sincospi.v4f32(<4 x float> %Val) ``` Currently, the default lowering of this intrinsic relies on the `sincospi[f\|l]` functions being available in the target's runtime (e.g. libc).	2025-02-11 09:01:30 +00:00
Abhilash Majumder	6a961dc03d	[NVPTX] Add intrinsics for prefetch.* (#125887 ) \[NVPTX\] Add Prefetch intrinsics This PR adds prefetch intrinsics with the relevant eviction priorities. * Lit tests are added as part of prefetch.ll * The generated PTX is verified with a 12.3 ptxas executable. * Added docs for these intrinsics in NVPTXUsage.rst. For more information, refer PTX ISA `<https://docs.nvidia.com/cuda/parallel-thread-execution/#data-movement-and-conversion-instructions-prefetch-prefetchu>`_. --------- Co-authored-by: abmajumder <abmajumder@nvidia.com>	2025-02-11 14:24:46 +05:30
Rahul Joshi	0f674cce82	[NFC][LLVM] Remove unused `TargetIntrinsicInfo` class (#126003 ) Remove `TargetIntrinsicInfo` class as its practically unused (its pure virtual with no subclasses) and its references in the code.	2025-02-10 14:56:30 -08:00
Nico Weber	308d28667c	[llvm][docs] Tweak backporting instructions a bit (#126519 ) * Drop ".Z" in milestone name since we've been doing X.Y releases instead of X.Y.Z releases since LLVM 18 * Add "LLVM" prefix since that's what release milestones are named * Use a numbered list to make it clearer that there are two steps needed, and add some more details to the first step	2025-02-10 10:58:16 -05:00
David Spickett	f845497f3b	[llvm][Docs] Explain how to handle excessive formatting changes (#126239 ) Based on some feedback in Discord about a PR where a reviewer asked the author to move the formatting changes to a new PR, which appears to contradict the current form of this document. I've added an explanation here, before the point where the author would be committing any of the formatting changes. There are other ways this can go, for example some projects don't want the churn of formatting, or you can pre-emptively send a formatting PR, but I don't think enumerating them all here will help the audience for this text. So I've recomended one path that will start them off well, and can branch off if the reviewers make requests.	2025-02-10 10:32:45 +00:00
Nikita Popov	2d31a12dbe	[DSE] Don't use initializes on byval argument (#126259 ) There are two ways we can fix this problem, depending on how the semantics of byval and initializes should interact: * Don't infer initializes on byval arguments. initializes on byval refers to the original caller memory (or having both attributes is made a verifier error). * Infer initializes on byval, but don't use it in DSE. initializes on byval refers to the callee copy. This matches the semantics of readonly on byval. This is slightly more powerful, for example, we could do a backend optimization where byval + initializes will allocate the full size of byval on the stack but not copy over the parts covered by initializes. I went with the second variant here, skipping byval + initializes in DSE (FunctionAttrs already doesn't propagate initializes past byval). I'm open to going in the other direction though. Fixes https://github.com/llvm/llvm-project/issues/126181.	2025-02-10 10:34:03 +01:00
Durgadoss R	f3040498f0	[NVPTX] Add tcgen05 wait/fence/commit intrinsics (#126091 ) This patch adds intrinsics for tcgen05 wait, fence and commit PTX instructions. lit tests are added and verified with a ptxas-12.8 executable. Docs are updated in the NVPTXUsage.rst file. Signed-off-by: Durgadoss R <durgadossr@nvidia.com>	2025-02-07 22:10:25 +05:30

1 2 3 4 5 ...

11443 Commits