11443 Commits

Author SHA1 Message Date
bodqhrohro
78c96aa24f
[docs] Fix typo in GettingStarted.rst Unlinke -> Unlike (NFC) (#128616) 2025-02-27 18:50:10 +00:00
Abhilash Majumder
241a56dfad
[NVPTX] Add Intrinsics for applypriority.* (#127989)
\[NVPTX\] Add ApplyPriority intrinsics

This PR adds applypriority.\* intrinsics with relevant eviction
priorities.

* The lowering is handled from nvvm to nvptx tablegen directly.
* Lit tests are added as part of applypriority.ll
* The generated PTX is verified with a 12.3 ptxas executable.
* Added docs for these intrinsics in NVPTXUsage.rst.

For more information, refer to the PTX ISA

`<https://docs.nvidia.com/cuda/parallel-thread-execution/#data-movement-and-conversion-instructions-applypriority>`_.

---------

Co-authored-by: abmajumder <abmajumder@nvidia.com>
2025-02-27 16:10:29 +05:30
Pradeep Kumar
f5ee401545
[LLVM][NVPTX] Add codegen support for tcgen05.{ld, st} instructions (#126740)
This commit adds support for tcgen05.{ld, st} instructions with lit
tests under tcgen05-ld.ll and tcgen05-st.ll and intrinsics documentation
under NVPTXUsage.rst
2025-02-27 14:56:35 +05:30
Omar Hossam
fcc8802133
[Docs] Fix typo in GetElementPtr.rst (#127393)
I couldn't find the verb "indices", and it was actually
a bit confusing for me reading this.
I think this should be "indexes" instead.
2025-02-27 10:21:56 +01:00
Sam Elliott
5066d7b601
[RISCV] Add Xqccmp 0.1 Assembly Support (#128731)
Xqccmp is a new spec by Qualcomm that makes a vendor-specific effort to
solve the push/pop + frame pointers issue. Broadly, it takes the Zcmp
instructions and reverse the order they push/pop registers in, which
ends up matching the frame pointer convention.

This extension adds a new instruction not present in Zcmp,
`qc.cm.pushfp`, which will set `fp` to the incoming `sp` value after it
has pushed the registers.

This change duplicates the Zcmp implementation, with minor changes to
mnemonics (for the `qc.` prefix), predicates, and the addition of
`qc.cm.pushfp`. There is also new logic to prevent combining Xqccmp and
Zcmp. Xqccmp is kept separate to Xqci for decoding/encoding etc, as the
specs are separate today.

Specification:
https://github.com/quic/riscv-unified-db/releases/tag/Xqccmp_extension-0.1.0
2025-02-26 20:03:02 -08:00
YunQiang Su
363b05944f
LangRef: Clarify llvm.minnum and llvm.maxnum about sNaN and signed zero (#112852)
The documents claims that it ignores sNaN, while in the current code it
may be different.

- as the finally callback, it use libc call fmin(3)/fmax(3). while C23
clarifies that fmin(3)/fmax(3) should return NaN for sNaN vs NUM.
- on some architectures, such as aarch64, it converts to `fmaxnm`, which
returns qNaN for sNaN vs NUM.
- on RISC-V (SPEC 2019+), it converts to `fmax`, which returns NUM for
sNaN vs NUM.

Since we have introduced llvm.minimumnum and llvm.maximumnum, which
follow IEEE 754-2019's minimumNumber/maximumNumber.

So, it's time for us to clarify llvm.minnum and llvm.maxnum. Since the
final fallback of llvm.minnum and llvm.maxnum is
fmin(3)/fmax(3), so that it is reasonable to follow the behaviors of
fmin(3)/fmax(3).

Although C23 clarified the behavior about sNaN and +0.0/-0.0:
     (NUM or NaN) vs sNaN -> qNaN
     +0.0 vs -0.0 -> either one of +0.0/-0.0
It is the same the IEEE754-2008's maxNUM and minNUM.
Not all implementation work as expected.
     
Since some architectures such as aarch64/MIPSr6/LoongArch, have
instructions that implements +0.0>-0.0.
So Let's define llvm.minnum and llvm.maxnum to IEEE754-2008 with
+0.0>-0.0.

The architectures without such instructions can implements `NSZ` flavor
to speed up,
and the frontend, such as clang, can call them with `nsz` attribute.
2025-02-27 11:22:30 +08:00
AdityaK
74306afe87
Fix the schedule of vectorizer improvement monthly sync 2025-02-26 10:22:25 -08:00
Justin Bogner
870b376f00
[DirectX] Support the CBufferLoadLegacy operation (#128699)
Fixes #112992
2025-02-26 09:43:30 -08:00
Craig Topper
5d501c6137
[RISCV][Docs] RISCV -> RISC-V in RISCVUsage.rst. NFC (#128906) 2025-02-26 09:30:09 -08:00
Philip Reames
8039f8e139
[RISCV][MC] Add assembler support for XRivosVisni (#128773)
This implements assembler support for the XRivosVisni custom/vendor
extension from Rivos Inc. which is defined in:
https://github.com/rivosinc/rivos-custom-extensions (See
src/xrivosvisni.adoc)

Codegen support will follow in separate changes.
2025-02-26 08:55:35 -08:00
Alex MacLean
6c2e170d04
[NVPTX] Convert vector function nvvm.annotations to attributes (#127736)
Replace some more nvvm.annotations with function attributes,
auto-upgrading the annotations as needed. These new attributes will be
more idiomatic and compile-time efficient than the annotations.

- !"maxntid[xyz]" -> "nvvm.maxntid"
- !"reqntid[xyz]" -> "nvvm.reqntid"
- !"cluster_dim_[xyz]" -> "nvvm.cluster_dim"
2025-02-26 08:45:27 -08:00
Luke Quinn
aace6a2f9d
[RISCV] Xqcia 0.4 The spec was recently updated, this changes the name in the TD files associated and increments the Extension number in the clang driver. This is mostly a MC change as there is no other generated code for these instructions yet.
Signed-off-by: Luke Quinn <quic_lquinn@quicinc.com>
2025-02-26 08:09:20 -05:00
Philip Reames
00f02fed88
[RISCV] Change the vendor prefix for Rivos from "rv." to "ri." (#128761)
There had been concern raised about possible confusion with "rvv". After
internal discussion, we decided to go with an alternate prefix to reduce
possible confusion going forward. The specification document
(https://github.com/rivosinc/rivos-custom-extensions) has been updated.

And also add the XRivosVizip extension to the documentation. I'd missed
that in the initial commit.
2025-02-25 11:27:18 -08:00
Florian Hahn
47822c80c1
[LangRef] Clarify that the pointer after an object must be valid. (#127892)
In some places, we rely on the assumption that the pointer after the
object must also be valid and not overflow, but it does not seem to be
spelled out clearly in LangRef, unless I missed a reference.

The GetElementPtr section mentions that the maximum object size is half
the pointer index type space, but then the pointer past the object may
wrap. Clarify that the pointer after the object must also be valid.

This should match Alive2's semantics:
https://alive2.llvm.org/ce/z/Dk8QFL
(https://github.com/AliveToolkit/alive2/blob/master/tools/transform.cpp#L1288)

PR: https://github.com/llvm/llvm-project/pull/127892
2025-02-24 21:22:59 +00:00
Andy Kaylor
20fd7f0a76
Remove floating-point working group meeting (#128258)
This meeting never quite took off the way I had hoped, and I haven't had
time for it in quite a while, so I am removing it from the Getting
Involved page.
2025-02-24 09:06:44 -08:00
Rahul Joshi
5bddadf783
[CodingStandard] Rework anonymous namespace section to cover visibility more broadly (#126775)
- Rename anonymous namespace section and rework it to
  cover visibility more broadly.
- Add language suggesting restricting visibility as much as
  possible, using various C++ facilities.

---------

Co-authored-by: Aaron Ballman <aaron@aaronballman.com>
2025-02-24 08:24:54 -08:00
quic_hchandel
538b898a83
[RISCV] Add Qualcomm uC Xqcilia (Large Immediate Arithmetic) extension (#124706)
This extension adds eight 48 bit large arithmetic instructions.

The current spec can be found at:
https://github.com/quic/riscv-unified-db/releases/latest

This patch adds assembler only support.
2025-02-24 08:04:29 -08:00
Dmitry Nechitaev
14f33c6bc1
[llvm-objcopy][mach-o] Fix section finding logic for object files (#127604)
Fix section finding logic for object files.
As by product, make --update-section functional when the input is an object file.

This PR fixes #127495
2025-02-23 11:17:58 -08:00
Tom Stellard
ca0406dd1c
DeveloperPolicy: Update commit access requirements (#127006)
See https://discourse.llvm.org/t/rfc-commit-access-criteria/84073
2025-02-21 14:29:59 -08:00
Carl Ritson
581599096e [AMDGPU] Expand IR Attribute table to handle longer names (NFC) 2025-02-21 16:34:54 +09:00
Dmitry Sidorov
55fa2fa348
[SPIR-V] Add SPV_INTEL_bindless_images extension (#127737)
Adds instructions to convert convert unsigned integer handles to images,
samplers and sampled images.

Spec:

https://github.com/intel/llvm/blob/sycl/sycl/doc/design/spirv-extensions/SPV_INTEL_bindless_images.asciidoc

---------

Signed-off-by: Sidorov, Dmitry <dmitry.sidorov@intel.com>
2025-02-20 10:27:15 +01:00
Florian Hahn
2d6330f83a [LangRef] Fix indent for deref assume bundle note added in 65640c1d4c.
Fixes odd whitespace rendering.
2025-02-20 08:55:12 +01:00
Durgadoss R
3ce2e4df5d
[NVPTX] Add tcgen05.cp/shift intrinsics (#127669)
This patch adds intrinsics for tcgen05.cp and
tcgen05.shift instructions.

lit tests are added and verified with a
ptxas-12.8 executable.

Docs are updated in the NVPTXUsage.rst file.

Signed-off-by: Durgadoss R <durgadossr@nvidia.com>
2025-02-19 17:55:25 +05:30
Fabian Ritter
c442b39770
[AMDGPU][docs][NFC] Replace gfx940 with gfx942 in the gfx940 ISA doc (#126906)
gfx940 and gfx941 are no longer supported. This is the last one of a
series of PRs to remove them from the code base.

The ISA documentation still contains a lot of links and file names with
the "gfx940" identifier. Changing them to "gfx942" is probably not worth
the cost of breaking all URLs to these pages that users might have saved
in the past.

For SWDEV-512631
2025-02-19 10:37:48 +01:00
Fabian Ritter
db597084c5
[AMDGPU][docs] Replace gfx940 and gfx941 with gfx942 in llvm/docs (#126887)
gfx940 and gfx941 are no longer supported. This is one of a series of
PRs to remove them from the code base.

This PR removes all documentation occurrences of gfx940/gfx941 except
for the gfx940 ISA description, which will be the subject of a separate
PR.

For SWDEV-512631
2025-02-19 10:31:47 +01:00
Fabian Ritter
8615f9aaff
[AMDGPU] Replace gfx940 and gfx941 with gfx942 in llvm (#126763)
gfx940 and gfx941 are no longer supported. This is one of a series of
PRs to remove them from the code base.

This PR removes all non-documentation occurrences of gfx940/gfx941 from
the llvm directory, and the remaining occurrences in clang.

Documentation changes will follow.

For SWDEV-512631
2025-02-19 10:20:48 +01:00
Krzysztof Drewniak
f7d03707d1
[AMDGPU] Generalize amdgcn.make.buffer.rsrc to fat pointers (#126828)
Attempting to pass a `ptr addrspace(7)` to functions that take `ptr`
arguments produces undesirable `addrspacecast(addrspacecast(p8 x to p7)
to p0) => addrspacecast(p8 x to p0)` folds. This results in illegal GEP
operations on buffer resources, which can't be GEP'd. (However, note
that, while unimplemneted, addressspacecast from ptr addrspace(7) to ptr
is legal - it's just an effective address computation)

To resolve this problem, and thus prevent illegal
`getelementptr T, ptr addrspace(8) %x, ...` s from being produces, this
commit extends amdgcn.make.buffer.rsrc to also be variadic in its result
type, auto-upgrading old manglings.

The logic for handling a make.buffer.rsrc in instruction selection
remains untouched and expects the output type to be a ptr addrspace(8),
as does the Clang lowering for its builtin (the pointer-to-pointer
version might want a different name in clang). LowerBufferFatPointers
has been updated to lower
amdgcn.make.buffer.rsrc.p7.p* to amdgcn.make.buffer.rsrc.p8.p* .

This'll also make exposing buffer fat pointers in Clang easier, since
you don't have to cast between a `__amdgcn_rsrc_t` and a pointer.
2025-02-18 14:15:28 -06:00
Kazu Hirata
5d4eb08379
[Analysis] Remove skipSCC (#127412)
The last use was removed in:

  commit fa6ea7a419f37befbed04368bcb8af4c718facbb
  Author: Arthur Eubanks <aeubanks@google.com>
  Date:   Mon Mar 20 11:18:35 2023 -0700
2025-02-18 09:59:12 -08:00
Paul Walker
df300a4a67 [llvm][docs] Fix typo in Backporting section of GitHub.rst. 2025-02-18 12:39:42 +00:00
Jonas Devlieghere
f71b83b359
[lldb] Add a release note for #127419 2025-02-17 18:29:24 -08:00
Dinu Blanovschi
9c9157b256
Fix typo in LangImpl03.rst (#127389) 2025-02-17 12:11:36 +00:00
Nikita Popov
d8b2e432d6
[IR] Remove mul constant expression (#127046)
Remove support for the mul constant expression, which has previously
already been marked as undesirable. This removes the APIs to create mul
expressions and updates tests to stop using mul expressions.

Part of:
https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179
2025-02-14 09:28:57 +01:00
Florian Hahn
65640c1d4c
[AssumeBundles] Dereferenceable used in bundle only applies at assume. (#126117)
Update LangRef and code using `Dereferenceable` in assume bundles to
only use the information if it is safe at the point of use.

`Dereferenceable` in an assume bundle is only guaranteed at the point of
the assumption, but may not be guaranteed at later points, because the
pointer may have been freed.

Update code using `Dereferenceable` to only use it if the pointer cannot
be freed. This can further be refined to check if the pointer could be
freed between assume and use.

This follows up on https://github.com/llvm/llvm-project/pull/123196.

With that change, it should be safe to expose dereferenceable
assumptions more widely as in
https://github.com/llvm/llvm-project/pull/121789

PR: https://github.com/llvm/llvm-project/pull/126117
2025-02-13 20:41:23 +01:00
Abhilash Majumder
55f3df875d
[NVPTX] Fix and refine prefetch.* intrinsics (#126899)
This is follow-up PR from #125887  which fixes the intrinsic failures .

---------

Co-authored-by: abmajumder <abmajumder@nvidia.com>
2025-02-13 17:54:01 +01:00
Alex Bradbury
62eddf4792 [docs] Fix typo in HowToAddABuilder 2025-02-13 15:03:52 +00:00
Alex Bradbury
db2953d801
[doc] Add Discord invite link alongside channel links (#126352)
By far the most important part of this patch is updating
GettingInvolved.rst to include the invite link, but I've grepped for any
other discord.com links.

I'm no Discord expert, but from my experience (confirmed via @preames
kindly testing as well) the direct channel links provide a confusing
experience if you haven't already found and used an invite link to the
LLVM Discord server. If you're logged into Discord but not a member of
LLVM's sever, the web app opens and then...nothing. No channel opens, no
prompt to join the server or even a hint that you need to find an invite
link (and if you're not used to Discord, you likely don't even know
that's necessary).

This patch addresses the issue by providing the invite link where
Discord is mentioned.
2025-02-13 15:00:21 +00:00
Antonio Frighetto
ff585feacf [IR][ModRef] Introduce errno memory location
Model C/C++ `errno` macro by adding a corresponding `errno`
memory location kind to the IR. Preliminary work to separate
`errno` writes from other memory accesses, to the benefit of
alias analyses and optimization correctness.

Previous discussion: https://discourse.llvm.org/t/rfc-modelling-errno-memory-effects/82972.
2025-02-13 12:13:39 +01:00
Rahul Joshi
bee9664970
[TableGen] Emit OpName as an enum class instead of a namespace (#125313)
- Change InstrInfoEmitter to emit OpName as an enum class
  instead of an anonymous enum in the OpName namespace.
- This will help clearly distinguish between values that are 
  OpNames vs just operand indices and should help avoid
  bugs due to confusion between the two.
- Rename OpName::OPERAND_LAST to NUM_OPERAND_NAMES.
- Emit declaration of getOperandIdx() along with the OpName
  enum so it doesn't have to be repeated in various headers.
- Also updated AMDGPU, RISCV, and WebAssembly backends
  to conform to the new definition of OpName (mostly
  mechanical changes).
2025-02-12 08:19:30 -08:00
Alex MacLean
a282b6c486
[NVPTX] Convert scalar function nvvm.annotations to attributes (#125908)
Replace some more nvvm.annotations with function attributes,
auto-upgrading the annotations as needed. These new attributes will be
more idiomatic and compile-time efficient than the annotations.

- !"maxclusterrank" / !"cluster_max_blocks" -> "nvvm.maxclusterrank"
- !"minctasm" -> "nvvm.minctasm"
- !"maxnreg" -> "nvvm.maxnreg"
2025-02-12 07:33:22 -08:00
Yingwei Zheng
34534442a8
[Docs][LangRef] Fix broken ref to pointer capture. NFC (#126910) 2025-02-12 23:14:58 +08:00
Paul Walker
563d54569e [NFC][LLVM][LangRef] Fix typos within partial.reduce.add documentation. 2025-02-12 11:51:26 +00:00
Paul Walker
01afa8fc0b
[NFC][LLVM][LangRef] Improve documentation for partial.reduce.add. (#126728) 2025-02-12 11:33:24 +00:00
Stanislav Mekhanoshin
7639242155
[AMDGPU] Create new directive .amdhsa_inst_pref_size (#126622)
The field INST_PREF_SIZE is available since gfx11.
2025-02-11 08:35:45 -08:00
Benjamin Maxwell
701223ac20
[IR] Add llvm.sincospi intrinsic (#125873)
This adds the `llvm.sincospi` intrinsic, legalization, and lowering
(mostly reusing the lowering for sincos and frexp).

The `llvm.sincospi` intrinsic takes a floating-point value and returns
both the sine and cosine of the value multiplied by pi. It computes the
result more accurately than the naive approach of doing the
multiplication ahead of time, especially for large input values.

```
declare { float, float }          @llvm.sincospi.f32(float  %Val)
declare { double, double }        @llvm.sincospi.f64(double %Val)
declare { x86_fp80, x86_fp80 }    @llvm.sincospi.f80(x86_fp80  %Val)
declare { fp128, fp128 }          @llvm.sincospi.f128(fp128 %Val)
declare { ppc_fp128, ppc_fp128 }  @llvm.sincospi.ppcf128(ppc_fp128  %Val)
declare { <4 x float>, <4 x float> } @llvm.sincospi.v4f32(<4 x float>  %Val)
```

Currently, the default lowering of this intrinsic relies on the
`sincospi[f|l]` functions being available in the target's runtime (e.g.
libc).
2025-02-11 09:01:30 +00:00
Abhilash Majumder
6a961dc03d
[NVPTX] Add intrinsics for prefetch.* (#125887)
\[NVPTX\] Add Prefetch intrinsics

This PR adds prefetch intrinsics with the relevant eviction priorities.
* Lit tests are added as part of prefetch.ll
* The generated PTX is verified with a 12.3 ptxas executable.
* Added docs for these intrinsics in NVPTXUsage.rst.

For more information, refer PTX ISA
`<https://docs.nvidia.com/cuda/parallel-thread-execution/#data-movement-and-conversion-instructions-prefetch-prefetchu>`_.

---------

Co-authored-by: abmajumder <abmajumder@nvidia.com>
2025-02-11 14:24:46 +05:30
Rahul Joshi
0f674cce82
[NFC][LLVM] Remove unused TargetIntrinsicInfo class (#126003)
Remove `TargetIntrinsicInfo` class as its practically unused (its pure
virtual with no subclasses) and its references in the code.
2025-02-10 14:56:30 -08:00
Nico Weber
308d28667c
[llvm][docs] Tweak backporting instructions a bit (#126519)
* Drop ".Z" in milestone name since we've been doing X.Y releases
instead of X.Y.Z releases since LLVM 18

* Add "LLVM" prefix since that's what release milestones are named

* Use a numbered list to make it clearer that there are two steps
needed, and add some more details to the first step
2025-02-10 10:58:16 -05:00
David Spickett
f845497f3b
[llvm][Docs] Explain how to handle excessive formatting changes (#126239)
Based on some feedback in Discord about a PR where a reviewer asked the
author to move the formatting changes to a new PR, which appears to
contradict the current form of this document.

I've added an explanation here, before the point where the author would
be committing any of the formatting changes.

There are other ways this can go, for example some projects don't want
the churn of formatting, or you can pre-emptively send a formatting PR,
but I don't think enumerating them all here will help the audience for
this text.

So I've recomended one path that will start them off well, and can
branch off if the reviewers make requests.
2025-02-10 10:32:45 +00:00
Nikita Popov
2d31a12dbe
[DSE] Don't use initializes on byval argument (#126259)
There are two ways we can fix this problem, depending on how the
semantics of byval and initializes should interact:

* Don't infer initializes on byval arguments. initializes on byval
refers to the original caller memory (or having both attributes is made
a verifier error).
* Infer initializes on byval, but don't use it in DSE. initializes on
byval refers to the callee copy. This matches the semantics of readonly
on byval. This is slightly more powerful, for example, we could do a
backend optimization where byval + initializes will allocate the full
size of byval on the stack but not copy over the parts covered by
initializes.

I went with the second variant here, skipping byval + initializes in DSE
(FunctionAttrs already doesn't propagate initializes past byval). I'm
open to going in the other direction though.

Fixes https://github.com/llvm/llvm-project/issues/126181.
2025-02-10 10:34:03 +01:00
Durgadoss R
f3040498f0
[NVPTX] Add tcgen05 wait/fence/commit intrinsics (#126091)
This patch adds intrinsics for tcgen05 wait,
fence and commit PTX instructions.

lit tests are added and verified with a
ptxas-12.8 executable.

Docs are updated in the NVPTXUsage.rst file.

Signed-off-by: Durgadoss R <durgadossr@nvidia.com>
2025-02-07 22:10:25 +05:30