11216 Commits

Author SHA1 Message Date
Rakshit Patel
c63e83f495
[lit] Add --report-failures-only option for lit test reports (#115439)
- Add option (--report-failures-only) to generate a reduced report for
lit tests that only includes failing tests
- This is a continuation of proposed patches by @gregbedwell here:
    - https://reviews.llvm.org/D143516
    - https://reviews.llvm.org/D143519

---------

Co-authored-by: Greg Bedwell <greg.bedwell@sony.com>
Co-authored-by: James Henderson <James.Henderson@sony.com>
2024-11-13 08:30:33 +00:00
Alex Bradbury
2baead09b2 [docs] Add blank line before bulletpoint list to fix HowToAddABuilder
The bulletpoint list wasn't rendering properly due to a missing blank
line.
2024-11-13 05:26:02 +00:00
Shilei Tian
de0fd64bed
[AMDGPU] Introduce a new generic target gfx9-4-generic (#115190)
This patch introduces a new generic target, `gfx9-4-generic`. Since it doesn’t support FP8 and XF32-related instructions, the patch includes several code reorganizations to accommodate these changes.
2024-11-12 23:11:05 -05:00
Alex Bradbury
8da61a3434
[llvm][docs] Expand HowToAddABuilder with guidance on testing locally (#115024)
With <https://github.com/llvm/llvm-zorg/pull/289> and <https://github.com/llvm/llvm-zorg/pull/293> landed, it's now reasonable to ask people to test their builder configurations locally. This patch adds documentation on how to do so.
2024-11-12 22:02:20 +00:00
Tex Riddell
5c2a133b13
Emit constrained atan2 intrinsic for clang builtin (#113636)
This change is part of this proposal:
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294

- `Builtins.td` - Add f16 support for libm atan2 builtin
- `CGBuiltin.cpp` - Emit constraint atan2 intrinsic for clang builtin
- `clang/test/CodeGenCXX/builtin-calling-conv.cpp` - Use erff instead of
atan2 for clang builtin to lib call calling convention check, now that
atan2 maps to an intrinsic.
- add atan2 cases to llvm.experimental.constrained tests for more
backends: ARM, PowerPC, RISCV, SystemZ.
- LangRef.rst: add llvm.experimental.constrained.atan2, revise
llvm.atan2 description.

Last part of Implement the atan2 HLSL Function. Fixes #70096.
2024-11-12 13:34:29 -08:00
Steven Perron
ba572abeb4
[SPIRV] Add reads from image buffer for shaders. (#115178)
This commit adds an intrinsic that will read from an image buffer. We
chose to match the name of the DXIL intrinsic for simplicity in clang.

We cannot reuse the existing openCL readimage function because that is
not a reserved name in HLSL.

I considered trying to refactor generateReadImageInst, so that we could
share code between the two implementations. However, most of the code in
generateReadImageInst is concerned with trying to figure out which type
of image read is being done. Once we factor out the code that will be
common, then we end up with just a single call to the MIRBuilder being
common.
2024-11-12 14:04:45 -05:00
Fangrui Song
5a094241de
[LangRef] Clarify RISC-V v? constraints
Pull Request: https://github.com/llvm/llvm-project/pull/115820
2024-11-12 09:20:54 -08:00
David Spickett
7c04da12f0
[llvm][docs] Add terminology note to Buildbot docs (#115856)
Choosing another term for this one document would only create confusion,
and vendoring Buildbot to change it is a lot of work (as explained in
the linked Buildbot issue).
2024-11-12 12:45:43 +00:00
Stephen Tozer
6d23ac1aa2
[DebugInfo] Update policy for when to merge locations (#115349)
Following discussions on PR #114231 this patch changes the policy on
merging locations, making the rule that new instructions should use a
merge of the locations of all the instructions whose output is produced
by the new instructions; in the case where only one instruction's output
is produced, as in most InstCombine optimizations, we use only that
instruction's location.
2024-11-12 09:16:59 +00:00
Carlos Alberto Enciso
5e7662efec
[llvm-debuginfo-analyzer] Incorrect DW_AT_call_line/DW_AT_call_file. (#115701)
The code dealing with DW_AT_call_line/DW_AT_call_file is in the wrong
place. The correct functions were call, but with incorrect values:
  DW_AT_call_line <-- Filename Index
  DW_AT_call_file <-- Line number
2024-11-11 13:00:24 +00:00
Thorsten Schütt
a5d09f4ad9
[GlobalISel] Add G_STEP_VECTOR instruction (#115598)
aka llvm.stepvector Intrinsic
2024-11-11 10:45:02 +01:00
Luke Lau
5ca082cdfe
[LangRef] Fix evl type on float VP reduction intrinsics (#115421)
Looks like a search-and-replace typo
2024-11-11 13:13:08 +08:00
Will
ff0698b258
[LangRef] Fix examples for float to int saturating intrinsics (#115629)
As per the [LangRef:Simple
Constants](https://llvm.org/docs/LangRef.html#simple-constants), exact
decimal values of floating-point constants are required. For instance,
23.9 is a repeating decimal in binary and results in the reported error.

https://godbolt.org/z/1h7ETPnf6

Fixes #113529.
2024-11-10 16:51:29 +01:00
Durgadoss R
4edd711b4d
[NVPTX] Add TMA bulk tensor prefetch intrinsics (#115527)
This patch adds NVVM intrinsics and NVPTX codegen for:
* cp.async.bulk.tensor.prefetch.1D -> 5D variants, supporting both Tile
  and Im2Col modes. These intrinsics optionally support cache_hints as
  indicated by the boolean flag argument.
* Lit tests are added for all combinations of these intrinsics in cp-async-bulk-tensor-prefetch.ll.
* The generated PTX is verified with a 12.3 ptxas executable.
* Added docs for these intrinsics in NVPTXUsage.rst file.
* PTX Spec reference: 
  https://docs.nvidia.com/cuda/parallel-thread-execution/#data-movement-and-conversion-instructions-cp-async-bulk-prefetch-tensor

Signed-off-by: Durgadoss R <durgadossr@nvidia.com>
2024-11-10 13:44:42 +05:30
T-Tie
c17a914675
[RISCV] Add Smdbltrp and Ssdbltrp extension (#111837)
Smdbltrp and Ssdbltrp supports are added in this PR.
Specification link(Smdbltrp) :
[https://github.com/riscv/riscv-isa-manual/blob/main/src/smdbltrp.adoc](url)
Specification link(Ssdbltrp) :
[https://github.com/riscv/riscv-isa-manual/blob/main/src/ssdbltrp.adoc](url)
2024-11-08 15:01:51 +08:00
Min-Yih Hsu
e8b70e9744
[TableGen] Make !and and !or short-circuit (#113963)
The idea is that by preemptively simplifying the result of `!and` and `!or`, we can fold
some of the conditional operators, like `!if` or `!cond`, as early as
possible.
2024-11-07 10:22:03 -08:00
Sjoerd Meijer
6720ce75f6
[Docs][llvm-exegesis] Clarify AArch64 support (#114989)
Claiming AArch64 support for llvm-exegesis is a bit of a stretch in my
opinion as only a couple of opcodes with GPR64 operands will work for
snippet benchmarking, so I propose to clarify that AArch64 support is
very experimental. Also added some clarifications about its libpfm4
dependency.
2024-11-07 10:48:52 +00:00
Durgadoss R
1b01064faa
[NVPTX] Add TMA bulk tensor copy intrinsics (#96083)
This patch adds NVVM intrinsics and NVPTX codegen for:
* cp.async.bulk.tensor.S2G.1D -> 5D variants, supporting both Tile and
   Im2Col modes. These intrinsics optionally support cache_hints as
   indicated by the boolean flag argument.
* cp.async.bulk.tensor.G2S.1D -> 5D variants, with support for both Tile
  and Im2Col modes. The Im2Col variants have an extra set of offsets as
  parameters. These intrinsics optionally support multicast and cache_hints,
  as indicated by the boolean arguments at the end of the intrinsics.
* The backend looks through these flag arguments and lowers to the
   appropriate PTX instruction.
* Lit tests are added for all combinations of these intrinsics in
  cp-async-bulk-tensor-g2s/s2g.ll.
* The generated PTX is verified with a 12.3 ptxas executable.
* Added docs for these intrinsics in NVPTXUsage.rst file.
* PTX Spec reference:
  https://docs.nvidia.com/cuda/parallel-thread-execution/#data-movement-and-conversion-instructions-cp-async-bulk-tensor

Signed-off-by: Durgadoss R <durgadossr@nvidia.com>
2024-11-07 15:21:53 +05:30
abhishek-kaushik22
d2aff182d3
Revert "TLS loads opimization (hoist)" (#114740)
This reverts commit c31014322c0b5ae596da129cbb844fb2198b4ef4.

Based on the discussions in #112772, this pass is not needed after the
introduction of `llvm.threadlocal.address` intrinsic.

Fixes https://github.com/llvm/llvm-project/issues/112771.
2024-11-07 10:10:28 +01:00
Andrzej Warzyński
41248b598b
[docs] Update docs on code-review process (#111735)
Clarify expectations for handling new comments post-LGTM but pre-commit.

This change aims to standardize expectations when new comments are added
after a patch has received LGTM but before it has been committed.
Currently, approaches to this vary, and this update seeks to clarify
best practices.
2024-11-06 07:39:43 +00:00
Rahul Joshi
b8ac87f34a
[LLVM][AsmParser] Add support for C style comments (#111554)
Add support for C style comments in LLVM assembly.

---------

Co-authored-by: Nikita Popov <github@npopov.com>
2024-11-05 13:28:22 -08:00
Matt Arsenault
0b40f97929
AMDGPU: Treat uint32_max as the default value for amdgpu-max-num-workgroups (#113751)
0 does not make sense as a value for this to be, much less the default.
Also stop emitting each individual field if it is the default, rather than
if any element was the default. Also fix the name of the test since it didn't
exactly match the real attribute name.
2024-11-05 12:50:44 -08:00
walter erquinigo
e952728f88 [LLDB] Retry Add a target.launch-working-dir setting
This retries the PR 113521 skipping a test in a remote environment.
2024-11-05 13:29:51 -05:00
Finn Plummer
3cdac06708
[HLSL][SPIRV][DXIL] Implement dot4add_i8packed intrinsic (#113623)
- create a clang built-in in Builtins.td
- link dot4add_i8packed in hlsl_intrinsics.h
- add lowering to spirv backend through expansion of operation as OPSDot
is missing up to SPIRV 1.6 in SPIRVInstructionSelector.cpp
- add lowering to spirv backend using OpSDot in applicable SPIRV version
or if SPV_KHR_integer_dot_product is enabled
- add dot4add_i8packed intrinsic to IntrinsicsDirectX.td and mapping to
DXIL.td op Dot4AddI8Packed

- add tests for HLSL intrinsic lowering to dx/spv intrinsic in
dot4add_i8packed.hlsl
- add tests for sema checks in dot4add_i8packed-errors.hlsl
- add test of spir-v lowering in SPIRV/dot4add_i8packed.ll
- add test to dxil lowering in DirectX/dot4add_i8packed.ll
    
 Resolves #99220
2024-11-05 10:29:08 -08:00
Walter Erquinigo
5d39e0c7e1
Revert "[LLDB] Add a target.launch-working-dir setting" (#114973)
Reverts llvm/llvm-project#113521 due to build bot failures mentioned in
the original PR.
2024-11-05 07:12:20 -05:00
Walter Erquinigo
6620cd2523
[LLDB] Add a target.launch-working-dir setting (#113521)
Internally we use bazel in a way in which it can drop you in a LLDB
session with the target launched in a particular cwd, which is needed
for things to work. We've been making this automation work via `process
launch -w`. However, if later the user wants to restart the process with
`r`, then they end up using a different cwd for relaunching the process.
As a way to fix this, I'm adding a target-level setting that allows
configuring a default cwd used for launching the process without needing
the user to specify it manually.
2024-11-05 06:33:25 -05:00
Carlo Cabrera
6d2f4dd79d
[llvm][docs] update links to clang-format-diff.py and git-clang-format (#114646)
Point to github instead of phabricator.
2024-11-05 11:06:54 +01:00
Jay Foad
4831e0aa88
[IR] Disallow recursive types (#114799)
StructType::setBody is the only mechanism that can potentially create
recursion in the type system. Add a runtime check that it is not
actually used to create recursion.

If the check fails, report an error from LLParser, BitcodeReader and
IRLinker. In all other cases assert that the check succeeds.

In future StructType::setBody will be removed in favor of specifying the
body when the type is created, so any performance hit from this runtime
check will be temporary.
2024-11-05 09:41:10 +00:00
Kyungwoo Lee
ffcf3c8688
[CGData][llvm-cgdata] Support for stable function map (#112664)
This introduces a new cgdata format for stable function maps. The raw
data is embedded in the __llvm_merge section during compile time. This
data can be read and merged using the llvm-cgdata tool, into an indexed
cgdata file. Consequently, the tool is now capable of handling either
outlined hash trees, stable function maps, or both, as they are
orthogonal.

Depends on #112662.
This is a patch for
https://discourse.llvm.org/t/rfc-global-function-merging/82608.
2024-11-04 17:32:50 -08:00
Louis Dionne
6127724786
[cmake] Remove obsolete files, docs and CMake variables related to the standalone build (#112741)
The runtimes used to support a build mode called the "Standalone build",
which isn't supported anymore (and hasn't been for a few years).
However, various places in the code still contained stuff whose only
purpose was to support that build mode, and some outdated documentation.
This patch cleans that up (although I probably missed some).

- Remove HandleOutOfTreeLLVM.cmake which isn't referenced anymore
- Remove the LLVM_PATH CMake variable which isn't used anymore
- Update some outdated documentation referencing standalone builds
2024-11-04 17:53:38 -05:00
Alex MacLean
ed19ef740b
[NVPTX][docs] Add isspacep.* to usage doc (#114839) 2024-11-04 12:11:32 -08:00
Jay Foad
f8559751fc
[llvm-project] Fix typo "propogate" (#114795) 2024-11-04 15:33:19 +00:00
zhijian lin
a51712751c
[PowerPC][LLC] Utilize PPC::getNormalizedPPCTargetCPU() to set CPU (#113943)
Utilize common API in PPCTargetParser
(https://github.com/llvm/llvm-project/pull/97541) to set default CPU
with same interfaces for LLC.
This will update AIX default CPU to pwr7 and LoP powerppc64 default CPU
to ppc64.
2024-11-04 09:40:54 -05:00
Rajat Bajpai
7603feac78
[Documentation] Update parameter and function attribute section in LangRef (#114007)
Update the documentation for parameter and function attributes to
include support for target-dependent string attributes.
2024-11-02 11:19:21 +01:00
Alex MacLean
8ff60c4d47
[NVPTX] Add support for nvvm.flo.[us] intrinsics (#114489)
Add support for '`llvm.nvvm.flo.[su].*`' intrinsics which correspond to
a PTX `bfind` instruction.
See [PTX ISA 9.7.1.16. Integer Arithmetic Instructions: bfind]
(https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#integer-arithmetic-instructions-bfind)

The '`llvm.nvvm.flo.u`' family of intrinsics identifies the bit position
of the leading one, returning either it's offset from the most or least
significant bit.

The '`llvm.nvvm.flo.s`' family of intrinsics identifies the bit position
of the leading non-sign bit, returning either it's offset from the most
or least significant bit.
2024-11-01 16:35:43 -07:00
David Spickett
a1c6dc223e
[llvm][docs] Add Approvals section to GitHub guide (#113434)
Based on feedback that when reading the document as a guide, it's odd
that it skips right from updating the PR to merging it.

The section is a link to the existing Code Review guide's text on the
topic.

I have updated that to mention required reviewers, which some
subprojects do use (libcxx is one) but most don't.

Also we use the words "accepted" and "approved" interchangeably, so I've
swapped one instance so it's consistent between paragraphs.
2024-10-31 15:24:33 +00:00
Aaron Ballman
0ab44fd246
Replace documentation mentions of IRC with Discord (#114276)
This does not touch code owners or credits files that list IRC handles,
that can be done separately if we want to make that change.

See
https://discourse.llvm.org/t/rfc-remove-irc-as-a-recommended-communication-channel/82808/3
for the RFC.
2024-10-31 09:22:46 -04:00
Jan Svoboda
3f17613509
[docs] Point to Discourse for creating RFCs (#114341) 2024-10-31 05:35:57 -07:00
Wanyi
efc6d33be9
[lldb] Fix write only file action to truncate the file (#112657)
When `FileAction` opens file with write access, it doesn't clear the
file nor append to the end of the file if it already exists. Instead, it
writes from cursor index 0.

For example, by using the settings `target.output-path` and
`target.error-path`, lldb will redirect process stdout/stderr to files.
It then calls this function to write to the files which the above
symptoms appear.

## Test
- Added unit test checking the file flags
- Added 2 api tests checking
  - File content overwritten if the file path already exists
- Stdout and stderr redirection to the same file doesn't change its
behavior
2024-10-29 14:22:51 -04:00
Benjamin Maxwell
c3260c65e8
[IR] Add llvm.sincos intrinsic (#109825)
This adds the `llvm.sincos` intrinsic, legalization, and lowering.

The `llvm.sincos` intrinsic takes a floating-point value and returns
both the sine and cosine (as a struct).

```
declare { float, float }          @llvm.sincos.f32(float  %Val)
declare { double, double }        @llvm.sincos.f64(double %Val)
declare { x86_fp80, x86_fp80 }    @llvm.sincos.f80(x86_fp80  %Val)
declare { fp128, fp128 }          @llvm.sincos.f128(fp128 %Val)
declare { ppc_fp128, ppc_fp128 }  @llvm.sincos.ppcf128(ppc_fp128  %Val)
declare { <4 x float>, <4 x float> } @llvm.sincos.v4f32(<4 x float>  %Val)
```

The lowering is built on top of the existing FSINCOS ISD node, with
additional type legalization to allow for f16, f128, and vector values.
2024-10-29 10:52:20 +00:00
David Spickett
a8398bd817 [llvm][docs] Update list of llvm-lit options
Fixes #62899

In this commit I have updated the list of options
to include any missing options and re-rordered
some of them to match the order in lit's --help.

Where there was a larger description in this document
I've used that instead of the --help description.

This *does not* include --use-unique-output-file-name
as this was only added recently and we are still
debating whether it will be kept.
2024-10-29 10:35:27 +00:00
Alex Bradbury
7544d3af0e
[RISCV] Mark RVB23U64 and RVB23S64 as non-experimental (#113918)
The specification was recently ratified

<https://github.com/riscv/riscv-profiles/blob/main/src/rvb23-profile.adoc>.
2024-10-29 07:57:34 +00:00
Alex Bradbury
ba7555e640
[RISCV] Mark the RVA23S64 and RVA23U64 profiles as non-experimental (#113826)
All of the extensions used by these profile are themselves
non-experimental, and RVA23 was just ratified

<https://riscv.org/announcements/2024/10/risc-v-announces-ratification-of-the-rva23-profile-standard/>.

<https://github.com/riscv/riscv-profiles/blob/main/src/rva23-profile.adoc>

We lack a way of expressing `Ss1p13` (supervisor architecture 1.13), but
this is a problem we have for RVA22 (Ss1p12) and RVA20 (Ss1p11) so I
don't feel it's a blocker.
2024-10-28 12:56:47 +00:00
dong-miao
75c75fc16e
[RISCV]Add svvptc extension (#113882) 2024-10-28 22:54:51 +11:00
Brandon Wu
f5d8a485e2
[RISCV] Fix typo in UserGuides.rst. NFC (#113861) 2024-10-28 18:30:21 +08:00
Alex Bradbury
35f6cc6af0
[RISCV] Add the Sha extension (#113820)
This was introduced in the now-ratified RVA23 profile (and also added to
the RVA22 text) as a simple way of referring to H plus the set of
supervisor extensions required by RVA23.
https://github.com/riscv/riscv-profiles/blob/main/src/rva23-profile.adoc

This patch simply defines the extension. The next patch will adjust the
RVA23 profile to use it, and at that point I think we will be ready to
mark RVA23 as non-experimental.

Note that I haven't made it so if you enable all extensions that
constitute Sha, Sha is implied. Per #76893 (adding 'B'), the concern is
making this implication might break older external assemblers. Perhaps
this is less of a concern given the relative frequency of
`-march=${foo}_zba_zbb_zbs` vs the collection of H extensions. If we did
want to add that implication, we'd probably want to add it in a separate
patch so it can be easily reverted if found to cause problems.
2024-10-28 07:42:33 +00:00
Freddy Ye
d3f70db51c
[X86][MC] Support instructions of MSR_IMM (#113524)
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/671368
2024-10-28 12:59:51 +08:00
Freddy Ye
5aa1275d03
[X86] Support SM4 EVEX version intrinsics/instructions. (#113402)
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/671368
2024-10-28 10:46:16 +08:00
Alex MacLean
fb33af08e4
[NVPTX] Remove nvvm.ldg.global.* intrinsics (#112834)
Remove these intrinsics which can be better represented by load
instructions with `!invariant.load` metadata:

- llvm.nvvm.ldg.global.i
- llvm.nvvm.ldg.global.f
- llvm.nvvm.ldg.global.p
2024-10-27 16:14:13 -07:00
davidtrevelyan
4102625380
[rtsan][llvm][NFC] Rename sanitize_realtime_unsafe attr to sanitize_realtime_blocking (#113155)
# What

This PR renames the newly-introduced llvm attribute
`sanitize_realtime_unsafe` to `sanitize_realtime_blocking`. Likewise,
sibling variables such as `SanitizeRealtimeUnsafe` are renamed to
`SanitizeRealtimeBlocking` respectively. There are no other functional
changes.


# Why?

- There are a number of problems that can cause a function to be
real-time "unsafe",
- we wish to communicate what problems rtsan detects and *why* they're
unsafe, and
- a generic "unsafe" attribute is, in our opinion, too broad a net -
which may lead to future implementations that need extra contextual
information passed through them in order to communicate meaningful
reasons to users.
- We want to avoid this situation and make the runtime library boundary
API/ABI as simple as possible, and
- we believe that restricting the scope of attributes to names like
`sanitize_realtime_blocking` is an effective means of doing so.

We also feel that the symmetry between `[[clang::blocking]]` and
`sanitize_realtime_blocking` is easier to follow as a developer.

# Concerns

- I'm aware that the LLVM attribute `sanitize_realtime_unsafe` has been
part of the tree for a few weeks now (introduced here:
https://github.com/llvm/llvm-project/pull/106754). Given that it hasn't
been released in version 20 yet, am I correct in considering this to not
be a breaking change?
2024-10-26 13:06:11 +01:00