521792 Commits

Author SHA1 Message Date
Valentin Clement (バレンタイン クレメン)
81333cfc52
[flang][cuda] Relax host array check for cuda constant (#120333)
Array with CONSTANT attribute declared in module spec part are device
arrays and should not trigger the host array check.
2024-12-17 17:04:32 -08:00
tianleliu
d7fe2cf8a2
[InstCombine] Widen Sel width after Cmp to generate Max/Min intrinsics. (#118932)
When Sel(Cmp) are in different integer type,

From: (K and N mean width, K < N; a and b are src operands.)
bN = Ext(bK)
cond = Cmp(aN, bN)
aK = Trunc aN
retK = Sel(cond, aK, bK)
To:
bN = Ext(bK)
cond = Cmp(aN, bN)
retN = Sel(cond, aN, bN)
retK = Trunc retN

Though Sel's operands width becomes larger, the benefit
of making type width in Sel the same as Cmp, is for combing
to max/min intrinsics, and also better performance for SIMD
instructions.
References of correctness: https://alive2.llvm.org/ce/z/Y4Kegm
                           https://alive2.llvm.org/ce/z/qFtjtR
Reference of generated code comparision:
                           https://gcc.godbolt.org/z/o97svGvYM
                           https://gcc.godbolt.org/z/59Ynj91ov
2024-12-18 09:02:11 +08:00
Thurston Dang
c48d45e6a3
[sanitizer] Refactor -f(no-)?sanitize-recover parsing (#119819)
This moves the -f(no-)?sanitize-recover parsing into a generic
parseSanitizerArgs function, and then applies it to parse
-f(no-)?sanitize-recover and -f(no-)?sanitize-trap.

N.B. parseSanitizeTrapArgs does *not* remove non-TrappingSupported
arguments. This maintains the legacy behavior of '-fsanitize=undefined
-fsanitize-trap=undefined' (clang/test/Driver/fsanitize.c), which is
that vptr is not enabled at all (not even in recover mode) in order to
avoid the need for a ubsan runtime.
2024-12-17 16:35:11 -08:00
Drew Kersnar
9d11aa175b
[NVPTX] Remove extra semicolon (#120336)
Fix bug in this change:
https://github.com/llvm/llvm-project/pull/119622#issuecomment-2549896245
2024-12-17 16:03:40 -08:00
Jon Roelofs
01d7a187a4
[llvm] Add missing dependency of libLLVMCodeGen on vt_gen
```
llvm-project/llvm/include/llvm/CodeGenTypes/MachineValueType.h:43:10: fatal error: 'llvm/CodeGen/GenVT.inc' file not found
   43 | #include "llvm/CodeGen/GenVT.inc"
      |          ^~~~~~~~~~~~~~~~~~~~~~~~
```

rdar://141643651
2024-12-17 17:02:55 -07:00
Paul Kirth
f8d9f8ed95
[clang-doc] Add test for functions with builtin return types (#120318)
This is a precommit test for #120308, since we lack non-template
functions that use builtin types.
2024-12-17 15:55:05 -08:00
Teresa Johnson
a15e7b11da
[MemProf] Add option to hint allocations at a given cold byte percentage (#120301)
Optionally unconditionally hint allocations as cold or not cold during
the matching step if the percentage of bytes allocated is at least that
of the given threshold.
2024-12-17 15:53:56 -08:00
Jie Fu
f3a8f87979 [NVPTX] Remove extra ';' outside of a function (NFC)
/llvm-project/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp:224:2:
error: extra ';' outside of a function is incompatible with C++98 [-Werror,-Wc++98-compat-extra-semi]
};
 ^
1 error generated.
2024-12-18 07:42:32 +08:00
Drew Kersnar
932d9c13fa
[NVPTX] Generalize and extend upsizing when lowering 8/16-bit-element vector loads/stores (#119622)
This addresses the following issue I opened:
https://github.com/llvm/llvm-project/issues/118851.

This change generalizes the Type Legalization mechanism that currently
handles `v8[i/f/bf]16` upsizing to include loads _and_ stores of `v8i8`
+ `v16i8`, allowing all of the mentioned vectors to be lowered to ptx as
vectors of `b32`. This extension also allows us to remove the DagCombine
that only handled exactly `load v16i8`, thus centralizing all the
upsizing logic into one place.

Test changes include adding v8i8, v16i8, and v8i16 cases to
load-store.ll, and updating the CHECKs for other tests to match the
improved codegen.
2024-12-17 15:23:22 -08:00
Aiden Grossman
8a62104f64
[Github] Use hashed dependencies in docs job (#120319)
This patch forces the docs test build job to use the hashed dpendencies
file rather than the normal requirements.txt. This ensures that we get
the exact transitive closure specified rather than whatever the
dependency solver feels like it should use in the CI job.
2024-12-17 14:46:13 -08:00
alx32
bb4007e562
[DWARFVerifier] Disable failing test llvm/include/llvm/DebugInfo/DWARF/DWARFVerifier.h (#120322)
Disabling and forward fixing later.
2024-12-17 14:42:33 -08:00
Craig Topper
09f449263e
[RISCV] Check for register where immediate should be in RISCVInstrInfo::verifyInstruction. (#120286)
The generic verifier will do this if the operand type is
OPERAND_IMMEDIATE, but we use our own custom operand types. Immediate
operands are still allowed to be globals, constant pools, blockaddress,
etc. so we can't check !isImm().
2024-12-17 14:39:00 -08:00
Farzon Lotfi
c03fc929ff
[DirectX] Add support for vector_reduce_add (#117646)
Use of `vector_reduce_add` will make it easier to write more intrinsics
in `hlsl_intrinsics.h`.
2024-12-17 17:32:50 -05:00
Justin Bogner
65d2177ae1
[DXIL] Simplify MDBuilder in resource unit tests. NFC (#120275) 2024-12-17 15:16:58 -07:00
Nick Desaulniers
7153a21916
[libc][docs] update sphinx requirement hashes (#120315)
Link: #120274
2024-12-17 14:14:03 -08:00
Alex MacLean
9f231a8500
[NVPTX] Prefer ValueType when defining DAG patterns (NFC) (#120161)
Replace uses of register class in dag patterns with value types. These
types are much more concise and in cases where a single register class
maps to multiple types, they avoid the need for both.
2024-12-17 13:49:31 -08:00
Valentin Clement (バレンタイン クレメン)
15c61a208f
[flang][cuda] Do not consider SHARED array as host array (#120306)
Update the current `FindHostArray` to not return shared array as host
array.
2024-12-17 13:42:14 -08:00
Valentin Clement (バレンタイン クレメン)
97b7bace67
[flang][cuda] Allow host array with PARAMETER attribute in device context (#120298)
Host arrays are normally not allowed in device context unless they have
a `PARAMETER` attribute. This patch update the check so no error is
emitted.
2024-12-17 13:41:24 -08:00
Florian Hahn
eb59fe8d04
[VPlan] Remove redundant assignment in VPReductionPHIRecipe (NFC)
Suggested post-commit for 0e528ac404e13ed2d952a2d83aaf8383293c851e.
2024-12-17 21:32:40 +00:00
Joseph Huber
958de20b30
[libc] Enable 'timespec_get' for the GPU build (#120304)
Summary:
Currently fails to build libc++ because this is missing.
2024-12-17 15:31:07 -06:00
Nick Desaulniers
1d06157b9e
[libc] fix -Wgcc-compat (#120303)
I don't quite recall why I added those in the first place. These tests build
without diagnostics for both clang and GCC with this fix.

Fixes: #114653
2024-12-17 13:30:20 -08:00
Nico Weber
cde996c31d [lld/COFF] Remove needless indirection
`symtab.ctx.symtab` is just `symtab`. Looks like #119296 added
this using a global find-and-replace.

This was the only instance of `symtab.ctx.symtab` in lld/.

No behavior change.
2024-12-17 16:27:16 -05:00
Michael Maitland
169c32eb49
[RISCV][VLOPT] Enable the RISCVVLOptimizer by default (#119461)
Now that we have testing of all instructions in the isSupportedInstr
switch, and better coverage of getOperandInfo, I think it is a good time
to enable this by default.
2024-12-17 16:19:35 -05:00
Joseph Huber
b0fbddde38 [OpenMP] Only put retain for NVPTX so it can be optimized out for AMD
Summary:
This is a hack that only NVPTX needs.
2024-12-17 15:16:51 -06:00
Kazu Hirata
7f2fb8061e
[memprof] Don't use Frame::hash or hashCallStacks in unit test (#119984)
This patch checks the result of YAML parsing at the level of
MemProfRecord instead of IndexedMemProfRecord, thereby avoiding use of
Frame::hash and hashCallStacks.  This makes sense because we
ultimately care about consumers like MemProfiler.cpp obtaining
MemProfRecord correctly; IndexedMemProfData and hash values are just
intermediaries.

Once this patch lands, we call Frame::hash and hashCallStacks only
when adding Frames or call stacks to their respective data structures.
In other words, the hash functions are pretty much business internal
to IndexedMemProfRecord.
2024-12-17 13:12:29 -08:00
Michael Maitland
48c20e7106
[RISCV][VLOPT] Do not optimize VL when isVectorOpUsedAsScalarOp (#120291)
This does not have tests, so we will remove this for now and add it back
later with tests.
2024-12-17 16:10:02 -05:00
Valentin Clement (バレンタイン クレメン)
bbeafe4b94
[flang][cuda] Apply implict data attribute to local arrays (#120293)
Add the implicit data attribute to local arrays that don't have one.
This simplifies the host array detection in semantic.
2024-12-17 12:56:39 -08:00
Teresa Johnson
d7d0e740cc
[MemProf] Refactor single alloc type handling and use in more cases (#120290)
Emit message when we have aliased contexts that are conservatively
hinted not cold. This is not a change in behavior, just in message when
the -memprof-report-hinted-sizes flag is enabled.
2024-12-17 12:50:49 -08:00
Philip Reames
984cb791db
[RISCV] Use vmv.v.x to materialize masks in deinterleave2 lowering (#118500)
This is a follow up to 2af2634 to use vmv.v.x of i8 constants instead of
the prior vid/vand/vmsne sequence. The advantage of the vmv.v.x sequence
is that it's always m1 (so cheaper at high LMUL), and can be
rematerialized by the register allocator if needed to locally reduce
register pressure.
2024-12-17 12:50:09 -08:00
Florian Hahn
4ad0fdd163
[VPlan] Remove reverse() of predecessors from VPInstruction::generate.
This was originally done to reduce the diff for the change. Remove it
and update the remaining tests. NFC modulo reordering of incoming
values.

Clean up after https://github.com/llvm/llvm-project/pull/114292.
2024-12-17 20:44:32 +00:00
Mark Danial
2a0091fb4a
[AIX] fix unsupported diff flag on AIX (-strip-trailing-cr) (#120276)
https://github.com/llvm/llvm-project/pull/119666 adds the
`-strip-trailing-cr` flag to diff which is not supported on AIX switch
to use the python implementation of diff instead
2024-12-17 15:43:50 -05:00
Shourya Goel
c98e79d856
[libc][complex] Implement different flavors of the cproj function (#119722)
Refer section 7.3.9.5 of ISO/IEC 9899:2023
2024-12-18 02:04:50 +05:30
Philipp van Kempen
e8ce6c4e69
[RISCV] Fix typo in CV_SH_rr_inc pattern (#120246)
This typo in
https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/RISCV/RISCVInstrInfoXCV.td#L701:L701
caused a compiler crash in 'RISC-V Assembly Printer' because
CV_SH_ri_inc was selected, leading to `getImmOpValue` being called for a
register operand.

This bug did not affect the Assembler output and therefore does not
trigger any existing unit tests, but is visible by examining the final
MIR function.
2024-12-17 12:20:17 -08:00
Jacek Caban
16ef239520 [LLD][COFF] Introduce hybrid symbol table for EC input files on ARM64X (#119294) 2024-12-17 21:19:01 +01:00
Roland McGrath
0b91d77bf4
[libc] Use __attribute__((__nothrow__)) for __NOEXCEPT in C (#114653)
Consistent with glibc headers, where `noexcept` is used in C++
(or `throw()` in older C++ which llvm-libc doesn't support) in
the public function declarations, `__attribute__((__nothrow__))` is
used in C for compilers that support it.
2024-12-17 12:18:37 -08:00
Peter Klausler
a957cedea9
[flang] Handle substring in data statement constant (#120130)
The case of a constant substring wasn't handled in the parser for data
statement constants.

Fixes https://github.com/llvm/llvm-project/issues/119005.
2024-12-17 12:10:50 -08:00
Peter Klausler
b2c363e261
[flang] Fix generic resolution with actual/dummy procedure incompatib… (#120105)
…ility

We generally allow any legal procedure pointer target as an actual
argument for association with a dummy procedure, since many actual
procedures are underspecified EXTERNALs. But for proper generic
resolution, it is necessary to disallow incompatible functions with
explicit result types.

Fixes https://github.com/llvm/llvm-project/issues/119151.
2024-12-17 12:10:29 -08:00
Alexey Bataev
0e11e19416 [SLP][NFC]Remove undef and update tests 2024-12-17 11:45:20 -08:00
Nico Weber
c9a5a6d18b [lld/COFF] Remove unused InputFile::LazyObjectKind
Its use was removed in d496abbe2a037. No behavior change.
2024-12-17 14:29:31 -05:00
serge-sans-paille
ec636cf3c5
[llvm-split][nfc] Harmonize help and error message (#120062)
Somme error / help message refer to options with a single dash while
help refer to options with a double dash.
2024-12-17 19:29:18 +00:00
Malte Dehling
e5521fae94
[mlir-tblgen] Fix bug in emitEnumDoc (#118131)
Fixes a crash (assertion failure) in `mlir-tblgen -emit-enum-doc` caused
by calling `EnumAttr()` for the wrong type of `Record *`: `EnumAttr`
rather than `EnumAttrInfo` as asserted.

Compare the corresponding line in `emitDialectDoc()`:

0ad6be1927/mlir/tools/mlir-tblgen/OpDocGen.cpp (L532)

Co-authored-by: Malte Dehling <m.dehling@samsung.com>
2024-12-17 14:28:00 -05:00
Jonas Devlieghere
83643ddf2f
[lldb] Improve error reporting in GetLocation_DW_OP_addr (#120162)
Instead of simply raising an error flag, use an llvm::Expected to
propagate a meaningful error to the caller, who can report it.

rdar://139705570
2024-12-17 11:25:04 -08:00
joaosaffran
56cb554291
[NFC] Updating Debug Info generation for 'this' (#119445)
This is PR is updating the debug info generation for `this`. This is
required to fix the generation of debug information for HLSL RWBuffer
type. This was required from another PR:
https://github.com/llvm/llvm-project/pull/119041/files

Co-authored-by: Joao Saffran <jderezende@microsoft.com>
2024-12-17 11:10:05 -08:00
alx32
ad32576cff
[DWARFVerifier] Allow overlapping ranges for ICF-merged functions (#117952)
This patch modifies the DWARF verifier to handle a valid case where two
or more functions have identical address ranges due to being merged by
ICF (Identical Code Folding). Previously, the verifier would incorrectly
report these as errors, but functions merged via ICF (such as when using
LLD's --keep-icf-stabs option) can legitimately share the same address
range.

A new test case has been added to verify this behavior using YAML-based
DWARF data that simulates two DW_TAG_subprogram entries with identical
address ranges. The test ensures that the verifier correctly identifies
this as a valid case and doesn't emit any errors, while still
maintaining the existing verification for truly invalid overlapping
ranges in other scenarios. Before this change, the newly added test case
would have failed, with `llvm-dwarfdump` marking the overlapping address
ranges in the DWARF as an error.

We also modify the existing tests `llvm-dwarfutil/ELF/X86/verify.test` and 
`llvm/test/tools/llvm-dwarfdump/X86/verify_parent_zero_length.yaml`
which rely on the existence of the error that we're trying to
suppress. We slightly change one offset so that the ranges don't
perfectly overlap and an error is still generated.
2024-12-17 11:00:56 -08:00
Brox Chen
de2acda3df
[AMDGPU][True16][MC] support more VOP3 inst in true16/fake16 format (#113603)
Support true16 and fake16 format for more VOP3 instructions in MC

This patch updates the true16 and fake16 vop_profile for the following
instructions and update the asm/dasm tests:
v_mad_u16
v_mad_i16
v_med3_f16
v_med3_i16
v_med3_u16
v_max3_f16
v_max3_i16
v_max3_u16
v_min3_f16
v_min3_i16
v_min3_u16
v_med3_num_f16
2024-12-17 13:58:01 -05:00
Nico Weber
4c2a46f5fe [lld/COFF] Make test/COFF/start-lib.ll use split-file
The two input files were only used by this one test, so put them inline.

No behavior change.
2024-12-17 13:55:50 -05:00
Florian Hahn
641fbf1524
[TySan] Add initial Type Sanitizer runtime (#76261)
This patch introduces the runtime components for type sanitizer: a
sanitizer for type-based aliasing violations.

It is based on Hal Finkel's https://reviews.llvm.org/D32197.

C/C++ have type-based aliasing rules, and LLVM's optimizer can exploit
these given TBAA metadata added by Clang. Roughly, a pointer of given
type cannot be used to access an object of a different type (with, of
course, certain exceptions). Unfortunately, there's a lot of code in the
wild that violates these rules (e.g. for type punning), and such code
often must be built with -fno-strict-aliasing. Performance is often
sacrificed as a result. Part of the problem is the difficulty of finding
TBAA violations. Hopefully, this sanitizer will help.

For each TBAA type-access descriptor, encoded in LLVM's IR using
metadata, the corresponding instrumentation pass generates descriptor
tables. Thus, for each type (and access descriptor), we have a unique
pointer representation. Excepting anonymous-namespace types, these
tables are comdat, so the pointer values should be unique across the
program. The descriptors refer to other descriptors to form a type
aliasing tree (just like LLVM's TBAA metadata does). The instrumentation
handles the "fast path" (where the types match exactly and no
partial-overlaps are detected), and defers to the runtime to handle all
of the more-complicated cases. The runtime, of course, is also
responsible for reporting errors when those are detected.

The runtime uses essentially the same shadow memory region as tsan, and
we use 8 bytes of shadow memory, the size of the pointer to the type
descriptor, for every byte of accessed data in the program. The value 0
is used to represent an unknown type. The value -1 is used to represent
an interior byte (a byte that is part of a type, but not the first
byte). The instrumentation first checks for an exact match between the
type of the current access and the type for that address recorded in the
shadow memory. If it matches, it then checks the shadow for the
remainder of the bytes in the type to make sure that they're all -1. If
not, we call the runtime. If the exact match fails, we next check if the
value is 0 (i.e. unknown). If it is, then we check the shadow for the
remainder of the byes in the type (to make sure they're all 0). If
they're not, we call the runtime. We then set the shadow for the access
address and set the shadow for the remaining bytes in the type to -1
(i.e. marking them as interior bytes). If the type indicated by the
shadow memory for the access address is neither an exact match nor 0, we
call the runtime.

The instrumentation pass inserts calls to the memset intrinsic to set
the memory updated by memset, memcpy, and memmove, as well as
allocas/byval (and for lifetime.start/end) to reset the shadow memory to
reflect that the type is now unknown. The runtime intercepts memset,
memcpy, etc. to perform the same function for the library calls.

The runtime essentially repeats these checks, but uses the full TBAA
algorithm, just as the compiler does, to determine when two types are
permitted to alias. In a situation where access overlap has occurred and
aliasing is not permitted, an error is generated.

As a note, this implementation does not use the compressed shadow-memory
scheme discussed previously
(http://lists.llvm.org/pipermail/llvm-dev/2017-April/111766.html). That
scheme would not handle the struct-path (i.e. structure offset)
information that our TBAA represents. I expect we'll want to further
work on compressing the shadow-memory representation, but I think it
makes sense to do that as follow-up work.

This includes build fixes for Linux from Mingjie Xu.

Depends on #76260 (Clang support), #76259 (LLVM support)


PR: https://github.com/llvm/llvm-project/pull/76261
2024-12-17 18:49:50 +00:00
Simon Pilgrim
5d4e4b3503 [X86] LowerShift - use getConstant directly to create vector splat constants. NFC. 2024-12-17 18:41:09 +00:00
Tristan Ross
7477b61b24
[libc] Add unistd overlay (#119312)
Reverts the revert #119295 of #118882 by expanding #118882 with
additional fixes which made CI unhappy.
2024-12-17 10:40:22 -08:00
Nick Desaulniers
4c5ddc9ed4
[libc][docs] add redirect for math/index.html (#120274)
commit a9aff440d9dd ("[libc][docs] reorganize documentation (#118836)")

moved https://libc.llvm.org/math/index.html to
https://libc.llvm.org/headers/math/index.html which makes links from
various slide decks stale.

There's an extension for sphinx that can generate redirects. Add a dependency
on that, then use it to create a redirect so that those older links still work.

I was able to install this sphinx extension via:

    $ sudo apt install python3-sphinx-reredirects

We may need to install this on whatever server generates the llvm
documentation.
2024-12-17 10:37:21 -08:00