527289 Commits

Author SHA1 Message Date
klensy
4ee173a168
add me to mailmap (#126226)
Should add ability for buildbot to find proper mail.


f1a84bbe55/master/buildbot/changes/gitpoller.py (L418)

At least buildbot parses user names and mails with respect to mailmap.

Co-authored-by: klensy <nightouser@gmail.com>
2025-02-13 17:49:48 +00:00
Alexey Bataev
d18b1ebef5 [SLP]Check if vector user exist before accessing it
Need to check if vector user exist before accessing it to avoid compiler
crash.
Fixes #126581
2025-02-13 09:44:34 -08:00
Sylvestre Ledru
c81139f417
libc/cmake: don't fail if LLVM_VERSION_SUFFIX isn't defined (#126359)
Closes: #126358

cc @samvangysegem

---------

Co-authored-by: Joseph Huber <huberjn@outlook.com>
2025-02-13 18:42:28 +01:00
Joel E. Denny
eb8ffd617a
[flang] AliasAnalysis: Handle fir.load on fir.alloca (#117785)
For example, determine that the address in p below cannot alias the
address of v:

```
subroutine test()
  real, pointer :: p
  real, target :: t
  real :: v
  p => t
  v = p
end subroutine test
```
2025-02-13 12:40:03 -05:00
Slava Zakharin
660cdace55
[flang] Fixed write past allocated descriptor in PointerAssociateRemapping. (#127000)
The pointer descriptor might be smaller than the target descriptor,
so `operator=` would write beyound the pointer descriptor.
2025-02-13 09:39:36 -08:00
Martin Erhart
9a63a2c4ba
[mlir][index] Add CAPI (#127039) 2025-02-13 17:37:49 +00:00
Stanislav Mekhanoshin
07405ca036
[AMDGPU] clang-format SIProgramInfo.h. NFC. (#127033) 2025-02-13 09:35:29 -08:00
Simon Pilgrim
4a97ce5f75 [X86] X86FixupVectorConstantsPass - pull out getPrimitiveSizeInBits call. NFC. 2025-02-13 17:25:08 +00:00
Kazu Hirata
4bda95304f
[llvm-profgen] Avoid repeated hash lookups (NFC) (#127028) 2025-02-13 09:12:33 -08:00
Kazu Hirata
9a59145d8e
[memprof] Avoid repeated map lookups (NFC) (#127027) 2025-02-13 09:12:04 -08:00
Kazu Hirata
fec04f286e
[FileCheck] Avoid repeated hash lookups (NFC) (#127026) 2025-02-13 09:11:43 -08:00
Kazu Hirata
e7bf6a4e04
[CodeGen] Avoid repeated map lookups (NFC) (#127025) 2025-02-13 09:11:17 -08:00
Kazu Hirata
44b61e056d
[Analysis] Avoid repeated hash lookups (NFC) (#127024) 2025-02-13 09:10:57 -08:00
Kazu Hirata
d096f45322
[clang-scan-deps] Avoid repeated map lookups (NFC) (#127023) 2025-02-13 09:10:38 -08:00
Ilia Kuklin
f30c891464
[lldb] Analyze enum promotion type during parsing (#115005)
The information about an enum's best promotion type is discarded after
compilation and is not present in debug info. This patch repeats the
same analysis of each enum value as in the front-end to determine the
best promotion type during DWARF info parsing.

Fixes #86989
2025-02-13 22:08:31 +05:00
Craig Topper
e750c7e636
[RISCV] Set Feature32Bit/Feature64Bit based on triple for -mcpu=help. (#127031)
llvm-mc keeps going after printing help text and creates an assembler.
If we don't set one of the XLen sized feature bits we trip a fatal error
in RISCVFeatures::validate.

llvm-mc should probably be fixed, but I don't know if its the only tool
with this issue.
2025-02-13 09:07:23 -08:00
Ellis Hoag
79fff6aa32
[lld][BP] Avoid ordering ICF'ed sections (#126327)
ICF runs before BPSectionOrderer. When a section is ICF'ed, it seems
that the original sections are marked as not live, but are still kept
around. Prior to this patch, those ICF'ed sections would be passed to BP
and ordered before being skipped when writing the output. Now, these
sections are no longer passed to BP, saving runtime and possibly
improving BP's output.

In a large binary, I found that the number of sections ordered using BP
decreased, while the number of duplicate sections drastically decreased
as expected.
```
Functions for startup: 50755 -> 50520
Functions for compression: 165734 -> 105328
Duplicate functions: 1827231 -> 55230
```
2025-02-13 08:57:44 -08:00
Abhilash Majumder
55f3df875d
[NVPTX] Fix and refine prefetch.* intrinsics (#126899)
This is follow-up PR from #125887  which fixes the intrinsic failures .

---------

Co-authored-by: abmajumder <abmajumder@nvidia.com>
2025-02-13 17:54:01 +01:00
Piotr Zegar
a663e78a6e
[clang-tidy] Add recursion protection in ExceptionSpecAnalyzer (#66810)
Normally endless recursion should not happen in ExceptionSpecAnalyzer,
but if AST would be malformed (missing include), this could cause crash.

I run into this issue when due to missing include constructor argument
were parsed as FieldDecl.
As checking for recursion cost nothing, why not to do this in check just
in case.

Fixes #111436
2025-02-13 17:51:28 +01:00
Georgiy Samoylov
1138a4964a
[lldb] Fix build problem in llgs tests for RISC-V (#127091)
During testing of LLDB on RISC-V target, tests from the llgs category
were built with an error: `Error when building test subject.`

```
llvm-project/lldb/test/API/tools/lldb-server/main.cpp:151:40: error: missing ')' after '__builtin_debugtrap'
  151 | #elif __has_builtin(__builtin_debugtrap())
      |                     ~~~~~~~~~~~~~~~~~~~^
llvm-project/lldb/test/API/tools/lldb-server/main.cpp:151:20: note: to match this '('
  151 | #elif __has_builtin(__builtin_debugtrap())
      |                    ^
```

This patch fixes this error.
2025-02-13 16:48:03 +00:00
Vyacheslav Levytskyy
2f8de7b466
[SPIR-V] Type inference must realize that a <1 x Type> vector type is not a legal vector type in LLT (#124560)
In this PR we account for possible <1 x LLVM Type> input to ensure that
we produce legal vector types during type inference.

We modify an LLVM type to conform with future transformations in
IRTranslator, if it's a <1 x Type> vector type, replacing it by the
element type, because <1 x Type> vector type is not a legal vector type
in LLT and IRTranslator will represent it as the scalar eventually.
2025-02-13 17:46:42 +01:00
Jay Foad
ba45592377
[AMDGPU] Try to fix -mattr=dumpcode on big-endian hosts (#127073)
Blind fix for #116982 failing on big-endian buildbots.
2025-02-13 16:44:22 +00:00
Kazu Hirata
88015d12ca [mlir] Fix a warning
This patch fixes:

  mlir/lib/Conversion/ComplexCommon/DivisionConverter.cpp:61:2: error:
  extra ';' outside of a function is incompatible with C++98
  [-Werror,-Wc++98-compat-extra-semi]
2025-02-13 08:36:07 -08:00
Craig Topper
8da8ff8768
[flang][RISCV] Add target-abi ModuleFlag. (#126188)
This is needed to generate proper ABI flags in the ELF header for LTO
builds. If these flags aren't set correctly, we can't link with objects
that were built with the correct flags.

For non-LTO builds the mcpu/mattr in the TargetMachine will cause the
backend to infer an ABI. For LTO builds the mcpu/mattr aren't set.

I've only added lp64, lp64f, and lp64d ABIs. ilp32* requires riscv32
which is not yet supported in flang. lp64e requires a different
DataLayout string and would need additional plumbing.

Fixes #115679
2025-02-13 08:08:09 -08:00
Mikhail Goncharov
21811818d6 [bazel] port aecb764cc2e026ecb5c418dd56f2722c6f263e8b 2025-02-13 17:05:33 +01:00
David Green
b2165f214e
[CostModel] Account for power-2 urem in funnel shift costs (#127037)
As can be seen in https://godbolt.org/z/qvMqY79cK, a urem by a power-2
constant will be code-generated as an And of a mask. The cost model for
funnel shifts tries to account for that by passing OP_PowerOf2 as the
operand info for the second operand. As far as I can tell returning a
lower cost for urem with a OP_PowerOf2 is only implemented on X86
though.

This patch short-cuts that by calling getArithmeticInstrCost(And, ..)
directly when we know the typesize will be a power-of-2. This is an
alternative to the patch in #126912 which is a more general solution for
power-2 udiv/urem costs, this more narrowly just fixes funnel shifts.
2025-02-13 16:05:00 +00:00
Hyunsung Lee
de09986596
[mlir][math] powf(a, b) drop support when a < 0 (#126338)
Related: #124402

- change inefficient implementation of `powf(a, b)` to handle `a < 0`
case
  - thus drop `a < 0` case support

However, some special cases are being used such as:
  - `a < 0` and `b = 0, b = 0.5, b = 1 or b = 2`
  - convert those special cases into simpler ops.
2025-02-13 08:01:47 -08:00
Vitaly Buka
a1345eb240
Revert "[libclang] Always Dup in createRef(StringRef)" (#127076)
Reverts llvm/llvm-project#125020


https://lab.llvm.org/buildbot/#/builders/24/builds/5252/steps/12/logs/stdio

```
==c-index-test==2512295==ERROR: AddressSanitizer: heap-use-after-free on address 0xe19338c27992 at pc 0xc66be4784830 bp 0xe0e33660df00 sp 0xe0e33660d6e8
READ of size 23 at 0xe19338c27992 thread T1
    #0 0xc66be478482c in printf_common(void*, char const*, std::__va_list) /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors_format.inc:563:9
    #1 0xc66be478643c in vprintf /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:1699:1
    #2 0xc66be478643c in printf /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:1757:1
    #3 0xc66be4839384 in FilteredPrintingVisitor /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/tools/c-index-test/c-index-test.c:1359:5
    #4 0xe4e3454f12e8 in clang::cxcursor::CursorVisitor::Visit(CXCursor, bool) /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/tools/libclang/CIndex.cpp:227:11
    #5 0xe4e3454f48a8 in bool clang::cxcursor::CursorVisitor::visitPreprocessedEntities<clang::PreprocessingRecord::iterator>(clang::PreprocessingRecord::iterator, clang::PreprocessingRecord::iterator, clang::PreprocessingRecord&, clang::FileID) CIndex.cpp
    
0xe19338c27992 is located 82 bytes inside of 105-byte region [0xe19338c27940,0xe19338c279a9)
freed by thread T1 here:
    #0 0xc66be480040c in free /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/compiler-rt/lib/asan/asan_malloc_linux.cpp:51:3
    #1 0xc66be4839728 in GetCursorSource c-index-test.c
    #2 0xc66be4839368 in FilteredPrintingVisitor /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/tools/c-index-test/c-index-test.c:1360:12
    #3 0xe4e3454f12e8 in clang::cxcursor::CursorVisitor::Visit(CXCursor, bool) /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/tools/libclang/CIndex.cpp:227:11
    #4 0xe4e3454f48a8 in bool clang::cxcursor::CursorVisitor::visitPreprocessedEntities<clang::PreprocessingRecord::iterator>(clang::PreprocessingRecord::iterator, clang::PreprocessingRecord::iterator, clang::PreprocessingRecord&, clang::FileID) CIndex.cpp


previously allocated by thread T1 here:
    #0 0xc66be4800680 in malloc /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/compiler-rt/lib/asan/asan_malloc_linux.cpp:67:3
    #1 0xe4e3456379b0 in safe_malloc /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/include/llvm/Support/MemAlloc.h:26:18
    #2 0xe4e3456379b0 in createDup /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/tools/libclang/CXString.cpp:95:40
    #3 0xe4e3456379b0 in clang::cxstring::createRef(llvm::StringRef) /home/b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/clang/tools/libclang/CXString.cpp:90:10
```
2025-02-13 07:42:40 -08:00
Alexey Bataev
2ad816648f
[SLP]Improved reduction cost/codegen
SLP vectorizer is able to combine several reductions from the list of
(potentially) reduced values with the different opcodes/values kind.
Currently, these reductions are handled independently of each other. But
instead the compiler can combine them into wide vector operations and
then perform only single reduction.
E.g, if the SLP vectorizer emits currently something like:
```
%r1 = reduce.add(<4 x i32> %v1)
%r2 = reduce.add(<4 x i32> %v2)
%r = add i32 %r1, %r2
```

it can be emitted as:
```
%v = add <4 x i32> %v1, %v2
%r = reduce.add(<4 x i32> %v)
```

It allows to improve the performance in some cases.

AVX512, -O3+LTO
Metric: size..text

Program                                                                                           size..text
                                                                                                  results     results0    diff
                      test-suite :: SingleSource/Benchmarks/Shootout-C++/Shootout-C++-matrix.test     4553.00     4615.00  1.4%
                                 test-suite :: SingleSource/Benchmarks/Adobe-C++/loop_unroll.test   412708.00   416820.00  1.0%
        test-suite :: SingleSource/UnitTests/Vector/AVX512BWVL/Vector-AVX512BWVL-mask_set_bw.test    12901.00    12981.00  0.6%
                        test-suite :: MultiSource/Benchmarks/FreeBench/fourinarow/fourinarow.test    22717.00    22813.00  0.4%
                             test-suite :: MultiSource/Benchmarks/mediabench/gsm/toast/toast.test    39722.00    39850.00  0.3%
                      test-suite :: MultiSource/Benchmarks/MiBench/telecomm-gsm/telecomm-gsm.test    39725.00    39853.00  0.3%
test-suite :: SingleSource/Regression/C/gcc-c-torture/execute/GCC-C-execute-builtin-bitops-1.test    15918.00    15967.00  0.3%
                                       test-suite :: External/SPEC/CFP2006/433.milc/433.milc.test   155491.00   155587.00  0.1%
                                     test-suite :: MicroBenchmarks/ImageProcessing/Blur/blur.test   227894.00   227942.00  0.0%
                                    test-suite :: MultiSource/Benchmarks/7zip/7zip-benchmark.test  1062188.00  1062364.00  0.0%
                                test-suite :: External/SPEC/CINT2006/464.h264ref/464.h264ref.test   793672.00   793720.00  0.0%
                              test-suite :: External/SPEC/CINT2017rate/525.x264_r/525.x264_r.test   657371.00   657403.00  0.0%
                             test-suite :: External/SPEC/CINT2017speed/625.x264_s/625.x264_s.test   657371.00   657403.00  0.0%
                   test-suite :: External/SPEC/CINT2017speed/600.perlbench_s/600.perlbench_s.test  2074917.00  2074933.00  0.0%
                    test-suite :: External/SPEC/CINT2017rate/500.perlbench_r/500.perlbench_r.test  2074917.00  2074933.00  0.0%
                                     test-suite :: MultiSource/Applications/JM/lencod/lencod.test   855219.00   855203.00 -0.0%

Benchmarks/Shootout-C++ - same transformed reduction
Adobe-C++/loop_unroll - same transformed reductions, new vector code
AVX512BWVL/Vector-AVX512BWVL-mask_set_bw - same transformed reductions
FreeBench/fourinarow - same transformed reductions
MiBench/telecomm-gsm - same transformed reductions
execute/GCC-C-execute-builtin-bitops-1 - same transformed reductions
CFP2006/433.milc - better vector code, several x i64 reductions + trunc
to i32 gets trunced to x i32 reductions
ImageProcessing/Blur - same transformed reductions
Benchmarks/7zip - same transformed reductions, extra 4 x vectorization
CINT2006/464.h264ref - same transformed reductions
CINT2017rate/525.x264_r
CINT2017speed/625.x264_s - same transformed reductions
CINT2017speed/600.perlbench_s
CINT2017rate/500.perlbench_r - transformed same reduction
JM/lencod - extra 4 x vectorization

RISC-V, SiFive-p670, -O3+LTO

Metric: size..text

Program                                                                                           size..text
                                                                                                  results    results0   diff
test-suite :: SingleSource/Regression/C/gcc-c-torture/execute/GCC-C-execute-builtin-bitops-1.test    8990.00    9514.00   5.8%
                                test-suite :: External/SPEC/CINT2006/464.h264ref/464.h264ref.test  588504.00  588488.00  -0.0%
                    test-suite :: MultiSource/Benchmarks/MiBench/consumer-lame/consumer-lame.test  147464.00  147440.00  -0.0%
              test-suite :: MultiSource/Benchmarks/MiBench/automotive-susan/automotive-susan.test   21496.00   21492.00  -0.0%
                                     test-suite :: MicroBenchmarks/ImageProcessing/Blur/blur.test  165420.00  165372.00  -0.0%
                                    test-suite :: MultiSource/Benchmarks/7zip/7zip-benchmark.test  843928.00  843648.00  -0.0%
                                    test-suite :: External/SPEC/CINT2006/458.sjeng/458.sjeng.test  100712.00  100672.00  -0.0%
                      test-suite :: MultiSource/Benchmarks/MiBench/telecomm-gsm/telecomm-gsm.test   24384.00   24336.00  -0.2%
                             test-suite :: MultiSource/Benchmarks/mediabench/gsm/toast/toast.test   24380.00   24332.00  -0.2%
             test-suite :: SingleSource/UnitTests/Vectorizer/VPlanNativePath/outer-loop-vect.test   10348.00   10316.00  -0.3%
                                 test-suite :: SingleSource/Benchmarks/Adobe-C++/loop_unroll.test  221304.00  220480.00  -0.4%
                      test-suite :: SingleSource/Benchmarks/Shootout-C++/Shootout-C++-matrix.test    3750.00    3736.00  -0.4%
                            test-suite :: SingleSource/Regression/C/Regression-C-DuffsDevice.test     678.00     370.00 -45.4%

execute/GCC-C-execute-builtin-bitops-1 - extra 4 x reductions, same
transformed reductions
CINT2006/464.h264ref - extra 4 x reductions, same transformed reductions
MiBench/consumer-lame - 2 4 x i1 merged to 8 x i1 reductions (bitcast + ctpop)
MiBench/automotive-susan - same transformed reductions
ImageProcessing/Blur - same transformed reductions
Benchmarks/7zip - same transformed reductions
CINT2006/458.sjeng - 2 4 x i1 merged to 8 x i1 reductions (bitcast + ctpop)
MiBench/telecomm-gsm - same transformed reductions
Benchmarks/mediabench - same transformed reductions
Vectorizer/VPlanNativePath - same transformed reductions
Adobe-C++/loop_unroll - extra 4 x reductions, same transformed reductions
Benchmarks/Shootout-C++ - extra 4 x reductions, same transformed reductions
Regression/C/Regression-C-DuffsDevice - same transformed reductions

Reviewers: hiraditya, topperc, preames

Pull Request: https://github.com/llvm/llvm-project/pull/118293
2025-02-13 10:36:28 -05:00
Robert Imschweiler
41e49fadd4
[AMDGPU] Fix llvm.amdgcn.workitem.id-unsupported-calling-convention.ll (#127041)
Follow-up fix for #126058. (@arsenm)
2025-02-13 22:23:47 +07:00
Robert Imschweiler
0da8d0f9b7
[AMDGPU] Change handling of unsupported non-compute shaders with HSA (#126798)
Previous handling in `SITargetLowering::LowerFormalArguments` only
reported a diagnostic message and continued execution by returning a
non-usable `SDValue`. This results in llvm crashing later with an
unrelated error. This commit changes the detection of an unsupported
non-compute shader to be a fatal error right away.

As an example situation, take the usage of an `amdgpu_ps` function and
the `amdgcn-unknown-amdhsa` target triple.
```
define amdgpu_ps void @foo(ptr %p, i32 %i) {
        store i32 %i, ptr %p
        ret void
}
```
Compiling this code (with `llc -mtriple=amdgcn-unknown-amdhsa
-mcpu=gfx942`, for example) fails with:
```
error: <unknown>:0:0: in function foo void (ptr, i32): unsupported non-compute shaders with HSA

llc:
[...]/git/trunk21.0/llvm-project/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp:11790:
void llvm::SelectionDAGISel::LowerArguments(const llvm::Function&):
Assertion `InVals.size() == Ins.size() && "LowerFormalArguments didn't emit the correct number of values!"' failed.
[...]
```
2025-02-13 22:23:08 +07:00
Donát Nagy
d2240cd314
[NFC] [analyzer] Add ArrayBound tests to document casting bug (#127062)
Add a few security.ArrayBound testcases that document the false
positives caused the fact that the analyzer doesn't model a cast from
`signed char` to `unsigned char`.
2025-02-13 16:09:09 +01:00
Alex Bradbury
62eddf4792 [docs] Fix typo in HowToAddABuilder 2025-02-13 15:03:52 +00:00
Dave Lee
277cb60d9a
[lldb] Use LLDB_LOG_ERROR in ObjectFilePECOFF.cpp (NFC) (#126972) 2025-02-13 07:03:00 -08:00
Alex Bradbury
db2953d801
[doc] Add Discord invite link alongside channel links (#126352)
By far the most important part of this patch is updating
GettingInvolved.rst to include the invite link, but I've grepped for any
other discord.com links.

I'm no Discord expert, but from my experience (confirmed via @preames
kindly testing as well) the direct channel links provide a confusing
experience if you haven't already found and used an invite link to the
LLVM Discord server. If you're logged into Discord but not a member of
LLVM's sever, the web app opens and then...nothing. No channel opens, no
prompt to join the server or even a hint that you need to find an invite
link (and if you're not used to Discord, you likely don't even know
that's necessary).

This patch addresses the issue by providing the invite link where
Discord is mentioned.
2025-02-13 15:00:21 +00:00
Ivan Butygin
aecb764cc2
[mlir][gpu] GPUToROCDL/NVVM: use generic llvm conversion interface instead of hardcoded conversions. (#124439)
Using `ConvertToLLVMPatternInterface` allows to unhardcode specific
dialect conversions from passes and, more importantly, allows downstream
projects to inject their ops/types translation here by registering
corresponding interface.

Add `allowed-dialects` option so user can control which dialects can be
used to populate conversions.
2025-02-13 17:53:12 +03:00
Jay Foad
3e54964dcf
[AMDGPU] Simplify OtherPredicates handling in MadFmaMixPats. NFC. (#127044)
This removes some of the complexity added by ad6cd7e8b259 by setting
OtherPredicates outside MadFmaMixPats rather than inside it.
2025-02-13 14:51:42 +00:00
Alexey Bataev
7d1db31aa0 [SLP]Check the first instruction instead the first scalar for subvectors
Need to check the first instruction instead of first scalar for
subvectors, when trying to find full matched vectorized node in the
graph.

Fixes #126909.
2025-02-13 06:40:37 -08:00
Oleg Shyshkov
95bb61bba9 [mlir][bazel] Port 1935f84856a9297e725770e6f4b9c50fbcec365c 2025-02-13 14:36:02 +00:00
Nikita Popov
8600d89e55 [DSE] Add test for interaction with return-only captures (NFC)
Regression test for the miscompile reported at:
https://github.com/llvm/llvm-project/pull/125880#issuecomment-2656632577
2025-02-13 15:19:05 +01:00
Fabian Ritter
a33a84ee63
[AMDGPU][NFC] Replace gfx940 and gfx941 with gfx942 in llvm/test (#125711)
[AMDGPU][NFC] Replace gfx940 and gfx941 with gfx942 in llvm/test

gfx940 and gfx941 are no longer supported. This is one of a series of PRs to remove them from the code base.

This PR uses gfx942 instead of gfx940 and gfx941 in the test RUN-lines (unless there is already a RUN-line for gfx942).

The only notable difference in the test output is that gfx942 does not force the use of sc0 and sc1 on stores while gfx940 and gfx941 do (cf. https://reviews.llvm.org/D149986).

For SWDEV-512631
2025-02-13 15:17:12 +01:00
Joseph Huber
c4ed95c85b [Flang] Fix leftover use of 'OPT_nogpulib'
Summary:
This didn't show up as a failure in precommit and I don't build flang so
this slipped by.
2025-02-13 08:14:12 -06:00
Jan Leyonberg
f63e3b15f9
[Flang] Generate math ops for non-precise calls to acosh, asin, asinh and atanh intrinsic calls (#126932)
This patch changes the codegen for non-precise acosh, asin, asinh and
atanh calls to generate math ops instead. This wasn't done before
because the math dialect did not have the corresponding operations at
the time.
2025-02-13 09:10:46 -05:00
zhijian lin
7763119c6e
[PowerPC] Deprecate uses of ISD::ADDC/ISD::ADDE/ISD::SUBC/ISD::SUBE (#116984)
ISD::ADDC, ISD::ADDE, ISD::SUBC and ISD::SUBE are being deprecated,
using ISD::UADDO_CARRY,ISD::USUBO_CARRY instead. Lowering the UADDO,
UADDO_CARRY, USUBO, USUBO_CARRY in the patch.
2025-02-13 09:09:17 -05:00
Joseph Huber
f6e3d33c00
[Clang][NFC] Introduce --offloadlib positive flag for nogpulib and alias to --no-offloadlib (#126567)
Summary:
We support `nogpulib` to disable implicit libraries. In the future we
will want to change the default linking of these libraries based on the
user language. This patch just introduces a positive variant so now we
can do `-nogpulib -gpulib` to disable it.

Later patch will make the default a variable in the ROCmToolChain
depending on the target languages.
2025-02-13 07:59:08 -06:00
Jan Voung
27e78e68a6
[clang][dataflow] Remove a deprecated CachedConstAccessorsLattice API (#127001)
We've already migrated known users from the old to the new version of
getOrCreateConstMethodReturnStorageLocation. The conversion is pretty
straightforward as well, if there are out-of-tree users:

Previously: use CallExpr as argument
New: get the direct Callee from CallExpr, null check, and use that as
the argument instead.
2025-02-13 08:57:54 -05:00
Nikita Popov
1e64ea9914 Revert "[CaptureTracking][FunctionAttrs] Add support for CaptureInfo (#125880)"
This reverts commit ee655ca27aad466bcc54f6eba03f7e564940ad5a.

A miscompilation has been reported at:
https://github.com/llvm/llvm-project/pull/125880#issuecomment-2656632577
2025-02-13 14:56:12 +01:00
Matt Arsenault
43780f4f92 RegAllocGreedy: Use Register type 2025-02-13 20:49:27 +07:00
Uday Bondhugula
6fa671f9e6
[MLIR][Affine] Fix fusion crash from memory space int assumption (#127032)
Fix fusion crash from memory space int assumption from assumption on int
attr-based memory spaces.

Fixes: https://github.com/llvm/llvm-project/issues/118759
2025-02-13 19:03:08 +05:30
Qi Zhao
e8dba3a4ff [LoongArch] Add test for stackmaps. NFC 2025-02-13 20:58:26 +08:00