532919 Commits

Author SHA1 Message Date
Stephen Tozer
2334fd2ea3 [Dexter] Update Dexter tests to use new dexter test substitutions
Following commit b8fc288, which changed some dexter test substitutions to
be specific to C and C++, some tests that had been added since the original
patch was written were still using the old substitution; this patch updates
them to use the new.
2025-04-03 16:05:42 +01:00
gbMattN
61ef286506
Fix signed/unsigned mismatch warning (#134255) 2025-04-03 15:56:33 +01:00
Felipe de Azevedo Piovezan
c14b6e90bd
[lldb][NFC] Move ShouldShow/ShouldSelect logic into Stopinfo (#134160)
This NFC patch simplifies the main loop in HandleProcessStateChanged
event by moving duplicated code into the StopInfo class, also allowing
StopInfo subclasses to override behavior.

More specifically, two functions are created:

* ShouldShow: should a Thread with such StopInfo should be printed when
the debugger stops? Currently, no StopInfo subclasses override this, but
a subsequent patch will fix a bug by making StopInfoBreakpoint check
whether the breakpoint is internal.

* ShouldSelect: should a Thread with such a StopInfo be selected? This
is currently overridden by StopInfoUnixSignal but will, in the future,
be overridden by StopInfoBreakpoint.
2025-04-03 07:41:29 -07:00
Sergio Afonso
f59b5b8d59
[MLIR][OpenMP] Fix standalone distribute on the device (#133094)
This patch updates the handling of target regions to set trip counts and
kernel execution modes properly, based on clang's behavior. This fixes a
race condition on `target teams distribute` constructs with no `parallel
do` loop inside.

This is how kernels are classified, after changes introduced in this
patch:

```f90
! Exec mode: SPMD.
! Trip count: Set.
!$omp target teams distribute parallel do
do i=...
end do

! Exec mode: Generic-SPMD.
! Trip count: Set (outer loop).
!$omp target teams distribute
do i=...
  !$omp parallel do private(idx, y)
  do j=...
  end do
end do

! Exec mode: Generic-SPMD.
! Trip count: Set (outer loop).
!$omp target teams distribute
do i=...
  !$omp parallel
    ...
  !$omp end parallel
end do

! Exec mode: Generic.
! Trip count: Set.
!$omp target teams distribute
do i=...
end do

! Exec mode: SPMD.
! Trip count: Not set.
!$omp target parallel do
do i=...
end do

! Exec mode: Generic.
! Trip count: Not set.
!$omp target
  ...
!$omp end target
```

For the split `target teams distribute + parallel do` case, clang
produces a Generic kernel which gets promoted to Generic-SPMD by the
openmp-opt pass. We can't currently replicate that behavior in flang
because our codegen for these constructs results in the introduction of
calls to the `kmpc_distribute_static_loop` family of functions, instead
of `kmpc_distribute_static_init`, which currently prevent promotion of
the kernel to Generic-SPMD.

For the time being, instead of relying on the openmp-opt pass, we look
at the MLIR representation to find the Generic-SPMD pattern and directly
tag the kernel as such during codegen. This is what we were already
doing, but incorrectly matching other kinds of kernels as such in the
process.
2025-04-03 15:41:00 +01:00
Jonas Devlieghere
51c2750599
[lldb] Update examples in docs/use/python-reference.rst to work with Python 3 (#134204)
The examples on this page were using the Python 2-style print. I ran the
updated code examples under Python 3 to confirm they are still
up-to-date.
2025-04-03 07:40:00 -07:00
Jake Egan
50fe5b90e7
[sanitizer_common][NFC] Fix sanitizer_symbolizer_libcdep.cpp formatting (#133930) 2025-04-03 10:39:49 -04:00
Stephen Tozer
b8fc288c46
[Dexter] Replace clang with clang++ in various cross project tests (#65987)
This patch replaces invocations of clang with clang++ for a set of
c++ files in the dexter cross-project tests. As a small additional change,
this patch removes -lstdc++ from a test that did not appear to require it.
2025-04-03 15:37:43 +01:00
Nick Sarnie
008040482b
[clang] Add SPIR-V to some OpenMP clang tests (#133503)
Just to get some more coverage.

Some of the behavior might be weird and change in the future, but let's
lock down what happens today to at least prevent regressions.

Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
2025-04-03 14:36:46 +00:00
gbMattN
59074a3760
[ASan] Add metadata to renamed instructions so ASan doesn't use the i… (#119387)
…ncorrect name

Clang needs variables to be represented with unique names. This means
that if a variable shadows another, its given a different name
internally to ensure it has a unique name. If ASan tries to use this
name when printing an error, it will print the modified unique name,
rather than the variable's source code name

Fixes #47326
2025-04-03 15:27:14 +01:00
Luke Lau
b61e3874fa Revert "[InstCombine] Match scalable splats in m_ImmConstant (#132522)"
This reverts commit df9e5ae5b40c4d245d904a2565e46f5b7ab9c7c8.

This is triggering an assertion failure on llvm-test-suite with
-enable-vplan-native-path:
https://lab.llvm.org/buildbot/#/builders/198/builds/3365
2025-04-03 15:16:56 +01:00
Sergio Afonso
18dd299fb1
[Flang][MLIR][OpenMP] Host-evaluation of omp.loop bounds (#133908)
This patch updates Flang lowering and kernel flags identification in
MLIR so that loop bounds on `target teams loop` constructs are evaluated
on the host, making the trip count available to the corresponding
`__tgt_target_kernel` call emitted for the target region.

This is necessary in order to properly execute these constructs as
`target teams distribute parallel do`.

Co-authored-by: Kareem Ergawy <kareem.ergawy@amd.com>
2025-04-03 15:06:19 +01:00
GeorgeKA
7145ead280
[Clang] Add warning message for C++17 alias template CTAD (#133806)
Alias template class template argument deduction is a documented C++20
feature. C++17 also happens to support it, but there is no message
output to indicate the officially supported version. This PR adds that.

Also updated relevant CTAD test cases.

Closes #125913
2025-04-03 15:58:52 +02:00
Aaron Puchert
9e0ca5720b
[X86] When expanding LCMPXCHG16B_SAVE_RBX, substitute RBX in base (#134109)
The pseudo-instruction LCMPXCHG16B_SAVE_RBX is used when RBX serves as
frame base pointer. At a very late stage it is then translated into a
regular LCMPXCHG16B, preceded by copying the actual argument into RBX,
and followed by restoring the register to the base pointer.

However, in case the `cmpxchg` operates on a local variable, RBX might
also be used as a base for the memory operand in frame finalization, and
we've overwritten RBX with the input operand for `cmpxchg16b`. So we
have to rewrite the memory operand base to use the saved value of RBX.

Fixes #119959.
2025-04-03 15:56:53 +02:00
Frank Schlimbach
586c5e3083
[mlir][mpi] fixing in-place and 0d mpi.all_reduce (#134225)
* inplace allreduce needs special MPI token MPI_IN_PLACE as send buffer
* 0d tensors have no sizes/strides in LLVM memref struct
2025-04-03 15:53:40 +02:00
Ramkumar Ramachandra
6bbdc70066
[LV] Use getCallWideningDecision in more places (NFC) (#134236) 2025-04-03 14:53:19 +01:00
Steven Perron
a77d807781
[SPIRV] Add spirv.VulkanBuffer types to the backend (#133475)
Adds code to expand the `llvm.spv.resource.handlefrombinding` and
`llvm.spv.resource.getpointer` when the resource type is
`spirv.VulkanBuffer`.

It gets expanded as a storage buffer or uniform buffer denpending on the
storage class used.

This is implementing part of

https://github.com/llvm/wg-hlsl/blob/main/proposals/0018-spirv-resource-representation.md.
2025-04-03 09:44:07 -04:00
Simon Pilgrim
9df324e90b
[X86] Add growShuffleMask helper to grow the shuffle mask for a larger value type. NFC. (#134243)
Prep work for #133947
2025-04-03 14:41:24 +01:00
Anatoly Trosinenko
c818ae7399
[BOLT] Gadget scanner: detect non-protected indirect calls (#131899)
Implement the detection of non-protected indirect calls and branches
similar to pac-ret scanner.
2025-04-03 16:40:34 +03:00
Lukacma
ae8ad8649d
[Clang][AArch64] Model ZT0 table using inaccessible memory (#133727)
This patch changes how ZT0 table is modelled at LLVM-IR level. Currently
accesses to ZT0 are represented at LLVM-IR level as memory reads and
writes. This patch changes that and models them as purely Inaccessible
memory accesses without any unmodeled side-effects.
2025-04-03 14:22:48 +01:00
Pavel Labath
0509932bb6 [lldb] Initialize active_row pointer variable
It's value is not set on all control flow paths. I believe this should
fix the failure on some buildbots after #133247.
2025-04-03 15:15:09 +02:00
Nikita Popov
efbbdd69c7
[ADT] Make DenseMap::init() private (NFC) (#134229)
I believe this method was not supposed to be public, as it has
additional preconditions (it will misbehave when called on a non-empty
DenseMap).

The public API for this is reserve().
2025-04-03 15:14:45 +02:00
Elen Kalda
c2355892a4
[mlir][tosa] Add ERROR_IF checks to TRANSPOSE_CONV2D verifier (#133234)
This patch extends the verifier with following checks:

ERROR_IF(out_pad_top <= -KH || out_pad_bottom <= -KH);
ERROR_IF(out_pad_left <= -KW || out_pad_right <= -KW); ERROR_IF(stride_y
< 1 || stride_x < 1);
ERROR_IF(OH != (IH - 1) * stride_y + out_pad_top + out_pad_bottom + KH);
ERROR_IF(OW != (IW - 1) * stride_x + out_pad_left + out_pad_right + KW);
ERROR_IF(BC != OC && BC != 1);

Signed-off-by: Elen Kalda <elen.kalda@arm.com>
2025-04-03 14:04:28 +01:00
Koakuma
ebacd46996
[SPARC][MC] Add tests for VIS family instructions
Also fix up any mistakes/typos in instruction definitions.

Reviewers: rorth, s-barannikov, brad0, MaskRay

Reviewed By: s-barannikov

Pull Request: https://github.com/llvm/llvm-project/pull/130967
2025-04-03 19:55:18 +07:00
Alex Bradbury
2a9948f038
Revert "[CLANG-CL] ignores wpadded" (#134239)
Reverts llvm/llvm-project#130182

This is causing failures on RISC-V and ppc builders as mentioned on
https://github.com/llvm/llvm-project/pull/130182#issuecomment-2775516899

Reverting so the issue can be fixed by the path author without time pressure (as noted in that PR, it seems a value is uninitialised).
2025-04-03 13:47:59 +01:00
Simon Pilgrim
52f3cad9ff
[X86] getFauxShuffleMask - move INSERT_SUBVECTOR(SRC0, EXTRACT_SUBVECTOR(SRC1)) matching behind common one use bitcast checks (#134227)
No need to ignore one use checks for the INSERT_SUBVECTOR(SRC0, EXTRACT_SUBVECTOR(SRC1)) fold

Noticed while working on the #133947 regressions
2025-04-03 13:17:14 +01:00
Paul Walker
41a6bb4c05
[LLVM][CodeGen][SVE] Prefer NEON instructions when zeroing Z registers. (#133929)
Several implementations have zero-latency instructions to zero
registers. To-date no implementation has a dedicated SVE instruction but
we can use the NEON equivalent because it is defined to zero bits
128..VL regardless of the immediate used.

NOTE: The relevant instruction is not available in streaming mode, where
the original SVE DUP instruction remains in use.
2025-04-03 13:15:05 +01:00
Ilya Biryukov
722346c7bc
[Tooling] Handle AttributedType in getFullyQualifiedType (#134228)
Before this change the code used to add extra qualifiers, e.g.
`std::unique_ptr<int> _Nonnull` became `::std::std::unique_ptr<int>
_Nonnull`
when adding a global namespace qualifier was requested.
2025-04-03 14:14:34 +02:00
Michael Buch
739fe98080 [lldb][test] TestExprFromNonZeroFrame.py: fix windows build
On Windows this test was failing to link with following error:
```

make: Entering directory 'C:/Users/tcwg/llvm-worker/lldb-aarch64-windows/build/lldb-test-build.noindex/commands/expression/expr-from-non-zero-frame/TestExprFromNonZeroFrame.test'
C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build\bin\clang.exe -gdwarf -O0   -IC:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\packages\Python\lldbsuite\test\make/../../../../..//include -IC:/Users/tcwg/llvm-worker/lldb-aarch64-windows/build/tools/lldb/include -IC:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\test\API\commands\expression\expr-from-non-zero-frame -IC:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\packages\Python\lldbsuite\test\make -include C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\packages\Python\lldbsuite\test\make/test_common.h -fno-limit-debug-info   -MT main.o -MD -MP -MF main.d -c -o main.o C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\test\API\commands\expression\expr-from-non-zero-frame/main.c
C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build\bin\clang.exe main.o -gdwarf -O0   -IC:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\packages\Python\lldbsuite\test\make/../../../../..//include -IC:/Users/tcwg/llvm-worker/lldb-aarch64-windows/build/tools/lldb/include -IC:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\test\API\commands\expression\expr-from-non-zero-frame -IC:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\packages\Python\lldbsuite\test\make -include C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\packages\Python\lldbsuite\test\make/test_common.h -fno-limit-debug-info     -fuse-ld=lld  --driver-mode=g++ -o "a.out"
lld-link: error: undefined symbol: printf
>>> referenced by main.o:(func)
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make: *** [Makefile.rules:530: a.out] Error 1
make: Leaving directory 'C:/Users/tcwg/llvm-worker/lldb-aarch64-windows/build/lldb-test-build.noindex/commands/expression/expr-from-non-zero-frame/TestExprFromNonZeroFrame.test'
```
2025-04-03 12:21:46 +01:00
Aaron Ballman
7febd78f1e
No longer diagnose __auto_type as the auto extension (#134129)
Given:

  __auto_type x = 12;
  decltype(auto) y = 12;

-Wc++98-compat would diagnose both x and y with:

'auto' type specifier is incompatible with C++98

This patch silences the diagnostic in those cases. decltype(auto) is
still diagnosed with:

'decltype(auto)' type specifier is incompatible with C++ standards
before C++14

as expected but no longer produces the extraneous diagnostic about use
of 'auto'.

Fixes #47900
2025-04-03 07:13:30 -04:00
Paul Walker
ee4e8197fa
[LLVM][AArch64][SVE] Mark DUP immediate instructions with isAsCheapAsAMove. (#133945)
Doing this means we'll regenerate an immediate rather than copy the
result of an existing one, reducing instruction dependency chains.
2025-04-03 11:42:07 +01:00
Simi Pallipurath
cb0d1305d1
[Clang][ARM] Ensure both -mno-unaligned-access and -munaligned-access are passed to multilib selection logic (#134099)
Previously, alignment option was passed to multilib selection logic only
when -mno-unaligned-access was explicitly specified on the command line.

Now this change ensure both -mno-unaligned-access and -munaligned-access
are passed to the multilib selection logic, which now also considers the
target architecture when determining alignment access policy.
2025-04-03 11:16:05 +01:00
David Green
6c27817294
[SelectionDAG] Use SimplifyDemandedBits from SimplifyDemandedVectorElts Bitcast. (#133717)
This adds a call to SimplifyDemandedBits from bitcasts with scalar input
types in SimplifyDemandedVectorElts, which can help simplify the input
scalar.
2025-04-03 11:14:08 +01:00
Michael Buch
554f4d1a57
[lldb][Target] RunThreadPlan to save/restore the ExecutionContext's frame if one exists (#134097)
When using `SBFrame::EvaluateExpression` on a frame that's not the
currently selected frame, we would sometimes run into errors such as:
```
error: error: The context has changed before we could JIT the expression!
error: errored out in DoExecute, couldn't PrepareToExecuteJITExpression
```

During expression parsing, we call `RunStaticInitializers`. On our
internal fork this happens quite frequently because any usage of, e.g.,
function pointers, will inject ptrauth fixup code into the expression.
The static initializers are run using `RunThreadPlan`. The
`ExecutionContext::m_frame_sp` going into the `RunThreadPlan` is the
`SBFrame` that we called `EvaluateExpression` on. LLDB then tries to
save this frame to restore it after the thread-plan ran (the restore
occurs by unconditionally overwriting whatever is in
`ExecutionContext::m_frame_sp`). However, if the `selected_frame_sp` is
not the same as the `SBFrame`, then `RunThreadPlan` would set the
`ExecutionContext`'s frame to a different frame than what we started
with. When we `PrepareToExecuteJITExpression`, LLDB checks whether the
`ExecutionContext` frame changed from when we initially
`EvaluateExpression`, and if did, bails out with the error above.

One such test-case is attached. This currently passes regardless of the
fix because our ptrauth static initializers code isn't upstream yet. But
the plan is to upstream it soon.

This patch addresses the issue by saving/restoring the frame of the
incoming `ExecutionContext`, if such frame exists. Otherwise, fall back
to using the selected frame.

rdar://147456589
2025-04-03 11:10:16 +01:00
Yingwei Zheng
61907ebd76
[Clang][CodeGen] Do not use the GEP result to infer offset and result type (#134221)
If `CreateConstInBoundsGEP2_32` returns a constant null/gep, the cast to
GetElementPtrInst will fail.
This patch uses two static helpers
`GEPOperator::accumulateConstantOffset/GetElementPtrInst::getIndexedType`
to infer offset and result type instead of depending on the GEP result.

This patch is extracted from
https://github.com/llvm/llvm-project/pull/130734.
2025-04-03 18:03:42 +08:00
Camsyn
ecc35456d7
[Utils] Fix incorrect LCSSA PHI nodes when splitting critical edges with MergeIdenticalEdges (#131744)
This PR fixes incorrect LCSSA PHI node generation when splitting
critical edges with both
`PreserveLCSSA` and `MergeIdenticalEdges` enabled. The bug caused PHI
nodes in the split block
to miss predecessors when multiple identical edges were merged.
2025-04-03 12:02:03 +02:00
Simon Pilgrim
bf516098fb
[X86] SimplifyDemandedVectorEltsForTargetNode - reduce the size of VPERMV/VPERMV3 nodes if the upper elements are not demanded (#133923)
With AVX512VL targets, use 128/256-bit VPERMV/VPERMV3 nodes when we only need the lower elements.
2025-04-03 11:01:08 +01:00
Hsiangkai Wang
2e7ed78cff
[mlir][spirv] Add instruction OpGroupNonUniformRotateKHR (#133428)
Add an instruction under the extension SPV_KHR_subgroup_rotate.

The specification for the extension is here:

https://github.khronos.org/SPIRV-Registry/extensions/KHR/SPV_KHR_subgroup_rotate.html
2025-04-03 11:00:29 +01:00
Pavel Labath
662d385c7b
[lldb/telemetry] Report exit status only once (#134078)
SetExitStatus can be called the second time when we reap the debug
server process. This shouldn't be interesting as at that point, we've
already told everyone that the process has exited.

I believe/hope this will also help with sporadic shutdown crashes that
have cropped up recently. They happen because the debug server is
monitored from a detached thread, so this code can be called after main
returns (and starts destroying everything). This isn't a real fix for
that though, as the situation can still happen (it's just that it
usually happens after the exit status has already been set). I think the
real fix for that is to make sure these threads terminate before we
start shutting everything down.
2025-04-03 11:59:02 +02:00
Vladislav Dzhidzhoev
094904303d Revert "[lldb] Return *const* UnwindPlan pointers from FuncUnwinders (#133247)"
This reverts commit d7afafdbc464e65c56a0a1d77bad426aa7538306.

Caused remote Linux to Linux buildbot failure
https://lab.llvm.org/buildbot/#/builders/195/builds/7046.
2025-04-03 11:33:11 +02:00
Jack Frankland
6f324bd39b
[mlir][tosa] Remove Convolution Type Verifiers (#134077)
Remove the test in the convolution verifier that checks the input and
output element types of convolution operations conform to the
constraints imposed by the TOSA 1.0 specification.

These checks are too strict for users of the TOSA dialect who wish to
allow more types than those allowed by the spec and provide
compatibility issues with earlier TOSA implementation which allowed more
type combinations.

Users who do wish to constrain the convolution types combination to only
those allowed by the TOSA 1.0 spec should run the TOSA validation pass
which already performs these checks.

Signed-off-by: Jack Frankland <jack.frankland@arm.com>
2025-04-03 10:30:10 +01:00
Simon Pilgrim
6ec66a2292 [X86] Move VPERMV3(X,M,Y) -> VPERMV(M,CONCAT(X,Y)) fold after general VPERMV3 canonicalization
Pulled out of #133923 - this prevents regressions with SimplifyDemandedVectorEltsForTargetNode exposing VPERMV3(X,M,X) repeated operand patterns which were getting concatenated to wider VPERMV nodes before simpler canonicalizations could clean them up.
2025-04-03 10:24:02 +01:00
Romaric Jodin
7baa7edc00
[libclc]: clspv: add a dummy implememtation for mul_hi (#134094)
clspv uses a better implementation that is not using a bigger side when
not available.
Add a dummy implementation for mul_hi to avoid to override the
implementation of clspv with the one in libclc.
2025-04-03 10:18:39 +01:00
Simon Pilgrim
edc22c64e5
[X86] getFauxShuffleMask - only handle VTRUNC nodes with matching src/dst sizes (#134161)
Cleanup work for #133947 - we need to handle VTRUNC nodes with large
source vectors directly to allow us to widen the size of the shuffle
combine

We currently discard these results in combineX86ShufflesRecursively
anyhow as we don't allow inputs from getTargetShuffleInputs to be larger
than the shuffle value type
2025-04-03 09:42:27 +01:00
Carlos Galvez
6333fa5160
[clang-tidy] Fix broken HeaderFilterRegex when read from config file (#133582)
PR https://github.com/llvm/llvm-project/pull/91400 broke the usage of
HeaderFilterRegex via config file, because it is now created at a
different point in the execution and leads to a different value.

The result of that is that using HeaderFilterRegex only in the config
file does NOT work, in other words clang-tidy stops triggering warnings
on header files, thereby losing a lot of coverage.

This patch reverts the logic so that the header filter is created upon
calling the getHeaderFilter() function.

Additionally, this patch adds 2 unit tests to prevent regressions in the
future:

- One of them, "simple", tests the most basic use case with a single
top-level .clang-tidy file.

- The second one, "inheritance", demonstrates that the subfolder only
gets warnings from headers within it, and not from parent headers.

Fixes #118009
Fixes #121969
Fixes #133453

Co-authored-by: Carlos Gálvez <carlos.galvez@zenseact.com>
2025-04-03 09:28:34 +02:00
Dmitry Polukhin
e1aaee7ea2
[modules] Handle friend function that was a definition but became only a declaration during AST deserialization (#132214)
Fix for regression #130917, changes in #111992 were too broad. This change reduces scope of previous fix. Added `ExternalASTSource::wasThisDeclarationADefinition` to detect cases when FunctionDecl lost body due to declaration merges.
2025-04-03 08:27:13 +01:00
Yingwei Zheng
73e1710a4d
[SimplifyCFG] Remove unused variable. NFC. (#134211) 2025-04-03 15:22:51 +08:00
Juan Manuel Martinez Caamaño
041e84261a
[Clang][AMDGPU] Expose buffer load lds as a clang builtin (#132048)
CK is using either inline assembly or inline LLVM-IR builtins to
generate buffer_load_dword lds instructions.

This patch exposes this instruction as a Clang builtin available on gfx9 and gfx10.

Related to SWDEV-519702 and SWDEV-518861
2025-04-03 09:22:38 +02:00
Ryotaro Kasuga
91f3965be4
[LoopInterchange] Fix the vectorizable check for a loop (#133667)
In the profitability check for vectorization, the dependency matrix was
not handled correctly. This can result to make a wrong decision: It may
say "this loop can be vectorized" when in fact it cannot. The root cause
of this is that the check process early returns when it finds '=' or 'I'
in the dependency matrix. To make sure that we can actually vectorize
the loop, we need to check all the rows of the matrix. This patch fixes
the process of checking whether we can vectorize the loop or not. Now it
won't make a wrong decision for a loop that cannot be vectorized.

Related: #131130
2025-04-03 16:21:19 +09:00
Yingwei Zheng
b6c0ce0bb6
[IR][NFC] Use SwitchInst::defaultDestUnreachable (#134199) 2025-04-03 14:47:47 +08:00
Iris
3295970d84
[ConstantFolding] Add support for sinh and cosh intrinsics in constant folding (#132671)
Closes #132503.
2025-04-03 08:34:09 +02:00