497771 Commits

Author SHA1 Message Date
luolent
a98a6e95be
Add clarifying parenthesis around non-trivial conditions in ternary expressions. (#90391)
Fixes [#85868](https://github.com/llvm/llvm-project/issues/85868)

Parenthesis are added as requested on ternary operators with non trivial conditions.

I used this [precedence table](https://en.cppreference.com/w/cpp/language/operator_precedence) for reference, to make sure we get the expected behavior on each change.
2024-05-04 18:38:45 +01:00
krzysdz
028f1b0781
[libc++] Fix P1206R7 feature test macros (#90914)
- Add missing `__cpp_lib_containers_ranges` feature test macro
- Constrain `__cpp_lib_ranges_to_container` to the `<ranges>` header,
since the standard does not list it in containers' headers

Ref:
-
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2022/p1206r7.pdf#section.18
- https://eel.is/c++draft/support.limits#lib:__cpp_lib_containers_ranges
-
https://eel.is/c++draft/support.limits#lib:__cpp_lib_ranges_to_container
2024-05-04 18:23:49 +02:00
Karl-Johan Karlsson
cb015b9ec9
[clang][CodeGen] Propagate pragma set fast-math flags to floating point builtins (#90377)
This is a fix for the issue #87758 where fast-math flags are not
propagated all builtins.

It seems like pragmas with fast math flags was only propagated to calls
of unary floating point builtins. This patch propagate them also for
binary and ternary floating point builtins.
2024-05-04 17:47:48 +02:00
Kazu Hirata
7ee6288312
[Support] Use StringRef::operator== instead of StringRef::equals (NFC) (#91042)
I'm planning to remove StringRef::equals in favor of
StringRef::operator==.

- StringRef::operator== outnumbers StringRef::equals by a factor of 25
  under llvm/ in terms of their usage.

- The elimination of StringRef::equals brings StringRef closer to
  std::string_view, which has operator== but not equals.

- S == "foo" is more readable than S.equals("foo"), especially for
  !Long.Expression.equals("str") vs Long.Expression != "str".
2024-05-04 08:46:48 -07:00
Matt Stephanson
76aa042dde
[libc++] Adjust some of the [rand.dist] critical values that are too strict (#88669)
Adjust some of the [rand.dist] critical values that are too strict

- Most critical values are determined empirically by running each test
51
times with a different PRNG seed and finding the smallest symmetric
interval
around the median that contains 90% of the sample means, variances, etc.

- For the Kolmogorov-Smirnov tests, the alpha=0.1 critical value for
large N
   is 1.224/sqrt(N).

- For normally distributed variates, the sample kurtosis is distributed
as
   Normal(0, 24/N). For N=1e5, this gives a 90% confidence interval of
0+/-0.0255. For Binomial(40, 0.25), which is approximately normal, the
   kurtosis is -0.0167, so the relative 90% CI is large, on the order of
0.0255/0.0167 = 153%. In most cases the distribution of the sample
kurtosis
isn't known analytically, but similarly large relative tolerances can be
   expected if the kurtosis is near zero.
2024-05-04 13:59:38 +02:00
Simon Pilgrim
caacf8685a
[DAG] Fold freeze(shuffle(x,y,m)) -> shuffle(freeze(x),freeze(y),m) (#90952)
If the shuffle mask contains no undef elements, then we can move the freeze through a shuffle node.

This requires special case handling to create a new ShuffleVectorSDNode.

Includes VECTOR_SHUFFLE support for isGuaranteedNotToBeUndefOrPoison  / canCreateUndefOrPoison.
2024-05-04 12:03:10 +01:00
orbiri
1e3c630fd1
[MLIR] Extend floating point parsing support (#90442)
Parsing support for floating point types was missing a few features:
1. Parsing floating point attributes from integer literals was supported
only for types with bitwidth smaller or equal to 64.
2. Downstream users could not use `AsmParser::parseFloat` to parse float
types which are printed as integer literals.

This commit addresses both these points. It extends
`Parser::parseFloatFromIntegerLiteral` to support arbitrary bitwidth,
and exposes a new API to parse arbitrary floating point given an
fltSemantics as input. The usage of this new API is introduced in the
Test Dialect.
2024-05-04 12:58:14 +02:00
Nikita Kniazev
294eecd4cb
[clang][docs] fix rendering issue in UsersManual.rst (#90308) 2024-05-04 09:57:38 +02:00
Kristof Beyls
554459a02f
[BOLT] Fix runOnEachFunctionWithUniqueAllocId (#90039)
When runOnEachFunctionWithUniqueAllocId is invoked with
ForceSequential=true, then the current implementation runs the function
with AllocId==0, which is the Id for the shared, non-unique, default
AnnotationAllocator.

However, the documentation for runOnEachFunctionWithUniqueAllocId
states:
```
/// Perform the work on each BinaryFunction except those that are rejected
/// by SkipPredicate, and create a unique annotation allocator for each
/// task. This should be used whenever the work function creates annotations to
/// allow thread-safe annotation creation.
```

Therefore, even when ForceSequential==true, a unique AllocId should be
used, i.e. different from 0.

In the current upstream BOLT this is presumably not depended on, but it
is needed to reduce memory usage for analyses that use a lot of
memory/annotations. Examples are the pac-ret and stack-clash analyses
that currently have prototype implementations as described in
https://discourse.llvm.org/t/rfc-bolt-based-binary-analysis-tool-to-verify-correctness-of-security-hardening/78148
These analyses use the DataFlowAnalysis framework to sometimes store
quite a lot of information on each MCInst. They run in parallel on each
function. When the dataflow analysis is finished, the annotations on
each MCInst can be removed, hugely saving on memory consumption. The
only annotations that need to remain are those that indicate some
unexpected properties somewhere in the binary.

Fixing this bug enables implementing the deletion of the memory used by
those huge number of DataFlowAnalysis annotations (by invoking
BC.MIB->freeValuesAllocator(AllocatorId)), even when run with
--no-threads. Without this bug fixed, the invocation of
BC.MIB->freeValuesAllocator(AllocatorId) results in also the memory for
all other annotations to be deleted, as AllocatorId is 0.

---------

Co-authored-by: Maksim Panchenko <maks@meta.com>
2024-05-04 09:26:35 +02:00
Andreas Jonson
1343e68862
[C API] Add function to create ConstantRange attributes to C API (#90505) 2024-05-04 16:01:59 +09:00
Nikita Popov
f16e234f11
[InstCombine] Do not request non-splat vector support in code reviews (NFC) (#90709)
The InstCombine contributor guide already says:

> Handle non-splat vector constants if doing so is free, but do
> not add handling for them if it adds any additional complexity
> to the code.

This change strengthens this guideline to explicitly discourage
asking (new) contributors to implement non-splat support during code
reviews. Doing so will almost certainly increase the number of
necessary review iterations, or result in outright contradictory review
feedback, as different people are willing to accept a different degree
of complexity for non-splat vector support.
2024-05-04 16:01:36 +09:00
Patrick O'Neill
96aac6798b
[lld] Error on unsupported split stack (#88063)
Targets with no `-fstack-split` support now emit `ld.lld: error: target
doesn't support split stacks` instead of `UNREACHABLE executed` with a
backtrace asking the user to report a bug.

Resolves #88061
2024-05-03 23:59:30 -07:00
Rafael Ubal
a42a2ca19b
Avoid buffer hoisting from parallel loops (#90735)
This change corrects an invalid behavior in pass
`--buffer-loop-hoisting`. The pass is in charge of extracting buffer
allocations (e.g., `memref.alloca`) from loop regions (e.g., `scf.for`)
when possible. This works OK for looks with sequential execution
semantics. However, a buffer allocated in the body of a parallel loop
may be concurrently accessed by multiple thread to store its local data.
Extracting such buffer from the loop causes all threads to wrongly share
the same memory region.

In the following example, dimension 1 of the input tensor is reversed.
Dimension 0 is traversed with a parallel loop.

```
func.func @f(%input: memref<2x3xf32>) -> memref<2x3xf32> {
  %c0 = index.constant 0
  %c1 = index.constant 1
  %c2 = index.constant 2
  %c3 = index.constant 3

  %output = memref.alloc() : memref<2x3xf32>
  scf.parallel (%index) = (%c0) to (%c2) step (%c1) {
    // Create subviews for working input and output slices
    %input_slice = memref.subview %input[%index, 2][1, 3][1, -1] : memref<2x3xf32> to memref<1x3xf32, strided<[3, -1], offset: ?>>
    %output_slice = memref.subview %output[%index, 0][1, 3][1, 1] : memref<2x3xf32> to memref<1x3xf32, strided<[3, 1], offset: ?>>

    // Copy the input slice into this temporary buffer. This intermediate
    // copy is unnecessary, but is used for illustration purposes.
    %temp = memref.alloc() : memref<1x3xf32>
    memref.copy %input_slice, %temp : memref<1x3xf32, strided<[3, -1], offset: ?>> to memref<1x3xf32>

    // Copy temporary buffer into output slice
    memref.copy %temp, %output_slice : memref<1x3xf32> to memref<1x3xf32, strided<[3, 1], offset: ?>>
    scf.reduce
  }

  return %output : memref<2x3xf32>
}
```

The patch submitted here prevents `%temp = memref.alloc() :
memref<1x3xf32>` from being hoisted when the containing op is
`scf.parallel` or `scf.forall`. A new op trait called
`HasParallelRegion` is introduced and assigned to these two ops to
indicate that their regions have parallel execution semantics.

@joker-eph @ftynse @nicolasvasilache @sabauma
2024-05-04 08:35:36 +02:00
Joseph Huber
1022636b0c
[libc] Fix assert dependency on macro header (#91036)
Summary:
This file was missing a dependency so it wasn't being installed.
2024-05-03 20:57:34 -05:00
paperchalice
e7939d0df6
[Instrumentation] Support verifying machine function (#90931)
We need it to test isel related passes. Currently
`verifyMachineFunction` is incomplete (no LiveIntervals support), but is
enough for testing isel pass, will migrate to complete
`MachineVerifierPass` in future.
2024-05-04 09:00:59 +08:00
Maksim Levental
b958ef1948
Update GettingInvolved.rst (#91008) 2024-05-03 19:02:28 -05:00
Fangrui Song
666679a559 [flang] Fix -Wunused-but-set-variable in lib/Evaluate 2024-05-03 16:51:35 -07:00
Johannes Doerfert
cd3a4c31bc
[Attributor][NFC] update tests (#91011) 2024-05-03 16:38:55 -07:00
Craig Topper
0c7e706c08 [AArch64] Pre-commit another test case for #90936. NFC
Another similar problem was added to the ticket after the first fix.
2024-05-03 16:29:47 -07:00
Vitaly Buka
a441645f80
[tsan] Don't crash on vscale (#91018)
Co-authored-by: Heejin Ahn <aheejin@gmail.com>
2024-05-03 16:29:26 -07:00
Teresa Johnson
e5cbe8fd9c
[MemProf] Optionally match profiles on to manually hinted hot/cold new (#91027)
While we don't currently rewrite the hints on manually hot/cold hinted
allocations, enable optionally matching profiles onto those allocations
as a first step to being able to do this.

By explicitly checking whether the library function is in the list of
operator new also fixes one limitation of the prior call to isNewLikeFn.
Some operator new calls (those that specify nothrow) are considered
Malloc-like because they may return null. We want to be able to match
and rewrite these. Therefore the new test uses a nothrow variant to test
the fix for this as well.
2024-05-03 16:18:04 -07:00
Benoit Jacob
b05a12e9d0
Let memref.expand_shape implement ReifyRankedShapedTypeOpInterface (#90975)
This is a new take on #89111. Now that #90040 is merged, this has become
trivial to implement. The added test shows the kind of benefit that we
get from this: now dim-of-expand-shape naturally folds without us
needing to implement an ad-hoc folding rewrite.
2024-05-03 18:33:01 -04:00
Heejin Ahn
5d81b1c50a
[WebAssembly] Add all remaining features to bleeding-edge (#90875)
I'm not entirely sure what the criteria for 'bleeding-edge' used to be,
but at this point it seems to be the set of all added features in LLVM.
This adds remaining features to bleeding-edge config.
2024-05-03 14:08:22 -07:00
Krystian Stasiowski
3191e0b527
[Clang][Sema] Fix template name lookup for operator= (#90999)
This fixes a bug in #90152 where `operator=` was never looked up in the
current instantiation, resulting in `<` never being interpreted as the
start of a template argument list.

Since function templates are not copy/move assignment operators, the fix
is accomplished by allowing lookup in the current instantiation for
`operator=` when looking up a template name.
2024-05-03 17:07:52 -04:00
Alexey Bataev
03972261a9 [SLP]Fix PR90892: do a correct sign analysis of the entries elements in gather shuffles.
Need to do extra analysis of the scalar elements of the tree entry to be
shuffled instead of the vectorized value to correctly deduce signedness
info.
2024-05-03 14:01:25 -07:00
Reid Kleckner
48039b195b Revert "[gn] port 2d4acb086541 (LLVM_ENABLE_CURL)"
This reverts commit 0558c7e01db81b3ac307fe59737fefd8bd060873 to match
the revert of 2d4acb086541 in  327bfc971e4dce3f6798843c92406cda95c07ba1
2024-05-03 20:59:37 +00:00
Reid Kleckner
385faf9cde
[ARM/X86] Standardize the isEligibleForTailCallOptimization prototypes (#90688)
Pass in CallLoweringInfo (CLI) instead of passing in the various fields
directly. Also pass in CCState (CCInfo), which is computed in both the
caller and the callee for a minor efficiency saving. There may also be a
small correctness improvement for sibcalls with vectorcall, which has an
odd way of recomputing argument locations.

This is a step towards improving the handling of musttail on armv7,
which we have numerous issues filed about in our tracker.

I took inspiration for this from the RISCV tail call eligibility check,
which uses a similar prototype.
2024-05-03 13:56:55 -07:00
Alexey Bataev
9620d3ee3e [SLP][NFC]Add a test with incorrect casting of shuffled gathered values, NFC. 2024-05-03 13:54:51 -07:00
Reid Kleckner
4e6d30e2c1
[clang] Note that optnone and target attributes do not apply to nested functions (#82815)
This behavior is true for all attributes, but this behavior can be
surprising for attributes which have function-wide effects, such as
`optnone` and `target`. Most other function attributes affect the
prototype or semantics, but do not affect code generation in the
function body. I believe it is worth calling this out in the
documentation of these function-wide attributes. There may be more,
these were the two that came to mind.
2024-05-03 13:50:27 -07:00
jeffreytan81
b8d38bb56d
Fix dap variable value format issue (#90799)
While adding a UI feature in VSCode to toggle hex/dec in variables view
window. I noticed that it does not work after second toggle. Then I
noticed that there is a bug that we only explicitly set hex format not
reset back to default during further toggle. The new test demonstrates
the bug.

This PR resets the format back to default if not using hex. One
complexity is that, we explicitly set registers value format to
AddressInfo, which shouldn't be overridden by default or hex settings.

---------

Co-authored-by: jeffreytan81 <jeffreytan@fb.com>
2024-05-03 13:36:23 -07:00
Jorge Gorbe Moya
2cde0e2f97 Revert "[BasicBlockUtils] Remove redundant llvm.dbg instructions after blocks to reduce compile time (#89069)"
This reverts commit 2e3e0868748635b779ba89a772eae3664bd822e4. It caused
quadratic slowdown at compilation time in some cases. See the comments
in the original PR: https://github.com/llvm/llvm-project/pull/89069
2024-05-03 13:05:08 -07:00
Chris B
9299a136dc
[DirectX] Remove unneccary check lines (#90979)
These check lines break as of 91446e2aa687e due to changes in how LLVM
handles debug information. Since debug informaiton isn't important to
what this test is verifying we can remove the check lines.
2024-05-03 14:54:21 -05:00
Matt Arsenault
7ec698e6ed
AMDGPU: Add tests for minimum and maximum intrinsics (#90997)
Baseline tests for new expansion. I think we can do better and avoid the
classes.
2024-05-03 21:43:30 +02:00
whisperity
3cf574da40
[clang-tidy][NFC] Document CERT rule coverage and aliases for some primary checks (#90965) 2024-05-03 21:40:36 +02:00
Jonas Devlieghere
ca8b064973
Revert "[lldb] Unify CalculateMD5 return types" (#90998)
Reverts llvm/llvm-project#90921
2024-05-03 12:14:45 -07:00
Noah Goldstein
f561daf989 [InstCombine] Add example usage for new Checked matcher API
There is no real motivation for this change other than to highlight a
case where the new `Checked` matcher API can handle non-splat-vecs
without increasing code complexity.

Closes #85676
2024-05-03 14:10:24 -05:00
Noah Goldstein
1708788d2d [InstCombine] Add non-splat test for (icmp (lshr x, y), x); NFC 2024-05-03 14:10:24 -05:00
Noah Goldstein
d8428dfeb8 [PatternMatching] Add generic API for matching constants using custom conditions
The new API is:
    `m_CheckedInt(Lambda)`/`m_CheckedFp(Lambda)`
        - Matches non-undef constants s.t `Lambda(ele)` is true for all
          elements.
    `m_CheckedIntAllowUndef(Lambda)`/`m_CheckedFpAllowUndef(Lambda)`
        - Matches constants/undef s.t `Lambda(ele)` is true for all
          elements.

The goal with these is to be able to replace the common usage of:
```
    match(X, m_APInt(C)) && CustomCheck(C)
```
with
```
    match(X, m_CheckedInt(C, CustomChecks);
```

The rationale if we often ignore non-splat vectors because there are
no good APIs to handle them with and its not worth increasing code
complexity for such cases.

The hope is the API creates a common method handling
scalars/splat-vecs/non-splat-vecs to essentially make this a
non-issue.
2024-05-03 14:10:24 -05:00
Noah Goldstein
285dbed147 [Inliner] Propagate callee argument memory access attributes before inlining
To avoid losing information, we can propagate some access attribute
from the to-be-inlined callee to its callsites.

We can propagate argument memory access attributes to callsite
parameters if they are from the same underlying object.

Closes #89024
2024-05-03 14:10:24 -05:00
Noah Goldstein
f8ff51e1b0 [Inliner] Add tests for not propagating writable if readonly is present; NFC 2024-05-03 14:10:24 -05:00
Joseph Huber
70b79a9ccd
[AMDGPU] Allow the __builtin_flt_rounds functions on AMDGPU (#90994)
Summary:
Previous patches added support for the LLVM rounding intrinsic
functions. This patch allows them to me emitted using the clang builtins
when targeting AMDGPU.
2024-05-03 14:01:09 -05:00
Anthony Ha
2f58b9aae2
[lldb] Unify CalculateMD5 return types (#90921)
# Overview
In my previous PR: https://github.com/llvm/llvm-project/pull/88812,
@JDevlieghere suggested to match return types of the various calculate
md5 functions.

This PR achieves that by changing the various calculate md5 functions to
return `llvm::ErrorOr<llvm::MD5::MD5Result>`.
 
The suggestion was to go for `std::optional<>` but I opted for
`llvm::ErrorOr<>` because local calculate md5 was already possibly
returning `ErrorOr`.

To make sure I didn't break the md5 calculation functionality, I ran
some tests for the gdb remote client, and things seem to work.

# Testing
1. Remote file doesn't exist

![image](https://github.com/llvm/llvm-project/assets/1326275/b26859e2-18c3-4685-be8f-c6b6a5a4bc77)

1. Remote file differs

![image](https://github.com/llvm/llvm-project/assets/1326275/cbdb3c58-555a-401b-9444-c5ff4c04c491)

1. Remote file matches

![image](https://github.com/llvm/llvm-project/assets/1326275/07561572-22d1-4e0a-988f-bc91b5c2ffce)

## Test gaps
Unfortunately, I had to modify
`lldb/source/Plugins/Platform/MacOSX/PlatformDarwinDevice.cpp` and I
can't test the changes there. Hopefully, the existing test suite / code
review from whomever is reading this will catch any issues.

Co-authored-by: Anthony Ha <antha@microsoft.com>
2024-05-03 11:51:25 -07:00
Valentin Clement (バレンタイン クレメン)
f8a9973f8c
[flang][cuda] Add verifier for cuda_alloc/cuda_free (#90983)
Adding a verifier to check the associated cuda attribute.
2024-05-03 11:25:34 -07:00
David Green
a4d10266d2
[VectorCombine] Add foldShuffleToIdentity (#88693)
This patch adds a basic version of a combine that attempts to remove
shuffles that when combined simplify away to an identity shuffle. For
example:
%ab = shufflevector <8 x half> %a, <8 x half> poison, <4 x i32> <i32 3,
i32 2, i32 1, i32 0>
%at = shufflevector <8 x half> %a, <8 x half> poison, <4 x i32> <i32 7,
i32 6, i32 5, i32 4>
  %abt = fneg <4 x half> %at
  %abb = fneg <4 x half> %ab
%r = shufflevector <4 x half> %abt, <4 x half> %abb, <8 x i32> <i32 7,
i32 6, i32 5, i32 4, i32 3, i32 2, i32 1, i32 0>
By looking through the shuffles and fneg, it can be simplified to:
  %r = fneg <8 x half> %a

The code tracks each lane starting from the original shuffle, keeping a
track of a vector of {src, idx}. As we propagate up through the
instructions we will either look through intermediate instructions
(binops and unops) or see a collections of lanes that all have the same
src and incrementing idx (an identity). We can also see a single value
with identical lanes, which we can treat like a splat.

Only the basic version is added here, handling identities, splats,
binops and unops. In follow-up patches other instructions can be added
such as constants, intrinsics, cmp/sel and zext/sext/trunc.
2024-05-03 19:14:38 +01:00
annamthomas
46c2d93662
[StandardInstrumentation] Annotate loops with the function name (#90756)
When analyzing pass debug output it is helpful to have the function name
along with the loop name.
2024-05-03 14:13:59 -04:00
Jonas Devlieghere
a8fbe500fe
[lldb] Add TeeLogHandler to log to 2 handlers (#90984)
Add a T-style log handler that multiplexes messages to two log handlers.
The goal is to use this in combination with the SystemLogHandler to log
messages both to the user requested file as well as the system log. The
latter is part of a sysdiagnose on Darwin which is commonly attached to
bug reports.
2024-05-03 11:08:50 -07:00
Alex Langford
e2b3e4ea9f
[lldb][NFCI] Unify DW_TAG -> string conversions (#90657)
The high level goal is to have 1 way of converting a DW_TAG value into a
human-readable string.

There are 3 ways this change accomplishes that:
1.) Changing DW_TAG_value_to_name to not create custom error strings.
  The way it was doing this is error-prone: Specifically, it was using a
  function-local static char buffer and handing out a pointer to it.
  Initialization of this is thread-safe, but mutating it is definitely
  not. Multiple threads that want to call this function could step on
  each others toes. The implementation in this patch sidesteps the issue
  by just returning a StringRef with no mention of the tag value in it.
2.) Changing all uses of DW_TAG_value_to_name to log the value of the
  tag since the function doesn't create a string with the value in it
  anymore.
3.) Removing `DWARFBaseDIE::GetTagAsCString()`. Callers should call
  DW_TAG_value_to_name on the tag directly.
2024-05-03 11:05:11 -07:00
Florian Hahn
401ecb4ccc
[LV] Add test showing miscompile with store reductions and RT checks.
Add anew test showing how a loop gets vectorized incorrectly with a
invariant store reduction where the same location is also read, when
vectorizing with runtime checks.
2024-05-03 18:54:00 +01:00
Abhinav Garg
76508dce43
[AMDGPU] Fix mode register pass for constrained FP operations (#90085)
This PR will fix the si-mode-register pass which is inserting an extra
setreg instruction in case of constrained FP operations. This pass will
be ignored for strictfp functions.
2024-05-03 19:47:15 +02:00
Aart Bik
fc398a112d
[mlir][sparse] test optimization of binary-valued operations (#90986)
Make sure consumer-producer fusion happens (to avoid the temporary dense
tensor) and constant folding occurs in the generated code.
2024-05-03 10:41:16 -07:00