505924 Commits

Author SHA1 Message Date
Sayhaan Siddiqui
bdee9b05de
Revert "[BOLT][DWARF][NFC] Split processUnitDIE into two lambdas" (#99904)
Reverts llvm/llvm-project#99225
2024-07-22 12:31:51 -07:00
Piotr Zegar
04d5003f59 [clang-tidy][DOC] Update check documentation
Fix issues in list.rst, addapt add_new_check.py to new
format of that file, and run gen-static-analyzer-docs.py
to generate missing documentation for clang-analyzer.
2024-07-22 19:31:00 +00:00
Petr Hosek
da2f7201f3
[libc] Include cbrt in baremetal targets (#99916)
This is a follow up to #99262.
2024-07-22 12:10:51 -07:00
Mingming Liu
a634171896
[InstrPGO][TypeProf]Annotate vtable types when they are present in the profile (#99402)
Before this change, when `file.profdata` have vtable profiles but `--enable-vtable-value-profiling` is not on for optimized build, warnings from this line [1] will show up. They are benign for performance but confusing.

It's better to automatically annotate vtable profiles if `file.profdata` has them. This PR implements it in profile use pass.
* If `-icp-max-num-vtables` is zero (default value is 6), vtable profiles won't be annotated.

[1] 464d321ee8/llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp (L1762-L1768)
2024-07-22 11:57:36 -07:00
Edd Dawson
3b24e5d450
Omit .debug_aranges if it is empty (#99897)
SIE tracker: https://jira.sie.sony.com/browse/TOOLCHAIN-16575
2024-07-22 19:56:11 +01:00
LLVM GN Syncbot
d6e17d7096 [gn build] Port 50c4e0392a42 2024-07-22 18:49:56 +00:00
Keith Smiley
50c4e0392a
[clang][test] Add missing test file to cmake (#99907)
Seems like this test was never running with cmake, but is running with
bazel and broke at head.
2024-07-22 11:49:37 -07:00
Hana Dusíková
52dd4dbb1a
[clang-tidy] bugprone-exception-escape didn't detech catching of an exception with pointer type by void * exception handler (#99773)
As in title, code which checks eligibility of exceptions with pointer
types to be handled by exception handler of type `void *` disallowed
this case. It was working like this:

```c++
if (isStandardPointerConvertible(ExceptionCanTy, HandlerCanTy) &&
          isUnambiguousPublicBaseClass(
              ExceptionCanTy->getTypePtr()->getPointeeType().getTypePtr(),
              HandlerCanTy->getTypePtr()->getPointeeType().getTypePtr())) {
```

but in `isUnambiguousPublicBaseClass` there was code which looked for
definitions:

```c++
bool isUnambiguousPublicBaseClass(const Type *DerivedType,
                                  const Type *BaseType) {
  const auto *DerivedClass =
      DerivedType->getCanonicalTypeUnqualified()->getAsCXXRecordDecl();
  const auto *BaseClass =
      BaseType->getCanonicalTypeUnqualified()->getAsCXXRecordDecl();
  if (!DerivedClass || !BaseClass)
    return false;
```

This code disallowed usage of `void *` type which was already correctly
detected in `isStandardPointerConvertible`.

AFAIK this seems like misinterpretation of specification:

> 14.4 Handling an exception
> a standard [pointer conversion](https://eel.is/c++draft/conv.ptr) not
involving conversions to pointers to private or protected or ambiguous
classes
(https://eel.is/c++draft/except.handle#3.3.1)

and 

> 7.3.12 Pointer conversions
> ... If B is an inaccessible
([[class.access]](https://eel.is/c++draft/class.access)) or ambiguous
([[class.member.lookup]](https://eel.is/c++draft/class.member.lookup))
base class of D, a program that necessitates this conversion is
ill-formed[.](https://eel.is/c++draft/conv.ptr#3.sentence-2) ...
(https://eel.is/c++draft/conv.ptr#3)

14.4 is carving out private, protected, and ambiguous base classes, but
they are already carved out in 7.3.12 and implemented in
`isStandardPointerConvertible`

---------

Co-authored-by: Piotr Zegar <me@piotrzegar.pl>
2024-07-22 20:49:25 +02:00
Med Ismail Bennani
bb8a74075b
[lldb] Change GetStartSymbol to GetStartAddress in DynamicLoader (#99909)
On linux, the start address doesn't necessarily have a symbol attached
to it.

This is why this patch replaces `DynamicLoader::GetStartSymbol` with
`DynamicLoader::GetStartAddress` instead to make it more generic.

Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>
2024-07-22 11:43:32 -07:00
Haojian Wu
2ef12b55b2 [clang] Fix the broken DeductionGuide ToolingTests after c7bfc41860a6abe5c92dc5afb47348b0c9e69963 2024-07-22 20:31:25 +02:00
Sergei Barannikov
735974e550
[builtins] Use __builtin_clzll for 64-bit types (#99874)
This addresses the issue with `__LP64__` not being defined for targets
with 32-bit pointers but 64-bit longs, resulting in worse codegen.
2024-07-22 21:29:56 +03:00
Jessica Del
0eb719fef5
[AMDGPU] Fix build failure in raw.atomic.buffer.load tests (#99912)
This fixes the failing tests after rebasing
over the attributor move.
2024-07-22 14:28:20 -04:00
dyung
9374216d4b
Replace distutils.version with packaging.version since the former was deprecated in python 3.10 and removed in 3.12. (#99852)
Attempt to reland #99549, but using packaging.version instead of
looseversion, based on the usage used for LLDB in #93712.
2024-07-22 11:28:11 -07:00
matthew-f
9d76231fcd
[clang-tidy] Ensure functions are anchored in the global namespace (for cert-err-33) (#99380)
The regular expressions match functions that aren't anchored in the
global namespace. For example `::remove` matches any object with a
`removeXyz` method. This change is to remove these false positives
2024-07-22 20:26:54 +02:00
Nicolas van Kempen
315561c867
[run-clang-tidy.py] Refactor, add progress indicator, add type hints (#89490)
[There is
work](https://discourse.llvm.org/t/rfc-upgrading-llvms-minimum-required-python-version/67571)
to make Python 3.8 the minimum Python version for LLVM.

I edited this script because I wanted some indicator of progress while
going through files.
It now outputs `[XX/YYY]` with the number of processed and total files
after each completion.

The current version of this script is compatible downto Python 3.6 (this
is PyYAML's minimum version).
It would probably work with older Python 3 versions with an older PyYAML
or when YAML is disabled.

With the updates here, it is compatible downto Python 3.7. Python 3.7
was released June 2018.

https://github.com/llvm/llvm-project/pull/89302 is also touching this
file, I don't mind rebasing on top of that work if needed.

### Summary

 -  Add type annotations.
 -  Replace `threading` + `queue` with `asyncio`.
- **Add indicator of processed files over total files**. This is what I
set out to do initially.
- Only print the filename after completion, not the entire Clang-Tidy
invocation command. I find this neater but the behavior can easily be
restored.
2024-07-22 20:23:49 +02:00
James Y Knight
511e93b96e
Handle constant "pointers" for __atomic_always_lock_free/__atomic_is_lock_free. (#99340)
The second argument passed to these builtins is used to validate whether
the object's alignment is sufficient for atomic operations of the given
size.

Currently, the builtins can be folded at compile time only when the
argument is 0/nullptr, or if the _type_ of the pointer guarantees
appropriate alignment.

This change allows the compiler to also evaluate non-null constant
pointers, which enables callers to check a specified alignment, instead
of only the type or an exact object. E.g.:
 `__atomic_is_lock_free(sizeof(T), (void*)4)`
can be potentially evaluated to true at compile time, instead of
generating a libcall. This is also supported by GCC, and used by
libstdc++, and is also useful for libc++'s atomic_ref.

Also helps with (but doesn't fix) issue #75081.

This also fixes a crash bug, when the second argument was a non-pointer
implicitly convertible to a pointer (such as an array, or a function).
2024-07-22 14:20:25 -04:00
Daniil Kovalev
146fd7cd45
[PAC][Driver] Support pauthtest ABI for AArch64 Linux triples (#97237)
When `pauthtest` is either passed as environment part of AArch64 Linux
triple
or passed via `-mabi=`, enable the following ptrauth flags:

- `intrinsics`;
- `calls`;
- `returns`;
- `auth-traps`;
- `vtable-pointer-address-discrimination`;
- `vtable-pointer-type-discrimination`;
- `init-fini`.

Some related stuff is still subject to change, and the ABI itself might
be changed, so end users are not expected to use this and the ABI name
has 'test' suffix.

If `-mabi=pauthtest` option is used, it's normalized to effective
triple.

When the environment part of the effective triple is `pauthtest`, try
to use `aarch64-linux-pauthtest` as multilib directory.

The following is not supported:
- combination of `pauthtest` ABI with any branch protection scheme
except BTI;
- explicit set of environment part of the triple to a value different
  from `pauthtest` in combination with `-mabi=pauthtest`;
- usage on non-Linux OS.

---------

Co-authored-by: Anatoly Trosinenko <atrosinenko@accesssoftek.com>
2024-07-22 21:18:39 +03:00
Wei Wang
bee2654300
[Asan] Skip pre-split coroutine and noop coroutine frame (#99415)
CoroSplit expects the second parameter of `llvm.coro.id` to be the
promise alloca. Applying Asan on a pre-split coroutine breaks this
assumption and causes split to fail. This should be NFC because asan
pass happens late in the pipeline where all coroutines are split. This
is to prevent crash in case the order of passes are switched.

Also `NoopCoro.Frame.Const` is a special coroutine frame that does
nothing when resumed or destroyed. There is no point to do
instrumentation on it.
2024-07-22 10:53:40 -07:00
Craig Topper
0950533cff [RISCV] Move call to EmitLoweredCascadedSelect above some variable declarations. NFC
These variables aren't used if we call EmitLoweredCascadedSelect
so move the call above them.
2024-07-22 10:37:45 -07:00
Craig Topper
d221662ed0 [RISCV] In emitSelectPseudo, copy call frame size from LastSelectPseudo instead of MI.
The split point is LastSelectPseudo. If MI is earlier, we might
sink it to LastSelectPseudo.
2024-07-22 10:37:44 -07:00
Abid Qadeer
fa5971c298
[flang][debug] Generate correct name for external function. (#99510)
The `ExternalNameConversion` will add an _ at the end of the external
functions. We extract the real function name to use in the debug info.
The convention is to use the real name of function in the `name` field
and mangled name with extra _ at the end in the `linkageName` field.

Fixes #92391.
2024-07-22 18:32:48 +01:00
Dinar Temirbulatov
3f8d77bcc7 Revert "[AArch64][SVE] Improve code quality of vector unsigned/signed add reductions. (#97339)"
This reverts commit b7b0071680e60c60da9d4d858f944fd95d76fd42.

The change caused regression in a performance testing.
2024-07-22 17:27:07 +00:00
Fangrui Song
f2eb7c7344 [test] Delete a redundant mapping symbol test
Covered by llvm/test/tools/llvm-nm/special-syms-arm.test
2024-07-22 10:26:05 -07:00
Daniel Bertalan
90569e02e6
[Support] Add Arm NEON implementation for llvm::xxh3_64bits (#99634)
Compared to the generic scalar code, using Arm NEON instructions yields
a ~11x speedup: 31 vs 339.5 ms to hash 1 GiB of random data on the Apple
M1.

This follows the upstream implementation closely, with some
simplifications made:
- Removed workarounds for suboptimal codegen on older GCC
- Removed instruction reordering barriers which seem to have a
negligible impact according to my measurements
- We do not support WebAssembly's mostly NEON-compatible API
- There is no configurable mixing of SIMD and scalar code; according to
the upstream comments, this is only relevant for smaller Cortex cores
which can dispatch relatively few NEON micro-ops per cycle.

This commit intends to use only standard ACLE intrinsics and datatypes,
so it should build with all supported versions of GCC, Clang and MSVC.

This feature is enabled by default when targeting AArch64, but the
`LLVM_XXH_USE_NEON=0` macro can be set to explicitly disable it.

XXH3 is used for ICF, string deduplication and computing the UUID in
ld64.lld; this commit results in a -1.77% +/- 0.59% speed improvement
for a `--threads=8` link of Chromium.framework.
2024-07-22 19:06:43 +02:00
Craig Topper
1c798e0b07
[SelectionDAGBuilder][RISCV] Fix crash when using a memory constraint with scalable vector type. (#99821)
We need to use the minimum size of the scalable type and the correct
stack ID.

The code in the PR is still invalid because the instruction used doesn't
have a pointer operand. This is diagnosed later when the assembler
parses it.

Fixes #99782
2024-07-22 09:41:57 -07:00
Mikhail R. Gadelha
c80b799e90
[libc] No need to use recursion in fcntl (#99893)
This patch removes the recursion in fcntl introduced by PR #99675 as it is not required and may be dangerous in some cases: some toolchains define F_GETLK == F_GETLK64 causing infinite recursion.
2024-07-22 13:40:47 -03:00
OverMighty
70843bf658
[libc][math] Optimize copysign{,f,f16} and fabs{,f,f16} with builtins when available (#99037) 2024-07-22 18:37:44 +02:00
nicole mazzuca
4189226236
[libc++] Update some C++23 statuses to "Nothing to do" or "Complete" (#99621)
- [P2160R1][] "Locks lock lockables"
- [P2212R2][] "Relax Requirements for `time_point::clock`"
- [P1675R2][] "`rethrow_exception` must be allowed to copy"
- [P2340R1][] "Clarifying the status of the 'C headers'"
- [P2460R2][] "Relax requirements on `wchar_t` to match existing
practices"

Are all papers that change wording without changing implementation
behaviour.

Additionally, [P2736R2][] "Referencing The Unicode Standard", is an
already complete paper in 19.0 (as of [LLVM-86543][])

[P2160R1]: https://wg21.link/p2160r1
[P2212R2]: https://wg21.link/p2212r2
[P1675R2]: https://wg21.link/p1675r2
[P2340R1]: https://wg21.link/p2340r1
[P2460R2]: https://wg21.link/p2460r2
[P2736R2]: https://wg21.link/p2736r2
[LLVM-86543]: https://github.com/llvm/llvm-project/pull/86543
2024-07-22 18:33:41 +02:00
Xiaoyang Liu
3d7622ea0b
[libc++][ranges] LWG3618: Unnecessary iter_move for transform_view::iterator (#91809)
## Introduction

This patch implements LWG3618: Unnecessary `iter_move` for
`transform_view::iterator`.

`transform_view`'s iterator currently specifies a customization point
for `iter_move`. This customization point does the same thing that the
default implementation would do, but its sole purpose is to ensure the
appropriate conditional `noexcept` specification.

## Reference

-
[[range.transform.iterator]](https://eel.is/c++draft/range.transform.iterator)
- [LWG3618](https://cplusplus.github.io/LWG/issue3618)
2024-07-22 18:32:37 +02:00
Martin Storsjö
ec966f699d
[libcxx] [test] Make indentation more consistent in thousands_sep. NFC. (#99844)
This was made inconsistent recently in
f114eddb1923289b696f1b0980cc22c4dbaafa22.
2024-07-22 19:31:47 +03:00
Raphael Isemann
0b12e185bd
[clang-fuzzer-dictionary] Fix build failure with libfuzzer (#99871) 2024-07-22 09:29:34 -07:00
Joseph Huber
6911f823ad [libc] Fix invalid format specifier in benchmark
Summary:
This value is a uint32_t but is printed as a uint64_t, leading to
invalid offsets when done on AMDGPU due to its packed format extending
past the buffer.
2024-07-22 11:21:22 -05:00
Craig Topper
2c92335eb7
[RISCV] Copy call frame size when splitting basic block in emitSelectPseudo. (#99823)
Fixes #97304.
2024-07-22 09:21:01 -07:00
xur-llvm
b1ca2a9546
[PGO] Sampled instrumentation in PGO to speed up instrumentation binary (#69535)
In comparison to non-instrumented binaries, PGO instrumentation binaries
can be significantly slower. For highly threaded programs, this slowdown
can
reach 10x due to data races or false sharing within counters.

This patch incorporates sampling into the PGO instrumentation process to
enhance the speed of instrumentation binaries. The fundamental concept
is similar to the one proposed in https://reviews.llvm.org/D63949.

Three sampling modes are introduced:
1. Simple Sampling: When '-sampled-instr-bust-duration' is set to 1.
2. Fast Burst Sampling: When not using simple sampling, and 
'-sampled-instr-period' is set to 65535. This is the default mode of
sampling.
3. Full Burst Sampling: When neither simple nor fast burst sampling is
used.

Utilizing this sampled instrumentation significantly improves the
binary's
execution speed. Measurements show up to 5x speedup with default
settings. Fast burst sampling now results in only around 20% to 30%
slowdown (compared to 8 to 10x slowdown without sampling).

Out tests show that profile quality remains good with sampling,
with edge counts typically showing more than 90% overlap.
For applications whose behavior changes due to binary speed,
sampling instrumentation can enhance performance.
Observations have shown some apps experiencing up to
a ~2% improvement in PGO.

A potential drawback of this patch is the increased binary size
and compilation time. The Sampling method in this patch does
not improve single threaded program instrumentation binary
speed.
2024-07-22 09:19:17 -07:00
Jessica Del
ec7f8e1113
[AMDGPU] Add intrinsic for raw atomic buffer loads (#97707)
Upstream the intrinsics `llvm.amdgcn.raw.atomic.buffer.load`
and `llvm.amdgcn.raw.atomic.ptr.buffer.load`.

These additional intrinsics mark atomic buffer loads
as atomic to LLVM by removing the `IntrReadMem`
attribute. Otherwise, it could hoist these
intrinsics out of loops in cases where LLVM marks
them as invariant. That can cause issues such as
infinite loops.

Continuation of https://reviews.llvm.org/D138786
with the additional use in the fat buffer lowering,
more test cases and the additional ptr versions 
of these intrinsics.

---------

Co-authored-by: rtayl <>
Co-authored-by: Jay Foad <jay.foad@amd.com>
Co-authored-by: Mariusz Sikora <mariusz.sikora@amd.com>
2024-07-22 18:04:49 +02:00
Fangrui Song
4010ddf780
[MC,AArch64] Create mapping symbols with non-unique names
Add `createLocalSymbol` to create a local, non-temporary symbol.
Different from `createRenamableSymbol`, the `Used` bit is ignored,
therefore multiple local symbols might share the same name.

Utilizing `createLocalSymbol` in AArch64 allows for efficient mapping
symbol creation with non-unique names, saving .strtab space.
The behavior matches GNU assembler.

Pull Request: https://github.com/llvm/llvm-project/pull/99836
2024-07-22 09:03:05 -07:00
Pranav Bhandarkar
d7e185cca9
[OMPIRBuilder] - Handle dependencies in createTarget (#93977)
This patch handles dependencies specified by the `depend` clause on an
OpenMP target construct. It does this much the same way clang does it by
materializing an OpenMP `task` that is tagged with the dependencies.

The following functions are relevant to this patch -
1) `createTarget` - This function itself is largely unchanged except
that it now accepts a vector of `DependData` objects that it simply
forwards to `emitTargetCall`
2) `emitTargetCall` - This function has changed now to check if an outer
target-task needs to be materialized (i.e if `target` construct has
`nowait` or has `depend` clause). If yes, it calls `emitTargetTask` to
do all the heavy lifting for creating and dispatching the task.
3) `emitTargetTask` - Bulk of the change is here. See the large comment
explaining what it does at the beginning of this function
2024-07-22 10:56:45 -05:00
Alexey Bataev
9da221d15f [SLP][NFC]Remove incorrect attribure from the test, NFC. 2024-07-22 08:38:11 -07:00
Michael Klemm
a5447613de
[Flang][runtime] Add dependency to build FortranRuntime after flang-new (#99737)
Makefile-based builds did not have proper dependencies to built the
FortranRuntime target after Flang new is available. This PR introduces a
dependency to ensure that this is the case. Relates to PR #95388.

---------

Co-authored-by: Michael Kruse <github@meinersbur.de>
2024-07-22 17:30:45 +02:00
Krzysztof Parzyszek
e9709899db [clang][OpenMP] Avoid names that hide existing variables, NFC 2024-07-22 10:22:20 -05:00
Paul Kirth
5ea38b86a3
[lit][NFC] Avoid unintended -EMPTY suffix in check prefix (#99690)
FileCheck has special handline for the `-EMPTY` suffix, that should
match empty lines. Overloading the suffix can be a source of confusion
when reading tests. Additionally, the current implementation seems to
match the following expressions, which appears to be a bug in FileCheck.
2024-07-22 08:21:38 -07:00
Björn Pettersson
2b78303e3f
[DAGCombiner] Freeze maybe poison operands when folding select to logic (#84924)
Just like for regular IR we need to treat SELECT as conditionally
blocking poison in SelectionDAG. So (unless the condition itself is
poison) the result is only poison if the selected true/false value is
poison.
Thus, when doing DAG combines that turn SELECT into arithmetic/logical
operations (e.g. AND/OR) we need to make sure that the new operations
aren't more poisonous. One way to do that is to use FREEZE to make
sure the operands aren't posion.

This patch aims at fixing the kind of miscompiles reported in
  https://github.com/llvm/llvm-project/issues/84653
and
  https://github.com/llvm/llvm-project/issues/85190

Solution is to make sure that we insert FREEZE, if needed to make
the fold sound, when using the foldBoolSelectToLogic and
foldVSelectToSignBitSplatMask DAG combines.
2024-07-22 17:19:46 +02:00
Mikhail R. Gadelha
cda5b2b4b8
[libc] Change fcntl cmd when only fcntl64 is available (#99675)
In some systems like rv32, only fcntl64 is available and it employs a different structure for file locking and the correspoding F_GETLK64, F_SETLK64, and F_SETLKW64 commands.

So if we use fcntl64, the F_GETLK, F_SETLK, and F_SETLKW commands need to be changed to their 64 versions. This patch adds new cases to the swich(cmd) in our implementation of fcntl to do that.

The default case was moved to outside the switch, so we don't need to change anything, the F_GETLK, F_SETLK, and F_SETLKW commands will just go through the old implementation.
2024-07-22 12:18:48 -03:00
Joseph Huber
65825cd543
[libc] Use <assert.h> in overlay mode for LIBC_ASSERT (#99875)
Summary:
This uses `internal::exit` which is not built in overlay mode, leading
to linker errors. Fix this to just use `assert.h`.
2024-07-22 10:12:43 -05:00
Utkarsh Saxena
280b04f65a
Record mainfile name in the Frontend time trace (#99866) 2024-07-22 17:10:41 +02:00
Mikhail R. Gadelha
28e6095082
[libc] Add working entrypoints to riscv (#99885)
Added new fsqrt entrypoints and updated headers.txt, which I missed in PR #99771
2024-07-22 12:06:28 -03:00
Mikhail R. Gadelha
7ddcf7acf2
[libc] Change fsfilcnt_t and fsblkcnt_t to be 64-bits long (#99876)
In 32-bit systems with 64-bit offsets, both fsfilcnt_t and fsblkcnt_t are 64-bit long, just like 64-bit systems. This patch changes both types to be 64-bit long for all platforms and follows the reasoning used to change off_t: the standard only requires it to be an unsigned int, so making it 64-bit long doesn't violate this property.

It should be NFC for 64-bit systems.
2024-07-22 12:06:03 -03:00
Tom Eccles
2e6558b8bc
[flang][OpenMP] fix lastprivate for allocatables (#99686)
Don't use `copyHostAssociateVar` for allocatable variables. It isn't
clear to me whether or not this should be addressed in
`copyHostAssociateVar` instead of inside OpenMP. I opted for OpenMP
to minimise how many things I effected. `copyHostAssociateVar` will
not update the destination variable if the destination variable
was unallocated. This is incorrect because assignment inside of the
openmp block can cause the allocation status of the variable to
change. Furthermore, `copyHostAssociateVar` seems to only copy the
variable address not other metadata like the size of the allocation.
Reallocation by assignment could cause this to change.
2024-07-22 16:05:17 +01:00
AtariDreams
ca3d4dfe0c
[Metadata] Make range boundary variables unsigned (NFC) (#99338)
They should be unsigned because the source and target value are too.
2024-07-22 17:02:05 +02:00
Timm Bäder
613d2c3939 [clang][Interp][NFC] Avoid hitting an assertion in invalid code 2024-07-22 16:59:31 +02:00