Following the discussion on #131174, update generate-tests.py script to
emit atomicrmw tests where the result is unused and add a note to
explain why these do use ST[F]ADD.
Replace min and max overload implementation using macros with one using
templates.
Enable overloads of the forms:
vector<T,N> min/max(vector<T,N> p0, T p1)
vector<T,N> min/max(T p0, vector<T,N> p1)
Add new tests.
Closes#131170
This code avoids adding comdat groups to interposable linkage types
(weak, linkonce (non-ODR)) to avoid changing semantics, since comdat
elimination happens before weak/strong prevailaing symbol resolution.
However, if the function is already in a comdat, we can add to the group
without changing the semantics of the linked program.
Fixes an issue uncovered in PR #126240
This change introduces the cir-canonicalize pass. This is a simple
cir-to-cir transformation that eliminates empty scopes and redundant
branches. It will be expanded in future changes to simplify other
redundant instruction sequences.
MLIR verification and mlir-specific command-line option handling is also
introduced here.
The `OmpDirectiveSpecification` contains directive name, the list of
arguments, and the list of clauses. It was introduced to store the
directive specification in METADIRECTIVE, and could be reused everywhere
a directive representation is needed.
In the long term this would unify the handling of common directive
properties, as well as creating actual constructs from METADIRECTIVE by
linking the contained directive specification with any associated user
code.
https://wg21.link/LWG3088 requires that `forward_list::merge()` is a no-op when passed
`*this`, which aligns with the behavior of `list::merge`. Although libc++'s implementation
of `forward_list::merge()` already meets this requirement, there were no tests to verify
this behavior. This patch adds the necessary tests to ensure that self-merging remains a
no-op and prevents any future regressions on this.
Closes#104942.
When an allocator-aware container already defines a member type alias
`__alloc_traits` for `std::allocator_traits<allocator_type>`, we should
consistently use `__alloc_traits`. Mixing the usage of both
`__alloc_traits` and `std::allocator_traits` can lead to inconsistencies
and confusion.
I noticed this while debugging some unit tests that the logs
occasionally would intersperse two log statements.
Previously, it was possible for a log statement to have two messages
interspersed since the timestamp and log statement were two different
writes to the log stream.
I created a Log helper to ensure we have a lock while attempting to write
to the logs.
The existing tests for `vector<bool>` copy- and move-assignment
operators are limited to 3 bits only, which are inadequate to cover
realistic scenarios. Most `vector<bool>` operations have code paths that
are executed only when multiple storage words are involved, with each
storage word typically comprising 64 bits on a 64-bit platform.
Furthermore, the existing tests fail to cover all combinations
`POCCA`/`POCMA`, along with different allocator equality and/or
reallocation scenarios, leaving some critical code paths untested.
This patch enhances the test coverage by introducing new tests covering
up to 5 storage words, ensuring that partial words in the front or tail,
and whole words in the middle are all properly tested. Moreover, these
new tests ensure that the copy- and move-assignment operators are tested
under all combinations of `POCCA`/`POCMA` and various allocator equality
scenarios, both with or without reallocations.
This patch fixes an issue in libc++ where `std::copy_backward` and
`ranges::copy_backward` incorrectly copy `std::vector<bool>` with small
storage types (e.g., `uint8_t`, `uint16_t`). The problem arises from
flawed bit mask computations involving integral promotions and sign-bit
extension, leading to unintended zeroing of bits. This patch corrects
the bitwise operations to ensure correct bit-level copying.
Fixes#131718.
- When the operand type of an operation changes to a profile-dependent
type, the compliance metadata must be updated. Update compliance check
for the following:
- CONV2D, CONV3D, DEPTHWISE_CONV2D, and TRANSPOSE_CONV2D, as zero points
have changed to variable inputs.
- PAD, because pad_const has been changed to a variable input.
- GATHER and SCATTER, as indices has changed to index_t.
- Add an int16 extension check for CONCAT.
- Add a compliance check for COND_IF, WHILE_LOOP, VARIABLE,
VARIABLE_READ, and VARIABLE_WRITE.
- Correct the profile requirements for IDENTITY, TABLE, MATMUL and
LOGICAL-like operations.
- Remove unnecessary checks for non-v1.0 operations.
- Add condition requirements (anyOf and allOf) to the type mode of
metadata for modes that have multiple profile/extension considerations.
The current implementation of `{std, ranges}::copy` fails to copy
`vector<bool>` correctly when the underlying storage type
(`__storage_type`) is smaller than `int`, such as `unsigned char`,
`unsigned short`, `uint8_t` and `uint16_t`. The root cause is that the
unsigned small storage type undergoes integer promotion to (signed)
`int`, which is then left and right shifted, leading to UB (before
C++20) and sign-bit extension (since C++20) respectively. As a result,
the underlying bit mask evaluations become incorrect, causing erroneous
copying behavior.
This patch resolves the issue by correcting the internal bitwise
operations, ensuring that `{std, ranges}::copy` operates correctly for
`vector<bool>` with any custom (unsigned) storage types.
Fixes#131692.
The current implementation of `{std, ranges}::equal` fails to correctly
compare `vector<bool>`s when the underlying storage type is smaller than
`int` (e.g., `unsigned char`, `unsigned short`, `uint8_t` and
`uint16_t`). See [demo](https://godbolt.org/z/j4s87s6b3)). The problem
arises due to integral promotions on the intermediate bitwise
operations, leading to incorrect final equality comparison results. This
patch fixes the issue by ensuring that `{std, ranges}::equal` operate
properly for both aligned and unaligned bits.
Fixes#126369.
This should ensure we have the full logs prior to dumping the logs.
Additionally, printing log dumps to stderr so they are adjacent to
assertion failures.
In some cases using the newly introduced clamp overloads, when floats
were involved, clang would behave differently than DXC.
To ensure the same behavior as DXC, require that for mix scalar/vector
overloads the type of the scalar matches the type of the vector.
This PR fixes an ambiguous call encountered while using the
`std::ranges::count` and `std::count` algorithms with `vector<bool>`
with small `size_type`s.
The ambiguity arises from integral promotions during the internal
bitwise arithmetic of the `count` algorithms for small integral types.
This results in multiple viable candidates:
`__libcpp_popcount(unsigned)`,` __libcpp_popcount(unsigned long)`, and
`__libcpp_popcount(unsigned long long)`, leading to an ambiguous call
error. To resolve this ambiguity, we introduce a dispatcher function,
`__popcount`, which directs calls to the appropriate overloads of
`__libcpp_popcount`. This closes#122528.
The value in the map is set to "true" when something is added to the
map.
Techncally this:
if (!DefRegs.contains(Reg))
will set the value in the map to false if it didn't already exist, and
this is used to indicate the value wasn't in the map. This only occurs
after all the "true" values have already been added to the map.
The unicode test sends some unicode input to lldb through pexpect and
expects the output to be echoed back in an error message. This only
works correctly when editline was compiled with wide character support.
This commit modifies the test to require the necessary libedit
configuration.
calculateRegisterUsage adds end points for each user of an instruction
to Ends and ignores instructions not added to it, i.e. instructions with
no users.
This means things like stores aren't included, which in turn means
values that are only used in stores are also not included for
consideration. This means we underestimate the register usage in cases
where the only users are things like stores.
Update the code to don't skip instructions without users (i.e. not in
Ends) if they have side-effects.
PR: https://github.com/llvm/llvm-project/pull/126415
When using parse_headers validation, rpc_server.h fails to build, since
we don't expose any of the RPC deps in the Bazel build.
So at least for now, just exclude it.
Create entry points for addresses referenced by dynamic relocations and
allow getNewFunctionOrDataAddress to map addrs inside functions. By
adding addresses referenced by dynamic relocations as entry points. This
patch fixes an issue where bolt fails on code using computing goto's.
This also fixes a mapping issue with the bugfix from this PR:
https://github.com/llvm/llvm-project/pull/117766.
This patch introduces the `MemorySpaceAttrInterface` interface. This
interface is responsible for handling the semantics of `ptr` operations.
For example, this interface can be used to create read-only memory
spaces, making any other operation other than a load a verification
error, see `TestConstMemorySpaceAttr` for a possible implementation of
this concept.
This patch also introduces Enum dependencies `AtomicOrdering`, and
`AtomicBinOp`, both enumerations are clones of the Enums with the same
name in the LLVM Dialect.
Also, see:
- [[RFC] `ptr` dialect & modularizing ptr ops in the LLVM
dialect](https://discourse.llvm.org/t/rfc-ptr-dialect-modularizing-ptr-ops-in-the-llvm-dialect/75142)
for rationale.
- https://github.com/llvm/llvm-project/pull/73057 for a prototype
implementation of the full change.
**Note: Ignore the first commit, that's being reviewed in
https://github.com/llvm/llvm-project/pull/86860 .**
When adding SBLock, I realized you always have to add a new header in
two places: LLDB.h and headers.swig.
I can't think of a reason the latter can't use the umbrella header and
avoid the duplication.
Currently, the handling for calls is split between AA and BasicAA in an
awkward way. BasicAA does argument alias analysis for non-escaping
objects (but without considering MemoryEffects), while AA handles the
generic case using MemoryEffects. However, fundamentally, both of these
are really trying to do the same thing.
The new merged logic first tries to remove the OtherMR component of the
memory effects, which includes accesses to escaped memory. If a
function-local object does not escape, OtherMR can be set to NoModRef.
Then we perform the argument scan in basically the same way as AA
previously did. However, we also need to look at the operand bundles. To
support that, I've adjusted getArgModRefInfo to accept operand bundle
arguments.
I initially thought starting with a more narrow definition and later
expanding would make more sense. But as pointed out in review for PR
#129220, this restriction is generating additional unnecessary work.
This patch alters the intrinsic to accept patterns of any type. Future
patches will update LoopIdiomRecognize and PreISelIntrinsicLowering to
take advantage of this. The verifier will complain if an unsized type is
used. I've additionally taken the opportunity to remove a comment from
the LangRef about some bit widths potentially not being supported by the
target. I don't think this is any more true than it is for arbitrary
width loads/stores which don't carry a similar warning that I can see.
A verifier check ensures that only sized types are used for the pattern.
This iterates on #104717 (which we had to revert)
In a bid to increase our chances of success, we try to avoid blowing up
the stack by
- Using `runWithSufficientStackSpace` in ParseCompoundStatement
- Reducing the size of `StmtVector` a bit
- Reducing the size of `DeclsInGroup` a bit
- Removing a few `ParsedAttributes` from the stacks in places where they
are not strictly necessary. `ParsedAttributes` is a _huge_ object
On a 64 bits system, the following stack size reductions are observed
```
ParseStatementOrDeclarationAfterAttributes: 344 -> 264
ParseStatementOrDeclaration: 520 -> 376
ParseCompoundStatementBody: 1080 -> 1016
ParseDeclaration: 264 -> 120
```
Fixes#94728