These four functions all related in that they share tables and helper
functions. Furthermore, the acosh and atanh builtins call log1p.
As with other work in this area, these builtins are now vectorized. To
enable this, there are new table accessor functions which return a
vector of table values using a vector of indices. These are internally
scalarized, in the absence of gather operations. Some tables which were
tables of multiple entries (e.g., double2) are split into two separate
"low" and "high" tables. This might affect the performance of memory
operations but are hopefully mitigated by better codegen overall.
PR #130587 defined same SubTargetFeature for CPUs i6400 and i6500 which
resulted into following warning when -mcpu=i6500 was used:
+i6500' is not a recognized feature for this target (ignoring feature)
This PR fixes above issue by defining separate SubTargetFeature for
i6500.
When redrawing the statusline, the current implementation would clear
the current line before drawing the new content. Since we always
overwrite the whole statusline from beginning to end, there's no need to
clear it and we can avoid the potential for flickering.
There's no need to call RedrawStatusline from HandleProgressEvent. The
statusline gets redraw after handling all events, including progress
events, in the default event handler loop.
This patch uses a range constructor to collapse:
llvm::StringSet<> Dest;
for (const auto &S : Src)
Dest.insert(S);
down to:
llvm::StringSet<> Dest(llvm::from_range, Src);
We can name the sub-operands using a DAG in the 'ins'. This allows those
names to be matched to the encoding fields. This removes the need for a
custom encoder/decoder that treats the 2 sub-operands as a single 10-bit
value.
While doing this, I noticed the base and offset names in the
MIOperandInfo were swapped relative to how the operands are parsed and
printed. Assuming that I've correctly understood the parsing/print
format as "offset(base)".
When builtins are built with runtimes, it is built before compiler-rt,
and this makes some of the HAS_XXX_FLAGs missing. In this case, the
COMPILER_RT_HAS_FCF_PROTECTION_FLAG is missing which makes it impossible
to enable CET in this case. This patch addresses this issue by also
check for such flag in standalone build instead of relying on the
compiler-rt's detection.
This moves the EnumAttrCase and EnumAttr classes from Attribute.h/.cpp
to a new EnumInfo.h/cpp and renames them to EnumCase and EnumInfo,
respectively.
This doesn't change any of the tablegen files or any user-facing aspects
of the enum attribute generation system, just reorganizes code in order
to make main PR (#132148) shorter.
The primary effect of this is that we get proper scalable sizes printed
by the assembler, but this may also enable proper aliasing analysis. I
don't see any test changes resulting from the later.
Getting the size is slightly tricky as we store the scalable size as a
non-scalable quantity in the object size field for the frame index. We
really should remove that hack at some point...
For the synthetic tuple spills and fills, I dropped the size from the
split loads and stores to avoid incorrect (overly large) sizes. We could
also divide by the NF factor if we felt like writing the code to do so.
Per [range.view]/6, a `view_interface` isn't a base class of itself, so
`enable_view` should report `false`. Also, current implementation
strategy handles `const` but not `volatile`, IIUC cv-qualifiers should
be consistent handled.
In `enable_view.compile.pass.cpp`, coverage for (`const`) `volatile`
types are added.
Drive-by: Remove one unnessary `test_macro.h` inclusion in a test.
Fixes#132577.
The existing using _ForwardLike declaration already fails with a
subsitution failure. The LWG issue was filed to clarify what should
happen for non-referencable types.
Added test to verify libc++ is already enforcing the new Mandates.
Implements:
- LWG3757 What's the effect of std::forward_like<void>(x)
Closes: #105026
We previously assumed that every `DW_AT_LLVM_stmt_sequence` attribute
has a corresponding sequence in the processed line table. However, this
isn't always true. Some sequences can be removed by the linker if they
are empty, as shown
[here](https://github.com/alx32/llvm-project/blob/release/14.x/llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp#L565-L566).
When an attribute refers to one of these removed sequences, there is no
actual sequence for it to match. In such cases, we update the attribute
to indicate that it is invalid and does not point to any sequence. This
informs readers that the attribute should be ignored.
The newly modified test would have triggered the assert that is being
removed in this patch.
Before this patch, making a change to a runtime directory (like libcxx)
would cause the project to be added to the LLVM_ENABLE_PROJECTS CMake
flag which is illegal as they can only be built as part of
LLVM_ENABLE_RUNTIMES. This patch fixes that behavior. Test added.
This PR fixes two crashes in ExtractAPI that occur when decls are
requested via libclang:
- A null-dereference would sometimes happen in
`DeclarationFragmentsBuilder::getFragmentsForClassTemplateSpecialization`
when the template being processed was loaded indirectly via a typedef,
with parameters filled in. The first commit loads the template parameter
locations ahead of time to perform a null check before dereferencing.
- An assertion (or another null-dereference) was happening in
`CXXRecordDecl::bases` when processing a forward-declaration (i.e. a
record without a definition). The second commit guards the use of
`bases` in `ExtractAPIVisitorBase::getBases` by first checking that the
decl in question has a complete definition.
The added test `extract-api-cursor-cpp` adds tests for these two
scenarios to protect against the crash in the future.
Fixes rdar://140592475, fixes rdar://123430367
This is generally good practice if the caches won't be reused (though
arguably pedantic for the `stage1-toolchain` stage).
`docker history` on comparable images showed that this saves a few
hundred MB on stage1, and ~60MB on the `apt-get` layer of
`ci-container-agent`.
Added a srl pattern for true16 flow. Changing right shift 16bit to a
reg_sequence
`srl vgpr32, 16 -> reg_sequence (vgpr32.hi16, 0)`
and finally it's lowered to two COPY
`vdst.lo16 = COPY vsrc.hi16`
`vdst.hi16 = COPY 0`
The benefits of this transform is allowing the following pass to
optimize out these copy.
Add a statusline to command-line LLDB to display information about the
current state of the debugger. The statusline is a dedicated area
displayed at the bottom of the screen. The information displayed is
configurable through a setting consisting of LLDB’s format strings.
Enablement
----------
The statusline is enabled by default, but can be disabled with the
following setting:
```
(lldb) settings set show-statusline false
```
Configuration
-------------
The statusline is configurable through the `statusline-format` setting.
The default configuration shows the target name, the current file, the
stop reason and any ongoing progress events.
```
(lldb) settings show statusline-format
statusline-format (format-string) = "${ansi.bg.blue}${ansi.fg.black}{${target.file.basename}}{ | ${line.file.basename}:${line.number}:${line.column}}{ | ${thread.stop-reason}}{ | {${progress.count} }${progress.message}}"
```
The statusline supersedes the current progress reporting implementation.
Consequently, the following settings no longer have any effect (but
continue to exist to not break anyone's `.lldbinit`):
```
show-progress -- Whether to show progress or not if the debugger's output is an interactive color-enabled terminal.
show-progress-ansi-prefix -- When displaying progress in a color-enabled terminal, use the ANSI terminal code specified in this format immediately before the progress message.
show-progress-ansi-suffix -- When displaying progress in a color-enabled terminal, use the ANSI terminal code specified in this format immediately after the progress message.
```
Format Strings
--------------
LLDB's format strings are documented in the LLDB documentation and on
the website: https://lldb.llvm.org/use/formatting.html#format-strings.
The current implementation is relatively limited but various
improvements have been discussed in the RFC.
One such improvement is being to display a string when a format string
is empty. Right now, when launching LLDB without a target, the
statusline will be empty, which is expected, but looks rather odd.
RFC
---
The full RFC can be found on Discourse:
https://discourse.llvm.org/t/rfc-lldb-statusline/83948
SubToSuperRegs was a DenseMap of std::vectors, where the vectors
typically had size 1. Switching to a vector of pairs avoids the overhead
of allocating tiny vectors.
I measured a 1.14x speed-up building AMDGPUGenRegisterInfo.inc with this
patch.
Create a proper way to build header-only libraries for llvm-libc code
sharing. Use it to group headers that can be shared with libcxx for
std::from_chars() implementation.
It mostly works, though the macro needs to be updated to enforce that no
.cpp files are listed in dependencies (it's not the case now) - see PR
#133126.
Follow-up to dfca6c0d3bf9d1a056 to extend isUnrolled handle any unrolled
VPlan, which means there's a single UF, but it will be > 1 if unrolling
took place.
This adds ELF::convertTripleArchTypeToEMachine and uses it in
llvm-ifs. It handles many more Triple::ArchType values than the
old code, though not all since I couldn't quickly discern
what all the mappings are.
In 664f345cd53d1f624d94f9889a1c9fff803e3391, a fix was introduced,
attempting to restore LLVM_DIR and Clang_DIR after doing
find_package(Clang).
However, 6775285e7695f2d45cf455f5d31b2c9fa9362d3d added a return if the
clangTidy target wasn't found. If this is hit, we don't restore LLVM_DIR
and Clang_DIR, which causes strange effects if CMake is rerun a second
time.
Move the code for restoring LLVM_DIR and Clang_DIR to directly after the
find_package calls, to make sure they are restored, regardless of the
find_package outcome.
This patch adds a python script, compute_projects, and associated unit
tests for computing the projects and runtimes that need to be tested in
premerge. Rewriting in Python opens up a couple new
improvements/opportunities:
1. I personally find python to be much easier to work with than shell
scripts for tasks like this. Particularly it becomes a lot easier to
work with paths with proper array support.
2. Unit testing becomes easier which makes it a lot easier to reason
about behavior changes, especially in review.
3. Most of the configuration is now setup in some dictionaries, which
makes changes much easier to apply for most of the common changes.
This preserves the behavior of the existing premerge scripts as much as
possible.
Reviewers: ldionne, lnihlen, Endilll, tstellar, Keenuts
Reviewed By: Keenuts
Pull Request: https://github.com/llvm/llvm-project/pull/132634
This PR does reduces the verbosity of parser errors for OpenACC block
constructs that do not parse correctly because they are missing their
trailing end block directive by:
- Removing the redundant error messages created by parsing 3 different
styles of directive tokens.
- Providing a general mechanism of configuring the max number of
contexts printed for every syntax error.
- Not printing less specific contexts that are at the same location.
Prior to the changes:
```
$ flang -fc1 -fopenacc -fsyntax-only flang/test/Parser/acc-data-statement.f90 2>&1 | tee acc-data-statement.prior.log | wc -l
262
```
[acc-data-statement.prior.log](https://github.com/user-attachments/files/19298165/acc-data-statement.prior.log)
```
$ flang -fc1 -fopenacc -fsyntax-only flang/test/Parser/acc-data-statement.f90 2>&1 | tee acc-data-statement.prior.log | wc -l
73
```
[acc-data-statement.post.log](https://github.com/user-attachments/files/19298181/acc-data-statement.post.log)
This moves the checks of MinTripCountTailFoldingThreshold later, during the
calculation of whether to tail fold. This allows it to check beforehand whether
tail predication is required, either for scalable or fixed-width vectors.
This option is only specified for AArch64, where it returns the minimum of 5.
This patch aims to allow the vectorization of TC=4 loops, preventing them from
performing slower when SVE is present.