Certain functions require the `returns_twice` attribute in order to
produce correct codegen. However, `-fno-builtin` removes all knowledge
of functions that require this attribute, so this PR modifies Clang to
add the `returns_twice` attribute even if `-fno-builtin` is set. This
behavior is also consistent with what GCC does.
It's not (easily) possible to get the builtin information from
`Builtins.td` because `-fno-builtin` causes Clang to never initialize
any builtins, so functions never get tokenized as functions/builtins
that require `returns_twice`. Therefore, the most straightforward
solution is to explicitly hard code the function names that require
`returns_twice`.
Fixes#122840
Include the LLDB version in the lldbassert error message, and prompt
users to include it in the bug report. The majority of users that bother
filing a bug report just copy past the stack trace and often forget to
include this important detail. By putting it after the backtrace and
before the prompt, I'm hoping it'll get copy-pasted in.
rdar://146793016
I observed that we have the boundary comments in the codebase like:
```
//===----------------------------------------------------------------------===//
// ...
//===----------------------------------------------------------------------===//
```
I also observed that there are incomplete boundary comments. The
revision is generated by a script that completes the boundary comments.
```
//===----------------------------------------------------------------------===//
// ...
...
```
Signed-off-by: hanhanW <hanhan0912@gmail.com>
TargetLoweringBase::getIRStackGuard refers to a platform-specific guard
variable. Before this change, TargetLoweringBase::getSDagStackGuard only
referred to a different variable.
This means that SelectionDAGBuilder's getLoadStackGuard does not get
memory operands. However, AArch64InstrInfo::expandPostRAPseudo assumes
that the passed MachineInstr has nonzero memoperands, causing a
segfault.
We have two possible options here: either disabling the LOAD_STACK_GUARD
node entirely in AArch64TargetLowering::useLoadStackGuardNode or just
making the platform-specific values match across TargetLoweringBase.
Here, we try the latter.
Splitting the 'ln_tbl' into two in db98e292 wasn't done thoroughly
enough as some references to the old table still remained. This commit
fixes the unresolved references by updating to the new split table.
Fixes#133712.
The change causes `c.slli` instructions whose immediate has bit 5 set to
be rejected when disassembling RV32C. Added a test to exhaustively cover
c.slli for 32 bit targets. A minor tweak to make the debug output a
little more readable.
The spec. (20240411) says:
> For RV32C, shamt[5] must be zero; the code points with shamt[5]=1 are
designated for custom extensions. For RV32C and RV64C, the shift amount
must be non-zero; the code points with shamt=0 are HINTs. For all base
ISAs, the code points with rd=x0 are HINTs, except those with shamt[5]=1
in RV32C.
In #133211, Greg suggested making the rate limit configurable through a
setting. Although adding the setting is easy, the two places where we
currently use rate limiting aren't tied to a particular debugger.
Although it'd be possible to hook up, given how few progress events
currently implement rate limiting, I don't think it's worth threading
this through, if that's even possible.
I still think it's a good idea to be consistent and make it easy to pick
the same rate limiting value, so I've moved it into a constant in the
Progress class.
Hey,
This solves an issue where running lldb-server-20 with a non-absolute
path (for example, when it's installed into `/usr/bin` and the user runs
it as `lldb-server-20 ...` and not `/usr/bin/lldb-server-20 ...`) fails
with `error: spawn_process failed: execve failed: No such file or
directory`. The underlying issue is that when run that way, it attempts
to execute a binary named `lldb-server-20` from its current directory.
This is also a mild security hazard because lldb-server is often being
run as root in the directory /tmp, meaning that an unprivileged user can
create the file /tmp/lldb-server-20 and lldb-server will execute it as
root. (although, well, it's a debugging server we're talking about, so
that may not be a real concern)
I haven't previously contributed to this project; if you want me to
change anything in the code please don't hesitate to let me know.
Expose u target API mutex through the SB API. This is motivated by
lldb-dap, which is built on top of the SB API and needs a way to execute
a series of SB API calls in an atomic manner (see #131242).
We can solve this problem by either introducing an additional layer of
locking at the DAP level or by exposing the existing locking at the SB
API level. This patch implements the second approach.
This was discussed in an RFC on Discourse [0]. The original
implementation exposed a move-only lock rather than a mutex [1] which
doesn't work well with SWIG 4.0 [2]. This implement the alternative
solution of exposing the mutex rather than the lock. The SBMutex
conforms to the BasicLockable requirement [3] (which is why the methods
are called `lock` and `unlock` rather than Lock and Unlock) so it can be
used as `std::lock_guard<lldb::SBMutex>` and
`std::unique_lock<lldb::SBMutex>`.
[0]: https://discourse.llvm.org/t/rfc-exposing-the-target-api-lock-through-the-sb-api/85215/6
[1]: https://github.com/llvm/llvm-project/pull/131404
[2]: https://discourse.llvm.org/t/rfc-bumping-the-minimum-swig-version-to-4-1-0/85377/9
[3]: https://en.cppreference.com/w/cpp/named_req/BasicLockable
This adds DWARF generation for fixed-point types. This feature is needed
by Ada.
Note that a pre-existing GNU extension is used in one case. This has
been emitted by GCC for years, and is needed because standard DWARF is
otherwise incapable of representing these types.
This turns on the unnecessary-virtual-specifier warning in general, but
disables it when building LLVM. It also tweaks the warning description
to be slightly more accurate.
Background: I've been working on cleaning up this warning in two
codebases: LLVM and chromium (plus its dependencies). The chromium
cleanup has been straightforward. Git archaeology shows that there are
two reasons for the warnings: classes to which `final` was added after
they were initially committed, and classes with virtual destructors that
nobody remarks on. Presumably the latter case is because people are just
very used to destructors being virtual.
The LLVM cleanup was more surprising: I discovered that we have an [old
policy](https://llvm.org/docs/CodingStandards.html#provide-a-virtual-method-anchor-for-classes-in-headers)
about including out-of-line virtual functions in every class with a
vtable, even `final` ones. This means our codebase has many virtual
"anchor" functions which do nothing except control where the vtable is
emitted, and which trigger the warning. I looked into alternatives to
satisfy the policy, such as using destructors instead of introducing a
new function, but it wasn't clear if they had larger implications.
Overall, it seems like the warning is genuinely useful in most codebases
(evidenced by chromium and its dependencies), and LLVM is an unusual
case. Therefore we should enable the warning by default, and turn it off
only for LLVM builds.
We only emits v_mov_b32/64_dpp. Don't combine t16 instructions with mov
dpp. Update the test inputs to be legal.
It is future work to emit v_mov_b16_dpp, and then update GCNDPPCombine
to combine it with the 16-bit instructions.
The 256-bit maximum vector register size control was removed from AVX10
whitepaper, ref: https://cdrdv2.intel.com/v1/dl/getContent/784343
- Re-target m[no-]avx10.1 to enable AVX10.1 with 512-bit maximum vector
register size;
- Emit warning for mavx10.x-256, noting AVX10/256 is not supported;
- Emit warning for mavx10.x-512, noting to use m[no-]avx10.x instead;
- Emit warning for m[no-]evex512, noting AVX10/256 is not supported;
This patch only changes Clang driver behavior. The features
avx10.x-256/512 keep unchanged and will be removed in the next release.
For example for the following situation:
%6:gpr = SLLI %2:gpr, 2
%7:gpr = ADDI killed %6:gpr, 24
%8:gpr = ADD %0:gpr, %7:gpr
If we swap the two add instrucions we can merge the shift and add. The
final code will look something like this:
%7 = SH2ADD %0, %2
%8 = ADDI %7, 24
There is some code to make sure that C++ keywords that are identifiers
in the other languages are not treated as keywords. Right now, the kind
is set to identifier, and the identifier info is cleared. The latter is
probably so that the code for identifying C++ structures does not
recognize those structures by mistake when formatting a language that
does not have those structures. But we did not find an instance where
the language can have the sequence of tokens, the code tries to parse
the structure as if it is C++ using the identifier info instead of the
token kind, but without checking for the language setting. However,
there are places where the code checks whether the identifier info field
is null or not. They are places where an identifier and a keyword are
treated the same way. For example, the name of a function in
JavaScript. This patch removes the lines that clear the identifier
info. This way, a C++ keyword gets treated in the same way as an
identifier in those places.
JavaScript
New
```JavaScript
async function
union(
myparamnameiswaytooloooong) {
}
```
Old
```JavaScript
async function
union(
myparamnameiswaytooloooong) {
}
```
Java
New
```Java
enum union { ABC, CDE }
```
Old
```Java
enum
union { ABC, CDE }
```
before
```Verilog
wait fork
;
wait fork
;
wait fork
;
```
after
```Verilog
wait fork;
wait fork;
wait fork;
```
The `wait fork` statement should not start a block. Previously the
formatter treated the `fork` part as the start of a new block. Now the
problem is fixed.
I recently added a new option to update_test_checks.py that can
filter out all CHECK lines after a certain point. We usually don't
care about checking for the original scalar loop after the vector
loop because it doesn't change. Cutting out unnecessary CHECK
lines makes the files smaller and hopefully the tests run quicker.
The patch splits the store-load forwarding distance analysis from other
dependency analysis in LAA. Currently it supports only power-of-2
distances, required to support non-power-of-2 distances in future.
Part of #100755
Required for mingw-w64, which uses the alias attribute in its CRT.
Follows ARM64EC mangling rules by mangling the alias symbol and emitting
an unmangled anti-dependency alias. Since metadata is not allowed on
GlobalAlias objects, extend arm64ec_unmangled_name to support multiple
unmangled names and attach the alias anti-dependency name to the target
function's metadata.
The function had special handling for -1, but that is incompatible with
functions whose entry point is not the first address. Use std::nullopt
instead.
LLVM vector predication reduction intrinsics return a scalar result, but
on RISC-V vector reduction instructions write the result in the first
element of a vector register. So when a reduction in a loop uses a
scalar phi, we end up with unnecessary scalar moves:
```asm
loop:
vmv.s.x v8, zero
vredsum.vs v8, v10, v8
vmv.x.s a0, v8
````
This mainly affects vector predication reduction. This tries to
vectorize any scalar phis that feed into a vector predication reduction
in RISCVCodeGenPrepare, converting:
```llvm
vector.body:
%red.phi = phi i32 [ ..., %entry ], [ %red, %vector.body ]
%red = tail call i32 @llvm.vp.reduce.add.nxv4i32(i32 %red.phi, <vscale x 4 x i32> %wide.load, <vscale x 4 x i1> splat (i1 true), i32 %evl)
```
to
```llvm
vector.body:
%red.phi = phi <vscale x 2 x i32> [ ..., %entry ], [ %acc.vec, %vector.body]
%phi.scalar = extractelement <vscale x 2 x i32> %red.phi, i64 0
%acc = tail call i32 @llvm.vp.reduce.add.nxv4i32(i32 %phi.scalar, <vscale x 4 x i32> %wide.load, <vscale x 4 x i1> splat (i1 true), i32 %evl)
%acc.vec = insertelement <vscale x 2 x i32> poison, float %acc, i64 0
```
Which eliminates the scalar -> vector -> scalar crossing during
instruction selection.
---------
Co-authored-by: yanming <ming.yan@terapines.com>
Whereas it is UB in terms of the standard to delete an array of objects
via pointer whose static type doesn't match its dynamic type, MSVC
supports an extension allowing to do it.
Aside from array deletion not working correctly in the mentioned case,
currently not having this extension implemented causes clang to generate
code that is not compatible with the code generated by MSVC, because
clang always puts scalar deleting destructor to the vftable. This PR
aims to resolve these problems.
It was reverted due to link time errors in chromium with sanitizer
coverage enabled,
which is fixed by https://github.com/llvm/llvm-project/pull/131929 .
The second commit of this PR also contains a fix for a runtime failure
in chromium reported
in
https://github.com/llvm/llvm-project/pull/126240#issuecomment-2730216384
.
Fixes https://github.com/llvm/llvm-project/issues/19772
Due to a copy paste mistake we used simm12 instead of the correct
type. This doesn't matter in practice because we only generate these
instructions with C++ code and we expand them before the AsmPrinter.
Rework https://reviews.llvm.org/D31026
AllowAtInIdentifier is a misnomer: it should be false for ELF targets,
but is currently true as a hack to parse expr@specifier.
The existing pretty printer generates excessive parentheses for
MCBinaryExpr expressions. This update removes unnecessary parentheses
of MCBinaryExpr with +/- operators and MCUnaryExpr.
Since relocatable expressions only use + and -, this change improves
readability in most cases.
Examples:
- (SymA - SymB) + C now prints as SymA - SymB + C.
This updates the output of -fexperimental-relative-c++-abi-vtables for
AArch64 and x86 to `.long _ZN1B3fooEv@PLT-_ZTV1B-8`
- expr + (MCTargetExpr) now prints as expr + MCTargetExpr, with this
change primarily affecting AMDGPUMCExpr.
The JITLinkMemoryManager::InFlightAlloc::abandon method should only abandon
memory for the current allocation, not any other allocations. In
MapperJITLinkMemoryManager this corresponds to the deinitialize operation, not
the deallocate operation (which releases whole slabs of memory that may be
shared by many allocations).
No testcase: This was spotted by inspection. The failing program was linking
concurrently when one linker instance raised an error. Through the call to
abandon an entire underlying slab was deallocated, resulting in segfaults in
other concurrent links that were sharing that slab.