The Github runner version got bumped recently and it would be good to keep
this up to date. Also debugging an issue where Github ARC is failing to
create new pods and trying to see if it might be related to outdated
versions.
f64 -> bf16 conversion can be lowered to f64 -> f32 followed by f32 ->
bf16:
v_cvt_f32_f64_e32 v0, v[0:1]
v_cvt_pk_bf16_f32 v0, v0, s0
Both conversion instructions will do round-to-even rounding, and thus we
will have double rounding issue which may generate incorrect result in
some data range. We need to add round-to-odd rounding during f64 -> f32
to avoid double rounding,.
NOTE: we are having the same issue with f64 -> f16 conversion. Will add
round-to-odd rounding for it in a separate patch, which fixes
SWDEV-523856
---------
Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
https://reviews.llvm.org/D24047 added `IsAtStartOfStatement` to
MCAsmLexer, while its subclass AsmLexer had a variable of the same name.
The assignment in `UnLex` is unnecessary, which is now removed.
60b403e75cd25a0c76aaaf4e6b176923acf49443 (2019) named the result
`parseStatement` `Parsed`. `HasError` is a clearer name.
Shift ELF `@plt` and `@gotpcrel` references in data directives, as well as
Mach-O `@specifier` notations, to use `AArch64MCExpr::Specifier` constants.
This is a follow-up to #132595. COFF-specific specifiers are not moved
yet.
In addition, partition @-specifiers into COFF, ELF, and Mach-O, so that
mix-and-match is rejected at parse time.
ELF and Mach-O specifiers are distinct, with `None` being the only
shared value. For Mach-O-specific specifiers, we adopt the `M_xxx` naming
convention.
Pull Request: https://github.com/llvm/llvm-project/pull/133214
If .L1 is not within +-4KiB range,
convert
qc.(e.)bge a0, 10, .L1
to
qc.(e.)blt a0, 10, 8(10)
j .L1
This is similar to what is done for the RISCV conditional branches.
In `X86MCInstLower::LowerMachineOperand`, a new `MCSymbol` can be
created in `GetSymbolFromOperand(MO)` where `MO.getType()` is
`MachineOperand::MO_ExternalSymbol`
```
case MachineOperand::MO_ExternalSymbol:
return LowerSymbolOperand(MO, GetSymbolFromOperand(MO));
```
at
725a7b664b/llvm/lib/Target/X86/X86MCInstLower.cpp (L196)
However, this newly created symbol will not be marked properly with its
`IsExternal` field since `Ctx.getOrCreateSymbol(Name)` doesn't know if
the newly created `MCSymbol` is for `MachineOperand::MO_ExternalSymbol`.
Looking at other backends, for example `Arch64MCInstLower` is doing for
handling `MC_ExternalSymbol`
14c36db16f/llvm/lib/Target/AArch64/AArch64MCInstLower.cpp (L366-L367)14c36db16f/llvm/lib/Target/AArch64/AArch64MCInstLower.cpp (L145-L148)
It creates/gets the MCSymbol from `AsmPrinter.OutContext` instead of
from `Ctx`. Moreover, `Ctx` for `AArch64MCLower` is the same as
`AsmPrinter.OutContext`.
8e7d6baf0e/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp (L100).
This applies to almost all the other backends except X86 and M68k.
```
$git grep "MCInstLowering("
lib/Target/AArch64/AArch64AsmPrinter.cpp💯 : AsmPrinter(TM, std::move(Streamer)), MCInstLowering(OutContext, *this),
lib/Target/AMDGPU/AMDGPUMCInstLower.cpp:223: AMDGPUMCInstLower MCInstLowering(OutContext, STI, *this);
lib/Target/AMDGPU/AMDGPUMCInstLower.cpp:257: AMDGPUMCInstLower MCInstLowering(OutContext, STI, *this);
lib/Target/AMDGPU/R600MCInstLower.cpp:52: R600MCInstLower MCInstLowering(OutContext, STI, *this);
lib/Target/ARC/ARCAsmPrinter.cpp:41: MCInstLowering(&OutContext, *this) {}
lib/Target/AVR/AVRAsmPrinter.cpp:196: AVRMCInstLower MCInstLowering(OutContext, *this);
lib/Target/BPF/BPFAsmPrinter.cpp:144: BPFMCInstLower MCInstLowering(OutContext, *this);
lib/Target/CSKY/CSKYAsmPrinter.cpp:41: : AsmPrinter(TM, std::move(Streamer)), MCInstLowering(OutContext, *this) {}
lib/Target/Lanai/LanaiAsmPrinter.cpp:147: LanaiMCInstLower MCInstLowering(OutContext, *this);
lib/Target/Lanai/LanaiAsmPrinter.cpp:184: LanaiMCInstLower MCInstLowering(OutContext, *this);
lib/Target/MSP430/MSP430AsmPrinter.cpp:149: MSP430MCInstLower MCInstLowering(OutContext, *this);
lib/Target/Mips/MipsAsmPrinter.h:126: : AsmPrinter(TM, std::move(Streamer)), MCInstLowering(*this) {}
lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp:695: WebAssemblyMCInstLower MCInstLowering(OutContext, *this);
lib/Target/X86/X86MCInstLower.cpp:2200: X86MCInstLower MCInstLowering(*MF, *this);
```
This patch makes `X86MCInstLower` and `M68KInstLower` to have their
`Ctx` from `AsmPrinter.OutContext` instead of getting it from
`MF.getContext()` to be consistent with all the other backends.
I think since normal use case (probably anything other than our
un-conventional case) only handles one llvm module all the way through
in the codegen pipeline till the end of code emission (AsmPrint),
`AsmPrinter.OutContext` is the same as MachineFunction's MCContext, so
this change is an NFC.
----
This fixes an error while running the generated code in ORC JIT for our
use case with
[MCLinker](https://youtu.be/yuSBEXkjfEA?si=HjgjkxJ9hLfnSvBj&t=813) (see
more details below):
https://github.com/llvm/llvm-project/pull/133291#issuecomment-2759200983
We (Mojo) are trying to do a MC level linking so that we break llvm
module into multiple submodules to compile and codegen in parallel
(technically into *.o files with symbol linkage type change), but
instead of archive all of them into one `.a` file, we want to fix the
symbol linkage type and still produce one *.o file. The parallel codegen
pipeline generates the codegen data structures in their own `MCContext`
(which is `Ctx` here). So if function `f` and `g` got split into
different submodules, they will have different `Ctx`. And when we try to
create an external symbol with the same name for each of them with
`Ctx.getOrCreate(SymName)`, we will get two different `MCSymbol*`
because `f` and `g`'s `MCContext` are different and they can't see each
other. This is unfortunately not what we want for external symbols.
Using `AsmPrinter.OutContext` helps, since it is shared, if we try to
get or create the `MCSymbol` there, we'll be able to deduplicate.
When unary operation support was initially upstreamed, the cir.cast
operation hadn't been upstreamed yet, so logical not wasn't included.
Since casts have now been added, this change adds support for logical
not.
I am having problems building Fortran runtime for CUDA
after #134164. I need more time to investigate it,
but in the meantime including variant.h (or any header
that eventually includes a libcudacxx header) resolves
the issue.
In `OutputSegment.cpp`, we need to ensure a specific order for certain
sections. The current sorting logic incorrectly prioritizes code
sections over explicitly defined section orders. This is problematic
because the `__objc_stubs` section is both a code section *and* has a
specific ordering requirement. The current logic would incorrectly
prioritize its code section status, causing it to be sorted *before* the
`__stubs` section. This incorrect ordering breaks the branch extension
algorithm, ultimately leading to linker failures due to relocation
errors.
We also modify the `lld/test/MachO/arm64-objc-stubs.s` test to ensure
correct section ordering.
Add ObjectFile::GetObjectName and SymbolFile::GetObjectName to retrieve
the name of the object file, including the `.a` for static libraries.
We currently do something similar in CommandObjectTarget, but the code
for dumping this is a lot more involved than what's being offered by the
new method. We have options to print he full path, the base name, and
the directoy of the path and trim it to a specific width.
This is motivated by #133211, where Greg pointed out that the old code
would print the static archive (the .a file) rather than the actual
object file inside of it.
Fixes#102053
The check was added in 8decdc472f308b13d7fb7fd50c3919db086c0417, and at
the time iOS 5 was the latest iOS version, before that commit tail calls
were disabled for all ARMv7 targets. Testing a build of wasm3 with the
patch on a device running iOS 3.0 shows a noticeable performance
improvement and no issues.
This adds ClangIR support for break and continue statements in loops.
Because only loops are currently implemented upstream in CIR, only
breaks in loops are supported here, but this same code will work (with
minor changes to the verification and cfg flattening) when switch
statements are upstreamed.
https://github.com/llvm/llvm-project/pull/132883 added support for cuda
surfaces but reached into clang/test/Headers/ from clang/test/CodeGen/
to grab the minimal cuda.h. Duplicate that file instead based on
comments in the review, to fix remote test runs.
Signed-off-by: Austin Schuh <austin.linux@gmail.com>
Almost all of the rotate idioms that are valid for an 'or' are also
valid when the halves are combined with an 'add'. Further, many of these
cases are not handled by common bits tracking meaning that the 'add' is
not converted to a 'disjoint or'.
Allocatable or pointer module variables with the CUDA managed attribute
are defined with a double descriptor. One on the host and one on the
device. Only the data pointed to by the descriptor will be allocated in
managed memory.
Allow the registration of any allocatable or pointer module variables
like device or constant.
- when developing the RootSignatureLexer library, we are creating new
files so we should set the standard to adhere to the coding conventions
for function naming
- this was missed in the initial review but caught in the review of the
parser pr
[here](https://github.com/llvm/llvm-project/pull/133302#discussion_r2017632092)
Co-authored-by: Finn Plummer <finnplummer@microsoft.com>
Implement extended intrinsic PUTENV, both function and subroutine forms.
Add PUTENV documentation to flang/docs/Intrinsics.md. Add functional and
semantic unit tests.
The main issue was that the kernel expected `suseconds_t` to be 64 bits
but ours was 32. This caused inconsistent failures since all valid
`suseconds_t` values are less than 1000000 (1 million), and some
configurations caused `struct timeval` to be padded to 128 bits.
Also: forgot to use TEST_FILE instead of FILE_PATH in some places.
The `lldbassert` macro in LLDB behaves like a regular `assert` when
assertions are enabled, and otherwise prints a pretty backtrace and
prompts the user to file a bug. By default, this is emitted as a
diagnostic event, but vendors can provide their own behavior, for
example pre-populating a bug report.
Recently, we ran into an issue where an `lldbassert` (in the Swift
language plugin) would fire excessively, to the point that it was
interfering with the usability of the debugger.
Once an `lldbasser` has fired, there's no point in bothering the user
over and over again for the same problem. This PR solves the problem by
introducing a static `std::once_flag` in the macro. This way, every
`lldbasser` will fire at most once per lldb instance.
rdar://148520448
Spimm in the spec refers to the 2-bit encoded value. All of the code
uses the 0, 16, 32, or 48 adjustment value.
Also remove the decodeZcmpSpimm as its identical to the default
behavior for no custom DecoderMethod.
OpenACC 3.3-NEXT has changed the way tags for copy, copyin, copyout, and
create clauses are specified, and end up adding a few extras, and
permits them as a list. This patch encodes these as bitmask enum so
they can be stored succinctly, but still diagnose reasonably.
The code sanitizer is failing with this error: `Execution cannot reach
this statement.`
The execution code path would early exit at line 928 if `(Lil && Ril) =
true`.
There are some system libraries such as sqlite3 which forward declare a
struct then use a pointer to that forward declared type in various APIs.
Ignore these types ForwardDeclChecker like other pointer types.
We were previously telling the user how many arguments were passed to
the attribute rather than saying how many arguments were expected to be
passed to the callback function. This rewords the diagnostic to
hopefully be a bit more clear.
Fixes#47451
When I introduced the various `_LIBCPP_INTRODUCED_IN_LLVM_XY_ATTRIBUTE`
macros in 182f5e9b2f03, I tried to correlate them to the right OS
versions, but it seems that I made a few mistakes. This wasn't caught in
the CI because we don't test back-deployment that far.
rdar://148405946
TLS relocations may not have a valid BOLT symbol associated with them.
While symbolizing the operand, we were checking for the symbol value,
and since there was no symbol the check resulted in a crash.
Handle TLS case while performing operand symbolization on AArch64.