The "preserve input debug-info format" flag allowed some tooling to opt
into not seeing the new debug records yet, and to not autoupgrade. This
was good at the time, but un-necessary now that we'll be ditching
intrinsics shortly.
It also hides errors now: verify-uselistorder was hardcoding this flag
to on, and as a result it hasn't seen debug records before. Thus, we
missed a uselistorder variation: constant-expressions such as GEPs can
be contained within debug records and completely isolated from the value
hierachy, see the metadata-use-uselistorder.ll test. These Values didn't
get ordered, but were legitimate uses of constants like "i64 0", and we
now run into difficulty handling that. The patch to AsmWriter seeks
Values to order even through debug-info now.
Finally there are a few intrinsics-tests relying on this flag that we
can just delete, such as one in llvm-reduce and another few in the
LocalTest unit tests. For the fast-isel test, it was added in
https://reviews.llvm.org/D67703 explicitly for checking the size of
blocks without debug-info and in 1525abb9c94 the codepath it tests moved
towards being sunsetted. It'll be totally redundant once RemoveDIs is on
permanently.
Note that there's now no explicit test for the textual-IR autoupgrade
path. I submit that we can rely on the thousands of .ll files where
we've only been bothered to update the outputs, not the inputs, to debug
records.
During the transition from debug intrinsics to debug records, we used
several different command line options to customise handling: the
printing of debug records to bitcode and textual could be independent of
how the debug-info was represented inside a module, whether the
autoupgrader ran could be customised. This was all valuable during
development, but now that totally removing debug intrinsics is coming
up, this patch removes those options in favour of a single flag
(experimental-debuginfo-iterators), which enables autoupgrade, in-memory
debug records, and debug record printing to bitcode and textual IR.
We need to do this ahead of removing the
experimental-debuginfo-iterators flag, to reduce the amount of
test-juggling that happens at that time.
There are quite a number of weird test behaviours related to this --
some of which I simply delete in this commit. Things like
print-non-instruction-debug-info.ll , the test suite now checks for
debug records in all tests, and we don't want to check we can print as
intrinsics. Or the update_test_checks tests -- these are duplicated with
write-experimental-debuginfo=false to ensure file writing for intrinsics
is correct, but that's something we're imminently going to delete.
A short survey of curious test changes:
* free-intrinsics.ll: we don't need to test that debug-info is a zero
cost intrinsic, because we won't be using intrinsics in the future.
* undef-dbg-val.ll: apparently we pinned this to non-RemoveDIs in-memory
mode while we sorted something out; it works now either way.
* salvage-cast-debug-info.ll: was testing intrinsics-in-memory get
salvaged, isn't necessary now
* localize-constexpr-debuginfo.ll: was producing "dead metadata"
intrinsics for optimised-out variable values, dbg-records takes the
(correct) representation of poison/undef as an operand. Looks like we
didn't update this in the past to avoid spurious test differences.
* Transforms/Scalarizer/dbginfo.ll: this test was explicitly testing
that debug-info affected codegen, and we deferred updating the tests
until now. This is just one of those silent gnochange issues that get
fixed by RemoveDIs.
Finally: I've added a bitcode test, dbg-intrinsics-autoupgrade.ll.bc,
that checks we can autoupgrade debug intrinsics that are in bitcode into
the new debug records.
This adds DWARF generation for fixed-point types. This feature is needed
by Ada.
Note that a pre-existing GNU extension is used in one case. This has
been emitted by GCC for years, and is needed because standard DWARF is
otherwise incapable of representing these types.
In Ada, an array can be packed and the elements can take less space than
their natural object size. For example, for this type:
type Packed_Array is array (4 .. 8) of Boolean;
pragma pack (Packed_Array);
... each element of the array occupies a single bit, even though the
"natural" size for a Boolean in memory is a byte.
In DWARF, this is represented by putting a DW_AT_bit_stride onto the
array type itself.
This patch adds a bit stride to DICompositeType so that gnat-llvm can
emit DWARF for these sorts of arrays.
The module currently stores the target triple as a string. This means
that any code that wants to actually use the triple first has to
instantiate a Triple, which is somewhat expensive. The change in #121652
caused a moderate compile-time regression due to this. While it would be
easy enough to work around, I think that architecturally, it makes more
sense to store the parsed Triple in the module, so that it can always be
directly queried.
For this change, I've opted not to add any magic conversions between
std::string and Triple for backwards-compatibilty purses, and instead
write out needed Triple()s or str()s explicitly. This is because I think
a decent number of them should be changed to work on Triple as well, to
avoid unnecessary conversions back and forth.
The only interesting part in this patch is that the default triple is
Triple("") instead of Triple() to preserve existing behavior. The former
defaults to using the ELF object format instead of unknown object
format. We should fix that as well.
This patch adds a function attribute `riscv_vls_cc` for RISCV VLS
calling
convention which takes 0 or 1 argument, the argument is the `ABI_VLEN`
which is the `VLEN` for passing the fixed-vector arguments, it wraps the
argument as a scalable vector(VLA) using the `ABI_VLEN` and uses the
corresponding mechanism to handle it. The range of `ABI_VLEN` is [32,
65536],
if not specified, the default value is 128.
Here is an example of VLS argument passing:
Non-VLS call:
```
void original_call(__attribute__((vector_size(16))) int arg) {}
=>
define void @original_call(i128 noundef %arg) {
entry:
...
ret void
}
```
VLS call:
```
void __attribute__((riscv_vls_cc(256))) vls_call(__attribute__((vector_size(16))) int arg) {}
=>
define riscv_vls_cc void @vls_call(<vscale x 1 x i32> %arg) {
entry:
...
ret void
}
}
```
The first Non-VLS call passes generic vector argument of 16 bytes by
flattened integer.
On the contrary, the VLS call uses `ABI_VLEN=256` which wraps the
vector to <vscale x 1 x i32> where the number of scalable vector
elements
is calaulated by: `ORIG_ELTS * RVV_BITS_PER_BLOCK / ABI_VLEN`.
Note: ORIG_ELTS = Vector Size / Type Size = 128 / 32 = 4.
PsABI PR: https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/418
C-API PR: https://github.com/riscv-non-isa/riscv-c-api-doc/pull/68
An Ada program can have types that are subranges of other types. This
patch adds a new DIType node, DISubrangeType, to represent this concept.
I considered extending the existing DISubrange to do this, but as
DISubrange does not derive from DIType, that approach seemed more
disruptive.
A DISubrangeType can be used both as an ordinary type, but also as the
type of an array index. This is also important for Ada.
Ada subrange types can also be stored using a bias. Representing this in
the DWARF required the use of an extension. GCC has been emitting this
extension for years, so I've reused it here.
Model C/C++ `errno` macro by adding a corresponding `errno`
memory location kind to the IR. Preliminary work to separate
`errno` writes from other memory accesses, to the benefit of
alias analyses and optimization correctness.
Previous discussion: https://discourse.llvm.org/t/rfc-modelling-errno-memory-effects/82972.
When creating `EnumDecl`s from DWARF for Objective-C `NS_ENUM`s, the
Swift compiler tries to figure out if it should perform "swiftification"
of that enum (which involves renaming the enumerator cases, etc.). The
heuristics by which it determines whether we want to swiftify an enum is
by checking the `enum_extensibility` attribute (because that's what
`NS_ENUM` pretty much are). Currently LLDB fails to attach the
`EnumExtensibilityAttr` to `EnumDecl`s it creates (because there's not
enough info in DWARF to derive it), which means we have to fall back to
re-building Swift modules on-the-fly, slowing down expression evaluation
substantially. This happens around
4b3931c8ce/lib/ClangImporter/ImportEnumInfo.cpp (L37-L59)
To speed up Swift exression evaluation, this patch proposes encoding the
C/C++/Objective-C `enum_extensibility` attribute in DWARF via a new
`DW_AT_APPLE_ENUM_KIND`. This would currently be only used from the LLDB
Swift plugin. But may be of interest to other language plugins as well
(though I haven't come up with a concrete use-case for it outside of
Swift).
I'm open to naming suggestions of the various new attributes/attribute
constants proposed here. I tried to be as generic as possible if we
wanted to extend it to other kinds of enum properties (e.g., flag
enums).
The new attribute would look as follows:
```
DW_TAG_enumeration_type
DW_AT_type (0x0000003a "unsigned int")
DW_AT_APPLE_enum_kind (DW_APPLE_ENUM_KIND_Closed)
DW_AT_name ("ClosedEnum")
DW_AT_byte_size (0x04)
DW_AT_decl_file ("enum.c")
DW_AT_decl_line (23)
DW_TAG_enumeration_type
DW_AT_type (0x0000003a "unsigned int")
DW_AT_APPLE_enum_kind (DW_APPLE_ENUM_KIND_Open)
DW_AT_name ("OpenEnum")
DW_AT_byte_size (0x04)
DW_AT_decl_file ("enum.c")
DW_AT_decl_line (27)
```
Absence of the attribute means the extensibility of the enum is unknown
and abides by whatever the language rules of that CU dictate.
This does feel like a big hammer for quite a specific use-case, so I'm
happy to discuss alternatives.
Alternatives considered:
* Re-using an existing DWARF attribute to express extensibility. E.g., a
`DW_TAG_enumeration_type` could have a `DW_AT_count` or
`DW_AT_upper_bound` indicating the number of enumerators, which could
imply closed-ness. I felt like a dedicated attribute (which could be
generalized further) seemed more applicable. But I'm open to re-using
existing attributes.
* Encoding the entire attribute string (i.e., `DW_TAG_LLVM_annotation
("enum_extensibility((open))")`) on the `DW_TAG_enumeration_type`. Then
in LLDB somehow parse that out into a `EnumExtensibilityAttr`. I haven't
found a great API in Clang to parse arbitrary strings into AST nodes
(the ones I've found required fully formed C++ constructs). Though if
someone knows of a good way to do this, happy to consider that too.
When we apply cloning decisions in the ThinLTO backend, we need to find
the corresponding summary for each function in the IR, and in some cases
for callee functions. This is complicated when the function was a
promoted local, in which case the GUID was formed from the hash of the
original source file prepended to the function name. Those functions
can be identified by the fact that they were given a ".llvm." suffix
during promotion.
We previously didn't do this correctly for promoted locals imported from
other modules, as we only tried the current module source name. This led
to crashes, in particular when the current module also had an local
function of the same original name. In particular, we were attempting to
iterate through the wrong summary's callsites, and there were fewer than
in the actual function so we accessed data off the end (in a release
build with assertion checking off - with assertion checking on we double
check the stack ids and that would have failed). Even if we hadn't
crashed or hit an assert, we could have applied the wrong cloning
decisions, leading to unsats at link time.
Luckily, function importing attaches thinlto_src_file metadata
containing the original source file name to all imported functions. It
normally doesn't do this by default, however, it always does if MemProf
context disambiguation is enabled. Therefore, we can just look to see if
the function contains this metadata and if so use it to recreate the
original GUID.
A similar issue can occur when looking for the ValueInfo / GUID of
a direct tail call to see if we synthesized a callsite record for a
missing tail call frame. In that case, the callee function may be a
declaration, if we imported its caller but not the callee function
definition. Because imported declarations don't get the thinlto_src_file
metadata, we instead look at its caller (which works because this
happens very early in the backend before any inlining).
Add a new AutoUpgrade function to convert some legacy nvvm.annotations
metadata to function level attributes. These attributes are quicker to
look-up so improve compile time and are more idiomatic than using
metadata which should not include required information that changes the
meaning of the program.
Currently supported annotations are:
- !"kernel" -> ptx_kernel calling convention
- !"align" -> alignstack parameter attributes (return not yet supported)
This PR removes the old `nocapture` attribute, replacing it with the new
`captures` attribute introduced in #116990. This change is
intended to be essentially NFC, replacing existing uses of `nocapture`
with `captures(none)` without adding any new analysis capabilities.
Making use of non-`none` values is left for a followup.
Some notes:
* `nocapture` will be upgraded to `captures(none)` by the bitcode
reader.
* `nocapture` will also be upgraded by the textual IR reader. This is to
make it easier to use old IR files and somewhat reduce the test churn in
this PR.
* Helper APIs like `doesNotCapture()` will check for `captures(none)`.
* MLIR import will convert `captures(none)` into an `llvm.nocapture`
attribute. The representation in the LLVM IR dialect should be updated
separately.
This introduces the `captures` attribute as described in:
https://discourse.llvm.org/t/rfc-improvements-to-capture-tracking/81420
This initial patch only introduces the IR/bitcode support for the
attribute and its in-memory representation as `CaptureInfo`. This will
be followed by a patch to upgrade and remove the `nocapture` attribute,
and then by actual inference/analysis support.
Based on the RFC feedback, I've used a syntax similar to the `memory`
attribute, though the only "location" that can be specified is `ret`.
I've added some pretty extensive documentation to LangRef on the
semantics. One non-obvious bit here is that using ptrtoint will not
result in a "return-only" capture, even if the ptrtoint result is only
used in the return value. Without this requirement we wouldn't be able
to continue ordinary capture analysis on the return value.
This consists of:
* Make these instructions part of FPMathOperator.
* Adjust bitcode/ir readers/writers to expect fast math flags on these
instructions.
* Make IRBuilder set the fast math flags on these instructions.
* Update langref and release notes.
* Update a bunch of tests. Some of these are due to InstCombineCasts
incorrectly adding fast math flags to fptrunc, which will be fixed in a
later patch.
Add a specification attribute to LLVM DebugInfo, which is analogous
to DWARF's DW_AT_specification. According to the DWARF spec:
"A debugging information entry that represents a declaration that
completes another (earlier) non-defining declaration may have a
DW_AT_specification attribute whose value is a reference to the
debugging information entry representing the non-defining declaration."
This patch allows types to be specifications of other types. This is
used by Swift to represent generic types. For example, given this Swift
program:
```
struct MyStruct<T> {
let t: T
}
let variable = MyStruct<Int>(t: 43)
```
The Swift compiler emits (roughly) an unsubtituted type for MyStruct<T>:
```
DW_TAG_structure_type
DW_AT_name ("MyStruct")
// "$s1w8MyStructVyxGD" is a Swift mangled name roughly equivalent to
// MyStruct<T>
DW_AT_linkage_name ("$s1w8MyStructVyxGD")
// other attributes here
```
And a specification for MyStruct<Int>:
```
DW_TAG_structure_type
DW_AT_specification (<link to "MyStruct">)
// "$s1w8MyStructVySiGD" is a Swift mangled name equivalent to
// MyStruct<Int>
DW_AT_linkage_name ("$s1w8MyStructVySiGD")
DW_AT_byte_size (0x08)
// other attributes here
```
An extra inhabitant is a bit pattern that does not represent a valid
value for instances of a given type. The number of extra inhabitants is
the number of those bit configurations.
This is used by Swift to save space when composing types. For example,
because Bool only needs 2 bit patterns to represent all of its values
(true and false), an Optional<Bool> only occupies 1 byte in memory by
using a bit configuration that is unused by Bool. Which bit patterns are
unused are part of the ABI of the language.
Since Swift generics are not monomorphized, by using dynamic libraries
you can have generic types whose size, alignment, etc, are known only
at runtime (which is why this feature is needed).
This patch adds num_extra_inhabitants to LLVM-IR debug info and in DWARF
as an Apple extension.
StructType::setBody is the only mechanism that can potentially create
recursion in the type system. Add a runtime check that it is not
actually used to create recursion.
If the check fails, report an error from LLParser, BitcodeReader and
IRLinker. In all other cases assert that the check succeeds.
In future StructType::setBody will be removed in favor of specifying the
body when the type is created, so any performance hit from this runtime
check will be temporary.
Type::isScalableTy and StructType::containsScalableVectorType failed to
detect some cases of structs containing scalable vectors because
containsScalableVectorType did not call back into isScalableTy to check
the element types. Fix this, which requires sharing the same Visited set
in both functions. Also change the external API so that callers are
never required to pass in a Visited set, and normalize the naming to
isScalableTy.
Rename the function to reflect its correct behavior and to be consistent
with `Module::getOrInsertFunction`. This is also in preparation of
adding a new `Intrinsic::getDeclaration` that will have behavior similar
to `Module::getFunction` (i.e, just lookup, no creation).
This change implements support of metadata strings in operand bundle
values. It makes possible calls like:
call void @some_func(i32 %x) [ "foo"(i32 42, metadata !"abc") ]
It requires some extension of the bitcode serialization. As SSA values
and metadata are stored in different tables, there must be a way to
distinguish them during deserialization. It is implemented by putting a
special marker before the metadata index. The marker cannot be treated
as a reference to any SSA value, so it unambiguously identifies
metadata. It allows extending the bitcode serialization without breaking
compatibility.
Metadata as operand bundle values are intended to be used in
floating-point function calls. They would represent the same information
as now is passed by the constrained intrinsic arguments.
When the lexer hits an error during assembly parsing, it just logs the
error in the `ErrorMsg` object, and it's possible that error gets
overwritten later in by the parser with a more generic error message.
This makes some errors reported by the LLVM's asm parser less precise.
Address this by not having the parser overwrite the message logged by
the lexer by assigning error messages generated by the lexer higher
"priority" than those generated by parser and overwriting the error
message only if its same or higher priority.
Update several Assembler unit test to now check the more precise error
messaged reported by the LLVM's AsmParser.
It is almost always simpler to use {} instead of std::nullopt to
initialize an empty ArrayRef. This patch changes all occurrences I could
find in LLVM itself. In future the ArrayRef(std::nullopt_t) constructor
could be deprecated or removed.
During the ThinLTO indexing step for one of our large applications, we
create 4 million instances of FunctionSummary.
Changing:
std::vector<EdgeTy> CallGraphEdgeList;
to:
SmallVector<EdgeTy, 0> CallGraphEdgeList;
in FunctionSummary reduces the size of each instance by 8 bytes. The
rest of the patch makes the same change to other places so that the
types stay compatible across function boundaries.
The primary motivation is to remove `EntryCount` from `FunctionSummary`.
This frees 8 bytes out of `sizeof(FunctionSummary)` (136 bytes as of
64498c5483).
While I'm at it, this PR clean up {SummaryBasedOptimizations,
SyntheticCountsPropagation} since they were not used and there are no
plans to further invest on them.
With this patch, bitcode writer writes a placeholder 0 at the byte
offset of `EntryCount` and bitcode reader can parse the function entry
count at the correct byte offset. Added a TODO to stop writing
`EntryCount` and bump bitcode version
During the ThinLTO indexing step for one of our large applications, we
create 7.5 million instances of GlobalValueSummary.
Changing:
std::vector<ValueInfo> RefEdgeList;
to:
SmallVector<ValueInfo, 0> RefEdgeList;
in GlobalValueSummary reduces the size of each instance by 8 bytes.
The rest of the patch makes the same change to other places so that
the types stay compatible across function boundaries.
Since IR Types are immutable it makes sense to check them on
construction instead of in the IR Verifier pass.
This patch checks that some TargetExtTypes are well-formed in the sense
that they have the expected number of type parameters and integer
parameters. When called from LLParser it gives a diagnostic message.
When called from anywhere else it just asserts that they are
well-formed.
After #98505, the textual IR keyword `x86_mmx` was temporarily made to
parse as `<1 x i64>`, so as not to require a lot of test update noise.
This completes the removal of the type, by removing the`x86_mmx` keyword
from the IR parser, and making the (now no-op) test updates via `sed -i
's/\bx86_mmx\b/<1 x i64>/g' $(git grep -l x86_mmx llvm/test/)`.
Resulting bitcasts from <1 x i64> to itself were then manually deleted.
Changes to llvm/test/Bitcode/compatibility-$VERSION.ll were reverted, as
they're intended to be equivalent to the .bc file, if parsed by old
LLVM, so shouldn't be updated.
A few tests were removed, as they're no longer testing anything, in the
following files:
- llvm/test/Transforms/GlobalOpt/x86_mmx_load.ll
- llvm/test/Transforms/InstCombine/cast.ll
- llvm/test/Transforms/InstSimplify/ConstProp/gep-zeroinit-vector.ll
Works towards issue #98272.
The use of SmallVector here saves 1.03% of heap allocations during the
compilation of X86ISelLowering.cpp.ll, a .ll version of
X86ISelLowering.cpp. The 8 inline elements cover greater than 99% of
invocations encountered here.
While I am it, the patch changes the parameter type to ArrayRef to
allow callers to use any vector type.
It is now translated to `<1 x i64>`, which allows the removal of a bunch
of special casing.
This _incompatibly_ changes the ABI of any LLVM IR function with
`x86_mmx` arguments or returns: instead of passing in mmx registers,
they will now be passed via integer registers. However, the real-world
incompatibility caused by this is expected to be minimal, because Clang
never uses the x86_mmx type -- it lowers `__m64` to either `<1 x i64>`
or `double`, depending on ABI.
This change does _not_ eliminate the SelectionDAG `MVT::x86mmx` type.
That type simply no longer corresponds to an IR type, and is used only
by MMX intrinsics and inline-asm operands.
Because SelectionDAGBuilder only knows how to generate the
operands/results of intrinsics based on the IR type, it thus now
generates the intrinsics with the type MVT::v1i64, instead of
MVT::x86mmx. We need to fix this before the DAG LegalizeTypes, and thus
have the X86 backend fix them up in DAGCombine. (This may be a
short-lived hack, if all the MMX intrinsics can be removed in upcoming
changes.)
Works towards issue #98272.