isAllOnes() should return true for zero bit values because
there are no zeros in it.
Thanks to Jay Foad for pointing this out.
Differential Revision: https://reviews.llvm.org/D111241
Currently the max alignment representable is 1GB, see D108661.
Setting the align of an object to 4GB is desirable in some cases to make sure the lower 32 bits are clear which can be used for some optimizations, e.g. https://crbug.com/1016945.
This uses an extra bit in instructions that carry an alignment. We can store 15 bits of "free" information, and with this change some instructions (e.g. AtomicCmpXchgInst) use 14 bits.
We can increase the max alignment representable above 4GB (up to 2^62) since we're only using 33 of the 64 values, but I've just limited it to 4GB for now.
The one place we have to update the bitcode format is for the alloca instruction. It stores its alignment into 5 bits of a 32 bit bitfield. I've added another field which is 8 bits and should be future proof for a while. For backward compatibility, we check if the old field has a value and use that, otherwise use the new field.
Updating clang's max allowed alignment will come in a future patch.
Reviewed By: hans
Differential Revision: https://reviews.llvm.org/D110451
Currently the max alignment representable is 1GB, see D108661.
Setting the align of an object to 4GB is desirable in some cases to make sure the lower 32 bits are clear which can be used for some optimizations, e.g. https://crbug.com/1016945.
This uses an extra bit in instructions that carry an alignment. We can store 15 bits of "free" information, and with this change some instructions (e.g. AtomicCmpXchgInst) use 14 bits.
We can increase the max alignment representable above 4GB (up to 2^62) since we're only using 33 of the 64 values, but I've just limited it to 4GB for now.
The one place we have to update the bitcode format is for the alloca instruction. It stores its alignment into 5 bits of a 32 bit bitfield. I've added another field which is 8 bits and should be future proof for a while. For backward compatibility, we check if the old field has a value and use that, otherwise use the new field.
Updating clang's max allowed alignment will come in a future patch.
Reviewed By: hans
Differential Revision: https://reviews.llvm.org/D110451
As described on D111049, we're trying to remove the <string> dependency from error handling and replace uses of report_fatal_error(const std::string&) with the Twine() variant which can be forward declared.
We can use the raw_string_ostream::str() method to perform the implicit flush() and return a reference to the std::string container that we can then wrap inside Twine().
These should both clearly work with our current model for zero width
integers, but don't until now!
Differential Revision: https://reviews.llvm.org/D111113
Deriving NoAlias based on having the same index in two BaseIndexOffset
expressions seemed weird (and as shown in the added unittest the
correctness of doing so depended on undocumented pre-conditions that
the user of BaseIndexOffset::computeAliasing would need to take care
of.
This patch removes the code that dereived NoAlias based on indices
being the same. As a compensation, to avoid regressions/diffs in
various lit test, we also add a new check. The new check derives
NoAlias in case the two base pointers are based on two different
GlobalValue:s (neither of them being a GlobalAlias).
Reviewed By: niravd
Differential Revision: https://reviews.llvm.org/D110256
This fixes a bug detected in DAGCombiner when using global alias
variables. Here is an example:
@foo = global i16 0, align 1
@aliasFoo = alias i16, i16 * @foo
define i16 @bar() {
...
store i16 7, i16 * @foo, align 1
store i16 8, i16 * @aliasFoo, align 1
...
}
BaseIndexOffset::computeAliasing would incorrectly derive NoAlias
for the two accesses in the example above, resulting in DAGCombiner
miscompiles.
This patch fixes the problem by a defensive approach letting
BaseIndexOffset::computeAliasing return false, i.e. that the aliasing
couldn't be determined, when comparing two global values and at least
one is a GlobalAlias. In the future we might improve this with a
deeper analysis to look at the aliasee for the GlobalAlias etc. But
that is a bit more complicated considering that we could have
'local_unnamed_addr' and situations with several 'alias' variables.
Fixes PR51878.
Differential Revision: https://reviews.llvm.org/D110064
The delayed stack protector feature which is currently used for SDAG (and thus
allows for more commonly generating tail calls) depends on being able to extract
the tail call into a separate return block. To do this it also has to extract
the vreg->physreg copies that set up the call's arguments, since if it doesn't
then the call inst ends up using undefined physregs in it's new spliced block.
SelectionDAG implementations can do this because they delay emitting register
copies until *after* the stack arguments are set up. GISel however just
processes and emits the arguments in IR order, so stack arguments always end up
last, and thus this breaks the code that looks for any register arg copies that
precede the call instruction.
This patch adds a thunk argument to the assignValueToReg() and custom assignment
hooks. For outgoing arguments, register assignments use this return param to
return a thunk that does the actual generating of the copies. We collect these
until all the outgoing stack assignments have been done and then execute them,
so that the copies (and perhaps some artifacts like G_SEXTs) are placed after
any stores.
Differential Revision: https://reviews.llvm.org/D110610
Stop using APInt constructors and methods that were soft-deprecated in
D109483. This fixes all the uses I found in llvm, except for the APInt
unit tests which should still test the deprecated methods.
Differential Revision: https://reviews.llvm.org/D110807
This patch adds the functionalities to print MDNode in tree shape. For
example, instead of printing a MDNode like this:
```
<0x5643e1166888> = !DILocalVariable(name: "foo", arg: 2, scope: <0x5643e11c9740>, file: <0x5643e11c6ec0>, line: 8, type: <0x5643e11ca8e0>, flags: DIFlagPublic | DIFlagFwdDecl, align: 8)
```
The printTree/dumpTree functions can give you:
```
<0x5643e1166888> = !DILocalVariable(name: "foo", arg: 2, scope: <0x5643e11c9740>, file: <0x5643e11c6ec0>, line: 8, type: <0x5643e11ca8e0>, flags: DIFlagPublic | DIFlagFwdDecl, align: 8)
<0x5643e11c9740> = distinct !DISubprogram(scope: null, spFlags: 0)
<0x5643e11c6ec0> = distinct !DIFile(filename: "file.c", directory: "/path/to/dir")
<0x5643e11ca8e0> = distinct !DIDerivedType(tag: DW_TAG_pointer_type, baseType: <0x5643e11668d8>, size: 1, align: 2)
<0x5643e11668d8> = !DIBasicType(tag: DW_TAG_unspecified_type, name: "basictype")
```
Which is useful when using it in debugger. Where sometimes printing the
whole module to see all MDNodes is too expensive.
Differential Revision: https://reviews.llvm.org/D110113
This patch implements suggestion done while reviewing D102634. It adds two fields:
ParentIdx and SiblingIdx. These fields allow fast navigation to die parent and
die sibling. These fields are set at the moment when dies are loaded.
dsymutil works 2% faster with this patch(run on clang binary).
Differential Revision: https://reviews.llvm.org/D110363
Rust allows use of non-ASCII identifiers, which in Rust mangling scheme
are encoded using Punycode.
The encoding deviates from the standard by using an underscore as the
separator between ASCII part and a base-36 encoding of non-ASCII
characters (avoiding hypen-minus in the symbol name). Other than that,
the encoding follows the standard, and the decoder implemented here in
turn follows the one given in RFC 3492.
To avoid an extra intermediate memory allocation while decoding
Punycode, the interface of OutputStream is extended with an insert
method.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D104366
With the removal of OrcRPCExecutorProcessControl and OrcRPCTPCServer in
6aeed7b19c4 the ORC RPC library no longer has any in-tree users.
Clients needing serialization for ORC should move to Simple Packed
Serialization (usually by adopting SimpleRemoteEPC for remote JITing).
- Introduce a skeleton outline for the GOFFAsmParser
- Before instantiating AsmParser/HLASMAsmParser, target specific asm parsers are attempted to be initialized first before proceeding. If it doesn't exist for a particular file type, we report a fatal error.
- This patch allows to properly instantiate the HLASMAsmParser on z/OS, and ensures we can write lit tests and unit tests which will involve the instantiation of asm parsers, without an assert / fatal error.
Reviewed By: uweigand, Kai
Differential Revision: https://reviews.llvm.org/D110730
Add MCDwarfLineStr class to the public API.
Note that MCDwarfLineTableHeader::Emit(), takes MCDwarfLineStr as
an Optional<> parameter making it impossible to use the API if the class
is not publicly defined.
Reviewed By: alexander-shaposhnikov
Differential Revision: https://reviews.llvm.org/D109412
This ensures that re-creating "the same" FS results in the same UIDs for files.
In turn, this means that creating a clang module (preamble) using one in-memory
filesystem and consuming it using another doesn't create duplicate FileEntrys
for files that are the same in both FSes.
It's tempting to give the creator control over the UIDs instead. However that
requires fiddly API changes, e.g. what should the UIDs of intermediate
directories be?
This change is more "magic" but seems safe given:
- InMemoryFilesystem is used in testing more than production
- comparing UIDs across filesystems is unusual
- files with the same path and content are usually logically equivalent
(The usual reason for re-creating virtual filesystems rather than reusing them
is that typical use involves mutating their CWD and so is not threadsafe).
Differential Revision: https://reviews.llvm.org/D110711
This patch introduces the vector-predicated version of the
experimental_vector_splice intrinsic [1] at the IR level. It considers
the active vector length for both vectors and and uses a vector mask to
disable certain lanes in the result.
[1] https://reviews.llvm.org/D94708
Change originally authored by Vineet Kumar <vineet.kumar@bsc.es>
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D103898
When using a datalayout that has pointer index width != pointer size this
code triggers an assertion in Value::stripAndAccumulateConstantOffsets().
I encountered this this while compiling FreeBSD for CHERI-RISC-V.
Also update LoadsTest.cpp to use a DataLayout with index width != pointer
width to ensure this case is tested.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D110406
This patch is for fixing potential insertElement-related bugs like D93818.
```
V = UndefValue::get(VecTy);
for(...)
V = Builder.CreateInsertElementy(V, Elt, Idx);
=>
V = PoisonValue::get(VecTy);
for(...)
V = Builder.CreateInsertElementy(V, Elt, Idx);
```
Like above, this patch changes the placeholder V to poison.
The patch will be separated into several commits.
Reviewed By: aqjune
Differential Revision: https://reviews.llvm.org/D110311
With improved analysis in determining CFG equivalence that does
not require strict dominance and post-dominance conditions, we
now relax isSafeToMoveBefore() such that an instruction I can
be moved before InsertPoint even if they do not strictly dominate
each other, as long as they follow the same control flow path.
For example, we can move Instruction 0 before Instruction 1,
and vice versa.
```
if (cond1)
// Instruction 0: %add = add i32 1, 2
if (cond1)
// Instruction 1: %add2 = add i32 2, 1
```
Reviewed By: Whitney
Differential Revision: https://reviews.llvm.org/D110456
Add a llvm::Split() implementation that can be used via range-for loop,
e.g.:
for (StringRef x : llvm::Split("foo,bar,baz", ','))
...
The implementation uses an additional SplittingIterator class that
uses StringRef::split() internally.
Differential Revision: https://reviews.llvm.org/D110496
This is a followup to D109844 (and alternative to D109907), which
integrates the new "earliest escape" tracking into AliasAnalysis.
This is done by replacing the pre-existing context-free capture
cache in AAQueryInfo with a replaceable (virtual) object with two
implementations: The SimpleCaptureInfo implements the previous
behavior (check whether object is captured at all), while
EarliestEscapeInfo implements the new behavior from DSE.
This combines the "earliest escape" analysis with the full power of
BasicAA: It subsumes the call handling from D109907, considers a
wider range of escape sources, and works with AA recursion. The
compile-time cost is slightly higher than with D109907.
Differential Revision: https://reviews.llvm.org/D110368
- This patch adds in the GOFFMCAsmInfo interfaces for the z/OS target.
- This patch decouples the previously existing SystemZMCAsmInfo interface for the ELF target and the z/OS target.
- This patch also removes a small test in the SystemZAsmLexerTest.cpp. The reason for this is because, the test is set up for the s390x-ibm-linux (SystemZ ELF triple), and the test checks a function which is overridden only for the z/OS target. The reason we can't change the test to use a z/OS triple outright is because there is still missing support which prevents the successful running of a test (assert in AsmParser.cpp due to missing GOFFAsmParser support)
Reviewed By: uweigand, abhina.sreeskantharajan
Differential Revision: https://reviews.llvm.org/D110077
- This patch adds in the GOFF mangling support to the LLVM data layout string. A corresponding additional line has been added into the data layout section in the language reference documentation.
- Furthermore, this patch also sets the right data layout string for the z/OS target in the SystemZ backend.
Reviewed By: uweigand, Kai, abhina.sreeskantharajan, MaskRay
Differential Revision: https://reviews.llvm.org/D109362
There are several places in the code that are currently broken where
we assume an Instruction is always a member of a BasicBlock that
lives in a Function. This is a problem specifically when
attempting to get the vscale_range attribute. This patch adds checks
that an Instruction's parent also has a parent!
I've added a test for a function-less @llvm.vscale intrinsic call here:
unittests/Analysis/ValueTrackingTest.cpp
When moving an entire basic block BB before InsertPoint, currently
we check for all instructions whether the operands dominates
InsertPoint, however, this can be improved such that even an
operand does not dominate InsertPoint, as long as it appears as
a previous instruction in the same BB, it is safe to move.
Reviewed By: Whitney
Differential Revision: https://reviews.llvm.org/D110378
There are several places in the code that are currently broken as
they assume an Instruction always has a parent Function when
attempting to get the vscale_range attribute. This patch adds checks
that an Instruction has a parent.
I've added a test for a parentless @llvm.vscale intrinsic call here:
unittests/Analysis/ValueTrackingTest.cpp
Differential Revision: https://reviews.llvm.org/D110158
Removing the 'ess' suffix improves the ergonomics without sacrificing clarity.
Since this class is likely to be used more frequently in the future it's worth
some short term pain to fix this now.
The closing namespace comment prevents clang-format from dropping a
blank line after the final test. Also add in a blank line (which
simplifies merging/rebasing/etc. WIP patches).
This allows VMOVL in tail predicated loops so long as the the vector
size the VMOVL is extending into is less than or equal to the size of
the VCTP in the tail predicated loop. These cases represent a
sign-extend-inreg (or zero-extend-inreg), which needn't block tail
predication as in https://godbolt.org/z/hdTsEbx8Y.
For this a vecsize has been added to the TSFlag bits of MVE
instructions, which stores the size of the elements that the MVE
instruction operates on. In the case of multiple size (such as a
MVE_VMOVLs8bh that extends from i8 to i16, the largest size was be
chosen). The sizes are encoded as 00 = i8, 01 = i16, 10 = i32 and 11 =
i64, which often (but not always) comes from the instruction encoding
directly. A unit test was added, and although only a subset of the
vecsizes are currently used, the rest should be useful for other cases.
Differential Revision: https://reviews.llvm.org/D109706
isValidAssumeForContext can provide better results with access to the
dominator tree in some cases. This patch adjusts computeConstantRange to
allow passing through a dominator tree.
The use VectorCombine is updated to pass through the DT to enable
additional scalarization.
Note that similar APIs like computeKnownBits already accept optional dominator
tree arguments.
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D110175
Add generic helper function that matches constant splat. It has option to
match constant splat with undef (some elements can be undef but not all).
Add util function and matcher for G_FCONSTANT splat.
Differential Revision: https://reviews.llvm.org/D104410
Rework getConstantstVRegValWithLookThrough in order to make it clear if we
are matching integer/float constant only or any constant(default).
Add helper functions that get DefVReg and APInt/APFloat from constant instr
getIConstantVRegValWithLookThrough: integer constant, only G_CONSTANT
getFConstantVRegValWithLookThrough: float constant, only G_FCONSTANT
getAnyConstantVRegValWithLookThrough: either G_CONSTANT or G_FCONSTANT
Rename getConstantVRegVal and getConstantVRegSExtVal to getIConstantVRegVal
and getIConstantVRegSExtVal. These now only match G_CONSTANT as described
in comment.
Relevant matchers now return both DefVReg and APInt/APFloat.
Replace existing uses of getConstantstVRegValWithLookThrough and
getConstantVRegVal with new helper functions. Any constant match is
only required in:
ConstantFoldBinOp: for constant argument that was bit-cast of float to int
getAArch64VectorSplat: AArch64::G_DUP operands can be any constant
amdgpu select for G_BUILD_VECTOR_TRUNC: operands can be any constant
In other places use integer only constant match.
Differential Revision: https://reviews.llvm.org/D104409
Finalization and deallocation actions are a key part of the upcoming
JITLinkMemoryManager redesign: They generalize the existing finalization and
deallocate concepts (basically "copy-and-mprotect", and "munmap") to include
support for arbitrary registration and deregistration of parts of JIT linked
code. This allows us to register and deregister eh-frames, TLV sections,
language metadata, etc. using regular memory management calls with no additional
IPC/RPC overhead, which should both improve JIT performance and simplify
interactions between ORC and the ORC runtime.
The SimpleExecutorMemoryManager class provides executor-side support for memory
management operations, including finalization and deallocation actions.
This support is being added in advance of the rest of the memory manager
redesign as it will simplify the introduction of an EPC based
RuntimeDyld::MemoryManager (since eh-frame registration/deregistration will be
expressible as actions). The new RuntimeDyld::MemoryManager will in turn allow
us to remove older remote allocators that are blocking the rest of the memory
manager changes.
Most PDB fields on disk are 32-bit but describe the file in terms of MSF
blocks, which are 4 kiB by default.
So PDB files can be a bit larger than 4 GiB, and much larger if you create them
with a block size > 4 kiB.
This is a first (necessary, but by far not not sufficient) step towards
supporting such PDB files. Now we don't truncate in-memory file offsets (which
are in terms of bytes, not in terms of blocks).
No effective behavior change. lld-link will still error out if it were to
produce PDBs > 4 GiB.
Differential Revision: https://reviews.llvm.org/D109923
Summary:
add a new API seek for the Cursor class in the DataExtractor.cpp
Reviewers: James Henderson, Fangrui Song
Differential Revision: https://reviews.llvm.org/D109603
New field `elements` is added to '!DIImportedEntity', representing
list of aliased entities.
This is needed to dump optimized debugging information where all names
in a module are imported, but a few names are imported with overriding
aliases.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D109343
MethodWrapperHandler removes some of the boilerplate when writing wrapper
functions to wrap method calls. It can be used as a handler for wrapper
functions whose first argument is an ExecutorAddress: the address is cast to a
pointer of the given class type, then the given method function pointer is
called on that object pointer (passing the rest of the arguments).
E.g.
class MyClass {
public:
void myMethod(uint32_t, bool) { ... }
};
// SPS Method signature for myMethod -- note MyClass object address as first
// argument.
using SPSMyMethodWrapperSignature =
SPSTuple<SPSExecutorAddress, uint32_t, bool>;
// Wrapper function for myMethod.
WrapperFunctionResult
myMethodCallWrapper(const char *ArgData, size_t ArgSize) {
return WrapperFunction<SPSMyMethodWrapperSignature>::handle(
ArgData, ArgSize, makeMethodWrapperHandler(&MyClass::myMethod));
}