This reverts ce1b24cca886eb9f05e252f399059167473d0ba3.
Since opaque pointers are enabled, we can safely revert the patch now.
Opaque ptr cleanup effort (NFC).
This removes `CreateMalloc` from `CallInst` and adds it to the `IRBuilderBase`
class.
We no longer needed the `Instruction *InsertBefore` and
`BasicBlock *InsertAtEnd` arguments of the `createMalloc` helper
function because we're using `IRBuilder` now. That's why I we also don't
need 4 `CreateMalloc` functions, but only two.
Differential Revision: https://reviews.llvm.org/D158861
Rather than have distinct implementations of memcpy.inline and memmove
creators, just use the existing CreateMemTransferInst, which plain
memcpy already uses. The AMD backend already generates memmove from
that function.
Reviewed By: gchatelet, jroelofs
Differential Revision: https://reviews.llvm.org/D156991
The last use was removed by:
commit 5a7c2f17009460251cc27d9bda183d3272a419ba
Author: Eric Christopher <echristo@gmail.com>
Date: Thu Jun 22 22:58:12 2017 +0000
This reverts commit 593797ab9bedca6e9b0b7a9ed0589cf76023ab00.
I didn't realize that there was already a fix for the broken tests fd2254b7358d0f78a79784688bd8012c1a52b9cf.
A new builtin function __builtin_isfpclass is added. It is called as:
__builtin_isfpclass(<floating point value>, <test>)
and returns an integer value, which is non-zero if the floating point
argument falls into one of the classes specified by the second argument,
and zero otherwise. The set of classes is an integer value, where each
value class is represented by a bit. There are ten data classes, as
defined by the IEEE-754 standard, they are represented by bits:
0x0001 (__FPCLASS_SNAN) - Signaling NaN
0x0002 (__FPCLASS_QNAN) - Quiet NaN
0x0004 (__FPCLASS_NEGINF) - Negative infinity
0x0008 (__FPCLASS_NEGNORMAL) - Negative normal
0x0010 (__FPCLASS_NEGSUBNORMAL) - Negative subnormal
0x0020 (__FPCLASS_NEGZERO) - Negative zero
0x0040 (__FPCLASS_POSZERO) - Positive zero
0x0080 (__FPCLASS_POSSUBNORMAL) - Positive subnormal
0x0100 (__FPCLASS_POSNORMAL) - Positive normal
0x0200 (__FPCLASS_POSINF) - Positive infinity
They have corresponding builtin macros to facilitate using the builtin
function:
if (__builtin_isfpclass(x, __FPCLASS_NEGZERO | __FPCLASS_POSZERO) {
// x is any zero.
}
The data class encoding is identical to that used in llvm.is.fpclass
function.
Differential Revision: https://reviews.llvm.org/D152351
This patch introduces the reduction intrinsic for floating point minimum
and maximum which has the same semantics (for NaN and signed zero) as
llvm.minimum and llvm.maximum.
Reviewed-By: nikic
Differential Revision: https://reviews.llvm.org/D152370
These idioms already appear a number of places in code, and upcoming changes to the various sanitizers continue to need more instances of the same patterns.
Differential Revision: https://reviews.llvm.org/D145945
Instcombine prefers this canonical form (see getPreferredVectorIndex),
as does IRBuilder when passing the index as an integer so we may as
well use the prefered form from creation.
NOTE: All test changes are mechanical with nothing else expected
beyond a change of index type from i32 to i64.
Differential Revision: https://reviews.llvm.org/D140983
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
In case that opaque pointers not enabled, there may be some constexpr
bitcast uses for thread local variables and the design of llvm allow
people to sink constant arbitrarily. This breaks the assumption of
IRBuilder::CreateThreadLocalAddress. This patch tries to handle the
case.
Add a new IRBuilderBase::CreateIntrinsic which takes the return type and
argument values for the intrinsic call but does not take an explicit
list of types to mangle. Instead the builder works this out from the
intrinsic declaration and the types of the supplied arguments.
This means that the mangling is hidden from the client, which in turn
means that intrinsic definitions can change which arguments are mangled
without requiring any changes to the client code.
Differential Revision: https://reviews.llvm.org/D130776
This belongs to a series of patches which try to solve the thread
identification problem in coroutines. See
https://discourse.llvm.org/t/address-thread-identification-problems-with-coroutine/62015
for a full background.
The problem consists of two concrete problems: TLS variable and readnone
functions. This patch tries to convert the TLS problem to readnone
problem by converting the access of TLS variable to an intrinsic which
is marked as readnone.
The readnone problem would be addressed in following patches.
Reviewed By: nikic, jyknight, nhaehnle, ychen
Differential Revision: https://reviews.llvm.org/D125291
In the same spirit as D73543 and in reply to https://reviews.llvm.org/D126768#3549920 this patch is adding support for `__builtin_memset_inline`.
The idea is to get support from the compiler to easily write efficient memory function implementations.
This patch could be split in two:
- one for the LLVM part adding the `llvm.memset.inline.*` intrinsics.
- and another one for the Clang part providing the instrinsic as a builtin.
Differential Revision: https://reviews.llvm.org/D126903
I can't remove the function just yet as it is used in the generated .inc files.
I would also like to provide a way to compare alignment with TypeSize since it came up a few times.
Differential Revision: https://reviews.llvm.org/D126910
This header is very large (3M Lines once expended) and was included in location
where dwarf-specific information were not needed.
More specifically, this commit suppresses the dependencies on
llvm/BinaryFormat/Dwarf.h in two headers: llvm/IR/IRBuilder.h and
llvm/IR/DebugInfoMetadata.h. As these headers (esp. the former) are widely used,
this has a decent impact on number of preprocessed lines generated during
compilation of LLVM, as showcased below.
This is achieved by moving some definitions back to the .cpp file, no
performance impact implied[0].
As a consequence of that patch, downstream user may need to manually some extra
files:
llvm/IR/IRBuilder.h no longer includes llvm/BinaryFormat/Dwarf.h
llvm/IR/DebugInfoMetadata.h no longer includes llvm/BinaryFormat/Dwarf.h
In some situations, codes maybe relying on the fact that
llvm/BinaryFormat/Dwarf.h was including llvm/ADT/Triple.h, this hidden
dependency now needs to be explicit.
$ clang++ -E -Iinclude -I../llvm/include ../llvm/lib/Transforms/Scalar/*.cpp -std=c++14 -fno-rtti -fno-exceptions | wc -l
after: 10978519
before: 11245451
Related Discourse thread: https://llvm.discourse.group/t/include-what-you-use-include-cleanup
[0] https://llvm-compile-time-tracker.com/compare.php?from=fa7145dfbf94cb93b1c3e610582c495cb806569b&to=995d3e326ee1d9489145e20762c65465a9caeab4&stat=instructions
Differential Revision: https://reviews.llvm.org/D118781
This makes the statepoint methods in IRBuilder accept a
FunctionCallee, which carries both the callee and function type.
This is used to add the elementtype attribute to the statepoint call.
RS4GC requires an additional tweak to actually preserve that attribute
-- previously the attributes on the call were completely overwritten.
Differential Revision: https://reviews.llvm.org/D118886
Based on the output of include-what-you-use.
This is a big chunk of changes. It is very likely to break downstream code
unless they took a lot of care in avoiding hidden ehader dependencies, something
the LLVM codebase doesn't do that well :-/
I've tried to summarize the biggest change below:
- llvm/include/llvm-c/Core.h: no longer includes llvm-c/ErrorHandling.h
- llvm/IR/DIBuilder.h no longer includes llvm/IR/DebugInfo.h
- llvm/IR/IRBuilder.h no longer includes llvm/IR/IntrinsicInst.h
- llvm/IR/LLVMRemarkStreamer.h no longer includes llvm/Support/ToolOutputFile.h
- llvm/IR/LegacyPassManager.h no longer include llvm/Pass.h
- llvm/IR/Type.h no longer includes llvm/ADT/SmallPtrSet.h
- llvm/IR/PassManager.h no longer includes llvm/Pass.h nor llvm/Support/Debug.h
And the usual count of preprocessed lines:
$ clang++ -E -Iinclude -I../llvm/include ../llvm/lib/IR/*.cpp -std=c++14 -fno-rtti -fno-exceptions | wc -l
before: 6400831
after: 6189948
200k lines less to process is no that bad ;-)
Discourse thread on the topic: https://llvm.discourse.group/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D118652
Instead use either Type::getPointerElementType() or
Type::getNonOpaquePointerElementType().
This is part of D117885, in preparation for deprecating the API.
Currently the stepvector intrinsic only supports element types that
are integers of size 8 bits or more. This patch adds support for the
creation of stepvectors with smaller element types by creating
the intrinsic with i8 elements that we then truncate to the requested
size.
It's not currently possible to write a vectoriser test to exercise
this code path so I have added a unit test here:
llvm/unittests/IR/IRBuilderTest.cpp
Differential Revision: https://reviews.llvm.org/D113767
Upon further investigation and discussion,
this is actually the opposite direction from what we should be taking,
and this direction wouldn't solve the motivational problem anyway.
Additionally, some more (polly) tests have escaped being updated.
So, let's just take a step back here.
This reverts commit f3190dedeef9da2109ea57e4cb372f295ff53b88.
This reverts commit 749581d21f2b3f53e4fca4eb8728c942d646893b.
This reverts commit f3df87d57e096143670e0fd396e81d43393a2dd2.
This reverts commit ab1dbcecd6f0969976fafd62af34730436ad5944.
While we could emit such a tautological `select`,
it will stick around until the next instsimplify invocation,
which may happen after we count the cost of this redundant `select`.
Which is precisely what happens with loop vectorization legality checks,
and that artificially increases the cost of said checks,
which is bad.
There is prior art for this in `IRBuilderBase::CreateAnd()`/`IRBuilderBase::CreateOr()`.
Refs. https://reviews.llvm.org/D109368#3089809
Use the elementtype attribute introduced in D105407 for the
llvm.preserve.array/struct.index intrinsics. It carries the
element type of the GEP these intrinsics effectively encode.
This patch:
* Adds a verifier check that the attribute is required.
* Adds it in the IRBuilder methods for these intrinsics.
* Autoupgrades old bitcode without the attribute.
* Updates the lowering code to use the attribute rather than
the pointer element type.
* Updates lots of tests to specify the attribute.
* Adds -force-opaque-pointers to the intrinsic-array.ll test
to demonstrate they work now.
https://reviews.llvm.org/D106184
Same as other CreateLoad-style APIs, these need an explicit type
argument to support opaque pointers.
Differential Revision: https://reviews.llvm.org/D105395
Specifically the CreateMaskedStore and CreateMaskedScatter APIs.
The CreateMaskedLoad and CreateMaskedGather APIs will need an
additional type argument.
There can be a need for some optimizations to get (base, offset)
for any GC pointer. The base can be calculated by generating
needed instructions as it is done by the
RewriteStatepointsForGC::findBasePointer() function. The offset
can be calculated in the same way. Though to not expose the base
calculation and to make the offset calculation as simple as
ptrtoint(derived_ptr) - ptrtoint(base_ptr), which is illegal
outside RS4GC, this patch introduces 2 intrinsics:
@llvm.experimental.gc.get.pointer.base(%derived_ptr)
@llvm.experimental.gc.get.pointer.offset(%derived_ptr)
These intrinsics are inlined by RS4GC along with generation of
statepoint sequences.
With these new intrinsics the GC parseable lowering for atomic
memcpy intrinsics (6ec2c5e402a724ba99bce82a9cac7a3006d660f4)
could be implemented as a separate pass.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D100445
Adds support for scalable vectorization of loops containing first-order recurrences, e.g:
```
for(int i = 0; i < n; i++)
b[i] = a[i] + a[i - 1]
```
This patch changes fixFirstOrderRecurrence for scalable vectors to take vscale into
account when inserting into and extracting from the last lane of a vector.
CreateVectorSplice has been added to construct a vector for the recurrence, which
returns a splice intrinsic for scalable types. For fixed-width the behaviour
remains unchanged as CreateVectorSplice will return a shufflevector instead.
The tests included here are the same as test/Transform/LoopVectorize/first-order-recurrence.ll
Reviewed By: david-arm, fhahn
Differential Revision: https://reviews.llvm.org/D101076
In quite a few cases in LoopVectorize.cpp we call createStepForVF
with a step value of 0, which leads to unnecessary generation of
llvm.vscale intrinsic calls. I've optimised IRBuilder::CreateVScale
and createStepForVF to return 0 when attempting to multiply
vscale by 0.
Differential Revision: https://reviews.llvm.org/D100763