This fix checks the original LLVM IR node to identify opaque constants by
looking for the bitcast-constant pattern. Originally we looked at the generated
SDNode, but this might lead to incorrect results. The SDNode could have been
generated by an constant expression that was folded to a constant.
This fixes <rdar://problem/16050719>
llvm-svn: 201291
We are now no longer relying on the target-specific call lowering implementation
to lower a stackmap intrinsic call. Instead we perform the call lowering in a
target-independent way directly in the stackmap lowering code. This simplifies
the code and removes the need to fixup the code after the target-specific call
lowering.
llvm-svn: 201263
The ID type for the stackmap and patchpoint intrinsics are in both cases i64.
This fixes an zero extend in the SelectionDAGBuilder that still used i32. This
also updates the target independent instructions STACKMAP and PATCHPOINT to use
the correct type.
llvm-svn: 201262
Summary:
AsmPrinter::EmitInlineAsm() will no longer use the EmitRawText() call for targets with mature MC support. Such targets will always parse the inline assembly (even when emitting assembly). Targets without mature MC support continue to use EmitRawText() for assembly output.
The hasRawTextSupport() check in AsmPrinter::EmitInlineAsm() has been replaced with MCAsmInfo::UseIntegratedAs which when true, causes the integrated assembler to parse inline assembly (even when emitting assembly output). UseIntegratedAs is set to true for targets that consider any failure to parse valid assembly to be a bug. Target specific subclasses generally enable the integrated assembler in their constructor. The default value can be overridden with -no-integrated-as.
All tests that rely on inline assembly supporting invalid assembly (for example, those that use mnemonics such as 'foo' or 'hello world') have been updated to disable the integrated assembler.
Reviewers: rafael
Reviewed By: rafael
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D2686
llvm-svn: 201237
There's still one piece missing here, which is adding the
DW_AT_stmt_list to the type unit that refer's to the compile unit's line
table. Working on that.
llvm-svn: 201198
This is preliminary work to fix type unit file strings so they appear in
their originating CU's line table - but it's also just good/simple
cleanup, so I'm committing it ahead of time.
llvm-svn: 201195
We used to be pretty vague about what debug entities were what, with
many conditionals to silently drop/skip/accept things. These don't seem
to be relevant anymore.
llvm-svn: 201194
Debug info: Emit values in subregisters that do not have a separate
DWARF register number by emitting a super-register + DW_OP_bit_piece.
This is necessary because on x86_64, there are no DWARF register numbers
for i386-style subregisters.
Fixes a bunch of FIXMEs.
rdar://problem/16015314
llvm-svn: 201190
This comes up in empty files or files containing #file directives that
never reference the actual source file name. Came up in a small test of
line tables I was playing with.
llvm-svn: 201187
DWARF register number by emitting a super-register + DW_OP_bit_piece.
This is necessary because on x86_64, there are no DWARF register numbers
for i386-style subregisters.
Fixes a bunch of FIXMEs.
rdar://problem/16015314
llvm-svn: 201180
BUILD_VECTOR nodes, e.g.:
(concat_vectors (BUILD_VECTOR a1, a2, a3, a4), (BUILD_VECTOR b1, b2, b3, b4))
->
(BUILD_VECTOR a1, a2, a3, a4, b1, b2, b3, b4)
This fixes an issue with AVX, where a sequence was not recognized as a 256-bit
vbroadcast due to the concat_vectors.
llvm-svn: 201158
These methods normally call each other and it is really annoying if the
arguments are in different order. The more common rule was that the arguments
specific to call are first (GV, Encoding, Suffix) and the auxiliary objects
(Mang, TM) come after. This patch changes the exceptions.
llvm-svn: 201044
This solves a problem where a def machine operand has no uses but has
not been marked dead. In this case, the initial RP analysis was being
extra precise and determining from LiveIntervals the the register was
actually dead. This caused us to omit the register from the RP
tracker's block live out. That's all good, but the per-instruction
summary still accounted for it as a valid def. This could cause an
assertion in the tracker later when we underflow pressure.
This is from a bug report on an out-of-tree target. It is not
reproducible on well-behaved targets. I'm just making an obvious fix
without unit test.
llvm-svn: 200941
In a previous commit (r199818) we added a const_cast to an existing
subtarget info instead of creating a new one so that we could reuse
it when creating the TargetAsmParser for parsing inline assembly.
This cast was necessary because we needed to reuse the existing STI
to avoid generating incorrect code when the inline asm contained
mode-switching directives (e.g. .code 16).
The root cause of the failure was that there was an implicit sharing
of the STI between the parser and the MCCodeEmitter. To fix a
different but related issue, we now explicitly pass the STI to the
MCCodeEmitter (see commits r200345-r200351).
The const_cast is no longer necessary and we can now create a fresh
STI for the inline asm parser to use.
Differential Revision: http://llvm-reviews.chandlerc.com/D2709
llvm-svn: 200929
The aim in this patch is to reduce work that VirtRegRewriter needs to do when
telling MachineRegisterInfo which physregs are in use. Up until now
VirtRegRewriter::rewrite has been doing rewriting and populating def info and
then proceeding to set whether a physreg is used based this info for every
physreg that the target provides. This can be expensive when a target has an
unusually high number of supported physregs, and is a noticeable chunk of
compile time for small programs on such targets.
So to reduce compile time, this patch simply adds the use of a SparseSet to the
rewrite function that is used to flag each physreg that is encountered in a
MachineFunction. Afterward, rather than iterating over the set of all physregs
for a given target to set the physregs used in MachineRegisterInfo, the new way
is to iterate over the set of physregs that were actually encountered and set
in the SparseSet. This improves compile time because the existing rewrite
function was iterating over all MachineOperands already, and because the
iterations afterward to setPhysRegUsed is reduced by use of the SparseSet data.
llvm-svn: 200919
programs on targets with large register files. The root of the compile time
overhead was in the use of llvm::SmallVector to hold PhysRegEntries, which
resulted in slow-down from calling llvm::SmallVector::assign(N, 0). In contrast
std::vector uses the faster __platform_bzero to zero out primitive buffers when
assign is called, while SmallVector uses an iterator.
The fix for this was simply to replace the SmallVector with a dynamically
allocated buffer and to initialize or reinitialize the buffer based on the
total registers that the target architecture requires. The changes support
cases where a pass manager may be reused for different targets, and note that
the PhysRegEntries is allocated using calloc mainly for good for, and also to
quite tools like Valgrind (see comments for more info on this).
There is an rdar to track the fact that SmallVector doesn't have platform
specific speedup optimizations inside of it for things like this, and I'll
create a bugzilla entry at some point soon as well.
TL;DR: This fix replaces the expensive llvm::SmallVector<unsigned
char>::assign(N, 0) with a call to calloc for N bytes which is much faster
because SmallVector's assign uses iterators.
llvm-svn: 200917
large register files. The omission of Queries.clear() is perfectly safe because
LiveIntervalUnion::Query doesn't contain any data that needs freeing and
because LiveRegMatrix::runOnFunction happens to reset the OwningArrayPtr
holding Queries every time it is run, so there's no need to zero out the
queries either. Not having to do this for very large numbers of physregs
is a noticeable constant cost reduction in compilation of small programs.
llvm-svn: 200913
During DAGCombine visitShiftByConstant assumes that certain binary operations
with only constant operands can always be folded successfully. This is no longer
true when the constant is opaque. This commit fixes visitShiftByConstant by not
performing the optimization for opaque constants. Otherwise we would end up in
an infinite DAGCombine loop.
llvm-svn: 200900
find a register.
The idea is to choose a color for the variable that cannot be allocated and
recolor its interferences around. Unlike the current register allocation scheme,
it is allowed to change the color of an already assigned (but maybe not
splittable or spillable) live interval while propagating this change to its
neighbors.
In other word, there are two things that may help finding an available color:
- Already assigned variables (RS_Done) can be recolored to different color.
- The recoloring allows to catch solutions that needs to touch more that just
the neighbors of the current allocated variable.
E.g.,
vA can use {R1, R2 }
vB can use { R2, R3}
vC can use {R1 }
Where vA, vB, and vC cannot be split anymore (they are reloads for instance) and
they all interfere.
vA is assigned R1
vB is assigned R2
vC tries to evict vA but vA is already done.
=> Regular register allocation heuristic fails.
Last chance recoloring kicks in:
vC does as if vA was evicted => vC uses R1.
vC is marked as fixed.
vA needs to find a color.
None are available.
vA cannot evict vC: vC is a fixed virtual register now.
vA does as if vB was evicted => vA uses R2.
vB needs to find a color.
R3 is available.
Recoloring => vC = R1, vA = R2, vB = R3.
<rdar://problem/15947839>
llvm-svn: 200883
A bunch of test cases needed to be cleaned up for this, many my fault -
when implementid imported modules I updated test cases by simply
duplicating the prior metadata field - which wasn't always the empty
metadata entry.
llvm-svn: 200731
This changes the PrologueEpilogInserter and LocalStackSlotAllocation passes to
follow the extended stack layout rules for sspstrong and sspreq.
The sspstrong layout rules are:
1. Large arrays and structures containing large arrays (>= ssp-buffer-size)
are closest to the stack protector.
2. Small arrays and structures containing small arrays (< ssp-buffer-size) are
2nd closest to the protector.
3. Variables that have had their address taken are 3rd closest to the
protector.
Differential Revision: http://llvm-reviews.chandlerc.com/D2546
llvm-svn: 200601
Calls with inalloca are lowered by skipping all stores for arguments
passed in memory and the initial stack adjustment to allocate argument
memory.
Now the frontend is responsible for the memory layout, and the backend
doesn't have to do any work. As a result these changes are pretty
minimal.
Reviewers: echristo
Differential Revision: http://llvm-reviews.chandlerc.com/D2637
llvm-svn: 200596
Allocas marked inalloca are never static, but we were trying to put them
into the static alloca map if they were in the entry block. Also add an
assertion in x86 fastisel.
llvm-svn: 200593