llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-05-07 10:26:06 +00:00

Author	SHA1	Message	Date
Djordje Todorovic	df686842bc	[RemoveRedundantDebugValues] Add a Pass that removes redundant DBG_VALUEs This new MIR pass removes redundant DBG_VALUEs. After the register allocator is done, more precisely, after the Virtual Register Rewriter, we end up having duplicated DBG_VALUEs, since some virtual registers are being rewritten into the same physical register as some of existing DBG_VALUEs. Each DBG_VALUE should indicate (at least before the LiveDebugValues) variables assignment, but it is being clobbered for function parameters during the SelectionDAG since it generates new DBG_VALUEs after COPY instructions, even though the parameter has no assignment. For example, if we had a DBG_VALUE $regX as an entry debug value representing the parameter, and a COPY and after the COPY, DBG_VALUE $virt_reg, and after the virtregrewrite the $virt_reg gets rewritten into $regX, we'd end up having redundant DBG_VALUE. This breaks the definition of the DBG_VALUE since some analysis passes might be built on top of that premise..., and this patch tries to fix the MIR with the respect to that. This first patch performs bacward scan, by trying to detect a sequence of consecutive DBG_VALUEs, and to remove all DBG_VALUEs describing one variable but the last one: For example: (1) DBG_VALUE $edi, !"var1", ... (2) DBG_VALUE $esi, !"var2", ... (3) DBG_VALUE $edi, !"var1", ... ... in this case, we can remove (1). By combining the forward scan that will be introduced in the next patch (from this stack), by inspecting the statistics, the RemoveRedundantDebugValues removes 15032 instructions by using gdb-7.11 as a testbed. Differential Revision: https://reviews.llvm.org/D105279	2021-07-14 04:29:42 -07:00
Matt Arsenault	eebe841a47	RegAlloc: Allow targets to split register allocation AMDGPU normally spills SGPRs to VGPRs. Previously, since all register classes are handled at the same time, this was problematic. We don't know ahead of time how many registers will be needed to be reserved to handle the spilling. If no VGPRs were left for spilling, we would have to try to spill to memory. If the spilled SGPRs were required for exec mask manipulation, it is highly problematic because the lanes active at the point of spill are not necessarily the same as at the restore point. Avoid this problem by fully allocating SGPRs in a separate regalloc run from VGPRs. This way we know the exact number of VGPRs needed, and can reserve them for a second run. This fixes the most serious issues, but it is still possible using inline asm to make all VGPRs unavailable. Start erroring in the case where we ever would require memory for an SGPR spill. This is implemented by giving each regalloc pass a callback which reports if a register class should be handled or not. A few passes need some small changes to deal with leftover virtual registers. In the AMDGPU implementation, a new pass is introduced to take the place of PrologEpilogInserter for SGPR spills emitted during the first run. One disadvantage of this is currently StackSlotColoring is no longer used for SGPR spills. It would need to be run again, which will require more work. Error if the standard -regalloc option is used. Introduce new separate -sgpr-regalloc and -vgpr-regalloc flags, so the two runs can be controlled individually. PBQB is not currently supported, so this also prevents using the unhandled allocator.	2021-07-13 18:49:29 -04:00
Stanislav Mekhanoshin	0fdb25cd95	[AMDGPU] Disable garbage collection passes Differential Revision: https://reviews.llvm.org/D105593	2021-07-07 15:47:57 -07:00
Rong Xu	6745ffe4fa	[SampleFDO] New hierarchical discriminator for FS SampleFDO (ProfileData part) This patch was split from https://reviews.llvm.org/D102246 [SampleFDO] New hierarchical discriminator for Flow Sensitive SampleFDO This is mainly for ProfileData part of change. It will load FS Profile when such profile is detected. For an extbinary format profile, create_llvm_prof tool will add a flag to profile summary section. For other format profiles, the users need to use an internal option (-profile-isfs) to tell the compiler that the profile uses FS discriminators. This patch also simplified the bit API used by FS discriminators. Differential Revision: https://reviews.llvm.org/D103041	2021-06-02 10:32:52 -07:00
Arthur Eubanks	8815ce03e8	Remove "Rewrite Symbols" from codegen pipeline It breaks up the function pass manager in the codegen pipeline. With empty parameters, it looks at the -mllvm flag -rewrite-map-file. This is likely not in use. Add a check that we only have one function pass manager in the codegen pipeline. Some tests relied on the fact that we had a module pass somewhere in the codegen pipeline. addr-label.ll crashes on ARM due to this change. This is because a ARMConstantPoolConstant containing a BasicBlock to represent a blockaddress may hold an invalid pointer to a BasicBlock if the blockaddress is invalidated by its BasicBlock getting removed. In that case all referencing blockaddresses are RAUW a constant int. Making ARMConstantPoolConstant::CVal a WeakVH fixes the crash, but I'm not sure that's the right fix. As a workaround, create a barrier right before ISel so that IR optimizations can't happen while a ARMConstantPoolConstant has been created. Reviewed By: rnk, MaskRay, compnerd Differential Revision: https://reviews.llvm.org/D99707	2021-05-31 08:32:36 -07:00
Rong Xu	886629a8c9	[SampleFDO] New hierarchical discriminator for Flow Sensitive SampleFDO This patch implements first part of Flow Sensitive SampleFDO (FSAFDO). It has the following changes: (1) disable current discriminator encoding scheme, (2) new hierarchical discriminator for FSAFDO. For this patch, option "-enable-fs-discriminator=true" turns on the new functionality. Option "-enable-fs-discriminator=false" (the default) keeps the current SampleFDO behavior. When the fs-discriminator is enabled, we insert a flag variable, namely, llvm_fs_discriminator, to the object. This symbol will checked by create_llvm_prof tool, and used to generate a profile with FS-AFDO discriminators enabled. If this happens, for an extbinary format profile, create_llvm_prof tool will add a flag to profile summary section. Differential Revision: https://reviews.llvm.org/D102246	2021-05-18 16:23:43 -07:00
Xiang1 Zhang	d4bdeca576	[X86] Support AMX fast register allocation Differential Revision: https://reviews.llvm.org/D100026	2021-05-08 14:21:11 +08:00
Xiang1 Zhang	bebafe01a7	Revert "[X86] Support AMX fast register allocation" This reverts commit 77e2e5e07d01fe0b83c39d0c527c0d3d2e659146.	2021-05-08 13:43:32 +08:00
Xiang1 Zhang	77e2e5e07d	[X86] Support AMX fast register allocation	2021-05-08 13:27:21 +08:00
Simon Moll	1db4dbba24	Recommit "[VP,Integer,#2] ExpandVectorPredication pass" This reverts the revert 02c5ba8679873e878ae7a76fb26808a47940275b Fix: Pass was registered as DUMMY_FUNCTION_PASS causing the newpm-pass functions to be doubly defined. Triggered in -DLLVM_ENABLE_MODULE=1 builds. Original commit: This patch implements expansion of llvm.vp.* intrinsics (https://llvm.org/docs/LangRef.html#vector-predication-intrinsics). VP expansion is required for targets that do not implement VP code generation. Since expansion is controllable with TTI, targets can switch on the VP intrinsics they do support in their backend offering a smooth transition strategy for VP code generation (VE, RISC-V V, ARM SVE, AVX512, ..). Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D78203	2021-05-04 11:47:52 +02:00
Adrian Prantl	02c5ba8679	Revert "[VP,Integer,#2] ExpandVectorPredication pass" This reverts commit 43bc584dc05e24c6d44ece8e07d4bff585adaf6d. The commit broke the -DLLVM_ENABLE_MODULES=1 builds. http://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/31603/consoleFull#2136199809a1ca8a51-895e-46c6-af87-ce24fa4cd561	2021-04-30 17:02:28 -07:00
Simon Moll	43bc584dc0	[VP,Integer,#2] ExpandVectorPredication pass This patch implements expansion of llvm.vp.* intrinsics (https://llvm.org/docs/LangRef.html#vector-predication-intrinsics). VP expansion is required for targets that do not implement VP code generation. Since expansion is controllable with TTI, targets can switch on the VP intrinsics they do support in their backend offering a smooth transition strategy for VP code generation (VE, RISC-V V, ARM SVE, AVX512, ..). Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D78203	2021-04-30 15:47:28 +02:00
Benjamin Kramer	df323ba445	Revert "[X86] Support AMX fast register allocation" This reverts commit 3b8ec86fd576b9808dc63da620d9a4f7bbe04372. Revert "[X86] Refine AMX fast register allocation" This reverts commit c3f95e9197643b699b891ca416ce7d72cf89f5fc. This pass breaks using LLVM in a multi-threaded environment by introducing global state.	2021-04-29 18:56:33 +02:00
Xiang1 Zhang	3b8ec86fd5	[X86] Support AMX fast register allocation Differential Revision: https://reviews.llvm.org/D100026	2021-04-25 09:45:41 +08:00
Arthur Eubanks	c88b87f9ce	Revert "Remove "Rewrite Symbols" from codegen pipeline" This reverts commit 6210261ecb21c84c9a440a76c0ccbc8ad211bed3. addr-label.ll crashes on armv7.	2021-04-10 23:28:16 -07:00
Arthur Eubanks	6210261ecb	Remove "Rewrite Symbols" from codegen pipeline It breaks up the function pass manager in the codegen pipeline. With empty parameters, it looks at the -mllvm flag -rewrite-map-file. This is likely not in use. Add a check that we only have one function pass manager in the codegen pipeline. This required reverting commit 9583a3f2625818b78c0cf6d473cdedb9f23ad82c: "[AsmPrinter] Delete dead takeDeletedSymbsForFunction()". This was not NFC as initially thought. By coalescing two function psas managers, this exposed the reverted code as necessary. addr-label.ll was crashing due to an emitted blockaddress's block being removed but the label not emitted. Some tests relied on the fact that we had a module pass somewhere in the codegen pipeline. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D99707	2021-04-10 22:38:44 -07:00
Arthur Eubanks	040c1b49d7	Move EntryExitInstrumentation pass location This seems to be more of a Clang thing rather than a generic LLVM thing, so this moves it out of LLVM pipelines and as Clang extension hooks into LLVM pipelines. Move the post-inline EEInstrumentation out of the backend pipeline and into a late pass, similar to other sanitizer passes. It doesn't fit into the codegen pipeline. Also fix up EntryExitInstrumentation not running at -O0 under the new PM. PR49143 Reviewed By: hans Differential Revision: https://reviews.llvm.org/D97608	2021-03-01 10:08:10 -08:00
Lukas Sommer	6577cef9b0	[CodeGen] New pass: Replace vector intrinsics with call to vector library This patch adds a pass to replace calls to vector intrinsics (i.e., LLVM intrinsics operating on vector operands) with calls to a vector library. Currently, calls to LLVM intrinsics are only replaced with calls to vector libraries when scalar calls to intrinsics are vectorized by the Loop- or SLP-Vectorizer. With this pass, it is now possible to replace calls to LLVM intrinsics already operating on vector operands, e.g., if such code was generated by MLIR. For the replacement, information from the TargetLibraryInfo, e.g., as specified via -vector-library is used. This is a re-try of the original commit 2303e93e66 that was reverted due to pass manager problems. Other minor changes have also been made. Differential Revision: https://reviews.llvm.org/D95373	2021-02-12 12:53:27 -05:00
Snehasish Kumar	d079dbc591	[CodeGen] Basic block sections should take precendence over splitting. The use of basic block sections should take precedence over the machine function splitting pass. Since they use the same underlying mechanism they are kept exclusive. Updated the tests to check that split machine functions is overridden by all flavours of basic block sections. Differential Revision: https://reviews.llvm.org/D96392	2021-02-11 11:14:10 -08:00
Sanjay Patel	c981f6f8e1	Revert "[Codegen][ReplaceWithVecLib] add pass to replace vector intrinsics with calls to vector library" This reverts commit 2303e93e666e13ebf6d24323729c28f520ecca37. Investigating bot failures.	2021-02-05 15:10:11 -05:00
Lukas Sommer	2303e93e66	[Codegen][ReplaceWithVecLib] add pass to replace vector intrinsics with calls to vector library This patch adds a pass to replace calls to vector intrinsics (i.e., LLVM intrinsics operating on vector operands) with calls to a vector library. Currently, calls to LLVM intrinsics are only replaced with calls to vector libraries when scalar calls to intrinsics are vectorized by the Loop- or SLP-Vectorizer. With this pass, it is now possible to replace calls to LLVM intrinsics already operating on vector operands, e.g., if such code was generated by MLIR. For the replacement, information from the TargetLibraryInfo, e.g., as specified via -vector-library is used. Differential Revision: https://reviews.llvm.org/D95373	2021-02-05 14:25:19 -05:00
Matt Arsenault	c9122ddef5	CodeGen: Refactor regallocator command line and target selection Make the sequence of passes to select and rewrite instructions to physical registers be a target callback. This is to prepare to allow targets to split register allocation into multiple phases.	2021-01-07 13:13:25 -05:00
Yuanfang Chen	480936e741	Reland "[NewPM][CodeGen] Introduce CodeGenPassBuilder to help build codegen pipeline" (again) This reverts commit 16c8f6e91344ec9840d6aa9ec6b8d0c87a104ca3 with fix. -Wswitch catched an unhandled enum value due to recent commits in TargetPassConfig.cpp.	2020-12-29 16:39:55 -08:00
Yuanfang Chen	16c8f6e913	Revert "Reland "[NewPM][CodeGen] Introduce CodeGenPassBuilder to help build codegen pipeline"" This reverts commit 21314940c4856e0cb81b664fd2d2117d1b7dc3e3. Build failure in some bots.	2020-12-29 16:29:07 -08:00
Yuanfang Chen	21314940c4	Reland "[NewPM][CodeGen] Introduce CodeGenPassBuilder to help build codegen pipeline" This reverts commit 94427af60c66ffea655a3084825c6c3a9deec1ad (relands 4646de5d75cfce3da4ddeffb6eb8e66e38238800 with fix). Use "return std::move(AsmStreamer);" instead of "return AsmStreamer;" in LVMTargetMachine::createMCStreamer. Unlike Clang, GCC seems having trouble inserting a implicit lvalue->rvalue conversion.	2020-12-29 15:17:23 -08:00
Yuanfang Chen	94427af60c	Revert "[NewPM][CodeGen] Introduce CodeGenPassBuilder to help build codegen pipeline" This reverts commit 4646de5d75cfce3da4ddeffb6eb8e66e38238800. Some bots have build failure.	2020-12-28 17:44:22 -08:00
Yuanfang Chen	4646de5d75	[NewPM][CodeGen] Introduce CodeGenPassBuilder to help build codegen pipeline Following up on D67687. Please refer to the RFC here http://lists.llvm.org/pipermail/llvm-dev/2020-July/143309.html `CodeGenPassBuilder` is the NPM counterpart of `TargetPassConfig` with below differences. - Debugging features (MIR print/verify, disable pass, start/stop-before/after, etc.) living in `TargetPassConfig` are moved to use PassInstrument as much as possible. (Implementation also lives in `TargetPassConfig.cpp`) - `TargetPassConfig` is a polymorphic base (virtual inheritance) to build the target-dependent pipeline whereas `CodeGenPassBuilder` is the CRTP base/helper to implement the target-dependent pipeline. The motivation is flexibility for targets to customize the pipeline, inlining opportunity, and fits the overall NPM value semantics design. - `TargetPassConfig` is a legacy immutable pass to declare hooks for targets to customize some target-independent codegen layer behavior. This is partially ported to TargetMachine::options. The rest, such as `createMachineScheduler/createPostMachineScheduler`, are left out for now. They should be implemented in LLVMTargetMachine in the future. Reviewed By: arsenm, aeubanks Differential Revision: https://reviews.llvm.org/D83608	2020-12-28 17:36:36 -08:00
Xiang1 Zhang	39584ae5b5	[Debugify] Support checking Machine IR debug info Add mir-check-debug pass to check MIR-level debug info. For IR-level, currently, LLVM have debugify + check-debugify to generate and check debug IR. Much like the IR-level pass debugify, mir-debugify inserts sequentially increasing line locations to each MachineInstr in a Module, But there is no equivalent MIR-level check-debugify pass, So now we support it at "mir-check-debug". Reviewed By: djtodoro Differential Revision: https://reviews.llvm.org/D91595	2020-12-16 22:17:25 -08:00
Xiang1 Zhang	1e42ad9d62	Revert "[Debugify] Support checking Machine IR debug info" This reverts commit 50aaa8c274910d78d7bf6c929a34fe58b1f45579.	2020-12-16 20:12:33 -08:00
Xiang1 Zhang	50aaa8c274	[Debugify] Support checking Machine IR debug info Add mir-check-debug pass to check MIR-level debug info. For IR-level, currently, LLVM have debugify + check-debugify to generate and check debug IR. Much like the IR-level pass debugify, mir-debugify inserts sequentially increasing line locations to each MachineInstr in a Module, But there is no equivalent MIR-level check-debugify pass, So now we support it at "mir-check-debug". Reviewed By: djtodoro Differential Revision: https://reviews.llvm.org/D91595	2020-12-16 18:04:05 -08:00
Nico Weber	da2551f3d1	Revert "[Debugify] Support checking Machine IR debug info" This reverts commit c4d2d4337d50bed3cafd564daece1a197005b22b. Necessary to revert 2a5675f11d3bc803a245c0e.	2020-12-14 22:14:48 -05:00
Xiang1 Zhang	c4d2d4337d	[Debugify] Support checking Machine IR debug info Add mir-check-debug pass to check MIR-level debug info. For IR-level, currently, LLVM have debugify + check-debugify to generate and check debug IR. Much like the IR-level pass debugify, mir-debugify inserts sequentially increasing line locations to each MachineInstr in a Module, But there is no equivalent MIR-level check-debugify pass, So now we support it at "mir-check-debug". Reviewed By: djtodoro Differential Revision: https://reviews.llvm.org/D91595	2020-12-14 17:53:46 -08:00
Xiang1 Zhang	fc0f4010bb	Revert "[Debugify] Support checking Machine IR debug info" This reverts commit 57a3d9ec4a8c1422f07264bed9f12a4ea416707e.	2020-12-14 17:48:49 -08:00
Xiang1 Zhang	57a3d9ec4a	[Debugify] Support checking Machine IR debug info Add mir-check-debug pass to check MIR-level debug info. For IR-level, currently, LLVM have debugify + check-debugify to generate and check debug IR. Much like the IR-level pass debugify, mir-debugify inserts sequentially increasing line locations to each MachineInstr in a Module, But there is no equivalent MIR-level check-debugify pass, So now we support it at "mir-check-debug". Reviewed By: djtodoro Differential Revision: https://reviews.llvm.org/D95195	2020-12-14 17:38:01 -08:00
Anna Thomas	29356e3279	[ScalarizeMaskedMemIntrin] Add new PM support This patch adds new PM support for the pass and the pass can be now used during middle-end transforms. The old pass is remamed to ScalarizeMaskedMemIntrinLegacyPass. Reviewed-By: skatkov, aeubanks Differential Revision: https://reviews.llvm.org/D92743	2020-12-08 17:15:22 -05:00
Hongtao Yu	24d4291ca7	[CSSPGO] Pseudo probes for function calls. An indirect call site needs to be probed for its potential call targets. With CSSPGO a direct call also needs a probe so that a calling context can be represented by a stack of callsite probes. Unlike pseudo probes for basic blocks that are in form of standalone intrinsic call instructions, pseudo probes for callsites have to be attached to the call instruction, thus a separate instruction would not work. One possible way of attaching a probe to a call instruction is to use a special metadata that carries information about the probe. The special metadata will have to make its way through the optimization pipeline down to object emission. This requires additional efforts to maintain the metadata in various places. Given that the `!dbg` metadata is a first-class metadata and has all essential support in place , leveraging the `!dbg` metadata as a channel to encode pseudo probe information is probably the easiest solution. With the requirement of not inflating `!dbg` metadata that is allocated for almost every instruction, we found that the 32-bit DWARF discriminator field which mainly serves AutoFDO can be reused for pseudo probes. DWARF discriminators distinguish identical source locations between instructions and with pseudo probes such support is not required. In this change we are using the discriminator field to encode the ID and type of a callsite probe and the encoded value will be unpacked and consumed right before object emission. When a callsite is inlined, the callsite discriminator field will go with the inlined instructions. The `!dbg` metadata of an inlined instruction is in form of a scope stack. The top of the stack is the instruction's original `!dbg` metadata and the bottom of the stack is for the original callsite of the top-level inliner. Except for the top of the stack, all other elements of the stack actually refer to the nested inlined callsites whose discriminator field (which actually represents a calliste probe) can be used together to represent the inline context of an inlined PseudoProbeInst or CallInst. To avoid collision with the baseline AutoFDO in various places that handles dwarf discriminators where a check against the `-pseudo-probe-for-profiling` switch is not available, a special encoding scheme is used to tell apart a pseudo probe discriminator from a regular discriminator. For the regular discriminator, if all lowest 3 bits are non-zero, it means the discriminator is basically empty and all higher 29 bits can be reversed for pseudo probe use. Callsite pseudo probes are inserted in `SampleProfileProbePass` and a target-independent MIR pass `PseudoProbeInserter` is added to unpack the probe ID/type from `!dbg`. Note that with this work the switch -debug-info-for-profiling will not work with -pseudo-probe-for-profiling anymore. They cannot be used at the same time. Reviewed By: wmi Differential Revision: https://reviews.llvm.org/D91756	2020-12-02 13:45:20 -08:00
jasonliu	a65d8c5d72	[XCOFF][AIX] Generate LSDA data and compact unwind section on AIX Summary: AIX uses the existing EH infrastructure in clang and llvm. The major differences would be 1. AIX do not have CFI instructions. 2. AIX uses a new personality routine, named __xlcxx_personality_v1. It doesn't use the GCC personality rountine, because the interoperability is not there yet on AIX. 3. AIX do not use eh_frame sections. Instead, it would use a eh_info section (compat unwind section) to store the information about personality routine and LSDA data address. Reviewed By: daltenty, hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D91455	2020-12-02 18:42:44 +00:00
Amara Emerson	6042c25b0a	[GlobalISel] Add translation support for vector reduction intrinsics. In order to prevent the ExpandReductions pass from expanding some intrinsics before they get to codegen, I had to add a -disable-expand-reductions flag for testing purposes. Differential Revision: https://reviews.llvm.org/D89028	2020-10-16 10:17:53 -07:00
Simon Pilgrim	3ae07b2a33	TargetPassConfig.cpp - use auto const& iterator in for-range loop to avoid copies. NFCI.	2020-09-21 17:17:11 +01:00
Yuanfang Chen	ad99e34c59	Revert "[NewPM][CodeGen] Introduce CodeGenPassBuilder to help build codegen pipeline" This reverts commit 31ecf8d29d81d196374a562c6d2bd2c25a62861e. This reverts commit 3fdaa8602a086a3fca5f0fc8527536ac659079d0. There is laying violation for Target->CodeGen.	2020-09-11 18:52:32 -07:00
Yuanfang Chen	31ecf8d29d	[NewPM][CodeGen] Introduce CodeGenPassBuilder to help build codegen pipeline Following up on D67687. Please refer to the RFC here http://lists.llvm.org/pipermail/llvm-dev/2020-July/143309.html `CodeGenPassBuilder` is the NPM counterpart of `TargetPassConfig` with below differences. - Debugging features (MIR print/verify, disable pass, start/stop-before/after, etc.) living in `TargetPassConfig` are moved to use PassInstrument as much as possible. (Implementation also lives in `TargetPassConfig.cpp`) - `TargetPassConfig` is a polymorphic base (virtual inheritance) to build the target-dependent pipeline whereas `CodeGenPassBuilder` is the CRTP base/helper to implement the target-dependent pipeline. The motivation is flexibility for targets to customize the pipeline, inlining opportunity, and fits the overall NPM value semantics design. - `TargetPassConfig` is a legacy immutable pass to declare hooks for targets to customize some target-independent codegen layer behavior. This is partially ported to TargetMachine::options. The rest, such as `createMachineScheduler/createPostMachineScheduler`, are left out for now. They should be implemented in LLVMTargetMachine in the future. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D83608	2020-09-11 16:41:17 -07:00
Snehasish Kumar	94faadaca4	[llvm][CodeGen] Machine Function Splitter We introduce a codegen optimization pass which splits functions into hot and cold parts. This pass leverages the basic block sections feature recently introduced in LLVM from the Propeller project. The pass targets functions with profile coverage, identifies cold blocks and moves them to a separate section. The linker groups all cold blocks across functions together, decreasing fragmentation and improving icache and itlb utilization. We evaluated the Machine Function Splitter pass on clang bootstrap and SPECInt 2017. For clang bootstrap we observe a mean 2.33% runtime improvement with a ~32% reduction in itlb and stlb misses. Additionally, L1 icache misses reduced by 9.5% while L2 instruction misses reduced by 20%. For SPECInt we report the change in IntRate the C/C++ benchmarks. All benchmarks apart from mcf and x264 improve, on average by 0.6% with the max for deepsjeng at 1.6%. Benchmark % Change 500.perlbench_r 0.78 502.gcc_r 0.82 505.mcf_r -0.30 520.omnetpp_r 0.18 523.xalancbmk_r 0.37 525.x264_r -0.46 531.deepsjeng_r 1.61 541.leela_r 0.83 557.xz_r 0.15 Differential Revision: https://reviews.llvm.org/D85368	2020-08-28 11:10:14 -07:00
Snehasish Kumar	8d943a928d	[NFC] Rename BBSectionsPrepare -> BasicBlockSections. Rename the BBSectionsPrepare pass as suggested by the review comment in https://reviews.llvm.org/D85368. Differential Revision: https://reviews.llvm.org/D85380	2020-08-06 13:12:06 -07:00
Evgeny Leviant	dc619f3d7a	[CodeGen][TargetPassConfig] Add unreachable-mbb-elimination pass explicitly Differential revision: https://reviews.llvm.org/D84228	2020-07-23 18:05:11 +03:00
Yuanfang Chen	589c646a7e	[llc] (almost) remove `--print-machineinstrs` Its effect could be achieved by `-stop-after`,`-print-after`,`-print-after-all`. But a few tests need to print MIR after ISel which could not be done with `-print-after`/`-stop-after` since isel pass does not have commandline name. That's the reason `--print-machineinstrs` is downgraded to `--print-after-isel` in this patch. `--print-after-isel` could be removed after we switch to new pass manager since isel pass would have a commandline text name to use `print-after` or equivalent switches. The motivation of this patch is to reduce tests dependency on would-be-deprecated feature. Reviewed By: arsenm, dsanders Differential Revision: https://reviews.llvm.org/D83275	2020-07-20 10:43:28 -07:00
Evgeny Leviant	24089928be	[CodeGen][TargetPassConfig] Add TargetTransformInfo pass correctly Patch adds tti pass directly enforcing its execution with correctly set TargetTransformInfo. Differential revision: https://reviews.llvm.org/D84047	2020-07-18 14:11:40 +03:00
Yuanfang Chen	1e495e10e6	[NFC] change getLimitedCodeGenPipelineReason to static function	2020-07-06 15:39:27 -07:00
Juneyoung Lee	54b6457240	[TargetPassConfig] Add CanonicalizeFreezeInLoops before LSR Summary: This patch adds CanonicalizeFreezeInLoops before LSR. Relevant patch: https://reviews.llvm.org/D77523 Reviewers: spatel, efriedma, jdoerfert, fhahn, nikic, reames, xbolva00 Reviewed By: nikic Subscribers: xbolva00, nikic, lebedev.ri, hiraditya, llvm-commits, sanwou01, nlopes Tags: #llvm Differential Revision: https://reviews.llvm.org/D77524	2020-05-28 05:21:12 +09:00
Nikita Popov	2833c46f75	[DwarfEHPrepare] Don't prune unreachable resumes at optnone Disable pruning of unreachable resumes in the DwarfEHPrepare pass at optnone. While I expect the pruning itself to be essentially free, this does require a dominator tree calculation, that is not used for anything else. Saving this DT construction makes for a 0.4% O0 compile-time improvement. Differential Revision: https://reviews.llvm.org/D80400	2020-05-23 20:58:01 +02:00
Nikita Popov	0c6bba71e3	[TargetPassConfig] Don't add alias analysis at optnone When performing codegen at optnone, don't add alias analysis to the pipeline. We don't need it, but it causes an unnecessary dominator tree calculation. I've also moved the module verifier call to the top so that a bunch of disabled-at-optnone passes group more nicely. Differential Revision: https://reviews.llvm.org/D80378	2020-05-23 10:35:03 +02:00

1 2 3 4

187 Commits