In InstCombine we may decide that an alloc is removable, and the alloc
fn is called by an InvokeInst, we replace that InvokeInst with a invoke
of a noop intrinsic; this patch has us also copy the original invoke's
DILocation to the new noop invoke.
Found using https://github.com/llvm/llvm-project/pull/107279.
Only check for diffs containing "undef" in .ll files, this prevents
comments like `// We should not have undef values...` triggering the
undef checker bot.
In case the same src BB targets to the same dest BB in different
conditions/edges, such as switch-cases, we should use
prob[SrcBB->SuccIndx] instead of prob[SrcBB->DstBB] to get probability.
Patches in the Key Instructions (KeyInstr) stack need to access CGF in these
functions. 2 CGF fields are passed to these functions already; at this point it
felt natural to promote them to CGF methods.
function `bitcast_v64i16_to_v128i8` in newly added test file
`llvm-project/llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll` from
PR:https://github.com/llvm/llvm-project/pull/133052 failed in expansive
check. (passes normal lit check)
remove it for now
Specify the initializes attribute in terms of an "initialized" shadow
state, such that:
* Loads prior to initialization return poison.
* Bytes that are not explicitly initialized are written with undef on
function return.
This is intended to preserve the core semantics of the attribute, but
adjusts the wording in a way that is compatible with existing
optimizations, such as insertion of spurious loads and removal of
uninitialized writes.
Fixes https://github.com/llvm/llvm-project/issues/133038.
Fixes https://github.com/llvm/llvm-project/issues/133059.
Following from the discussion in #132224, this seems like the best
approach to deal with a mix of XO and RX output sections in the same
binary. This change will also simplify the implementation of the
PURECODE section flag for AArch64.
To control this behaviour, the `--[no-]xosegment` flag is added to LLD
(similarly to `--[no-]rosegment`), which determines whether to allow
merging XO and RX sections in the same segment. The default value is
`--no-xosegment`, which is a breaking change compared to the previous
behaviour.
Release notes are also added, since this will be a breaking change.
This feature is currently not supported in the compiler.
To facilitate this we emit a stub version of each kernel
function body with different name mangling scheme, and
replaces the respective kernel call-sites appropriately.
Fixes https://github.com/llvm/llvm-project/issues/60313
D120566 was an earlier attempt made to upstream a solution
for this issue.
---------
Co-authored-by: anikelal <anikelal@amd.com>
This issue is very convoluted, but in essence, in the new version:
For a Pointer P that points to the root of a multidimensional, primitive
array:
`P.narrow()` does nothing.
`P.atIndex(0)` points `P[0]`
`P.atIndex(0).atIndex(0)` is the same as `P.atIndex(0)` (as before)
`P.atIndex(0).narrow().atIndex(0)` points to `P[0][0]`
`P.atIndex(0).narrow().narrow()` is the same as `P.atIndex(0).narrow()`.
This PR add test cases for all types of bit conversation, it prepares
for PR: https://github.com/llvm/llvm-project/pull/132899
All tests passed due to:
1. For DAG, pattern will not separate SReg and VReg. One of the sample
is:
```
define <2 x double> @v_bitcast_v4f32_to_v2f64(<4 x float> inreg %a, i32
%b) {
%cmp = icmp eq i32 %b, 0
br i1 %cmp, label %cmp.true, label %cmp.false
cmp.true:
%a1 = fadd <4 x float> %a, splat (float 1.000000e+00)
%a2 = bitcast <4 x float> %a1 to <2 x double>
br label %end
cmp.false:
%a3 = bitcast <4 x float> %a to <2 x double>
br label %end
end:
%phi = phi <2 x double> [ %a2, %cmp.true ], [ %a3, %cmp.false ]
ret <2 x double> %phi
}
```
It suppose to select from scalar register patterns. But the Vreg pattern
is matched is as follow:
```
Debug log:
ISEL: Starting selection on root node: t3: v2f64 = bitcast t2
ISEL: Starting pattern match
Initial Opcode index to 440336
Skipped scope entry (due to false predicate) at index 440339, continuing
at 440367
Skipped scope entry (due to false predicate) at index 440368, continuing
at 440396
Skipped scope entry (due to false predicate) at index 440397, continuing
at 440435
Skipped scope entry (due to false predicate) at index 440436, continuing
at 440467
Skipped scope entry (due to false predicate) at index 440468, continuing
at 440499
Skipped scope entry (due to false predicate) at index 440500, continuing
at 440552
Skipped scope entry (due to false predicate) at index 440553, continuing
at 440587
Skipped scope entry (due to false predicate) at index 440588, continuing
at 440622
Skipped scope entry (due to false predicate) at index 440623, continuing
at 440657
Skipped scope entry (due to false predicate) at index 440658, continuing
at 440692
Skipped scope entry (due to false predicate) at index 440693, continuing
at 440727
Skipped scope entry (due to false predicate) at index 440728, continuing
at 440769
Skipped scope entry (due to false predicate) at index 440770, continuing
at 440798
Skipped scope entry (due to false predicate) at index 440799, continuing
at 440836
Skipped scope entry (due to false predicate) at index 440837, continuing
at 440870
TypeSwitch[v2f64] from 440873 to 440892
Patterns:
/*440892*/ OPC_CompleteMatch, 1, 0,
// Src: (bitconvert:{ *:[v2f64] } VReg_128:{ *:[v4f32] }:$src0) -
Complexity = 3
// Dst: VReg_128:{ *:[v2f64] }:$src0
```
2. Global isel will use `Select_COPY` to select bitcast
Fix https://github.com/llvm/llvm-project/issues/132059.
Providing incorrect mappings via `-fmodule-file=<name>=<path/to/bmi>`
can crash the compiler when loading a module that imports an
incorrectly mapped module.
The crash occurs during AST body deserialization, when the compiler
attempts to resolve remappings using the `ModuleFile` from the
incorrectly mapped module's BMI file.
The cause is an invalid access into an incorrectly loaded
`ModuleFile`.
This commit fixes the issue by verifying the identity of the imported
module.
After #134340, the availability of contextual profile isn't in itself an indication of compiling the module containing all the functions covered by that profile.
We will subsequently treat the whole profile as "flat" in the frontend, (i.e flatten and combine with the flat profile section), so we can have a profile for ThinLTO for parts of the application that don't come under the contextual profile. After ThinLTO, we will treat the module(s) containing contextual trees differently: they'll have only the contextual profile pertinent to them. The rest of the modules (non-contextual) will proceed "as usual", off the flattened profile.
This patch implements pruning of the contextual profile to enable the above.
#121323 changed the way the absolute path is computed. Empty file name
will cause absolute path ignore current folder.
This patch add "dummy" file name to avoid this issue
Fixed: #134502
Currently when printing a template argument of expression type, the
expression is converted immediately into a string to be sent to the
diagnostic engine, unsing a fake LangOpts.
This makes the expression printing look incorrect for the current
language, besides being inneficient, as we don't actually need to print
the expression if the diagnostic would be ignored.
This fixes a nastiness with the TemplateArgument constructor for
expressions being implicit, and all current users just passing an
expression to a diagnostic were implicitly going through the template
argument path.
The expressions are also being printed unquoted. This will be fixed in a
subsequent patch, as the test churn is much larger.
After replacing the branch condition, this was calling simplifyCFG to
perform the cleanups of the branch. This is far too heavy of a hammer.
We do not want all of the extra optimizations in simplifyCFG, and
this could also leave behind dead code. Instead, minimally fold the
terminator and try to delete the newly dead code.
This is pretty much a direct copy of what bugpoint does.
Add support for import and translate.
MLIR does not support using basic block references outside a function
(like LLVM does), This PR does not consider changes to MLIR to that
respect. It instead introduces two new ops: `llvm.blockaddress` and
`llvm.blocktag`. Here's an example:
```
llvm.func @ba() -> !llvm.ptr {
%0 = llvm.blockaddress <function = @ba, tag = <id = 1>> : !llvm.ptr
llvm.br ^bb1
^bb1: // pred: ^bb0
llvm.blocktag <id = 1>
llvm.return %0 : !llvm.ptr
}
```
Value `%0` hold the address of block tagged as `id = 1` in function
`@ba`. Block tags need to be unique within a function and use of
`llvm.blockaddress` requires a matching tag in a `llvm.blocktag`.
A requested follow-up from
https://github.com/llvm/llvm-project/pull/130912 by @JDevlieghere to
control Darwin parallel image loading with the same
`target.parallel-module-load` that controls the POSIX dyld parallel
image loading. Darwin parallel image loading was introduced by
https://github.com/llvm/llvm-project/pull/110646.
This small change:
* removes
`plugin.dynamic-loader.darwin.experimental.enable-parallel-image-load`
and associated code.
* changes setting call site in
`DynamicLoaderDarwin::PreloadModulesFromImageInfos` to use the new
setting.
Tested by running `ninja check-lldb` and loading some targets.
Co-authored-by: Tom Yang <toyang@fb.com>
## What?
Implement `areInlineCompatible` for the SystemZ target using
FeatureBitset comparison.
## Why?
The default implementation in `TargetTransformInfoImpl.h` makes a string
comparison and only inlines when the target-cpu and the target-features
for caller and callee are the same. We are missing out on optimizations
when the callee has a subset of features of the caller.
## How?
Get the FeatureBitset of the caller and callee and check when callee is
a subset or equal to the caller's features. It's a similar
implementation to ARM, PowerPC...
## Testing?
Test cases check for when the callee is a subset of the caller, when
it's not a subset and when both are equals.
When calculating the layout for a cbuffer field, if that field is a
ConstantArrayType, desguar it before casting it to a ConstantArrayType.
Closes#134668
---------
Co-authored-by: Eli Friedman <efriedma@quicinc.com>