When Sel(Cmp) are in different integer type,
From: (K and N mean width, K < N; a and b are src operands.)
bN = Ext(bK)
cond = Cmp(aN, bN)
aK = Trunc aN
retK = Sel(cond, aK, bK)
To:
bN = Ext(bK)
cond = Cmp(aN, bN)
retN = Sel(cond, aN, bN)
retK = Trunc retN
Though Sel's operands width becomes larger, the benefit
of making type width in Sel the same as Cmp, is for combing
to max/min intrinsics, and also better performance for SIMD
instructions.
References of correctness: https://alive2.llvm.org/ce/z/Y4Kegmhttps://alive2.llvm.org/ce/z/qFtjtR
Reference of generated code comparision:
https://gcc.godbolt.org/z/o97svGvYMhttps://gcc.godbolt.org/z/59Ynj91ov
This moves the -f(no-)?sanitize-recover parsing into a generic
parseSanitizerArgs function, and then applies it to parse
-f(no-)?sanitize-recover and -f(no-)?sanitize-trap.
N.B. parseSanitizeTrapArgs does *not* remove non-TrappingSupported
arguments. This maintains the legacy behavior of '-fsanitize=undefined
-fsanitize-trap=undefined' (clang/test/Driver/fsanitize.c), which is
that vptr is not enabled at all (not even in recover mode) in order to
avoid the need for a ubsan runtime.
Optionally unconditionally hint allocations as cold or not cold during
the matching step if the percentage of bytes allocated is at least that
of the given threshold.
/llvm-project/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp:224:2:
error: extra ';' outside of a function is incompatible with C++98 [-Werror,-Wc++98-compat-extra-semi]
};
^
1 error generated.
This addresses the following issue I opened:
https://github.com/llvm/llvm-project/issues/118851.
This change generalizes the Type Legalization mechanism that currently
handles `v8[i/f/bf]16` upsizing to include loads _and_ stores of `v8i8`
+ `v16i8`, allowing all of the mentioned vectors to be lowered to ptx as
vectors of `b32`. This extension also allows us to remove the DagCombine
that only handled exactly `load v16i8`, thus centralizing all the
upsizing logic into one place.
Test changes include adding v8i8, v16i8, and v8i16 cases to
load-store.ll, and updating the CHECKs for other tests to match the
improved codegen.
This patch forces the docs test build job to use the hashed dpendencies
file rather than the normal requirements.txt. This ensures that we get
the exact transitive closure specified rather than whatever the
dependency solver feels like it should use in the CI job.
The generic verifier will do this if the operand type is
OPERAND_IMMEDIATE, but we use our own custom operand types. Immediate
operands are still allowed to be globals, constant pools, blockaddress,
etc. so we can't check !isImm().
Replace uses of register class in dag patterns with value types. These
types are much more concise and in cases where a single register class
maps to multiple types, they avoid the need for both.
`symtab.ctx.symtab` is just `symtab`. Looks like #119296 added
this using a global find-and-replace.
This was the only instance of `symtab.ctx.symtab` in lld/.
No behavior change.
Now that we have testing of all instructions in the isSupportedInstr
switch, and better coverage of getOperandInfo, I think it is a good time
to enable this by default.
This patch checks the result of YAML parsing at the level of
MemProfRecord instead of IndexedMemProfRecord, thereby avoiding use of
Frame::hash and hashCallStacks. This makes sense because we
ultimately care about consumers like MemProfiler.cpp obtaining
MemProfRecord correctly; IndexedMemProfData and hash values are just
intermediaries.
Once this patch lands, we call Frame::hash and hashCallStacks only
when adding Frames or call stacks to their respective data structures.
In other words, the hash functions are pretty much business internal
to IndexedMemProfRecord.
Emit message when we have aliased contexts that are conservatively
hinted not cold. This is not a change in behavior, just in message when
the -memprof-report-hinted-sizes flag is enabled.
This is a follow up to 2af2634 to use vmv.v.x of i8 constants instead of
the prior vid/vand/vmsne sequence. The advantage of the vmv.v.x sequence
is that it's always m1 (so cheaper at high LMUL), and can be
rematerialized by the register allocator if needed to locally reduce
register pressure.
This was originally done to reduce the diff for the change. Remove it
and update the remaining tests. NFC modulo reordering of incoming
values.
Clean up after https://github.com/llvm/llvm-project/pull/114292.
This typo in
https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/RISCV/RISCVInstrInfoXCV.td#L701:L701
caused a compiler crash in 'RISC-V Assembly Printer' because
CV_SH_ri_inc was selected, leading to `getImmOpValue` being called for a
register operand.
This bug did not affect the Assembler output and therefore does not
trigger any existing unit tests, but is visible by examining the final
MIR function.
Consistent with glibc headers, where `noexcept` is used in C++
(or `throw()` in older C++ which llvm-libc doesn't support) in
the public function declarations, `__attribute__((__nothrow__))` is
used in C for compilers that support it.
…ility
We generally allow any legal procedure pointer target as an actual
argument for association with a dummy procedure, since many actual
procedures are underspecified EXTERNALs. But for proper generic
resolution, it is necessary to disallow incompatible functions with
explicit result types.
Fixes https://github.com/llvm/llvm-project/issues/119151.
Fixes a crash (assertion failure) in `mlir-tblgen -emit-enum-doc` caused
by calling `EnumAttr()` for the wrong type of `Record *`: `EnumAttr`
rather than `EnumAttrInfo` as asserted.
Compare the corresponding line in `emitDialectDoc()`:
0ad6be1927/mlir/tools/mlir-tblgen/OpDocGen.cpp (L532)
Co-authored-by: Malte Dehling <m.dehling@samsung.com>
This is PR is updating the debug info generation for `this`. This is
required to fix the generation of debug information for HLSL RWBuffer
type. This was required from another PR:
https://github.com/llvm/llvm-project/pull/119041/files
Co-authored-by: Joao Saffran <jderezende@microsoft.com>
This patch modifies the DWARF verifier to handle a valid case where two
or more functions have identical address ranges due to being merged by
ICF (Identical Code Folding). Previously, the verifier would incorrectly
report these as errors, but functions merged via ICF (such as when using
LLD's --keep-icf-stabs option) can legitimately share the same address
range.
A new test case has been added to verify this behavior using YAML-based
DWARF data that simulates two DW_TAG_subprogram entries with identical
address ranges. The test ensures that the verifier correctly identifies
this as a valid case and doesn't emit any errors, while still
maintaining the existing verification for truly invalid overlapping
ranges in other scenarios. Before this change, the newly added test case
would have failed, with `llvm-dwarfdump` marking the overlapping address
ranges in the DWARF as an error.
We also modify the existing tests `llvm-dwarfutil/ELF/X86/verify.test` and
`llvm/test/tools/llvm-dwarfdump/X86/verify_parent_zero_length.yaml`
which rely on the existence of the error that we're trying to
suppress. We slightly change one offset so that the ranges don't
perfectly overlap and an error is still generated.
Support true16 and fake16 format for more VOP3 instructions in MC
This patch updates the true16 and fake16 vop_profile for the following
instructions and update the asm/dasm tests:
v_mad_u16
v_mad_i16
v_med3_f16
v_med3_i16
v_med3_u16
v_max3_f16
v_max3_i16
v_max3_u16
v_min3_f16
v_min3_i16
v_min3_u16
v_med3_num_f16
This patch introduces the runtime components for type sanitizer: a
sanitizer for type-based aliasing violations.
It is based on Hal Finkel's https://reviews.llvm.org/D32197.
C/C++ have type-based aliasing rules, and LLVM's optimizer can exploit
these given TBAA metadata added by Clang. Roughly, a pointer of given
type cannot be used to access an object of a different type (with, of
course, certain exceptions). Unfortunately, there's a lot of code in the
wild that violates these rules (e.g. for type punning), and such code
often must be built with -fno-strict-aliasing. Performance is often
sacrificed as a result. Part of the problem is the difficulty of finding
TBAA violations. Hopefully, this sanitizer will help.
For each TBAA type-access descriptor, encoded in LLVM's IR using
metadata, the corresponding instrumentation pass generates descriptor
tables. Thus, for each type (and access descriptor), we have a unique
pointer representation. Excepting anonymous-namespace types, these
tables are comdat, so the pointer values should be unique across the
program. The descriptors refer to other descriptors to form a type
aliasing tree (just like LLVM's TBAA metadata does). The instrumentation
handles the "fast path" (where the types match exactly and no
partial-overlaps are detected), and defers to the runtime to handle all
of the more-complicated cases. The runtime, of course, is also
responsible for reporting errors when those are detected.
The runtime uses essentially the same shadow memory region as tsan, and
we use 8 bytes of shadow memory, the size of the pointer to the type
descriptor, for every byte of accessed data in the program. The value 0
is used to represent an unknown type. The value -1 is used to represent
an interior byte (a byte that is part of a type, but not the first
byte). The instrumentation first checks for an exact match between the
type of the current access and the type for that address recorded in the
shadow memory. If it matches, it then checks the shadow for the
remainder of the bytes in the type to make sure that they're all -1. If
not, we call the runtime. If the exact match fails, we next check if the
value is 0 (i.e. unknown). If it is, then we check the shadow for the
remainder of the byes in the type (to make sure they're all 0). If
they're not, we call the runtime. We then set the shadow for the access
address and set the shadow for the remaining bytes in the type to -1
(i.e. marking them as interior bytes). If the type indicated by the
shadow memory for the access address is neither an exact match nor 0, we
call the runtime.
The instrumentation pass inserts calls to the memset intrinsic to set
the memory updated by memset, memcpy, and memmove, as well as
allocas/byval (and for lifetime.start/end) to reset the shadow memory to
reflect that the type is now unknown. The runtime intercepts memset,
memcpy, etc. to perform the same function for the library calls.
The runtime essentially repeats these checks, but uses the full TBAA
algorithm, just as the compiler does, to determine when two types are
permitted to alias. In a situation where access overlap has occurred and
aliasing is not permitted, an error is generated.
As a note, this implementation does not use the compressed shadow-memory
scheme discussed previously
(http://lists.llvm.org/pipermail/llvm-dev/2017-April/111766.html). That
scheme would not handle the struct-path (i.e. structure offset)
information that our TBAA represents. I expect we'll want to further
work on compressing the shadow-memory representation, but I think it
makes sense to do that as follow-up work.
This includes build fixes for Linux from Mingjie Xu.
Depends on #76260 (Clang support), #76259 (LLVM support)
PR: https://github.com/llvm/llvm-project/pull/76261
commit a9aff440d9dd ("[libc][docs] reorganize documentation (#118836)")
moved https://libc.llvm.org/math/index.html to
https://libc.llvm.org/headers/math/index.html which makes links from
various slide decks stale.
There's an extension for sphinx that can generate redirects. Add a dependency
on that, then use it to create a redirect so that those older links still work.
I was able to install this sphinx extension via:
$ sudo apt install python3-sphinx-reredirects
We may need to install this on whatever server generates the llvm
documentation.