Adding support for Root Signature Flags Element extraction and writing
to DXContainer.
- Adding an analysis to deal with RootSignature metadata definition
- Adding validation for Flag
- writing RootSignature blob into DXIL
Closes: [126632](https://github.com/llvm/llvm-project/issues/126632)
---------
Co-authored-by: joaosaffran <joao.saffran@microsoft.com>
The find-dynamic-unwind-info callback registration APIs in libunwind
limit the number of callbacks that can be registered. If we use multiple
UnwindInfoManager instances, each with their own own callback function
(as was the case prior to this patch) we can quickly exceed this limit
(see https://github.com/llvm/llvm-project/issues/126611).
This patch updates the UnwindInfoManager class to use a singleton
pattern, with the single instance shared between all LLVM JITs in the
process.
This change does _not_ apply to compact unwind info registered through
the ORC runtime (which currently installs its own callbacks).
As a bonus this change eliminates the need to load an IR "bouncer"
module to supply the unique callback for each instance, so support for
compact-unwind can be extended to the llvm-jitlink tools (which does not
support adding IR).
I want to use `spirv-link` from `SPIR-V-Tools` in a test, so let's build
it if `LLVM_INCLUDE_SPIRV_TOOLS_TESTS` is set, as we do with the other
tools.
Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
This patch updates the getSectionAndRelocations function to also support
CREL relocation sections. Unit tests have been added. This patch also
updates consumers to say they explicitly do not support CREL format
relocations. Subsequent patches will make the consumers work with CREL
format relocations and also add in testing support.
Reviewers: red1bluelost, MaskRay, rlavaee
Reviewed By: MaskRay
Pull Request: https://github.com/llvm/llvm-project/pull/126445
This PR adds:
- `RootSignatureFlags` extraction from DXContainer using `obj2yaml`
This PR is part of: #121493
---------
Co-authored-by: joaosaffran <joao.saffran@microsoft.com>
The system libunwind on older Darwins does not support JIT registration of
compact-unwind. Since the CompactUnwindManager utility discards redundant
eh-frame FDEs by default we need to remove the compact-unwind section first
when targeting older libunwinds in order to preserve eh-frames.
While LLJIT was already doing this as of eae6d6d18bd, MachOPlatform was not.
This was causing buildbot failures in the ORC runtime (e.g. in
https://green.lab.llvm.org/job/llvm.org/job/clang-stage1-RA/3479/).
This patch updates both LLJIT and MachOPlatform to check a bootstrap value,
"darwin-use-ehframes-only", to determine whether to forcibly preserve
eh-frame sections. If this value is present and set to true then compact-unwind
sections will be discarded, causing eh-frames to be preserved. If the value is
absent or set to false then compact-unwind will be used and redundant FDEs in
eh-frames discarded (FDEs that are needed by the compact-unwind section are
always preserved).
rdar://143895614
RISCV Zicfilp/Zicfiss extensions uses the `.note.gnu.property` section
to store flags indicating the adoption of features based on these
extensions. This patch enables the llvm-readobj/llvm-readelf tools to
dump these flags with the `--note` flag.
For each instruction from the input assembly sequence, DependencyGraph
has a dedicated node (DGNode). Outgoing edges (data, resource and memory
dependencies) are tracked as SmallVector<..., 8> for each DGNode in the
graph. However, it's unlikely that a usual input instruction will have
approximately eight dependent instructions. Below is my statistics for
several RISC-V input sequences:
```
Number of | Number of nodes with
edges | this # of edges
---------------------------------
0 | 8239447
1 | 464252
2 | 6164
3 | 6783
4 | 939
5 | 500
6 | 545
7 | 116
8 | 2
9 | 1
10 | 1
```
Approximately the same distribution is produced by llvm-mca lit tests
for X86, AArch and RISC-V (even modified ones with extra dependencies
added).
On a rather big input asm sequences, the use of SmallVector<..., 8>
dramatically increases memory consumption without any need for it. In my
case, replacing it with SmallVector<...,0> reduces memory usage by ~28%
or ~1700% of input file size (2.2GB in absolute values).
There is no change in execution time, I verified it on mca lit-tests and
on my big test (execution time is ~30s in both cases).
This change was made with the same intention as #124904 and optimizes I
believe quite an unusual scenario. However, if there is no negative
impact on other known scenarios, I'd like to have the change in
llvm-project.
ResourceUsage is a very sparse table. On large input asm sequences it
consumes a lot of memory utilizing only a few percents of it (~4% on my
benchmark). Reorganization of ResourceUsage to keep only used fields
allows saving up to 18% of total memory use by mca or ~850% of input
file size (~1.1GB in absolute values in my case).
Added an extension point after vectorizer passes in the PassBuilder.
Additionally, added extension points before and after vectorizer passes
in `buildLTODefaultPipeline`. Credit goes to @mshockwave for guiding me
through my first LLVM contribution (and my first open source
contribution in general!) :)
- Implemented `registerVectorizerEndEPCallback`
- Implemented `invokeVectorizerEndEPCallbacks`
- Added `VectorizerEndEPCallbacks` SmallVector
- Added a command line option `passes-ep-vectorizer-end` to
`NewPMDriver.cpp`
- `buildModuleOptimizationPipeline` now calls
`invokeVectorizerEndEPCallbacks`
- `buildO0DefaultPipeline` now calls `invokeVectorizerEndEPCallbacks`
- `buildLTODefaultPipeline` now calls BOTH
`invokeVectorizerStartEPCallbacks` and `invokeVectorizerEndEPCallbacks`
- Added LIT tests to `new-pm-defaults.ll`, `new-pm-lto-defaults.ll`,
`new-pm-O0-ep-callbacks.ll`, and `pass-pipeline-parsing.ll`
- Renamed `CHECK-EP-Peephole` to `CHECK-EP-PEEPHOLE` in
`new-pm-lto-defaults.ll` for consistency.
This code is intended for developers that wish to implement and run
custom passes after the vectorizer passes in the PassBuilder pipeline.
For example, in #91796, a pass was created that changed the induction
variables of vectorized code. This is right after the vectorization
passes.
As of d0052ebbe2e the setUpGenericLLVMIRPlatform function will automatically
add an instance of the EHFrameRegistrationPlugin (for LLJIT instances whose
object linking layers are ObjectLinkingLayers, not RTDyldObjectLinkingLayers).
This commit removes the redundant plugin creation in the object linking
layer constructor function in lli.cpp to prevent duplicate registration of
eh-frames, which is likely the cause of recent bot failures, e.g.
https://lab.llvm.org/buildbot/#/builders/108/builds/8685.
Since #112694, MCDCRecord::isCondFolded() has returned true for
"partially folded" conditions. Besides,
isConditionIndependencePairCovered() returns true if the unfolded
condition is satisfied. This might break consistency
(CoveredPairs <= NumPairs).
Scalable vectorization factors are printed as "vscale x VF" where VF is
the known minimum number of elements, a integer. Currently,
llvm-opt-report always expects a integer (like for vectorization with
fixed-sized vectors), and does not display any vectorization factor in
the output (just 'V', but without a number).
This patch adds support for scalable vectorization factors and prints
them as "VNx<VF>", so for example "VNx4". The "Nx" is used to
differentiate between fixed-sized and scalable factors, and is
consistent with the way LLVM mangles scalable vectors in other places.
As part of the "RemoveDIs" project, BasicBlock::iterator now carries a
debug-info bit that's needed when getFirstNonPHI and similar feed into
instruction insertion positions. Call-sites where that's necessary were
updated a year ago; but to ensure some type safety however, we'd like to
have all calls to getFirstNonPHI use the iterator-returning version.
This patch changes a bunch of call-sites calling getFirstNonPHI to use
getFirstNonPHIIt, which returns an iterator. All these call sites are
where it's obviously safe to fetch the iterator then dereference it. A
follow-up patch will contain less-obviously-safe changes.
We'll eventually deprecate and remove the instruction-pointer
getFirstNonPHI, but not before adding concise documentation of what
considerations are needed (very few).
---------
Co-authored-by: Stephen Tozer <Melamoto@gmail.com>
As part of the "RemoveDIs" project, BasicBlock::iterator now carries a
debug-info bit that's needed when getFirstNonPHI and similar feed into
instruction insertion positions. Call-sites where that's necessary were
updated a year ago; but to ensure some type safety however, we'd like to
have all calls to moveBefore use iterators.
This patch adds a (guaranteed dereferenceable) iterator-taking
moveBefore, and changes a bunch of call-sites where it's obviously safe
to change to use it by just calling getIterator() on an instruction
pointer. A follow-up patch will contain less-obviously-safe changes.
We'll eventually deprecate and remove the instruction-pointer
insertBefore, but not before adding concise documentation of what
considerations are needed (very few).
This fixes "unused-local-typedef" warnings in 9324e6a7a5.
This adds an option `--remove-note=[name/]type` to selectively delete
notes in ELF files, where `type` is the numeric value of the note type
and `name` is the name of the originator. The name can be omitted, in
which case all notes of the specified type will be removed. For now,
only `SHT_NOTE` sections that are not associated with segments are
handled. The implementation can be extended later as needed.
RFC: https://discourse.llvm.org/t/rfc-llvm-objcopy-feature-for-editing-notes/83491
This adds an option `--remove-note=[name/]type` to selectively delete
notes in ELF files, where `type` is the numeric value of the note type
and `name` is the name of the originator. The name can be omitted, in
which case all notes of the specified type will be removed. For now,
only `SHT_NOTE` sections that are not associated with segments are
handled. The implementation can be extended later as needed.
RFC: https://discourse.llvm.org/t/rfc-llvm-objcopy-feature-for-editing-notes/83491
Now that we have a dedicated abstraction for string tables, switch the
option parser library's string table over to it rather than using a raw
`const char*`. Also try to use the `StringTable::Offset` type rather
than a raw `unsigned` where we can to avoid accidental increments or
other issues.
This is based on review feedback for the initial switch of options to a
string table. Happy to tweak or adjust if desired here.
This update introduces the ability to filter merged functions during
lookups based on regex patterns derived from call site information in a
previous call to `llvm-gsymutil`. The regex patterns, extracted from
call sites, can then be passed to subsequent calls using the
`--merged-functions-filter` option along with `--merged-functions` and
`--address` (or `--addresses-from-stdin`). This allows for precise
filtering of functions during lookups, giving accurate results for call
stacks that contain merged functions.
Replace `static void exitWithError(Twine Message, std::string Whence =
"", std::string Hint = "")` std::string with StringRef to remove
constructing Strings on every call or passing by value
Fixes: #100065
Some of this was needed to fix implicit conversions from MCRegister to
unsigned when calling getReg() on MCOperand for example.
The majority was done by reviewing parts of the code that dealt with
registers, converting them to MCRegister and then seeing what new
implicit conversions were created and fixing those.
There were a few places where I used MCPhysReg instead of MCRegiser for
static arrays since its uint16_t instead of unsigned.
Adds support to objdump and readobj for reading the `UOP_Epilog` entries
of Windows x64 unwind v2.
`UOP_Epilog` has a weird format:
The first `UOP_Epilog` in the unwind data is the "header":
* The least-significant bit of `OpInfo` is the "At End" flag, which
signifies that there is an epilog at the very end of the associated
function.
* `CodeOffset` is the length each epilog described by the current unwind
information (all epilogs have the same length).
Any subsequent `UOP_Epilog` represents another epilog for the current
function, where `OpInfo` and `CodeOffset` are combined to a 12-bit value
which is the offset of the beginning of the epilog from the end of the
current function. If the offset is 0, then this entry is actually
padding and can be ignored.
This relands f8f8598fd886cddfd374fa43eb6d7d37d301b576
Follow up on #122371:
The problem here is a little subtle: when we dry-run the measurement
phase, we create a LLJIT instance without actually executing the
snippets. The key is, LLJIT has its own TargetMachine which uses triple
designated by LLVM_TARGET_ARCH (which is default to host). On a machine
that does not support Exegesis, the LLJIT would fail to create its
TargetMachine because llvm-exegesis don't even register the host's
target!
Putting this test into any of the target-specific folder won't help,
because it's about the host. And personally I don't really want to use
`exegesis-can-execute-<arch>` for generic tests like this -- it's too
strict as we don't actually need to execute the snippet.
My solution here is creating another test feature which is added only
when LLVM_TARGET_ARCH is supported by llvm-exegesis. This feature is
something in between `<arch>-registered-target` and
`exegesis-can-execute-<arch>`.
We have a textual representation of contextual profiles for test scenarios, mainly. This patch moves that to YAML instead of JSON. YAML is more succinct and readable (some of the .ll tests should be illustrative). In addition, JSON is parse-able by the YAML reader.
A subsequent patch will address deserialization.
(thanks, @kazutakahirata, for showing me how to use the llvm YAML reader/writer APIs, which I incorrectly thought to be more low-level than the JSON ones!)
This reapplies 6d72bf47606, which was reverted in 57447d3ddf to investigate
build failures, e.g. https://lab.llvm.org/buildbot/#/builders/3/builds/10114.
The original patch contained an invalid unused friend declaration of
std::make_shared. This has been removed.
Also adds a new IdleTask type and updates DynamicThreadPoolTaskDispatcher to
schedule IdleTasks whenever the total number of threads running is less than
the maximum number of MaterializationThreads.
A SimpleLazyReexportsSpeculator instance maintains a list of speculation
suggestions ((JITDylib, Function) pairs) and registered lazy reexports. When
speculation opportunities are available (having been added via
addSpeculationSuggestions or when lazy reexports were created) it schedules
an IdleTask that triggers the next speculative lookup as soon as resources
are available. Speculation suggestions are processed first, followed by
lookups for lazy reexport bodies. A callback can be registered at object
construction time to record lazy reexport executions as they happen, and these
executions can be fed back into the speculator as suggestions on subsequent
executions.
The llvm-jitlink tool is updated to support speculation when lazy linking is
used via three new arguments:
-speculate=[none|simple] : When the 'simple' value is specified a
SimpleLazyReexportsSpeculator instances is used
for speculation.
-speculate-order <path> : Specifies a path to a CSV containing
(jit dylib name, function name) triples to use
as speculative suggestions in the current run.
-record-lazy-execs <path> : Specifies a path in which to record lazy function
executions as a CSV of (jit dylib name, function
name) pairs, suitable for use with
-speculate-order.
The same path can be passed to -speculate-order and -record-lazy-execs, in
which case the file will be overwritten at the end of the execution.
No testcase yet: Speculative linking is difficult to test (since by definition
execution behavior should be unaffected by speculation) and this is an new
prototype of the concept*. Tests will be added in the future once the interface
and behavior settle down.
* An earlier implementation of the speculation concept can be found in
llvm/include/llvm/ExecutionEngine/Orc/Speculation.h. Both systems have the
same goal (hiding compilation latency) but different mechanisms. This patch
relies entirely on information available in the controller, where the old
system could receive additional information from the JIT'd runtime via
callbacks. I aim to combine the two in the future, but want to gain more
practical experience with speculation first.
…#121991)"
This reverts commit f8f8598fd886cddfd374fa43eb6d7d37d301b576.
This breaks ARMv7 and s390x buildbot with the following message:
```
llvm-exegesis error: No available targets are compatible with triple "armv8l-unknown-linux-gnueabihf"
FileCheck error: '<stdin>' is empty.
FileCheck command line: /home/tcwg-buildbot/worker/clang-armv7-2stage/stage2/bin/FileCheck /home/tcwg-buildbot/worker/clang-armv7-2stage/llvm/llvm/test/tools/llvm-exegesis/dry-run-measurement.test
```