Inferring super-register classes naively requires checking every
register class against every other register class and sub-register
index.
Each of those checks is itself a non-trivial operation on register sets.
Culling as many (RC, RC, SubIdx) triples as possible is important for
the running time of TableGen for architectures with complex sub-register
relations.
Use transitivity to cull many (RC, RC, SubIdx) triples. This
unfortunately requires us to complete the transitive closure of
super-register classes explicitly, but it still cuts down the running
time on AMDGPU substantially -- in some upcoming work in the
backend by more than half (in very rough measurements).
This changes the names of some of the inferred register classes, since
the order in which they are inferred changes. The names of the inferred
register classes become shorter, which reduces the size of the generated
files.
Replacing some uses of SmallPtrSet by DenseSet shaves off a few more
percent; there are hundreds of register classes in AMDGPU.
Tweaking the topological signature check to skip reigsters without
super-registers further helps skip register classes that have "pseudo"
registers in them whose sub- and super-register structure is trivial.
The .h removals was done by the sync script. I manually cleaned up
the remaining removals based on the output of
git show 750da48b4aa52f libcxx/include/CMakeLists.txt | rg '^- ' | rg -v '\.'
Only check for diffs containing "undef" in .ll files, this prevents
comments like `// We should not have undef values...` triggering the
undef checker bot.
2b11c7de4ae182496438e166cb6758d41b6e1740 introduced
`llvm/include/llvm/MC/MCAsmLexer.h` and made `AsmLexer` inherit from
`MCAsmLexer`, likely to allow target-specific parsers to depend solely
on `MCAsmLexer`. However, this separation now seems unnecessary and
confusing.
`MCAsmLexer` defines virtual functions with `AsmLexer` as its only
implementation, and `AsmLexer` itself has few extra public methods.
To simplify the codebase, this change merges MCAsmLexer.{h,cpp} into
AsmLexer.{h,cpp}. MCAsmLexer.h is temporarily kept as a forwarder.
Note: I doubt that a downstream lexer handling an assembly syntax
significantly different from the standard GNU Assembler syntax would
want to inherit from `MCAsmLexer`. Instead, it's more likely they'd
extend `AsmLexer` by adding new states and modifying its internal logic,
as seen with variables for MASM, M68k, and HLASM.
When the list is large, using `llvm::is_contained` is of higher
performance than a sequence of comparisons. When the list is small,
the `llvm::is_contained` can be inlined and unrolled, which has the
same effect as using a sequence of comparisons.
And the generated code is more readable.
e2ba1b6ffde4ec607342b1b746d1b57f0f04390a references that it reverts a
commit that's not a parent of e2ba1b6ffde4ec607342b1b746d1b57f0f04390a.
Functionally, this can (and demonstrably does) work(*), but from the
standpoint of the revert checker, it's nonsense. Print a `logging.error`
when it's detected.
Tested by running the revert checker against a commit range that
includes the aforementioned commit; the logging.error was fired
appropriately.
(*) - the specifics here are:
- the _SHA_ that was referenced was on a non-main branch, but
- the commit from the non-main branch was merged into the non-main
branch from main
- ...so the _functional_ commit being reverted was originally landed on
main, but the _SHA_ referenced from main was from a branch that was cut
before the reverted-commit was landed on main
Currently iterators over EquivalenceClasses will iterate over std::set,
which guarantees the order specified by the comperator. Unfortunately in
many cases, EquivalenceClasses are used with pointers, so iterating over
std::set of pointers will not be deterministic across runs.
There are multiple places that explicitly try to sort the equivalence
classes before using them to try to get a deterministic order
(LowerTypeTests, SplitModule), but there are others that do not at the
moment and this can result at least in non-determinstic value naming in
Float2Int.
This patch updates EquivalenceClasses to keep track of all members via a
extra SmallVector and removes code from LowerTypeTests and SplitModule
to sort the classes before processing.
Overall it looks like compile-time slightly decreases in most cases, but
close to noise:
https://llvm-compile-time-tracker.com/compare.php?from=7d441d9892295a6eb8aaf481e1715f039f6f224f&to=b0c2ac67a88d3ef86987e2f82115ea0170675a17&stat=instructions
PR: https://github.com/llvm/llvm-project/pull/134075
Try to reduce individual subtarget features in the "target-features"
attribute. This attempts a textual removal of the fields in the string,
not a semantic removal. Typically there's a lot of redundant feature spam
in the feature list implied by the target-cpu (which I really wish clang
would stop emitting). If we could parse these out, we could easily drop
the fields without testing anything.
We can use *Set::insert_range to collapse:
for (auto Elem : Range)
Set.insert(E.first);
down to:
Set.insert_range(llvm::make_first_range(Range));
In some cases, we can further fold that into the set declaration.
I'm looking into using sub-operands for memory operands. This would use
MIOperandInfo to create a single operand that contains a register and
immediate as sub-operands. We can treat this as a single operand for
parsing and matching in the assembler. I believe this will provide some
simplifications like removing the InstAliases we need to support "(rs1)"
without an immediate.
Doing this requires making CompressInstEmitter aware of sub-operands.
I've chosen to use a flat list of operands in the CompressPats so each
sub-operand is represented individually.