This patch adds Version 3 for development purposes. For now, this
patch adds V3 as a copy of V2.
For the most part, this patch adds "case Version3:" wherever "case
Version2:" appears. One exception is writeMemProfV3, which is copied
from writeMemProfV2 but updated to write out memprof::Version3 to the
MemProf header. We'll incrementally modify writeMemProfV3 in
subsequent patches.
This implements the `nusw` and `nuw` flags for `getelementptr` as
proposed at
https://discourse.llvm.org/t/rfc-add-nusw-and-nuw-flags-for-getelementptr/78672.
The three possible flags are encapsulated in the new `GEPNoWrapFlags`
class. Currently this class has a ctor from bool, interpreted as the
InBounds flag. This ctor should be removed in the future, as code gets
migrated to handle all flags.
There are a few places annotated with `TODO(gep_nowrap)`, where I've had
to touch code but opted to not infer or precisely preserve the new
flags, so as to keep this as NFC as possible and make sure any changes
of that kind get test coverage when they are made.
Update the folder titles for targets in the monorepository that have not
seen taken care of for some time. These are the folders that targets are
organized in Visual Studio and XCode
(`set_property(TARGET <target> PROPERTY FOLDER "<title>")`)
when using the respective CMake's IDE generator.
* Ensure that every target is in a folder
* Use a folder hierarchy with each LLVM subproject as a top-level folder
* Use consistent folder names between subprojects
* When using target-creating functions from AddLLVM.cmake, automatically
deduce the folder. This reduces the number of
`set_property`/`set_target_property`, but are still necessary when
`add_custom_target`, `add_executable`, `add_library`, etc. are used. A
LLVM_SUBPROJECT_TITLE definition is used for that in each subproject's
root CMakeLists.txt.
The profile density feature(the amount of samples in the profile
relative to the program size) is used to identify insufficient sample
issue and provide hints for user to increase sample count. A low-density
profile can be inaccurate due to statistical noise, which can hurt FDO
performance.
This change introduces two improvements to the current density work.
1. The density calculation/definition is changed. Previously, the
density of a profile was calculated as the minimum density for all warm
functions (a function was considered warm if its total samples were
within the top N percent of the profile). However, there is a problem
that a high total sample profile can have a very low density, which
makes the density value unstable.
- Instead, we want to find a density number such that if a function's
density is below this value, it is considered low-density function. We
consider the whole profile is bad if a group of low-density functions
have the sum of samples that exceeds N percent cut-off of the total
samples.
- In implementation, we sort the function profiles by density, iterate
them in descending order and keep accumulating the body samples until
the sum exceeds the (100% - N) percentage of the total_samples, the
profile-density is the last(minimum) function-density of processed
functions. We introduce the a flag(`--profile-density-threshold`) for
this percentage threshold.
2. The density is now calculated based on final(compiler used) profiles
instead of merged context-less profiles.
On Windows, perfscript generated by sep contains CR+LF at the end of
LBR records line. This '\r' will be treated as a LBR record when running
llvm-profgen on Linux and then generate warning.
The `llvm-profdata order` command is used to compute a function order
using traces from the input profile. Add the `--num-test-traces` flag to
keep aside N traces to evalute this order. These test traces are assumed
to be the actual function execution order in some experiment. The output
is a number that represents how many page faults we got. Lower is
better.
I tested on a large profile I already had.
```
llvm-profdata order default.profdata --num-test-traces=30
# Ordered 149103 functions
# Total area under the page fault curve: 2.271827e+09
...
```
I also improved `TemporalProfTraceTy::createBPFunctionNodes()` in a few
ways:
* Simplified how `UN`s are computed
* Change how the initial `Node` order is computed
* Filter out rare and common `UN`s
* Output vector is an aliased argument instead of a return
These changes slightly improved the evaluation in my test.
```
llvm-profdata order default.profdata --num-test-traces=30
# Ordered 149103 functions
# Total area under the page fault curve: 2.268586e+09
...
```
This patch fixes an issue where, when printing corefile notes with
llvm-readobj as json, the dumper generated llvm formatted output which
isn't valid json. This alters the dumper to, in the NT_FILE, note, dump
properly formatted json data.
Prior to this patch the JSON output was formatted like:
```
"Mapping": [
"Start": 4096,
"End": 8192,
"Offset": 12288,
"Filename": "/path/to/a.out"
],
```
Whereas now it is formatted as:
```
"Mappings": [
{
"Start": 4096,
"End": 8192,
"Offset": 12288,
"Filename": "/path/to/a.out"
},
```
Which is valid. Additionally the LLVM output has changed to match the
structure of the JSON output (i.e. instead of lists of keys it is a list
of dictionaries)
Currently we assume a constant latency of 100 cycles for call
instructions. This commit allows the user to specify a custom value for
the same as a command line argument. Default latency is set to 100.
For distributed ThinLTO, the LTO indexing step generates combined
summary for each module, and postlink pipeline reads the combined
summary which stores the information for link-time optimization.
This patch populates the 'import type' of a summary in bitcode, and
updates bitcode reader to parse the bit correctly.
Commit 6c0665e22174d474050e85ca367424f6e02476be
(https://reviews.llvm.org/D45164) enabled certain constant expression
evaluation for `MCObjectStreamer` at parse time (e.g. `.if` directives,
see llvm/test/MC/AsmParser/assembler-expressions.s).
`getUseAssemblerInfoForParsing` was added to make `clang -c` handling
inline assembly similar to `MCAsmStreamer` (e.g. `llvm-mc -filetype=asm`),
where such expression folding (related to
`AttemptToFoldSymbolOffsetDifference`) is unavailable.
I believe this is overly conservative. We can make some parse-time
expression folding work for `clang -c` even if `clang -S` would still
report an error, a MCAsmStreamer issue (we cannot print `.if`
directives) that should not restrict the functionality of
MCObjectStreamer.
```
% cat b.cc
asm(R"(
.pushsection .text,"ax"
.globl _start; _start: ret
.if . -_start == 1
ret
.endif
.popsection
)");
% gcc -S b.cc && gcc -c b.cc
% clang -S -fno-integrated-as b.cc # succeeded
% clang -c b.cc # succeeded with this patch
% clang -S b.cc # still failed
<inline asm>:4:5: error: expected absolute expression
4 | .if . -_start == 1
| ^
1 error generated.
```
However, removing `getUseAssemblerInfoForParsing` would make
MCDwarfFrameEmitter::Emit (for .eh_frame FDE) slow (~4% compile time
regression for sqlite3.c amalgamation) due to expensive
`AttemptToFoldSymbolOffsetDifference`. For now, make
`UseAssemblerInfoForParsing` false in MCDwarfFrameEmitter::Emit.
Close#62520
Link: https://discourse.llvm.org/t/rfc-clang-assembly-object-equivalence-for-files-with-inline-assembly/78841
Pull Request: https://github.com/llvm/llvm-project/pull/91082
https://github.com/llvm/llvm-project/pull/71328 refactored
`llvm-profdata.cpp` to use subcommands (which is super nice), but left
many unused `argv` variables. This opts to use `ProgName` where
necessary, and removes `argv` otherwise.
`llvm_test_dibuilder(/*NewDebugInfoMode=*/true)` isn't currently executed in
`return llvm_test_dibuilder(false) && llvm_test_dibuilder(true);`
because `llvm_test_dibuilder` returns 0 for success.
Split the llvm-c-test flag `--test-dibuilder` into two, one for the old and
one for the new debug info format. Add another lit test for the new format.
Now that the test actually runs, it crashes using the new format with
`llvm/lib/IR/LLVMContextImpl.cpp:53:llvm::LLVMContextImpl::~LLVMContextImpl(): Assertion 'TrailingDbgRecords.empty() && "DbgRecords in blocks not cleaned"' failed. Aborted`.
Insert terminators into the blocks so that we don't leave the debug records
trailing, unattached to any instructions, which fixes that.
This reverts commit 03c53c69a367008da689f0d2940e2197eb4a955c.
This causes very large compile-time regressions in some cases,
e.g. sqlite3 at O0 regresses by 5%.
Commit 6c0665e22174d474050e85ca367424f6e02476be
(https://reviews.llvm.org/D45164) enabled certain constant expression
evaluation for `MCObjectStreamer` at parse time (e.g. `.if` directives,
see llvm/test/MC/AsmParser/assembler-expressions.s).
`getUseAssemblerInfoForParsing` was added to make `clang -c` handling
inline assembly similar to `MCAsmStreamer` (e.g. `llvm-mc -filetype=asm`),
where such expression folding (related to
`AttemptToFoldSymbolOffsetDifference`) is unavailable.
I believe this is overly conservative. We can make some parse-time
expression folding work for `clang -c` even if `clang -S` would still
report an error, a MCAsmStreamer issue (we cannot print `.if`
directives) that should not restrict the functionality of
MCObjectStreamer.
```
% cat b.cc
asm(R"(
.pushsection .text,"ax"
.globl _start; _start: ret
.if . -_start == 1
ret
.endif
.popsection
)");
% gcc -S b.cc && gcc -c b.cc
% clang -S -fno-integrated-as b.cc # succeeded
% clang -c b.cc # succeeded with this patch
% clang -S b.cc # still failed
<inline asm>:4:5: error: expected absolute expression
4 | .if . -_start == 1
| ^
1 error generated.
```
Close#62520
Link: https://discourse.llvm.org/t/rfc-clang-assembly-object-equivalence-for-files-with-inline-assembly/78841
Pull Request: https://github.com/llvm/llvm-project/pull/91082
This reverts commit 91446e2aa687ec57ad88dc0df793d0c6e694a7c9 and
a unittest followup 1530f319311908b06fe935c89fca692d3e53184f (#90476).
In a stage-2 -flto=thin -gsplit-dwarf -g -fdebug-info-for-profiling
-fprofile-sample-use= build of clang, a ThinLTO backend compile has
assertion failures:
Global is external, but doesn't have external or weak linkage!
ptr @_ZN5clang12ast_matchers8internal18makeAllOfCompositeINS_8QualTypeEEENS1_15BindableMatcherIT_EEN4llvm8ArrayRefIPKNS1_7MatcherIS5_EEEE
function declaration may only have a unique !dbg attachment
ptr @_ZN5clang12ast_matchers8internal18makeAllOfCompositeINS_8QualTypeEEENS1_15BindableMatcherIT_EEN4llvm8ArrayRefIPKNS1_7MatcherIS5_EEEE
The failures somehow go away if -fprofile-sample-use= is removed.
Avoid using bitfield in dxbc::ProgramHeader.
It could potentially be read incorrectly on any host depending on the
compiler.
From [C++17's
[class.bit]](https://timsong-cpp.github.io/cppwp/n4659/class.bit#1)
> Bit-fields are packed into some addressable allocation unit. [ Note:
Bit-fields straddle allocation units on some machines and not on others.
Bit-fields are assigned right-to-left on some machines, left-to-right on
others. — end note ]
For #91793
Add a -q/--quiet flag to suppress dsymutil output. For now the flag is
limited to dsymutil, though there might be other places in the DWARF
linker that could be conditionalized by this flag.
The motivation is having a way to silence the "no debug symbols in
executable" warning. This is useful when we want to generate a dSYM for
a binary not containing debug symbols, but still want a dSYM that can be
indexed by spotlight.
rdar://127843467
I'm planning to remove StringRef::equals in favor of
StringRef::operator==.
- StringRef::operator==/!= outnumber StringRef::equals by a factor of
70 under llvm/ in terms of their usage.
- The elimination of StringRef::equals brings StringRef closer to
std::string_view, which has operator== but not equals.
- S == "foo" is more readable than S.equals("foo"), especially for
!Long.Expression.equals("str") vs Long.Expression != "str".
This adds LLVMBuildCallBr to create CallBr instructions, and getters for
the CallBr-specific data. The remainder of its data, e.g.
arguments/function, can be accessed using existing getters.
[llvm-mca] Abort on parse error without -skip-unsupported-instructions
Prior to this patch, llvm-mca would continue executing after parse
errors. These errors can lead to some confusion since some analysis
results are printed on the standard output, and they're printed after
the errors, which could otherwise be easy to miss.
However it is still useful to be able to continue analysis after errors;
so extend the recently added -skip-unsupported-instructions to support
this.
Two tests which have parse errors for some of the 'RUN' branches are
updated to use -skip-unsupported-instructions so they can remain as-is.
Add a description of -skip-unsupported-instructions to the llvm-mca
command guide, and add it to the llvm-mca --help output:
```
--skip-unsupported-instructions=<value> - Force analysis to continue in the presence of unsupported instructions
=none - Exit with an error when an instruction is unsupported for any reason (default)
=lack-sched - Skip instructions on input which lack scheduling information
=parse-failure - Skip lines on the input which fail to parse for any reason
=any - Skip instructions or lines on input which are unsupported for any reason
```
Tests within this patch are intended to cover each of the cases.
Reason | Flag | Comment
--------------|------|-------
none | none | Usual case, existing test suite
lack-sched | none | Advises user to use -skip-unsupported-instructions=lack-sched, tested in llvm/test/tools/llvm-mca/X86/BtVer2/unsupported-instruction.s
parse-failure | none | Advises user to use -skip-unsupported-instructions=parse-failure, tested in llvm/test/tools/llvm-mca/bad-input.s
any | none | (N/A, covered above)
lack-sched | any | Continues, prints warnings, tested in llvm/test/tools/llvm-mca/X86/BtVer2/unsupported-instruction.s
parse-failure | any | Continues, prints errors, tested in llvm/test/tools/llvm-mca/bad-input.s
lack-sched | parse-failure | Advises user to use -skip-unsupported-instructions=lack-sched, tested in llvm/test/tools/llvm-mca/X86/BtVer2/unsupported-instruction.s
parse-failure | lack-sched | Advises user to use -skip-unsupported-instructions=parse-failure, tested in llvm/test/tools/llvm-mca/bad-input.s
none | * | This would be any test case with skip-unsupported-instructions, coverage added in llvm/test/tools/llvm-mca/X86/BtVer2/simple-test.s
any | * | (Logically covered by the other cases)
We need it to test isel related passes. Currently
`verifyMachineFunction` is incomplete (no LiveIntervals support), but is
enough for testing isel pass, will migrate to complete
`MachineVerifierPass` in future.
Reapplies the original commit:
2f01fd99eb8c8ab3db9aba72c4f00e31e9e60a05
The previous application of this patch failed due to some missing
DbgVariableRecord support in clang, which has been added now by commit
8805465e.
This will probably break some downstream tools that don't already handle
debug records. If your downstream code breaks as a result of this
change, the simplest fix is to convert the module in question to the old
debug format before you process it, using
`Module::convertFromNewDbgValues()`. For more information about how to
handle debug records or about what has changed, see the migration
document:
https://llvm.org/docs/RemoveDIsDebugInfo.html
This reverts commit 4fd319ae273ed6c252f2067909c1abd9f6d97efa.
Fixes the broken tests in the original commit:
2f01fd99eb8c8ab3db9aba72c4f00e31e9e60a05
This will probably break some downstream tools that don't already handle
debug records. If your downstream code breaks as a result of this
change, the simplest fix is to convert the module in question to the old
debug format before you process it, using
`Module::convertFromNewDbgValues()`. For more information about how to
handle debug records or about what has changed, see the migration
document:
https://llvm.org/docs/RemoveDIsDebugInfo.html
This reverts commit 00821fed09969305b0003d3313c44d1e761a7131.
This patch enables parsing and creating modules directly into the new
debug info format. Prior to this patch, all modules were constructed
with the old debug info format by default, and would be converted into
the new format just before running LLVM passes. This is an important
milestone, in that this means that every tool will now be exposed to
debug records, rather than those that run LLVM passes. As far as I've
tested, all LLVM tools/projects now either handle debug records, or
convert them to the old intrinsic format.
There are a few unit tests that need updating for this patch; these are
either cases of tests that previously needed to set the debug info
format to function, or tests that depend on the old debug info format in
some way. There should be no visible change in the output of any LLVM
tool as a result of this patch, although the likelihood of this patch
breaking downstream code means an NFC tag might be a little misleading,
if not technically incorrect:
This will probably break some downstream tools that don't already handle
debug records. If your downstream code breaks as a result of this
change, the simplest fix is to convert the module in question to the old
debug format before you process it, using
`Module::convertFromNewDbgValues()`. For more information about how to
handle debug records or about what has changed, see the migration
document:
https://llvm.org/docs/RemoveDIsDebugInfo.html
To support auto-conversion on z/OS text files need to be opened as text files. These changes will fix a number of LIT failures due to text files not being converted to the internal code page.
update a number of tools so they open the text files as text files
add support in the cat.py to open a text file as a text file (Windows will continue to treat all files as binary so new lines are handled correctly)
add env var definitions to enable auto-conversion in the lit config file.
In new pass system, `MachineFunction` could be an analysis result again,
machine module pass can now fetch them from analysis manager.
`MachineModuleInfo` no longer owns them.
Remove `FreeMachineFunctionPass`, replaced by
`InvalidateAnalysisPass<MachineFunctionAnalysis>`.
Now `FreeMachineFunction` is replaced by
`InvalidateAnalysisPass<MachineFunctionAnalysis>`, the workaround in
`MachineFunctionPassManager` is no longer needed, there is no difference
between `unittests/MIR/PassBuilderCallbacksTest.cpp` and
`unittests/IR/PassBuilderCallbacksTest.cpp`.
This adds a "hidden" alias kind that allows using LLD when symlinked as
`ld`; however, it does not install `ld` as a symlink. This is to allow
either using a mixed toolchain with both LLD and GNU ld, or a pure LLD
toolchain where LLD has been installed (or symlinked) to `ld` for
compatibility w/ older tools that expect `ld`.
Prior to this patch, if llvm-mca encountered an instruction which parses
but has no scheduler info, the instruction is always reported as
unsupported, and llvm-mca halts with an error.
However, it would still be useful to allow MCA to continue even in the
case of instructions lacking scheduling information. Obviously if
scheduling information is lacking, it's not possible to give an accurate
analysis for those instructions, and therefore a warning is emitted.
A user could previously have worked around such unsupported instructions
manually by deleting such instructions from the input, but this provides
them a way of doing this for bulk inputs where they may not have a list
of such unsupported instructions to drop up front.
Note that this behaviour of instructions with no scheduling information
under -skip-unsupported-instructions is analagous to current
instructions which fail to parse: those are currently dropped from the
input with a message printed, after which the analysis continues.
~Testing the feature is a little awkward currently, it relies on an
instruction
which is currently marked as unsupported, which may not remain so;
should the
situation change it would be necessary to find an alternative
unsupported
instruction or drop the test.~
A test is added to check that analysis still reports an error if all
instructions are removed from the input, to mirror the current behaviour
of giving an error if no instructions are supplied.
In the SubprocessMemory destructor, I was using a normal std::string to
hold the name of the current shared memory name, but a const reference
works just as well in this situation while having better performance
characteristics.
Fixes#90289
There are several insstances in the subprocess executor in llvm-exegesis
where we fail to close file descriptors after using them. This leaves
them open, which can cause issues later on if a third-party binary is
using the exegesis libraries and executing many blocks before exiting.
Leaving the descriptors open until process exit is also bad practice.
This patch fixes that by explicitly calling close() when we are done
with a specific file descriptor.
This patch fixes#86583.