This avoids recomputing string length that is already known at compile time.
It has a slight impact on preprocessing / compile time, see
https://llvm-compile-time-tracker.com/compare.php?from=3f36d2d579d8b0e8824d9dd99bfa79f456858f88&to=e49640c507ddc6615b5e503144301c8e41f8f434&stat=instructions:u
This a recommit of e953ae5bbc313fd0cc980ce021d487e5b5199ea4 and the subsequent fixes caa713559bd38f337d7d35de35686775e8fb5175 and 06b90e2e9c991e211fecc97948e533320a825470.
The above patchset caused some version of GCC to take eons to compile clang/lib/Basic/Targets/AArch64.cpp, as spotted in aa171833ab0017d9732e82b8682c9848ab25ff9e.
The fix is to make BuiltinInfo tables a compilation unit static variable, instead of a private static variable.
Differential Revision: https://reviews.llvm.org/D139881
Revert "Fix lldb option handling since e953ae5bbc313fd0cc980ce021d487e5b5199ea4 (part 2)"
Revert "Fix lldb option handling since e953ae5bbc313fd0cc980ce021d487e5b5199ea4"
GCC build hangs on this bot https://lab.llvm.org/buildbot/#/builders/37/builds/19104
compiling CMakeFiles/obj.clangBasic.dir/Targets/AArch64.cpp.d
The bot uses GNU 11.3.0, but I can reproduce locally with gcc (Debian 12.2.0-3) 12.2.0.
This reverts commit caa713559bd38f337d7d35de35686775e8fb5175.
This reverts commit 06b90e2e9c991e211fecc97948e533320a825470.
This reverts commit e953ae5bbc313fd0cc980ce021d487e5b5199ea4.
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
Summary:
The linker wrapper uses this metadata to determine which registration
code to emit, e.g. CUDA, HIP or OpenMP. If we encounter an OFK_None we
should just ignore it.
Breaks build of LLVMgold here:
```
/repositories/llvm-project/llvm/tools/gold/gold-plugin.cpp:1108:19: error: no matching function for call to 'localCache'
Cache = check(localCache("ThinLTO", "Thin", options::cache_dir, AddBuffer));
^~~~~~~~~~
/repositories/llvm-project/llvm/include/llvm/Support/Caching.h:72:21: note: candidate function not viable: no known conversion from '(lambda at /repositories/llvm-project/llvm/tools/gold/gold-plugin.cpp:1102:20)' to 'llvm::AddBufferFn' (aka 'function<void (unsigned int, const llvm::Twine &, std::unique_ptr<MemoryBuffer>)>') for 4th argument
Expected<FileCache> localCache(
^
/repositories/llvm-project/llvm/tools/gold/gold-plugin.cpp:1110:18: error: no viable conversion from '(lambda at /repositories/llvm-project/llvm/tools/gold/gold-plugin.cpp:1094:20)' to 'llvm::AddStreamFn' (aka 'function<Expected<std::unique_ptr<CachedFileStream>> (unsigned int, const llvm::Twine &)>')
check(Lto->run(AddStream, Cache));
^~~~~~~~~
/usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/std_function.h:375:7: note: candidate constructor not viable: no known conversion from '(lambda at /repositories/llvm-project/llvm/tools/gold/gold-plugin.cpp:1094:20)' to 'std::nullptr_t' for 1st argument
function(nullptr_t) noexcept
^
/usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/std_function.h:386:7: note: candidate constructor not viable: no known conversion from '(lambda at /repositories/llvm-project/llvm/tools/gold/gold-plugin.cpp:1094:20)' to 'const std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream>> (unsigned int, const llvm::Twine &)> &' for 1st argument
function(const function& __x)
^
/usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/std_function.h:404:7: note: candidate constructor not viable: no known conversion from '(lambda at /repositories/llvm-project/llvm/tools/gold/gold-plugin.cpp:1094:20)' to 'std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream>> (unsigned int, const llvm::Twine &)> &&' for 1st argument
function(function&& __x) noexcept
^
/usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/std_function.h:435:2: note: candidate template ignored: requirement '_Callable<(lambda at /repositories/llvm-project/llvm/tools/gold/gold-plugin.cpp:1094:20) &, (lambda at /repositories/llvm-project/llvm/tools/gold/gold-plugin.cpp:1094:20), std::__invoke_result<(lambda at /repositories/llvm-project/llvm/tools/gold/gold-plugin.cpp:1094:20) &, unsigned int, const llvm::Twine &>>::value' was not satisfied [with _Functor = (lambda at /repositories/llvm-project/llvm/tools/gold/gold-plugin.cpp:1094:20) &]
function(_Functor&& __f)
^
/repositories/llvm-project/llvm/include/llvm/LTO/LTO.h:278:25: note: passing argument to parameter 'AddStream' here
Error run(AddStreamFn AddStream, FileCache Cache = nullptr);
^
```
This reverts commit 387620aa8cea33174b6c1fb80c1af713fee732ac.
Currently the lto native object files have names like main.exe.lto.1.obj. In
PDB, those names are used as names for each compiland. Microsoft’s tool
SizeBench uses those names to present to users the size of each object files.
So, names like main.exe.lto.1.obj is not user friendly.
This patch makes the lto native object file names more readable by using
the bitcode file names as part of the file names. For example, if the input
bitcode file has path like "path/to/foo.obj", its corresponding lto native
object file path would be "path/to/main.exe.lto.foo.obj". Since the lto native
object file name only bothers PDB, this patch only changes the lld-linker's
behavior.
Reviewed By: tejohnson, MaskRay, #lld-macho
Differential Revision: https://reviews.llvm.org/D137217
This patch changes the device linking steps to be performed in parallel
when multiple offloading architectures are being used. We use the LLVM
parallelism support to accomplish this by simply doing each inidividual
device linking job in a single thread. This change required re-parsing
the input arguments as these arguments have internal state that would
not be properly shared between the threads otherwise.
By default, the parallelism uses all threads availible. But this can be
controlled with the `--wrapper-jobs=` option. This was required in a few
tests to ensure the ordering was still deterministic.
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D136701
A previous patch introduced a common function used to extract offloading
binaries from an image. Therefore we no longer need to duplicate the
functionality in the `llvm-objdump` implementation. Functionally, this
removes the old warning behaviour when given malformed input. This has
been changed to a hard error, which is effectively the same.
This required a slight tweak in the linker wrapper to filter out the
user passing shared objects directly.
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D136796
The linker wrapper does its own library searching for static archives
that can contain device code. The device linking phases happen before
the host linking phases so that we can generate the necessary
registration code and link it in with the rest of the code. Previously,
If a library containing needed device code was not found the execution
would continue silently until it failed with undefined symbols. This
patch allows the linker wrapper to perform its own check beforehand to
catch these errors.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D137180
The ptxas assembler does not allow the `-g` flag along with
optimizations. Normally this is degraded to line info in the driver, but
when using LTO we did not have this step and the linker wrapper was not
correctly degrading the option. Note that this will not work if the user
does not pass `-g` again to the linker invocation. That will require
setting some flags in the binary to indicate that debugging was used
when building.
This fixes#57990
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D134660
We currently extract offload binaries inside of the linker wrapper.
Other tools may wish to do the same extraction operation. This patch
simply factors out this handling into the `OffloadBinary.h` interface.
Reviewed By: yaxunl
Differential Revision: https://reviews.llvm.org/D132689
The OpenMP offloading runtine currently uses an array of linked
offloading images. One downside to this is that we cannot know the
architecture or triple associated with the given image. In this patch,
instead of embedding the image itself, we embed an offloading binary
instead. This binary is simply a binary format that wraps around the
original linked image to provide some additional metadata. This will
allow us to support offloading to multiple architecture, or performing
future JIT compilation inside of the runtime, more clearly.
Additionally, these can be placed at a special section such that the
supported architectures can be identified using objdump with the support
from D126904. This needs to be stored in a new section name
`.llvm.offloading.images` because the `.llvm.offloading` section
implicitly uses the `SHF_EXCLUDE` flag and will always be stripped.
This patch does not contain the necessary code to parse these in
libomptarget.
Depends on D127246
Reviewed By: saiislam
Differential Revision: https://reviews.llvm.org/D127304
Summary:
The previous patch moved some functoinality into a new function and
returned it. The vector contained move-only members. Newer compilers
should figure this out and I didn't notice any problems, but other ones
have problems. Explicitly move this vector to hopefully solve the issue.
Summary:
This patch adds the new `--wrapper-time-trace=` option to write a time
tracing JSON file indicating where time was spent in the linker wrapper.
We also reformat and group some of the existing code to make
constraining the scope easier for time tracing. We use the `--wrapper`
prefix to set this apart from the time tracing that lld may use.
Summary:
A previous patch added the Task to the output filename when doing
`save-temps` the majority of cases there is only a single task so we
only add the task explicitly to differentiate it from the first one.
Summary:
Previous assumptions held that the LTO stage would only have a single
output. This is incorrect when using thinLTO which outputs multiple
files. Additionally there were some bugs with how we hanlded input that
cause problems when performing thinLTO. This patch addresses these
issues.
The performance of Thin-LTO is currently pretty bad. But I am content to
leave it that way as long as it compiles.
This patch adds the necessary changes required to bundle and wrap HIP
files. The bundling is done using `clang-offload-bundler` currently to
mimic `fatbinary` and the wrapping is done using very similar runtime
calls to CUDA. This still does not support managed / surface / texture
variables, that would require some additional information in the entry.
One difference in the codegeneration with AMD is that I don't check if
the handle is null before destructing it, I'm not sure if that's
required.
With this we should be able to support HIP with the new driver.
Depends on D128850
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D128914
The LTO pipeline handles its errors using the diagnostics handler
callback function. We were not checking the results of these errors and
not properly returning an error code in the linker wrapper when errors
occured inside of the LTO pipeline. This patch adds a simple boolean
flag to indicate if the LTO backend failed to any reason and quit.
Reviewed By: ye-luo
Differential Revision: https://reviews.llvm.org/D129423
This patch removes some uses of string savers that are no-longer needed.
We also create a new string saver when linking bitcode files. It seems
that occasionally the symbol string references can go out of scope when
they are added to the LTO input so we need to save these names that are
used for symbol resolution. Additionally, a previous patch added new
logic for handling bitcode libraries, but failed to actually add them to
the input. This bug has been fixed.
Fixes#56445
Reviewed By: ye-luo
Differential Revision: https://reviews.llvm.org/D129383
Summary:
The previous path reworked some handling of temporary files which
exposed some bugs related to capturing local state by reference in the
callback labmda. Squashing this by copying in everything instead. There
was also a problem where the argument name was changed for
`--bitcode-library=` but clang still used `--target-library=`.
Summary:
This patch reworks the command line argument handling in the linker
wrapper from using the LLVM `cl` interface to using the `Option`
interface with TableGen. This has several benefits compared to the old
method.
We use arguments from the linker arguments in the linker
wrapper, such as the libraries and input files, this allows us to
properly parse these. Additionally we can now easily set up aliases to
the linker wrapper arguments and pass them in the linker input directly.
That is, pass an option like `cuda-path=` as `--offload-arg=cuda-path=`
in the linker's inputs. This will allow us to handle offloading
compilation in the linker itself some day. Finally, this is also a much
cleaner interface for passing arguments to the individual device linking
jobs.
Summary:
A previous patch added a new ELF section type for LLVM offloading. We
should use this when extracting the offloading sections rather than
checking the string. This pach also removes the implicit support for
COFF and MACH-O because we don't support those currently and should not
be included.
Currently we use the `embedBufferInModule` function to store binary
strings containing device offloading data inside the host object to
create a fatbinary. In the case of LTO, we need to extract this object
from the LLVM-IR. This patch adds a metadata node for the embedded
objects containing the embedded pointers and the sections they were
stored at. This should create a cleaner interface for identifying these
values.
In the future it may be worthwhile to also encode an `ID` in the
metadata corresponding to the object's special section type if relevant.
This would allow us to extract the data from an object file and LLVM-IR
using the same ID.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D129033
We use LLD to perform AMDGPU linking. This linker accepts some arguments
through the `-plugin-opt` facilities. These options match what `Clang`
will output when given the same input.
Reviewed By: yaxunl
Differential Revision: https://reviews.llvm.org/D128923
Summary:
This patch adds some new sanity checks to make sure that the sizes of
the offsets are within the bounds of the file or what is expected by the
binary. This also improves the error handling of the version structure
to be built into the binary itself so we can change it easier.
The target features are necessary for correctly compiling most programs
in LTO mode. Currently, these are derived in clang at link time and
passed as an arguemnt to the linker wrapper. This is problematic because
it requires knowing the required toolchain at link time, which should
not be necessry. Instead, these features should be embedded into the
offloading binary so we can unify them in the linker wrapper for LTO.
This also required changing the offload packager to interpret multiple
arguments as concatenation with a comma. This is so we can still use the
`,` separator for the argument list.
Depends on D127246
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D127686
Summary:
Currently we use temporary files to write the intermediate results to.
However, these are stored as regular strings and we do a few unnecessary
copies and conversions of them. This patch simply replaces these strings
with a reference to the filename stored in the list of temporary files.
The temporary files will stay alive during the whole linking phase and
have stable pointers, so we should be able to cheaply pass references to
them rather than copying them every time.
Summary:
A recent patch added some new code paths to the linker wrapper. Older
compilers seem to have problems with returning errors wrapped in
an Excepted type without explicitly moving them. This caused failures in
some of the buildbots. This patch fixes that.
The linker wrapper currently eagerly extracts all identified offloading
binaries to a file. This isn't ideal because we will soon open these
files again to examine their symbols for LTO and other things.
Additionally, we may not use every extracted file in the case of static
libraries. This would be very noisy in the case of static libraries that
may contain code for several targets not participating in the current
link.
Recent changes allow us to treat an Offloading binary as a standard
binary class. So that allows us to use an OwningBinary to model the
file. Now we keep it in memory and only write it once we know which
files will be participating in the final link job. This also reworks a
lot of the structure around how we handle this by removing the old
DeviceFile class.
The main benefit from this is that the following doesn't output 32+ files and
instead will only output a single temp file for the linked module.
```
$ clang input.c -fopenmp --offload-arch=sm_70 -foffload-lto -save-temps
```
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D127246