On ARM64X, symbol names alone are ambiguous as they may refer to either
a native or an EC symbol. Append '(EC symbol)' or '(native symbol)' in
diagnostic messages to distinguish them.
If the ECSYMBOLS section is missing in the archive, the archive could be
either a native-only ARM64 or x86_64 archive. Check the machine type of
the object containing a symbol to determine which symbol table to use.
`%T` is not unique and deprecated
[[1](https://llvm.org/docs/CommandGuide/lit.html#substitutions)].
This patch replaces all `%T` in `lld/test` with `%t.dir` (`mkdir` if
necessary)
---------
Signed-off-by: Peter Rong <PeterRong@meta.com>
For each imported module, emit null-terminated native import entries,
followed by null-terminated EC entries. If a view lacks imports for a
given module, only terminators are emitted. Use ARM64X relocations to
skip native entries in the EC view.
Move `delayLoadHelper` and `tailMergeUnwindInfoChunk` to `SymbolTable`
since they are different for each symbol table.
In hybrid images, the PE header references a single IAT for both native
and EC views, merging entries where possible. When merging isn't
feasible, different imports are grouped together, and ARM64X relocations
are emitted as needed.
`NullChunk` instances do write data, even if it's always zero. Setting
`hasData` to false causes `Writer::assignAddresses` to ignore them
when calculating `rawSize`. This typically isn't an issue, as null chunks
are usually positioned within a section, and later chunks adjust the
size accordingly.
However, on ARM64EC, the auxiliary IAT is placed at the end of the
`.rdata` section and terminates with a null chunk. As a result, `rawSize`
is never updated to account for it, and space for the null chunk is not
allocated. Consequently, when `NullChunk::writeTo` is called, it receives
an invalid pointer - either pointing to the next section or beyond the
allocated buffer.
This is a follow-up to #120452 in a way.
Since lld/COFF does not yet insert all defined in an obj file before all
undefineds (ELF and MachO do this, see #67445 and things linked from
there), it's possible that:
1. We add an obj file a.obj
2. a.obj contains an undefined that's in b.obj, causing b.obj to be
added
3. b.obj contains an undefined that's in a part of a.obj that's not yet
in the symbol table, causing a recursive load of a.obj, which adds the
symbols in there twice, leading to duplicate symbol errors.
For normal archives, `ArchiveFile::addMember()` has a `seen` check to
prevent this. For start-lib lazy objects, we can just check if the
archive is still lazy at the recursive call.
This bug is similar to issue #59162.
(Eventually, we'll probably want to do what the MachO and ELF ports do.)
Includes a test that caused duplicate symbol diagnostics before this
code change.
ecb5ea6a266d5cc4e05252f6db4c73613b73cc3b tried to fix cases when LLD
links what seems to be import library header objects from MSVC. However,
the fix seems incorrect; the review at https://reviews.llvm.org/D133627
concluded that if this (treating this kind of symbol as a common symbol)
is what link.exe does, it's fine.
However, this is most probably not what link.exe does. The symbol
mentioned in the commit message of
ecb5ea6a266d5cc4e05252f6db4c73613b73cc3b would be a common symbol with a
size of around 3 GB; this is not what might have been intended.
That commit tried to avoid running into the error ".idata$4 should not
refer to special section 0"; that issue is fixed for a similar style of
section symbols in 4a4a8a1476b1386b523dc5b292ba9a5a6748a9cf.
Therefore, revert ecb5ea6a266d5cc4e05252f6db4c73613b73cc3b and extend
the fix from 4a4a8a1476b1386b523dc5b292ba9a5a6748a9cf to also work for
the section symbols in MSVC generated import libraries.
The main detail about them, is that for symbols of type
IMAGE_SYM_CLASS_SECTION, the Value field is not an offset, but it is an
optional set of flags, corresponding to the Characteristics of the
section header (although it may be empty).
This is a reland of a previous version of this commit, earlier merged in
9457418e66766d8fafc81f85eb8045986220ca3e / #122811. The previous version
failed tests when run with address sanitizer. The issue was that the
synthesized coff_symbol_generic object actually will be used to access a
full coff_symbol16 or coff_symbol32 struct, see
DefinedCOFF::getCOFFSymbol. Therefore, we need to make a copy of the
full size of either of them.
ecb5ea6a266d5cc4e05252f6db4c73613b73cc3b tried to fix cases when LLD
links what seems to be import library header objects from MSVC. However,
the fix seems incorrect; the review at https://reviews.llvm.org/D133627
concluded that if this (treating this kind of symbol as a common symbol)
is what link.exe does, it's fine.
However, this is most probably not what link.exe does. The symbol
mentioned in the commit message of
ecb5ea6a266d5cc4e05252f6db4c73613b73cc3b would be a common symbol with a
size of around 3 GB; this is not what might have been intended.
That commit tried to avoid running into the error ".idata$4 should not
refer to special section 0"; that issue is fixed for a similar style of
section symbols in 4a4a8a1476b1386b523dc5b292ba9a5a6748a9cf.
Therefore, revert ecb5ea6a266d5cc4e05252f6db4c73613b73cc3b and extend
the fix from 4a4a8a1476b1386b523dc5b292ba9a5a6748a9cf to also work for
the section symbols in MSVC generated import libraries.
The main detail about them, is that for symbols of type
IMAGE_SYM_CLASS_SECTION, the Value field is not an offset, but it is an
optional set of flags, corresponding to the Characteristics of the
section header (although it may be empty).
This change implements support for the /stub flag to align with MS
link.exe. This option is useful when a program needs to optimize the DOS
program that executes when the PE runs on DOS, avoiding the traditional
hardcoded DOS program in LLD.
When LLD links against an import library (for the regular, short import
libraries), it doesn't actually link in the header/trailer object files
at all, but synthesizes new corresponding data structures into the right
sections.
If the whole of such an import library is forced to be linked, e.g. with
the -wholearchive: option, we actually end up linking in those
header/trailer objects. The header objects contain a construct which LLD
fails to handle; previously we'd error out with the error ".idata$4
should not refer to special section 0".
Within the import library header object, in the import directory we have
relocations towards the IAT (.idata$4 and .idata$5), but the header
object itself doesn't contain any data for those sections.
In the case of GNU generated import libraries, the header objects
contain zero length sections .idata$4 and .idata$5, with relocations
against them. However in the case of LLVM generated import libraries,
the sections .idata$4 and .idata$5 are not included in the list of
sections. The symbol table does contain section symbols for these
sections, but without any actual associated section. This can probably
be seen as a declaration of an empty section.
If the header/trailer objects of a short import library are linked
forcibly and we also reference other functions in the library, we end up
with two import directory entries for this DLL, one that gets
synthesized by LLD, and one from the actual header object file. This is
inelegant, but should be acceptable.
While it would seem unusual to link import libraries with the
-wholearchive: option, this can happen in certain scenarios.
Rust builds libraries that contain relevant import libraries bundled
along with compiled Rust code as regular object files, all within one
single archive. Such an archive can then end up linked with the
-wholarchive: option, if build systems decide to use such an option for
including static libraries.
This should fix https://github.com/msys2/MINGW-packages/issues/21017.
This works for the header/trailer object files in import libraries
generated by LLVM; import libraries generated by MSVC are vaguely
different. ecb5ea6a266d5cc4e05252f6db4c73613b73cc3b did an attempt at
fixing the issue for MSVC generated libraries, but it's not entirely
correct, and isn't enough for making things work for that case.
Move `LinkerDriver::addUndefined` to` SymbolTable` to allow its use with
both symbol tables on ARM64X and rename it to `addGCRoot` to clarify its
distinct role compared to the existing `SymbolTable::addUndefined`.
Command-line `-include` arguments now apply to the EC symbol table, with
`mainSymtab` introduced in `linkerMain`. There will be more similar
cases. For `.drectve` sections, the corresponding symbol table is used
based on the context.
This change ensures the load config in the hybrid image view is handled
correctly. It introduces a new Arm64XRelocVal class to abstract
relocation values, allowing them to be relative to a symbol. This class
will also be useful for managing ARM64X relocation offsets in the
future.