299 Commits

Author SHA1 Message Date
Hervé Poussineau
3b67383c6c
[MC][Mips] Generate required IMAGE_REL_MIPS_PAIR relocation (#120876)
Add the required IMAGE_REL_MIPS_PAIR relocation after
IMAGE_REL_MIPS_REFHI/IMAGE_REL_MIPS_SECRELHI

Microsoft PE/COFF specification says that the IMAGE_REL_MIPS_REFHI
relocation contains "the high 16 bits of the target's 32-bit virtual
address. [...] This relocation must be immediately followed by a PAIR
relocation whose SymbolTableIndex contains a 16-bit displacement which
is added to the upper 16 bits taken from the location being relocated."

Microsoft PE/COFF specification says that the IMAGE_REL_MIPS_SECRELHI
relocation contains "the high 16 bits of the 32-bit offset of the target
from the beginning of its section. A PAIR relocation must immediately
follow this on. The SymbolTableIndex of the PAIR relocation contains a
16-bit displacement, which is added to the upper 16 bits taken from the
location being relocated."

Behavior has been checked against Microsoft C compiler for MIPS.
2025-01-20 14:54:43 +08:00
Daniel Paoliello
ac2165fe7b
[coff] Don't try to write the obj if the assembler has errors (#123007)
The ASAN and MSAN tests have been failing after #122777 because some
fields are now set in `executePostLayoutBinding` which is skipped by the
assembler if it had errors but read in `writeObject`

Since the compilation has failed anyway, skip `writeObject` if the
assembler had errors.
2025-01-15 09:27:11 -08:00
Daniel Paoliello
283dca56f8
Reapply "[aarch64][win] Add support for import call optimization (equivalent to MSVC /d2ImportCallOptimization) (#121516)" (#122777)
This reverts commit 2f7ade4b5e399962e18f5f9a0ab0b7335deece51.

Fix is available in #122762
2025-01-13 14:00:14 -08:00
Kirill Stoimenov
2f7ade4b5e Revert "[aarch64][win] Add support for import call optimization (equivalent to MSVC /d2ImportCallOptimization) (#121516)"
Breaks sanitizer build: https://lab.llvm.org/buildbot/#/builders/52/builds/5179

This reverts commits:
5ee0a71df919a328c714e25f0935c21e586cc18b
d997a722c194feec5f3a94dec5acdce59ac5e55b
2025-01-13 19:09:01 +00:00
Daniel Paoliello
5ee0a71df9
[aarch64][win] Add support for import call optimization (equivalent to MSVC /d2ImportCallOptimization) (#121516)
This change implements import call optimization for AArch64 Windows
(equivalent to the undocumented MSVC `/d2ImportCallOptimization` flag).

Import call optimization adds additional data to the binary which can be
used by the Windows kernel loader to rewrite indirect calls to imported
functions as direct calls. It uses the same [Dynamic Value Relocation
Table mechanism that was leveraged on x64 to implement
`/d2GuardRetpoline`](https://techcommunity.microsoft.com/blog/windowsosplatform/mitigating-spectre-variant-2-with-retpoline-on-windows/295618).

The change to the obj file is to add a new `.impcall` section with the
following layout:
```cpp
  // Per section that contains calls to imported functions:
  //  uint32_t SectionSize: Size in bytes for information in this section.
  //  uint32_t Section Number
  //  Per call to imported function in section:
  //    uint32_t Kind: the kind of imported function.
  //    uint32_t BranchOffset: the offset of the branch instruction in its
  //                            parent section.
  //    uint32_t TargetSymbolId: the symbol id of the called function.
```

NOTE: If the import call optimization feature is enabled, then the
`.impcall` section must be emitted, even if there are no calls to
imported functions.

The implementation is split across a few parts of LLVM:
* During AArch64 instruction selection, the `GlobalValue` for each call
to a global is recorded into the Extra Information for that node.
* During lowering to machine instructions, the called global value for
each call is noted in its containing `MachineFunction`.
* During AArch64 asm printing, if the import call optimization feature
is enabled:
- A (new) `.impcall` directive is emitted for each call to an imported
function.
- The `.impcall` section is emitted with its magic header (but is not
filled in).
* During COFF object writing, the `.impcall` section is filled in based
on each `.impcall` directive that were encountered.

The `.impcall` section can only be filled in when we are writing the
COFF object as it requires the actual section numbers, which are only
assigned at that point (i.e., they don't exist during asm printing).

I had tried to avoid using the Extra Information during instruction
selection and instead implement this either purely during asm printing
or in a `MachineFunctionPass` (as suggested in [on the
forums](https://discourse.llvm.org/t/design-gathering-locations-of-instructions-to-emit-into-a-section/83729/3))
but this was not possible due to how loading and calling an imported
function works on AArch64. Specifically, they are emitted as `ADRP` +
`LDR` (to load the symbol) then a `BR` (to do the call), so at the point
when we have machine instructions, we would have to work backwards
through the instructions to discover what is being called. An initial
prototype did work by inspecting instructions; however, it didn't
correctly handle the case where the same function was called twice in a
row, which caused LLVM to elide the `ADRP` + `LDR` and reuse the
previously loaded address. Worse than that, sometimes for the
double-call case LLVM decided to spill the loaded address to the stack
and then reload it before making the second call. So, instead of trying
to implement logic to discover where the value in a register came from,
I instead recorded the symbol being called at the last place where it
was easy to do: instruction selection.
2025-01-11 21:30:17 -08:00
Fangrui Song
7b86fbbab7 [MC] Remove redundant MCSection::empty check. NFC
The section cannot be empty due to allocInitialFragment.
2024-12-21 23:02:45 -08:00
Kazu Hirata
d73d5c8c9b
[MC] Remove unused includes (NFC) (#116317)
Identified with misc-include-cleaner.
2024-11-15 07:26:22 -08:00
Fangrui Song
ae3c85a708 MCAssembler: Move CGProfile to MCObjectWriter 2024-07-22 21:56:45 -07:00
Fangrui Song
219d80bcb7 MCAssembler: Move FileNames and CompilerVersion to MCObjectWriter 2024-07-22 20:20:32 -07:00
Fangrui Song
9539a77960 [MC] Export llvm::WinCOFFObjectWriter and access it from MCWinCOFFStreamer
Similar to commit 28fcafb50274be2520117eacb0a886adafefe59d (2011) for
MachObjectWriter. MCWinCOFFStreamer can now access WinCOFFObjectWriter
directly without holding object file format specific inforamtion in
MCAssembler (e.g. IncrementalLinkerCompatible).
2024-07-21 12:04:47 -07:00
Fangrui Song
b75453bc07 MCAssembler: Remove unneeded non-const iterators for Sections and misleading size()
The pointers cannot be mutated even if the dereferenced MCSection can.
2024-07-05 15:42:38 -07:00
Fangrui Song
fdd04e8c0c WinCOFFObjectWriter: replace the MCAsmLayout parameter with MCAssembler 2024-07-01 17:21:14 -07:00
Fangrui Song
dbf12b2f77 [MC] Remove MCAsmLayout::{getSymbolOffset,getBaseSymbol}
The MCAsmLayout::* forwarders added by
67957a45ee1ec42ae1671cdbfa0d73127346cc95 have all been removed.
2024-07-01 11:51:26 -07:00
Fangrui Song
a40ca78bb9 [MC] Remove MCAsmLayout::{getSectionFileSize,getSectionAddressSize} 2024-07-01 11:27:32 -07:00
Fangrui Song
6b707a8cc1 [MC] Remove the MCAsmLayout parameter from MCObjectWriter::executePostLayoutBinding 2024-07-01 10:47:46 -07:00
Fangrui Song
23e6224374 [MC] Remove the MCAsmLayout parameter from MCObjectWriter::{writeObject,writeSectionData} 2024-07-01 10:04:59 -07:00
Fangrui Song
4289c422a8 [MC] Remove the MCAsmLayout parameter from MCObjectWriter::recordRelocation 2024-06-30 22:13:54 -07:00
Fangrui Song
67957a45ee [MC] Start merging MCAsmLayout into MCAssembler
Follow-up to 10c894cffd0f4bef21b54a43b5780240532e44cf.

MCAsmLayout, introduced by ac8a95498a99eb16dff9d3d0186616645d200b6e
(2010), provides APIs to compute fragment/symbol/section offsets.
The separate class is cumbersome and passing it around has overhead.
Let's remove it as the underlying implementation is tightly coupled with
MCAsmLayout anyway.

Some forwarders are added to ease migration.
2024-06-30 16:10:27 -07:00
Fangrui Song
bc6d925528 [MC] Simplify isSymbolRefDifferenceFullyResolvedImpl overloads. NFC
The base implementation is simple. Just inline it.
2024-06-29 16:10:33 -07:00
Fangrui Song
04c27852e4
[MC,COFF] Change how we handle section symbols
13a79bbfe583e1d8cc85d241b580907260065eb8 (2017) unified `BeginSymbol` and
section symbol for ELF. This patch does the same for COFF.

* In getCOFFSection, all sections now have a `BeginSymbol` (section
  symbol). We do not need a dummy symbol name when `getBeginSymbol` is
  needed (used by AsmParser::Run and DWARF generation).
* Section symbols are in the global symbol table. `call .text` will
  reference the section symbol instead of an undefined symbol. This
  matches GNU assembler. Unlike GNU, redefining the section symbol will
  cause a "symbol 'foo0' is already defined" error (see
  `section-sym-err.s`).

Pull Request: https://github.com/llvm/llvm-project/pull/96459
2024-06-25 14:00:47 -07:00
Fangrui Song
de0d4827ee [MC,COFF] Register .llvm.call-graph-profile in finalizeImpl
All sections should have been created before MCAssembler::layout so that
every section has an ordinal. Registering the section in
WinCOFFObjectWriter::executePostLayoutBinding is a hack.
2024-06-23 12:44:31 -07:00
aengelke
46beeaa394
[MC] Remove SectionKind from MCSection (#96067)
There are only three actual uses of the section kind in MCSection:
isText(), XCOFF, and WebAssembly. Store isText() in the MCSection, and
store other info in the actual section variants where required.

ELF and COFF flags also encode all relevant information, so for these
two section variants, remove the SectionKind parameter entirely.

This allows to remove the string switch (which is unnecessary and
inaccurate) from createELFSectionImpl. This was introduced in
[D133456](https://reviews.llvm.org/D133456), but apparently, it was
never hit for non-writable sections anyway and the resulting kind was
never used.
2024-06-20 10:52:49 +02:00
Fangrui Song
f808abf508
[MC] Add MCFragment allocation helpers
`allocFragment` might be changed to a placement new when the allocation
strategy changes.

`allocInitialFragment` is to deduplicate the following pattern
```
  auto *F = new MCDataFragment();
  Result->addFragment(*F);
  F->setParent(Result);
```

Pull Request: https://github.com/llvm/llvm-project/pull/95197
2024-06-14 09:39:32 -07:00
Fangrui Song
cb63abca27 [MC] Remove getFragmentList uses. NFC 2024-06-10 18:27:34 -07:00
Billy Laws
46b853a82c
[MC][COFF][AArch64] Treat ARM64EC/X as ARM64 for relocations (#86019)
Since ARM64EC/X objects use regular ARM64 relocations, any special
handling must be done for them too.
2024-03-22 15:17:06 +01:00
Kazu Hirata
586ecdf205
[llvm] Use StringRef::{starts,ends}_with (NFC) (#74956)
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.

I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
2023-12-11 21:01:36 -08:00
Kazu Hirata
4a0ccfa865 Use llvm::endianness::{big,little,native} (NFC)
Note that llvm::support::endianness has been renamed to
llvm::endianness while becoming an enum class as opposed to an
enum. This patch replaces support::{big,little,native} with
llvm::endianness::{big,little,native}.
2023-10-12 21:21:45 -07:00
Hiroshi Yamauchi
1e7f592a89 [MC][COFF][AArch64] Avoid incorrect IMAGE_REL_ARM64_BRANCH26 relocations.
For a b/bl instruction that branches a temporary symbol (private
assembly label), an IMAGE_REL_ARM64_BRANCH26 relocation to another
symbol + a non-zero offset could be emitted but the linkers don't
support this type of relocation and cause incorrect relocations and
crashes. Avoid emitting this type of relocations.

Differential Revision: https://reviews.llvm.org/D155732
2023-08-17 14:31:31 -07:00
Haohai Wen
82dff24bde Reland [COFF] Support -gsplit-dwarf for COFF on Windows
This relands 3eee5aa528abd67bb6d057e25ce1980d0d38c445 with fixes.
2023-06-26 15:48:38 +08:00
Nico Weber
b851308b87 Revert "[COFF] Support -gsplit-dwarf for COFF on Windows"
This reverts commit 3eee5aa528abd67bb6d057e25ce1980d0d38c445.

Breaks tests on mac, see https://reviews.llvm.org/D152785#4447118
2023-06-25 14:32:36 -04:00
Haohai Wen
3eee5aa528 [COFF] Support -gsplit-dwarf for COFF on Windows
D152340 has split WinCOFFObjectWriter to WinCOFFWriter. This patch adds
another WinCOFFWriter as DwoWriter to write Dwo sections to dwo file.
Driver options are also updated accordingly to support -gsplit-dwarf in
CL mode.

e.g. $ clang-cl  -c -gdwarf -gsplit-dwarf foo.c

Like what -gsplit-dwarf did in ELF, using this option will create DWARF object
(.dwo) file. DWARF debug info is split between COFF object and DWARF object
file. It can reduce the executable file size especially for large project.

Reviewed By: skan, MaskRay

Differential Revision: https://reviews.llvm.org/D152785
2023-06-25 11:54:39 +08:00
Haohai Wen
56f3da5917 [NFC][COFF] Split WinCOFFObjectWriter to WinCOFFWriter
We'd like to support -gsplit-dwarf for Windows COFF. It requires to
write Dwo and NonDwo sections to different output streams.The original
implementation is not designed to do that and there can be only one
MCObjectWriter. This patch split WinCOFFObjectWriter to WinCOFFWriter so
that:
  1. WinCOFFObjectWriter can create multiple WinCOFFWriter.
  2. Each WinCOFFWriter can separately collect sections it is interested.
  3. Each WinCOFFWriter can write to it's own output stream.

Reviewed By: skan

Differential Revision: https://reviews.llvm.org/D152340
2023-06-09 10:03:49 +08:00
Eli Friedman
7198baccda [COFF] Add MC support for emitting IMAGE_WEAK_EXTERN_ANTI_DEPENDENCY symbols
This is mostly useful for ARM64EC, which uses such symbols extensively.

One interesting quirk of ARM64EC is that we need to be able to emit weak
symbols that point at each other (so if either symbol is defined
elsewhere, both symbols point at the definition). This handling is
currently restricted to weak_anti_dep symbols, because we depend on the
current behavior of resolving weak symbols in some cases.

Differential Revision: https://reviews.llvm.org/D145208
2023-06-07 11:07:21 -07:00
Haohai Wen
050227004c [NFC][COFF] Refine access specifiers for WinCOFFObjectWriter
Set public specifiers only for constructor and inherited methods from
MCObjectWriter and leave others as private. Also change the order of
MCObjectWriter methods' definition according to it's declaration order.

Reviewed By: skan

Differential Revision: https://reviews.llvm.org/D152229
2023-06-06 21:01:45 +08:00
Haohai Wen
b56c439d7d [NFC][COFF] clang-format WinCOFFObjectWriter and MCWinCOFFObjectWriter
Reviewed By: skan

Differential Revision: https://reviews.llvm.org/D152119
2023-06-05 13:42:01 +08:00
Haohai Wen
01cc38843d [NFC][COFF] Use COFFSection.MCSection when writeSection
Each COFFSection bind MCSection when created. No need to iterate
throught MCAssembler when writeSection.

Reviewed By: skan

Differential Revision: https://reviews.llvm.org/D151793
2023-06-05 09:45:56 +08:00
Zequan Wu
439f804c47 Revert "[COFF] Add MC support for emitting IMAGE_WEAK_EXTERN_ANTI_DEPENDENCY symbols"
This reverts commit 10c17c97ebaf81ac26f6830e51a7a57ddcf63cd2. It causes undefined symbol error on chromium windows build. A small repro was uploaded to the code review.
2023-04-27 10:01:56 -04:00
Akshay Khadse
c9f6912a3a [Coverity] Fix uninitialized scalar members in MC
This change fixes static code analysis errors

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D148814
2023-04-23 15:29:52 +08:00
Akshay Khadse
65d4d62ab7 Fix uninitialized pointer members in MC
Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D148421
2023-04-18 17:45:58 +08:00
Eli Friedman
10c17c97eb [COFF] Add MC support for emitting IMAGE_WEAK_EXTERN_ANTI_DEPENDENCY symbols
This is mostly useful for ARM64EC, which uses such symbols extensively.

One interesting quirk of ARM64EC is that we need to be able to emit weak
symbols that point at each other (so if either symbol is defined
elsewhere, both symbols point at the definition).  This required a few
changes to the way we handle weak symbols on Windows.

Differential Revision: https://reviews.llvm.org/D145208
2023-04-17 13:17:25 -07:00
Arthur Eubanks
29a88f991b Revert "[COFF] Add MC support for emitting IMAGE_WEAK_EXTERN_ANTI_DEPENDENCY symbols"
This reverts commit fffdb7eac58b4efde5e23c1281e7a7f93a42d280.

Causes crashes, see https://reviews.llvm.org/D145208
2023-04-13 09:09:36 -07:00
Eli Friedman
fffdb7eac5 [COFF] Add MC support for emitting IMAGE_WEAK_EXTERN_ANTI_DEPENDENCY symbols
This is mostly useful for ARM64EC, which uses such symbols extensively.

One interesting quirk of ARM64EC is that we need to be able to emit weak
symbols that point at each other (so if either symbol is defined
elsewhere, both symbols point at the definition).  This required a few
changes to the way we handle weak symbols on Windows.

Differential Revision: https://reviews.llvm.org/D145208
2023-04-07 14:05:45 -07:00
Kazu Hirata
398af9b43b [llvm] Use *{Map,Set}::contains (NFC) 2023-03-15 18:06:32 -07:00
serge-sans-paille
38818b60c5
Move from llvm::makeArrayRef to ArrayRef deduction guides - llvm/ part
Use deduction guides instead of helper functions.

The only non-automatic changes have been:

1. ArrayRef(some_uint8_pointer, 0) needs to be changed into ArrayRef(some_uint8_pointer, (size_t)0) to avoid an ambiguous call with ArrayRef((uint8_t*), (uint8_t*))
2. CVSymbol sym(makeArrayRef(symStorage)); needed to be rewritten as CVSymbol sym{ArrayRef(symStorage)}; otherwise the compiler is confused and thinks we have a (bad) function prototype. There was a few similar situation across the codebase.
3. ADL doesn't seem to work the same for deduction-guides and functions, so at some point the llvm namespace must be explicitly stated.
4. The "reference mode" of makeArrayRef(ArrayRef<T> &) that acts as no-op is not supported (a constructor cannot achieve that).

Per reviewers' comment, some useless makeArrayRef have been removed in the process.

This is a follow-up to https://reviews.llvm.org/D140896 that introduced
the deduction guides.

Differential Revision: https://reviews.llvm.org/D140955
2023-01-05 14:11:08 +01:00
Guillaume Chatelet
e647b4f519 [reland][Alignment][NFC] Use the Align type in MCSection
Differential Revision: https://reviews.llvm.org/D138653
2022-11-24 13:19:18 +00:00
Guillaume Chatelet
3467f9c7d6 Revert D138653 [Alignment][NFC] Use the Align type in MCSection"
This breaks the bolt project.
This reverts commit 409f0dc4a420db1c6b259d5ae965a070c169d930.
2022-11-24 12:42:30 +00:00
Guillaume Chatelet
409f0dc4a4 [Alignment][NFC] Use the Align type in MCSection
Differential Revision: https://reviews.llvm.org/D138653
2022-11-24 12:32:58 +00:00
Fangrui Song
d8162a7196 [MC] .addrsig_sym: ignore unregistered symbols
.addrsig_sym forces registering the symbol regardless whether it is otherwise
registered. This creates an undefined symbol which is inconvenient/undesired:

* `extern int x; void f() { (void)x; }` has inconsistent behavior whether `x` is emitted as an undefined symbol.
  `-O0 -faddrsig` makes `x` undefined while other -O levels and -fno-addrsig eliminate the symbol.
* In ThinLTO, after a non-prevailing linkonce_odr definition is converted to available_externally, and then a declaration,
  the addrsig code emits a symbol while the symbol is otherwise unseen.

D135427 fixed a bug that a non-prevailing `__cxx_global_var_init` was
incorrectly retained. However, the IR declaration causes an undesired
`.addrsig_sym __cxx_global_var_init`. This can be addressed in a way similar
to D101512 (`isTransitiveUsedByMetadataOnly`) but the increased
`OutStreamer->emitAddrsigSym(getSymbol(&GV));` complexity makes me nervous.
Just ignoring unregistered symbols circumvents the problem.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D135642
2022-10-11 15:07:14 -07:00
Tim Besard
c71d77876f [MC] Avoid UAF in WinCOFFObjectWriter with weak symbols.
When using weak symbols, the WinCOFFObjectWriter keeps a list (`WeakDefaults`)
that's used to make names unique. This list should be reset when the object
writer is reset, because otherwise reuse of the object writer can result in
freed symbols being accessed. With some added output, this becomes clear when
using `llc` in `--run-twice` mode:

```
$ ./llc --compile-twice -mtriple=x86_64-pc-win32 trivial.ll -filetype=obj

DefineSymbol::WeakDefaults
 - .weak.foo.default
 - .weak.bar.default

DefineSymbol::WeakDefaults
 - .weak.foo.default
 - áÑJij⌂  p§┼Ø┐☺
 - .debug_macinfo.dw
 - .weak.bar.default
```

This does not seem to leak into the output object file though, so I couldn't
come up with a test. I added one that just does `--run-twice` (and verified
that it does access freed memory), which should result in detecting the
invalid memory accesses when running under ASAN.

Observed in a Julia PR where we started using weak symbols:
https://github.com/JuliaLang/julia/pull/45649

Reviewed By: mstorsjo

Differential Revision: https://reviews.llvm.org/D129840
2022-07-16 13:24:08 +03:00
Martin Storsjö
298e9cac92 [MC] [Win64EH] Check that the SEH unwind opcodes match the actual instructions
It's a fairly common issue that the generating code incorrectly marks
instructions as narrow or wide; check that the instruction lengths
add up to the expected value, and error out if it doesn't. This allows
catching code generation bugs.

Also check that prologs and epilogs are properly terminated, to
catch other code generation issues.

Differential Revision: https://reviews.llvm.org/D125647
2022-06-01 11:25:49 +03:00