This commit adds support for WebAssembly's custom-page-sizes proposal to
`wasm-ld`. An overview of the proposal can be found
[here](https://github.com/WebAssembly/custom-page-sizes/blob/main/proposals/custom-page-sizes/Overview.md).
In a sentence, it allows customizing a Wasm memory's page size, enabling
Wasm to target environments with less than 64KiB of memory (the default
Wasm page size) available for Wasm memories.
This commit contains the following:
* Adds a `--page-size=N` CLI flag to `wasm-ld` for configuring the
linked Wasm binary's linear memory's page size.
* When the page size is configured to a non-default value, then the
final Wasm binary will use the encodings defined in the
custom-page-sizes proposal to declare the linear memory's page size.
* Defines a `__wasm_first_page_end` symbol, whose address points to the
first page in the Wasm linear memory, a.k.a. is the Wasm memory's page
size. This allows writing code that is compatible with any page size,
and doesn't require re-compiling its object code. At the same time,
because it just lowers to a constant rather than a memory access or
something, it enables link-time optimization.
* Adds tests for these new features.
r? @sbc100
cc @sunfishcode
Some tools (e.g. Rust tooling) produce element segment descriptors with
neither
elemkind or element type descriptors, but with init exprs instead of
func indices
(this is with the flags value of 4 in
https://webassembly.github.io/spec/core/binary/modules.html#element-section).
LLVM doesn't fully model reference types or the various ways to
initialize element
segments, but we do want to correctly parse and skip over all type
sections, so
this change updates the object parser to handle that case, and refactors
for more
clarity.
The test file is updated to include one additional elem segment with a
flags value
of 4, an initializer value of (32.const 0) and an empty vector.
Also support parsing files that export imported (undefined) functions.
There are only three actual uses of the section kind in MCSection:
isText(), XCOFF, and WebAssembly. Store isText() in the MCSection, and
store other info in the actual section variants where required.
ELF and COFF flags also encode all relevant information, so for these
two section variants, remove the SectionKind parameter entirely.
This allows to remove the string switch (which is unnecessary and
inaccurate) from createELFSectionImpl. This was introduced in
[D133456](https://reviews.llvm.org/D133456), but apparently, it was
never hit for non-writable sections anyway and the resulting kind was
never used.
Fragments are allocated with `operator new` and stored in an ilist with
Prev/Next/Parent pointers. A more efficient representation would be an
array of fragments without the overhead of Prev/Next pointers.
As the first step, replace ilist with singly-linked lists.
* `getPrevNode` uses have been eliminated by previous changes.
* The last use of the `Prev` pointer remains: for each subsection, there is an insertion point and
the current insertion point is stored at `CurInsertionPoint`.
* `HexagonAsmBackend::finishLayout` needs a backward iterator. Save all
fragments within `Frags`. Hexagon programs are usually small, and the
performance does not matter that much.
To eliminate `Prev`, change the subsection representation to
singly-linked lists for subsections and a pointer to the active
singly-linked list. The fragments from all subsections will be chained
together at layout time.
Since fragment lists are disconnected before layout time, we can remove
`MCFragment::SubsectionNumber` (https://reviews.llvm.org/D69411). The
current implementation of `AttemptToFoldSymbolOffsetDifference` requires
future improvement for robustness.
Pull Request: https://github.com/llvm/llvm-project/pull/95077
This change allows a WasmObjectFile to be created from a wasm file even
if it uses typed funcrefs and GC types. It does not significantly change how
lib/Object models its various internal types (e.g. WasmSignature,
WasmElemSegment), so LLVM does not really "support" or understand such
files, but it is sufficient to parse the type, global and element sections, discarding
types that are not understood. This is useful for low-level binary tools such as
nm and objcopy, which use only limited aspects of the binary (such as function
definitions) or deal with sections as opaque blobs.
This is done by allowing `WasmValType` to have a value of `OTHERREF`
(representing any unmodeled reference type), and adding a field to
`WasmSignature` indicating it's a placeholder for an unmodeled reference
type (since there is a 1:1 correspondence between WasmSignature objects
and types in the type section).
Then the object file parsers for the type and element sections are expanded
to parse encoded reference types and discard any unmodeled fields.
LLVM models some features found in the binary format with raw integers
and others with nested or enumerated types. This PR switches modeling of
tables and segments to use wasm::ValType rather than uint32_t. This NFC
change is in preparation for modeling more reference types, but IMO is
also cleaner and closer to the spec.
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.
I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
Note that llvm::support::endianness has been renamed to
llvm::endianness while becoming an enum class as opposed to an
enum. This patch replaces support::{big,little,native} with
llvm::endianness::{big,little,native}.
Annotation attributes may be attached to a function to mark it with
custom data that will be contained in the final Wasm file. The
annotation causes a custom section named
"func_attr.annotate.<name>.<arg0>.<arg1>..." to be created that will
contain each function's index value that was marked with the annotation.
A new patchable relocation type for function indexes had to be created so
the custom section could be updated during linking.
Reviewed By: sbc100
Differential Revision: https://reviews.llvm.org/D150803
Warn if `.size` is specified for a function symbol. The size of a
function symbol is determined solely by its content.
I noticed this simplification was possible while debugging #57427, but
this change doesn't fix that specific issue.
Differential Revision: https://reviews.llvm.org/D132929
This change continues to lay the ground work for supporting extended
const expressions in the linker.
The included test covers object file reading and writing and the YAML
representation.
Differential Revision: https://reviews.llvm.org/D121349
There's a few relevant forward declarations in there that may require downstream
adding explicit includes:
llvm/MC/MCContext.h no longer includes llvm/BinaryFormat/ELF.h, llvm/MC/MCSubtargetInfo.h, llvm/MC/MCTargetOptions.h
llvm/MC/MCObjectStreamer.h no longer include llvm/MC/MCAssembler.h
llvm/MC/MCAssembler.h no longer includes llvm/MC/MCFixup.h, llvm/MC/MCFragment.h
Counting preprocessed lines required to rebuild llvm-project on my setup:
before: 1052436830
after: 1049293745
Which is significant and backs up the change in addition to the usual benefits of
decreasing coupling between headers and compilation units.
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D119244
[NFC] This patch fixes URLs containing "master". Old URLs were either broken or
redirecting to the new URL.
Reviewed By: #libc, ldionne, mehdi_amini
Differential Revision: https://reviews.llvm.org/D113186
Emit __clangast in custom section instead of named data segment
to find it while iterating sections.
This could be avoided if all data segements (the wasm sense) were
represented as their own sections (in the llvm sense).
This can be resolved by https://github.com/WebAssembly/tool-conventions/issues/138
And the on-disk hashtable in clangast needs to be aligned by 4 bytes,
so add paddings in name length field in custom section header.
The length of clangast section name can be represented in 1 byte
by leb128, and possible maximum pads are 3 bytes, so the section
name length won't be invalid in theory.
Fixes https://bugs.llvm.org/show_bug.cgi?id=35928
Differential Revision: https://reviews.llvm.org/D74531
This removes `WasmTagType`. `WasmTagType` contained an attribute and a
signature index:
```
struct WasmTagType {
uint8_t Attribute;
uint32_t SigIndex;
};
```
Currently the attribute field is not used and reserved for future use,
and always 0. And that this class contains `SigIndex` as its property is
a little weird in the place, because the tag type's signature index is
not an inherent property of a tag but rather a reference to another
section that changes after linking. This makes tag handling in the
linker also weird that tag-related methods are taking both `WasmTagType`
and `WasmSignature` even though `WasmTagType` contains a signature
index. This is because the signature index changes in linking so it
doesn't have any info at this point. This instead moves `SigIndex` to
`struct WasmTag` itself, as we did for `struct WasmFunction` in D111104.
In this CL, in lib/MC and lib/Object, this now treats tag types in the
same way as function types. Also in YAML, this removes `struct Tag`,
because now it only contains the tag index. Also tags set `SigIndex` in
`WasmImport` union, as functions do.
I think this makes things simpler and makes tag handling more in line
with function handling. These two shares similar properties in that both
of them have signatures, but they are kind of nominal so having the same
signature doesn't mean they are the same element.
Also a drive-by fix: the reserved 'attirubute' part's encoding changed
from uleb32 to uint8 a while ago. This was fixed in lib/MC and
lib/Object but not in YAML. This doesn't change object files because the
field's value is always 0 and its encoding is the same for the both
encoding.
This is effectively NFC; I didn't mark it as such just because it
changed YAML test results.
Reviewed By: sbc100, tlively
Differential Revision: https://reviews.llvm.org/D111086
We previously had a limitation that TLS variables could not
be exported (and therefore could also not be imported). This
change removed that limitation.
Differential Revision: https://reviews.llvm.org/D108877
This add support for SjLj using Wasm exception handling instructions:
https://github.com/WebAssembly/exception-handling/blob/master/proposals/exception-handling/Exceptions.md
This does not yet support the mixed use of EH and SjLj within a
function. It will be added in a follow-up CL.
This currently passes all SjLj Emscripten tests for wasm0/1/2/3/s,
except for the below:
- `test_longjmp_standalone`: Uses Node
- `test_dlfcn_longjmp`: Uses NodeRAWFS
- `test_longjmp_throw`: Mixes EH and SjLj
- `test_exceptions_longjmp1`: Mixes EH and SjLj
- `test_exceptions_longjmp2`: Mixes EH and SjLj
- `test_exceptions_longjmp3`: Mixes EH and SjLj
Reviewed By: dschuff, tlively
Differential Revision: https://reviews.llvm.org/D108960
Debug info sections need R_WASM_FUNCTION_OFFSET_I32 relocs (with FK_Data_4 fixup
kinds) to refer to functions (instead of R_WASM_TABLE_INDEX as is used in data
sections). Usually this is done in a convoluted way, with unnamed temp data
symbols which target the start of the function, in which case
WasmObjectWriter::recordRelocation converts it to use the section symbol
instead. However in some cases the function can actually be undefined; in this
case the dwarf generator uses the function symbol (a named undefined function
symbol) instead. In that case the section-symbol transform doesn't work and we
need to generate the correct reloc type a different way. In this change
WebAssemblyWasmObjectWriter::getRelocType takes the fixup section type into
account to choose the correct reloc type.
Fixes PR50408
Differential Revision: https://reviews.llvm.org/D103557
This changes the encoding of the `attribute` field, which currently only
contains the value `0` denoting this tag is for an exception, from
`varuint32` to `uint8`. This field is effectively unused at the moment
and reserved for future use, and it is not likely to need `varuint32`
even in future.
See https://github.com/WebAssembly/exception-handling/pull/162.
This does not change any encoded binaries because `0` is encoded in the
same way both in `varuint32` and `uint8`.
Reviewed By: tlively
Differential Revision: https://reviews.llvm.org/D104571
__table_base is know 64-bit, since in LLVM it represents a function pointer offset
__table_base32 is a copy in wasm32 for use in elem init expr, since no truncation may be used there.
New reloc R_WASM_TABLE_INDEX_REL_SLEB64 added
Differential Revision: https://reviews.llvm.org/D101784
This change was originally landed in: 5000a1b4b9edeb9e994f2a5b36da8d48599bea49
It was reverted in: 061e071d8c9b98526f35cad55a918a4f1615afd4
This change adds support for a new WASM_SEG_FLAG_STRINGS flag in
the object format which works in a similar fashion to SHF_STRINGS
in the ELF world.
Unlike the ELF linker this support is currently limited:
- No support for SHF_MERGE (non-string merging)
- Always do full tail merging ("lo" can be merged with "hello")
- Only support single byte strings (p2align 0)
Like the ELF linker merging is only performed at `-O1` and above.
This fixes part of https://bugs.llvm.org/show_bug.cgi?id=48828,
although crucially it doesn't not currently support debug sections
because they are not represented by data segments (they are custom
sections)
Differential Revision: https://reviews.llvm.org/D97657
This reverts commit 5000a1b4b9edeb9e994f2a5b36da8d48599bea49.
Breaks tests, see https://reviews.llvm.org/D97657#2749151
Easily repros locally with `ninja check-llvm-mc-webassembly`.
This change adds support for a new WASM_SEG_FLAG_STRINGS flag in
the object format which works in a similar fashion to SHF_STRINGS
in the ELF world.
Unlike the ELF linker this support is currently limited:
- No support for SHF_MERGE (non-string merging)
- Always do full tail merging ("lo" can be merged with "hello")
- Only support single byte strings (p2align 0)
Like the ELF linker merging is only performed at `-O1` and above.
This fixes part of https://bugs.llvm.org/show_bug.cgi?id=48828,
although crucially it doesn't not currently support debug sections
because they are not represented by data segments (they are custom
sections)
Differential Revision: https://reviews.llvm.org/D97657