46 Commits

Author SHA1 Message Date
Fangrui Song
5c3c0a8cec [ELF] Replace inExpr with lexState. NFC
We may add another state State::Wild to behave more lik GNU ld.
2025-02-01 15:49:08 -08:00
Fangrui Song
483516fd83 [ELF] Remove unneeded Twine() 2024-11-16 20:32:44 -08:00
Fangrui Song
e24457a330 [ELF] Migrate away from global ctx 2024-11-14 22:17:10 -08:00
Fangrui Song
ed6c106e6a [ELF] Replace errorCount with errCount(ctx)
to reduce reliance on the global context.
2024-11-07 09:06:01 -08:00
Fangrui Song
09c2c5e1e9 [ELF] Replace error(...) with ErrAlways or Err
Most are migrated to ErrAlways mechanically.
In the future we should change most to Err.
2024-11-06 22:04:52 -08:00
Fangrui Song
cf57a670bb [ELF] ScriptParser: pass Ctx to ScriptParser and ScriptLexer. NFC 2024-09-21 11:06:06 -07:00
Fangrui Song
a7e8bddfc1 [ELF] Respect --sysroot for INCLUDE
If an included script is under the sysroot directory, when it opens an
absolute path file (`INPUT` or `GROUP`), add sysroot before the absolute
path. When the included script ends, the `isUnderSysroot` state is
restored.
2024-07-28 11:43:27 -07:00
Fangrui Song
8f72b0cb08 [ELF] Fix INCLUDE cycle detection
Fix #93947: the cycle detection mechanism added by
https://reviews.llvm.org/D37524 also disallowed including a file twice,
which is an unnecessary limitation.

Now that we have an include stack #100493, supporting multiple inclusion
is trivial. Note: a filename can be referenced with many different
paths, e.g. a.lds, ./a.lds, ././a.lds. We don't attempt to detect the
cycle in the earliest point.
2024-07-27 17:25:13 -07:00
Fangrui Song
9328c20cc8 [ELF] Track line number precisely
`getLineNumber` is both imprecise (when `INCLUDE` is used) and
inefficient (see https://reviews.llvm.org/D104137). Track line number
precisely now that we have `struct Buffer` abstraction from #100493.
2024-07-27 14:46:41 -07:00
Fangrui Song
2a89356d64 [ELF] Add till and rewrite while (... consume("}"))
After #100493, the idiom `while (!errorCount() && !consume("}"))` could
lead to inaccurate diagnostics or dead loops. Introduce till to change
the code pattern.
2024-07-26 17:13:37 -07:00
Fangrui Song
1978c21d96
[ELF] ScriptLexer: generate tokens lazily
The current tokenize-whole-file approach has a few limitations.

* Lack of state information: `maybeSplitExpr` is needed to parse
  expressions. It's infeasible to add new states to behave more like GNU
  ld.
* `readInclude` may insert tokens in the middle, leading to a time
  complexity issue with N-nested `INCLUDE`.
* line/column information for diagnostics are inaccurate, especially
  after an `INCLUDE`.
* `getLineNumber` cannot be made more efficient without significant code
  complexity and memory consumption. https://reviews.llvm.org/D104137

The patch switches to a traditional lexer that generates tokens lazily.

* `atEOF` behavior is modified: we need to call `peek` to determine EOF.
* `peek` and `next` cannot call `setError` upon `atEOF`.
* Since `consume` no longer reports an error upon `atEOF`, the idiom `while (!errorCount() && !consume(")"))`
  would cause a dead loop. Use `while (peek() != ")" && !atEOF()) { ... } expect(")")` instead.
* An include stack is introduced to handle `readInclude`. This can be
  utilized to address #93947 properly.
* `tokens` and `pos` are removed.
* `commandString` is reimplemented. Since it is used in -Map output,
  `\n` needs to be replaced with space.

Pull Request: https://github.com/llvm/llvm-project/pull/100493
2024-07-26 14:26:38 -07:00
Fangrui Song
026972af9c [ELF] Remove obsoleted comment after #99567 2024-07-25 16:45:09 -07:00
Hongyu Chen
2ae862b74b
[ELF] Remove consumeLabel in ScriptLexer (#99567)
This commit removes `consumeLabel` since we can just use consume
function to have the same functionalities.
2024-07-23 22:03:46 -07:00
Hongyu Chen
b828c13f3c
[ELF] Delete peek2 in Lexer (#99790)
Thanks to Fangrui's change

28045ceab0
so peek2 can be removed.
2024-07-20 16:35:38 -07:00
Fangrui Song
c93554b82a [ELF] Simplify ScriptLexer::consume. NFC 2024-07-20 14:23:54 -07:00
Fangrui Song
fae96104d4 [ELF] Support operator ^ and ^=
GNU ld added ^ support in July 2023 and it looks like ^= is in plan as
well.

For now, we don't support `a^=0` (^= without a preceding space).
2023-07-15 14:10:40 -07:00
Fangrui Song
8d85c96e0e [lld] StringRef::{starts,ends}with => {starts,ends}_with. NFC
The latter form is now preferred to be similar to C++20 starts_with.
This replacement also removes one function call when startswith is not inlined.
2023-06-05 14:36:19 -07:00
Fangrui Song
0a0effdd5b [ELF] Support -= *= /= <<= >>= &= |= in symbol assignments 2022-06-25 22:22:59 -07:00
Fangrui Song
77295c5486 [ELF] Allow ? without adjacent space
GNU ld allows 1 ? 2?3:4 : 5?6 :7
2022-06-25 21:16:59 -07:00
Fangrui Song
27bb799095 [ELF] Clean up headers. NFC 2022-02-07 21:53:34 -08:00
Colin Cross
e387778722 [ELF] Optimize ScriptLexer::getLineNumber by caching the previous line number and offset
getLineNumber() was counting the number of line feeds from the start of
the buffer to the current token. For large linker scripts this became a
performance bottleneck. For one 4MB linker script over 4 minutes was
spent in getLineNumber's StringRef::count.

Store the line number from the last token, and only count the additional
line feeds since the last token.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D104137
2021-06-22 15:35:24 -07:00
Georgii Rymar
ae4279bd3e [LLD][ELF] - Linkerscript: report location for the "unclosed comment in a linker script" error.
Currently we print "error: unclosed comment in a linker script", which doesn't
provide information about the real error location.

Fixes https://bugs.llvm.org/show_bug.cgi?id=46793.

Differential revision: https://reviews.llvm.org/D84300
2020-07-24 11:38:26 +03:00
Fangrui Song
ac6abc99e2 [ELF] Don't cause assertion failure if --dynamic-list or --version-script takes an empty file
Fixes PR46184
Report line 1 of the last memory buffer.
2020-06-05 15:59:54 -07:00
Fangrui Song
07837b8f49 [ELF] Use namespace qualifiers (lld:: or elf::) instead of namespace lld { namespace elf {
Similar to D74882. This reverts much code from commit
bd8cfe65f5fee4ad573adc2172359c9552e8cdc0 (D68323) and fixes some
problems before D68323.

Sorry for the churn but D68323 was a mistake. Namespace qualifiers avoid
bugs where the definition does not match the declaration from the
header. See
https://llvm.org/docs/CodingStandards.html#use-namespace-qualifiers-to-implement-previously-declared-functions (D74515)

Differential Revision: https://reviews.llvm.org/D79982
2020-05-15 08:49:53 -07:00
Kazuaki Ishizaki
7c5fcb3591 [lld] NFC: fix trivial typos in comments
Differential Revision: https://reviews.llvm.org/D72339
2020-04-02 01:21:36 +09:00
Benjamin Kramer
adcd026838 Make llvm::StringRef to std::string conversions explicit.
This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.

This is mostly mechanical from a custom clang-tidy check, with a lot of
manual fixups. It uncovers a lot of minor inefficiencies.

This doesn't actually modify StringRef yet, I'll do that in a follow-up.
2020-01-28 23:25:25 +01:00
Fangrui Song
bd8cfe65f5 [ELF] Wrap things in namespace lld { namespace elf {, NFC
This makes it clear `ELF/**/*.cpp` files define things in the `lld::elf`
namespace and simplifies `elf::foo` to `foo`.

Reviewed By: atanasyan, grimar, ruiu

Differential Revision: https://reviews.llvm.org/D68323

llvm-svn: 373885
2019-10-07 08:31:18 +00:00
Rui Ueyama
3837f4273f [Coding style change] Rename variables so that they start with a lowercase letter
This patch is mechanically generated by clang-llvm-rename tool that I wrote
using Clang Refactoring Engine just for creating this patch. You can see the
source code of the tool at https://reviews.llvm.org/D64123. There's no manual
post-processing; you can generate the same patch by re-running the tool against
lld's code base.

Here is the main discussion thread to change the LLVM coding style:
https://lists.llvm.org/pipermail/llvm-dev/2019-February/130083.html
In the discussion thread, I proposed we use lld as a testbed for variable
naming scheme change, and this patch does that.

I chose to rename variables so that they are in camelCase, just because that
is a minimal change to make variables to start with a lowercase letter.

Note to downstream patch maintainers: if you are maintaining a downstream lld
repo, just rebasing ahead of this commit would cause massive merge conflicts
because this patch essentially changes every line in the lld subdirectory. But
there's a remedy.

clang-llvm-rename tool is a batch tool, so you can rename variables in your
downstream repo with the tool. Given that, here is how to rebase your repo to
a commit after the mass renaming:

1. rebase to the commit just before the mass variable renaming,
2. apply the tool to your downstream repo to mass-rename variables locally, and
3. rebase again to the head.

Most changes made by the tool should be identical for a downstream repo and
for the head, so at the step 3, almost all changes should be merged and
disappear. I'd expect that there would be some lines that you need to merge by
hand, but that shouldn't be too many.

Differential Revision: https://reviews.llvm.org/D64121

llvm-svn: 365595
2019-07-10 05:00:37 +00:00
George Rimar
0810f16fb9 [LLD][ELF] - Linkerscript: add a support for expressions for section's filling
Imagine the script:

.section: {
...
} = FILL_EXPR
LLD assumes that FILL_EXPR is a number, and does not allow
it to be an expression. Though that is allowed by specification:
https://sourceware.org/binutils/docs-2.32/ld/Output-Section-Fill.html

This patch adds a support for cases when FILL_EXPR is simple math expression.

Fixes https://bugs.llvm.org/show_bug.cgi?id=42482.

Differential revision: https://reviews.llvm.org/D64130

llvm-svn: 365143
2019-07-04 14:17:31 +00:00
Chandler Carruth
2946cd7010 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636
2019-01-19 08:50:56 +00:00
George Rimar
a46d08ebe6 [LLD][ELD] - Do not reject INFO output section type when used with a start address.
This is https://bugs.llvm.org/show_bug.cgi?id=38625

LLD accept this: 

".stack (INFO) : {", 

but not this:

".stack address_expression (INFO) :"

The patch fixes it.

Differential revision: https://reviews.llvm.org/D51027

llvm-svn: 340804
2018-08-28 08:39:21 +00:00
George Rimar
e5cd32b702 [ELF] - Remove dead code #2.
'Pos' is never can be 0 here.

llvm-svn: 336436
2018-07-06 13:30:50 +00:00
George Rimar
92bd49e874 [ELF] - Remove dead code. NFC.
'Pos' can never be 0.

llvm-svn: 336435
2018-07-06 13:23:49 +00:00
George Rimar
b5d6e76bb7 [ELF] - Add a comment. NFC.
Minor follow up for r336197
"[ELF] - Add support for '||' and '&&' in linker scripts."

llvm-svn: 336199
2018-07-03 14:16:19 +00:00
George Rimar
a50054829d [ELF] - Add support for '||' and '&&' in linker scripts.
This is https://bugs.llvm.org//show_bug.cgi?id=37976,
we had no support, but seems someone faced it.

llvm-svn: 336197
2018-07-03 14:02:52 +00:00
Rui Ueyama
c67d6b2da0 Simplify script lexer.
Differential Revision: https://reviews.llvm.org/D41577

llvm-svn: 321453
2017-12-26 10:13:10 +00:00
Bob Haarman
b8a59c8aa5 [lld] unified COFF and ELF error handling on new Common/ErrorHandler
Summary:
The COFF linker and the ELF linker have long had similar but separate
Error.h and Error.cpp files to implement error handling. This change
introduces new error handling code in Common/ErrorHandler.h, changes the
COFF and ELF linkers to use it, and removes the old, separate
implementations.

Reviewers: ruiu

Reviewed By: ruiu

Subscribers: smeenai, jyknight, emaste, sdardis, nemanjai, nhaehnle, mgorny, javed.absar, kbarton, fedor.sergeev, llvm-commits

Differential Revision: https://reviews.llvm.org/D39259

llvm-svn: 316624
2017-10-25 22:28:38 +00:00
George Rimar
81eca18df3 [ELF] - Linkerscript: Add ~ as separate math token.
Previously we did not support following:
foo = ~0xFF;
and had to add space before numeric value:
foo = ~ 0xFF

That was constistent with ld.bfd < 2.30, which shows:
script.txt:3: undefined symbol `~2' referenced in expression,
but inconsistent with gold.

It was fixed for ld.bfd 2.30 as well:
https://sourceware.org/bugzilla/show_bug.cgi?id=22267

Differential revision: https://reviews.llvm.org/D36508

llvm-svn: 315569
2017-10-12 08:40:12 +00:00
George Rimar
970e783bdd [ELF] - Fix out of sync comment. NFC.
llvm-svn: 315442
2017-10-11 08:18:53 +00:00
George Rimar
de2d1066ae [ELF] - Do not report multiple errors for single one in ScriptLexer::setError.
Previously up to 3 errors were reported at once,
with patch we always will report only one,
just like in other linker code.

Differential revision: https://reviews.llvm.org/D37015

llvm-svn: 311537
2017-08-23 08:48:39 +00:00
Hafiz Abid Qadeer
6f1d954ef4 [ELF, LinkerScript] Support ! operator in linker script.
Summary: This small patch adds the support for ! operator in linker scripts. 

Reviewers: ruiu, rafael

Reviewed By: ruiu

Subscribers: meadori, grimar, emaste, llvm-commits

Differential Revision: https://reviews.llvm.org/D36451

llvm-svn: 310607
2017-08-10 15:25:47 +00:00
George Rimar
ce6080819c [ELF] - Remove ScriptLexer::Error field and check ErrorCount instead.
D35945 introduces change when there is useless to check Error flag
in few places, but ErrorCount must be checked instead.

But then we probably can just check ErrorCount always. That should simplify
things. Patch do that.

Differential revision: https://reviews.llvm.org/D36266

llvm-svn: 310046
2017-08-04 10:34:14 +00:00
Rui Ueyama
f5fce48679 Handle ":" as a regular token character in linker scripts.
This is an alternative to https://reviews.llvm.org/D30500 to simplify the
version definition parser and allow ":" in symbol names.

Differential Revision: https://reviews.llvm.org/D30722

llvm-svn: 297402
2017-03-09 19:23:00 +00:00
Rui Ueyama
731a66ae98 Apply different tokenization rules to linker script expressions.
The linker script lexer is context-sensitive. In the regular context,
arithmetic operator characters are regular characters, but in the
expression context, they are independent tokens. This afects how the
lexer tokenizes "3*4", for example. (This kind of expression is real;
the Linux kernel uses it.)

This patch defines function `maybeSplitExpr`. This function splits the
current token into multiple expression tokens if the lexer is in the
expression context.

Differential Revision: https://reviews.llvm.org/D29963

llvm-svn: 295225
2017-02-15 19:58:17 +00:00
Rui Ueyama
4c82b4f6fa Add file comments for ScriptParser.cpp.
llvm-svn: 295023
2017-02-14 04:47:24 +00:00
Rui Ueyama
794366a237 Rename ScriptParser.{cpp,h} -> ScriptLexer.{cpp,h}.
These files contain a lexer, so the new names are better.
The parser is in LinkerScript.{cpp,h}.

llvm-svn: 295022
2017-02-14 04:47:05 +00:00