113 Commits

Author SHA1 Message Date
George Rimar
e2051efcbe [ELF] - Versionscript: support mangled symbols with the same name.
This is PR30312. Info from bug page:

Both of these symbols demangle to abc::abc():
_ZN3abcC1Ev
_ZN3abcC2Ev
(These would be abc's complete object constructor and base object constructor, respectively.)
however with "abc::abc()" in the version script only one of the two receives the symbol version.

Patch fixes that.
It uses testcase created by Ed Maste (D24306).

Differential revision: https://reviews.llvm.org/D24336

llvm-svn: 281318
2016-09-13 10:45:39 +00:00
Michael J. Spencer
0cb8a70adc [ELF] Fix memory leak in BinaryFile handling.
llvm-svn: 281129
2016-09-10 01:42:43 +00:00
George Rimar
c91930a17f [ELF] - Use std::regex instead of hand written logic in elf::globMatch()
Use std::regex instead of hand written matcher.

Patch based on code and ideas of Rui Ueyama.

Differential revision: https://reviews.llvm.org/D23829

llvm-svn: 280544
2016-09-02 21:17:20 +00:00
Rafael Espindola
0509876f3f Remove redundant argument. NFC.
llvm-svn: 280243
2016-08-31 13:49:23 +00:00
Rafael Espindola
cceb92a075 Pass Binding instead of IsWeak to addBitcode.
We were computing the binding on both the caller and callee.

llvm-svn: 280156
2016-08-30 20:53:26 +00:00
Davide Italiano
35af5b3d21 [LTO] Fix the logic for dropping unnamed_addr.
Differential Revision:  https://reviews.llvm.org/D24037

llvm-svn: 280144
2016-08-30 20:15:03 +00:00
George Rimar
e1937bb524 [ELF] - Give automatically generated __start_* and __stop_* symbols default visibility.
This patch is opposite to D19024, which made this symbols to be hidden by default.

Unfortunately FreeBSD loader wants to see
start_set_modmetadata_set/stop_set_modmetadata_set in the dynamic symbol table. 
They were not placed there because had hidden visibility.

Patch makes them to have default visibility again.

Differential revision: https://reviews.llvm.org/D23552

llvm-svn: 279262
2016-08-19 15:36:32 +00:00
Rui Ueyama
dace838138 Simplify symbol version handling.
r275711 for "speedng up symbol version handling" was committed
by misunderstanding; the benchmark number was measured with
a debug build. The number with a release build didn't actually change.
This patch removes false optimizations added in that patch.

llvm-svn: 276267
2016-07-21 13:13:21 +00:00
George Rimar
b084125dbf [ELF] - Fixed integral constant overflow warning under MSVS 2015. NFC.
Under MSVS 2015 I observed integral constant overflow warning when aggregate initialization was used
to init the bit field. Patch fixes that.

llvm-svn: 276118
2016-07-20 14:26:48 +00:00
Rui Ueyama
e33579072d Remove SymbolBody::PlaceholderKind.
In the last patch for --trace-symbol, I introduced a new symbol type
PlaceholderKind and store it to SymVector storage. It made all code
that iterates over SymVector to recognize and skip PlaceholderKind
symbols. I found that that's annoying.

In this patch, I removed PlaceholderKind and stop storing them to SymVector.
Now the information whether a symbol is being watched by --trace-symbol
is stored to the Symtab hash table.

llvm-svn: 275747
2016-07-18 01:35:00 +00:00
Rui Ueyama
69c778c084 Implement almost-zero-cost --trace-symbol.
--trace-symbol is a command line option to watch a symbol.
Previosly, we looked up a hash table for a new symbol if the
option is given. Any code that looks up a hash table for each
symbol is expensive because the linker handles a lot of symbols.
In our design, we look up a hash table strictly only once
for a symbol, so --trace-symbol was an exception.

This patch improves efficiency of the option by merging the
hash table into the symbol table.

Instead of looking up a separate hash table with a string,
this patch sets `Traced` flag to symbols specified by --trace-symbol.
So, if you insert a symbol and get a symbol with `Traced` flag on,
you know that you need to print out a log message for the symbol.
This is nearly zero cost.

llvm-svn: 275716
2016-07-17 17:50:09 +00:00
Rui Ueyama
663b8c2769 Handle versioned symbols efficiently.
Versions can be assigned to symbols in two different ways.
One is the usual version scripts, and the other is special
symbol suffix '@'. If a symbol contains '@', the string after
that is considered to specify a version name.

Previously, we look for '@' for all symbols.

Anything that works on every symbol can be expensive because
the linker has to handle a lot of symbols. The search for '@'
was not an exception.

In this patch, I made two optimizations.

The first optimization is to handle '@' only when at least one
version is defined. If no versions are defined, no versions can
be assigned to any symbols, so it's waste of time to search for '@'.

The second optimization is to scan only suffixes of symbol names
instead of entire symbol names. Symbol names can be very long, but
symbol versions are usually short, so scanning entire symbol names
is waste of time, too.

There are some error cases which we no longer be able to detect
with this patch. I don't think it's a major drawback because they
are minor errors. Speed is more important.

This change improves LLD with debug info self-link time from
6.6993 seconds to 6.3426 seconds (or -5.3%).

Differential Revision: https://reviews.llvm.org/D22433

llvm-svn: 275711
2016-07-17 17:23:17 +00:00
George Rimar
50dcece2a0 Recommit r275257 "[ELF] - Implement extern "c++" version script tag"
BSD toolchain contains a bug:
https://sourceforge.net/p/elftoolchain/tickets/491/

In short demangler works differently, fix was to update the testcase.
It should fix the FreeBSD bot failture:
http://lab.llvm.org:8011/builders/lld-x86_64-freebsd/builds/19432/steps/test_lld/logs/stdio

Original commit message was:
[ELF] - Implement extern "c++" version script tag

Patch implements 'extern' version script tag.
Currently only values in quotes(") are supported.

Matching of externs is performed in the same pass as exact match of globals.

Differential revision: http://reviews.llvm.org/D21930

llvm-svn: 275682
2016-07-16 12:26:39 +00:00
George Rimar
dd64bb38bd Reverted r275257 "[ELF] - Implement extern "c++" version script tag"
It broke build bots:
http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast/builds/8204
http://lab.llvm.org:8011/builders/lld-x86_64-freebsd/builds/19432

llvm-svn: 275258
2016-07-13 08:19:04 +00:00
George Rimar
e05103ea11 [ELF] - Implement extern "c++" version script tag
Patch implements 'extern' version script tag.
Currently only values in quotes(") are supported.

Matching of externs is performed in the same pass as exact match of globals.

Differential revision: http://reviews.llvm.org/D21930

llvm-svn: 275257
2016-07-13 07:46:00 +00:00
Davide Italiano
8e1131dc46 [ELF] Support for wildcard in version scripts.
Example:

VERSION_1.0 {
  global: foo*;
  local: *; }

now correctly matches all the symbols which name starts with
`foo`.

Differential Revision:  http://reviews.llvm.org/D21732

llvm-svn: 274091
2016-06-29 02:46:51 +00:00
Rui Ueyama
d60dae8a6a Implement --trace-symbol=symbol option.
Patch by Shridhar Joshi.

This option provides names of all the link time modules which define and
reference symbols requested by user. This helps to speed up application
development by detecting references causing undefined symbols.
It also helps in detecting symbols being resolved to wrong (unintended)
definitions in case of applications containing multiple definitions for
same symbols with different types, bindings.

Implements PR28226.

llvm-svn: 273536
2016-06-23 07:00:17 +00:00
Rafael Espindola
cc70da39ff Internalize symbols in comdats.
We were dropping the CanOmitFromDynSym bit when creating undefined
symbols because of comdat.

llvm-svn: 272812
2016-06-15 17:56:10 +00:00
Rafael Espindola
65c65ce897 Don't include --start-lib/--end-lib files twice.
This should never happen with correct programs, but it is trivial
write a testcase where lld would crash or report duplicated
symbols. We now behave like when an archive is used and include the
file only once.

llvm-svn: 272724
2016-06-14 21:56:36 +00:00
Peter Collingbourne
c357278a38 ELF: Remove the function SymbolTable<ELFT>::findFile.
We already have the function SymbolBody::getSourceFile which does the same thing.

llvm-svn: 268353
2016-05-03 01:48:25 +00:00
Peter Collingbourne
6a4225962d ELF: Forbid all relative relocations to absolute symbols in PIC, except for weak undefined.
Weak undefined symbols resolve to the image base. This is a little strange,
but it allows us to link function calls to such symbols. Normally such a
call will be guarded with a comparison, which will load a zero from the GOT.

There's one example of such a function call in crti.o in Linux's CRT.

As part of this change, I also needed to make the synthetic start and end
symbols image base relative in the case where their sections were empty,
so that PC-relative references to those symbols would continue to work.

Differential Revision: http://reviews.llvm.org/D19844

llvm-svn: 268350
2016-05-03 01:21:08 +00:00
Peter Collingbourne
4f9527065c ELF: New symbol table design.
This patch implements a new design for the symbol table that stores
SymbolBodies within a memory region of the Symbol object. Symbols are mutated
by constructing SymbolBodies in place over existing SymbolBodies, rather
than by mutating pointers. As mentioned in the initial proposal [1], this
memory layout helps reduce the cache miss rate by improving memory locality.

Performance numbers:

           old(s) new(s)
Without debug info:
chrome      7.178  6.432 (-11.5%)
LLVMgold.so 0.505  0.502 (-0.5%)
clang       0.954  0.827 (-15.4%)
llvm-as     0.052  0.045 (-15.5%)
With debug info:
scylla      5.695  5.613 (-1.5%)
clang      14.396 14.143 (-1.8%)

Performance counter results show that the fewer required indirections is
indeed the cause of the improved performance. For example, when linking
chrome, stalled cycles decreases from 14,556,444,002 to 12,959,238,310, and
instructions per cycle increases from 0.78 to 0.83. We are also executing
many fewer instructions (15,516,401,933 down to 15,002,434,310), probably
because we spend less time allocating SymbolBodies.

The new mechanism by which symbols are added to the symbol table is by calling
add* functions on the SymbolTable.

In this patch, I handle local symbols by storing them inside "unparented"
SymbolBodies. This is suboptimal, but if we do want to try to avoid allocating
these SymbolBodies, we can probably do that separately.

I also removed a few members from the SymbolBody class that were only being
used to pass information from the input file to the symbol table.

This patch implements the new design for the ELF linker only. I intend to
prepare a similar patch for the COFF linker.

[1] http://lists.llvm.org/pipermail/llvm-dev/2016-April/098832.html

Differential Revision: http://reviews.llvm.org/D19752

llvm-svn: 268178
2016-05-01 04:55:03 +00:00
George Rimar
22a95121b9 Removed dead code. NFC.
llvm-svn: 267685
2016-04-27 09:24:03 +00:00
Peter Collingbourne
892d498017 ELF: Re-implement -u directly and remove CanKeepUndefined flag.
The semantics of the -u flag are to load the lazy symbol named by the flag. We
were previously relying on this behavior falling out of symbol resolution
against a synthetic undefined symbol, but that didn't quite give us the
correct behavior, so we needed a flag to mark symbols created with -u so
we could treat them specially in the writer. However, it's simpler and less
error prone to implement the required behavior directly and remove the flag.

This fixes an issue where symbols loaded with -u would receive hidden
visibility even when the definition in an object file had wider visibility.

Differential Revision: http://reviews.llvm.org/D19560

llvm-svn: 267639
2016-04-27 00:05:03 +00:00
Peter Collingbourne
66ac1d6152 ELF: Implement basic support for --version-script.
This patch only implements support for version scripts of the form:
  { [ global: symbol1; symbol2; [...]; symbolN; ] local: *; };
No wildcards are supported, other than for the local entry. Symbol versioning
is also not supported.

It works by introducing a new Symbol flag which tracks whether a symbol
appears in the global section of a version script.

This patch also simplifies the logic in SymbolBody::isPreemptible(), and
teaches it to handle the case where symbols with default visibility in DSOs
do not appear in the dynamic symbol table because of a version script.

Fixes PR27482.

Differential Revision: http://reviews.llvm.org/D19430

llvm-svn: 267208
2016-04-22 20:21:26 +00:00
Rafael Espindola
29af472cf6 Use llvm::CachedHash.
llvm-svn: 266982
2016-04-21 12:21:06 +00:00
Rafael Espindola
7f0b727235 Specialize the symbol table data structure a bit.
We never need to iterate over the K,V pairs, so we can avoid copying the
key as MapVector does.

This is a small speedup on most benchmarks.

llvm-svn: 266364
2016-04-14 20:42:43 +00:00
Rafael Espindola
c9157d35d9 Hash symbol names only once per global SymbolBody.
The DenseMap doesn't store hash results. This means that when it is
resized it has to recompute them.

This patch is a small hack that wraps the StringRef in a struct that
remembers the hash value. That way we can be sure it is only hashed
once.

llvm-svn: 266357
2016-04-14 19:17:16 +00:00
Adhemerval Zanella
9df0720766 ELF: Implement --dynamic-list
This patch implements the --dynamic-list option, which adds a list of
global symbol that either should not be bounded by default definition
when creating shared libraries, or add in dynamic symbol table in the
case of creating executables.

The patch modifies the ScriptParserBase class to use a list of Token
instead of StringRef, which contains information if the token is a
quoted or unquoted strings. It is used to use a faster search for
exact match symbol name.

The input file follow a similar format of linker script with some
simplifications (it does not have scope or node names). It leads
to a simplified parser define in DynamicList.{cpp,h}.

Different from ld/gold neither glob pattern nor mangled names
(extern 'C++') are currently supported.

llvm-svn: 266227
2016-04-13 18:51:11 +00:00
Peter Collingbourne
f6e9b4ec24 ELF: Use hidden visibility for all DefinedSynthetic symbols.
This simplifies the code by allowing us to remove the visibility argument
to functions that create synthetic symbols.

The only functional change is that the visibility of the MIPS "_gp" symbol
is now hidden. Because this symbol is defined in every executable or DSO, it
would be difficult to observe a visibility change here.

Differential Revision: http://reviews.llvm.org/D19033

llvm-svn: 266208
2016-04-13 16:57:28 +00:00
Rui Ueyama
f8baa66056 ELF: Implement --start-lib and --end-lib
start-lib and end-lib are options to link object files in the same
semantics as archive files. If an object is in start-lib and end-lib,
the object is linked only when the file is needed to resolve
undefined symbols. That means, if an object is in start-lib and end-lib,
it behaves as if it were in an archive file.

In this patch, I introduced a new notion, LazyObjectFile. That is
analogous to Archive file type, but that works for a single object
file instead of for an archive file.

http://reviews.llvm.org/D18814

llvm-svn: 265710
2016-04-07 19:24:51 +00:00
Rafael Espindola
f47657301b Change the type hierarchy for undefined symbols.
We have to differentiate undefined symbols from bitcode and undefined
symbols from other sources.

Undefined symbols from bitcode should not inhibit the symbol being
internalized. Undefined symbols from other sources should.

llvm-svn: 265536
2016-04-06 13:22:41 +00:00
Rafael Espindola
28ed95f45e Revert unintentional change.
resolve can remain private.

Thanks to Rui for noticing it.

llvm-svn: 265336
2016-04-04 19:22:51 +00:00
Rafael Espindola
ccfe3cb3d6 Don't store an Elf_Sym for most symbols.
Our symbol representation was redundant, and some times would get out of
sync. It had an Elf_Sym, but some fields were copied to SymbolBody.

Different parts of the code were checking the bits in SymbolBody and
others were checking Elf_Sym.

There are two general approaches to fix this:
* Copy the required information and don't store and Elf_Sym.
* Don't copy the information and always use the Elf_Smy.

The second way sounds tempting, but has a big problem: we would have to
template SymbolBody. I started doing it, but it requires templeting
*everything* and creates a bit chicken and egg problem at the driver
where we have to find ELFT before we can create an ArchiveFile for
example.

As much as possible I compared the test differences with what gold and
bfd produce to make sure they are still valid. In most cases we are just
adding hidden visibility to a local symbol, which is harmless.

In most tests this is a small speedup. The only slowdown was scylla
(1.006X). The largest speedup was clang with no --build-id, -O3 or
--gc-sections (i.e.: focus on the relocations): 1.019X.

llvm-svn: 265293
2016-04-04 14:04:16 +00:00
Reid Kleckner
4f88914c5b Remove unused fwd decl for LLVM IR stuff that lives in LTO now
llvm-svn: 264911
2016-03-30 20:15:41 +00:00
Reid Kleckner
20e24193f3 Remove declaration of SymbolTable::codegen, this method was deleted in r264091
llvm-svn: 264441
2016-03-25 18:20:33 +00:00
Davide Italiano
7cd8e65187 [LTO] Remove dead code.
llvm-svn: 264133
2016-03-23 02:45:53 +00:00
Rui Ueyama
259924869b ELF: Create LTO.{cpp,h} and move LTO-related code to that file.
The code for LTO has been growing, so now is probably a good time to
move it to its own file. SymbolTable.cpp is for symbol table, and
because compiling bitcode files are semantically not a part of
symbol table, this is I think a good thing to do.

http://reviews.llvm.org/D18370

llvm-svn: 264091
2016-03-22 20:52:10 +00:00
Rui Ueyama
9328b2cdde Use ELFT instead of ELFFile<ELFT>.
llvm-svn: 263510
2016-03-14 23:16:09 +00:00
George Rimar
aa4dc20f09 [ELF] - Create _DYNAMIC symbol for dynamic output
lld needs to provide _DYNAMIC symbol when creating a shared library
both bfd and gold do that.

This should fix the https://llvm.org/bugs/show_bug.cgi?id=26732

Differential revision: http://reviews.llvm.org/D17607

llvm-svn: 262348
2016-03-01 16:23:13 +00:00
Rafael Espindola
e0df00b91f Rename elf2 to elf.
llvm-svn: 262159
2016-02-28 00:25:54 +00:00
Rafael Espindola
18f0950783 Report duplicated symbols in bitcode.
llvm-svn: 262076
2016-02-26 21:49:38 +00:00
Rafael Espindola
9f77ef0c08 Add initial LTO support.
llvm-svn: 260726
2016-02-12 20:54:57 +00:00
Rafael Espindola
3a6a0a0109 Delete addIgnoredStrong.
It is not needed now that we resolve symbols is shared libraries
correctly.

llvm-svn: 258104
2016-01-19 00:05:54 +00:00
Simon Atanasyan
188558e5eb [ELF][MIPS] Prevent substitution of _gp_disp symbol
On MIPS O32 ABI, _gp_disp is a magic symbol designates offset between
start of function and gp pointer into GOT. To make seal with such symbol
we add new method addIgnoredStrong(). It adds ignored symbol with global
binding to prevent the symbol substitution. The addIgnored call is not
enough here because this call adds a weak symbol which might be
substituted by symbol from shared library.

Differential Revision: http://reviews.llvm.org/D16084

llvm-svn: 257449
2016-01-12 06:23:57 +00:00
Rui Ueyama
52c7c5fa3e Group members to match a comment. NFC.
llvm-svn: 257218
2016-01-08 22:20:00 +00:00
Rui Ueyama
131e0ffa10 Use shorter name. NFC.
llvm-svn: 257217
2016-01-08 22:17:42 +00:00
Rui Ueyama
683564ea63 Update comments.
llvm-svn: 257216
2016-01-08 22:14:15 +00:00
Rui Ueyama
8b4879aec0 Remove an empty constructor.
We used to have code in SymbolTable constructor to add entry symbols, etc.
That code has been moved to Driver. We can remove the constructor.

llvm-svn: 257214
2016-01-08 22:06:25 +00:00
Rui Ueyama
79c7373232 ELF: Consistently return SymbolBody * from SymbolTable::add functions.
For historical reasons, some add* functions for SymbolTable returns a
pointer to a SymbolBody, while some are not. This patch is to make them
consistently return a pointer to a newly added symbol.

llvm-svn: 257211
2016-01-08 21:53:28 +00:00