The bug was introduced when the AddressRange class was no longer able to modify the End address directly and the entire range of the .text address range that contained the trailing empty symbol was replaced. There was no unit test for this, so it wasn't caught. I fixed the bug and added a unit test for it.
The effects of this bug are serious as the AddressOffsetSize in the header would be incorrectly calculated and an invalid GSYM would be created.
Differential Revision: https://reviews.llvm.org/D127811
Add proper CodeView encoding for positive constant integer values greater than
127. In addition, use the two byte encoding form for positive values less
than LF_NUMERIC.
Differential Revision: https://reviews.llvm.org/D126968
- truncateQuotedNameFront: The last use was removed on Jul 10, 2017 in
commit a9d944fd6fd19ac377b5ebea9272676642b7ceaa.
- truncateQuotedNameBack: The last use was removed on Mar 26, 2018 in
commit 7b84b678a993c8a8236868f65d1d4c2b3e29fb3d.
- truncateStringMiddle: The last use was removed on Mar 26, 2018 in
commit 7b84b678a993c8a8236868f65d1d4c2b3e29fb3d.
- truncateStringBack: The last use is in truncateQuotedNameBack being
removed above.
- truncateStringFront: The last use is in truncateQuotedNameFront
being removed above.
This is minimum changes extracted from https://reviews.llvm.org/D78950. The old patch tried to add LRU eviction of caching data structure. Due to multiple layers of interfaces that users could be using, it was not clear where to put the functionality. While we work out on where to put that functionality, it'll be great to add this minimum interface change so that the user could implement their own memory management. More specifically:
* Add a clearLineTable method for DWARFDebugLine which erases the given offset from the LineTableMap.
* DWARFDebugContext adds the clearLineTableForUnit method that leverages clearLineTable to remove the object corresponding to a given compile unit, for memory management purposes. When it is referred to again, the line table object will be repopulated.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D90006
Currently, llvm-symbolizer doesn't like to parse .debug_info in order to
show the line info for global variables. addr2line does this. In the
future, I'm looking to migrate AddressSanitizer off of internal metadata
over to using debuginfo, and this is predicated on being able to get the
line info for global variables.
This patch adds the requisite support for getting the line info from the
.debug_info section for symbolizing global variables. This only happens
when you ask for a global variable to be symbolized as data.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D123538
Discovered in a large object that would need a 64 bit index (but the
cu/tu index format doesn't include a 64 bit offset/length mode in
DWARF64 - a spec bug) but instead binutils dwp overflowed the offsets
causing overlapping regions.
llvm-profgen gives error message when the input binary contains premature terminator in .debug_aranges section. These zero length items point to some rodata with zero size type in embed Rust Library. Considering Zero-Sized Types are a valid feature in Rust. They are not real error. This change makes the "error:" message into a warning to avoid misleading.
Why do we still want a warning on such case? because it doesn't follow dwarf standard. https://bugs.llvm.org/show_bug.cgi?id=46805 contains early discussion.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D124121
Running iwyu-diff on LLVM codebase since fa5a4e1b95c8f37796 detected a few
regressions, fixing them.
Differential Revision: https://reviews.llvm.org/D124847
Right now, if we want to dump symbol at specified offset, we need to use `grep`.
And it can only show surrounding symbols in layout (not in lexical scope sense).
This adds similar options to `dump` command as `llvm-dwarfdump` to allow users
to dump symbol record at specified offset and its parents or children with
spcified depth.
`--symbol-offset=` must be used with `--modi` to dump only one symbol at given
offset.
`--show-parents`/`--show-children` must be used with `--symbol-offset` to
dump all symbols that are parents/children of the symbol at given offset.
`--parent-recurse-depth`/`--children-recurse-depth` must be used with
`--show-parents`/`--show-children` to specify the max up/down depth.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D124317
llvm-gsymutil has an implementation of AddressRange and AddressRanges
classes. That implementation might be reused in other parts of llvm.
This patch moves AddressRange and AddressRanges classes into llvm/ADT.
Differential Revision: https://reviews.llvm.org/D124350
The change described by:
https://reviews.llvm.org/D122226
Moved some llvm-pdbutil functionality to the debug PDB library.
This patch addresses a broken '-modi' argument handling, which
causes an assertion if its value is other than '0' or '1'.
In addition, it moves the assertion for the number of occurrences
of the '-modi' argument from the PDB library into the llvm-pdbutil
driver.
Reviewed By: zequanwu
Differential Revision: https://reviews.llvm.org/D123483
The changes described by:
https://reviews.llvm.org/D121801https://reviews.llvm.org/D122226
Moved some llvm-pdbutil functionality to the debug PDB library.
This patch addresses one outstanding issue concerning the global
state (Filters) created in the PDB library.
- Move 'Filters' inside the 'LinePrinter' class.
- Omit 'Optional' and just pass 'PrintScope &HeaderScope' everywhere.
Reviewed By: aganea
Differential Revision: https://reviews.llvm.org/D122887
This fixes the issue when the current line offset is actually for next range.
Maintain a current code range with current line offset and cache next file/line
offset. Update file/line offset after finishing current range.
Differential Revision: https://reviews.llvm.org/D123151
Returning `std::array<uint8_t, N>` is better ergonomics for the hashing functions usage, instead of a `StringRef`:
* When returning `StringRef`, client code is "jumping through hoops" to do string manipulations instead of dealing with fixed array of bytes directly, which is more natural
* Returning `std::array<uint8_t, N>` avoids the need for the hasher classes to keep a field just for the purpose of wrapping it and returning it as a `StringRef`
As part of this patch also:
* Introduce `TruncatedBLAKE3` which is useful for using BLAKE3 as the hasher type for `HashBuilder` with non-default hash sizes.
* Make `MD5Result` inherit from `std::array<uint8_t, 16>` which improves & simplifies its API.
Differential Revision: https://reviews.llvm.org/D123100
Since enumerators may not be available in every translation unit they
can't be reliably used to name entities. (this also makes simplified
template name roundtripping infeasible - since the expected name could
only be rebuilt if the enumeration definition could be found (or only if
it couldn't be found, depending on the context of the original name))
At Sony we are developing llvm-dva
https://lists.llvm.org/pipermail/llvm-dev/2020-August/144174.html
For its PDB support, it requires functionality already present in
llvm-pdbutil.
We intend to move that functionaly into the PDB library to be
shared by both tools. That change will be done in 2 steps, that
will be submitted as 2 patches:
(1) Replace 'ExitOnError' with explicit error handling.
(2) Move the intended shared code to the PDB library.
Patch for step (1): https://reviews.llvm.org/D121801
This patch is for step (2).
Move InputFile.cpp[h], FormatUtil.cpp[h] and LinePrinter.cpp[h]
files to the debug PDB library.
It exposes the following functionality that can be used by tools:
- Open a PDB file.
- Get module debug stream.
- Traverse module sections.
- Traverse module subsections.
Most of the needed functionality is in InputFile, but there are
dependencies from LinePrinter and FormatUtil.
Some other functionality is in the following functions in
DumpOutputStyle.cpp file:
- iterateModuleSubsections
- getModuleDebugStream
- iterateOneModule
- iterateSymbolGroups
- iterateModuleSubsections
Only these specific functions from DumpOutputStyle are moved to
the PDB library.
Reviewed By: aganea, dblaikie, rnk
Differential Revision: https://reviews.llvm.org/D122226
Follow up to 1e396affca6a0d21247d960c93a415e8f6fe0301. On some standard
library configurations these have a dependency on the complete type of
SymbolizableModule.
This change adds a simple LRU cache to the Symbolize class to put a cap
on llvm-symbolizer memory usage. Previously, the Symbolizer's virtual
memory footprint would grow without bound as additional binaries were
referenced.
I'm putting this out there early for an informal review, since there may be
a dramatically different/better way to go about this. I still need to
figure out a good default constant for the memory cap and benchmark the
implementation against a large symbolization workload. Right now I've
pegged max memory usage at zero for testing purposes, which evicts the whole
cache every time.
Unfortunately, it looks like StringRefs in the returned DI objects can
directly refer to the contents of binaries. Accordingly, the cache
pruning must be explicitly requested by the caller, as the caller must
guarantee that none of the returned objects will be used afterwards.
For llvm-symbolizer this a light burden; symbolization occurs
line-by-line, and the returned objects are discarded after each.
Implementation wise, there are a number of nested caches that depend
on one another. I've implemented a simple Evictor callback system to
allow derived caches to register eviction actions to occur when the
underlying binaries are evicted.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D119784
On some standard library configurations these have a dependency on the
complete type of SymbolizableModule. They also do a lot of
copying/freeing so no point in inlining them.
As usual with that header cleanup series, some implicit dependencies now need to
be explicit:
llvm/DebugInfo/DWARF/DWARFContext.h no longer includes:
- "llvm/DebugInfo/DWARF/DWARFAcceleratorTable.h"
- "llvm/DebugInfo/DWARF/DWARFCompileUnit.h"
- "llvm/DebugInfo/DWARF/DWARFDebugAbbrev.h"
- "llvm/DebugInfo/DWARF/DWARFDebugAranges.h"
- "llvm/DebugInfo/DWARF/DWARFDebugFrame.h"
- "llvm/DebugInfo/DWARF/DWARFDebugLoc.h"
- "llvm/DebugInfo/DWARF/DWARFDebugMacro.h"
- "llvm/DebugInfo/DWARF/DWARFGdbIndex.h"
- "llvm/DebugInfo/DWARF/DWARFSection.h"
- "llvm/DebugInfo/DWARF/DWARFTypeUnit.h"
- "llvm/DebugInfo/DWARF/DWARFUnitIndex.h"
Plus llvm/Support/Errc.h not included by a bunch of llvm/DebugInfo/DWARF/DWARF*.h files
Preprocessed lines to build llvm on my setup:
after: 1065629059
before: 1066621848
Which is a great diff!
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D119723
Most notably,
llvm/Object/Binary.h no longer includes llvm/Support/MemoryBuffer.h
llvm/Object/MachOUniversal*.h no longer include llvm/Object/Archive.h
llvm/Object/TapiUniversal.h no longer includes llvm/Object/TapiFile.h
llvm-project preprocessed size:
before: 1068185081
after: 1068324320
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D119457
This adds a --build-id=<hex build ID> flag to llvm-symbolizer. If --obj
is unspecified, this will attempt to look up the provided build ID using
whatever mechanisms are available to the Symbolizer (typically,
debuginfod). The semantics are then as if the found binary were given
using the --obj flag.
Reviewed By: jhenderson, phosek
Differential Revision: https://reviews.llvm.org/D118633