When you run lldb without colors (`-X`), the status line looks weird
because it doesn't have a background. You end up with what appears to be
floating text at the bottom of your terminal.
This patch changes the statusline to use the reverse video effect, even
when colors are off. The effect doesn't introduce any new colors and
just inverts the foreground and background color.
I considered an alternative approach which changes the behavior of the
`-X` option, so that turning off colors doesn't prevent emitting
non-color related control characters such as bold, underline, and
reverse video. I decided to go with this more targeted fix as (1) nobody
is asking for this more general change and (2) it introduces significant
complexity to plumb this through using a setting and driver flag so that
it can be disabled when running the tests.
Fixes#134112.
This moves all the common settings of the launch and attach operations
into the `lldb_dap::protocol::Configuration`. These common settings
can be in both `launch` and `attach` requests and allows us to isolate
the DAP configuration operations into a single common location.
This is split out from #133624.
Protect the various SetBreakpoint functions with the API mutex. This
fixes a race condition between the breakpoint being created and the DAP
label getting added. This was causing `TestDAP_breakpointEvents.py` to
be flaky.
Fixes#131242.
This NFC patch simplifies the main loop in HandleProcessStateChanged
event by moving duplicated code into the StopInfo class, also allowing
StopInfo subclasses to override behavior.
More specifically, two functions are created:
* ShouldShow: should a Thread with such StopInfo should be printed when
the debugger stops? Currently, no StopInfo subclasses override this, but
a subsequent patch will fix a bug by making StopInfoBreakpoint check
whether the breakpoint is internal.
* ShouldSelect: should a Thread with such a StopInfo be selected? This
is currently overridden by StopInfoUnixSignal but will, in the future,
be overridden by StopInfoBreakpoint.
When using `SBFrame::EvaluateExpression` on a frame that's not the
currently selected frame, we would sometimes run into errors such as:
```
error: error: The context has changed before we could JIT the expression!
error: errored out in DoExecute, couldn't PrepareToExecuteJITExpression
```
During expression parsing, we call `RunStaticInitializers`. On our
internal fork this happens quite frequently because any usage of, e.g.,
function pointers, will inject ptrauth fixup code into the expression.
The static initializers are run using `RunThreadPlan`. The
`ExecutionContext::m_frame_sp` going into the `RunThreadPlan` is the
`SBFrame` that we called `EvaluateExpression` on. LLDB then tries to
save this frame to restore it after the thread-plan ran (the restore
occurs by unconditionally overwriting whatever is in
`ExecutionContext::m_frame_sp`). However, if the `selected_frame_sp` is
not the same as the `SBFrame`, then `RunThreadPlan` would set the
`ExecutionContext`'s frame to a different frame than what we started
with. When we `PrepareToExecuteJITExpression`, LLDB checks whether the
`ExecutionContext` frame changed from when we initially
`EvaluateExpression`, and if did, bails out with the error above.
One such test-case is attached. This currently passes regardless of the
fix because our ptrauth static initializers code isn't upstream yet. But
the plan is to upstream it soon.
This patch addresses the issue by saving/restoring the frame of the
incoming `ExecutionContext`, if such frame exists. Otherwise, fall back
to using the selected frame.
rdar://147456589
SetExitStatus can be called the second time when we reap the debug
server process. This shouldn't be interesting as at that point, we've
already told everyone that the process has exited.
I believe/hope this will also help with sporadic shutdown crashes that
have cropped up recently. They happen because the debug server is
monitored from a detached thread, so this code can be called after main
returns (and starts destroying everything). This isn't a real fix for
that though, as the situation can still happen (it's just that it
usually happens after the exit status has already been set). I think the
real fix for that is to make sure these threads terminate before we
start shutting everything down.
This reverts commit 4e40c7c4bd66d98f529a807dbf410dc46444f4ca.
arm64 CI is getting a failure in
lldb-api.tools/lldb-server.TestGdbRemoteRegisterState.py
with this commit, need to investigate and re-land.
The motivation for this patch is that in Statistics.cpp we [check to see
if the module symfile is
loaded](990a086d9d/lldb/source/Target/Statistics.cpp (L353C60-L353C75))
to calculate how much debug info has been loaded. I have an external
utility that only wants to look at the loaded debug info, which isn't
exposed by the SBAPI.
debugserver isn't saving and restoring the SVE/SME register state around
inferior function calls.
Making arbitrary function calls while in Streaming SVE mode is generally
a poor idea because a NEON instruction can be hit and crash the
expression execution, which is how I missed this, but they should be
handled correctly if the user knows it is safe to do.
rdar://146886210
Add a -v/--version command line argument to print the version of both
the lldb-dap binary and the liblldb it's linked against.
This is motivated by me trying to figure out which lldb-dap I had in my
PATH.
This is an attempt to fix a test failure from #133794 when running on
windows builds. I suspect we are running into a case where the
[ICF](https://learn.microsoft.com/en-us/cpp/build/reference/opt-optimizations?view=msvc-170)
optimization kicks in and combines the CreateSystemRuntimePlugin*
functions into a single address. This means that we cannot uniquely
unregister the plugin based on its create function address.
The fix is have each create function return a different (bogus) value.
`llvm::convertUTF16ToUTF8String` opens with an assertion that the output
container is empty:
3bdf9a0880/llvm/lib/Support/ConvertUTFWrapper.cpp (L83-L84)
It's not clear to me why this function requires the output container to
be empty instead of just overwriting it, but the callsite in
`TargetThreadWindows::GetName` may reuse the container without clearing
it out first, resulting in an assertion failure:
```
# Child-SP RetAddr Call Site
00 000000d2`44b8ea48 00007ff8`beefc12e ntdll!NtTerminateProcess+0x14
01 000000d2`44b8ea50 00007ff8`bcf518ab ntdll!RtlExitUserProcess+0x11e
02 000000d2`44b8ea80 00007ff8`bc0e0143 KERNEL32!ExitProcessImplementation+0xb
03 000000d2`44b8eab0 00007ff8`bc0e4c49 ucrtbase!common_exit+0xc7
04 000000d2`44b8eb10 00007ff8`bc102ae6 ucrtbase!abort+0x69
05 000000d2`44b8eb40 00007ff8`bc102cc1 ucrtbase!common_assert_to_stderr<wchar_t>+0x6e
06 000000d2`44b8eb80 00007fff`b8e27a80 ucrtbase!wassert+0x71
07 000000d2`44b8ebb0 00007fff`b8b821e1 liblldb!llvm::convertUTF16ToUTF8String+0x30 [D:\r\_work\swift-build\swift-build\SourceCache\llvm-project\llvm\lib\Support\ConvertUTFWrapper.cpp @ 88]
08 000000d2`44b8ec30 00007fff`b83e9aa2 liblldb!lldb_private::TargetThreadWindows::GetName+0x1b1 [D:\r\_work\swift-build\swift-build\SourceCache\llvm-project\lldb\source\Plugins\Process\Windows\Common\TargetThreadWindows.cpp @ 198]
09 000000d2`44b8eca0 00007ff7`2a3c3c14 liblldb!lldb::SBThread::GetName+0x102 [D:\r\_work\swift-build\swift-build\SourceCache\llvm-project\lldb\source\API\SBThread.cpp @ 432]
0a 000000d2`44b8ed70 00007ff7`2a3a5ac6 lldb_dap!lldb_dap::CreateThread+0x1f4 [S:\SourceCache\llvm-project\lldb\tools\lldb-dap\JSONUtils.cpp @ 877]
0b 000000d2`44b8ef10 00007ff7`2a3b0ab5 lldb_dap!`anonymous namespace'::request_threads+0xa6 [S:\SourceCache\llvm-project\lldb\tools\lldb-dap\lldb-dap.cpp @ 3906]
0c 000000d2`44b8f010 00007ff7`2a3b0fe8 lldb_dap!lldb_dap::DAP::HandleObject+0x1c5 [S:\SourceCache\llvm-project\lldb\tools\lldb-dap\DAP.cpp @ 796]
0d 000000d2`44b8f130 00007ff7`2a3a8b96 lldb_dap!lldb_dap::DAP::Loop+0x78 [S:\SourceCache\llvm-project\lldb\tools\lldb-dap\DAP.cpp @ 812]
0e 000000d2`44b8f1d0 00007ff7`2a4b5fbc lldb_dap!main+0x1096 [S:\SourceCache\llvm-project\lldb\tools\lldb-dap\lldb-dap.cpp @ 5319]
0f (Inline Function) --------`-------- lldb_dap!invoke_main+0x22 [D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl @ 78]
10 000000d2`44b8fb80 00007ff8`bcf3e8d7 lldb_dap!__scrt_common_main_seh+0x10c [D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl @ 288]
11 000000d2`44b8fbc0 00007ff8`beefbf6c KERNEL32!BaseThreadInitThunk+0x17
12 000000d2`44b8fbf0 00000000`00000000 ntdll!RtlUserThreadStart+0x2c
```
This stack trace was captured from the lldb distributed in the Swift
toolchain. The issue is easy to reproduce by resuming from a breakpoint
twice in VS Code.
I've verified that clearing out the container here fixes the assertion
failure.
Before #134048, TestDAP_Progress relied on wait_for_event to block until
the progressEnd came in. However, progress events were not added to the
packet list, so this call would always time out. This PR makes it so
that packets are added to the packet list, and you can block on them.
Change `objc tagged-pointer info` to call
`OptionArgParser::ToRawAddress`.
Previously `ToAddress` was used, but it calls `FixCodeAddress`, which
can erroneously mutate the bits of a tagged pointer.
This commit adds support for enabling and disabling plugins by name. The
changes are made generically in the `PluginInstances` class, but
currently we only expose the ability to SystemRuntime plugins. Other
plugins types can be added easily.
We had a few design goals for how disabled plugins should work
1. Plugins that are disabled should still be visible to the system. This
allows us to dynamically enable and disable plugins and report their
state to the user.
2. Plugin order should be stable across disable and enable changes. We
want avoid changing the order of plugin lookup. When a plugin is
re-enabled it should return to its original slot in the creation order.
3. Disabled plugins should not appear in PluginManager operations.
Clients should be able to assume that only enabled plugins will be
returned from the PluginManager.
For the implementation we modify the plugin instance to maintain a bool
of its enabled state. Existing clients external to the Instances class
expect to iterate over only enabled instance so we skip over disabed
instances in the query and snapshot apis. This way the client does not
have to manually check which instances are enabled.
While trying to make progress on #133782, I noticed that
TestDAP_Progress was taking 90 seconds to complete. This patch brings
that down to 10 seocnds by making the following changes:
1. Don't call `wait_for_event` with a 15 second timeout. By the time we
call this, all progress events have been emitted, which means that we're
just sitting there until we hit the timeout.
2. Don't use 10 steps (= 10 seconds) for indeterminate progress. We have
two indeterminate progress tests so that's 6 seconds instead of 20.
3. Don't launch the process over and over. Once we have a dap session,
we can clear the progress vector and emit new progress events.
According to [1], the template parameter must be cv-unqualified and one
of unsigned short, unsigned int, unsigned long, or unsigned long long.
Should fix the following MSVC error:
error: static assertion failed due to requirement
'_Is_any_of_v<unsigned char, unsigned short, unsigned int, unsigned
long, unsigned long long>': invalid template argument for
independent_bits_engine: N4659
[1] https://en.cppreference.com/w/cpp/numeric/random/independent_bits_engine
Hoist UUID generation into the UUID class and add a trivial unit test.
This also changes the telemetry code to drop the double underscore if we
failed to generate a UUID and subsequently logs to the Host instead of
Object log channel.
The tests for the
[intel-pt](3483740289/lldb/docs/use/intel_pt.rst)
trace plugin were failing for multiple reasons.
On machines where tracing is supported many of the tests were crashing
because of a nullptr dereference. It looks like the `core_file`
parameter in `ProcessTrace::CreateInstance` was once ignored, but was
changed to always being dereferenced. This caused the tests to fail even
when tracing was supported.
On machines where tracing is not supported we would still run tests that
attempt to take a trace. These would obviously fail because the required
hardware is not present. Note that some of the tests simply read
serialized json as trace files which does not require any special
hardware.
This PR fixes these two issues by guarding the pointer dereference and
then skipping unsupported tests on machines. With these changes the
trace tests pass on both types of machines.
We also add a new unit test to validate that a process can be created
with a nullptr core_file through the generic process trace plugin path.
So the dSYM can be told what target it has been loaded into.
When lldb is loading modules, while creating a target, it will run
"command script import" on any Python modules in Resources/Python in the
dSYM. However, this happens WHILE the target is being created, so it is
not yet in the target list. That means that these scripts can't act on
the target that they a part of when they get loaded.
This patch adds a new python API that lldb will call:
__lldb_module_added_to_target
if it is defined in the module, passing in the Target the module was
being added to, so that code in these dSYM's don't have to guess.
This patch adds --platform-available-ports option to the dotest.py
script to avoid hardcoded gdb ports in lldb testsuite.
Currently, this option could be helpful in GdbRemoteTestCases (e.g.
TestLldbGdbServer, TestNonStop, TestGdbRemoteThreadsInStopReply,
TestGdbRemotePlatformFile, TestGdbRemote_vCont)
Since lldb 20, these have had no effect:
https://releases.llvm.org/20.1.0/docs/ReleaseNotes.html#changes-to-lldb
> lldb-server now listens to a single port for gdbserver connections and
> provides that port to the connection handler processes. This means
that
> only 2 ports need to be opened in the firewall (one for the
lldb-server
> platform, one for gdbserver connections). In addition, due to this
work,
lldb-server now works on Windows in the server mode.
Remove them.
Simplify and fix the logic to clear the old statusline when the terminal
window dimensions have changed. I accidentally broke the terminal
resizing behavior when addressing code review feedback.
I'd really like to figure out a way to test this. PExpect isn't a good
fit for this, because I really need to check the result, rather than the
control characters, as the latter doesn't tell me whether any part of
the old statusline is still visible.
These tests are currently filtered on macOS if your on an M1 (or newer)
device. These tests do work on macOS, for me at least on M1 Max with
macOS 15.3.2 and Xcode 16.2.
Enabling them again, but if we have CI problems with them we can keep
them disabled.
The `locked` variable can be accessed from the asynchronous thread until
the call to f.wait() completes. However, the variable is scoped in a
lexical block that ends before that, leading to a use-after-free.
Convert Breakpoint & Watchpoints structs to classes to provide proper
access control. This is in preparation for adopting SBMutex to protect
the underlying SBBreakpoint and SBWatchpoint.
This patch improves LLDB launch time on Linux machines for **preload
scenarios**, particularly for executables with a lot of shared library
dependencies (or modules). Specifically:
* Launching a binary with `target.preload-symbols = true`
* Attaching to a process with `target.preload-symbols = true`.
It's completely controlled by a new flag added in the first commit
`plugin.dynamic-loader.posix-dyld.parallel-module-load`, which *defaults
to false*. This was inspired by similar work on Darwin #110646.
Some rough numbers to showcase perf improvement, run on a very beefy
machine:
* Executable with ~5600 modules: baseline 45s, improvement 15s
* Executable with ~3800 modules: baseline 25s, improvement 10s
* Executable with ~6650 modules: baseline 67s, improvement 20s
* Executable with ~12500 modules: baseline 185s, improvement 85s
* Executable with ~14700 modules: baseline 235s, improvement 120s
A lot of targets we deal with have a *ton* of modules, and unfortunately
we're unable to convince other folks to reduce the number of modules, so
performance improvements like this can be very impactful for user
experience.
This patch achieves the performance improvement by parallelizing
`DynamicLoaderPOSIXDYLD::RefreshModules` for the launch scenario, and
`DynamicLoaderPOSIXDYLD::LoadAllCurrentModules` for the attach scenario.
The commits have some context on their specific changes as well --
hopefully this helps the review.
# More context on implementation
We discovered the bottlenecks by via `perf record -g -p <lldb's pid>` on
a Linux machine. With an executable known to have 1000s of shared
library dependencies, I ran
```
(lldb) b main
(lldb) r
# taking a while
```
and showed the resulting perf trace (snippet shown)
```
Samples: 85K of event 'cycles:P', Event count (approx.): 54615855812
Children Self Command Shared Object Symbol
- 93.54% 0.00% intern-state libc.so.6 [.] clone3
clone3
start_thread
lldb_private::HostNativeThreadBase::ThreadCreateTrampoline(void*) r
std::_Function_handler<void* (), lldb_private::Process::StartPrivateStateThread(bool)::$_0>::_M_invoke(std::_Any_data const&)
lldb_private::Process::RunPrivateStateThread(bool) n
- lldb_private::Process::HandlePrivateEvent(std::shared_ptr<lldb_private::Event>&)
- 93.54% lldb_private::Process::ShouldBroadcastEvent(lldb_private::Event*)
- 93.54% lldb_private::ThreadList::ShouldStop(lldb_private::Event*)
- lldb_private::Thread::ShouldStop(lldb_private::Event*) *
- 93.53% lldb_private::StopInfoBreakpoint::ShouldStopSynchronous(lldb_private::Event*) t
- 93.52% lldb_private::BreakpointSite::ShouldStop(lldb_private::StoppointCallbackContext*) i
lldb_private::BreakpointLocationCollection::ShouldStop(lldb_private::StoppointCallbackContext*) k
lldb_private::BreakpointLocation::ShouldStop(lldb_private::StoppointCallbackContext*) b
lldb_private::BreakpointOptions::InvokeCallback(lldb_private::StoppointCallbackContext*, unsigned long, unsigned long) i
DynamicLoaderPOSIXDYLD::RendezvousBreakpointHit(void*, lldb_private::StoppointCallbackContext*, unsigned long, unsigned lo
- DynamicLoaderPOSIXDYLD::RefreshModules() O
- 93.42% DynamicLoaderPOSIXDYLD::RefreshModules()::$_0::operator()(DYLDRendezvous::SOEntry const&) const u
- 93.40% DynamicLoaderPOSIXDYLD::LoadModuleAtAddress(lldb_private::FileSpec const&, unsigned long, unsigned long, bools
- lldb_private::DynamicLoader::LoadModuleAtAddress(lldb_private::FileSpec const&, unsigned long, unsigned long, boos
- 83.90% lldb_private::DynamicLoader::FindModuleViaTarget(lldb_private::FileSpec const&) o
- 83.01% lldb_private::Target::GetOrCreateModule(lldb_private::ModuleSpec const&, bool, lldb_private::Status*
- 77.89% lldb_private::Module::PreloadSymbols()
- 44.06% lldb_private::Symtab::PreloadSymbols()
- 43.66% lldb_private::Symtab::InitNameIndexes()
...
```
We saw that majority of time was spent in `RefreshModules`, with the
main culprit within it `LoadModuleAtAddress` which eventually calls
`PreloadSymbols`.
At first, `DynamicLoaderPOSIXDYLD::LoadModuleAtAddress` appears fairly
independent -- most of it deals with different files and then getting or
creating Modules from these files. The portions that aren't independent
seem to deal with ModuleLists, which appear concurrency safe. There were
members of `DynamicLoaderPOSIXDYLD` I had to synchronize though: namely
`m_loaded_modules` which `DynamicLoaderPOSIXDYLD` maintains to map its
loaded modules to their link addresses. Without synchronizing this, I
ran into SEGFAULTS and other issues when running `check-lldb`. I also
locked the assignment and comparison of `m_interpreter_module`, which
may be unnecessary.
# Alternate implementations
When creating this patch, another implementation I considered was
directly background-ing the call to `Module::PreloadSymbol` in
`Target::GetOrCreateModule`. It would have the added benefit of working
across platforms generically, and appeared to be concurrency safe. It
was done via `Debugger::GetThreadPool().async` directly. However, there
were a ton of concurrency issues, so I abandoned that approach for now.
# Testing
With the feature active, I tested via `ninja check-lldb` on both Debug
and Release builds several times (~5 or 6 altogether?), and didn't spot
additional failing or flaky tests.
I also tested manually on several different binaries, some with around
14000 modules, but just basic operations: launching, reaching main,
setting breakpoint, stepping, showing some backtraces.
I've also tested with the flag off just to make sure things behave
properly synchronously.
Remove raw access to PluginInstances vector
This commit modifies the PluginInstances class to remove direct access
to the m_instances vector. Instead, we expose a new `GetSnapshot` method
that returns a copy of the current state of the instances vector. All
external iteration over the instances is updated to use the new method.
The motivation for the change is to allow modifying the way we store
instances without having to change all the clients. This is a
preliminary change to allow enabling/disabling of plugins in which case
we want to iterate over only enabled plugins.
We also considered using a custom iterator that wraps the vector
iterator and can skip over disabled instances. That works, but the
iterator code is a bit messy with all template and typedefs to make a
compliant iterator.
Include the LLDB version in the lldbassert error message, and prompt
users to include it in the bug report. The majority of users that bother
filing a bug report just copy past the stack trace and often forget to
include this important detail. By putting it after the backtrace and
before the prompt, I'm hoping it'll get copy-pasted in.
rdar://146793016
In #133211, Greg suggested making the rate limit configurable through a
setting. Although adding the setting is easy, the two places where we
currently use rate limiting aren't tied to a particular debugger.
Although it'd be possible to hook up, given how few progress events
currently implement rate limiting, I don't think it's worth threading
this through, if that's even possible.
I still think it's a good idea to be consistent and make it easy to pick
the same rate limiting value, so I've moved it into a constant in the
Progress class.
Hey,
This solves an issue where running lldb-server-20 with a non-absolute
path (for example, when it's installed into `/usr/bin` and the user runs
it as `lldb-server-20 ...` and not `/usr/bin/lldb-server-20 ...`) fails
with `error: spawn_process failed: execve failed: No such file or
directory`. The underlying issue is that when run that way, it attempts
to execute a binary named `lldb-server-20` from its current directory.
This is also a mild security hazard because lldb-server is often being
run as root in the directory /tmp, meaning that an unprivileged user can
create the file /tmp/lldb-server-20 and lldb-server will execute it as
root. (although, well, it's a debugging server we're talking about, so
that may not be a real concern)
I haven't previously contributed to this project; if you want me to
change anything in the code please don't hesitate to let me know.
Expose u target API mutex through the SB API. This is motivated by
lldb-dap, which is built on top of the SB API and needs a way to execute
a series of SB API calls in an atomic manner (see #131242).
We can solve this problem by either introducing an additional layer of
locking at the DAP level or by exposing the existing locking at the SB
API level. This patch implements the second approach.
This was discussed in an RFC on Discourse [0]. The original
implementation exposed a move-only lock rather than a mutex [1] which
doesn't work well with SWIG 4.0 [2]. This implement the alternative
solution of exposing the mutex rather than the lock. The SBMutex
conforms to the BasicLockable requirement [3] (which is why the methods
are called `lock` and `unlock` rather than Lock and Unlock) so it can be
used as `std::lock_guard<lldb::SBMutex>` and
`std::unique_lock<lldb::SBMutex>`.
[0]: https://discourse.llvm.org/t/rfc-exposing-the-target-api-lock-through-the-sb-api/85215/6
[1]: https://github.com/llvm/llvm-project/pull/131404
[2]: https://discourse.llvm.org/t/rfc-bumping-the-minimum-swig-version-to-4-1-0/85377/9
[3]: https://en.cppreference.com/w/cpp/named_req/BasicLockable
The function had special handling for -1, but that is incompatible with
functions whose entry point is not the first address. Use std::nullopt
instead.