royitaqi e8dc8d614a
Add new Python API SBCommandInterpreter::GetTranscript() (#90703)
# Motivation

Currently, the user can already get the "transcript" (for "what is the
transcript", see `CommandInterpreter::SaveTranscript`). However, the
only way to obtain the transcript data as a user is to first destroy the
debugger, then read the save directory. Note that destroy-callbacks
cannot be used, because 1\ transcript data is private to the command
interpreter (see `CommandInterpreter.h`), and 2\ the writing of the
transcript is *after* the invocation of destory-callbacks (see
`Debugger::Destroy`).

So basically, there is no way to obtain the transcript:
* during the lifetime of a debugger (including the destroy-callbacks,
which often performs logging tasks, where the transcript can be useful)
* without relying on external storage

In theory, there are other ways for user to obtain transcript data
during a debugger's life cycle:
* Use Python API and intercept commands and results.
* Use CLI and record console input/output.

However, such ways rely on the client's setup and are not supported
natively by LLDB.


# Proposal

Add a new Python API `SBCommandInterpreter::GetTranscript()`.

Goals:
* It can be called at any time during the debugger's life cycle,
including in destroy-callbacks.
* It returns data in-memory.

Structured data:
* To make data processing easier, the return type is `SBStructuredData`.
See comments in code for how the data is organized.
* In the future, `SaveTranscript` can be updated to write different
formats using such data (e.g. JSON). This is probably accompanied by a
new setting (e.g. `interpreter.save-session-format`).

# Alternatives

The return type can also be `std::vector<std::pair<std::string,
SBCommandReturnObject>>`. This will make implementation easier, without
having to translate it to `SBStructuredData`. On the other hand,
`SBStructuredData` can convert to JSON easily, so it's more convenient
for user to process.

# Privacy

Both user commands and output/error in the transcript can contain
privacy data. However, as mentioned, the transcript is already available
to the user. The addition of the new API doesn't increase the level of
risk. In fact, it _lowers_ the risk of privacy data being leaked later
on, by avoiding writing such data to external storage.

Once the user (or their code) gets the transcript, it will be their
responsibility to make sure that any required privacy policies are
guaranteed.

# Tests

```
bin/llvm-lit -sv ../external/llvm-project/lldb/test/API/python_api/interpreter/TestCommandInterpreterAPI.py
```

```
bin/llvm-lit -sv ../external/llvm-project/lldb/test/API/commands/session/save/TestSessionSave.py
```

---------

Co-authored-by: Roy Shi <royshi@meta.com>
Co-authored-by: Med Ismail Bennani <ismail@bennani.ma>
2024-05-20 15:49:46 -07:00
..