mirror of
https://github.com/llvm/llvm-project.git
synced 2025-04-30 00:56:06 +00:00

This error was found when analyzing MySQL with CTU enabled. When there are space characters in the lookup name, the current delimiter searching strategy will make the file path wrongly parsed. And when two lookup names have the same prefix before their first space characters, a 'multiple definitions' error will be wrongly reported. e.g. The lookup names for the two lambda exprs in the test case are `c:@S@G@F@G#@Sa@F@operator int (*)(char)#1` and `c:@S@G@F@G#@Sa@F@operator bool (*)(char)#1` respectively. And their prefixes are both `c:@S@G@F@G#@Sa@F@operator` when using the first space character as the delimiter. Solving the problem by adding a length for the lookup name, making the index items in the format of `<USR-Length>:<USR File> <Path>`. --- In the test case of this patch, we found that it will trigger a "triple mismatch" warning when using `clang -cc1` to analyze the source file with CTU using the on-demand-parsing strategy in Darwin systems. And this problem is also encountered in D75665, which is the patch introducing the on-demand parsing strategy. We temporarily bypass this problem by using the loading-ast-file strategy. Refer to the [discourse topic](https://discourse.llvm.org/t/60762) for more details. Differential Revision: https://reviews.llvm.org/D102669
377 lines
12 KiB
ReStructuredText
377 lines
12 KiB
ReStructuredText
=====================================
|
|
Cross Translation Unit (CTU) Analysis
|
|
=====================================
|
|
|
|
Normally, static analysis works in the boundary of one translation unit (TU).
|
|
However, with additional steps and configuration we can enable the analysis to inline the definition of a function from
|
|
another TU.
|
|
|
|
.. contents::
|
|
:local:
|
|
|
|
Overview
|
|
________
|
|
CTU analysis can be used in a variety of ways. The importing of external TU definitions can work with pre-dumped PCH
|
|
files or generating the necessary AST structure on-demand, during the analysis of the main TU. Driving the static
|
|
analysis can also be implemented in multiple ways. The most direct way is to specify the necessary commandline options
|
|
of the Clang frontend manually (and generate the prerequisite dependencies of the specific import method by hand). This
|
|
process can be automated by other tools, like `CodeChecker <https://github.com/Ericsson/codechecker>`_ and scan-build-py
|
|
(preference for the former).
|
|
|
|
PCH-based analysis
|
|
__________________
|
|
The analysis needs the PCH dumps of all the translations units used in the project.
|
|
These can be generated by the Clang Frontend itself, and must be arranged in a specific way in the filesystem.
|
|
The index, which maps symbols' USR names to PCH dumps containing them must also be generated by the
|
|
`clang-extdef-mapping`. Entries in the index *must* have an `.ast` suffix if the goal
|
|
is to use PCH-based analysis, as the lack of that extension signals that the entry is to be used as a source-file, and parsed on-demand.
|
|
This tool uses a :doc:`compilation database <../../JSONCompilationDatabase>` to
|
|
determine the compilation flags used.
|
|
The analysis invocation must be provided with the directory which contains the dumps and the mapping files.
|
|
|
|
|
|
Manual CTU Analysis
|
|
###################
|
|
Let's consider these source files in our minimal example:
|
|
|
|
.. code-block:: cpp
|
|
|
|
// main.cpp
|
|
int foo();
|
|
|
|
int main() {
|
|
return 3 / foo();
|
|
}
|
|
|
|
.. code-block:: cpp
|
|
|
|
// foo.cpp
|
|
int foo() {
|
|
return 0;
|
|
}
|
|
|
|
And a compilation database:
|
|
|
|
.. code-block:: bash
|
|
|
|
[
|
|
{
|
|
"directory": "/path/to/your/project",
|
|
"command": "clang++ -c foo.cpp -o foo.o",
|
|
"file": "foo.cpp"
|
|
},
|
|
{
|
|
"directory": "/path/to/your/project",
|
|
"command": "clang++ -c main.cpp -o main.o",
|
|
"file": "main.cpp"
|
|
}
|
|
]
|
|
|
|
We'd like to analyze `main.cpp` and discover the division by zero bug.
|
|
In order to be able to inline the definition of `foo` from `foo.cpp` first we have to generate the `AST` (or `PCH`) file
|
|
of `foo.cpp`:
|
|
|
|
.. code-block:: bash
|
|
|
|
$ pwd $ /path/to/your/project
|
|
$ clang++ -emit-ast -o foo.cpp.ast foo.cpp
|
|
$ # Check that the .ast file is generated:
|
|
$ ls
|
|
compile_commands.json foo.cpp.ast foo.cpp main.cpp
|
|
$
|
|
|
|
The next step is to create a CTU index file which holds the `USR` name and location of external definitions in the
|
|
source files in format `<USR-Length>:<USR> <File-Path>`:
|
|
|
|
.. code-block:: bash
|
|
|
|
$ clang-extdef-mapping -p . foo.cpp
|
|
9:c:@F@foo# /path/to/your/project/foo.cpp
|
|
$ clang-extdef-mapping -p . foo.cpp > externalDefMap.txt
|
|
|
|
We have to modify `externalDefMap.txt` to contain the name of the `.ast` files instead of the source files:
|
|
|
|
.. code-block:: bash
|
|
|
|
$ sed -i -e "s/.cpp/.cpp.ast/g" externalDefMap.txt
|
|
|
|
We still have to further modify the `externalDefMap.txt` file to contain relative paths:
|
|
|
|
.. code-block:: bash
|
|
|
|
$ sed -i -e "s|$(pwd)/||g" externalDefMap.txt
|
|
|
|
Now everything is available for the CTU analysis.
|
|
We have to feed Clang with CTU specific extra arguments:
|
|
|
|
.. code-block:: bash
|
|
|
|
$ pwd
|
|
/path/to/your/project
|
|
$ clang++ --analyze \
|
|
-Xclang -analyzer-config -Xclang experimental-enable-naive-ctu-analysis=true \
|
|
-Xclang -analyzer-config -Xclang ctu-dir=. \
|
|
-Xclang -analyzer-output=plist-multi-file \
|
|
main.cpp
|
|
main.cpp:5:12: warning: Division by zero
|
|
return 3 / foo();
|
|
~~^~~~~~~
|
|
1 warning generated.
|
|
$ # The plist file with the result is generated.
|
|
$ ls -F
|
|
compile_commands.json externalDefMap.txt foo.ast foo.cpp foo.cpp.ast main.cpp main.plist
|
|
$
|
|
|
|
This manual procedure is error-prone and not scalable, therefore to analyze real projects it is recommended to use
|
|
`CodeChecker` or `scan-build-py`.
|
|
|
|
Automated CTU Analysis with CodeChecker
|
|
#######################################
|
|
The `CodeChecker <https://github.com/Ericsson/codechecker>`_ project fully supports automated CTU analysis with Clang.
|
|
Once we have set up the `PATH` environment variable and we activated the python `venv` then it is all it takes:
|
|
|
|
.. code-block:: bash
|
|
|
|
$ CodeChecker analyze --ctu compile_commands.json -o reports
|
|
$ ls -F
|
|
compile_commands.json foo.cpp foo.cpp.ast main.cpp reports/
|
|
$ tree reports
|
|
reports
|
|
├── compile_cmd.json
|
|
├── compiler_info.json
|
|
├── foo.cpp_53f6fbf7ab7ec9931301524b551959e2.plist
|
|
├── main.cpp_23db3d8df52ff0812e6e5a03071c8337.plist
|
|
├── metadata.json
|
|
└── unique_compile_commands.json
|
|
|
|
0 directories, 6 files
|
|
$
|
|
|
|
The `plist` files contain the results of the analysis, which may be viewed with the regular analysis tools.
|
|
E.g. one may use `CodeChecker parse` to view the results in command line:
|
|
|
|
.. code-block:: bash
|
|
|
|
$ CodeChecker parse reports
|
|
[HIGH] /home/egbomrt/ctu_mini_raw_project/main.cpp:5:12: Division by zero [core.DivideZero]
|
|
return 3 / foo();
|
|
^
|
|
|
|
Found 1 defect(s) in main.cpp
|
|
|
|
|
|
----==== Summary ====----
|
|
-----------------------
|
|
Filename | Report count
|
|
-----------------------
|
|
main.cpp | 1
|
|
-----------------------
|
|
-----------------------
|
|
Severity | Report count
|
|
-----------------------
|
|
HIGH | 1
|
|
-----------------------
|
|
----=================----
|
|
Total number of reports: 1
|
|
----=================----
|
|
|
|
Or we can use `CodeChecker parse -e html` to export the results into HTML format:
|
|
|
|
.. code-block:: bash
|
|
|
|
$ CodeChecker parse -e html -o html_out reports
|
|
$ firefox html_out/index.html
|
|
|
|
Automated CTU Analysis with scan-build-py (don't do it)
|
|
#############################################################
|
|
We actively develop CTU with CodeChecker as the driver for this feature, `scan-build-py` is not actively developed for CTU.
|
|
`scan-build-py` has various errors and issues, expect it to work only with the very basic projects only.
|
|
|
|
Example usage of scan-build-py:
|
|
|
|
.. code-block:: bash
|
|
|
|
$ /your/path/to/llvm-project/clang/tools/scan-build-py/bin/analyze-build --ctu
|
|
analyze-build: Run 'scan-view /tmp/scan-build-2019-07-17-17-53-33-810365-7fqgWk' to examine bug reports.
|
|
$ /your/path/to/llvm-project/clang/tools/scan-view/bin/scan-view /tmp/scan-build-2019-07-17-17-53-33-810365-7fqgWk
|
|
Starting scan-view at: http://127.0.0.1:8181
|
|
Use Ctrl-C to exit.
|
|
[6336:6431:0717/175357.633914:ERROR:browser_process_sub_thread.cc(209)] Waited 5 ms for network service
|
|
Opening in existing browser session.
|
|
^C
|
|
$
|
|
|
|
.. _ctu-on-demand:
|
|
|
|
On-demand analysis
|
|
__________________
|
|
The analysis produces the necessary AST structure of external TUs during analysis. This requires the
|
|
exact compiler invocations for each TU, which can be generated by hand, or by tools driving the analyzer.
|
|
The compiler invocation is a shell command that could be used to compile the TU-s main source file.
|
|
The mapping from absolute source file paths of a TU to lists of compilation command segments used to
|
|
compile said TU are given in YAML format referred to as `invocation list`, and must be passed as an
|
|
analyer-config argument.
|
|
The index, which maps function USR names to source files containing them must also be generated by the
|
|
`clang-extdef-mapping`. Entries in the index must *not* have an `.ast` suffix if the goal
|
|
is to use On-demand analysis, as that extension signals that the entry is to be used as an PCH-dump.
|
|
The mapping of external definitions implicitly uses a
|
|
:doc:`compilation database <../../JSONCompilationDatabase>` to determine the compilation flags used.
|
|
The analysis invocation must be provided with the directory which contains the mapping
|
|
files, and the `invocation list` which is used to determine compiler flags.
|
|
|
|
|
|
Manual CTU Analysis
|
|
###################
|
|
|
|
Let's consider these source files in our minimal example:
|
|
|
|
.. code-block:: cpp
|
|
|
|
// main.cpp
|
|
int foo();
|
|
|
|
int main() {
|
|
return 3 / foo();
|
|
}
|
|
|
|
.. code-block:: cpp
|
|
|
|
// foo.cpp
|
|
int foo() {
|
|
return 0;
|
|
}
|
|
|
|
The compilation database:
|
|
|
|
.. code-block:: bash
|
|
|
|
[
|
|
{
|
|
"directory": "/path/to/your/project",
|
|
"command": "clang++ -c foo.cpp -o foo.o",
|
|
"file": "foo.cpp"
|
|
},
|
|
{
|
|
"directory": "/path/to/your/project",
|
|
"command": "clang++ -c main.cpp -o main.o",
|
|
"file": "main.cpp"
|
|
}
|
|
]
|
|
|
|
The `invocation list`:
|
|
|
|
.. code-block:: bash
|
|
|
|
"/path/to/your/project/foo.cpp":
|
|
- "clang++"
|
|
- "-c"
|
|
- "/path/to/your/project/foo.cpp"
|
|
- "-o"
|
|
- "/path/to/your/project/foo.o"
|
|
|
|
"/path/to/your/project/main.cpp":
|
|
- "clang++"
|
|
- "-c"
|
|
- "/path/to/your/project/main.cpp"
|
|
- "-o"
|
|
- "/path/to/your/project/main.o"
|
|
|
|
We'd like to analyze `main.cpp` and discover the division by zero bug.
|
|
As we are using On-demand mode, we only need to create a CTU index file which holds the `USR` name and location of
|
|
external definitions in the source files in format `<USR-Length>:<USR> <File-Path>`:
|
|
|
|
.. code-block:: bash
|
|
|
|
$ clang-extdef-mapping -p . foo.cpp
|
|
9:c:@F@foo# /path/to/your/project/foo.cpp
|
|
$ clang-extdef-mapping -p . foo.cpp > externalDefMap.txt
|
|
|
|
Now everything is available for the CTU analysis.
|
|
We have to feed Clang with CTU specific extra arguments:
|
|
|
|
.. code-block:: bash
|
|
|
|
$ pwd
|
|
/path/to/your/project
|
|
$ clang++ --analyze \
|
|
-Xclang -analyzer-config -Xclang experimental-enable-naive-ctu-analysis=true \
|
|
-Xclang -analyzer-config -Xclang ctu-dir=. \
|
|
-Xclang -analyzer-config -Xclang ctu-invocation-list=invocations.yaml \
|
|
-Xclang -analyzer-output=plist-multi-file \
|
|
main.cpp
|
|
main.cpp:5:12: warning: Division by zero
|
|
return 3 / foo();
|
|
~~^~~~~~~
|
|
1 warning generated.
|
|
$ # The plist file with the result is generated.
|
|
$ ls -F
|
|
compile_commands.json externalDefMap.txt foo.cpp main.cpp main.plist
|
|
$
|
|
|
|
This manual procedure is error-prone and not scalable, therefore to analyze real projects it is recommended to use
|
|
`CodeChecker` or `scan-build-py`.
|
|
|
|
Automated CTU Analysis with CodeChecker
|
|
#######################################
|
|
The `CodeChecker <https://github.com/Ericsson/codechecker>`_ project fully supports automated CTU analysis with Clang.
|
|
Once we have set up the `PATH` environment variable and we activated the python `venv` then it is all it takes:
|
|
|
|
.. code-block:: bash
|
|
|
|
$ CodeChecker analyze --ctu --ctu-ast-loading-mode on-demand compile_commands.json -o reports
|
|
$ ls -F
|
|
compile_commands.json foo.cpp main.cpp reports/
|
|
$ tree reports
|
|
reports
|
|
├── compile_cmd.json
|
|
├── compiler_info.json
|
|
├── foo.cpp_53f6fbf7ab7ec9931301524b551959e2.plist
|
|
├── main.cpp_23db3d8df52ff0812e6e5a03071c8337.plist
|
|
├── metadata.json
|
|
└── unique_compile_commands.json
|
|
|
|
0 directories, 6 files
|
|
$
|
|
|
|
The `plist` files contain the results of the analysis, which may be viewed with the regular analysis tools.
|
|
E.g. one may use `CodeChecker parse` to view the results in command line:
|
|
|
|
.. code-block:: bash
|
|
|
|
$ CodeChecker parse reports
|
|
[HIGH] /home/egbomrt/ctu_mini_raw_project/main.cpp:5:12: Division by zero [core.DivideZero]
|
|
return 3 / foo();
|
|
^
|
|
|
|
Found 1 defect(s) in main.cpp
|
|
|
|
|
|
----==== Summary ====----
|
|
-----------------------
|
|
Filename | Report count
|
|
-----------------------
|
|
main.cpp | 1
|
|
-----------------------
|
|
-----------------------
|
|
Severity | Report count
|
|
-----------------------
|
|
HIGH | 1
|
|
-----------------------
|
|
----=================----
|
|
Total number of reports: 1
|
|
----=================----
|
|
|
|
Or we can use `CodeChecker parse -e html` to export the results into HTML format:
|
|
|
|
.. code-block:: bash
|
|
|
|
$ CodeChecker parse -e html -o html_out reports
|
|
$ firefox html_out/index.html
|
|
|
|
Automated CTU Analysis with scan-build-py (don't do it)
|
|
#######################################################
|
|
We actively develop CTU with CodeChecker as the driver for feature, `scan-build-py` is not actively developed for CTU.
|
|
`scan-build-py` has various errors and issues, expect it to work only with the very basic projects only.
|
|
|
|
Currently On-demand analysis is not supported with `scan-build-py`.
|