mirror of
https://github.com/llvm/llvm-project.git
synced 2025-04-25 02:16:05 +00:00

Add probe inline tree information to YAML profile, at function level: - function GUID, - checksum, - parent node id, - call site in the parent. This information is used for pseudo probe block matching (#99891). The encoding adds/changes probe information in multiple levels of YAML profile: - BinaryProfile: add pseudo_probe_desc with GUIDs and Hashes, which permits deduplication of data: - many GUIDs are duplicate as the same callee is commonly inlined into multiple callers, - hashes are also very repetitive, especially for functions with low block counts. - FunctionProfile: add inline tree (see above). Top-level function is included as root of function inline tree, which makes guid and pseudo_probe_desc_hash fields redundant. - BlockProfile: densely-encoded block probe information: - probes reference their containing inline tree node, - separate lists for block, call, indirect call probes, - block probe encoding is specialized: ids are encoded as bitset in uint64_t. If only block probe with id=1 is present, it's encoded as implicit entry (id=0, omitted). - inline tree nodes with identical probes share probe description where node indices are combined into a list. On top of #107970, profile with new probe encoding has the following characteristics (profile for a large binary): - Profile without probe information: 33MB, 3.8MB compressed (baseline). - Profile with inline tree information: 92MB, 14MB compressed. Profile processing time (YAML parsing, inference, attaching steps): - profile without pseudo probes: 5s, - profile with pseudo probes, without pseudo probe matching: 11s, - with pseudo probe matching: 12.5s. Test Plan: updated pseudoprobe-decoding-inline.test Reviewers: wlei-llvm, ayermolo, rafaelauler, dcci, maksfb Reviewed By: wlei-llvm, rafaelauler Pull Request: https://github.com/llvm/llvm-project/pull/107137
114 lines
5.1 KiB
Plaintext
114 lines
5.1 KiB
Plaintext
# REQUIRES: system-linux
|
|
# RUN: llvm-bolt %S/../../../llvm/test/tools/llvm-profgen/Inputs/noinline-cs-pseudoprobe.perfbin --print-pseudo-probes=all -o %t.bolt --lite=0 --enable-bat 2>&1 | FileCheck %s
|
|
|
|
# PREAGG: B X:0 #foo# 1 0
|
|
# PREAGG: B X:0 #bar# 1 0
|
|
# PREAGG: B X:0 #main# 1 0
|
|
|
|
## Check pseudo-probes in regular YAML profile (non-BOLTed binary)
|
|
# RUN: link_fdata %s %S/../../../llvm/test/tools/llvm-profgen/Inputs/noinline-cs-pseudoprobe.perfbin %t.preagg PREAGG
|
|
# RUN: perf2bolt %S/../../../llvm/test/tools/llvm-profgen/Inputs/noinline-cs-pseudoprobe.perfbin -p %t.preagg --pa -w %t.yaml -o %t.fdata --profile-write-pseudo-probes
|
|
# RUN: FileCheck --input-file %t.yaml %s --check-prefix CHECK-YAML
|
|
## Check pseudo-probes in BAT YAML profile (BOLTed binary)
|
|
# RUN: link_fdata %s %t.bolt %t.preagg2 PREAGG
|
|
# RUN: perf2bolt %t.bolt -p %t.preagg2 --pa -w %t.yaml2 -o %t.fdata2 --profile-write-pseudo-probes
|
|
# RUN: FileCheck --input-file %t.yaml2 %s --check-prefix CHECK-YAML
|
|
# CHECK-YAML: name: bar
|
|
# CHECK-YAML: - bid: 0
|
|
# CHECK-YAML: probes: [ { blx: 9 } ]
|
|
# CHECK-YAML: inline_tree: [ { } ]
|
|
#
|
|
# CHECK-YAML: name: foo
|
|
# CHECK-YAML: - bid: 0
|
|
# CHECK-YAML: probes: [ { blx: 3 } ]
|
|
# CHECK-YAML: inline_tree: [ { g: 2 } ]
|
|
#
|
|
# CHECK-YAML: name: main
|
|
# CHECK-YAML: - bid: 0
|
|
# CHECK-YAML: probes: [ { blx: 1, call: [ 2 ] } ]
|
|
# CHECK-YAML: inline_tree: [ { g: 1 } ]
|
|
#
|
|
# CHECK-YAML: pseudo_probe_desc:
|
|
# CHECK-YAML-NEXT: gs: [ 0xE413754A191DB537, 0xDB956436E78DD5FA, 0x5CF8C24CDB18BDAC ]
|
|
# CHECK-YAML-NEXT: gh: [ 2, 1, 0 ]
|
|
# CHECK-YAML-NEXT: hs: [ 0x200205A19C5B4, 0x10000FFFFFFFF, 0x10E852DA94 ]
|
|
#
|
|
## Check that without --profile-write-pseudo-probes option, no pseudo probes are
|
|
## generated
|
|
# RUN: perf2bolt %S/../../../llvm/test/tools/llvm-profgen/Inputs/noinline-cs-pseudoprobe.perfbin -p %t.preagg --pa -w %t.yaml3 -o %t.fdata
|
|
# RUN: FileCheck --input-file %t.yaml3 %s --check-prefix CHECK-NO-OPT
|
|
# CHECK-NO-OPT-NOT: probes:
|
|
# CHECK-NO-OPT-NOT: inline_tree:
|
|
# CHECK-NO-OPT-NOT: pseudo_probe_desc:
|
|
;; Report of decoding input pseudo probe binaries
|
|
|
|
; CHECK: GUID: 6699318081062747564 Name: foo
|
|
; CHECK: Hash: 563088904013236
|
|
; CHECK: GUID: 15822663052811949562 Name: main
|
|
; CHECK: Hash: 281479271677951
|
|
; CHECK: GUID: 16434608426314478903 Name: bar
|
|
; CHECK: Hash: 72617220756
|
|
|
|
CHECK: [Probe]: FUNC: bar Index: 1 Type: Block
|
|
CHECK: [Probe]: FUNC: bar Index: 4 Type: Block
|
|
CHECK: [Probe]: FUNC: foo Index: 1 Type: Block
|
|
CHECK: [Probe]: FUNC: foo Index: 2 Type: Block
|
|
CHECK: [Probe]: FUNC: foo Index: 5 Type: Block
|
|
CHECK: [Probe]: FUNC: foo Index: 6 Type: Block
|
|
CHECK: [Probe]: FUNC: foo Index: 2 Type: Block
|
|
CHECK: [Probe]: FUNC: foo Index: 3 Type: Block
|
|
CHECK: [Probe]: FUNC: foo Index: 4 Type: Block
|
|
CHECK: [Probe]: FUNC: foo Index: 8 Type: DirectCall
|
|
CHECK: [Probe]: FUNC: foo Index: 6 Type: Block
|
|
CHECK: [Probe]: FUNC: foo Index: 2 Type: Block
|
|
CHECK: [Probe]: FUNC: foo Index: 7 Type: Block
|
|
CHECK: [Probe]: FUNC: foo Index: 9 Type: DirectCall
|
|
CHECK: [Probe]: FUNC: main Index: 1 Type: Block
|
|
CHECK: [Probe]: FUNC: main Index: 2 Type: DirectCall
|
|
|
|
CHECK: Pseudo Probe Address Conversion results:
|
|
|
|
CHECK: Address: 0x201760 FUNC: bar Index: 1 Type: Block
|
|
CHECK: Address: 0x201760 FUNC: bar Index: 4 Type: Block
|
|
CHECK: Address: 0x201780 FUNC: foo Index: 1 Type: Block
|
|
CHECK: Address: 0x201780 FUNC: foo Index: 2 Type: Block
|
|
CHECK: Address: 0x20178d FUNC: foo Index: 5 Type: Block
|
|
CHECK: Address: 0x20178d FUNC: foo Index: 6 Type: Block
|
|
CHECK: Address: 0x20178d FUNC: foo Index: 2 Type: Block
|
|
CHECK: Address: 0x20179b FUNC: foo Index: 3 Type: Block
|
|
CHECK: Address: 0x2017ba FUNC: foo Index: 4 Type: Block
|
|
CHECK: Address: 0x2017bc FUNC: foo Index: 8 Type: DirectCall
|
|
CHECK: Address: 0x2017ba FUNC: foo Index: 6 Type: Block
|
|
CHECK: Address: 0x2017ba FUNC: foo Index: 2 Type: Block
|
|
CHECK: Address: 0x2017ce FUNC: foo Index: 7 Type: Block
|
|
CHECK: Address: 0x2017d5 FUNC: foo Index: 9 Type: DirectCall
|
|
CHECK: Address: 0x2017f0 FUNC: main Index: 1 Type: Block
|
|
CHECK: Address: 0x2017f4 FUNC: main Index: 2 Type: DirectCall
|
|
|
|
CHECK: Address: 2103136
|
|
CHECK-NEXT: [Probe]: FUNC: bar Index: 1 Type: Block
|
|
CHECK-NEXT: [Probe]: FUNC: bar Index: 4 Type: Block
|
|
CHECK-NEXT: Address: 2103168
|
|
CHECK-NEXT: [Probe]: FUNC: foo Index: 1 Type: Block
|
|
CHECK-NEXT: [Probe]: FUNC: foo Index: 2 Type: Block
|
|
CHECK-NEXT: Address: 2103181
|
|
CHECK-NEXT: [Probe]: FUNC: foo Index: 5 Type: Block
|
|
CHECK-NEXT: [Probe]: FUNC: foo Index: 6 Type: Block
|
|
CHECK-NEXT: [Probe]: FUNC: foo Index: 2 Type: Block
|
|
CHECK-NEXT: Address: 2103195
|
|
CHECK-NEXT: [Probe]: FUNC: foo Index: 3 Type: Block
|
|
CHECK-NEXT: Address: 2103226
|
|
CHECK-NEXT: [Probe]: FUNC: foo Index: 4 Type: Block
|
|
CHECK-NEXT: [Probe]: FUNC: foo Index: 6 Type: Block
|
|
CHECK-NEXT: [Probe]: FUNC: foo Index: 2 Type: Block
|
|
CHECK-NEXT: Address: 2103228
|
|
CHECK-NEXT: [Probe]: FUNC: foo Index: 8 Type: DirectCall
|
|
CHECK-NEXT: Address: 2103246
|
|
CHECK-NEXT: [Probe]: FUNC: foo Index: 7 Type: Block
|
|
CHECK-NEXT: Address: 2103253
|
|
CHECK-NEXT: [Probe]: FUNC: foo Index: 9 Type: DirectCall
|
|
CHECK-NEXT: Address: 2103280
|
|
CHECK-NEXT: [Probe]: FUNC: main Index: 1 Type: Block
|
|
CHECK-NEXT: Address: 2103284
|
|
CHECK-NEXT: [Probe]: FUNC: main Index: 2 Type: DirectCall
|