verilator/test_regress/t/t_gantt_io_arm.out
Geza Lore 4c0edd2efb Improve --prof-exec infrastructure and report
Again --prof-exec have bit-rotted a little with all the recent changes
to the structure of the generated code. This patch contains a few
improvements:
- Repalce the eval/evl_loop begin/end events with generic
  section_push/section_pop events, that can be arbitrarily sprinkled
  into the generate code (so long as they are matched correctly) to
  measure various sections. The report then contains a nested profile
  of the sections, and the VCD trace shows the section names.
- Better handling of exec graphs
- Clearer overall statistics
2023-10-21 21:09:03 +01:00

48 lines
1.3 KiB
Plaintext

Verilator Gantt report
Argument settings:
+verilator+prof+exec+start+1
+verilator+prof+exec+window+2
Summary:
Total elapsed time = 300000 rdtsc ticks
Parallelized code = 80.49% of elapsed time
Total threads = 2
Total CPUs used = 2
Total mtasks = 5
Total yields = 51
Parallelized code, measured:
Thread utilization = 42.50%
Speedup = 0.85x
Parallelized code, predicted during static scheduling:
Thread utilization = 82.44%
Speedup = 1.65x
All code, measured:
Thread utilization = 43.96%
Speedup = 0.879x
All code, measured, scaled by predicted speedup:
Thread utilization = 72.06%
Speedup = 1.44x
MTask statistics:
Longest mtask id = 79
Longest mtask time = 57.05% of time elapsed in parallelized code
min log(p2e) = -1.054 from mtask 79 (predict 48001, elapsed 137754)
max log(p2e) = 3.641 from mtask 87 (predict 33809, elapsed 887)
mean = 1.656
stddev = 2.104
e ^ stddev = 8.200
CPU info:
Id | Time spent executing MTask | Socket | Core | Model
| % of elapsed ticks / ticks | | |
====|============================|========|======|======
2 | 67.44% / 202323 | | | Phytium,FT-2500/128
3 | 0.97% / 2914 | | | Phytium,FT-2500/128
Writing profile_exec.vcd