Commit Graph

75 Commits

Author SHA1 Message Date
Geza Lore
dc5c259069
Improve tracing performance. (#2257)
* Improve tracing performance.

Various tactics used to improve performance of both VCD and FST tracing:
- Both: Change tracing functions to templates to take variable widths as
  template parameters. For VCD, subsequently specialize these to the
  values used by Verilator. This avoids redundant instructions and hard
  to predict branches.
- Both: Check for value changes via direct pointer access into the
  previous signal value buffer. This eliminates a lot of simple pointer
  arithmetic instructions form the tracing code.
- Both: Verilator provides clean input, no need to mask out used bits.
- VCD: pre-compute identifier codes and use memory copy instead of
  re-computing them every time a code is emitted. This saves a lot of
  instructions and hard to predict branches. The added D-cache misses
  are cheaper than the removed branches/instructions.
- VCD: re-write the routines emitting the changes to be more efficient.
- FST: Use previous signal value buffer the same way as the VCD tracing
  code, and only call the FST API when a change is detected.

Performance as measured on SweRV EH1, with the pre-canned CoreMark
benchmark running from DCCM/ICCM, clang 6.0.0, Intel i7-3770 @ 3.40GHz,
and IO to ramdisk:

            +--------------+---------------+----------------------+
            | VCD          | FST           | FST separate thread  |
            | (--trace)    | (--trace-fst) | (--trace-fst-thread) |
------------+-----------------------------------------------------+
Before      |  30.2 s      | 121.1 s       |  69.8 s              |
============+==============+===============+======================+
After       |  24.7 s      |  45.7 s       |  32.4 s              |
------------+--------------+---------------+----------------------+
Speedup     |    22 %      |   256 %       |   215 %              |
------------+--------------+---------------+----------------------+
Rel. to VCD |     1 x      |  1.85 x       |  1.31 x              |
------------+--------------+---------------+----------------------+

In addition, FST trace size for the above reduced by 48%.
2020-04-14 00:13:10 +01:00
Nathan Myers
4c1ae4701a
Add assertion for monotonic dump times #2103 (#2237) 2020-04-09 19:00:27 -04:00
Geza Lore
05f213c266
VCD tracing speed improvements (#2246)
* Don't inline VCD dump functions

Improves model speed with tracing. Measured on SweRW cmark:
- GCC 5.5      ~3% faster
- Clang 6.0    ~12% faster (!)

* Remove redundant test from VCD bit tracing.

Improves model speed with tracing. Measured on SweRW cmark:
 - GCC 5.5      ~7.5% faster
 - Clang 6.0    ~1.5% faster
2020-04-09 08:19:26 -04:00
Geza Lore
991d8b178b
Fix FST tracing performance by removing std::map from hot path. (#2244)
This patch eliminates a major piece of inefficiency in FST tracing
support, by using an array to lookup fstHandle values corresponding
to trace codes, instead of a tree based std::map. With this change, FST
tracing is now only about 3x slower than VCD tracing. We do require
more memory to store the symbol lookup table, but the size of that is
still small, for the speed benefit.
2020-04-08 17:54:35 -04:00
Wilson Snyder
e07e9390f6 Internals: clang-format cleanups. No functional change. 2020-04-04 14:09:21 -04:00
Wilson Snyder
1ce360ed5b Add SPDX license identifiers. No functional change. 2020-03-21 11:24:24 -04:00
Wilson Snyder
6c6d70a5e5 Fix FST when multi-model tracing. 2020-03-07 18:39:58 -05:00
Wilson Snyder
905067d13f Fix $dumpvar multithreaded assert, broke last commit. 2020-03-02 07:43:10 -05:00
Wilson Snyder
30a33a6104 Add support for and , #2126. 2020-03-01 21:39:23 -05:00
Wilson Snyder
0ca0e07354 Internals: Vcd doesn't need code when registering. No functional change intended. 2020-02-29 20:42:52 -05:00
Wilson Snyder
9978cbfa5c Fix tracing -1 index arrays. Closes #2090. 2020-01-08 07:32:31 -05:00
Wilson Snyder
f23fe8fd84 Update copyright year. 2020-01-06 18:05:53 -05:00
Wilson Snyder
521418d832 Update FST trace API for better performance. 2019-12-10 18:55:09 -05:00
Wilson Snyder
700f2072c0 Framework for WDatas being vectors of 64-bit EDatas, but not supporting this at this time. 2019-12-08 21:36:38 -05:00
Wilson Snyder
f87107e757 Tests etc: Cleanup some clang-format suggestions. No functional change. 2019-11-09 20:35:12 -05:00
Wilson Snyder
96c70ea2df Internals: Fix some long lines in include files. No functional change.
When merging, recommend using "git merge -Xignore-all-space"
2019-05-14 22:49:32 -04:00
Wilson Snyder
afdfa4df87 Support VerilatedFstC set_time_unit, bug1433. 2019-05-12 08:19:49 -04:00
Wilson Snyder
8a4aeddbb0 Copyright year update. 2019-01-03 19:17:22 -05:00
Wilson Snyder
304a24d03a Internals: Fix many clang-tidy issues. No functional change intended. 2018-10-14 18:39:33 -04:00
Wilson Snyder
d87b9d25ca Internals: Cleanup and standardize include order. No functional change intended. 2018-10-14 13:59:40 -04:00
Wilson Snyder
595419b370 Internals: Sort includes for clang-tidy. No functional change intended. 2018-10-14 07:04:18 -04:00
Wilson Snyder
cc45a3dd72 For --trace-fst, save enum decoding information, bug1358. 2018-10-08 07:21:22 -04:00
Wilson Snyder
d11592cadd Support signal types in FST dumps, bug1358. 2018-10-04 20:24:41 -04:00
Wilson Snyder
8ef9ac7dba Support in/out directions in FST dumps, bug1358. 2018-10-03 19:51:05 -04:00
Sergi Granell
a5aa0e2b0a Add GTKWave FST native tracing, bug1356.
Signed-off-by: Wilson Snyder <wsnyder@wsnyder.org>
2018-10-02 18:42:53 -04:00