Commit Graph

496 Commits

Author SHA1 Message Date
Geza Lore
c96a43b452
Fix unused variable in VL_READMEM_N (#2274) 2020-04-22 17:25:35 -04:00
Geza Lore
c52f3349d1
Initial implementation of generic multithreaded tracing (#2269)
The --trace-threads option can now be used to perform tracing on a
thread separate from the main thread when using VCD tracing (with
--trace-threads 1). For FST tracing --trace-threads can be 1 or 2, and
--trace-fst --trace-threads 1 is the same a what --trace-fst-threads
used to be (which is now deprecated).

Performance numbers on SweRV EH1 CoreMark, clang 6.0.0, Intel i7-3770 @
3.40GHz, IO to ramdisk, with numactl set to schedule threads on different
physical cores. Relative speedup:

--trace     ->  --trace --trace-threads 1      +22%
--trace-fst ->  --trace-fst --trace-threads 1  +38% (as --trace-fst-thread)
--trace-fst ->  --trace-fst --trace-threads 2  +93%

Speed relative to --trace with no threaded tracing:
--trace                                 1.00 x
--trace --trace-threads 1               0.82 x
--trace-fst                             1.79 x
--trace-fst --trace-threads 1           1.23 x
--trace-fst --trace-threads 2           0.87 x

This means FST tracing with 2 extra threads is now faster than single
threaded VCD tracing, and is on par with threaded VCD tracing. You do
pay for it in total compute though as --trace-fst --trace-threads 2 uses
about 240% CPU vs 150% for --trace-fst --trace-threads 1, and 155% for
--trace --trace threads 1. Still for interactive use it should be
helpful with large designs.
2020-04-21 23:49:07 +01:00
Wilson Snyder
174fd1bf0e Codacy cleanups. No functional change. 2020-04-20 22:01:47 -04:00
Geza Lore
39d903375b
Factor out trace implementation common to all formats. (#2268)
This patch de-duplicates common functionality between the VCD and FST
trace implementation. It also enables adding new trace formats more
easily and consistently.

No functional nor performance change intended.
2020-04-19 23:57:36 +01:00
Geza Lore
6a54922044
Set FST timescale correctly. (#2266)
The FST trace timescale used to be set in the constructor via
set_time_unit, but at that point we haven't normally opened the
file yet so it was just dropped. On top of that, we actually want
to use set_time_resolution... FST trace timescales now match the VCD.
2020-04-19 08:47:22 -04:00
Geza Lore
74e16d85c5
Fix FST trace initial time stamp. (#2264)
If the first dump was not at time zero, then the FST trace used
to contain the initial values as if they were set at time zero. Now
they only appear at the time the first dump call is actually made,
and hence match the VCD trace exactly.
2020-04-18 18:54:02 -04:00
Wilson Snyder
e6f345e45d Internal: clang-tidy fixes. No functional change. 2020-04-15 21:47:37 -04:00
Wilson Snyder
d4f7f5297a
Support IEEE time units and time precisions, #234. (#2253)
Includes `timescale, $printtimescale, $timeformat.
VL_TIME_MULTIPLIER, VL_TIME_PRECISION, VL_TIME_UNIT have been removed
and the time precision must now match the SystemC time precision.
To get closer behavior to older versions, use e.g. --timescale-override
"1ps/1ps".
2020-04-15 19:39:03 -04:00
Wilson Snyder
5c966ec510 clang-format many files. No functional change.
Use nodist/clang_formatter to reformat files that are now clean.
2020-04-13 22:52:23 -04:00
Geza Lore
dc5c259069
Improve tracing performance. (#2257)
* Improve tracing performance.

Various tactics used to improve performance of both VCD and FST tracing:
- Both: Change tracing functions to templates to take variable widths as
  template parameters. For VCD, subsequently specialize these to the
  values used by Verilator. This avoids redundant instructions and hard
  to predict branches.
- Both: Check for value changes via direct pointer access into the
  previous signal value buffer. This eliminates a lot of simple pointer
  arithmetic instructions form the tracing code.
- Both: Verilator provides clean input, no need to mask out used bits.
- VCD: pre-compute identifier codes and use memory copy instead of
  re-computing them every time a code is emitted. This saves a lot of
  instructions and hard to predict branches. The added D-cache misses
  are cheaper than the removed branches/instructions.
- VCD: re-write the routines emitting the changes to be more efficient.
- FST: Use previous signal value buffer the same way as the VCD tracing
  code, and only call the FST API when a change is detected.

Performance as measured on SweRV EH1, with the pre-canned CoreMark
benchmark running from DCCM/ICCM, clang 6.0.0, Intel i7-3770 @ 3.40GHz,
and IO to ramdisk:

            +--------------+---------------+----------------------+
            | VCD          | FST           | FST separate thread  |
            | (--trace)    | (--trace-fst) | (--trace-fst-thread) |
------------+-----------------------------------------------------+
Before      |  30.2 s      | 121.1 s       |  69.8 s              |
============+==============+===============+======================+
After       |  24.7 s      |  45.7 s       |  32.4 s              |
------------+--------------+---------------+----------------------+
Speedup     |    22 %      |   256 %       |   215 %              |
------------+--------------+---------------+----------------------+
Rel. to VCD |     1 x      |  1.85 x       |  1.31 x              |
------------+--------------+---------------+----------------------+

In addition, FST trace size for the above reduced by 48%.
2020-04-14 00:13:10 +01:00
Wilson Snyder
dc27a179e2 Always define VL_SIG etc; conditional definitions were long removed SystemPerl. 2020-04-13 19:07:56 -04:00
Wilson Snyder
ef211fc9e0 Make sure SystemC always included in -sc mode to prevent ordering issues. 2020-04-11 10:33:40 -04:00
Nathan Myers
4c1ae4701a
Add assertion for monotonic dump times #2103 (#2237) 2020-04-09 19:00:27 -04:00
Geza Lore
05f213c266
VCD tracing speed improvements (#2246)
* Don't inline VCD dump functions

Improves model speed with tracing. Measured on SweRW cmark:
- GCC 5.5      ~3% faster
- Clang 6.0    ~12% faster (!)

* Remove redundant test from VCD bit tracing.

Improves model speed with tracing. Measured on SweRW cmark:
 - GCC 5.5      ~7.5% faster
 - Clang 6.0    ~1.5% faster
2020-04-09 08:19:26 -04:00
Geza Lore
0f617988d4
Compile fast tracing code with OPT_FAST in single compile mode. (#2245)
When using the __ALL*.cpp based single compile mode (i.e.: without
VM_PARALLEL_BUILDS), the fast path tracing code used to be included in
__Allsup.cpp, which was compiled with OPT_SLOW, severely harming tracing
performance. We now have __ALLfast.cpp and __ALLslow.cpp instead of
__ALLcls.cpp and __ALLsup.cpp, so we can compile the fast support code
with OPT_FAST as well.
2020-04-08 21:05:43 -04:00
Geza Lore
991d8b178b
Fix FST tracing performance by removing std::map from hot path. (#2244)
This patch eliminates a major piece of inefficiency in FST tracing
support, by using an array to lookup fstHandle values corresponding
to trace codes, instead of a tree based std::map. With this change, FST
tracing is now only about 3x slower than VCD tracing. We do require
more memory to store the symbol lookup table, but the size of that is
still small, for the speed benefit.
2020-04-08 17:54:35 -04:00
Wilson Snyder
914a6edd33 Add error if use SystemC 2.2 and earlier (pre-2011) as is deprecated. 2020-04-07 19:58:17 -04:00
Geza Lore
0cfa828572 Fix DPI import/export to be standard compliant, #2236. 2020-04-07 19:07:47 -04:00
Wilson Snyder
cba05480ba Fix clang warning. 2020-04-06 20:13:24 -04:00
Wilson Snyder
2abbae8dd0 Internals: Remove strncpy to appease codacity. 2020-04-06 19:26:31 -04:00
Wilson Snyder
50535a1894 Internals: cppcheck 1.90 fixes. No functional change intended. 2020-04-05 18:57:47 -04:00
Wilson Snyder
763f621d4c Deprecate VL_ULL. 2020-04-05 16:45:53 -04:00
Wilson Snyder
efaf375887 Configuring with ccache present now defaults to using it; see OBJCACHE in the manual. 2020-04-05 16:10:33 -04:00
Wilson Snyder
a494ad5ec7 Support $ferror, #1638. 2020-04-05 11:22:05 -04:00
Wilson Snyder
e55338f927 Support $fflush without arguments, #1638. 2020-04-05 10:11:28 -04:00
Wilson Snyder
6eadb8e771 Add simplistic class support with many restrictions, see manual, #377. 2020-04-05 09:30:23 -04:00
Wilson Snyder
9fdb026e95 Add VM_C11 for future need of C++11 2020-04-04 20:48:03 -04:00
Wilson Snyder
5302a9d0e6 Internals: clang-format cleanups. No functional change. 2020-04-04 17:55:37 -04:00
Wilson Snyder
e07e9390f6 Internals: clang-format cleanups. No functional change. 2020-04-04 14:09:21 -04:00
Wilson Snyder
a13eab55f5 Internals: Add missing VL_DO_CLEARs. No functional change. 2020-04-04 13:06:31 -04:00
Wilson Snyder
38a31ae168 Cleanup misc clang-tidy warnings. No functional change intended 2020-04-03 22:31:54 -04:00
Wilson Snyder
6f4a8fe695 Fix Travis-CI failures. 2020-04-02 09:22:10 -04:00
Wilson Snyder
4361c4b57a Add vlsint8_t types. 2020-03-31 21:30:18 -04:00
Wilson Snyder
e6beab4037 Fix implicit conversion of floats to wide integers. 2020-03-31 20:42:07 -04:00
Sean Cross
a1a2650f1e
Modernize va args (#2214)
Verilator uses a form of variadic macros that are nonstandard, making it
unable to be compiled under MSVC.  Replace the old synax with the
standard syntax.  This fixes MSVC usage.

Signed-off-by: Sean Cross <sean@xobs.io>
2020-03-29 10:29:12 -04:00
Matthew Ballance
510be53521
Expose VPI cbNextDeadline via the public API (#2212)
Signed-off-by: Matthew Ballance <matt.ballance@gmail.com>
2020-03-28 13:47:21 -04:00
Wilson Snyder
08a51e3e09 Fix VCD open with empty filename, #2198. 2020-03-24 17:32:47 -04:00
Wilson Snyder
75ebe7a4be Update gtkwave from upstream. 2020-03-21 21:45:57 -04:00
Wilson Snyder
1ce360ed5b Add SPDX license identifiers. No functional change. 2020-03-21 11:24:24 -04:00
Wilson Snyder
5d9d1cc09f Internals: Favor const_iterator (in a few obvious spots, more needed). No functional change. 2020-03-15 18:34:09 -04:00
Wilson Snyder
6c6d70a5e5 Fix FST when multi-model tracing. 2020-03-07 18:39:58 -05:00
Wilson Snyder
e70cba77e6 Add support for dynamic arrays, #379. 2020-03-07 10:24:27 -05:00
Wilson Snyder
905067d13f Fix $dumpvar multithreaded assert, broke last commit. 2020-03-02 07:43:10 -05:00
Wilson Snyder
30a33a6104 Add support for and , #2126. 2020-03-01 21:39:23 -05:00
Wilson Snyder
0ca0e07354 Internals: Vcd doesn't need code when registering. No functional change intended. 2020-02-29 20:42:52 -05:00
Wilson Snyder
991d81cd0a Recommend -Os. 2020-02-27 07:46:34 -05:00
Wilson Snyder
23eb96579c Fix duplicate __STDC_FORMAT_MACROS definition. 2020-02-24 18:11:41 -05:00
Wilson Snyder
28e19cef90 Fix undeclared VL_SHIFTR_WWQ, #2114. 2020-02-23 19:33:37 -05:00
Tobias Wölfel
18f8cd0529
Allow assert disable (#2168)
* Add +verilator+noassert flag

This allows to disable the assert check per simulation argument.

* Add AssertOn check for assert

Insert the check AssertOn to allow disabling of asserts.
Asserts can be disabled by not using the `--assert` flag or by calling
`AssertOn(false)`, or passing the "+verilator+noassert" runtime flag.
Add tests for this behavior.
Bad tests check that the assert still causes a stop.
Non bad tests check that asserts are properly disabled and cause no stop
of the simulation.

Fixes #2162.

Signed-off-by: Tobias Wölfel <tobias.woelfel@mailbox.org>

* Correct file location

Signed-off-by: Tobias Wölfel <tobias.woelfel@mailbox.org>

* Add description for single test execution

Without this description it is not obvious how to run a single test from
the regression test suite.

Signed-off-by: Tobias Wölfel <tobias.woelfel@mailbox.org>
2020-02-15 18:17:23 -06:00
Wilson Snyder
70358e8587 Fix compiler warning. 2020-02-08 10:58:07 -05:00