This provides minor simulation performance benefit, but can provide
large C++ compilation time improvement, notably with Clang (4x).
This patch implements #2366 .
This adds the flag --generate-waivefile <filename>. This will generate
a verilator config file with the proper lint_off statemens to turn off
warnings emitted during this particular run.
This feature can be used to start with using Verilator as linter and
systematically capture all known lint warning for further
elimination. It hopefully helps people turning of -Wno-fatal or
-Wno-lint and gradually improve their code base.
Signed-off-by: Stefan Wallentowitz <stefan.wallentowitz@hm.edu>
Convert trace buffer to 32-bit entries, rather than a union containing a
pointer type. Also tweaked trace entry layouts for a bit more
performance. This gains another 10% on SweRV EH1 CoreMark.
The --trace-threads option can now be used to perform tracing on a
thread separate from the main thread when using VCD tracing (with
--trace-threads 1). For FST tracing --trace-threads can be 1 or 2, and
--trace-fst --trace-threads 1 is the same a what --trace-fst-threads
used to be (which is now deprecated).
Performance numbers on SweRV EH1 CoreMark, clang 6.0.0, Intel i7-3770 @
3.40GHz, IO to ramdisk, with numactl set to schedule threads on different
physical cores. Relative speedup:
--trace -> --trace --trace-threads 1 +22%
--trace-fst -> --trace-fst --trace-threads 1 +38% (as --trace-fst-thread)
--trace-fst -> --trace-fst --trace-threads 2 +93%
Speed relative to --trace with no threaded tracing:
--trace 1.00 x
--trace --trace-threads 1 0.82 x
--trace-fst 1.79 x
--trace-fst --trace-threads 1 1.23 x
--trace-fst --trace-threads 2 0.87 x
This means FST tracing with 2 extra threads is now faster than single
threaded VCD tracing, and is on par with threaded VCD tracing. You do
pay for it in total compute though as --trace-fst --trace-threads 2 uses
about 240% CPU vs 150% for --trace-fst --trace-threads 1, and 155% for
--trace --trace threads 1. Still for interactive use it should be
helpful with large designs.
Includes `timescale, $printtimescale, $timeformat.
VL_TIME_MULTIPLIER, VL_TIME_PRECISION, VL_TIME_UNIT have been removed
and the time precision must now match the SystemC time precision.
To get closer behavior to older versions, use e.g. --timescale-override
"1ps/1ps".
This switch exposes VARs, PORTs and WIREs to C++ code. It must be use
with care as it has a significant performance impact and may result in
mis-simulation of generated clocks. Anyhow, it is prefered over
--public and useful for VPI.
Signed-off-by: Lukasz Dalek <ldalek@antmicro.com>
Signed-off-by: Stefan Wallentowitz <stefan@wallentowitz.de>
Signed-off-by: Wilson Snyder <wsnyder@wsnyder.org>