Change the Travis builds to use workspaces and persistent ccache
We proceed in 2 stages (as before, but using workspaces for
persistence):
1. In the 'build' stage, we clone the repo, build it and
save the whole checkout ($TRAVIS_BUILD_DIR) as a workspace
2. In the 'test' stage, rather than cloning the repo, multiple jobs
pull down the same workspace we built to run the tests from
This enables:
- Reuse of the build in multiple test jobs (this is what we used the Travis
cache for before)
- Each job having a separate persistent Travis cache, which now only
contains the ccache. This means all jobs, including 'build' and 'test'
jobs can make maximum use of ccache across runs. This drastically cuts
down build times when the ccache hits, which is very often the case for
'test' jobs. Also, the separate caches only store the objects build by
the particular job that owns the cache, so we can keep the per job
ccache small.
If the commit message contains '[travis ccache clear]', the ccache will
be cleared at the beginning of the build. This can be used to test build
complete within the 50 minute timeout imposed by Travis, even without a
persistent ccache.
This provides minor simulation performance benefit, but can provide
large C++ compilation time improvement, notably with Clang (4x).
This patch implements #2366 .
This allows compiling the run-time library with optimization even when OPT_FAST is not used in order to imporove model build speed, possibly during debug cycles.
This adds the flag --generate-waivefile <filename>. This will generate
a verilator config file with the proper lint_off statemens to turn off
warnings emitted during this particular run.
This feature can be used to start with using Verilator as linter and
systematically capture all known lint warning for further
elimination. It hopefully helps people turning of -Wno-fatal or
-Wno-lint and gradually improve their code base.
Signed-off-by: Stefan Wallentowitz <stefan.wallentowitz@hm.edu>
--output-split is now on by default with value 20000.
--output-split-cfuncs and --output-split-ctrace now defaults to the
value of --output-split unless explicitly specified.
Instead of __ALLfast.cpp and __ALLslow.cpp, we now create only a single
__ALL.cpp and compile it with OPT_FAST, this speeds up small builds
where the C compiler does not dominate. A separate patch will follow
turning VM_PARALLEL_BUILDS on by default at a certain size.
Given this change to the build there is now no point in emitting both
fast and slow routines into the same .cpp file when --output-split is
not set as they will be just included in the same __ALL.cpp file. To
keep things simpler and the output easier to comprehend, V3EmitC has
also been changed to always emit the fast and slow files separately.
Also change verilated.mk to apply OPT_SLOW to all slow files, not just
ones called *__Slow.cpp. This change in particular ensures __Syms.cpp
is build as slow.
Part of #2360.
- Improve flag pruning heuristic
- Set all trace activity flags in slow code. This in turns enables us
to remove checking the slow flag on the fast path.
Firstly, we always use a byte array for fine grained activity flags
instead of a bit vector (we used to use a byte array only if we had
parallel mtasks). The byte vector can be set more cheaply in eval,
closing about 1/3 of the gap in performance between compiling with
or without --trace on SweRV EH1. The speed of tracing itself is not
measurably different.
Secondly, we prune the activity tracking such that if a set of activity
flag combinations only guard a small number of signals, we will turn
those signals into awayls traced signals. This avoids code which
sometimes tests dozens of activity flags just to subsequently check one
signal and dump it if it's value changed. We can just check the signal
state straight instead, and not bother with the flags. This removes
about 30% of activity flags in SweRV EH1, and makes both single threaded
VCD and FST tracing 8-9% faster.