Change the Travis builds to use workspaces and persistent ccache
We proceed in 2 stages (as before, but using workspaces for
persistence):
1. In the 'build' stage, we clone the repo, build it and
save the whole checkout ($TRAVIS_BUILD_DIR) as a workspace
2. In the 'test' stage, rather than cloning the repo, multiple jobs
pull down the same workspace we built to run the tests from
This enables:
- Reuse of the build in multiple test jobs (this is what we used the Travis
cache for before)
- Each job having a separate persistent Travis cache, which now only
contains the ccache. This means all jobs, including 'build' and 'test'
jobs can make maximum use of ccache across runs. This drastically cuts
down build times when the ccache hits, which is very often the case for
'test' jobs. Also, the separate caches only store the objects build by
the particular job that owns the cache, so we can keep the per job
ccache small.
If the commit message contains '[travis ccache clear]', the ccache will
be cleared at the beginning of the build. This can be used to test build
complete within the 50 minute timeout imposed by Travis, even without a
persistent ccache.
This allows compiling the run-time library with optimization even when OPT_FAST is not used in order to imporove model build speed, possibly during debug cycles.
- Packaged SystemC lives in /usr so needed to update regex in test
driver
- Clang 10 complains about mixed named and positional initializers in
struct definitions.
The --trace-threads option can now be used to perform tracing on a
thread separate from the main thread when using VCD tracing (with
--trace-threads 1). For FST tracing --trace-threads can be 1 or 2, and
--trace-fst --trace-threads 1 is the same a what --trace-fst-threads
used to be (which is now deprecated).
Performance numbers on SweRV EH1 CoreMark, clang 6.0.0, Intel i7-3770 @
3.40GHz, IO to ramdisk, with numactl set to schedule threads on different
physical cores. Relative speedup:
--trace -> --trace --trace-threads 1 +22%
--trace-fst -> --trace-fst --trace-threads 1 +38% (as --trace-fst-thread)
--trace-fst -> --trace-fst --trace-threads 2 +93%
Speed relative to --trace with no threaded tracing:
--trace 1.00 x
--trace --trace-threads 1 0.82 x
--trace-fst 1.79 x
--trace-fst --trace-threads 1 1.23 x
--trace-fst --trace-threads 2 0.87 x
This means FST tracing with 2 extra threads is now faster than single
threaded VCD tracing, and is on par with threaded VCD tracing. You do
pay for it in total compute though as --trace-fst --trace-threads 2 uses
about 240% CPU vs 150% for --trace-fst --trace-threads 1, and 155% for
--trace --trace threads 1. Still for interactive use it should be
helpful with large designs.
Includes `timescale, $printtimescale, $timeformat.
VL_TIME_MULTIPLIER, VL_TIME_PRECISION, VL_TIME_UNIT have been removed
and the time precision must now match the SystemC time precision.
To get closer behavior to older versions, use e.g. --timescale-override
"1ps/1ps".