Commit Graph

806 Commits

Author SHA1 Message Date
Geza Lore
b51f887567
Perform VCD tracing in parallel when using --threads (#3449)
VCD tracing is now parallelized using the same thread pool as the model.
We achieve this by breaking the top level trace functions into multiple
top level functions (as many as --threads), and after emitting the time
stamp to the VCD file on the main thread, we execute the tracing
functions in parallel on the same thread pool as the model (which we
pass to the trace file during registration), tracing into a secondary
per thread buffer. The main thread will then stitch (memcpy) the buffers
together into the output file.

This makes the `--trace-threads` option redundant with `--trace`, which
now only affects `--trace-fst`. FST tracing uses the previous offloading
scheme.

This obviously helps a lot in VCD tracing performance, and I have seen
better than Amdahl speedup, namely I get 3.9x on XiangShan 4T (2.7x on
OpenTitan 4T).
2022-05-29 19:08:39 +01:00
Geza Lore
c4b8675d77 Always inline some small, hot trace routines 2022-05-28 12:47:09 +01:00
Geza Lore
a7cd7a1ed9 Initialize VerilatedTrace members in class 2022-05-28 12:47:07 +01:00
Geza Lore
a48c779367 Rename verilated_trace_imp.cpp -> verilated_trace_imp.h
Also fix file header to describe purpose of this file.
2022-05-28 12:20:35 +01:00
Geza Lore
cf1eccc24f Make local function 'static' in verilated_profiler.h 2022-05-28 12:17:39 +01:00
Geza Lore
d45caca011 Remove legacy VCD tracing API
This has not been used by Verilator for a while, but was kept for
compatibility with some external code. Now removed.
2022-05-28 12:07:24 +01:00
Geza Lore
b130a8cfeb Add -DVM_TRACE_VCD in model builds with Make with --trace 2022-05-20 16:44:38 +01:00
Geza Lore
551bd284dd Rename some internals related to multi-threaded tracing
Rename the implementation internals of current multi-threaded tracing to
be "offload mode". No functional change, nor user interface change
intended.
2022-05-20 16:44:35 +01:00
Wilson Snyder
99bdc27be3 Internals: Cleanup some statics, trivial part towards (#3419) 2022-05-15 14:26:55 -04:00
Wilson Snyder
c3c46967dc Tests: Appease sanitizer (#3121). 2022-05-15 11:50:52 -04:00
Wilson Snyder
5aa12e9b51 Add assert when VerilatedContext is mis-deleted (#3121). 2022-05-15 10:51:03 -04:00
Wilson Snyder
f6035447ae Internals: Use mutable for mutexes. No functional change. 2022-05-13 07:21:39 -04:00
Wilson Snyder
38438b3373 Internals: Cleanup some defaults. No functional change. 2022-05-12 23:30:39 -04:00
HungMingWu
880a9be3b1
Internal: Add C++20ish reverse_view for range loops. No functional change (#3388).
Signed-off-by: HungMingWu <u9089000@gmail.com>
2022-04-18 13:03:56 -04:00
Wilson Snyder
33105f017c Commentary 2022-03-30 20:17:59 -04:00
Wilson Snyder
e02f97854c Deprecate 'vluint64_t' and similar types (#3255). 2022-03-27 15:27:40 -04:00
Wilson Snyder
3f7bf3d2dc Fix MSVC localtime_s (#3124). 2022-03-27 13:59:18 -04:00
Geza Lore
b1b5b5dfe2 Improve run-time profiling
The --prof-threads option has been split into two independent options:
1. --prof-exec, for collecting verilator_gantt and other execution
related profiling data, and
2. --prof-pgo, for collecting data needed for PGO

The implementation of execution profiling is extricated from
VlThreadPool and is now a separate class VlExecutionProfiler. This means
--prof-exec can now be used for single-threaded models (though it does
not measure a lot of things just yet). For consistency VerilatedProfiler
is renamed VlPgoProfiler. Both VlExecutionProfiler and VlPgoProfiler are
in verilated_profiler.{h/cpp}, but can be used completely independently.

Also re-worked the execution profile format so it now only emits events
without holding onto any temporaries. This is in preparation for some
future optimizations that would be hindered by the introduction of function
locals via AstText.

Also removed the Barrier event. Clearing the profile buffers is not
notably more expensive as the profiling records are trivially
destructible.
2022-03-27 15:57:30 +02:00
Geza Lore
c7440b250f Validate integer run-time arguments 2022-03-26 22:58:47 +00:00
Geza Lore
bab8462789 Rebuild run-time library if generated makefile changes
The generated makefile contains compiler options that are passed when
building the run-time library, so re-build if it changes.
2022-03-26 21:29:03 +00:00
Xi Zhang
14d24213a8
Support LoongArch ISA multithreading (#3353) (#3354) 2022-03-17 09:04:47 -04:00
Wilson Snyder
b5ce7d5982 Add VERILATOR_VERSION_INTEGER for determining API (#3343). 2022-03-12 11:17:39 -05:00
Wilson Snyder
ef87d057fc Fix $fscanf etc to return -1 on EOF (#3113). 2022-03-07 17:43:33 -05:00
Wilson Snyder
321880f5a6 Add trace dumpvars() call for selective runtime tracing (#3322). 2022-03-05 15:44:32 -05:00
Wilson Snyder
956f64c6ba Fix compile error with --trace-fst --sc (#3332). 2022-03-02 07:26:26 -05:00
Jamie Iles
b6ca2a42f2
Fix FST traces to include vector range (#3296) (#3297) 2022-02-26 12:52:24 -05:00
Wilson Snyder
e52a4ac74f Fix $readmem file not found to be warning not error (#3310). 2022-02-19 10:04:12 -05:00
Wilson Snyder
3b7ad1820d GTKWave header updates from upstream. 2022-02-09 21:56:22 -05:00
Guokai Chen
818aaa8b89
Fix macOS arm64 build by excluding x86 only cpuid header (#3285) (#3291)
Signed-off-by: Guokai Chen <chenguokai17@mails.ucas.ac.cn>
2022-01-23 09:15:09 -05:00
Julie Schwartz
f5b1a5cd58 Fix make support for BSD ar (#2999) (#3256). [Julie Schwartz]
While GNU 'ar' supports '@' to specify a file, BSD 'ar' does not.
The max line length can be handled by 'xargs' instead, which will know
to break up the command.  In case there are multiple calls, only build
the index (specified with '-s') once in a later call.
2022-01-17 14:04:43 -05:00
Wilson Snyder
50094ca296 Internals: Add cpplint control file and related cleanups 2022-01-09 16:49:38 -05:00
Wilson Snyder
15b32dc140 Internals: cpplint cleanups. No functional change. 2022-01-08 12:01:39 -05:00
Wilson Snyder
9bda91b3bf Fix clang compile warning 2022-01-01 19:33:12 -05:00
Wilson Snyder
d679d50eca Fix $random not updating seed (#3238). [Julie Schwartz] 2022-01-01 16:43:06 -05:00
Wilson Snyder
4cd56b1fb9 Use C++11 standard types for MacOS portability (#3254) (#3257). 2022-01-01 16:04:20 -05:00
Wilson Snyder
ca42be982c Copyright year update. 2022-01-01 08:26:40 -05:00
Yutetsu TAKATSUKASA
0658a7654f
Add tests of tracing SystemC model verilated with --hierarchical (#3252)
* Tests: Add t_hier_block_sc_trace(fst|vcd) that tests tracing hierarchical block on SystemC.

* Add a check that elaboration is done before a trace file is opened.

* Add a check that elaboration is done before trace() is called to verilated SystemC model.

* Tests: call sc_core::sc_start(sc_core::SC_ZERO_TIME) before opening a trace file

* Tests: Fix t_trace_two_sc to call sc_start before opening trace

* Use vl_fatal as suggested in PR review.
2021-12-23 08:41:11 +09:00
Wilson Snyder
7526151670 Fix bad ending address on $readmem (#3205). 2021-12-21 19:55:04 -05:00
Wilson Snyder
560b59f97f Use C++11 standard types for MacOS portability (#3254). 2021-12-21 13:18:05 -05:00
Geza Lore
ff425369ac
Reduce .rodata footprint of trace initialization (#3250)
Trace initialization (tracep->decl* functions) used to explicitly pass
the complete hierarchical names of signals as string constants. This
contains a lot of redundancy (path prefixes), does not scale well with
large designs and resulted in .rodata sections (the string constants) in
ELF executables being extremely large.

This patch changes the API of trace initialization that allows pushing
and popping name prefixes as we walk the hierarchy tree, which are
prepended to declared signal names at run-time during trace
initialization. This in turn allows us to emit repeat path/name
components only once, effectively removing all duplicate path prefixes.
On SweRV EH1 this reduces the .rodata section in a --trace build by 94%.

Additionally, trace declarations are now emitted in lexical order by
hierarchical signal names, and the top level trace initialization
function respects --output-split-ctrace.
2021-12-19 15:15:07 +00:00
Wilson Snyder
8696e38e6f Primary inputs and outputs (VL_INW/VL_OUTW) now use VlWide type (#3236). 2021-12-09 19:41:33 -05:00
Adrien Le Masle
c3f17ce2c4
Fix VL_STREAML_FAST_QQI with 64 bit left-hand-side (#3232) (#3235) 2021-12-09 17:30:04 -05:00
Wilson Snyder
b7d20b102b Internals: Remove unused and cleanup VL_ASSIGNSEL. 2021-12-05 11:59:49 -05:00
Wilson Snyder
293a5f402b Fix timescale portability on Arm64 (#3222). 2021-11-28 15:47:19 -05:00
Wilson Snyder
692306ef44 Optimize $random concatenates/selects (#3114). 2021-11-28 14:17:28 -05:00
Wilson Snyder
98037cad56 Internals: Optimize VL_RANDOM to have unclean output 2021-11-28 14:00:19 -05:00
Wilson Snyder
61e3536163 Internals: Remove some unused arguments. 2021-11-28 13:44:16 -05:00
Wilson Snyder
a1a186a86c Internals: Remove some unused arguments. 2021-11-28 13:07:37 -05:00
Wilson Snyder
cd737065f2 Internals: More const. No functional change intended. 2021-11-26 17:55:36 -05:00
Wilson Snyder
393b9e435d Internals: Revert previous commit const for clang. 2021-11-25 09:47:06 -05:00