Commit Graph

79 Commits

Author SHA1 Message Date
Krzysztof Boroński
a87fb57656
Allow assigning events (#4403)
Signed-off-by: Krzysztof Boronski <kboronski@antmicro.com>
2023-10-26 16:38:47 +02:00
Geza Lore
95c4ade718
Unify code generation for trace declarations in both trace formats (#4612)
This patch adds some abstract enums to pass to the trace decl* APIs, so
the VCD/FST specific code can be kept in verilated_{vcd,fst}_*.cc, and
removed from V3Emit*. It also reworks the generation of the trace init
functions (those that call 'decl*' for the signals) such that the scope
hierarchy is traversed precisely once during initialization, which
simplifies the FST writer. This later change also has the side effect of
fixing tracing of nested interfaces when traced via an interface
reference - see the change in the expected t_interface_ref_trace - which
previously were missed.
2023-10-24 16:33:29 +01:00
Geza Lore
d1b6224c2b
Associate trace codes with function indices (#4610)
For each traced variable, also register the trace function index that
will write it.
2023-10-23 16:01:55 +01:00
Geza Lore
165a2ef1b8 Separate tracing of const values from non-const values
Some values emitted to the trace files are constant (e.g.: actual
parameter values), and never change. Previously we used to trace these
in the 'full' dumps, which also included all other truly variable
signals. This patch introduces a new generated trace function 'const',
to complement the 'full' and 'chg' flavour, and 'const' now only
contains the constant signals, while 'full' and 'chg' contain only the
truly variable signals. The generate 'full' and 'chg' trace functions
now have exactly the same shape. Note that 'const' signals are still
traced using the 'full*' dump methods of the trace buffers, so there is
no need for a third flavour of those.
2023-10-23 14:07:52 +01:00
Larry Doolittle
2ab70ba452
Internals: Cleanup .txt file whitespace (#3842) 2023-01-05 05:00:54 -05:00
Wilson Snyder
b24d7c83d3 Copyright year update 2023-01-01 10:18:39 -05:00
github action
821dd070bf Apply 'make format' 2022-11-23 09:08:02 +00:00
Yves Mathieu
06fdf7be58
Add support of Events for VCD/FST traces (#3759) 2022-11-23 04:07:14 -05:00
Wilson Snyder
8c6d1e53ca Internals: Fix some 'p' names, and make new base class for VlDeleter. No functional change intended. 2022-11-13 17:40:50 -05:00
Kamil Rakoczy
d6126c4b32
Remove --no-threads; require --threads 1 for single threaded (#3703). 2022-11-05 08:47:34 -04:00
Wilson Snyder
a2d26b45bb Internals: Fix some clang-tidy issues. No functional change intended. 2022-07-30 11:54:28 -04:00
Geza Lore
1d400dd98c
Configure tracing at run-time, instead of compile time (#3504)
All remaining use of conditional compilation in the tracing
implementation of the run-time library are replaced with the use of
VerilatedModel::traceConfig, and is now done at run-time.
2022-07-20 11:27:10 +01:00
Geza Lore
f8b7981be4 Make use of FST writer thread switchable at run-time.
Always build the FST libray with -DFST_WRITER_PARALLEL, iff VL_THREADED.
This supports run-time enablement of the FST writer thread, and has no
measurable performance impact on single threaded tracing but simplifies
the library build.

Note: the actual choice of using the fst writer thread is still compile
time, but can now be made run-time easily.
2022-07-19 13:48:03 +01:00
Geza Lore
db59c07f27 Implement trace offloading with fewer ifdefs
Step towards a proper run-time library. Reduce the amount of ifdefs in
the implementation of offloaded tracing. There are still a very small
number of ifdefs left, which will need more careful changes in order to
keep user API compatibility.
2022-07-19 11:31:35 +01:00
Geza Lore
b51f887567
Perform VCD tracing in parallel when using --threads (#3449)
VCD tracing is now parallelized using the same thread pool as the model.
We achieve this by breaking the top level trace functions into multiple
top level functions (as many as --threads), and after emitting the time
stamp to the VCD file on the main thread, we execute the tracing
functions in parallel on the same thread pool as the model (which we
pass to the trace file during registration), tracing into a secondary
per thread buffer. The main thread will then stitch (memcpy) the buffers
together into the output file.

This makes the `--trace-threads` option redundant with `--trace`, which
now only affects `--trace-fst`. FST tracing uses the previous offloading
scheme.

This obviously helps a lot in VCD tracing performance, and I have seen
better than Amdahl speedup, namely I get 3.9x on XiangShan 4T (2.7x on
OpenTitan 4T).
2022-05-29 19:08:39 +01:00
Geza Lore
a48c779367 Rename verilated_trace_imp.cpp -> verilated_trace_imp.h
Also fix file header to describe purpose of this file.
2022-05-28 12:20:35 +01:00
Wilson Snyder
e02f97854c Deprecate 'vluint64_t' and similar types (#3255). 2022-03-27 15:27:40 -04:00
Wilson Snyder
321880f5a6 Add trace dumpvars() call for selective runtime tracing (#3322). 2022-03-05 15:44:32 -05:00
Jamie Iles
b6ca2a42f2
Fix FST traces to include vector range (#3296) (#3297) 2022-02-26 12:52:24 -05:00
Wilson Snyder
50094ca296 Internals: Add cpplint control file and related cleanups 2022-01-09 16:49:38 -05:00
Wilson Snyder
ca42be982c Copyright year update. 2022-01-01 08:26:40 -05:00
Geza Lore
ff425369ac
Reduce .rodata footprint of trace initialization (#3250)
Trace initialization (tracep->decl* functions) used to explicitly pass
the complete hierarchical names of signals as string constants. This
contains a lot of redundancy (path prefixes), does not scale well with
large designs and resulted in .rodata sections (the string constants) in
ELF executables being extremely large.

This patch changes the API of trace initialization that allows pushing
and popping name prefixes as we walk the hierarchy tree, which are
prepended to declared signal names at run-time during trace
initialization. This in turn allows us to emit repeat path/name
components only once, effectively removing all duplicate path prefixes.
On SweRV EH1 this reduces the .rodata section in a --trace build by 94%.

Additionally, trace declarations are now emitted in lexical order by
hierarchical signal names, and the top level trace initialization
function respects --output-split-ctrace.
2021-12-19 15:15:07 +00:00
Pieter Kapsenberg
d1836b7b6f
Traces show array instances using brackets instead of parens (#3092) (#3095) 2021-08-12 20:40:44 +03:00
Wilson Snyder
b8e804f05b Internals: Some clang-tidy cleanups. No functional change intended. 2021-07-25 13:38:27 -04:00
Wilson Snyder
ab13a2ebdc Internals: Use C++11 const and initializers. No functional change intended. 2021-07-24 08:36:11 -04:00
Wilson Snyder
52cde49a6f Internals: Add more const. No functional change. 2021-06-18 22:24:08 -04:00
David Metz
f5ad5cf034
Fix dumping waveforms to multiple FST files (#2889) 2021-04-14 16:52:14 -04:00
github action
52fc134272 Apply clang-format 2021-04-07 13:56:12 +00:00
Àlex Torregrosa
2b2680770b
Improve scope types in FST and VCD traces (#2805). 2021-04-07 09:55:11 -04:00
Wilson Snyder
a1ab295b74 Commentary: Cleanup all include/* header comments. 2021-03-20 17:46:00 -04:00
Wilson Snyder
2cad22a22a
Add simulation context (VerilatedContext) (#2660). (#2813)
**   Add simulation context (VerilatedContext) to allow multiple fully independent
      models to be in the same process.  Please see the updated examples.
**   Add context->time() and context->timeInc() API calls, to set simulation time.
      These now are recommended in place of the legacy sc_time_stamp().
2021-03-07 11:01:54 -05:00
Wilson Snyder
caa9c99837 Commentary 2021-03-07 08:28:13 -05:00
Wilson Snyder
9650aefa42 Internals: Cleanup unneeded {}. No functional change 2021-02-21 21:25:21 -05:00
Wilson Snyder
bd602d0e2d Copyright year update 2021-01-01 10:29:54 -05:00
Wilson Snyder
ee9d6dd63f C++11: Favor auto, range for. No functional change intended. 2020-08-16 11:44:06 -04:00
Wilson Snyder
72d2cff0a1 C++11: Use member declaration initalizations. No functional change intended. 2020-08-16 11:44:06 -04:00
Wilson Snyder
c0127599df C++11: Use nullptr. No functional change. 2020-08-16 11:44:05 -04:00
Geza Lore
378d947702 Travis: Add FreeBSD build + portability fixes 2020-06-28 15:37:24 +01:00
Wilson Snyder
6ce878cb0d Fix some clang-tidy warnings 2020-06-01 23:16:17 -04:00
Geza Lore
95534fa5c5
Remove unused headers (#2389) 2020-05-31 20:21:07 +01:00
Wilson Snyder
c4f31d3bb6 Tracing: Remove dead code. No functional change intended. 2020-05-17 09:52:03 -04:00
Wilson Snyder
29bcbb0417 Suppress impossible code coverage issues 2020-05-15 22:34:29 -04:00
Geza Lore
8afcd67a1f Fix FST tracing of little endian vectors 2020-05-03 22:39:45 +01:00
Geza Lore
aa9cde22c8
Use SIMD intrinsics to render VCD traces (#2289)
Use SIMD intrinsics to render VCD traces.

I have measured 10-40% single threaded performance increase with VCD
tracing on SweRV EH1 and lowRISC Ibex using SSE2 intrinsics to render
the trace. Also helps a tiny bit with FST, but now almost all of the FST
overhead is in the FST library.

I have reworked the tracing routines to use more precisely sized
arguments. The nice thing about this is that the performance without the
intrinsics is pretty much the same as it was before, as we do at most 2x
as much work as necessary, but in exchange there are no data dependent
branches at all.
2020-04-30 00:09:09 +01:00
Geza Lore
b79ef672e1 Various minor optimizations of VCD trace routines
- Change templated trace routines to branch table.

Removed templating from trace chgBus and fullBus and replaced them with
a branch table like the other there is a very small (< 1%) penalty for
this on SwerRV EH1 CoreMark, but this is less than the variability of
disk IO so it's worth it to keep the code simpler and smaller.

- Prefetch VCD suffix buffer at the top of emit*

- Increase ILP in VCD emit* routines

- Use a 64-bit unaligned store to emit the VCD suffix (on x86 only)

The performance difference with these is very small, but the changes
hopefully make this code more performance-portable across various
micro-architectures.
2020-04-27 18:44:53 +01:00
Geza Lore
c52f3349d1
Initial implementation of generic multithreaded tracing (#2269)
The --trace-threads option can now be used to perform tracing on a
thread separate from the main thread when using VCD tracing (with
--trace-threads 1). For FST tracing --trace-threads can be 1 or 2, and
--trace-fst --trace-threads 1 is the same a what --trace-fst-threads
used to be (which is now deprecated).

Performance numbers on SweRV EH1 CoreMark, clang 6.0.0, Intel i7-3770 @
3.40GHz, IO to ramdisk, with numactl set to schedule threads on different
physical cores. Relative speedup:

--trace     ->  --trace --trace-threads 1      +22%
--trace-fst ->  --trace-fst --trace-threads 1  +38% (as --trace-fst-thread)
--trace-fst ->  --trace-fst --trace-threads 2  +93%

Speed relative to --trace with no threaded tracing:
--trace                                 1.00 x
--trace --trace-threads 1               0.82 x
--trace-fst                             1.79 x
--trace-fst --trace-threads 1           1.23 x
--trace-fst --trace-threads 2           0.87 x

This means FST tracing with 2 extra threads is now faster than single
threaded VCD tracing, and is on par with threaded VCD tracing. You do
pay for it in total compute though as --trace-fst --trace-threads 2 uses
about 240% CPU vs 150% for --trace-fst --trace-threads 1, and 155% for
--trace --trace threads 1. Still for interactive use it should be
helpful with large designs.
2020-04-21 23:49:07 +01:00
Geza Lore
39d903375b
Factor out trace implementation common to all formats. (#2268)
This patch de-duplicates common functionality between the VCD and FST
trace implementation. It also enables adding new trace formats more
easily and consistently.

No functional nor performance change intended.
2020-04-19 23:57:36 +01:00
Geza Lore
6a54922044
Set FST timescale correctly. (#2266)
The FST trace timescale used to be set in the constructor via
set_time_unit, but at that point we haven't normally opened the
file yet so it was just dropped. On top of that, we actually want
to use set_time_resolution... FST trace timescales now match the VCD.
2020-04-19 08:47:22 -04:00
Geza Lore
74e16d85c5
Fix FST trace initial time stamp. (#2264)
If the first dump was not at time zero, then the FST trace used
to contain the initial values as if they were set at time zero. Now
they only appear at the time the first dump call is actually made,
and hence match the VCD trace exactly.
2020-04-18 18:54:02 -04:00
Wilson Snyder
d4f7f5297a
Support IEEE time units and time precisions, #234. (#2253)
Includes `timescale, $printtimescale, $timeformat.
VL_TIME_MULTIPLIER, VL_TIME_PRECISION, VL_TIME_UNIT have been removed
and the time precision must now match the SystemC time precision.
To get closer behavior to older versions, use e.g. --timescale-override
"1ps/1ps".
2020-04-15 19:39:03 -04:00