forked from github/verilator
b51f887567
VCD tracing is now parallelized using the same thread pool as the model. We achieve this by breaking the top level trace functions into multiple top level functions (as many as --threads), and after emitting the time stamp to the VCD file on the main thread, we execute the tracing functions in parallel on the same thread pool as the model (which we pass to the trace file during registration), tracing into a secondary per thread buffer. The main thread will then stitch (memcpy) the buffers together into the output file. This makes the `--trace-threads` option redundant with `--trace`, which now only affects `--trace-fst`. FST tracing uses the previous offloading scheme. This obviously helps a lot in VCD tracing performance, and I have seen better than Amdahl speedup, namely I get 3.9x on XiangShan 4T (2.7x on OpenTitan 4T). |
||
---|---|---|
.. | ||
gtkwave | ||
vltstd | ||
.gitignore | ||
verilated_config.h.in | ||
verilated_cov_key.h | ||
verilated_cov.cpp | ||
verilated_cov.h | ||
verilated_dpi.cpp | ||
verilated_dpi.h | ||
verilated_fst_c.cpp | ||
verilated_fst_c.h | ||
verilated_fst_sc.cpp | ||
verilated_fst_sc.h | ||
verilated_funcs.h | ||
verilated_heavy.h | ||
verilated_imp.h | ||
verilated_intrinsics.h | ||
verilated_profiler.cpp | ||
verilated_profiler.h | ||
verilated_save.cpp | ||
verilated_save.h | ||
verilated_sc.h | ||
verilated_sym_props.h | ||
verilated_syms.h | ||
verilated_threads.cpp | ||
verilated_threads.h | ||
verilated_trace_defs.h | ||
verilated_trace_imp.h | ||
verilated_trace.h | ||
verilated_types.h | ||
verilated_vcd_c.cpp | ||
verilated_vcd_c.h | ||
verilated_vcd_sc.cpp | ||
verilated_vcd_sc.h | ||
verilated_vpi.cpp | ||
verilated_vpi.h | ||
verilated.cpp | ||
verilated.h | ||
verilated.mk.in | ||
verilated.v | ||
verilatedos.h |