The --prof-threads option has been split into two independent options:
1. --prof-exec, for collecting verilator_gantt and other execution
related profiling data, and
2. --prof-pgo, for collecting data needed for PGO
The implementation of execution profiling is extricated from
VlThreadPool and is now a separate class VlExecutionProfiler. This means
--prof-exec can now be used for single-threaded models (though it does
not measure a lot of things just yet). For consistency VerilatedProfiler
is renamed VlPgoProfiler. Both VlExecutionProfiler and VlPgoProfiler are
in verilated_profiler.{h/cpp}, but can be used completely independently.
Also re-worked the execution profile format so it now only emits events
without holding onto any temporaries. This is in preparation for some
future optimizations that would be hindered by the introduction of function
locals via AstText.
Also removed the Barrier event. Clearing the profile buffers is not
notably more expensive as the profiling records are trivially
destructible.
Trace initialization (tracep->decl* functions) used to explicitly pass
the complete hierarchical names of signals as string constants. This
contains a lot of redundancy (path prefixes), does not scale well with
large designs and resulted in .rodata sections (the string constants) in
ELF executables being extremely large.
This patch changes the API of trace initialization that allows pushing
and popping name prefixes as we walk the hierarchy tree, which are
prepended to declared signal names at run-time during trace
initialization. This in turn allows us to emit repeat path/name
components only once, effectively removing all duplicate path prefixes.
On SweRV EH1 this reduces the .rodata section in a --trace build by 94%.
Additionally, trace declarations are now emitted in lexical order by
hierarchical signal names, and the top level trace initialization
function respects --output-split-ctrace.
** Add simulation context (VerilatedContext) to allow multiple fully independent
models to be in the same process. Please see the updated examples.
** Add context->time() and context->timeInc() API calls, to set simulation time.
These now are recommended in place of the legacy sc_time_stamp().
* Guarantee mechanism to initialize just once is now in VerilatedInitializer. No functional change is intended.
* Make sure to initialize Verilated::NonInitialized just once. Fixes
memory leak in t_prot_lib_shared and t_hier_block_prot_lib_shared.
* Call setup() and teardown() of Verilated::NonSerialized.