This patch addresses two issues with NBAs in non-inlined functions/tasks:
- If the NBA writes to a local automatic var, the var could cease to exist before the NBA executes. This is normally addressed by fork dynscopes (#4356), but NBA-to-fork transformation happens way after `V3Fork` (in `V3Timing`). To solve this, we put NBAs that write to locals under forks in `V3Fork` already. This way, such locals will be put in dynscopes, and will still exist after the task containing the NBA exits.
- The above change means that any writes in forks other than `fork..join` should be handled by `V3Fork`. Thus, in `V3SchedTiming`, we only have to worry about read references, so we can simply copy all remaining locals. Because we copy, lifetimes are not an issue. This fixes a bug that allowed assignment intravals to be overwritten if they go out of scope in the containing function.
This patch adds some abstract enums to pass to the trace decl* APIs, so
the VCD/FST specific code can be kept in verilated_{vcd,fst}_*.cc, and
removed from V3Emit*. It also reworks the generation of the trace init
functions (those that call 'decl*' for the signals) such that the scope
hierarchy is traversed precisely once during initialization, which
simplifies the FST writer. This later change also has the side effect of
fixing tracing of nested interfaces when traced via an interface
reference - see the change in the expected t_interface_ref_trace - which
previously were missed.
Prior to this we failed to create implicit nets for inputs of gate
primitives, which is required by the standard (IEEE 1800-2017 6.10).
Note: outputs were covered due to being modeled as the LHS of
assignments, which do create implicit nets.
Again --prof-exec have bit-rotted a little with all the recent changes
to the structure of the generated code. This patch contains a few
improvements:
- Repalce the eval/evl_loop begin/end events with generic
section_push/section_pop events, that can be arbitrarily sprinkled
into the generate code (so long as they are matched correctly) to
measure various sections. The report then contains a nested profile
of the sections, and the VCD trace shows the section names.
- Better handling of exec graphs
- Clearer overall statistics
Still some remains of the --threads 0 mode. Remove unnecessary complexity
from V3EmitCModel. (Also don't pretend there is an MTask in single
threaded mode, when there really isn't.)
It's unlikely one value fits all use case, so making VL_LOCK_SPINS
configurable at model build time.
For testing, we reduce the value as we expect high contention.