Commit Graph

922 Commits

Author SHA1 Message Date
Wilson Snyder
c9634695a7 Fix std::exchange for C++11 compilers 2022-10-02 16:25:11 -04:00
Wilson Snyder
880cac2fdd Merge branch 'master' into develop-v5 2022-10-01 11:24:55 -04:00
Wilson Snyder
5ed882faf2 Fix unused compiler warning when not VL_THREADED. 2022-09-30 23:41:35 -04:00
Krzysztof Bieganski
9c2ead90d5
Add custom memory management for verilated classes (#3595)
This change introduces a custom reference-counting pointer class that
allows creating such pointers from 'this'. This lets us keep the
receiver object around even if all references to it outside of a class
method no longer exist. Useful for coroutine methods, which may outlive
all external references to the object.

The deletion of objects is deferred until the next time slot. This is to
make clearing the triggered flag on named events in classes safe
(otherwise freed memory could be accessed).
2022-09-28 18:54:18 -04:00
github action
91823c41c5 Apply 'make format' 2022-09-28 02:22:05 +00:00
Wilson Snyder
c6bce636ee Merge branch 'master' into develop-v5 2022-09-27 22:19:04 -04:00
Wilson Snyder
75a70bee6d Update to clang-format-14 on Ubuntu22.04 2022-09-27 21:47:45 -04:00
Wilson Snyder
d162619bd3 Merge branch 'master' into develop-v5 2022-09-20 20:06:21 -04:00
Mariusz Glebocki
fc3ce29845
Improve Verilation memory by reducing V3Number size (#3521) 2022-09-20 16:46:47 -04:00
Geza Lore
af305bf280 Merge branch 'master' into develop-v5 2022-09-16 16:24:36 +01:00
Geza Lore
38a8d7fb2e Remove redundant 'inline' keywords from definitions
Also add checks to t/t_dist_cppstyle
2022-09-16 15:52:25 +01:00
Geza Lore
0c70a0dcbf Remove redundant 'virtual' keywords from overridden methods
'virtual' is redundant when 'override' is present, so keep only
'override'.

Add t/t_dist_cppstyle.pl to check for this.
2022-09-16 15:19:38 +01:00
Geza Lore
27031ed688 Merge branch 'master' into develop-v5 2022-09-15 10:28:35 +01:00
github action
e94cdcf29c Apply 'make format' 2022-09-05 22:43:09 +00:00
Mladen Slijepcevic
1af046986d
Fix thread saftey in SystemC VL_ASSIGN_SBW/WSB (#3494) (#3513). 2022-09-05 18:42:12 -04:00
Krzysztof Bieganski
da7ad35577
Fix fork debug output (#3593)
Signed-off-by: Krzysztof Bieganski <kbieganski@antmicro.com>
2022-09-05 11:27:24 +01:00
Wilson Snyder
819e8741cc Merge branch 'master' into develop-v5 2022-08-30 00:20:21 -04:00
Wilson Snyder
6a5f77b278 Internals: Cleanup some string/model constructors. No functional change. 2022-08-29 23:50:32 -04:00
Wilson Snyder
2358ced061 Rename tracing rolloverSize and add test (#3570). 2022-08-28 08:25:02 -04:00
Geza Lore
5c356a4680 Merge branch 'master' into develop-v5 2022-08-22 14:32:06 +01:00
Krzysztof Bieganski
39af5d020e
Timing support (#3363)
Adds timing support to Verilator. It makes it possible to use delays,
event controls within processes (not just at the start), wait
statements, and forks.

Building a design with those constructs requires a compiler that
supports C++20 coroutines (GCC 10, Clang 5).

The basic idea is to have processes and tasks with delays/event controls
implemented as C++20 coroutines. This allows us to suspend and resume
them at any time.

There are five main runtime classes responsible for managing suspended
coroutines:
* `VlCoroutineHandle`, a wrapper over C++20's `std::coroutine_handle`
  with move semantics and automatic cleanup.
* `VlDelayScheduler`, for coroutines suspended by delays. It resumes
  them at a proper simulation time.
* `VlTriggerScheduler`, for coroutines suspended by event controls. It
  resumes them if its corresponding trigger was set.
* `VlForkSync`, used for syncing `fork..join` and `fork..join_any`
  blocks.
* `VlCoroutine`, the return type of all verilated coroutines. It allows
  for suspending a stack of coroutines (normally, C++ coroutines are
  stackless).

There is a new visitor in `V3Timing.cpp` which:
  * scales delays according to the timescale,
  * simplifies intra-assignment timing controls and net delays into
    regular timing controls and assignments,
  * simplifies wait statements into loops with event controls,
  * marks processes and tasks with timing controls in them as
    suspendable,
  * creates delay, trigger scheduler, and fork sync variables,
  * transforms timing controls and fork joins into C++ awaits

There are new functions in `V3SchedTiming.cpp` (used by `V3Sched.cpp`)
that integrate static scheduling with timing. This involves providing
external domains for variables, so that the necessary combinational
logic gets triggered after coroutine resumption, as well as statements
that need to be injected into the design eval function to perform this
resumption at the correct time.

There is also a function that transforms forked processes into separate
functions.

See the comments in `verilated_timing.h`, `verilated_timing.cpp`,
`V3Timing.cpp`, and `V3SchedTiming.cpp`, as well as the internals
documentation for more details.

Signed-off-by: Krzysztof Bieganski <kbieganski@antmicro.com>
2022-08-22 13:26:32 +01:00
Geza Lore
9ac64d0b92 Improve performance of MTask coarsening
Various optimizations to speed up MTasks coarsening (which is the long
pole in the multi-threaded scheduling of very large designs).

The biggest impact ones:
- Use efficient hand written Pairing Heaps for implementing priority
  queues and the scoreboard, instead of the old SortByValueMap. This
  helps us avoid having to sort a lot of merge candidates that we will
  never actually consider and helps a lot in performance.
- Remove unnecessary associative containers and store data structures
  (the heap nodes in particular) directly in the object they relate to.
  This eliminates a huge amount of lookups and helps a lot in
  performance.
- Distribute storage for SiblingMC instances into the LogicMTask
  instances, and combine with the sibling maps. This again eliminates
  hash table lookups and makes storage structures smaller.
- Remove some now bidirectional edge maps, keep only the forward map.

There are also some other smaller optimizations:
- Replaced more unnecessary dynamic_casts with static_casts
- Templated some functions/classes to reduce the number of static
  branches in loops.
- Improves sorting of edges for sibling candidate creation
- Various micro-optimizations here and there

This speeds up MTask coarsening by 3.8x on a large design, which
translates to a 2.5x speedup of the ordering pass in multi-threaded
mode. (Combined with the earlier optimizations, ordering is now 3x
faster.)

Due to the elimination of a lot of the auxiliary data structures, and
ensuring a minimal size for the necessary ones, memory consumption of
the MTask coarsening is also reduced (measured up to 4.4x reduction
though the accuracy of this is low).

The algorithm is identical except for minor alterations of the order
some candidates are added or removed, this can cause perturbation in the
output due to tied scores being broken based on IDs.
2022-08-20 21:18:50 +01:00
Geza Lore
1404319b28 Merge branch 'master' into develop-v5 2022-08-19 13:39:44 +01:00
Wilson Snyder
1e2219347e Internals: Cleanup ifdef, move up not under compilver version ifdef 2022-08-11 17:41:43 -04:00
Geza Lore
a4fd6d38fb Add operator != to VlWide
This is required by VlUnpacked::neq
2022-08-07 13:13:28 +01:00
Geza Lore
c266739e9f Merge branch 'master' into develop-v5 2022-08-05 12:17:57 +01:00
Geza Lore
96a4b3e5a5 Update clang-format config and apply
- Regroup and sort #include directives (like we used to, but automatic)
- Set AlwaysBreakTemplateDeclarations to true
2022-08-05 12:00:24 +01:00
Geza Lore
39d1a62f9e Fix change detection on unpacked arrays
Expand array assignment when creating the trigger, as V3Expand might
mangle it otherwise.
2022-08-02 13:01:41 +01:00
Wilson Snyder
3c54d5df70 Merge branch 'master' into develop-v5 2022-07-30 14:42:51 -04:00
Wilson Snyder
f91793e931 Revert - SC overrides cause non-override clang error. 2022-07-30 13:53:54 -04:00
Wilson Snyder
daac7cb90d Merge branch 'master' into develop-v5 2022-07-30 12:09:05 -04:00
Wilson Snyder
a2d26b45bb Internals: Fix some clang-tidy issues. No functional change intended. 2022-07-30 11:54:28 -04:00
Wilson Snyder
dce8f3d25d Internals: Spacing from develop-v5. No functional change. 2022-07-30 11:54:28 -04:00
Geza Lore
38e5b6c1ad Replace __gcov_flush with __gcov_dump
__gcov_flush was a private function and was removed from later GCC
versions (at least from 11.2.0, possibly earlier). Replace with the
documented public __gcov_dump.
2022-07-30 16:02:03 +01:00
Wilson Snyder
4859f5e1fa Merge branch 'master' into develop-v5 2022-07-30 10:26:16 -04:00
Wilson Snyder
b9d7819faa Internals: Fix some cppcheck issues. Some dump functions fixed. 2022-07-30 10:01:39 -04:00
Geza Lore
ad2fbfe62d Merge branch 'master' into develop-v5 2022-07-29 12:04:24 +01:00
Gustav Svensk
eeef5ab4de
Fix sformat string incorrectly cleared (#3515) (#3519). 2022-07-25 17:36:34 +02:00
Geza Lore
386401da60 Merge branch 'master' into develop-v5 2022-07-22 15:09:20 +01:00
Geza Lore
e0b61ceabd Remove legacy #ifdef SYSTEMC_64BIT_PATCHES
These days this is always false, see #3505
2022-07-21 15:01:17 +01:00
Geza Lore
f9ecbdc70b Merge branch 'master' into develop-v5 2022-07-21 09:56:14 +01:00
Geza Lore
30e3edb81d Remove deprecated and unused timescale override defines
These have been 'deprecated' for 2 years and are otherwise unused except
for using a temporary placeholder value, which I have inlined with the
default value.

Also remove the now VL_TIME_STR_CONVERT utility function (and
corresponding unit tests), which have no references in any project on
GitHub.
2022-07-20 14:06:09 +01:00
Geza Lore
1d400dd98c
Configure tracing at run-time, instead of compile time (#3504)
All remaining use of conditional compilation in the tracing
implementation of the run-time library are replaced with the use of
VerilatedModel::traceConfig, and is now done at run-time.
2022-07-20 11:27:10 +01:00
Geza Lore
a4ed3c2086 Make parallel tracing switchable at run-time 2022-07-19 17:13:13 +01:00
Geza Lore
efb5caad22 Improve robustness of trace configuration
Always fail if adding a model to a trace file that has already executed
a dump. We used to do this before as well, though in a less robust way.
We will be relying on this property more in the future, so improve the
check.
2022-07-19 14:16:08 +01:00
Geza Lore
3a002b6cf2 Remove VerilatedVcd::m_evcd and related dead code.
The legacy code that was using this was removed earlier, and m_evcd was
constant false, so removed.
2022-07-19 13:58:18 +01:00
Geza Lore
f8b7981be4 Make use of FST writer thread switchable at run-time.
Always build the FST libray with -DFST_WRITER_PARALLEL, iff VL_THREADED.
This supports run-time enablement of the FST writer thread, and has no
measurable performance impact on single threaded tracing but simplifies
the library build.

Note: the actual choice of using the fst writer thread is still compile
time, but can now be made run-time easily.
2022-07-19 13:48:03 +01:00
Geza Lore
b55ee79d86 Fix typo 2022-07-19 12:36:21 +01:00
Geza Lore
db59c07f27 Implement trace offloading with fewer ifdefs
Step towards a proper run-time library. Reduce the amount of ifdefs in
the implementation of offloaded tracing. There are still a very small
number of ifdefs left, which will need more careful changes in order to
keep user API compatibility.
2022-07-19 11:31:35 +01:00
Geza Lore
9085e34d70 Pass VerilatedModel at trace registration time 2022-07-19 11:00:09 +01:00
Geza Lore
c28bf9ce24 Fix change detection over unpacked arrays. 2022-07-18 12:25:22 +01:00
Geza Lore
c9ac9a75a6 Merge branch 'master' into develop-v5 2022-07-12 17:29:45 +01:00
Geza Lore
79c901c220 Tighten signatures/implementaion of VerilatedModel abstract methods. 2022-07-12 16:06:08 +01:00
Geza Lore
b61d819fcb Move contextp() under VerilatedModel 2022-07-12 16:06:08 +01:00
Geza Lore
f4038e3674
Move thread pool and execution profiler into the context. (#3477)
Fixes #3454
2022-07-12 11:41:15 +01:00
Arkadiusz Kozdra
8377514127
Add support for $test$plusargs(expr) (#3489) 2022-07-11 06:21:35 -04:00
Geza Lore
0de1bbc85b Add and use VL_CONSTEXPR_CXX17 2022-07-05 14:21:28 +01:00
Geza Lore
42b711b862 Don't use 'assert' in profiler initialization 2022-07-05 12:18:54 +01:00
Wilson Snyder
b25b798dbe Merge branch 'master' into develop-v5 2022-07-04 13:20:03 -04:00
Geza Lore
1bb6433649 Improve worker thread shutdown.
Always ensure worker thread task queue is drained before shutting down.
2022-06-27 15:03:36 +01:00
Wilson Snyder
fc4d6a62af Remove VL_PROFILER ifdef. Partial (#3454). 2022-06-22 20:06:23 -04:00
Wilson Snyder
49455721a3 Commentary 2022-06-21 19:28:23 -04:00
Wilson Snyder
0f324c8309 Merge branch 'master' into develop-v5 2022-06-04 11:59:49 -04:00
Geza Lore
b51f887567
Perform VCD tracing in parallel when using --threads (#3449)
VCD tracing is now parallelized using the same thread pool as the model.
We achieve this by breaking the top level trace functions into multiple
top level functions (as many as --threads), and after emitting the time
stamp to the VCD file on the main thread, we execute the tracing
functions in parallel on the same thread pool as the model (which we
pass to the trace file during registration), tracing into a secondary
per thread buffer. The main thread will then stitch (memcpy) the buffers
together into the output file.

This makes the `--trace-threads` option redundant with `--trace`, which
now only affects `--trace-fst`. FST tracing uses the previous offloading
scheme.

This obviously helps a lot in VCD tracing performance, and I have seen
better than Amdahl speedup, namely I get 3.9x on XiangShan 4T (2.7x on
OpenTitan 4T).
2022-05-29 19:08:39 +01:00
Geza Lore
c4b8675d77 Always inline some small, hot trace routines 2022-05-28 12:47:09 +01:00
Geza Lore
a7cd7a1ed9 Initialize VerilatedTrace members in class 2022-05-28 12:47:07 +01:00
Geza Lore
a48c779367 Rename verilated_trace_imp.cpp -> verilated_trace_imp.h
Also fix file header to describe purpose of this file.
2022-05-28 12:20:35 +01:00
Geza Lore
cf1eccc24f Make local function 'static' in verilated_profiler.h 2022-05-28 12:17:39 +01:00
Geza Lore
d45caca011 Remove legacy VCD tracing API
This has not been used by Verilator for a while, but was kept for
compatibility with some external code. Now removed.
2022-05-28 12:07:24 +01:00
Krzysztof Bieganski
3a310f19f0 Adjust loop conditions in VlTriggerVec functions
This change is not a functional one; it is only meant to appease the
compiler with respect to warnings such as GCC's `-Wtype-limits`.

Signed-off-by: Krzysztof Bieganski <kbieganski@antmicro.com>
2022-05-25 18:45:59 +01:00
Krzysztof Bieganski
d7a75dc026 Merge branch 'master' into develop-v5 2022-05-25 11:06:38 +02:00
Geza Lore
b130a8cfeb Add -DVM_TRACE_VCD in model builds with Make with --trace 2022-05-20 16:44:38 +01:00
Geza Lore
551bd284dd Rename some internals related to multi-threaded tracing
Rename the implementation internals of current multi-threaded tracing to
be "offload mode". No functional change, nor user interface change
intended.
2022-05-20 16:44:35 +01:00
Wilson Snyder
99bdc27be3 Internals: Cleanup some statics, trivial part towards (#3419) 2022-05-15 14:26:55 -04:00
Wilson Snyder
c3c46967dc Tests: Appease sanitizer (#3121). 2022-05-15 11:50:52 -04:00
Geza Lore
599d23697d
IEEE compliant scheduler (#3384)
This is a major re-design of the way code is scheduled in Verilator,
with the goal of properly supporting the Active and NBA regions of the
SystemVerilog scheduling model, as defined in IEEE 1800-2017 chapter 4.

With this change, all internally generated clocks should simulate
correctly, and there should be no more need for the `clock_enable` and
`clocker` attributes for correctness in the absence of Verilator
generated library models (`--lib-create`).

Details of the new scheduling model and algorithm are provided in
docs/internals.rst.

Implements #3278
2022-05-15 16:03:32 +01:00
Wilson Snyder
5aa12e9b51 Add assert when VerilatedContext is mis-deleted (#3121). 2022-05-15 10:51:03 -04:00
Wilson Snyder
f6035447ae Internals: Use mutable for mutexes. No functional change. 2022-05-13 07:21:39 -04:00
Wilson Snyder
38438b3373 Internals: Cleanup some defaults. No functional change. 2022-05-12 23:30:39 -04:00
HungMingWu
880a9be3b1
Internal: Add C++20ish reverse_view for range loops. No functional change (#3388).
Signed-off-by: HungMingWu <u9089000@gmail.com>
2022-04-18 13:03:56 -04:00
Wilson Snyder
33105f017c Commentary 2022-03-30 20:17:59 -04:00
Wilson Snyder
e02f97854c Deprecate 'vluint64_t' and similar types (#3255). 2022-03-27 15:27:40 -04:00
Wilson Snyder
3f7bf3d2dc Fix MSVC localtime_s (#3124). 2022-03-27 13:59:18 -04:00
Geza Lore
b1b5b5dfe2 Improve run-time profiling
The --prof-threads option has been split into two independent options:
1. --prof-exec, for collecting verilator_gantt and other execution
related profiling data, and
2. --prof-pgo, for collecting data needed for PGO

The implementation of execution profiling is extricated from
VlThreadPool and is now a separate class VlExecutionProfiler. This means
--prof-exec can now be used for single-threaded models (though it does
not measure a lot of things just yet). For consistency VerilatedProfiler
is renamed VlPgoProfiler. Both VlExecutionProfiler and VlPgoProfiler are
in verilated_profiler.{h/cpp}, but can be used completely independently.

Also re-worked the execution profile format so it now only emits events
without holding onto any temporaries. This is in preparation for some
future optimizations that would be hindered by the introduction of function
locals via AstText.

Also removed the Barrier event. Clearing the profile buffers is not
notably more expensive as the profiling records are trivially
destructible.
2022-03-27 15:57:30 +02:00
Geza Lore
c7440b250f Validate integer run-time arguments 2022-03-26 22:58:47 +00:00
Geza Lore
bab8462789 Rebuild run-time library if generated makefile changes
The generated makefile contains compiler options that are passed when
building the run-time library, so re-build if it changes.
2022-03-26 21:29:03 +00:00
Xi Zhang
14d24213a8
Support LoongArch ISA multithreading (#3353) (#3354) 2022-03-17 09:04:47 -04:00
Wilson Snyder
b5ce7d5982 Add VERILATOR_VERSION_INTEGER for determining API (#3343). 2022-03-12 11:17:39 -05:00
Wilson Snyder
ef87d057fc Fix $fscanf etc to return -1 on EOF (#3113). 2022-03-07 17:43:33 -05:00
Wilson Snyder
321880f5a6 Add trace dumpvars() call for selective runtime tracing (#3322). 2022-03-05 15:44:32 -05:00
Wilson Snyder
956f64c6ba Fix compile error with --trace-fst --sc (#3332). 2022-03-02 07:26:26 -05:00
Jamie Iles
b6ca2a42f2
Fix FST traces to include vector range (#3296) (#3297) 2022-02-26 12:52:24 -05:00
Wilson Snyder
e52a4ac74f Fix $readmem file not found to be warning not error (#3310). 2022-02-19 10:04:12 -05:00
Wilson Snyder
3b7ad1820d GTKWave header updates from upstream. 2022-02-09 21:56:22 -05:00
Guokai Chen
818aaa8b89
Fix macOS arm64 build by excluding x86 only cpuid header (#3285) (#3291)
Signed-off-by: Guokai Chen <chenguokai17@mails.ucas.ac.cn>
2022-01-23 09:15:09 -05:00
Julie Schwartz
f5b1a5cd58 Fix make support for BSD ar (#2999) (#3256). [Julie Schwartz]
While GNU 'ar' supports '@' to specify a file, BSD 'ar' does not.
The max line length can be handled by 'xargs' instead, which will know
to break up the command.  In case there are multiple calls, only build
the index (specified with '-s') once in a later call.
2022-01-17 14:04:43 -05:00
Wilson Snyder
50094ca296 Internals: Add cpplint control file and related cleanups 2022-01-09 16:49:38 -05:00
Wilson Snyder
15b32dc140 Internals: cpplint cleanups. No functional change. 2022-01-08 12:01:39 -05:00
Wilson Snyder
9bda91b3bf Fix clang compile warning 2022-01-01 19:33:12 -05:00
Wilson Snyder
d679d50eca Fix $random not updating seed (#3238). [Julie Schwartz] 2022-01-01 16:43:06 -05:00