verilator

Author	SHA1	Message	Date
Wilson Snyder	5a1bcf9794	Tests: Add lint-py checker	2022-09-07 22:04:57 -04:00
Wilson Snyder	361cef4633	Fix pylint warnings.	2022-09-07 21:48:52 -04:00
Geza Lore	90ab746a42	Make it possible to parallelize ico and act scheduling sections Small fixup patch so the 'ico' and 'act' scheduling sections could be ordered as multi-threaded. However, we still only order these single threaded at the moment (but switching them to multi-threaded now works).	2022-09-06 16:01:13 +01:00
github action	e94cdcf29c	Apply 'make format'	2022-09-05 22:43:09 +00:00
Mladen Slijepcevic	1af046986d	Fix thread saftey in SystemC VL_ASSIGN_SBW/WSB (#3494 ) (#3513 ).	2022-09-05 18:42:12 -04:00
Wilson Snyder	1c9263a25b	Commentary	2022-09-05 15:20:08 -04:00
Geza Lore	fd6275a62b	Merge branch 'master' into develop-v5	2022-09-05 17:03:43 +01:00
Krzysztof Bieganski	6b6790fc50	Preserve return type of `AstNode::addNext` via templating (#3597 ) No functional change intended. Signed-off-by: Krzysztof Bieganski <kbieganski@antmicro.com>	2022-09-05 16:56:57 +01:00
Krzysztof Bieganski	fb931087ab	Add stats tracking for `V3Undriven`. (#3600 ) Signed-off-by: Krzysztof Bieganski <kbieganski@antmicro.com>	2022-09-05 16:20:38 +01:00
Krzysztof Bieganski	a2e1b32a1c	Fix inlining of forks (#3594 ) Before this change, some forked processes were being inlined in `V3Timing` because they contained no `CAwait`s. This only works under the assumption that no `CAwait`s will be added there later, which is not true, as a function called by a forked process could be turned into a coroutine later. The call would be wrapped in a new `CAwait`, but the process itself would have already been inlined at this point. This commit moves the inlining to `transformForks` in `V3SchedTiming`, which is called at a point when all `CAwait`s are already in place. Signed-off-by: Krzysztof Bieganski <kbieganski@antmicro.com>	2022-09-05 15:19:19 +01:00
Krzysztof Bieganski	54f89bce42	Move `SenExprBuilder` to a header. (#3598 ) No functional change intended. Signed-off-by: Krzysztof Bieganski <kbieganski@antmicro.com>	2022-09-05 15:17:51 +01:00
Krzysztof Bieganski	8b19d02e3b	Fix `co_await VlNow{}` being added too many times (#3596 ) (or not at all) Signed-off-by: Krzysztof Bieganski <kbieganski@antmicro.com>	2022-09-05 11:46:34 +01:00
Krzysztof Bieganski	da7ad35577	Fix fork debug output (#3593 ) Signed-off-by: Krzysztof Bieganski <kbieganski@antmicro.com>	2022-09-05 11:27:24 +01:00
Geza Lore	937e893b6d	Build verilator_bin with -O3 (#3592 ) This is consistently a few percent faster.	2022-09-03 22:10:07 +01:00
Geza Lore	d42a2d6494	Fix V3Gate crash on circular logic The recent patch to defer substitutions on V3Gate crashes on circular logic that has cycle length >= 3 with all inlineable signals (cycle length 2 is detected correctly and is not inlined). Fix by stopping recursion at the loop-back edge. Fixes #3543	2022-09-02 19:58:58 +01:00
Geza Lore	8e8f4b1e5c	Remove AstVarScope::valuep() and related code This is detritus from when V3TraceDecl used to run after V3Gate, today V3TraceDecl runs before V3Gate and this value has no function at all. No functional change intended.	2022-09-02 16:44:13 +01:00
Geza Lore	298f71f2b1	Merge branch 'master' into develop-v5	2022-09-02 12:19:35 +01:00
Geza Lore	2ba39b25f1	Replace dynamic_casts with static_casts dynamic_cast is not free. Replace obvious instances (where the result is unconditionally dereferenced) with static_cast in contexts with performance implications.	2022-09-02 12:08:34 +01:00
Geza Lore	5c828b7e60	V3Partition: use V3Lists to keep track of SiblingMCs Replace std::set<SiblingMC> with V3Lists to keep track of SiblingMCs associated with MTasks, use a std::set<LogicMTask*> for ensuring uniqueness. This yields a bit more speed in PartContraction.	2022-09-01 19:40:44 +01:00
Geza Lore	4640bea31a	V3Partition: More improvements for PartFixDataHazards - Remove redundant loop through the MTask graph - Gather variables directly from the OrderGraph, which is simpler and faster.	2022-09-01 16:30:04 +01:00
Geza Lore	875361d7ce	V3Partition: Reduce working set size of PartContraction (#3587 ) This yields an additional 25% speedup of MT scheduling.	2022-09-01 16:29:40 +01:00
Wilson Snyder	849bb5590a	Merge branch 'master' into develop-v5	2022-08-31 19:51:07 -04:00
Wilson Snyder	8d0c06e570	devel release	2022-08-31 19:49:24 -04:00
Wilson Snyder	5b2fbf4f37	Version bump	2022-08-31 19:46:45 -04:00
Wilson Snyder	592dab2bdb	Commentary: Changes update	2022-08-31 19:27:43 -04:00
Wilson Snyder	51daa64e9a	Fix --hierarchical with order-based pin connections (#3585 ).	2022-08-31 18:12:21 -04:00
Geza Lore	c0f9b0d8f6	V3Partition: Refactor initialization of MTask dependencies No functional change	2022-08-31 16:54:04 +01:00
Geza Lore	505bba14eb	Improve PartFixDataHazards for clarity and speed. - Use modern C++ - Implement OrderLogicVertex->LogicMTask map with OrderLogicVertex::userp(), insteas of std::unordered_map - Simplify data structures - Simplify code and assert properties No functional change.	2022-08-31 16:52:05 +01:00
Geza Lore	ebbe24966c	Remove unnecessary virtual methods	2022-08-31 16:52:05 +01:00
Geza Lore	881c3f6e40	Minor optimization of PartContraction Remove rarely used debug code from initialization loop.	2022-08-31 16:52:05 +01:00
Geza Lore	546aeab9f2	V3Order: Minor refactoring for clarity Refactor ProcessMoveBuildGraph utilizing the fact that OrderGraph is a bipartite graph, also remove unnecessary unordered_map and distribute variable domain map. No functional change.	2022-08-31 16:52:05 +01:00
Geza Lore	8de21e9bb7	Document and ensure OrderGraph is bipartite Minor refactoring and documentation. No functional change.	2022-08-31 16:52:05 +01:00
Geza Lore	2ecda74471	Merge branch 'master' into develop-v5	2022-08-31 10:45:18 +01:00
Aleksander Kiryk	2136afde6b	Support negated properties (#3572 )	2022-08-30 06:33:42 -04:00
Wilson Snyder	ea55db7286	Internals: Cleanup some string constructors. No functional change.	2022-08-30 01:02:39 -04:00
Wilson Snyder	819e8741cc	Merge branch 'master' into develop-v5	2022-08-30 00:20:21 -04:00
Wilson Snyder	6a5f77b278	Internals: Cleanup some string/model constructors. No functional change.	2022-08-29 23:50:32 -04:00
Wilson Snyder	8658a0d7dc	Internals: Constructor format update. No functional change.	2022-08-29 23:05:52 -04:00
Wilson Snyder	c335aad25f	Fix --hierarchical with order-based pin connections (#3583 ).	2022-08-29 22:49:19 -04:00
Wilson Snyder	9d9d647c1f	Fix indentation of --protect import function SV code.	2022-08-29 22:28:02 -04:00
Wilson Snyder	d47a37fb76	Internals: Cleanup constructors etc. No functional change.	2022-08-29 22:17:27 -04:00
Aleksander Kiryk	24ec84851a	Support $sampled (#3569 )	2022-08-29 08:39:41 -04:00
Arkadiusz Kozdra	0a3a15a66e	Support class parameters (#2231 ) (#3541 )	2022-08-28 10:24:55 -04:00
Wilson Snyder	2358ced061	Rename tracing rolloverSize and add test (#3570 ).	2022-08-28 08:25:02 -04:00
Krzysztof Bieganski	2af5304884	Fix tracing of slow coroutines (#3576 part) (#3579 )	2022-08-26 05:11:44 -05:00
Varun Koyyalagunta	5869fdf7f6	Fix $dump systemtask with --output-split-cfuncs (#3495 ) (#3497 )	2022-08-25 18:29:11 -05:00
Krzysztof Bieganski	1a1d2ecfd9	Enable tracing in generated main (#3578 ) Signed-off-by: Krzysztof Bieganski <kbieganski@antmicro.com>	2022-08-25 14:55:37 +01:00
Geza Lore	5c356a4680	Merge branch 'master' into develop-v5	2022-08-22 14:32:06 +01:00
Krzysztof Bieganski	39af5d020e	Timing support (#3363 ) Adds timing support to Verilator. It makes it possible to use delays, event controls within processes (not just at the start), wait statements, and forks. Building a design with those constructs requires a compiler that supports C++20 coroutines (GCC 10, Clang 5). The basic idea is to have processes and tasks with delays/event controls implemented as C++20 coroutines. This allows us to suspend and resume them at any time. There are five main runtime classes responsible for managing suspended coroutines: * `VlCoroutineHandle`, a wrapper over C++20's `std::coroutine_handle` with move semantics and automatic cleanup. * `VlDelayScheduler`, for coroutines suspended by delays. It resumes them at a proper simulation time. * `VlTriggerScheduler`, for coroutines suspended by event controls. It resumes them if its corresponding trigger was set. * `VlForkSync`, used for syncing `fork..join` and `fork..join_any` blocks. * `VlCoroutine`, the return type of all verilated coroutines. It allows for suspending a stack of coroutines (normally, C++ coroutines are stackless). There is a new visitor in `V3Timing.cpp` which: * scales delays according to the timescale, * simplifies intra-assignment timing controls and net delays into regular timing controls and assignments, * simplifies wait statements into loops with event controls, * marks processes and tasks with timing controls in them as suspendable, * creates delay, trigger scheduler, and fork sync variables, * transforms timing controls and fork joins into C++ awaits There are new functions in `V3SchedTiming.cpp` (used by `V3Sched.cpp`) that integrate static scheduling with timing. This involves providing external domains for variables, so that the necessary combinational logic gets triggered after coroutine resumption, as well as statements that need to be injected into the design eval function to perform this resumption at the correct time. There is also a function that transforms forked processes into separate functions. See the comments in `verilated_timing.h`, `verilated_timing.cpp`, `V3Timing.cpp`, and `V3SchedTiming.cpp`, as well as the internals documentation for more details. Signed-off-by: Krzysztof Bieganski <kbieganski@antmicro.com>	2022-08-22 13:26:32 +01:00
Geza Lore	9ac64d0b92	Improve performance of MTask coarsening Various optimizations to speed up MTasks coarsening (which is the long pole in the multi-threaded scheduling of very large designs). The biggest impact ones: - Use efficient hand written Pairing Heaps for implementing priority queues and the scoreboard, instead of the old SortByValueMap. This helps us avoid having to sort a lot of merge candidates that we will never actually consider and helps a lot in performance. - Remove unnecessary associative containers and store data structures (the heap nodes in particular) directly in the object they relate to. This eliminates a huge amount of lookups and helps a lot in performance. - Distribute storage for SiblingMC instances into the LogicMTask instances, and combine with the sibling maps. This again eliminates hash table lookups and makes storage structures smaller. - Remove some now bidirectional edge maps, keep only the forward map. There are also some other smaller optimizations: - Replaced more unnecessary dynamic_casts with static_casts - Templated some functions/classes to reduce the number of static branches in loops. - Improves sorting of edges for sibling candidate creation - Various micro-optimizations here and there This speeds up MTask coarsening by 3.8x on a large design, which translates to a 2.5x speedup of the ordering pass in multi-threaded mode. (Combined with the earlier optimizations, ordering is now 3x faster.) Due to the elimination of a lot of the auxiliary data structures, and ensuring a minimal size for the necessary ones, memory consumption of the MTask coarsening is also reduced (measured up to 4.4x reduction though the accuracy of this is low). The algorithm is identical except for minor alterations of the order some candidates are added or removed, this can cause perturbation in the output due to tied scores being broken based on IDs.	2022-08-20 21:18:50 +01:00

1 2 3 4 5 ...

5368 Commits