Commit Graph

1881 Commits

Author SHA1 Message Date
Wilson Snyder
035bf13e4a Fix foreach unnamedblk duplicate error (#3885). 2023-01-18 21:48:06 -05:00
Wilson Snyder
5fce23e90d Fix empty case items crash (#3851). 2023-01-10 07:18:12 -05:00
Wilson Snyder
30f6831be6 Commentary: Changes update 2023-01-07 14:54:12 -05:00
Wilson Snyder
b24d7c83d3 Copyright year update 2023-01-01 10:18:39 -05:00
Wilson Snyder
3ccb2e0f2d Fix initiation of function variables (#3815). 2022-12-23 10:51:52 -05:00
Wilson Snyder
71d29a235f Commentary: Changes update 2022-12-20 19:51:17 -05:00
Wilson Snyder
bae60ab8ea devel release 2022-12-14 22:08:39 -05:00
Wilson Snyder
4cefccf5d2 Version bump 2022-12-14 21:59:58 -05:00
Wilson Snyder
01b521a0ea Commentary: Changes update 2022-12-11 15:10:23 -05:00
Wilson Snyder
afc66f6a85 Fix make jobserver with submakes (#3758). 2022-12-11 14:19:40 -05:00
Wilson Snyder
722e38f532 Commentary, part of last commit (#3783) 2022-12-11 13:30:00 -05:00
Wilson Snyder
3f4d4dec77 Fix ENUMVALUE on typedef (#3777) 2022-12-11 11:50:22 -05:00
Wilson Snyder
a0e7930036 docs: Fix spelling 2022-12-09 22:39:41 -05:00
Wilson Snyder
e465a30eee Fix lint_off EOFNEWLINE in .vlt files (#3796). 2022-12-01 18:27:36 -05:00
Wilson Snyder
d87ef8394a Fix CASEINCOMPLETE when covers all enum values (#3745) (#3782).
Co-authored-by: "G-A. Kamendje" <gkamendje@gmail.com>
2022-11-30 19:42:21 -05:00
Wilson Snyder
8ff607f679 Deprecate verilated_fst_sc.cpp and verilated_vcd_sc.cpp (#3507) 2022-11-29 22:17:50 -05:00
Wilson Snyder
f4be3d5d2b Fix empty string literals converting to string types (#3774). 2022-11-27 13:28:57 -05:00
Wilson Snyder
aacb38b776 Support assignment expressions. 2022-11-19 15:23:37 -05:00
Wilson Snyder
3c77c7bb92 Support and 2022-11-16 21:10:54 -05:00
Wilson Snyder
c6ecd60993 Support pre_randomize and post_randomize. 2022-11-13 11:59:40 -05:00
Wilson Snyder
d25834e57b Add ENUMVALUE warning when value misused for enum (#726). 2022-11-12 20:11:05 -05:00
Wilson Snyder
0a045a7bf6 Change ENDLABEL from warning into an error. 2022-11-12 12:09:48 -05:00
Wilson Snyder
a427860825 Support randcase. 2022-11-11 21:53:05 -05:00
Wilson Snyder
227e61f891 Fix comparing ranged slices of unpacked arrays. 2022-11-11 18:01:30 -05:00
Wilson Snyder
9d7c4d9af3 Fix wait 0. 2022-11-11 17:18:59 -05:00
Wilson Snyder
16586d1d37 Fix tracing parameters overridden with -G (#3723). 2022-11-10 20:30:10 -05:00
Wilson Snyder
e64295e92b Fix missing UNUSED warnings with --coverage (#3736). 2022-11-09 21:45:14 -05:00
Wilson Snyder
167f4ebbd4 Commentary: Changes update 2022-11-05 09:32:41 -04:00
Wilson Snyder
0b0b642241 devel release 2022-10-29 17:54:12 -04:00
Wilson Snyder
52d29d238c Version bump 2022-10-29 17:45:54 -04:00
Wilson Snyder
5c658f8cd5 Fix width mismatch on inside operator (#3714). 2022-10-28 06:38:49 -04:00
Wilson Snyder
5f9b0929b4 Commentary: Changes update 2022-10-27 21:35:09 -04:00
Wilson Snyder
4154584c4b Commentary: Changes update 2022-10-21 20:04:07 -04:00
Wilson Snyder
a57a3579c0 Fix false LATCH warning on 'unique if' (#3088). 2022-10-21 19:10:06 -04:00
Wilson Snyder
347e9b4ec8 Fix cell assigning integer array parameters (#3299). 2022-10-21 18:26:39 -04:00
Wilson Snyder
79682e6072 Support empty generate_regions (#3695). [mpb27] 2022-10-20 22:04:50 -04:00
Wilson Snyder
7e1b92fa75 Add --get-supported to determine what features are in Verilator (#3688). 2022-10-20 21:42:30 -04:00
Wilson Snyder
e7068369fe Fix $display of fixed-width numbers (#3565). 2022-10-18 21:10:35 -04:00
Wilson Snyder
b930d0731a Fix foreach and pre/post increment in functions (#3613). 2022-10-18 20:04:09 -04:00
Wilson Snyder
2723223884 Fix LSB error on --hierarchical submodules (#3539). 2022-10-18 17:29:51 -04:00
Wilson Snyder
cb7b024e8f Commentary: Spelling, and add upgrade notes (#3462) 2022-10-16 11:10:41 -04:00
Wilson Snyder
b16b607b98 Commentary: Changes update 2022-10-15 11:04:03 -04:00
Wilson Snyder
732d5bea10 Commentary: Standard format for company contributions 2022-10-15 10:59:31 -04:00
Wilson Snyder
d9a0d0ade2 Commentary (fix earlier commit) 2022-10-15 10:57:46 -04:00
Wilson Snyder
14f58ed6c7 Add error on real edge event control. 2022-10-15 06:21:34 -04:00
Geza Lore
b2070a9407 Commentary: Mention DFG in changes 2022-10-12 10:21:02 +01:00
Wilson Snyder
880cac2fdd Merge branch 'master' into develop-v5 2022-10-01 11:24:55 -04:00
Wilson Snyder
0b843ada03 devel release 2022-10-01 08:34:43 -04:00
Wilson Snyder
746c7ea8f7 Version bump 2022-10-01 08:28:27 -04:00
Wilson Snyder
fa4b10b4d9 Commentary: Changes update 2022-09-30 23:03:26 -04:00
Wilson Snyder
cd2a5771b8 Add --timing to --binary (#3625). 2022-09-28 19:02:23 -04:00
Wilson Snyder
b92173bf3d Add --binary option as alias of --main --exe --build (#3625). 2022-09-28 09:04:33 -04:00
Wilson Snyder
d162619bd3 Merge branch 'master' into develop-v5 2022-09-20 20:06:21 -04:00
Wilson Snyder
fc4ffd454e Rename --bin to --build-dep-bin. 2022-09-18 10:32:43 -04:00
Geza Lore
27031ed688 Merge branch 'master' into develop-v5 2022-09-15 10:28:35 +01:00
Wilson Snyder
75fd71d7e5 Add --main to generate main() C++ (previously was experimental only) (#3265). 2022-09-14 20:18:40 -04:00
Wilson Snyder
9efd64ab98 Commentary 2022-09-14 20:13:28 -04:00
Wilson Snyder
7aa01625d8 Commentary: Changes update 2022-09-14 08:15:42 -04:00
Wilson Snyder
81fe35ee2e Fix typedef'ed class conversion to boolean (#3616). 2022-09-12 18:03:56 -04:00
Geza Lore
fd6275a62b Merge branch 'master' into develop-v5 2022-09-05 17:03:43 +01:00
Geza Lore
d42a2d6494 Fix V3Gate crash on circular logic
The recent patch to defer substitutions on V3Gate crashes on circular
logic that has cycle length >= 3 with all inlineable signals (cycle
length 2 is detected correctly and is not inlined). Fix by stopping
recursion at the loop-back edge.

Fixes #3543
2022-09-02 19:58:58 +01:00
Wilson Snyder
849bb5590a Merge branch 'master' into develop-v5 2022-08-31 19:51:07 -04:00
Wilson Snyder
8d0c06e570 devel release 2022-08-31 19:49:24 -04:00
Wilson Snyder
5b2fbf4f37 Version bump 2022-08-31 19:46:45 -04:00
Wilson Snyder
592dab2bdb Commentary: Changes update 2022-08-31 19:27:43 -04:00
Wilson Snyder
51daa64e9a Fix --hierarchical with order-based pin connections (#3585). 2022-08-31 18:12:21 -04:00
Wilson Snyder
819e8741cc Merge branch 'master' into develop-v5 2022-08-30 00:20:21 -04:00
Wilson Snyder
c335aad25f Fix --hierarchical with order-based pin connections (#3583). 2022-08-29 22:49:19 -04:00
Wilson Snyder
2358ced061 Rename tracing rolloverSize and add test (#3570). 2022-08-28 08:25:02 -04:00
Geza Lore
5c356a4680 Merge branch 'master' into develop-v5 2022-08-22 14:32:06 +01:00
Krzysztof Bieganski
39af5d020e
Timing support (#3363)
Adds timing support to Verilator. It makes it possible to use delays,
event controls within processes (not just at the start), wait
statements, and forks.

Building a design with those constructs requires a compiler that
supports C++20 coroutines (GCC 10, Clang 5).

The basic idea is to have processes and tasks with delays/event controls
implemented as C++20 coroutines. This allows us to suspend and resume
them at any time.

There are five main runtime classes responsible for managing suspended
coroutines:
* `VlCoroutineHandle`, a wrapper over C++20's `std::coroutine_handle`
  with move semantics and automatic cleanup.
* `VlDelayScheduler`, for coroutines suspended by delays. It resumes
  them at a proper simulation time.
* `VlTriggerScheduler`, for coroutines suspended by event controls. It
  resumes them if its corresponding trigger was set.
* `VlForkSync`, used for syncing `fork..join` and `fork..join_any`
  blocks.
* `VlCoroutine`, the return type of all verilated coroutines. It allows
  for suspending a stack of coroutines (normally, C++ coroutines are
  stackless).

There is a new visitor in `V3Timing.cpp` which:
  * scales delays according to the timescale,
  * simplifies intra-assignment timing controls and net delays into
    regular timing controls and assignments,
  * simplifies wait statements into loops with event controls,
  * marks processes and tasks with timing controls in them as
    suspendable,
  * creates delay, trigger scheduler, and fork sync variables,
  * transforms timing controls and fork joins into C++ awaits

There are new functions in `V3SchedTiming.cpp` (used by `V3Sched.cpp`)
that integrate static scheduling with timing. This involves providing
external domains for variables, so that the necessary combinational
logic gets triggered after coroutine resumption, as well as statements
that need to be injected into the design eval function to perform this
resumption at the correct time.

There is also a function that transforms forked processes into separate
functions.

See the comments in `verilated_timing.h`, `verilated_timing.cpp`,
`V3Timing.cpp`, and `V3SchedTiming.cpp`, as well as the internals
documentation for more details.

Signed-off-by: Krzysztof Bieganski <kbieganski@antmicro.com>
2022-08-22 13:26:32 +01:00
Geza Lore
9ac64d0b92 Improve performance of MTask coarsening
Various optimizations to speed up MTasks coarsening (which is the long
pole in the multi-threaded scheduling of very large designs).

The biggest impact ones:
- Use efficient hand written Pairing Heaps for implementing priority
  queues and the scoreboard, instead of the old SortByValueMap. This
  helps us avoid having to sort a lot of merge candidates that we will
  never actually consider and helps a lot in performance.
- Remove unnecessary associative containers and store data structures
  (the heap nodes in particular) directly in the object they relate to.
  This eliminates a huge amount of lookups and helps a lot in
  performance.
- Distribute storage for SiblingMC instances into the LogicMTask
  instances, and combine with the sibling maps. This again eliminates
  hash table lookups and makes storage structures smaller.
- Remove some now bidirectional edge maps, keep only the forward map.

There are also some other smaller optimizations:
- Replaced more unnecessary dynamic_casts with static_casts
- Templated some functions/classes to reduce the number of static
  branches in loops.
- Improves sorting of edges for sibling candidate creation
- Various micro-optimizations here and there

This speeds up MTask coarsening by 3.8x on a large design, which
translates to a 2.5x speedup of the ordering pass in multi-threaded
mode. (Combined with the earlier optimizations, ordering is now 3x
faster.)

Due to the elimination of a lot of the auxiliary data structures, and
ensuring a minimal size for the necessary ones, memory consumption of
the MTask coarsening is also reduced (measured up to 4.4x reduction
though the accuracy of this is low).

The algorithm is identical except for minor alterations of the order
some candidates are added or removed, this can cause perturbation in the
output due to tied scores being broken based on IDs.
2022-08-20 21:18:50 +01:00
Wilson Snyder
ebb37b0156 Merge branch 'master' into develop-v5 2022-08-20 14:02:09 -04:00
Wilson Snyder
90dc04cf93 Add --future0 and --future1 options. 2022-08-20 14:01:13 -04:00
Geza Lore
4d81eb021d Revert "Improve performance of MTask coarsening"
This reverts commit 83475008d9.
2022-08-19 18:03:45 +01:00
Geza Lore
83475008d9 Improve performance of MTask coarsening
Various optimizations to speed up MTasks coarsening (which is the long
pole in the multi-threaded scheduling of very large designs).

The biggest impact ones:
- Use efficient hand written Pairing Heaps for implementing priority
  queues and the scoreboard, instead of the old SortByValueMap. This
  helps us avoid having to sort a lot of merge candidates that we will
  never actually consider and helps a lot in performance.
- Remove unnecessary associative containers and store data structures
  (the heap nodes in particular) directly in the object they relate to.
  This eliminates a huge amount of lookups and helps a lot in
  performance.
- Distribute storage for SiblingMC instances into the LogicMTask
  instances, and combine with the sibling maps. This again eliminates
  hash table lookups and makes storage structures smaller.
- Remove some now bidirectional edge maps, keep only the forward map.

There are also some other smaller optimizations:
- Replaced more unnecessary dynamic_casts with static_casts
- Templated some functions/classes to reduce the number of static
  branches in loops.
- Improves sorting of edges for sibling candidate creation
- Various micro-optimizations here and there

This speeds up MTask coarsening by 3.8x on a large design, which
translates to a 2.5x speedup of the ordering pass in multi-threaded
mode. (Combined with the earlier optimizations, ordering is now 3x
faster.)

Due to the elimination of a lot of the auxiliary data structures, and
ensuring a minimal size for the necessary ones, memory consumption of
the MTask coarsening is also reduced (measured up to 4.4x reduction
though the accuracy of this is low).

The algorithm is identical except for minor alterations of the order
some candidates are added or removed, this can cause perturbation in the
output due to tied scores being broken based on IDs.
2022-08-19 16:59:20 +01:00
Geza Lore
1404319b28 Merge branch 'master' into develop-v5 2022-08-19 13:39:44 +01:00
Wilson Snyder
f435d96241 Fix case statement comparing string literal (#3544). 2022-08-15 21:56:09 -04:00
Wilson Snyder
cbe1b8e266 Fix segfault exporting non-existant package (#3535). 2022-08-08 17:53:50 -04:00
Geza Lore
ad2fbfe62d Merge branch 'master' into develop-v5 2022-07-29 12:04:24 +01:00
Yutetsu TAKATSUKASA
1f9323d086
Set correct dtype in replaceShiftSame() (#3520)
* Tests: Add a test to reproduce bug3399

* Fix3399. Set the correct dtype in replaceShiftSame().

* Tests: update stats.

* Update Changes
2022-07-29 07:05:04 +09:00
Yutetsu TAKATSUKASA
60eab3eb8c
Fix wrong result of bit op tree optimization #3509 (#3516)
* Tests: Add a test to reproduce #3509

* Tests: Compile without tautological-compare check because bit op tree optimization is disabled in the test.

* Internals: Dedup code. No functional change is intended.

* Fix #3509.

"2'b10 == (2'b11 & {1'b0, val[0]})"  and "2'b10 != (2'b11 & {1'b0, val[0]})" were
wrongly optimized to "!val[0]" and "val[0]" respectively.
Now properly optimize them to 1'b0 and 1'b1.

* Commentary

* Commentary: Update Changes
2022-07-24 19:54:37 +09:00
Geza Lore
c9ac9a75a6 Merge branch 'master' into develop-v5 2022-07-12 17:29:45 +01:00
Wilson Snyder
5f3316d3dc * Fix empty string arguments to display (#3484). 2022-07-09 08:30:57 -04:00
Wilson Snyder
a4fddb3fbe Fix table misoptimizing away display (#3488). 2022-07-09 07:55:46 -04:00
Yutetsu TAKATSUKASA
9f37cef1bb
Fix #3470 of incorrect bit op tree optimization (#3476)
* Tests: Add a test to reproduce #3470

* Update LSB during return path of traversal. No functional change is intended.

* Introduce LeafInfo::m_msb

* Update LeafInfo::m_msb when visitin AstCCast

* Internals: Add comment, reorder. No functional change is intended.

* Delete explicit from copy constructor to fix build error.

* Update Changes

* Internals: Remove unused parameter. No functional change is intended.

* Tests: Add explanation to t_const_opt.
2022-07-06 08:33:37 +09:00
Wilson Snyder
e7ca4a69e3 Merge branch 'master' into develop-v5 2022-06-19 15:22:09 -04:00
Wilson Snyder
f2fba51fe2 devel release 2022-06-19 15:13:29 -04:00
Wilson Snyder
7c79f0d431 Version bump (Changes update) 2022-06-19 15:10:23 -04:00
Wilson Snyder
1fa82ffb7b Commentary: Update ChangeLog 2022-06-19 15:00:10 -04:00
Wilson Snyder
e7dc2de14b Fix BLKANDNBLK on $readmem/$writemem (#3379). 2022-06-04 12:43:18 -04:00
Wilson Snyder
0f324c8309 Merge branch 'master' into develop-v5 2022-06-04 11:59:49 -04:00
Wilson Snyder
59dc2853e3 Support concat assignment to packed array (#3446). 2022-06-03 21:32:13 -04:00
Wilson Snyder
ada58465b2 Add -f<optimization> options to replace -O<letter> options (#3436). 2022-06-03 20:43:16 -04:00
Wilson Snyder
173f57c636 Changed --no-merge-const-pool to -fno-merge-const-pool (#3436). 2022-06-03 19:41:59 -04:00
Geza Lore
b51f887567
Perform VCD tracing in parallel when using --threads (#3449)
VCD tracing is now parallelized using the same thread pool as the model.
We achieve this by breaking the top level trace functions into multiple
top level functions (as many as --threads), and after emitting the time
stamp to the VCD file on the main thread, we execute the tracing
functions in parallel on the same thread pool as the model (which we
pass to the trace file during registration), tracing into a secondary
per thread buffer. The main thread will then stitch (memcpy) the buffers
together into the output file.

This makes the `--trace-threads` option redundant with `--trace`, which
now only affects `--trace-fst`. FST tracing uses the previous offloading
scheme.

This obviously helps a lot in VCD tracing performance, and I have seen
better than Amdahl speedup, namely I get 3.9x on XiangShan 4T (2.7x on
OpenTitan 4T).
2022-05-29 19:08:39 +01:00
Geza Lore
0722f47539
Improve V3MergeCond by reordering statements (#3125)
V3MergeCond merges consecutive conditional `_ = cond ? _ : _` and
`if (cond) ...` statements. This patch adds an analysis and ordering
phase that moves statements with identical conditions closer to each
other, in order to enable more merging opportunities. This in turn
eliminates a lot of repeated conditionals which reduced dynamic branch
count and branch misprediction rate. Observed 6.5% improvement on
multi-threaded large designs, at the cost of less than 2% increase in
Verilation speed.
2022-05-27 16:57:51 +01:00
Krzysztof Bieganski
d7a75dc026 Merge branch 'master' into develop-v5 2022-05-25 11:06:38 +02:00
Wilson Snyder
530817191e Support non-ANSI interface port declarations (#3439). 2022-05-25 00:50:50 -04:00
Geza Lore
c7610ed044 Fix FST tracing thread in CMake build 2022-05-20 17:04:46 +01:00