Commit Graph

5310 Commits

Author SHA1 Message Date
Geza Lore
03ac7ad730 Make PartPropagateCp specific to the MTask graph
While keeping the client code abstract in PartPropagateCp is nice for
testing, there is performance to be had removing the abstraction. As
this code dominates in scheduling large designs, we eliminate the
abstraction and re-work the testing to use the actual LogicMTask and
MTaskEdge graph types. No functional change intended.
2022-08-19 14:06:11 +01:00
Geza Lore
cd50949a7e Reuse MTaskEdge instances in MT scheduling
Instead of deleting then re-allocating MTaskEdge instances when merging
two MTasks, just redirect the edged of the donor MTask to the recipient
MTask. This is both faster as it avoids an allocation and a deletion,
together with one update of the sibling maps, and also makes the
algorithm more stable due to MergeCandidate IDs being stable and
allocated up front for all MTaskEdges, before any SiblingMCs are
allocated.

Perturbations in output are expected as the IDs used to break ties
between merge candidates with equal costs are not updated when
redirecting an edge (on purpose). The relinking of only one end of the
graph edges also perturbs the order in which they are enumerated, which
does change candidate opportunities when the number of edges is larger
than PART_SIBLING_EDGE_LIMIT. Confirmed output is identical when
IDs are updated and edges are updated to appear in their original order.
2022-08-19 14:06:11 +01:00
Geza Lore
f0040c7b9a Remove reliance on pointer comparison in MT scheduling
The critical path propagation used to rely on a pointer comparison to
break equal scoring critical path updates. Use the corresponding mtask
ids instead, which is deterministic across invocations.
2022-08-19 14:06:11 +01:00
Geza Lore
f8a0389e73 Do not use stepCost when gathering sibling merge candidates
siblingPairFromRelatives gathers neighbours of a vertex, and sorts them.
It then takes the N best nodes, and creates sibling merge candidates
from them. We now use the unadjusted cost instead of the step cost of
the vertices when sorting. This is both faster as we need not do the
log-space rounding to compute stepCost, and will also make similar but
yet cheaper nodes appear closer to the front as we don't lose precision
in rounding, hence they are more likely to be entered as merge
candidates. Note that when creating the merge candidate, we still use
the stepCost, so it's purpose of reducing the propagation of critical
path updates is maintained in full. In summary, this should make both
Verilator and the generated model very slightly faster, at least in
theory, and I have observed minor improvement in places.
2022-08-19 14:06:11 +01:00
Geza Lore
b436794773 Add specialized GraphStreamUnordered
GraphStreamUnordered used to be GraphStream<std::less<const
V3GraphVertex*>>, but a lot of performance improvements can be had by a
specialized implementation, so added a highly optimized one. This helps
a lot with --debug-partition.
2022-08-19 14:06:11 +01:00
Geza Lore
90d22cbec6 Fix AstNode::exists return type 2022-08-19 13:22:06 +01:00
Krzysztof Bieganski
33e2acfe61
Fix AstNode::forall return type (#3559)
Signed-off-by: Krzysztof Bieganski <kbieganski@antmicro.com>
2022-08-19 12:33:17 +01:00
Ryszard Rozak
db5fdfb0ee
Fix === with some tristate constants (#3551). 2022-08-18 07:03:05 -04:00
Krzysztof Bieganski
951cd73fe0
Handle MemberSel in V3EmitV.cpp (#3555) 2022-08-18 06:33:45 -04:00
Arkadiusz Kozdra
0eeb40b975
Fix converting subclasses to string (#3552) 2022-08-17 18:08:43 -04:00
Wilson Snyder
93272c13fd Tests: Confirm fixed (#181) 2022-08-15 22:17:36 -04:00
Wilson Snyder
43abaeb055 Tests: Confirm fixed (#485) 2022-08-15 22:17:17 -04:00
Wilson Snyder
18b9e661c9 Tests: Confirm fixed (#446) 2022-08-15 22:17:09 -04:00
Wilson Snyder
f435d96241 Fix case statement comparing string literal (#3544). 2022-08-15 21:56:09 -04:00
github action
d32e3f042f Apply 'make format' 2022-08-12 10:56:12 +00:00
Mostafa Gamal
df5f95a5bd
Fix nested default assignment for struct pattern (#3511) (#3524) 2022-08-12 06:55:07 -04:00
Drew Ranck
b0c475205b
Fix void-cast queue pop_front or pop_back (#3542) (#3364)
Fix compile error for queue method usage, if it is the
first statement in a block of code, and the return
value is not used. Example:

>  if (foo)
>    void'(bar.pop_front());
2022-08-12 06:51:25 -04:00
Wilson Snyder
1e2219347e Internals: Cleanup ifdef, move up not under compilver version ifdef 2022-08-11 17:41:43 -04:00
Wilson Snyder
cbe1b8e266 Fix segfault exporting non-existant package (#3535). 2022-08-08 17:53:50 -04:00
Mariusz Glebocki
2b12fe5773
Internals: Construct V3Number with correct type instead of changing it manually. (#3529) 2022-08-08 08:17:02 -04:00
Yutetsu TAKATSUKASA
d20f22beb1
Fix tristate logic when reading inout port in a module #3399 (#3523)
* Tests: Add a test to reproduce #3399

* Fix #3399. When reading an inout port in a module, it should refer the
original inout port, not the generated MODTEMP.
2022-08-07 21:12:57 +09:00
Wilson Snyder
f4fe10844b Tests: Fix t_flag_help.pl (#3532). 2022-08-07 04:57:59 -04:00
Mariusz Glebocki
122e89ffde
Fix V3Number::isMsbXZ(). (#3530) 2022-08-05 19:12:52 +01:00
Geza Lore
96a4b3e5a5 Update clang-format config and apply
- Regroup and sort #include directives (like we used to, but automatic)
- Set AlwaysBreakTemplateDeclarations to true
2022-08-05 12:00:24 +01:00
Geza Lore
fac8e76923 Rework SortByValueMap for better performance
Keep a single std::set of key/value pairs, and a single unordered_map
from key to iterators into the set. Also improve some of the accessing
mechanisms using modern C++. This speeds up multi-threaded ordering by
about 10%.
2022-08-03 21:17:02 +01:00
Geza Lore
b864f5f5ba V3Partition: use static_cast with LogicMTaskVertex
dynamic_cast is not free, and the mtask graph contains only
LogicMTaskVertex vertices, use static_cast instead for some speedup.
2022-08-03 17:05:01 +01:00
Geza Lore
f9f66d787e Fix integer overflow in V3Unroll (#3451) 2022-08-03 09:41:30 +01:00
Geza Lore
bd211c87aa astgen: split 'visit' method declarations from definitions
Add definitions to V3Ast.cpp, and use static_cast.
This fixes a lot of clang-tidy noise.
2022-08-02 17:53:19 +01:00
Geza Lore
6c33e6e889 Tell clang-tidy .h files are C++ (not C) headers 2022-08-02 17:53:19 +01:00
Kamil Rakoczy
cfb6fd8b34
Reduce max RSS usage (#3483)
By constant folding nodes earlier in V3Expand, we can save some max RSS on large designs.
2022-08-02 13:36:14 +01:00
Geza Lore
cb60663d49 V3Gate: Defer substitutions until required as well
Similarly to the earlier patch that defers constant folding on optimized
logic, now we also defer the variable substitutions as well. This again
eliminates a lot of traversals, and yields another ~10x speedup of V3Gate
on a design where V3Gate used to dominate while producing identical
results.
2022-08-01 12:54:41 +01:00
Geza Lore
0d2bf23d82 V3Gate: Defer constant folding until required
Rather than constant folding each logic block after every substitution,
only constant fold updated blocks when re-analysed, or at the end. This
removes a lot of invocations of V3Const on large blocks that can be
optimized well, and should yield the same result.

This speeds up V3Gate by ~4x on a design where V3Gate dominates.
2022-07-31 20:42:04 +01:00
Geza Lore
682a60e325 Cleanup V3Gate, no functional change 2022-07-31 20:07:54 +01:00
Geza Lore
2ab6272cc7 Use AstNode::foreach in V3Gate
This yields a little speedup.
2022-07-31 20:05:25 +01:00
Geza Lore
152a6cd886 Improve AstNode::foreach (also exists and forall)
Speed improvements:
- Use a direct, recursion-free implementation
- Improve pre-fetching

Functionality:
- Support remove/replace of currently iterated node
2022-07-31 19:07:32 +01:00
Wilson Snyder
f91793e931 Revert - SC overrides cause non-override clang error. 2022-07-30 13:53:54 -04:00
Wilson Snyder
a2d26b45bb Internals: Fix some clang-tidy issues. No functional change intended. 2022-07-30 11:54:28 -04:00
Wilson Snyder
dce8f3d25d Internals: Spacing from develop-v5. No functional change. 2022-07-30 11:54:28 -04:00
Geza Lore
38e5b6c1ad Replace __gcov_flush with __gcov_dump
__gcov_flush was a private function and was removed from later GCC
versions (at least from 11.2.0, possibly earlier). Replace with the
documented public __gcov_dump.
2022-07-30 16:02:03 +01:00
Wilson Snyder
b9d7819faa Internals: Fix some cppcheck issues. Some dump functions fixed. 2022-07-30 10:01:39 -04:00
Yutetsu TAKATSUKASA
1f9323d086
Set correct dtype in replaceShiftSame() (#3520)
* Tests: Add a test to reproduce bug3399

* Fix3399. Set the correct dtype in replaceShiftSame().

* Tests: update stats.

* Update Changes
2022-07-29 07:05:04 +09:00
Geza Lore
574dbfded1 V3MergeCond: Fix incorrect merge of assignments to the condition 2022-07-28 15:50:02 +01:00
Wilson Snyder
2a87387eb3 Documentation fixes (#3514) 2022-07-28 08:41:01 -04:00
Geza Lore
a5ddd10e31 Tests: compare VCD files both ways
vcddiff is a bit broken, and sometimes 'vcddiff a b' fails while the
files are indeed equivalent. There is a chance however that 'vcddif b a'
will succeed in this case, so compare trace files both ways when
checking test results and claim success if vcddiff succeeds in at least
one direction.
2022-07-27 10:48:02 +01:00
github action
e871cd8a44 Apply 'make format' 2022-07-25 21:47:29 +00:00
Mostafa Gamal
7b431b37c7
Fix struct pattern assignment (#2328) (#3517). 2022-07-25 17:46:22 -04:00
Gustav Svensk
eeef5ab4de
Fix sformat string incorrectly cleared (#3515) (#3519). 2022-07-25 17:36:34 +02:00
Geza Lore
ac4ec87942 Respect clang's default -fbracket-depth by default
Set default value of --comp-limit-parens to 240, to respect default
 maximum nesting of parentheses in clang (which is controlled by
 -fbracket-depth and defaults to 256). For code generation consistency,
 also use the same default with gcc.
2022-07-25 12:59:26 +01:00
Geza Lore
290c2e0388 Mark FileLine::v3errorEndFatal as noreturn 2022-07-25 12:51:02 +01:00
Geza Lore
89924bda51 Always type '$clog2' as signed 32 2022-07-25 12:48:13 +01:00