Commit Graph

36 Commits

Author SHA1 Message Date
Geza Lore
83475008d9 Improve performance of MTask coarsening
Various optimizations to speed up MTasks coarsening (which is the long
pole in the multi-threaded scheduling of very large designs).

The biggest impact ones:
- Use efficient hand written Pairing Heaps for implementing priority
  queues and the scoreboard, instead of the old SortByValueMap. This
  helps us avoid having to sort a lot of merge candidates that we will
  never actually consider and helps a lot in performance.
- Remove unnecessary associative containers and store data structures
  (the heap nodes in particular) directly in the object they relate to.
  This eliminates a huge amount of lookups and helps a lot in
  performance.
- Distribute storage for SiblingMC instances into the LogicMTask
  instances, and combine with the sibling maps. This again eliminates
  hash table lookups and makes storage structures smaller.
- Remove some now bidirectional edge maps, keep only the forward map.

There are also some other smaller optimizations:
- Replaced more unnecessary dynamic_casts with static_casts
- Templated some functions/classes to reduce the number of static
  branches in loops.
- Improves sorting of edges for sibling candidate creation
- Various micro-optimizations here and there

This speeds up MTask coarsening by 3.8x on a large design, which
translates to a 2.5x speedup of the ordering pass in multi-threaded
mode. (Combined with the earlier optimizations, ordering is now 3x
faster.)

Due to the elimination of a lot of the auxiliary data structures, and
ensuring a minimal size for the necessary ones, memory consumption of
the MTask coarsening is also reduced (measured up to 4.4x reduction
though the accuracy of this is low).

The algorithm is identical except for minor alterations of the order
some candidates are added or removed, this can cause perturbation in the
output due to tied scores being broken based on IDs.
2022-08-19 16:59:20 +01:00
Geza Lore
96a4b3e5a5 Update clang-format config and apply
- Regroup and sort #include directives (like we used to, but automatic)
- Set AlwaysBreakTemplateDeclarations to true
2022-08-05 12:00:24 +01:00
Geza Lore
fac8e76923 Rework SortByValueMap for better performance
Keep a single std::set of key/value pairs, and a single unordered_map
from key to iterators into the set. Also improve some of the accessing
mechanisms using modern C++. This speeds up multi-threaded ordering by
about 10%.
2022-08-03 21:17:02 +01:00
Wilson Snyder
15b32dc140 Internals: cpplint cleanups. No functional change. 2022-01-08 12:01:39 -05:00
Wilson Snyder
24a0d2a0c9 Internals: Favor member assignment initialization. No functional change intended. 2022-01-01 11:46:49 -05:00
Wilson Snyder
ca42be982c Copyright year update. 2022-01-01 08:26:40 -05:00
Wilson Snyder
cd737065f2 Internals: More const. No functional change intended. 2021-11-26 17:55:36 -05:00
Wilson Snyder
8b00939f0c Improve performance of V3Scoreboard. Only performance change intended. 2021-11-03 22:16:18 -04:00
Wilson Snyder
c26ce25cea Internals: Add more const. No functional change. 2021-11-03 17:49:19 -04:00
Wilson Snyder
512fe0a2d1 Internals: Add const. No functional change. 2021-06-20 18:33:13 -04:00
Wilson Snyder
3a55600913 Internals: Restyle with C++11 using replacing typedef 2021-03-12 18:10:45 -05:00
Wilson Snyder
404b323f8c Internals: Remove some unnecessary typedefs. No functional change. 2021-03-12 17:26:53 -05:00
Wilson Snyder
be31fdcfe4 Use Google-style-guide header guard naming, to avoid __ prefix. 2021-03-03 21:57:07 -05:00
Wilson Snyder
bd602d0e2d Copyright year update 2021-01-01 10:29:54 -05:00
Wilson Snyder
b6ded59c2b Internals: Use and enforce class final for ~5% performance boost. 2020-11-18 21:32:16 -05:00
Wilson Snyder
1b0a48ea02 Internals: Use C++11 = default where obvious. No functional change intended. 2020-11-16 19:56:16 -05:00
Wilson Snyder
ee9d6dd63f C++11: Favor auto, range for. No functional change intended. 2020-08-16 11:44:06 -04:00
Wilson Snyder
72d2cff0a1 C++11: Use member declaration initalizations. No functional change intended. 2020-08-16 11:44:06 -04:00
Wilson Snyder
c0127599df C++11: Use nullptr. No functional change. 2020-08-16 11:44:05 -04:00
Wilson Snyder
7c54a451a9 C++11: Remove pre-c11 VL_OVERRIDE etc. No functional change. 2020-08-16 11:44:05 -04:00
Wilson Snyder
9927e8b3ee clang-format uses C++11 style. No functional change. 2020-08-15 09:48:08 -04:00
Wilson Snyder
17e7da77f0 Misc internal coverage improvements. 2020-05-16 18:02:54 -04:00
Wilson Snyder
57a937df03 Misc internal coverage cleanups 2020-05-16 07:43:22 -04:00
Wilson Snyder
5c966ec510 clang-format many files. No functional change.
Use nodist/clang_formatter to reformat files that are now clean.
2020-04-13 22:52:23 -04:00
Wilson Snyder
38a31ae168 Cleanup misc clang-tidy warnings. No functional change intended 2020-04-03 22:31:54 -04:00
Wilson Snyder
1ce360ed5b Add SPDX license identifiers. No functional change. 2020-03-21 11:24:24 -04:00
Wilson Snyder
0aabe6ce00 Internals: Fix cppcheck warning including missing init. 2020-02-03 22:10:29 -05:00
Wilson Snyder
f23fe8fd84 Update copyright year. 2020-01-06 18:05:53 -05:00
Wilson Snyder
5811ec07e6 Update URLs to https://verilator.org 2019-11-07 22:33:59 -05:00
Wilson Snyder
771a301f66 Commentary: Remove newlines, upsets some patches. No functional change. 2019-10-04 20:17:11 -04:00
Wilson Snyder
01ef7122e9 Internals: Add lcov code coverage markers. 2019-06-30 22:37:03 -04:00
Wilson Snyder
8a4aeddbb0 Copyright year update. 2019-01-03 19:17:22 -05:00
Wilson Snyder
d87b9d25ca Internals: Cleanup and standardize include order. No functional change intended. 2018-10-14 13:59:40 -04:00
Kevin Kiningham
b2ca4995ab Fix Mac OSX 10.13.6 / LLVM 9.1 compile issues, bug1348. 2018-09-17 18:35:23 -04:00
Wilson Snyder
0070520edb Fix cppcheck warnings. 2018-07-22 20:41:46 -04:00
Wilson Snyder
e37dce9d85 Internals: Add new graph algs for future partitioning. 2018-07-15 22:09:27 -04:00