mirror of
https://github.com/verilator/verilator.git
synced 2025-01-22 06:14:02 +00:00
9ac64d0b92
Various optimizations to speed up MTasks coarsening (which is the long pole in the multi-threaded scheduling of very large designs). The biggest impact ones: - Use efficient hand written Pairing Heaps for implementing priority queues and the scoreboard, instead of the old SortByValueMap. This helps us avoid having to sort a lot of merge candidates that we will never actually consider and helps a lot in performance. - Remove unnecessary associative containers and store data structures (the heap nodes in particular) directly in the object they relate to. This eliminates a huge amount of lookups and helps a lot in performance. - Distribute storage for SiblingMC instances into the LogicMTask instances, and combine with the sibling maps. This again eliminates hash table lookups and makes storage structures smaller. - Remove some now bidirectional edge maps, keep only the forward map. There are also some other smaller optimizations: - Replaced more unnecessary dynamic_casts with static_casts - Templated some functions/classes to reduce the number of static branches in loops. - Improves sorting of edges for sibling candidate creation - Various micro-optimizations here and there This speeds up MTask coarsening by 3.8x on a large design, which translates to a 2.5x speedup of the ordering pass in multi-threaded mode. (Combined with the earlier optimizations, ordering is now 3x faster.) Due to the elimination of a lot of the auxiliary data structures, and ensuring a minimal size for the necessary ones, memory consumption of the MTask coarsening is also reduced (measured up to 4.4x reduction though the accuracy of this is low). The algorithm is identical except for minor alterations of the order some candidates are added or removed, this can cause perturbation in the output due to tied scores being broken based on IDs. |
||
---|---|---|
.. | ||
gtkwave | ||
vltstd | ||
.gitignore | ||
verilated_config.h.in | ||
verilated_cov_key.h | ||
verilated_cov.cpp | ||
verilated_cov.h | ||
verilated_dpi.cpp | ||
verilated_dpi.h | ||
verilated_fst_c.cpp | ||
verilated_fst_c.h | ||
verilated_fst_sc.cpp | ||
verilated_fst_sc.h | ||
verilated_funcs.h | ||
verilated_heavy.h | ||
verilated_imp.h | ||
verilated_intrinsics.h | ||
verilated_profiler.cpp | ||
verilated_profiler.h | ||
verilated_save.cpp | ||
verilated_save.h | ||
verilated_sc.h | ||
verilated_sym_props.h | ||
verilated_syms.h | ||
verilated_threads.cpp | ||
verilated_threads.h | ||
verilated_trace_defs.h | ||
verilated_trace_imp.h | ||
verilated_trace.h | ||
verilated_types.h | ||
verilated_vcd_c.cpp | ||
verilated_vcd_c.h | ||
verilated_vcd_sc.cpp | ||
verilated_vcd_sc.h | ||
verilated_vpi.cpp | ||
verilated_vpi.h | ||
verilated.cpp | ||
verilated.h | ||
verilated.mk.in | ||
verilated.v | ||
verilatedos.h |