Commit Graph

178 Commits

Author SHA1 Message Date
Geza Lore
f048cff093
Add inline for VL_ATTR_ALWINLINE, even on non GNU-C platforms (#4734)
This is required to avoid multiple definition errors at link time for
functions defined in headers and marked VL_ATTR_ALWINLINE.
2023-12-02 21:51:59 +00:00
Kamil Rakoczy
8bd6d7c5b1
Internals: Add V3ThreadSafety (#4477) 2023-09-13 07:57:48 -04:00
Anthony Donlon
cbdee5a804
Fix Windows filename format, etc. (#3873) (#4421)
* Ignore CLion project files and CMake outputs
* Supporting stripping file path that contains backslash
* Set /bigobj flag and increase stack size for windows platform
* Fix MSVC warnings
2023-08-16 07:34:57 -04:00
Mariusz Glebocki
85b7f828b3
Internals: Use std::packaged_task as a job wrapper; add and use V3ThreadPool::ScopedExclusiveAccess (#4310).
* Add VL_ASSERT_CAPABILITY; add assumeLocked and pretendUnlock to V3Mutex.

* Pass jobs as template-arguments and use std::packaged_task.

* Add and use V3ThreadPool::ScopedExclusiveAccess.
2023-06-23 18:25:12 -04:00
Mariusz Glebocki
e7714e0902
Internals: Add additional clang's thread safety analysis annotations (#4195)
* Simplify some Clang-specific attribute defines.

* Add `VL_RETURN_CAPABILITY` and `VL_PT_GUARDED_BY`.
2023-05-12 09:06:36 -04:00
Wilson Snyder
befb415f27 Fix mod/div/round argument side effects 2023-05-07 22:31:31 -04:00
Wilson Snyder
dc25be536c Internals: Deprecate VL_ATTR_ALIGNED, use alignas instead. 2023-05-02 21:21:10 -04:00
Kamil Rakoczy
65a484e00b
Internal: Update clang_check_annotations conditions (#4134) 2023-04-20 07:02:31 -04:00
Kamil Rakoczy
bbf53bd5af
Add VL_MT_SAFE attribute to several functions. (#3729) 2023-03-16 19:48:56 -04:00
Larry Doolittle
aff3f7c4f6
Fix build on hppa (#3954)
As supplied by John David Anglin in Debian bug #1030913
2023-02-11 04:23:26 -05:00
Wilson Snyder
317fe7a787 Fix VL_CPU_RELAX on MIPS/Armel/s390/sparc (#3891) 2023-01-19 17:22:28 -05:00
Andrew Nolte
e5eb7d8930
Add VL_VALUE_STRING_MAX_WORDS override (#3869)
Co-authored-by: Andrew Nolte <anolte@hudson-trading.com>
2023-01-13 15:23:15 -05:00
Wilson Snyder
b24d7c83d3 Copyright year update 2023-01-01 10:18:39 -05:00
Jevin Sweval
3340f7b0b4
Fix macOS weak symbols with -U linker flag (#3823) 2022-12-20 11:17:43 -05:00
Kamil Rakoczy
7a15457511
Tests: Add multithreading attribute checks (#3748) 2022-12-16 11:19:27 -05:00
Kritik Bhimani
9d2f1c607a
Fix MSVCC issues (#3813) 2022-12-14 07:07:25 -05:00
Kritik Bhimani
7b073fec7d
Fix MSVC++ portability issues (#3812) 2022-12-12 18:45:32 -05:00
Wilson Snyder
a0e7930036 docs: Fix spelling 2022-12-09 22:39:41 -05:00
Kamil Rakoczy
5aa935d170
Internals: Add annotations for check attributes (#3803) 2022-12-09 07:12:26 -05:00
Larry Doolittle
f27cf4c804
Commentary: Fix spelling in C++ comments (#3797) (#3798) 2022-12-02 18:46:38 -05:00
Wilson Snyder
7d807a7e0e Commentary 2022-11-29 07:33:12 -05:00
Larry Doolittle
6349e76abd
Remove $date from .vcd files (#3779) 2022-11-27 20:24:22 -05:00
Miodrag Milanović
f782496092
Fix for mingw cross-compile, arm and riscv (#3752) 2022-11-16 05:34:25 -08:00
Kamil Rakoczy
207bc2b18a Fix comment annotation
Signed-off-by: Kamil Rakoczy <krakoczy@antmicro.com>
2022-11-10 13:36:14 +00:00
Kamil Rakoczy
d6126c4b32
Remove --no-threads; require --threads 1 for single threaded (#3703). 2022-11-05 08:47:34 -04:00
Kamil Rakoczy
54e3f15dce
Internals: Add attribute when using clang to VL_MT_SAFE and VL_MT_UNSAFE (#3685) 2022-10-18 05:15:33 -04:00
Wilson Snyder
4367e03e46 Internals: Make VL_UNREACHABLE similar to std::unreachable() 2022-10-02 16:35:45 -04:00
Wilson Snyder
c9634695a7 Fix std::exchange for C++11 compilers 2022-10-02 16:25:11 -04:00
Mariusz Glebocki
fc3ce29845
Improve Verilation memory by reducing V3Number size (#3521) 2022-09-20 16:46:47 -04:00
Geza Lore
38a8d7fb2e Remove redundant 'inline' keywords from definitions
Also add checks to t/t_dist_cppstyle
2022-09-16 15:52:25 +01:00
Geza Lore
9ac64d0b92 Improve performance of MTask coarsening
Various optimizations to speed up MTasks coarsening (which is the long
pole in the multi-threaded scheduling of very large designs).

The biggest impact ones:
- Use efficient hand written Pairing Heaps for implementing priority
  queues and the scoreboard, instead of the old SortByValueMap. This
  helps us avoid having to sort a lot of merge candidates that we will
  never actually consider and helps a lot in performance.
- Remove unnecessary associative containers and store data structures
  (the heap nodes in particular) directly in the object they relate to.
  This eliminates a huge amount of lookups and helps a lot in
  performance.
- Distribute storage for SiblingMC instances into the LogicMTask
  instances, and combine with the sibling maps. This again eliminates
  hash table lookups and makes storage structures smaller.
- Remove some now bidirectional edge maps, keep only the forward map.

There are also some other smaller optimizations:
- Replaced more unnecessary dynamic_casts with static_casts
- Templated some functions/classes to reduce the number of static
  branches in loops.
- Improves sorting of edges for sibling candidate creation
- Various micro-optimizations here and there

This speeds up MTask coarsening by 3.8x on a large design, which
translates to a 2.5x speedup of the ordering pass in multi-threaded
mode. (Combined with the earlier optimizations, ordering is now 3x
faster.)

Due to the elimination of a lot of the auxiliary data structures, and
ensuring a minimal size for the necessary ones, memory consumption of
the MTask coarsening is also reduced (measured up to 4.4x reduction
though the accuracy of this is low).

The algorithm is identical except for minor alterations of the order
some candidates are added or removed, this can cause perturbation in the
output due to tied scores being broken based on IDs.
2022-08-20 21:18:50 +01:00
Wilson Snyder
1e2219347e Internals: Cleanup ifdef, move up not under compilver version ifdef 2022-08-11 17:41:43 -04:00
Geza Lore
96a4b3e5a5 Update clang-format config and apply
- Regroup and sort #include directives (like we used to, but automatic)
- Set AlwaysBreakTemplateDeclarations to true
2022-08-05 12:00:24 +01:00
Geza Lore
38e5b6c1ad Replace __gcov_flush with __gcov_dump
__gcov_flush was a private function and was removed from later GCC
versions (at least from 11.2.0, possibly earlier). Replace with the
documented public __gcov_dump.
2022-07-30 16:02:03 +01:00
Wilson Snyder
b9d7819faa Internals: Fix some cppcheck issues. Some dump functions fixed. 2022-07-30 10:01:39 -04:00
Geza Lore
0de1bbc85b Add and use VL_CONSTEXPR_CXX17 2022-07-05 14:21:28 +01:00
Geza Lore
b51f887567
Perform VCD tracing in parallel when using --threads (#3449)
VCD tracing is now parallelized using the same thread pool as the model.
We achieve this by breaking the top level trace functions into multiple
top level functions (as many as --threads), and after emitting the time
stamp to the VCD file on the main thread, we execute the tracing
functions in parallel on the same thread pool as the model (which we
pass to the trace file during registration), tracing into a secondary
per thread buffer. The main thread will then stitch (memcpy) the buffers
together into the output file.

This makes the `--trace-threads` option redundant with `--trace`, which
now only affects `--trace-fst`. FST tracing uses the previous offloading
scheme.

This obviously helps a lot in VCD tracing performance, and I have seen
better than Amdahl speedup, namely I get 3.9x on XiangShan 4T (2.7x on
OpenTitan 4T).
2022-05-29 19:08:39 +01:00
HungMingWu
880a9be3b1
Internal: Add C++20ish reverse_view for range loops. No functional change (#3388).
Signed-off-by: HungMingWu <u9089000@gmail.com>
2022-04-18 13:03:56 -04:00
Wilson Snyder
e02f97854c Deprecate 'vluint64_t' and similar types (#3255). 2022-03-27 15:27:40 -04:00
Wilson Snyder
3f7bf3d2dc Fix MSVC localtime_s (#3124). 2022-03-27 13:59:18 -04:00
Geza Lore
b1b5b5dfe2 Improve run-time profiling
The --prof-threads option has been split into two independent options:
1. --prof-exec, for collecting verilator_gantt and other execution
related profiling data, and
2. --prof-pgo, for collecting data needed for PGO

The implementation of execution profiling is extricated from
VlThreadPool and is now a separate class VlExecutionProfiler. This means
--prof-exec can now be used for single-threaded models (though it does
not measure a lot of things just yet). For consistency VerilatedProfiler
is renamed VlPgoProfiler. Both VlExecutionProfiler and VlPgoProfiler are
in verilated_profiler.{h/cpp}, but can be used completely independently.

Also re-worked the execution profile format so it now only emits events
without holding onto any temporaries. This is in preparation for some
future optimizations that would be hindered by the introduction of function
locals via AstText.

Also removed the Barrier event. Clearing the profile buffers is not
notably more expensive as the profiling records are trivially
destructible.
2022-03-27 15:57:30 +02:00
Xi Zhang
14d24213a8
Support LoongArch ISA multithreading (#3353) (#3354) 2022-03-17 09:04:47 -04:00
Wilson Snyder
321880f5a6 Add trace dumpvars() call for selective runtime tracing (#3322). 2022-03-05 15:44:32 -05:00
Wilson Snyder
50094ca296 Internals: Add cpplint control file and related cleanups 2022-01-09 16:49:38 -05:00
Wilson Snyder
4cd56b1fb9 Use C++11 standard types for MacOS portability (#3254) (#3257). 2022-01-01 16:04:20 -05:00
Wilson Snyder
ca42be982c Copyright year update. 2022-01-01 08:26:40 -05:00
Wilson Snyder
560b59f97f Use C++11 standard types for MacOS portability (#3254). 2021-12-21 13:18:05 -05:00
Wilson Snyder
293a5f402b Fix timescale portability on Arm64 (#3222). 2021-11-28 15:47:19 -05:00
Wilson Snyder
55da66164b Fix verilator_gantt time on Arm. 2021-10-04 22:13:34 -04:00
Wilson Snyder
959793cde3 Internals: Cleanup VL_VALUE_STRING_MAX widths (#3050). 2021-08-23 21:13:33 -04:00