verilator

mirror of https://github.com/verilator/verilator.git synced 2025-01-01 04:07:34 +00:00

Author	SHA1	Message	Date
Geza Lore	8afcd67a1f	Fix FST tracing of little endian vectors	2020-05-03 22:39:45 +01:00
John Demme	6e9008fb5a	Fix VerilatedVarProps::totalSize missing the first unpacked dim (#2296 )	2020-05-01 07:42:29 -04:00
Wilson Snyder	5ded80cf79	Fix MacOs Homebrew by removing default LIBS, #2298 .	2020-04-30 19:53:21 -04:00
Wilson Snyder	9fd4541069	Fix reduction OR on wide data, broke in v4.026, #2300 .	2020-04-30 17:53:54 -04:00
Peter Horvath	dc64b43152	Fix xcode clang bug workaround (#2295 )	2020-04-30 07:20:31 -04:00
Geza Lore	209a585a68	Remove VL_NEGATE_{I,Q,E}, use C native unary '-' instead This is to avoid slowing down -O0 models unnecessarily.	2020-04-30 01:05:52 +01:00
Geza Lore	aa9cde22c8	Use SIMD intrinsics to render VCD traces (#2289 ) Use SIMD intrinsics to render VCD traces. I have measured 10-40% single threaded performance increase with VCD tracing on SweRV EH1 and lowRISC Ibex using SSE2 intrinsics to render the trace. Also helps a tiny bit with FST, but now almost all of the FST overhead is in the FST library. I have reworked the tracing routines to use more precisely sized arguments. The nice thing about this is that the performance without the intrinsics is pretty much the same as it was before, as we do at most 2x as much work as necessary, but in exchange there are no data dependent branches at all.	2020-04-30 00:09:09 +01:00
Geza Lore	dd967f7769	Improve trace buffer memory utilization and performance. Convert trace buffer to 32-bit entries, rather than a union containing a pointer type. Also tweaked trace entry layouts for a bit more performance. This gains another 10% on SweRV EH1 CoreMark.	2020-04-27 19:00:17 +01:00
Geza Lore	b79ef672e1	Various minor optimizations of VCD trace routines - Change templated trace routines to branch table. Removed templating from trace chgBus and fullBus and replaced them with a branch table like the other there is a very small (< 1%) penalty for this on SwerRV EH1 CoreMark, but this is less than the variability of disk IO so it's worth it to keep the code simpler and smaller. - Prefetch VCD suffix buffer at the top of emit* - Increase ILP in VCD emit* routines - Use a 64-bit unaligned store to emit the VCD suffix (on x86 only) The performance difference with these is very small, but the changes hopefully make this code more performance-portable across various micro-architectures.	2020-04-27 18:44:53 +01:00
Geza Lore	9991b19610	Another attempt at flushing threaded VCD correctly.	2020-04-25 18:40:09 +01:00
Geza Lore	c1665818b9	Fix missing flush with threaded VCD tracing. (#2282 ) VerilatedVcdC::openNext() failed to flush the tracing thread before opening the next output file, which caused t_trace_cat.pl to fail with --vltmt on occasion.	2020-04-24 03:09:26 +01:00
Wilson Snyder	df52e481fb	Collected minor output code cleanups.	2020-04-23 21:22:47 -04:00
Geza Lore	c96a43b452	Fix unused variable in VL_READMEM_N (#2274 )	2020-04-22 17:25:35 -04:00
Geza Lore	c52f3349d1	Initial implementation of generic multithreaded tracing (#2269 ) The --trace-threads option can now be used to perform tracing on a thread separate from the main thread when using VCD tracing (with --trace-threads 1). For FST tracing --trace-threads can be 1 or 2, and --trace-fst --trace-threads 1 is the same a what --trace-fst-threads used to be (which is now deprecated). Performance numbers on SweRV EH1 CoreMark, clang 6.0.0, Intel i7-3770 @ 3.40GHz, IO to ramdisk, with numactl set to schedule threads on different physical cores. Relative speedup: --trace -> --trace --trace-threads 1 +22% --trace-fst -> --trace-fst --trace-threads 1 +38% (as --trace-fst-thread) --trace-fst -> --trace-fst --trace-threads 2 +93% Speed relative to --trace with no threaded tracing: --trace 1.00 x --trace --trace-threads 1 0.82 x --trace-fst 1.79 x --trace-fst --trace-threads 1 1.23 x --trace-fst --trace-threads 2 0.87 x This means FST tracing with 2 extra threads is now faster than single threaded VCD tracing, and is on par with threaded VCD tracing. You do pay for it in total compute though as --trace-fst --trace-threads 2 uses about 240% CPU vs 150% for --trace-fst --trace-threads 1, and 155% for --trace --trace threads 1. Still for interactive use it should be helpful with large designs.	2020-04-21 23:49:07 +01:00
Wilson Snyder	174fd1bf0e	Codacy cleanups. No functional change.	2020-04-20 22:01:47 -04:00
Geza Lore	39d903375b	Factor out trace implementation common to all formats. (#2268 ) This patch de-duplicates common functionality between the VCD and FST trace implementation. It also enables adding new trace formats more easily and consistently. No functional nor performance change intended.	2020-04-19 23:57:36 +01:00
Geza Lore	6a54922044	Set FST timescale correctly. (#2266 ) The FST trace timescale used to be set in the constructor via set_time_unit, but at that point we haven't normally opened the file yet so it was just dropped. On top of that, we actually want to use set_time_resolution... FST trace timescales now match the VCD.	2020-04-19 08:47:22 -04:00
Geza Lore	74e16d85c5	Fix FST trace initial time stamp. (#2264 ) If the first dump was not at time zero, then the FST trace used to contain the initial values as if they were set at time zero. Now they only appear at the time the first dump call is actually made, and hence match the VCD trace exactly.	2020-04-18 18:54:02 -04:00
Wilson Snyder	e6f345e45d	Internal: clang-tidy fixes. No functional change.	2020-04-15 21:47:37 -04:00
Wilson Snyder	d4f7f5297a	Support IEEE time units and time precisions, #234 . (#2253 ) Includes `timescale, $printtimescale, $timeformat. VL_TIME_MULTIPLIER, VL_TIME_PRECISION, VL_TIME_UNIT have been removed and the time precision must now match the SystemC time precision. To get closer behavior to older versions, use e.g. --timescale-override "1ps/1ps".	2020-04-15 19:39:03 -04:00
Wilson Snyder	5c966ec510	clang-format many files. No functional change. Use nodist/clang_formatter to reformat files that are now clean.	2020-04-13 22:52:23 -04:00
Geza Lore	dc5c259069	Improve tracing performance. (#2257 ) * Improve tracing performance. Various tactics used to improve performance of both VCD and FST tracing: - Both: Change tracing functions to templates to take variable widths as template parameters. For VCD, subsequently specialize these to the values used by Verilator. This avoids redundant instructions and hard to predict branches. - Both: Check for value changes via direct pointer access into the previous signal value buffer. This eliminates a lot of simple pointer arithmetic instructions form the tracing code. - Both: Verilator provides clean input, no need to mask out used bits. - VCD: pre-compute identifier codes and use memory copy instead of re-computing them every time a code is emitted. This saves a lot of instructions and hard to predict branches. The added D-cache misses are cheaper than the removed branches/instructions. - VCD: re-write the routines emitting the changes to be more efficient. - FST: Use previous signal value buffer the same way as the VCD tracing code, and only call the FST API when a change is detected. Performance as measured on SweRV EH1, with the pre-canned CoreMark benchmark running from DCCM/ICCM, clang 6.0.0, Intel i7-3770 @ 3.40GHz, and IO to ramdisk: +--------------+---------------+----------------------+ \| VCD \| FST \| FST separate thread \| \| (--trace) \| (--trace-fst) \| (--trace-fst-thread) \| ------------+-----------------------------------------------------+ Before \| 30.2 s \| 121.1 s \| 69.8 s \| ============+==============+===============+======================+ After \| 24.7 s \| 45.7 s \| 32.4 s \| ------------+--------------+---------------+----------------------+ Speedup \| 22 % \| 256 % \| 215 % \| ------------+--------------+---------------+----------------------+ Rel. to VCD \| 1 x \| 1.85 x \| 1.31 x \| ------------+--------------+---------------+----------------------+ In addition, FST trace size for the above reduced by 48%.	2020-04-14 00:13:10 +01:00
Wilson Snyder	dc27a179e2	Always define VL_SIG etc; conditional definitions were long removed SystemPerl.	2020-04-13 19:07:56 -04:00
Wilson Snyder	ef211fc9e0	Make sure SystemC always included in -sc mode to prevent ordering issues.	2020-04-11 10:33:40 -04:00
Nathan Myers	4c1ae4701a	Add assertion for monotonic dump times #2103 (#2237 )	2020-04-09 19:00:27 -04:00
Geza Lore	05f213c266	VCD tracing speed improvements (#2246 ) * Don't inline VCD dump functions Improves model speed with tracing. Measured on SweRW cmark: - GCC 5.5 ~3% faster - Clang 6.0 ~12% faster (!) * Remove redundant test from VCD bit tracing. Improves model speed with tracing. Measured on SweRW cmark: - GCC 5.5 ~7.5% faster - Clang 6.0 ~1.5% faster	2020-04-09 08:19:26 -04:00
Geza Lore	0f617988d4	Compile fast tracing code with OPT_FAST in single compile mode. (#2245 ) When using the __ALL*.cpp based single compile mode (i.e.: without VM_PARALLEL_BUILDS), the fast path tracing code used to be included in __Allsup.cpp, which was compiled with OPT_SLOW, severely harming tracing performance. We now have __ALLfast.cpp and __ALLslow.cpp instead of __ALLcls.cpp and __ALLsup.cpp, so we can compile the fast support code with OPT_FAST as well.	2020-04-08 21:05:43 -04:00
Geza Lore	991d8b178b	Fix FST tracing performance by removing std::map from hot path. (#2244 ) This patch eliminates a major piece of inefficiency in FST tracing support, by using an array to lookup fstHandle values corresponding to trace codes, instead of a tree based std::map. With this change, FST tracing is now only about 3x slower than VCD tracing. We do require more memory to store the symbol lookup table, but the size of that is still small, for the speed benefit.	2020-04-08 17:54:35 -04:00
Wilson Snyder	914a6edd33	Add error if use SystemC 2.2 and earlier (pre-2011) as is deprecated.	2020-04-07 19:58:17 -04:00
Geza Lore	0cfa828572	Fix DPI import/export to be standard compliant, #2236 .	2020-04-07 19:07:47 -04:00
Wilson Snyder	cba05480ba	Fix clang warning.	2020-04-06 20:13:24 -04:00
Wilson Snyder	2abbae8dd0	Internals: Remove strncpy to appease codacity.	2020-04-06 19:26:31 -04:00
Wilson Snyder	50535a1894	Internals: cppcheck 1.90 fixes. No functional change intended.	2020-04-05 18:57:47 -04:00
Wilson Snyder	763f621d4c	Deprecate VL_ULL.	2020-04-05 16:45:53 -04:00
Wilson Snyder	efaf375887	Configuring with ccache present now defaults to using it; see OBJCACHE in the manual.	2020-04-05 16:10:33 -04:00
Wilson Snyder	a494ad5ec7	Support $ferror, #1638 .	2020-04-05 11:22:05 -04:00
Wilson Snyder	e55338f927	Support $fflush without arguments, #1638 .	2020-04-05 10:11:28 -04:00
Wilson Snyder	6eadb8e771	Add simplistic class support with many restrictions, see manual, #377 .	2020-04-05 09:30:23 -04:00
Wilson Snyder	9fdb026e95	Add VM_C11 for future need of C++11	2020-04-04 20:48:03 -04:00
Wilson Snyder	5302a9d0e6	Internals: clang-format cleanups. No functional change.	2020-04-04 17:55:37 -04:00
Wilson Snyder	e07e9390f6	Internals: clang-format cleanups. No functional change.	2020-04-04 14:09:21 -04:00
Wilson Snyder	a13eab55f5	Internals: Add missing VL_DO_CLEARs. No functional change.	2020-04-04 13:06:31 -04:00
Wilson Snyder	38a31ae168	Cleanup misc clang-tidy warnings. No functional change intended	2020-04-03 22:31:54 -04:00
Wilson Snyder	6f4a8fe695	Fix Travis-CI failures.	2020-04-02 09:22:10 -04:00
Wilson Snyder	4361c4b57a	Add vlsint8_t types.	2020-03-31 21:30:18 -04:00
Wilson Snyder	e6beab4037	Fix implicit conversion of floats to wide integers.	2020-03-31 20:42:07 -04:00
Sean Cross	a1a2650f1e	Modernize va args (#2214 ) Verilator uses a form of variadic macros that are nonstandard, making it unable to be compiled under MSVC. Replace the old synax with the standard syntax. This fixes MSVC usage. Signed-off-by: Sean Cross <sean@xobs.io>	2020-03-29 10:29:12 -04:00
Matthew Ballance	510be53521	Expose VPI cbNextDeadline via the public API (#2212 ) Signed-off-by: Matthew Ballance <matt.ballance@gmail.com>	2020-03-28 13:47:21 -04:00
Wilson Snyder	08a51e3e09	Fix VCD open with empty filename, #2198 .	2020-03-24 17:32:47 -04:00
Wilson Snyder	75ebe7a4be	Update gtkwave from upstream.	2020-03-21 21:45:57 -04:00

1 2 3 4 5 ...

508 Commits