forked from github/verilator
aa9cde22c8
Use SIMD intrinsics to render VCD traces. I have measured 10-40% single threaded performance increase with VCD tracing on SweRV EH1 and lowRISC Ibex using SSE2 intrinsics to render the trace. Also helps a tiny bit with FST, but now almost all of the FST overhead is in the FST library. I have reworked the tracing routines to use more precisely sized arguments. The nice thing about this is that the performance without the intrinsics is pretty much the same as it was before, as we do at most 2x as much work as necessary, but in exchange there are no data dependent branches at all.
42 lines
1.5 KiB
C++
42 lines
1.5 KiB
C++
// -*- mode: C++; c-file-style: "cc-mode" -*-
|
|
//*************************************************************************
|
|
//
|
|
// Copyright 2003-2020 by Wilson Snyder. This program is free software; you can
|
|
// redistribute it and/or modify it under the terms of either the GNU
|
|
// Lesser General Public License Version 3 or the Perl Artistic License
|
|
// Version 2.0.
|
|
// SPDX-License-Identifier: LGPL-3.0-only OR Artistic-2.0
|
|
//
|
|
//*************************************************************************
|
|
///
|
|
/// \file
|
|
/// \brief Verilator: Common include for target specific intrinsics.
|
|
///
|
|
/// Code using machine specific intrinsics for optimization should
|
|
/// include this header rather than directly including he target
|
|
/// specific headers. We provide macros to check for availability
|
|
/// of instruction sets, and a common mechanism to disable them.
|
|
///
|
|
//*************************************************************************
|
|
|
|
#ifndef _VERILATED_INTRINSICS_H_
|
|
#define _VERILATED_INTRINSICS_H_ 1 ///< Header Guard
|
|
|
|
// clang-format off
|
|
|
|
// Use VL_DISABLE_INTRINSICS to disable all intrinsics based optimization
|
|
#if !defined(VL_DISABLE_INTRINSICS) && !defined(VL_PORTABLE_ONLY)
|
|
# if defined(__SSE2__) && !defined(VL_DISABLE_SSE2)
|
|
# define VL_HAVE_SSE2 1
|
|
# include <emmintrin.h>
|
|
# endif
|
|
# if defined(__AVX2__) && defined(VL_HAVE_SSE2) && !defined(VL_DISABLE_AVX2)
|
|
# define VL_HAVE_AVX2 1
|
|
# include <immintrin.h>
|
|
# endif
|
|
#endif
|
|
|
|
// clang-format on
|
|
|
|
#endif // Guard
|