verilator/include/verilated_intrinsics.h
Geza Lore aa9cde22c8
Use SIMD intrinsics to render VCD traces (#2289)
Use SIMD intrinsics to render VCD traces.

I have measured 10-40% single threaded performance increase with VCD
tracing on SweRV EH1 and lowRISC Ibex using SSE2 intrinsics to render
the trace. Also helps a tiny bit with FST, but now almost all of the FST
overhead is in the FST library.

I have reworked the tracing routines to use more precisely sized
arguments. The nice thing about this is that the performance without the
intrinsics is pretty much the same as it was before, as we do at most 2x
as much work as necessary, but in exchange there are no data dependent
branches at all.
2020-04-30 00:09:09 +01:00

42 lines
1.5 KiB
C++

// -*- mode: C++; c-file-style: "cc-mode" -*-
//*************************************************************************
//
// Copyright 2003-2020 by Wilson Snyder. This program is free software; you can
// redistribute it and/or modify it under the terms of either the GNU
// Lesser General Public License Version 3 or the Perl Artistic License
// Version 2.0.
// SPDX-License-Identifier: LGPL-3.0-only OR Artistic-2.0
//
//*************************************************************************
///
/// \file
/// \brief Verilator: Common include for target specific intrinsics.
///
/// Code using machine specific intrinsics for optimization should
/// include this header rather than directly including he target
/// specific headers. We provide macros to check for availability
/// of instruction sets, and a common mechanism to disable them.
///
//*************************************************************************
#ifndef _VERILATED_INTRINSICS_H_
#define _VERILATED_INTRINSICS_H_ 1 ///< Header Guard
// clang-format off
// Use VL_DISABLE_INTRINSICS to disable all intrinsics based optimization
#if !defined(VL_DISABLE_INTRINSICS) && !defined(VL_PORTABLE_ONLY)
# if defined(__SSE2__) && !defined(VL_DISABLE_SSE2)
# define VL_HAVE_SSE2 1
# include <emmintrin.h>
# endif
# if defined(__AVX2__) && defined(VL_HAVE_SSE2) && !defined(VL_DISABLE_AVX2)
# define VL_HAVE_AVX2 1
# include <immintrin.h>
# endif
#endif
// clang-format on
#endif // Guard