The build is now by default configured to link performance critical
libraries (libgcc, libstdc++, libtcmalloc) statically. This improves
Verilation speed by between 4.5-7% based on my measurements as it
eliminates approx 20% of the mispredicted branches from the execution.
With partial static linking, the size of the .text section in
verilator_bin is increased by about 14%, and the binary is itself only
about 800KB bigger on disk, so hopefully this is not a big issue in
exchange for the faster compilation speed. A configure option
"--disable-partial-static" is provided to restore the old behaviour of
linking everything dynamically.
Note: This patch also changes to use libtcmalloc_minimal, which is all
we really need and itself has fewer dependencies.
Replace the virtual type() method on AstNode with a non-virtual, inlined
accessor to a const member variable m_type. This means that in order to be
able to use this for type testing, it needs to be initialized based on the
final type of the node. This is achieved by passing the relevant AstType
value back through the constructor call chain. Most of the boilerplate
involved is auto generated by first feeding V3AstNodes.h through astgen to
get V3AstNodes__gen.h, which is then included in V3Ast.h. No client code
needs to be aware and there is no functional change intended.
Eliminating the virtual function call to fetch the node type identifier
results in measured compilation speed improvement of 5-10% as it
eliminates up to 20% of all mispredicted branches from the execution.
As Verilator continuously allocates and releases small objects (e.g.:
AstNode, V3GraphVertex, V3GraphEdge), it spends a significant amount of
time in malloc/free and friends. This patch adds the --enable-tcmalloc
configure option to link Verilator against the high performance malloc
implementation library libtcmalloc. The default is to use libtcmalloc if
available on the system. Note that there are no source code change, we
are simply replacing the standard library memory allocation functions.
Measured major compilation speed improvement of 27% when running
Verilator with -O3 on a large design.