docs: Fix grammar

This commit is contained in:
Wilson Snyder 2022-12-09 23:16:14 -05:00
parent a0e7930036
commit a9ff0a0f32
7 changed files with 134 additions and 136 deletions

View File

@ -32,7 +32,7 @@ Welcome to Verilator
* Single- and multithreaded output models
* - **Widely Used**
* Wide industry and academic deployment
* Out-of-the-box support from Arm, and RISC-V vendor IP
* Out-of-the-box support from Arm and RISC-V vendor IP
- |verilator usage|
* - |verilator community|
- **Community Driven & Openly Licensed**
@ -62,7 +62,7 @@ performs the design simulation. Verilator also supports linking Verilated
generated libraries, optionally encrypted, into other simulators.
Verilator may not be the best choice if you are expecting a full-featured
replacement for a closed-source Verilog simulator, need SDF annotation,
replacement for a closed-source Verilog simulator, needs SDF annotation,
mixed-signal simulation, or are doing a quick class project (we recommend
`Icarus Verilog`_ for classwork.) However, if you are looking for a path
to migrate SystemVerilog to C++/SystemC, or want high-speed simulation of
@ -101,7 +101,7 @@ For more information:
- `Verilator manual (HTML) <https://verilator.org/verilator_doc.html>`_,
or `Verilator manual (PDF) <https://verilator.org/verilator_doc.pdf>`_
- `Subscribe to verilator announcements
- `Subscribe to Verilator announcements
<https://github.com/verilator/verilator-announce>`_
- `Verilator forum <https://verilator.org/forum>`_

View File

@ -10,7 +10,7 @@ Revision History
"Revision History" in the sidebar.
Changes are contained in the :file:`Changes` file of the distribution, and
also summarized below. To subscribe to new versions see `Verilator
also summarized below. To subscribe to new versions, see `Verilator
Announcements <https://github.com/verilator/verilator-announce>`_.
.. include:: ../_build/gen/Changes

View File

@ -137,11 +137,11 @@ Historical Origins
Verilator was conceived in 1994 by Paul Wasson at the Core Logic Group at
Digital Equipment Corporation. The Verilog code that was converted to C
was then merged with a C based CPU model of the Alpha processor and
simulated in a C based environment called CCLI.
was then merged with a C-based CPU model of the Alpha processor and
simulated in a C-based environment called CCLI.
In 1995 Verilator started being used also for Multimedia and Network
Processor development inside Digital. Duane Galbi took over active
In 1995 Verilator started being also used for Multimedia and Network
Processor development inside Digital. Duane Galbi took over the active
development of Verilator, and added several performance enhancements. CCLI
was still being used as the shell.
@ -149,7 +149,7 @@ In 1998, through the efforts of existing DECies, mainly Duane Galbi,
Digital graciously agreed to release the source code. (Subject to the code
not being resold, which is compatible with the GNU Public License.)
In 2001, Wilson Snyder took the kit, and added a SystemC mode, and called
In 2001, Wilson Snyder took the kit, added a SystemC mode, and called
it Verilator2. This was the first packaged public release.
In 2002, Wilson Snyder created Verilator 3.000 by rewriting Verilator from
@ -168,5 +168,5 @@ fork/join, delay handling, DFG performance optimizations, and other
improvements.
Currently, various language features and performance enhancements are added
as the need arises, with a focus towards getting to full Universal
as the need arises, with a focus on getting to complete Universal
Verification Methodology (UVM, IEEE 1800.2-2017) support.

View File

@ -13,7 +13,7 @@ can redistribute it and/or modify the Verilator internals under the terms
of either the GNU Lesser General Public License Version 3 or the Perl
Artistic License Version 2.0.
All Verilog and C++/SystemC code quoted within this documentation file are
released as Creative Commons Public Domain (CC0). Many example files and
All Verilog and C++/SystemC code quoted within this documentation file is
released as Creative Commons Public Domain (CC0). Many example files and
test files are likewise released under CC0 into effectively the Public
Domain as described in the files themselves.

View File

@ -1,7 +1,7 @@
.. Copyright 2003-2022 by Wilson Snyder.
.. SPDX-License-Identifier: LGPL-3.0-only OR Artistic-2.0
First you need Verilator installed, see :ref:`Installation`. In brief, if
First you need Verilator installed, see :ref:`Installation`. In brief, if
you installed Verilator using the package manager of your operating system,
or did a :command:`make install` to place Verilator into your default path,
you do not need anything special in your environment, and should not have

View File

@ -8,8 +8,8 @@ Overview
Welcome to Verilator!
The Verilator package converts Verilog [#]_ and SystemVerilog [#]_ hardware
description language (HDL) designs into a C++ or SystemC model that after
compiling can be executed. Verilator is not a traditional simulator, but a
description language (HDL) designs into a C++ or SystemC model that, after
compiling, can be executed. Verilator is not a traditional simulator, but a
compiler.
Verilator is typically used as follows:
@ -18,13 +18,13 @@ Verilator is typically used as follows:
to GCC, or other simulators such as Cadence Verilog-XL/NC-Verilog, or
Synopsys VCS. Verilator reads the specified SystemVerilog code, lints it,
optionally adds coverage and waveform tracing support, and compiles the
design into a source level multithreaded C++ or SystemC "model". The
design into a source-level multithreaded C++ or SystemC "model". The
resulting model's C++ or SystemC code is output as .cpp and .h files. This
is referred to as "Verilating" and the process is "to Verilate"; the output
is a "Verilated" model.
is referred to as "Verilating", and the process is "to Verilate"; the
output is a "Verilated" model.
2. For simulation, a small user written C++ wrapper file is required, the
"wrapper". This wrapper defines the C++ standard function "main()" which
2. For simulation, a small-user written C++ wrapper file is required, the
"wrapper". This wrapper defines the C++ standard function "main()", which
instantiates the Verilated model as a C++/SystemC object.
3. The user C++ wrapper, the files created by Verilator, a "runtime
@ -44,12 +44,12 @@ The best place to get started is to try the :ref:`Examples`.
.. [#] Verilog is defined by the `Institute of Electrical and Electronics
Engineers (IEEE) Standard for Verilog Hardware Description
Language`, Std. 1364, released in 1995, 2001, and 2005. The
Verilator documentation uses the shorthand e.g. "IEEE 1394-2005" to
refer to the e.g. 2005 version of this standard.
Verilator documentation uses the shorthand, e.g., "IEEE 1394-2005",
to refer to the e.g. 2005 version of this standard.
.. [#] SystemVerilog is defined by the `Institute of Electrical and
Electronics Engineers (IEEE) Standard for SystemVerilog - Unified
Hardware Design, Specification, and Verification Language`, Standard
1800, released in 2005, 2009, 2012, and 2017. The Verilator
documentation uses the shorthand e.g. "IEEE 1800-2017" to refer to
documentation uses the shorthand e.g., "IEEE 1800-2017", to refer to
the e.g. 2017 version of this standard.

View File

@ -39,12 +39,12 @@ The main flow of Verilator can be followed by reading the Verilator.cpp
3. Cells in the AST first linked, which will read and parse additional
files as above.
4. Functions, variable and other references are linked to their
4. Functions, variable, and other references are linked to their
definitions.
5. Parameters are resolved, and the design is elaborated.
6. Verilator then performs many additional edits and optimizations on
6. Verilator then performs additional edits and optimizations on
the hierarchical design. This includes coverage, assertions, X
elimination, inlining, constant propagation, and dead code
elimination.
@ -56,15 +56,15 @@ The main flow of Verilator can be followed by reading the Verilator.cpp
a single scope and single VarScope for each variable. A module that
occurs twice will have a scope for each occurrence, and two
VarScopes for each variable. This allows optimizations to proceed
across the flattened design, while still preserving the hierarchy.
across the flattened design while still preserving the hierarchy.
8. Additional edits and optimizations proceed on the pseudo-flat
design. These include module references, function inlining, loop
unrolling, variable lifetime analysis, lookup table creation, always
splitting, and logic gate simplifications (pushing inverters, etc).
splitting, and logic gate simplifications (pushing inverters, etc.).
9. Verilator orders the code. Best case, this results in a single
"eval" function which has all always statements flowing from top to
"eval" function, which has all always statements flowing from top to
bottom with no loops.
10. Verilator mostly removes the flattening, so that code may be shared
@ -95,14 +95,14 @@ this.
Each ``AstNode`` has pointers to up to four children, accessed by the
``op1p`` through ``op4p`` methods. These methods are then abstracted in a
specific Ast\* node class to a more specific name. For example with the
specific Ast\* node class to a more specific name. For example, with the
``AstIf`` node (for ``if`` statements), ``thensp`` calls ``op2p`` to give the
pointer to the AST for the "then" block, while ``elsesp`` calls ``op3p`` to
give the pointer to the AST for the "else" block, or NULL if there is not
one. These accessors are automatically generated by ``astgen`` after
parsing the ``@astgen`` directives in the specific ``AstNode`` subclasses.
``AstNode`` has the concept of a next and previous AST - for example the
``AstNode`` has the concept of a next and previous AST - for example, the
next and previous statements in a block. Pointers to the AST for these
statements (if they exist) can be obtained using the ``back`` and ``next``
methods.
@ -136,7 +136,7 @@ the pass.
A number of passes use graph algorithms, and the class ``V3Graph`` is
provided to represent those graphs. Graphs are directed, and algorithms are
provided to manipulate the graphs and to output them in `GraphViz
provided to manipulate the graphs and output them in `GraphViz
<https://www.graphviz.org>`__ dot format. ``V3Graph.h`` provides
documentation of this class.
@ -150,7 +150,7 @@ algorithms for ordering the graph. A generic ``user``/``userp`` member
variable is also provided.
Virtual methods are provided to specify the name, color, shape, and style
to be used in dot output. Typically, users provide derived classes from
to be used in dot output. Typically users provide derived classes from
``V3GraphVertex`` which will reimplement these methods.
Iterators are provided to access in and out edges. Typically these are used
@ -173,9 +173,9 @@ vertices. Edges have an associated ``weight`` and may also be made
Accessors, ``fromp`` and ``top`` return the "from" and "to" vertices
respectively.
Virtual methods are provided to specify the label, color and style to be
Virtual methods are provided to specify the label, color, and style to be
used in dot output. Typically users provided derived classes from
``V3GraphEdge`` which will reimplement these methods.
``V3GraphEdge``, which will reimplement these methods.
``V3GraphAlg``
@ -183,7 +183,7 @@ used in dot output. Typically users provided derived classes from
This is the base class for graph algorithms. It implements a ``bool``
method, ``followEdge`` which algorithms can use to decide whether an edge
is followed. This method returns true if the graph edge has weight greater
is followed. This method returns true if the graph edge has a weight greater
than one and a user function, ``edgeFuncp`` (supplied in the constructor)
returns ``true``.
@ -194,11 +194,11 @@ provided and documented in ``V3GraphAlg.cpp``.
``DfgGraph``
^^^^^^^^^^^^^
The data-flow graph based combinational logic optimizer (DFG optimizer)
The data-flow graph-based combinational logic optimizer (DFG optimizer)
converts an ``AstModule`` into a ``DfgGraph``. The graph represents the
combinational equations (~continuous assignments) in the module, and for the
duration of the DFG passes, it takes over the role of the represented
``AstModule``. The ``DfgGraph`` keeps holds of the represented ``AstModule``,
``AstModule``. The ``DfgGraph`` keeps hold of the represented ``AstModule``,
and the ``AstModule`` retains all other logic that is not representable as a
data-flow graph. At the end of optimization, the combinational logic
represented by the ``DfgGraph`` is converted back into AST form and is
@ -212,7 +212,7 @@ writing DFG passes easier.
The ``DfgGraph`` represents combinational logic equations as a graph of
``DfgVertex`` vertices. Each sub-class of ``DfgVertex`` corresponds to an
expression (a sub-class of ``AstNodeExpr``), a constanat, or a variable
expression (a sub-class of ``AstNodeExpr``), a constant, or a variable
reference. LValues and RValues referencing the same storage location are
represented by the same ``DfgVertex``. Consumers of such vertices read as the
LValue, writers of such vertices write the RValue. The bulk of the final
@ -225,11 +225,11 @@ Scheduling
Verilator implements the Active and NBA regions of the SystemVerilog scheduling
model as described in IEEE 1800-2017 chapter 4, and in particular sections
4.5 and Figure 4.1. The static (verilation time) scheduling of SystemVerilog
4.5 and Figure 4.1. The static (Verilation time) scheduling of SystemVerilog
processes is performed by code in the ``V3Sched`` namespace. The single
entry-point to the scheduling algorithm is ``V3Sched::schedule``. Some
entry point to the scheduling algorithm is ``V3Sched::schedule``. Some
preparatory transformations important for scheduling are also performed in
``V3Active`` and ``V3ActiveTop``. High level evaluation functions are
``V3Active`` and ``V3ActiveTop``. High-level evaluation functions are
constructed by ``V3Order``, which ``V3Sched`` invokes on subsets of the logic
in the design.
@ -267,8 +267,8 @@ The classes of logic we distinguish between are:
below.
- Clocked logic. Any process or construct that has an explicit sensitivity
list, with no implicit sensitivities is considered 'clocked' (or
'sequential') logic. This includes among other things ``always`` and
list, with no implicit sensitivities, is considered 'clocked' (or
'sequential') logic. This includes, among other things ``always`` and
``always_ff`` processes with an explicit sensitivity list.
Note that the distinction between clocked logic and combinational logic is only
@ -321,7 +321,7 @@ At the highest level, ordering is performed by ``V3Order::order``, which is
invoked by ``V3Sched::schedule`` on various subsets of the combinational and
clocked logic as described below. The important thing to highlight now is that
``V3Order::order`` operates by assuming that the state of all variables driven
by combinational logic are consistent with that combinational logic. While this
by combinational logic is consistent with that combinational logic. While this
might seem subtle, it is very important, so here is an example:
::
@ -335,7 +335,7 @@ first, and all downstream combinational logic (like the assignment to ``d``)
will execute after the clocked logic that drives inputs to the combinational
logic, in data-flow (or dependency) order. At the end of the evaluation step,
this ordering restores the invariant that variables driven by combinational
logic are consistent with that combinational logic (i.e.: the circuit is in a
logic are consistent with that combinational logic (i.e., the circuit is in a
settled/steady state).
One of the most important optimizations for performance is to only evaluate
@ -344,12 +344,12 @@ point in evaluating the above assignment to ``d`` on a negative edge of the
clock signal. Verilator does this by pushing the combinational logic into the
same (possibly multiple) event domains as the logic driving the inputs to that
combinational logic, and only evaluating the combinational logic if at least
one driving domains have been triggered. The impact of this activity gating is
one driving domain has been triggered. The impact of this activity gating is
very high (observed 100x slowdown on large designs when turning it off), it is
the reason we prefer to convert clocked logic to combinational logic in
``V3Active`` whenever possible.
The ordering procedure described above works straight forward unless there are
The ordering procedure described above works straightforward unless there are
combinational logic constructs that are circularly dependent (a.k.a.: the
UNOPTFLAT warning). Combinational scheduling loops can arise in sound
(realizable) circuits as Verilator considers each SystemVerilog process as a
@ -369,7 +369,7 @@ To achieve this, ``V3Sched::schedule`` calls ``V3Sched::breakCycles``, which
builds a dependency graph of all combinational logic in the design, and then
breaks all combinational cycles by converting all combinational logic that
consumes a variable driven via a 'back-edge' into hybrid logic. Here
'back-edge' just means a graph edge that points from a higher rank vertex to a
'back-edge' just means a graph edge that points from a higher-rank vertex to a
lower rank vertex in some consistent ranking of the directed graph. Variables
driven via a back-edge in the dependency graph are marked, and all
combinational logic that depends on such variables is converted into hybrid
@ -382,7 +382,7 @@ logic, with two exceptions:
- Explicit sensitivities of hybrid logic are ignored for the purposes of
data-flow ordering with respect to other combinational or hybrid logic. I.e.:
an explicit sensitivity suppresses the implicit sensitivity on the same
variable. This cold also be interpreted as ordering the hybrid logic as if
variable. This could also be interpreted as ordering the hybrid logic as if
all variables listed as explicit sensitivities were substituted as constants
with their current values.
@ -396,7 +396,7 @@ explicit sensitivities are triggered.
The effect of this transformation is that ``V3Order`` can proceed as if there
are no combinational cycles (or alternatively, under the assumption that the
back-edge driven variables don't change during one evaluation pass). The
back-edge-driven variables don't change during one evaluation pass). The
evaluation loop invoking the ordered code, will then re-invoke it on a follow
on iteration, if any of the explicit sensitivities of hybrid logic have
actually changed due to the previous invocation, iterating until all the
@ -422,8 +422,8 @@ combinationally driven variables are consistent with the combinational logic.
To achieve this, we invoke ``V3Order::order`` on all of the combinational and
hybrid logic, and iterate the resulting evaluation function until no more
hybrid logic is triggered. This yields the `_eval_settle` function which is
invoked at the beginning of simulation, after the `_eval_initial`.
hybrid logic is triggered. This yields the `_eval_settle` function, which is
invoked at the beginning of simulation after the `_eval_initial`.
Partitioning logic for correct NBA updates
@ -432,17 +432,17 @@ Partitioning logic for correct NBA updates
``V3Order`` can order logic corresponding to non-blocking assignments (NBAs) to
yield correct simulation results, as long as all the sensitivity expressions of
clocked logic triggered in the Active scheduling region of the current time
step are known up front. I.e.: the ordering of NBA updates is only correct if
step are known up front. I.e., the ordering of NBA updates is only correct if
derived clocks that are computed in an Active region update (that is, via a
blocking or continuous assignment) are known up front.
We can ensure this by partitioning the logic into two regions. Note these
regions are a concept of the Verilator scheduling algorithm and they do not
regions are a concept of the Verilator scheduling algorithm, and they do not
directly correspond to the similarly named SystemVerilog scheduling regions
as defined in the standard:
- All logic (clocked, combinational and hybrid) that transitively feeds into,
or drives, via a non-blocking or continuous assignments (or via any update
or drives via a non-blocking or continuous assignments (or via any update
that SystemVerilog executes in the Active scheduling region), a variable that
is used in the explicit sensitivity list of some clocked or hybrid logic, is
assigned to the 'act' region.
@ -450,10 +450,10 @@ as defined in the standard:
- All other logic is assigned to the 'nba' region.
For completeness, note that a subset of the 'act' region logic, specifically,
the logic related to the pre-assignments of NBA updates (i.e.: AstAssignPre
the logic related to the pre-assignments of NBA updates (i.e., AstAssignPre
nodes), is handled separately, but is executed as part of the 'act' region.
Also note that all logic representing the committing of an NBA (i.e.: Ast*Post)
Also note that all logic representing the committing of an NBA (i.e., Ast*Post)
nodes) will be in the 'nba' region. This means that the evaluation of the 'act'
region logic will not commit any NBA updates. As a result, the 'act' region
logic can be iterated to compute all derived clock signals up front.
@ -462,7 +462,7 @@ The correspondence between the SystemVerilog Active and NBA scheduling regions,
and the internal 'act' and 'nba' regions, is that 'act' contains all Active
region logic that can compute a clock signal, while 'nba' contains all other
Active and NBA region logic. For example, if the only clocks in the design are
top level inputs, then 'act' will be empty, and 'nba' will contain the whole of
top-level inputs, then 'act' will be empty, and 'nba' will contain the whole of
the design.
The partitioning described above is performed by ``V3Sched::partition``.
@ -475,10 +475,10 @@ We will separately invoke ``V3Order::order`` on the 'act' and 'nba' region
logic.
Combinational logic that reads variables driven from both 'act' and 'nba'
region logic has the problem of needing to be re-evaluated even if only one of
region logic has the problem of needing to be reevaluated even if only one of
the regions updates an input variable. We could pass additional trigger
expressions between the regions to make sure combinational logic is always
re-evaluated, or we can replicate combinational logic that is driven from
reevaluated, or we can replicate combinational logic that is driven from
multiple regions, by copying it into each region that drives it. Experiments
show this simple replication works well performance-wise (and notably
``V3Combine`` is good at combining the replicated code), so this is what we do
@ -506,7 +506,7 @@ the top level `_eval` function, which on the high level has the form:
::
void _eval() {
// Update combinational logic dependent on top level inptus ('ico' region)
// Update combinational logic dependent on top level inputs ('ico' region)
while (true) {
_eval__triggers__ico();
// If no 'ico' region trigger is active
@ -534,7 +534,7 @@ the top level `_eval` function, which on the high level has the form:
// If no 'nba' region trigger is active
if (!nba_triggers.any()) break;
// Evaluate all other Active region logic, and commti NBAs
// Evaluate all other Active region logic, and commit NBAs
_eval_nba();
}
}
@ -628,7 +628,7 @@ coroutines ``co_await`` its ``join`` function, and forked ones call ``done``
when they're finished. Once the required number of coroutines (set using
``setCounter``) finish execution, the forking coroutine is resumed.
Awaitable utilities
Awaitable Utilities
^^^^^^^^^^^^^^^^^^^
There are also two small utility awaitable types:
@ -639,7 +639,7 @@ There are also two small utility awaitable types:
* ``VlForever`` is used for blocking a coroutine forever. See the `Timing pass`
section for more detail.
Timing pass
Timing Pass
^^^^^^^^^^^
The visitor in ``V3Timing.cpp`` transforms each timing control into a ``co_await``.
@ -668,7 +668,7 @@ before them and stored in temporary variables.
and then await changes in variables used in the condition. If the condition is
always false, the ``wait`` statement is replaced by a ``co_await`` on a
``VlForever``. This is done instead of a return in case the ``wait`` is deep in
a call stack (otherwise the coroutine's caller would continue execution).
a call stack (otherwise, the coroutine's caller would continue execution).
Each sub-statement of a ``fork`` is put in an ``AstBegin`` node for easier
grouping. In a later step, each of these gets transformed into a new, separate
@ -748,7 +748,7 @@ doesn't suspend the forking process.
In forked processes, references to local variables are only allowed in
``fork..join``, as this is the only case that ensures the lifetime of these
locals is at least as long as the execution of the forked processes. This is
locals are at least as long as the execution of the forked processes. This is
where ``VlNow`` is used, to ensure the locals are moved to the heap before they
are passed by reference to the forked processes.
@ -770,7 +770,7 @@ graph, while maintaining as much available parallelism as possible. Often
the partitioner can transform an input graph with millions of nodes into a
coarsened execution graph with a few dozen nodes, while maintaining enough
parallelism to take advantage of a modern multicore CPU. Runtime
synchronization cost is not prohibitive with so few nodes.
synchronization cost is reasonable with so few nodes.
Partitioning
@ -789,7 +789,7 @@ The available parallelism or "par-factor" of a DAG is the total cost to
execute all nodes, divided by the cost to execute the longest critical path
through the graph. This is the speedup you would get from running the graph
in parallel, if given infinite CPU cores available and communication and
synchronization are zero.
synchronization is zero.
Macro Task
@ -847,7 +847,7 @@ synchronization costs.
Verilator's cost estimates are assigned by ``InstrCountVisitor``. This
class is perhaps the most fragile piece of the multithread
implementation. It's easy to have a bug where you count something cheap
(eg. accessing one element of a huge array) as if it were expensive (eg.
(e.g. accessing one element of a huge array) as if it were expensive (eg.
by counting it as if it were an access to the entire array.) Even without
such gross bugs, the estimates this produce are only loosely predictive of
actual runtime cost. Multithread performance would be better with better
@ -879,13 +879,13 @@ fragmentation.
Locating Variables for Best Spatial Locality
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
After scheduling all code, we attempt to locate variables in memory such
After scheduling all code, we attempt to locate variables in memory, such
that variables accessed by a single macro-task are close together in
memory. This provides "spatial locality" - when we pull in a 64-byte cache
line to access a 2-byte variable, we want the other 62 bytes to be ones
we'll also likely access soon, for best cache performance.
This turns out to be critical for performance. It should allow Verilator
This is critical for performance. It should allow Verilator
to scale to very large models. We don't rely on our working set fitting
in any CPU cache; instead we essentially "stream" data into caches from
memory. It's not literally streaming, where the address increases
@ -904,7 +904,7 @@ The footprint ordering is literally the traveling salesman problem, and
we use a TSP-approximation algorithm to get close to an optimal sort.
This is an old idea. Simulators designed at DEC in the early 1990s used
similar techniques to optimize both single-thread and multi-thread
similar techniques to optimize both single-thread and multithread
modes. (Verilator does not optimize variable placement for spatial
locality in serial mode; that is a possible area for improvement.)
@ -918,7 +918,7 @@ Wave Scheduling
To allow the Verilated model to run in parallel with the testbench, it
might be nice to support "wave" scheduling, in which work on a cycle begins
before ``eval()`` is called or continues after ``eval()`` returns. For now
before ``eval()`` is called or continues after ``eval()`` returns. For now,
all work on a cycle happens during the ``eval()`` call, leaving Verilator's
threads idle while the testbench (everything outside ``eval()``) is
working. This would involve fundamental changes within the partitioner,
@ -929,7 +929,7 @@ Efficient Dynamic Scheduling
""""""""""""""""""""""""""""
To scale to more than a few threads, we may revisit a fully dynamic
scheduler. For large (>16 core) systems it might make sense to dedicate an
scheduler. For large (>16 core) systems, it might make sense to dedicate an
entire core to scheduling, so that scheduler data structures would fit in
its L1 cache and thus the cost of traversing priority-ordered ready lists
would not be prohibitive.
@ -983,7 +983,7 @@ Performance Regression
""""""""""""""""""""""
It would be nice if we had a regression of large designs, with some
diversity of design styles, to test on both single- and multi-threaded
diversity of design styles, to test on both single- and multithreaded
modes. This would help to avoid performance regressions, and also to
evaluate the optimizations while minimizing the impact of parasitic noise.
@ -992,7 +992,7 @@ Per-Instance Classes
""""""""""""""""""""
If we have multiple instances of the same module, and they partition
differently (likely; we make no attempt to partition them the same) then
differently (likely; we make no attempt to partition them the same), then
the variable sort will be suboptimal for either instance. A possible
improvement would be to emit an unique class for each instance of a module,
and sort its variables optimally for that instance's code stream.
@ -1011,17 +1011,17 @@ until all signals are stable.
On other evaluations, the Verilated code detects what input signals have
changes. If any are clocks, it calls the appropriate sequential functions
(from ``always @ posedge`` statements). Interspersed with sequential
functions it calls combo functions (from ``always @*``). After this is
functions, it calls combo functions (from ``always @*``). After this is
complete, it detects any changes due to combo loops or internally generated
clocks, and if one is found must reevaluate the model again.
For SystemC code, the ``eval()`` function is wrapped in a SystemC
``SC_METHOD``, sensitive to all inputs. (Ideally it would only be sensitive
``SC_METHOD``, sensitive to all inputs. (Ideally, it would only be sensitive
to clocks and combo inputs, but tracing requires all signals to cause
evaluation, and the performance difference is small.)
If tracing is enabled, a callback examines all variables in the design for
changes, and writes the trace for each change. To accelerate this process
changes, and writes the trace for each change. To accelerate this process,
the evaluation process records a bitmask of variables that might have
changed; if clear, checking those signals for changes may be skipped.
@ -1045,7 +1045,7 @@ is appreciated if you could match our style:
- Use "mixedCapsSymbols" instead of "underlined_symbols".
- Uas a "p" suffix on variables that are pointers, e.g. "nodep".
- Use a "p" suffix on variables that are pointers, e.g., "nodep".
- Comment every member variable.
@ -1057,12 +1057,12 @@ using clang-format version 10.0.0, and yapf for python, and is
automatically corrected in the CI actions. For those manually formatting C
code:
- Use 4 spaces per level, and no tabs.
- Use four spaces per level, and no tabs.
- Use 2 spaces between the end of source and the beginning of a
- Use two spaces between the end of source and the beginning of a
comment.
- Use 1 space after if/for/switch/while and similar keywords.
- Use one space after if/for/switch/while and similar keywords.
- No spaces before semicolons, nor between a function's name and open
parenthesis (only applies to functions; if/else has a following space).
@ -1073,8 +1073,8 @@ The ``astgen`` Script
The ``astgen`` script is used to generate some of the repetitive C++ code
related to the ``AstNode`` type hierarchy. An example is the abstract ``visit``
methods in ``VNVisitor``. There are other uses, please see the ``*__gen*``
files in the bulid directories and the ``astgen`` script itself for details. A
methods in ``VNVisitor``. There are other uses; please see the ``*__gen*``
files in the bulid directories and the ``astgen`` script for details. A
description of the more advanced features of ``astgen`` are provided here.
@ -1099,7 +1099,7 @@ sub-class definitions are parsed and contribute to the code generated by
``astgen``. The general syntax is ``@astgen <keywords> := <description>``,
where ``<keywords>`` determines what is being defined, and ``<description>`` is
a ``<keywords>`` dependent description of the definition. The list of
``@astgen`` directives is as follows:
``@astgen`` directives are as follows:
``op<N>`` operand directives
@ -1128,7 +1128,7 @@ An example of the full syntax of the directive is
``astnode`` generates accessors for the child nodes based on these directives.
For non-list children, the names of the getter and setter both are that of the
given ``<identifier>``. For list type children, the getter is ``<identifier>``,
given ``<identifier>``. For list-type children, the getter is ``<identifier>``,
and instead of the setter, there an ``add<Identifier>`` method is generated
that appends new nodes (or lists of nodes) to the child list.
@ -1185,10 +1185,10 @@ and applies the visit method of the ``VNVisitor`` to the invoking AstNode
instance (i.e. ``this``).
One possible difficulty is that a call to ``accept`` may perform an edit
which destroys the node it receives as argument. The
which destroys the node it receives as an argument. The
``acceptSubtreeReturnEdits`` method of ``AstNode`` is provided to apply
``accept`` and return the resulting node, even if the original node is
destroyed (if it is not destroyed it will just return the original node).
destroyed (if it is not destroyed, it will just return the original node).
The behavior of the visitor classes is achieved by overloading the
``visit`` function for the different ``AstNode`` derived classes. If a
@ -1212,7 +1212,7 @@ There are three ways data is passed between visitor functions.
it's cleared. Children under an ``AstModule`` will see it set, while
nodes elsewhere will see it clear. If there can be nested items (for
example an ``AstFor`` under an ``AstFor``) the variable needs to be
save-set-restored in the ``AstFor`` visitor, otherwise exiting the
save-set-restored in the ``AstFor`` visitor; otherwise exiting the
lower for will lose the upper for's setting.
2. User attributes. Each ``AstNode`` (**Note.** The AST node, not the
@ -1243,14 +1243,14 @@ There are three ways data is passed between visitor functions.
These comments are important to make sure a ``user#()`` on a given
``AstNode`` type is never being used for two different purposes.
Note that calling ``user#ClearTree`` is fast, it doesn't walk the
Note that calling ``user#ClearTree`` is fast; it doesn't walk the
tree, so it's ok to call fairly often. For example, it's commonly
called on every module.
3. Parameters can be passed between the visitors in close to the
"normal" function caller to callee way. This is the second ``vup``
parameter of type ``AstNUser`` that is ignored on most of the visitor
functions. V3Width does this, but it proved more messy than the above
functions. V3Width does this, but it proved messier than the above
and is deprecated. (V3Width was nearly the first module written.
Someday this scheme may be removed, as it slows the program down to
have to pass vup everywhere.)
@ -1305,7 +1305,7 @@ change. For example:
iterateAndNextNull(nodep->lhsp());
Will work fine, as even if the first iterate causes a new node to take
the place of the ``lhsp()``, that edit will update ``nodep->lhsp()`` and
the place of the ``lhsp()``, that edit will update ``nodep->lhsp()``, and
the second call will correctly see the change. Alternatively:
::
@ -1318,8 +1318,8 @@ the second call will correctly see the change. Alternatively:
This will cause bugs or a core dump, as lp is a dangling pointer. Thus
it is advisable to set lhsp=NULL shown in the \*'s above to make sure
these dangles are avoided. Another alternative used in special cases
mostly in V3Width is to use acceptSubtreeReturnEdits, which operates on
these dangles are avoided. Another alternative used in special cases,
mostly in V3Width, is to use acceptSubtreeReturnEdits, which operates on
a single node and returns the new pointer if any. Note
acceptSubtreeReturnEdits does not follow ``nextp()`` links.
@ -1332,7 +1332,7 @@ Identifying Derived Classes
---------------------------
A common requirement is to identify the specific ``AstNode`` class we
are dealing with. For example a visitor might not implement separate
are dealing with. For example, a visitor might not implement separate
``visit`` methods for ``AstIf`` and ``AstGenIf``, but just a single
method for the base class:
@ -1355,7 +1355,7 @@ use:
Additionally the ``VN_CAST`` method converts pointers similar to C++
``dynamic_cast``. This either returns a pointer to the object cast to
that type (if it is of class ``SOMETYPE``, or a derived class of
``SOMETYPE``) or else NULL. (However, for true/false tests use ``VN_IS``
``SOMETYPE``) or else NULL. (However, for true/false tests, use ``VN_IS``
as that is faster.)
@ -1364,13 +1364,13 @@ as that is faster.)
Testing
=======
For an overview of how to write a test see the BUGS section of the
For an overview of how to write a test, see the BUGS section of the
`Verilator Manual <https://verilator.org/verilator_doc.html>`_.
It is important to add tests for failures as well as success (for
example to check that an error message is correctly triggered).
Tests that fail should by convention have the suffix ``_bad`` in their
Tests that fail should, by convention have the suffix ``_bad`` in their
name, and include ``fails = 1`` in either their ``compile`` or
``execute`` step as appropriate.
@ -1378,11 +1378,11 @@ name, and include ``fails = 1`` in either their ``compile`` or
Preparing to Run Tests
----------------------
For all tests to pass you must install the following packages:
For all tests to pass, you must install the following packages:
- SystemC to compile the SystemC outputs, see http://systemc.org
- Parallel::Forker from CPAN to run tests in parallel, you can install
- Parallel::Forker from CPAN to run tests in parallel; you can install
this with e.g. "sudo cpan install Parallel::Forker".
- vcddiff to find differences in VCD outputs. See the readme at
@ -1417,9 +1417,9 @@ This can be changed using the ``top_filename`` subroutine, for example
top_filename("t/t_myothertest.v");
By default all tests will run with major simulators (Icarus Verilog, NC,
VCS, ModelSim, etc) as well as Verilator, to allow results to be
compared. However if you wish a test only to be used with Verilator, you
By default, all tests will run with major simulators (Icarus Verilog, NC,
VCS, ModelSim, etc.) as well as Verilator, to allow results to be
compared. However, if you wish a test only to be used with Verilator, you
can use the following:
::
@ -1435,7 +1435,7 @@ Of the many options that can be set through arguments to ``compiler`` and
``fails``
Set to 1 to indicate that the compilation or execution is intended to fail.
For example the following would specify that compilation requires two
For example, the following would specify that compilation requires two
defines and is expected to fail.
::
@ -1452,15 +1452,15 @@ Regression Testing for Developers
Developers will also want to call ./configure with two extra flags:
``--enable-ccwarn``
Causes the build to stop on warnings as well as errors. A good way to
ensure no sloppy code gets added, however it can be painful when it
This causes the build to stop on warnings as well as errors. A good way
to ensure no sloppy code gets added; however it can be painful when it
comes to testing, since third party code used in the tests (e.g.
SystemC) may not be warning free.
``--enable-longtests``
In addition to the standard C, SystemC examples, also run the tests
in the ``test_regress`` directory when using *make test*'. This is
disabled by default as SystemC installation problems would otherwise
disabled by default, as SystemC installation problems would otherwise
falsely indicate a Verilator problem.
When enabling the long tests, some additional PERL modules are needed,
@ -1477,7 +1477,7 @@ There are some traps to avoid when running regression tests
- Not all Linux systems install Perldoc by default. This is needed for the
``--help`` option to Verilator, and also for regression testing. This
can be installed using cpan:
can be installed using CPAN:
::
@ -1489,8 +1489,8 @@ There are some traps to avoid when running regression tests
- Running regression may exhaust resources on some Linux systems,
particularly file handles and user processes. Increase these to
respectively 16,384 and 4,096. The method of doing this is system
dependent, but on Fedora Linux it would require editing the
respectively 16,384 and 4,096. The method of doing this is
system-dependent, but on Fedora Linux it would require editing the
``/etc/security/limits.conf`` file as root.
@ -1510,7 +1510,7 @@ Continuous Integration
Verilator uses GitHub Actions which automatically tests the master branch
for test failures on new commits. It also runs a daily cron job to validate
all of the tests against different OS and compiler versions.
all tests against different OS and compiler versions.
Developers can enable Actions on their GitHub repository so that the CI
environment can check their branches too by enabling the build workflow:
@ -1555,7 +1555,7 @@ debug level 5, with the V3Width.cpp file at level 9.
--debug
-------
When you run with ``--debug`` there are two primary output file types
When you run with ``--debug``, there are two primary output file types
placed into the obj_dir, .tree and .dot files.
@ -1572,7 +1572,7 @@ output, for example:
dot -Tps -o ~/a.ps obj_dir/Vtop_foo.dot
You can then print a.ps. You may prefer gif format, which doesn't get
scaled so can be more useful with large graphs.
scaled so it can be more useful with large graphs.
For interactive graph viewing consider `xdot
<https://github.com/jrfonseca/xdot.py>`__ or `ZGRViewer
@ -1617,21 +1617,21 @@ field in the section below.
+---------------+--------------------------------------------------------+
| ``w32`` | The data-type width() is 32 bits. |
+---------------+--------------------------------------------------------+
| ``out_wide`` | The name() of the node, in this case the name of the |
| ``out_wide`` | The name() of the node, in this case, the name of the |
| | variable. |
+---------------+--------------------------------------------------------+
| ``[O]`` | Flags which vary with the type of node, in this |
| | case it means the variable is an output. |
| | case, it means the variable is an output. |
+---------------+--------------------------------------------------------+
In more detail the following fields are dumped common to all nodes. They
In more detail, the following fields are dumped common to all nodes. They
are produced by the ``AstNode::dump()`` method:
Tree Hierarchy
The dump lines begin with numbers and colons to indicate the child
node hierarchy. As noted above, ``AstNode`` has lists of items at the
same level in the AST, connected by the ``nextp()`` and ``prevp()``
pointers. These appear as nodes at the same level. For example after
pointers. These appear as nodes at the same level. For example, after
inlining:
::
@ -1655,20 +1655,20 @@ Address of the node
with the debugger. If the actual address values are not important,
then using the ``--dump-tree-addrids`` option will convert address
values to short identifiers of the form ``([A-Z]*)``, which is
hopefully easier for the reader to cross reference throughout the
hopefully easier for the reader to cross-reference throughout the
dump.
Last edit number
Of the form ``<ennnn>`` or ``<ennnn#>`` , where ``nnnn`` is the
number of the last edit to modify this node. The trailing ``#``
indicates the node has been edited since the last tree dump (which
typically means in the last refinement or optimization pass). GDB can
watch for this, see << /Debugging >>.
indicates the node has been edited since the last tree dump
(typically in the last refinement or optimization pass). GDB can
watch for this; see << /Debugging >>.
Source file and line
Of the form ``{xxnnnn}``, where C{xx} is the filename letter (or
letters) and ``nnnn`` is the line number within that file. The first
file is ``a``, the 26th is ``z``, the 27th is ``aa`` and so on.
file is ``a``, the 26th is ``z``, the 27th is ``aa``, and so on.
User pointers
Shows the value of the node's user1p...user5p, if non-NULL.
@ -1683,7 +1683,7 @@ Data type
- ``s`` if the node is signed.
- ``d`` if the node is a double (i.e a floating point entity).
- ``d`` if the node is a double (i.e. a floating point entity).
- ``w`` always present, indicating this is the width field.
@ -1693,9 +1693,9 @@ Data type
width.
Name of the entity represented by the node if it exists
For example for a ``VAR`` it is the name of the variable.
For example, for a ``VAR`` is the name of the variable.
Many nodes follow these fields with additional node specific
Many nodes follow these fields with additional node-specific
information. Thus the ``VARREF`` node will print either ``[LV]`` or
``[RV]`` to indicate a left value or right value, followed by the node
of the variable being referred to. For example:
@ -1710,7 +1710,7 @@ type in question to determine additional fields that may be printed.
The ``MODULE`` has a list of ``CELLINLINE`` nodes referred to by its
``op1p()`` pointer, connected by ``nextp()`` and ``prevp()`` pointers.
Similarly the ``NETLIST`` has a list of modules referred to by its
Similarly, the ``NETLIST`` has a list of modules referred to by its
``op1p()`` pointer.
@ -1728,7 +1728,7 @@ Debugging with GDB
------------------
The test_regress/driver.pl script accepts ``--debug --gdb`` to start
Verilator under gdb and break when an error is hit or the program is about
Verilator under gdb and break when an error is hit, or the program is about
to exit. You can also use ``--debug --gdbbt`` to just backtrace and then
exit gdb. To debug the Verilated executable, use ``--gdbsim``.
@ -1805,7 +1805,7 @@ backtrace. You will typically see a frame sequence something like:
Adding a New Feature
====================
Generally what would you do to add a new feature?
Generally, what would you do to add a new feature?
1. File an issue (if there isn't already) so others know what you're
working on.
@ -1823,7 +1823,7 @@ Generally what would you do to add a new feature?
Ordering of definitions is enforced by ``astgen``.
5. Now you can run "test_regress/t/t_<newtestcase>.pl --debug" and it'll
probably fail but you'll see a
probably fail, but you'll see a
"test_regress/obj_dir/t_<newtestcase>/*.tree" file which you can examine
to see if the parsing worked. See also the sections above on debugging.
@ -1833,12 +1833,12 @@ Generally what would you do to add a new feature?
Adding a New Pass
-----------------
For more substantial changes you may need to add a new pass. The simplest
For more substantial changes, you may need to add a new pass. The simplest
way to do this is to copy the ``.cpp`` and ``.h`` files from an existing
pass. You'll need to add a call into your pass from the ``process()``
function in ``src/verilator.cpp``.
To get your pass to build you'll need to add its binary filename to the
To get your pass to build, you'll need to add its binary filename to the
list in ``src/Makefile_obj.in`` and reconfigure.
@ -1854,11 +1854,9 @@ IEEE 1800-2017 3.3 modules within modules
IEEE 1800-2017 6.12 "shortreal"
Little/no tool support, and easily promoted to real.
IEEE 1800-2017 11.11 Min, typ, max
No SDF support so will always use typical.
No SDF support, so will always use typical.
IEEE 1800-2017 11.12 "let"
Little/no tool support, makes difficult to implement parsers.
IEEE 1800-2017 20.15 Probabilistic functions
Little industry use.
Little/no tool support, makes it difficult to implement parsers.
IEEE 1800-2017 20.16 Stochastic analysis
Little industry use.
IEEE 1800-2017 20.17 PLA modeling