verilator/TODO

154 lines
6.8 KiB
Plaintext
Raw Normal View History

// DESCRIPTION: Verilator: List of To Do issues.
//
2009-01-02 16:47:39 +00:00
// Copyright 2004-2009 by Wilson Snyder. This program is free software; you can
// redistribute it and/or modify it under the terms of either the GNU
// General Public License or the Perl Artistic License.
Features:
Finish 3.400 new ordering fixes
Latch optimizations {Need here}
Task I/Os connecting to non-simple variables.
Fix nested casez statements expanding into to huge C++. [JeanPaul Vanitegem]
Fix FileLine memory inefficiency
Fix ordering of each bit separately in a signal (mips)
assign b[3:0] = b[7:4]; assign b[7:4] = in;
Support gate primitives/ cell libraries from xilinx, etc
Assign dont_care value to an 1'bzzz assignment
Function to eval combo logic after /*verilator public*/ functions [gwaters]
Support generated clocks (correctness)
?gcov coverage
Selectable SystemC types based on widths (see notes below)
Compile time
Inline first level trace routines
Coverage
Points should be per-scope like everything else rather then per-module
Expression coverage (see notes)
Constant functions for widths, etc, IE "input [log2(PARAM):0] xx;"
More Verilog 2001 Support
(* *) Attributes (just ignore -- preprocessor?)
Real numbers (NEVER)
Recursive functions (NEVER)
Verilog configuration files (NEVER)
DPI to define C/C++ calls from Verilog
Long-term Features
Assertions
VHDL parser [Philips]
Tristate support
SystemPerl integration
Multithreaded execution
Front end parser integration into Verilog-Perl like Netlist
Testing:
Capture all inputs into global "rerun it" file
Code to make wrapper that sets signals, so can do comparison checks
New random program generator
Better graph viewer with search and zoom
Port and test against opencores.org code
Usability:
Better reporting of unopt problems, including what lines of code
Report more errors (all of them?) before exiting [Eugene Weber]
Internal Code:
Eliminate the AstNUser* passed to all visitors; its only needed in V3Width,
and removing it will speed up and simplify all the other code.
V3Graph should be templated container type, taking in Vertex + Edge types
Performance:
Constant propagation
Extra cleaning AND: 1 & ((VARREF >> 1) | ((&VARREF >> 1) & VARREF))
Extra shift (perhaps due to clean): if (1 & CAST (VARREF >> #))
Gated clock and latch conversion to flops. [JeanPaul Vanitegem]
Could propagate the AND into pos/negedges and let domaining optimize.
Negedge reset
Switch to remove negedges that don't matter
Can't remove async resets from control flops (like in syncronizers)
If all references to array have a constant index, blow up into separate signals-per-index
Multithreaded execution
Bit-multiply for faster bit swapping and a=b[1,3,2] random bit reorderings.
Move _last sets and all other combo logic inside master
if() that triggers on all possible sense items
Rewrite and combine V3Life, V3Subst
If block temp only ever set in one place to constant, propagate it
Used in t_mem for array delayed assignments
Replace variables if set later in same cfunc branch
See for example duplicate sets of _narrow in cycle 90/91 of t_select_plusloop
Same assignment on both if branches
"if (a) { ... b=2; } else { ... b=2;}" -> "b=2; if ..."
Careful though, as b could appear in the statement or multiple times in statement
(Could just require exatly two 'b's in statement)
Simplify XOR/XNOR/AND/OR bit selection trees
Foo = A[1] ^ A[2] ^ A[3] etc are better as ^ ( A & 32'b...1110 )
Combine variables into wider elements
Parallel statements on different bits should become single signal
Variables that are always consumed in "parallel" can be joined
Duplicate assignments in gate optimization
Common to have many separate posedge blocks, each with identical
reset_r <= rst_in
*If signal is used only once (not counting trace), always gate substitute
Don't merge if any combining would form circ logic (out goes back to in)
Multiple assignments each bit can become single assign with concat
Make sure a SEL of a CONCAT can get the single bit back.
Usually blocks/values
Enable only after certain time, so VL_TIME_I(32) > 0x1e gets eliminated out
Better ordering of a<=b, b<=c, put all refs to 'b' next to each other to optimize caching
Allow Split of case statements without a $display/$stop
I-cache packing improvements (what/how?)
Data cache organization (order of vars in class)
First have clocks,
then bools instead of uint32_t's
then based on what sense list they come from, all outputs, then all inputs
finally have any signals part of a "usually" block, or constant.
Rather then tracking widths, have a MSB...LSB of this expression
(or better, a bitmask of bits relevant in this expression)
Track recirculation and convert into clock-enables
Clock enables should become new clocking domains for speed
If floped(a) & flopped(b) and no other a&b, then instead flop(a&b).
Sort by output bitselects so can combine more assignments (see DDP example dx_dm signal)
All of the temp vars that get set, exp pre_ vars and never feedback
(not flops) don't need to be stored in the structs, but instead can
be per-invocation, and even better register-colored-like to reuse
the space. This will greatly reduce the data footprint.
//**********************************************************************
//* Detailed notes on 'todo' features
Selectable SystemC types based on widths (see notes below)
Statements:
verilator sc_type uint32_t {override specific I/O signals}
verilator sc_typedef width 1 bool
verilator sc_typedef width 2-32 uint32_t
verilator sc_typedef width 33-64 uint64_t
verilator sc_typedef width 65+ sc_bv<width> //<< programmable width
Replace `systemc_clock, --pins64
Have accessor function with type overloading to get or set the pin value.
(Get rid of getdatap mess?)
//**********************************************************************
//* Eventual tristate bus Stuff allowed (old verilator)
1) Tristate assignments must be continuous assignments
The RHS of a tristate assignment can be the following
a) a node (tristate or non-tristate)
b) a constant (must be all or no z's)
x'b0, x'bz, x{x'bz}, x{x'b0} -> are allowed
c) a conditional whose possible values are (a) or (b)
2) One can lose that fact that a node is a tristate node. This happens
if a tristate node is assigned to a 'standard' node, or is used on
RHS of a conditional. The following infer tristate signals:
a) inout <SIGNAL>
b) tri <SIGNAL>
c) assigning to 'Z' (maybe through a conditional)
Note: tristate-ness of an output port determined only by
statements in the module (not the instances it calls)
4) Tristate variables can't be multidimensional arrays
5) Only check tristate contention between modules (not within!)
6) Only simple compares with 'Z' are allowed (===)