// DESCRIPTION: Verilator: List of To Do issues. // // Copyright 2004-2009 by Wilson Snyder. This program is free software; you can // redistribute it and/or modify it under the terms of either the GNU // General Public License or the Perl Artistic License. Features: Finish 3.400 new ordering fixes Latch optimizations {Need here} Task I/Os connecting to non-simple variables. Fix nested casez statements expanding into to huge C++. [JeanPaul Vanitegem] Fix FileLine memory inefficiency Fix ordering of each bit separately in a signal (mips) assign b[3:0] = b[7:4]; assign b[7:4] = in; Support gate primitives/ cell libraries from xilinx, etc Assign dont_care value to an 1'bzzz assignment Function to eval combo logic after /*verilator public*/ functions [gwaters] Support generated clocks (correctness) ?gcov coverage Selectable SystemC types based on widths (see notes below) Compile time Inline first level trace routines Coverage Points should be per-scope like everything else rather then per-module Expression coverage (see notes) Constant functions for widths, etc, IE "input [log2(PARAM):0] xx;" More Verilog 2001 Support (* *) Attributes (just ignore -- preprocessor?) Real numbers (NEVER) Recursive functions (NEVER) Verilog configuration files (NEVER) DPI to define C/C++ calls from Verilog Long-term Features Assertions VHDL parser [Philips] Tristate support SystemPerl integration Multithreaded execution Front end parser integration into Verilog-Perl like Netlist Testing: Capture all inputs into global "rerun it" file Code to make wrapper that sets signals, so can do comparison checks New random program generator Better graph viewer with search and zoom Port and test against opencores.org code Usability: Better reporting of unopt problems, including what lines of code Report more errors (all of them?) before exiting [Eugene Weber] Internal Code: Eliminate the AstNUser* passed to all visitors; its only needed in V3Width, and removing it will speed up and simplify all the other code. V3Graph should be templated container type, taking in Vertex + Edge types Performance: Constant propagation Extra cleaning AND: 1 & ((VARREF >> 1) | ((&VARREF >> 1) & VARREF)) Extra shift (perhaps due to clean): if (1 & CAST (VARREF >> #)) Gated clock and latch conversion to flops. [JeanPaul Vanitegem] Could propagate the AND into pos/negedges and let domaining optimize. Negedge reset Switch to remove negedges that don't matter Can't remove async resets from control flops (like in syncronizers) If all references to array have a constant index, blow up into separate signals-per-index Multithreaded execution Bit-multiply for faster bit swapping and a=b[1,3,2] random bit reorderings. Move _last sets and all other combo logic inside master if() that triggers on all possible sense items Rewrite and combine V3Life, V3Subst If block temp only ever set in one place to constant, propagate it Used in t_mem for array delayed assignments Replace variables if set later in same cfunc branch See for example duplicate sets of _narrow in cycle 90/91 of t_select_plusloop Same assignment on both if branches "if (a) { ... b=2; } else { ... b=2;}" -> "b=2; if ..." Careful though, as b could appear in the statement or multiple times in statement (Could just require exatly two 'b's in statement) Simplify XOR/XNOR/AND/OR bit selection trees Foo = A[1] ^ A[2] ^ A[3] etc are better as ^ ( A & 32'b...1110 ) Combine variables into wider elements Parallel statements on different bits should become single signal Variables that are always consumed in "parallel" can be joined Duplicate assignments in gate optimization Common to have many separate posedge blocks, each with identical reset_r <= rst_in *If signal is used only once (not counting trace), always gate substitute Don't merge if any combining would form circ logic (out goes back to in) Multiple assignments each bit can become single assign with concat Make sure a SEL of a CONCAT can get the single bit back. Usually blocks/values Enable only after certain time, so VL_TIME_I(32) > 0x1e gets eliminated out Better ordering of a<=b, b<=c, put all refs to 'b' next to each other to optimize caching Allow Split of case statements without a $display/$stop I-cache packing improvements (what/how?) Data cache organization (order of vars in class) First have clocks, then bools instead of uint32_t's then based on what sense list they come from, all outputs, then all inputs finally have any signals part of a "usually" block, or constant. Rather then tracking widths, have a MSB...LSB of this expression (or better, a bitmask of bits relevant in this expression) Track recirculation and convert into clock-enables Clock enables should become new clocking domains for speed If floped(a) & flopped(b) and no other a&b, then instead flop(a&b). Sort by output bitselects so can combine more assignments (see DDP example dx_dm signal) All of the temp vars that get set, exp pre_ vars and never feedback (not flops) don't need to be stored in the structs, but instead can be per-invocation, and even better register-colored-like to reuse the space. This will greatly reduce the data footprint. //********************************************************************** //* Detailed notes on 'todo' features Selectable SystemC types based on widths (see notes below) Statements: verilator sc_type uint32_t {override specific I/O signals} verilator sc_typedef width 1 bool verilator sc_typedef width 2-32 uint32_t verilator sc_typedef width 33-64 uint64_t verilator sc_typedef width 65+ sc_bv //<< programmable width Replace `systemc_clock, --pins64 Have accessor function with type overloading to get or set the pin value. (Get rid of getdatap mess?) //********************************************************************** //* Eventual tristate bus Stuff allowed (old verilator) 1) Tristate assignments must be continuous assignments The RHS of a tristate assignment can be the following a) a node (tristate or non-tristate) b) a constant (must be all or no z's) x'b0, x'bz, x{x'bz}, x{x'b0} -> are allowed c) a conditional whose possible values are (a) or (b) 2) One can lose that fact that a node is a tristate node. This happens if a tristate node is assigned to a 'standard' node, or is used on RHS of a conditional. The following infer tristate signals: a) inout b) tri c) assigning to 'Z' (maybe through a conditional) Note: tristate-ness of an output port determined only by statements in the module (not the instances it calls) 4) Tristate variables can't be multidimensional arrays 5) Only check tristate contention between modules (not within!) 6) Only simple compares with 'Z' are allowed (===)