// DESCRIPTION: Verilator: List of To Do issues. // // Copyright 2004-2014 by Wilson Snyder. This program is free software; you can // redistribute it and/or modify it under the terms of either the GNU // Lesser General Public License Version 3 or the Perl Artistic License // Version 2.0. Language support: * Fix ordering of each bit separately in a signal (mips) assign b[3:0] = b[7:4]; assign b[7:4] = in; * Support UDP gate primitives/ cell libraries (have code for combos - problem is sequential udps) * Function to eval combo logic after /*verilator public*/ functions [gwaters] * Support generated clocks (correctness) * Real numbers * Recursive functions * Verilog configuration files * Structs/unions (have starting point) * DPI to define C/C++ calls from Verilog * Expression coverage (see notes) * Better tristate support Long-term Features * Assertions * Tristate support * Multithreaded execution Configure/Make/Install * Full MSVC++ compilation (does scons support this?) (4.000?) * Distribute with flex/bison already expanded? Flex library not needed. Probably too difficult to be worth it. * Integrate SystemPerl coverage see the SystemPerl git branch coverage_only (Note in /usr/include there are no upper cased include files.) Coverage.pm -- Need all functionality, but in C? Coverage/Item.pm -- Need all functionality, but in C? Coverage/ItemKey.pm -- Need all functionality, but in C? sp_preproc -- Some steps in here need to be moved to generated C -- -- note uses Verilog::Getopt src/Sp.cpp -- n/a src/SpCommon.h -- mostly overlaps verilatedos.h src/SpCoverage.cpp/h -- All needed src/SpFunctor.cpp/h -- No longer used src/SpTraceVcd.cpp/h -- MOVED src/SpTraceVcdC.cpp/h -- MOVED src/sp_log.cpp/h -- Not needed src/systemperl.h -- some stuff may be cut vcoverage -- Need all functionality, but in C? Testing: * Move test_c/sp/v/verilated into test_regress format (4.000?) * Capture all inputs into global "rerun it" file * Code to make wrapper that sets signals, so can do comparison checks * New random program generator * Better graph viewer with search and zoom * Port and test against opencores.org code Usability: * Detect and pre-remove most UNOPTFLATs (4.000) * Better reporting of unopt problems, including what lines of code * Report more errors (all of them?) before exiting [Eugene Weber] * Auto-create scons config files * Print version/etc message at runtime. (4.000?) Include number of lines of code, percent comments, code complexity measurement <-80chars-------------------------------------------------------------------> Verilator 3.600 - fast, free, open-sourced. Copyright 2001-2013. Verilated #### modules, #### instances, ##### sigs, #### non-comment lines, ##### ops, ### KB model size * Default the --l2name to remove extra "v" level of hierarchy (flag to make "top") Lint: * CDCRSTLOGIC should allow filtering with paths "waive CDCRSTLOGIC --from a.b.sig --to a.c.sig --via OR" Internal Code: * Eliminate the AstNUser* passed to all visitors; its only needed in V3Width, and removing it will speed up and simplify all the other code. * V3Graph should be templated container type, taking in Vertex + Edge types * Rename V3PreLex etc to match VerilogPerl filenames * Instead of string, have an VEncodedString/VIdString which contains __DOT__ish things, to reduce bugs. Also add _20 trailing space to \ encoded names. (4.000) Runtime: * New evalulation loop ~/src/verilator/notes/event_loop.txt (4.000?) * Remove all private internal functions from top level wrapper header, move to new level (4.000?) * Completely standalone simulation (4.000) main() records arguments for $test$plusvars instantiates top, does tracing (support $dump?) calls top->simulateForever() exits Performance: * Latch optimizations * Constant propagation Extra cleaning AND: 1 & ((VARREF >> 1) | ((&VARREF >> 1) & VARREF)) Extra shift (perhaps due to clean): if (1 & CAST (VARREF >> #)) * Gated clock and latch conversion to flops. [JeanPaul Vanitegem] Could propagate the AND into pos/negedges and let domaining optimize. * Negedge reset Switch to remove negedges that don't matter Can't remove async resets from control flops (like in syncronizers) * If all references to array have a constant index, blow up into separate signals-per-index * Bit-multiply for faster bit swapping and a=b[1,3,2] random bit reorderings. * Move _last sets and all other combo logic inside master if() that triggers on all possible sense items * Rewrite and combine V3Life, V3Subst If block temp only ever set in one place to constant, propagate it Used in t_mem for array delayed assignments Replace variables if set later in same cfunc branch See for example duplicate sets of _narrow in cycle 90/91 of t_select_plusloop * Same assignment on both if branches "if (a) { ... b=2; } else { ... b=2;}" -> "b=2; if ..." Careful though, as b could appear in the statement or multiple times in statement (Could just require exatly two 'b's in statement) * Simplify XOR/XNOR/AND/OR bit selection trees Foo = A[1] ^ A[2] ^ A[3] etc are better as ^ ( A & 32'b...1110 ) * Combine variables into wider elements Parallel statements on different bits should become single signal Variables that are always consumed in "parallel" can be joined * Duplicate assignments in gate optimization Common to have many separate posedge blocks, each with identical reset_r <= rst_in * If signal is used only once (not counting trace), always gate substitute Don't merge if any combining would form circ logic (out goes back to in) * Multiple assignments each bit can become single assign with concat Make sure a SEL of a CONCAT can get the single bit back. * Usually blocks/values Enable only after certain time, so VL_TIME_I(32) > 0x1e gets eliminated out * Better ordering of a<=b, b<=c, put all refs to 'b' next to each other to optimize caching * Allow Split of case statements without a $display/$stop * I-cache packing improvements (what/how?) * Data cache organization (order of vars in class) First have clocks, then bools instead of uint32_t's then based on what sense list they come from, all outputs, then all inputs finally have any signals part of a "usually" block, or constant. * Rather then tracking widths, have a MSB...LSB of this expression (or better, a bitmask of bits relevant in this expression) * Track recirculation and convert into clock-enables * Clock enables should become new clocking domains for speed * If floped(a) & flopped(b) and no other a&b, then instead flop(a&b). * Sort by output bitselects so can combine more assignments (see DDP example dx_dm signal)