// DESCRIPTION: Verilator: List of To Do issues.
//
// Copyright 2004-2015 by Wilson Snyder. This program is free software; you can
// redistribute it and/or modify it under the terms of either the GNU
// Lesser General Public License Version 3 or the Perl Artistic License
// Version 2.0.


Language support:
	* Fix ordering of each bit separately in a signal (mips)
		assign b[3:0] = b[7:4];  assign b[7:4] = in;
	* Support UDP gate primitives/ cell libraries
		(have code for combos - problem is sequential udps)
	* Function to eval combo logic after /*verilator public*/ functions [gwaters]
	* Support generated clocks (correctness)
	* Real numbers
	* Recursive functions
	* Verilog configuration files
	* Structs/unions (have starting point)
	* DPI to define C/C++ calls from Verilog
	* Expression coverage (see notes)
	* Better tristate support

Long-term Features
	* Assertions
	* Tristate support
	* Multithreaded execution

Configure/Make/Install
	* Full MSVC++ compilation (does scons support this?) (4.000?)
	* Distribute with flex/bison already expanded?
	  Flex library not needed.  Probably too difficult to be worth it.

Testing:
	* Move test_c/sp/v/verilated into test_regress format (4.000?)
	* Capture all inputs into global "rerun it" file
	* Code to make wrapper that sets signals, so can do comparison checks
	* New random program generator
	* Better graph viewer with search and zoom
	* Port and test against opencores.org code
	* // verilator debug in code so can see only tree affecting those nodes

Usability:
	* Detect and pre-remove most UNOPTFLATs (4.000) 
	* Better reporting of unopt problems, including what lines of code
	* Report more errors (all of them?) before exiting [Eugene Weber]
	* Auto-create scons config files
	* Print version/etc message at runtime. (4.000?)
	  Include number of lines of code, percent comments, code complexity measurement
	  <-80chars------------------------------------------------------------------->
	  Verilator 3.600 - fast, free, open-sourced.  Copyright 2001-2013.
	  Verilated #### modules, #### instances, ##### sigs,
	  	    #### non-comment lines, ##### ops, ### KB model size
	* Default the --l2name to remove extra "v" level of hierarchy (flag to make "top")

Lint:
	* CDCRSTLOGIC should allow filtering with paths
	  "waive CDCRSTLOGIC --from a.b.sig  --to a.c.sig --via OR"

Internal Code:
	* Eliminate the AstNUser* passed to all visitors; its only needed in V3Width,
	  and removing it will speed up and simplify all the other code.
	* V3Graph should be templated container type, taking in Vertex + Edge types
	* Rename V3PreLex etc to match VerilogPerl filenames
	* Instead of string, have an VEncodedString/VIdString which contains __DOT__ish
	  things, to reduce bugs.  Also add _20 trailing space to \ encoded names. (4.000)

Runtime:
	* New evalulation loop   ~/src/verilator/notes/event_loop.txt (4.000?)
	* Remove all private internal functions from top level wrapper header, move
	  to new level (4.000?)
	* Completely standalone simulation (4.000)
	   main() records arguments for $test$plusvars
	   instantiates top,
	   does tracing  (support $dump?)
	   calls top->simulateForever()
	   exits

Performance:
	* Latch optimizations
	* Constant propagation
		Extra cleaning AND:  1 & ((VARREF >> 1) | ((&VARREF >> 1) & VARREF))
		Extra shift (perhaps due to clean): if (1 & CAST (VARREF >> #))
	* Gated clock and latch conversion to flops.  [JeanPaul Vanitegem]
		Could propagate the AND into pos/negedges and let domaining optimize.
	* Negedge reset
		Switch to remove negedges that don't matter
		Can't remove async resets from control flops (like in syncronizers)
	* If all references to array have a constant index, blow up into separate signals-per-index
	* Bit-multiply for faster bit swapping and a=b[1,3,2] random bit reorderings.
	* Move _last sets and all other combo logic inside master
		if() that triggers on all possible sense items
	* Rewrite and combine V3Life, V3Subst
		If block temp only ever set in one place to constant, propagate it
			Used in t_mem for array delayed assignments
		Replace variables if set later in same cfunc branch
			See for example duplicate sets of _narrow in cycle 90/91 of t_select_plusloop
	* Same assignment on both if branches
		"if (a) { ... b=2; } else { ... b=2;}" -> "b=2; if ..."
		Careful though, as b could appear in the statement or multiple times in statement
		(Could just require exatly two 'b's in statement)
	* Simplify XOR/XNOR/AND/OR bit selection trees
		Foo = A[1] ^ A[2] ^ A[3] etc are better as ^ ( A & 32'b...1110 )
	* Combine variables into wider elements
		Parallel statements on different bits should become single signal
		Variables that are always consumed in "parallel" can be joined
	* Duplicate assignments in gate optimization
		Common to have many separate posedge blocks, each with identical
		reset_r <= rst_in
	* If signal is used only once (not counting trace), always gate substitute
		Don't merge if any combining would form circ logic (out goes back to in)
	* Multiple assignments each bit can become single assign with concat
		Make sure a SEL of a CONCAT can get the single bit back.
	* Usually blocks/values
		Enable only after certain time, so VL_TIME_I(32) > 0x1e gets eliminated out
	* Better ordering of a<=b, b<=c, put all refs to 'b' next to each other to optimize caching
	* Allow Split of case statements without a $display/$stop
	* I-cache packing improvements (what/how?)
	* Data cache organization (order of vars in class)
		First have clocks,
		then bools instead of uint32_t's
		then based on what sense list they come from, all outputs, then all inputs
		finally have any signals part of a "usually" block, or constant.
	* Rather then tracking widths, have a MSB...LSB of this expression
		(or better, a bitmask of bits relevant in this expression)
	* Track recirculation and convert into clock-enables
	* Clock enables should become new clocking domains for speed
	* If floped(a) & flopped(b) and no other a&b, then instead flop(a&b).
	* Sort by output bitselects so can combine more assignments (see DDP example dx_dm signal)