diff --git a/README.adoc b/README.adoc
index 395302e64..3e2eb528b 100644
--- a/README.adoc
+++ b/README.adoc
@@ -25,7 +25,7 @@ endif::[]
^.^| *Welcome to Verilator, the fastest free Verilog HDL simulator.*
+++
+++ • Accepts synthesizable Verilog or SystemVerilog
+++
+++ • Performs lint code-quality checks
-+++
+++ • Compiles into multithreaded {cpp}, SystemC, or (soon) {cpp}-under-Python
++++
+++ • Compiles into multithreaded {cpp}, or SystemC
+++
+++ • Creates XML to front-end your own tools
<.^|image:https://www.veripool.org/img/verilator_256_200_min.png[Logo,256,200]
@@ -81,7 +81,7 @@ touch of {cpp} code, Verilator is the tool for you.
Verilator does not simply convert Verilog HDL to {cpp} or SystemC. Rather
than only translate, Verilator compiles your code into a much faster
optimized and optionally thread-partitioned model, which is in turn wrapped
-inside a {cpp}/SystemC/Python module. The results are a compiled Verilog
+inside a {cpp}/SystemC/{cpp}-under-Python module. The results are a compiled Verilog
model that executes even on a single-thread over 10x faster than standalone
SystemC, and on a single thread is about 100 times faster than interpreted
Verilog simulators such as http://iverilog.icarus.com[Icarus
diff --git a/bin/verilator b/bin/verilator
index 1ec4e722c..bb3dbb69b 100755
--- a/bin/verilator
+++ b/bin/verilator
@@ -1981,9 +1981,10 @@ Unfortunately, using the optimizer with SystemC files can result in
compiles taking several minutes. (The SystemC libraries have many little
inlined functions that drive the compiler nuts.)
-For best results, use GCC 3.3 or newer. GCC 3.2 and earlier have
-optimization bugs around pointer aliasing detection, which can result in 2x
-performance losses.
+For best results, use the latest clang compiler (about 10% faster than
+GCC). Note the now fairly old GCC 3.2 and earlier have optimization bugs
+around pointer aliasing detection, which can result in 2x performance
+losses.
If you will be running many simulations on a single compile, investigate
feedback driven compilation. With GCC, using -fprofile-arcs, then
@@ -1994,6 +1995,9 @@ especially if you link in DPI code. To enable LTO on GCC, pass "-flto" in
both compilation and link. Note LTO may cause excessive compile times on
large designs.
+Using profile driven compiler optimization, with feedback from a real
+design, can yield up to30% improvements.
+
If you are using your own makefiles, you may want to compile the Verilated
code with -DVL_INLINE_OPT=inline. This will inline functions, however this
requires that all cpp files be compiled in a single compiler run.
@@ -2004,7 +2008,7 @@ either oprofile or gprof to see where in the C++ code the time is spent.
Run the gprof output through verilator_profcfunc and it will tell you what
Verilog line numbers on which most of the time is being spent.
-When done, please let the author know the results. I like to keep tabs on
+When done, please let the author know the results. We like to keep tabs on
how Verilator compares, and may be able to suggest additional improvements.
@@ -2079,7 +2083,7 @@ After running Make, the C++ compiler may produce the following:
A generic Linux/OS variable specifying what directories have shared object
(.so) files. This path should include SystemC and any other shared objects
-needed at simultion runtime.
+needed at simulation runtime.
=item OBJCACHE