() This is simply to clean up the code for cycles and timing
calculations as there are quite a bit of unnecessary AND
operations.
() This also moves the cycle calculation closer to the print
statement as a few calculations were done between grabbing
counter values.
() PRINT_STATS() now takes only two parameters as the third
one was always calling CYCLES_TO_NS(2nd) anyway.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>