------------------------------------------------------

    Memo on SimpleScalar 2.0 for cse586

------------------------------------------------------


00. Table of Contents

0.   Preliminaries
0.1  Course Directory Layout
I.   Introduction
II.  Using SimpleScalar (basic)
III. Using SimpleScalar (advanced)
IV.  SimpleScalar for Assignment 2

0. Preliminaries

The SimpleScalar home page is:

  http://www.cs.wisc.edu/~mscalar/simplescalar.html

I've downloaded a technical report that gives an overview of the
system and placed it in the course directory at

 /cse/courses/cse586/01au/ss2.0/doc/TR_1342.ps

I'll refer to this technical report frequently in this memo.

I assume working knowledge of unix: familiarity with shells, input and
output redirection, and the command-line environment.  The examples
here are in bash; users of other shells should translate.  If there's
large demand for a unix tutorial I'll put one up.

0.1  Directories and Files

All the paths I reference here are relative to the course directory.
This directory is visible from the instructional servers: fiji,
ceylon, sumatra and tahiti, all off cs.washington.edu.

/cse/courses/cse586/01au/              --- root of course directory
                         bin/          --- binaries, e.g. sim-outorder
                                                and 586-setup-bash
                         bin-ss/       --- SimpleScalar binaries,
                                                e.g. li.ss
                         etc/          --- configuration files, 
                                                e.g. hw2.cfg
                         src/          --- input files for the benchmarks, 
                                                organized by program.
                         ss2.0/        --- the SimpleScalar 2.0 tree.
                                            Contains cross-compiler
                                            and all tools.


I. Introduction

SimpleScalar is a processor simulator.  That is, it is a program that
runs on one platform (e.g., x86), and executes binaries for another
processor (e.g., MIPS).  A program run on the simulator should execute
the "same" as a program executed on the simulator's target platform.
The trick is to define what you mean by "same".  

The usual trade-off when writing a simulator is accuracy vs. speed.  If
your criterion of same-ness simply considers program output, you can
write a simple simulator that runs fast, but cannot produce detailed
statistics on, for example, cache or pipeline performance.  On the
other hand, if your criterion of same-ness extends to how the pipeline
is processed, detailed statistics on something like the pipeline can
be produced.

Because of the varying applications of simulators, no single
compromise in this trade-off is very useful.  Accordingly, the
SimpleScalar tool set provides several simulators, each at a different
point in this trade-off.  The fastest, sim-fast, can execute
instructions at 4 MHz, but guarantees nothing more than serial
execution of the instructions.  The slowest, sim-outorder, runs at 150
KHz, but simulates most aspects of a chip, including an out-of-order
execution pipeline and branch prediction.  An intermediate simulator,
sim-cache, accurately simulates cache behavior but is not cycle
accurate.

The simulators in SimpleScalar implement a MIPS-like instruction set
and chip design.  The instruction set is detailed in the technical
report.  One interesting quirk is a 64-bit op-code, that has 16 spare
bits intended to be used for poke extensions; the simulator is meant
to be used for heavy experimentation.


II. Using SimpleScalar (basic)

I've put sim-outorder, the simulator to be used with assignment 2, in
the bin/ directory of the course directory(/cse/courses/cse586/01au).
Also in the bin/ directory is 586-setup-bash.  Sourcing this does nice
things to your environment.  If there's demand, I'll put up a csh
version as well.

Full documentation on running the simulator is found in the technical
report.  Simulator parameters are set by command-line flags; as
specifying long lists of flags can get tedious, the simulator can also
read command-line flags from a configuration file, using the -config
<file> switch.  

You must also specify what program to execute on the simulator.  This
is done by specifying it last on the command line.  As is standard for
this sort of thing, any flags to be passed to the program are appended
to the command line.  For example,

 sim-outorder bin-ss/perl.ss -e 'print "hello\n";'

executes the file perl.ss with sim-outorder, with perl.ss getting the
flags -e 'print "hello\n";'.  

Remember that files executed by the simulator are binary files
containing machine code for the simulator instruction set.  Thus, for
example, you cannot execute an x86 binary using sim-outorder.  You
must use a binary cross-compiled to the SimpleScalar architecture
instead.  Binaries that will be used in this course are found in
bin-ss/ in the course directory.

Note that statistics are written to stderr.  Thus it's useful to do
something like

 sim-outorder bin-ss/perl.ss -e 'print "hello\n";' 2> sim-results

to save the output into sim-results for later analysis.


III. Using SimpleScalar (advanced)

I've built the complete SimpleScalar 2.0 tool-set at ss2.0/ in the
course directory.  There are two flavors of the simulator, one
big-endian and the other little endian.  The simulator apparently
doesn't work well when its endian-ness doesn't match the host
platform; therefore I've built the little-endian version of the
tools.  This is signified by the word "little" or "sslittle" in the
tools or directories.

The SimpleScalar environment is based on the gnu/binutils tool-set, all
compiled to be cross-platform, with the host being x86 and the target
sslittle.  Cross-platform binutils and compiler/assembler/loader are
located in ss2.0/bin (with long, fully descriptive names), or in
sslittle-na-sstrix/bin (with the usual short names).

The more adventurous students can use the tools to compile C-code to
the SimpleScalar platform for simulation.


IV. SimpleScalar for Assignment 2

A configuration file that contains flags suitable for the first part
of assignment 2 can be found at etc/hw2.cfg in the course directory.
Read through hw2.cfg and make sure you understand it.  It sets up the
caches and 2-level cache as directed.  Note that even though the
parameters for the 2-level cache are specified, the bimodal cache is
specified as the one to use by the configuration file.  You can change
this by using the -bpred flag on the command line (command line flags
take precedence over configuration file settings).

You've been directed to use bin-ss/cc1.ss as the test program.  This
program takes input on stdin (the input is just a prepossessed C
file).  Assembly code is output on stdout.  The spec95 inputs are
located in src/cc1.  Here is an example (split across multiple lines
for formatting purposes only):

 sim-outorder -config $CDIR/etc/hw2.cfg -bpred 2lev $CDIR/bin-ss/cc1.ss \
                 < $CDIR/src/cc1/cexp.i > output.s 2>sim-output

$CDIR is a variable set by 586-setup to be the root of the course
directory tree.  The -config flag reads in hw2.cfg, -bpred 2lev sets
the branch prediction to be 2-level and cc1.ss is specified as the
binary to execute.  stdin is redirected to cexp.i, stdout to output.s,
and stderr to sim-output.  All the useful statistics are located in
sim-output.

Note that "-bpred bimod" is the switch to use to get 2-bit dynamic
branch prediction.  "bimod" is short for bimodal, which refers to the
fact that 2-bit dynamic prediction moves between two states (modes):
taken and not taken.

BTFN prediction, a.k.a. "static combined", has been added to
SimpleScalar.  Access it using "-bpred static_comb"