CSE331 Autumn 2011
Software Design and Implementation
|
Assignment
4 (10%): Due Wednesday November 9, 2011
at 11:59PM |
This assignment was motivated by a number of questions from students (on
Piazza, in office hours, etc.) and some answers to some of the midterm
questions. Alert: It's a seat-of-the-pants definition of the assignment, so
it's possible there are gaps in it. For anyone who finds it
straightforward, there is an optional part, but it is not part of the grade.
Program Description
One question on the midterm was about random test generation. A number of
answers said things lik
- The randomly generated array length might not be consistent with the
number of values in the array.
- The randomly generated array might not be sorted.
- Random keys are much more likely to be not found than to be found.
- There's no way to determine the oracle.
We will try to address these issues in this assignment. (There are some
other problems we will not address in this assignment including making sure to
cover edge cases and the ability to repeat random tests deterministically.)
There are also parts intended to enhance your understanding of abstraction
functions and representation invariants, and perhaps a bit on design
patterns.
Here's the basic idea. You will write a program that generates legal test
cases for binary search, specifically the Java 6.0 binarySearch
over int arrays, which simply takes a sorted int array and a key. How will you
address the four issues above?
- Length and number of values must be consistent: A
straightforward way to handle this is to generate a length (within some
bounds, using some distribution) and then create an array of that size for
which you will generate random values.
- You should select a length for the array between 0 and 1000
(inclusive); you should select values for the ints in the array between
[-10000,10000] (inclusive).
- These values should be easy to change (at least using constants and
optionally using a configuration file).
- Array must be sorted. There are two ways to do this. The first
is to sort a randomly generated array. The second is to generate the array
in a way so that it is sorted. You will implement the second way, which
itself has two approaches. One, you can ensure that your code never
generates an int less than the most-recently generated int; this is
straightforward. Two, you can generate ints in any order, and insert them
into a data structure that can easily be traversed to produce a sorted
array. You will implement the second way.
- You will use a binary search tree (BST). This is not to be
mistaken for a binary search algorithm, although there are some
commonalities. Here is the wikipedia entry on
BST. And an applet that
animates a BST might help you (I recommend using the random button to
play with it at first). You will not have to be concerned about
deletion of nodes nor about any destructive updates (for saving space,
rebalancing, etc. -- you will have to update node links as described in
a bullet below). The basic idea of a BST is to create a binary tree
with the following properties.
- Each node in the tree has an int value and two children, a left
child and a right child.
- The value of a node has to be greater than or equal to the value
of its left child, and less than or equal to the value of the right
child.
- Elements are inserted sequentially -- the first value goes in the
top node of the tree (there are different ways to handle this) with
two null children. After that, a value is inserted by comparing it
to the top node, determining if it is less than or equal to the
value of the node -- in which case it is recursively inserted in
the tree represented by the left child -- or greater than or equal
to the value of the node -- in which case it is recursively
inserted in the tree represented by the right child. When a null
node is encountered, a null node is created, filled with the
inserted value, and linked in appropriately to the existing
BST.
- There are plenty of BST implementations in Java on the web, which you
can look at. Here is one
that I haven't fully vetted but looks fine (except I'd prefer traversal
to use the visitor pattern explicitly). You can base your own
implementation on one of these, but you must give appropriate
credit. And make sure that your implementation doesn't include
methods, etc. that you do not need for this assignment. (This is one
way we will be able to tell if you took an existing implementation,
didn't cite it, and didn't understand it enough to modify it. That
would be bad :-).)
- Good distribution of found/not found keys. Once you generate the
sorted array, you can use any distribution you wish to select a key that is
found or not found. For instance, if you want 10% of the tests to have keys
that should be found, roll a 10-sided die and use it to decide whether to
randomly select one of the elements in the array or to randomly select an
int not in the array.
- Like #1, the percentage of found elements should be easy to change.
- No oracle. Once you've handled the previous point, you've
essentially created an oracle -- you know if it should be found or not
found, and if it's found you know the index. (This is a tad tricker than it
looks, because if there are multiple occurrences of the key, you might
choose one and the binarySearch might choose the other.)
A few more things:
- You must generate runnable JUnit tests and store them into a file named
BinarySearchRandomTests.java
. (Although this is a bit of a
pain, it allows you to more easily test your program by running the
generated tests on binarySearch.)
- You might decide to first generate the basics, and then work on the
formatting into JUnit tests. You can copy-and-paste the tests into
Eclipse.
- You should generate a complete test class -- imports, etc., so that
copy-and-paste will "just work."
- Your program must accept a command-line parameter indicating how many
JUnit tests you are to generate during a single run of your program.
- You must be explicit about the representation invariant for the BST and
must implement it as
repOK()
. You will want this to ensure
that you have a legal BST.
- You must be explicit about the abstraction function for the BST -- this
is somewhat easier than the rep invariant, but still takes a little
attention. This appears in documentation, not in the code per se.
- You must write tests to test your program carefully.
Warnings
- Remember, this is not necessary a well-designed assignment. This suggests
-- more strongly than usual -- that you start very early.
- You have to keep your mind clear, especially in the following dimensions
binarySearch
(which you are testing) vs. binary
search trees (which you are using in your implementation)
- Tests you are generating for
binarySearch
vs. tests of
your program that generates those tests.
Optional Options
If anybody wants additional challenges, here are some options. (Again,
not for credit. So make sure you have the base program right first, and then
work from a copy of that.)
- Support the ability to rerun a test deterministically by storing the
values generated for a test (array, key, oracle). Depending on how you
expect this feature to be used, it could be implemented with a pattern.
- Figure out a clean way to increase the likelihood of generating some edge
cases -- for instance, perhaps skew the distributions. (This could be used
to make some specific sized arrays more/less likely, and to make some
placements of found keys more/less likely, etc.)
- Make it more generic -- this may be tricky with respect to generating the
random elements.