UW Home     CSE Home   Announcements    Message Board    Contact Info 

 
 
 

CSE143 Winter 2005 Project 3

Deep Aqua, optimal Tic-Tac-Toe player

Part A due: Saturday, March 5, at 9:00 pm. Assignments may be turned in up to 24 hours late without penalty, however you should try to finish part A by the original deadline.

Part B due: Friday, March 11 at 11 am. No late assignments will be accepted.

Goals

The primary goal of this project is to give you experience using recursion, and more generally, algorithms -- systematic procedures for accomplishing a task.  You'll also get some more practice in splitting up a problem into convenient classes, and a bit of reinforcement on putting together a simple GUI for user interaction.

Background

The bread-and-butter of classical artificial intelligence is search -- evaluate all the options for what to do and pick the best.  Often, a task to be accomplished has multiple steps, and multiple options at each step.  In that case, the search procedure starts at the beginning, and considers each option for the first step, to evaluate the benefit of taking that step in helping complete the task.  But to evaluate a given first step, the search has to examine all the options for the second step, that follow from that first step, and see how well they would do.  And to evaluate an option for the second step, it has to check out all possible third steps that follow from that second step.  And so on, until at some point, a sequence of steps reaches a point where either the task is accomplished, or fails, or, more generally, some final state is reached that state has some "value", as far as accomplishing the task.  After evaluating all possible paths, there will be one (or several) paths with the highest value -- any of those will do.

As you might imagine, examining all possible sequences of steps can take quite a bit of time.  In fact, for all but the simplest tasks, there just isn't enough time to check out all possibilities.  Just think about how many cases we'd need to check:  Say that, for each step, there are k options, and say that we expect to find a final state after n steps.  How many sequences are there?  Well, there are k choices for the first step, and for each of those there are k choices, so that's k2 so far.  For each of those k2 pairs of first-and-second steps, there are k third steps -- now we're up to k3.  For all n steps, we'll need to try kn sequences.  That is, the complexity of this sort of search is O(kn) -- this is exponential in the expected number of steps, which can very rapidly turn into "not before the sun burns out".

Nevertheless, we're going to try it on a very small problem -- playing tic-tac-toe.  In playing a game like this, the "task" is to make moves until we get to the end of the game, and the value is our final score.  For tic-tac-toe, we can win, draw, or lose -- win has the highest value, then draw, then lose.  (By tradition, we give a positive score (e.g. 1) to a win, nothing (0) for a draw, and a negative score (e.g. -1) for a loss.)  So when we assign a value to a sequence of moves, what matters is what's at the end -- the value of the whole sequence of moves is our score at the end.

If we're playing tic-tac-toe (or any other two-player game), we control only our own moves.  So if we're going to search for the best move, we can imagine picking each of our moves in turn, but the usefulness of those moves depends on what our opponent does, so if we don't know what they're going to pick, how can we possibly evaluate our own moves?  Well, we can imagine what we'd do, if we were the opponent -- we can make their moves for them, in the hypothetical games we're evaluating.  So, do we want them to make random moves?  Or be an average player?  Or should we assume they're going to mess up?  No!  Selecting a move for our opponent that is less than the best they could do is wishful thinking on our part, and is risky, because the real opponent won't want to give us any breaks -- they're going to do the best they can, and we have no reason to expect they won't choose their best move.  So, when we play for them, we should do just that -- choose their best move against us.

How do we figure out what would be our opponent's best move?  The same way we figure out ours -- search.  And if we're careful to be somewhat general, we can use our same search procedure, or something very like it, for our moves and theirs.

What's different when we're playing for our own player, versus for the opponent?  It's in what value we assign to moves.  It's easiest to think of the value of moves in terms of our own player's score in both cases.  When we make a move, we're trying to make our score as large as possible.  When the opponent makes a move, they're trying to make our score as small as possible.  So when we alternately make our moves and theirs, we're alternately trying to maximize, then minimize, our score.  This scheme of assuming both we and the opponent make the best moves we can, and that our best is what maximizes our score, and their best is what minimizes our score, is called minimax search.  It's the optimal choice for any two-player "zero-sum" game (a game where if one wins, the other loses), and is at the heart of even the most sophisticated game-playing computer programs.

Not administrivia

Unlike previous projects, there is really not a separate first part that can be done without taking into account what will come later. The design decisions you make depend on knowing what you're ultimately going to be using your code for.  So although we'll split the requirements into two sections, you should read the entire project description up front, and keep in mind, as you're doing the first part, what you'll need for the second.  Also, the first part is short -- you should go straight on to the second when you're done, and not wait for the part A due date.

Administrivia

For this project you may work with a partner of your own choosing, possibly from a different section, or you may work alone. If you don't choose either of these options, you will be assigned a new partner from your section, as in past projects. If you work with a partner, you and your partner should turn in a single set of files. After the final part of the project is done, you will individually produce a written report. We strongly suggest that you work with a partner, as it will be very helpful to have someone to talk to about detailed design and implementation decisions.

Grading for part A:  This is a progress check.  We will try to give you quick feedback on the scale of 0-3: 3=no major problems, 2=something needs to be fixed, 1=serious trouble, and 0=no credible effort.  As always, besides the stated requirements, be sure to include appropriate JavaDoc and other comments in your code, use meaningful names, indent sensibly, provide toString() methods where appropriate, and so forth.

Grading for part B:  This is the final code turnin for the project.  If you work with a partner, you and your partner will receive the same scores on the programming parts of the project.  Programming projects will be evaluated both for how well the code works and how well it is written, each on a scale of 0 to 4 (see the project assessment guidelines on the main projects page for details).

Overview

In this project, you'll be implementing a tic-tac-toe playing program using minimax search, and making it possible for a human to play against your automated player.  You'll start by allowing two humans to play each other, with a simple GUI, then adding the option to play against the computer (and perhaps even have two automated players play each other).

For purposes of the progress check only, we're dividing this into part A, where you'll get the human-versus-human version running, with a mouse-operated GUI, and part B, where you'll add the automated player.  But as mentioned above, you should keep in mind at all times what you'll need to do for part B.  We'll include some notes on what to watch out for as you start on part A.  In the Details section, we'll note whether each topic is included in part A or part B, and will summarize the requirements for each at the end.

Just in case anyone is unsure of the rules of tic-tac-toe, here is a brief description (which is all that's necessary...).  Tic-tac-toe is played on a board with 9 squares, arranged in 3 rows and 3 columns.  Two players alternately mark an unmarked square with their symbol -- an X for the player who starts first, and O for the player who starts second.  A player's goal is to get three of their own symbols in a row, either horizontally, vertically, or diagonally.  Play stops either when one player wins, or when all squares have been filled without a win -- in that case, the game is a draw.

Details

Dividing up the problem  (part A)

This is an opportunity for you to get some practice in splitting up an application into classes.  Think of what objects there are in a tic-tac-toe game, what tasks each has to perform, and what information you need to have for each.

There isn't One Best Way to do this, and, in fact, you may find that what's convenient with two human players isn't so convenient when one player is part of your code.  For instance...
  • Human players using a GUI will just make their moves at will, but an automated player (or a human player using a command line (text) interface), will need to be asked for a move.  So when you have an automated player, what will do the asking?  Two humans will "run" the game themselves, by taking turns, but what will control the game if one or both players don't take the initiative, but instead need to be asked?  It may be helpful to think about what you'd do with two humans if you were doing just a command line interface.

  • Two human players are the same type of thing -- it would be easy to just build them directly into the game.  In fact, because you know whether it's X or O that plays next, you don't even need to know which human clicked the mouse in your GUI!  An automated player is rather different, but it would be nice make the two types "plug-replaceable", i.e. not to have to change anything else, but just "plug in" different types of players.

  • When you implement minimax, you'll need to represent many different possible moves, without losing track of what the current state of the game actually is.  With two human players making their own decisions, you don't need any recursive search, so all you need to know is the current state of the game.

  • The MVC architecture is your friend.  It tells you to keep the model (e.g. the game state) separate from the GUI.  When you're doing a minimax search, you don't want to drag an entire GUI along with each possible move's game state!

Representing the state of the tic-tac-toe game  (part A)

To support playing the game, you'll need a way to...
  • ...represent the "state" of the game -- what Xs and Os are on the board, and whose turn is next.
  • ...tell if the game is over, and what the outcome is.
  • ...get moves from the players.
  • ...decide if a player's proposed move is legal.  (It's not if there's a symbol already there.)
We suggest that you make it possible to use each of these features without a GUI, so you can test by executing single method calls from DrJava or from JUnit tests. JUnit tests are not required for this project, but you may find them helpful as you debug your code.

A simple GUI  (part A)

Put together a simple GUI that lets players select moves with the mouse.  Because time is short, and we don't want you bogged down in learning new Swing classes, we'll give more explicit recommendations for this GUI.  You'll need three things:
  • ...the board itself -- a 3 x 3 grid of JButtons will work nicely.
  • ...a button to start or reset the game.
  • ...a place for a message, to tell who should move next, or to tell when the game is over and what the outcome is.

You'll want to use a JFrame as the top level window as usual, with a JPanel to display the game board and another one beneath it for the control button and message.

Recall that the default layout manager for a JPanel displays the components that are added to it from left to right. For the game board, we'd like to specify instead that the buttons that make up the game cells be laid out in a 3 x 3 grid. A different layout manager, GridLayout, is perfect for this. So when you create the JPanel to display the game, instead of using the no-argument JPanel constructor, use this sequence to specify that you want to use GridLayout as the layout manager, and it should have 3 rows and 3 columns:

JPanel grid = new JPanel(new GridLayout(3, 3));

Then create 9 buttons and add them to the grid -- they should fill in left to right, then top to bottom.  In order to keep your buttons from shrinking to nothing when you "pack" your JFrame, you should set their preferred size to something reasonable.  For instance:

JButton button = new JButton();
button.setPreferredSize(new Dimension(80, 80));
grid.add( button );


Then add this panel to the JFrame using it's getContentPane().add() method.

For your start button and a status message area, you can put another JPanel in (say) the south part of your content pane.  A convenient component for a text message is a JLabel -- it's like a JButton in that it can display a line of text, or an icon, but it does nothing in response to mouse clicks.  JLabel has a setText method that you can use to change what message it displays.

If you don't give JPanel a specific layout manager as above, it has a FlowLayout -- this arranges components mainly left to right, top to bottom, but doesn't constrain them to be the same size, and it will "flow" them onto multiple rows as needed depending on their size and the width of the window.  If you add a JButton and a JLabel, they'll likely end up side by side, which is what you want.

Running a human versus human game  (part A)

You've seen how to get button press events (look back at BallSimControl.java).  Since you know whether X or O plays next, you don't need anything but a mouse click on a button to tell both what square the move is in and which symbol is needed.  You can display the symbol right on the button -- you can change the symbol with setText.  (If a player clicks a button that's already been used for a move, you can just ignore it -- don't worry about telling the player they're being silly.)

Make sure you can stop the game when someone wins, or when it's a draw (e.g. stop responding to attempts to move).

Display a message to tell who should move next, or, when the game is over, what the outcome was.

Implementing an automated player -- minimax search  (part B)

When your automated player is called on to choose a move, it may be starting from some partly-filled board.  From there, it will need to check each allowed move.  What it needs for each possible move is the expected score if it chooses that move.  Then it can choose a move with the best score.  In this case, where the player is making a move for itself, the "best" score is the highest.  By tradition, since it's picking the highest score, when the player tries out a move for itself, it's called "Max".

The expected score from making a move is also called the "value" of the move.  (We sometimes also say this is the value of the game state that results from making that move.)

How do we get the value of one of these moves?   Well, if the move ends the game, then its value is the final score: +1 for a win, 0 for a draw, -1 for a loss.

If the move didn't end the game, then it's time to make a move pretending we're the opponent.  So we do almost exactly the same thing:  Starting from this new game state (including the hypothetical move the player is trying out), we go though all allowed moves for the opponent, and get their values.  Again, these are our player's expected scores.  But now, the opponent does their worst to us, and chooses the smallest, and returns that as the value of their move.  Because it chooses the minimum score, the opponent is traditionally called "Min".

We just said, when the opponent tries out their moves, it has to evaluate them.  How we do that is suspiciously similar to what we did before.  If the move ends the game, we return the final score.  If not, we go on to...try out moves for our player again.

So now we've come full circle.  Surely we don't want to replicate all the player's search code right here, and then replicate the opponent's code, and then the player's code, and then...  This is where recursion comes in -- we already have (or will) a method that tries out our players moves and returns the best value.  So, we just call that.  And inside that call, it may need to try out opponent moves again, but we've got that covered, because it's just using the same opponent code as before.

There are two major gotchas when writing a method that calls itself (or calls something that calls it back, or any series of calls that can come around and call the same thing again).

First is, we have to stop some time.  If you think your program is really slow, it may not be slow, but in an infinite recursion.  Unlike an infinite loop, an infinite recursion can't go on indefinitely.  This is because on each method call, Java needs some memory for parameter values and other local variables of the method it's calling, and some place to remember where to go back to when the method is done.  It can't get rid of this memory until the method returns.  If the method keeps calling itself, Java will have to keep allocating more of these "call frames".  Eventually it will run out of memory, and you'll get a StackOverflowError.  So be sure you correctly detect end states!

Second, our recursive methods must be "reentrant".  This means it must be possible to any number of callers to be using the same code without interfering with each other.  If the method relies on storing information in instance variables, or other "global" variables, then when a recursive call comes along and changes those itself, the original call's information will be gone.  So recursive methods must store any information they need only in local variables, as each call gets its own copy of these.

Running a human versus machine game  (Part B)

Now that you have an automated player, you need to let it take the place of one of the human players.  Unlike the human versus human case, the players won't take turns on their own -- you'll have to alternately get a move from the human and the automated player.  Be careful not to let the human sneak extra moves in before the automated player has had a chance to take their turn!

Summary of what to do for part A

  • Come up with a way to represent the state of a tic-tac-toe game.
  • Let two human players play against each other, using a simple GUI.

Summary of what to do for part B

  • Make an automated player, using minimax search.
  • Allow a human to play the automated player through the GUI.

Hints

Understanding minimax  --  It's good to try out a minimax search by hand, drawing all the possible moves that branch out from each game state.  But this gets big very fast.  What you might do instead is try a 2 x 2 version of tic-tac-toe.

Multiple game states
  --  When you need to try out lots of moves starting from some state, you could make an object representing each new move's state from scratch, specifying what's in each square.  An alternative is to take the starting state and a move, and make a new state based on those.

Different types of players  --  This is an opportunity to try making your own interface or (abstract) superclass to represent a generic player.  Because the human player operating a GUI, and the automated player that has to be asked for moves are rather different, giving them a common interface may involve some compromises.

Write the code first, remove redundancy later  --  When you're implementing minimax, you may find you have two very similar sets of code, one for "Min" moves and one for "Max" moves.  If you can find a way to merge them right away, that's great, but don't get hung up on this.  Another way to do it is to go ahead and get it working with two almost identical sets of code, then afterward look for tricks to combine the two.

Here's another case where you might want to tolerate some redundancy to begin with.  You've probably noticed that sometimes before you start into a loop, you have to "prime the pump" by doing something for the first time outside the loop, that you repeat inside the loop.  The same sort of thing happens with recursion, only there, the pump-priming usually includes the call to start the recursion, and inside the recursive method, the parallel code includes the recursive call.  Later, if you want to get try and get rid of the duplication, one trick is to make an intermediate method, and call it both from where the recursion is started, and form inside the recursive method.  The intermediate method then calls the original recursive method.  That is, you have a method a that calls method b, then b calls a.  This is a perfectly valid form of recursion.  (Note you may not need this trick at all, so don't worry if it doesn't match how you set up the recursion.)

Extra credit

We strongly recommend that you get your basic requirements running before trying for extra credit.

Optimize your search  --  There are many opportunities for making your search faster.  Some of these have to do with not re-doing work you've already done, others with realizing you're on a hopeless path.  For instance:
  • It's possible to reach the same game state in tic-tac-toe by different sequences of moves.  So any state you reach, you might have already evaluated.  To avoid evaluating it again, you could save the value.  You'd want a way to store it so you could find it given the state.  By now, you can probably guess what Java library class is appropriate for this...  But there's a problem.  If you're making new game state objects on each move, you won't have the exact same object when you want to look up a previous state.  So you'll need to change how equals() works, to compare the contents of the state, not the reference.

  • An additional wrinkle is that some states are mirror images of each other, or rotated versions of each other.  Does this make any difference to the eventual score?  So you could extend your equality check to allow these equivalent states as well.

  • At any point when you're trying out a set of moves, you have a current best that you have to beat.  If you hit some end state that is going to spoil that entire "branch" of the recursion, so that there's no way you can beat that previous best, you can quit evaluating that branch.
Turn your application into an applet  --  An applet is a Java program that can be loaded from a web server and run in a browser.  If you convert your GUI into an applet, and package all your code up appropriately, you can add your tic-tac-toe game to your own web page.  Look at these Java tutorial sections on writing applets, and on using Swing components in applets.

Show what the automated player is doing  --  E.g. when the automated player is choosing, pop up a window or highlight the board showing each move it's trying.  Give it a "next" button to step through the moves slowly.

Make a general two-player game driver  --  Can you come up with an abstract representation of a game -- perhaps a game interface -- so that you can reuse your minimax search for other games?

For more information (optional, but if you're interested in this stuff...)

Artificial Intelligence: A Modern Approach, by Russell and Norvig, has a chapter on search for game playing.  This is ch. 6 in the 2nd edition, ch. 5 in the 1st edition.

Even much more complex game playing programs, like Deep Blue, that won against the best human chess player, Gary Kasparov, in 1997, use minimax.  But they don't run games all the way out to a win or loss!  (Think about how many options there are for each next move -- the "branching factor" k we mentioned above when talking about complexity -- and how many moves there might be to the end of a chess game -- that was n above.  So how big is kn for chess?  Riiiight...  And anyway, moves in championship chess have time limits.  So chess-playing programs stop at some depth of future moves, and they estimate the "future value" of the board positions they do reach -- this substitutes for the actual score at the end.  The competence of a chess program depends not only on how many moves it can check in the time available, but also how well it estimates the value of a board position.  These "evaluation functions" are often carefully-guarded secrets.

If you'd like to read about the software (and the specialized computer hardware) that were required to beat Gary Kasparov, a description of the events and technology are recorded here.  There's an article by the UW's own William Calvin about how humans play chess.  Or go to the ACM (Association for Computing Machinery) Library web site, and search for "deep blue" -- there are several apropos articles.  (If you can't access articles here from outside the UW, try it from a UW machine -- we may have a site license for their content.)

Why did the "background" section say "classical AI" depends on search?  That implies it's history -- what's wrong with it?  And if that's "classical", what's "modern"?  Search takes too long for most realistic problems.  We've got more effective methods now, that don't require exponential runtimes.  They also deal with the fact that the real world is imprecise, and often what we want it an approximation, or a good-enough solution.  These new methods go by names like "machine learning" and "data mining".  Instead of search, they depend on statistics...  Explaining how this works is beyond the scope of this writeup.  ;-)

What to Turn In

Use the appropriate turnin form for part A or part B to turn in the Java source files that make up your project, including any JUnit tests. If your project uses any other files, such as images, turn those in also. If you have many files, including things like images, you can bundle them into an archive file (zip, jar, or tar) and turn that in. Multiple turnins are fine and highly recommended if you are planning to add extra credit features. Once you've got something that meets the basic requirements, turn that in. Then if you add to your project, turn in the extended version(s) later - we'll grade the last one you turn in.