CSE 373, Spring 2004: Assignment 3

Title: Heaps for Huffman (V1.03 -- Version)

Due Dates: Part I -- Wednesday, April 21 in class; Part II -- Wednesday, April 28 at 11:59 PM

Work Modes: Part I -- Individual; Part II -- Partnership

The partnership planning form should be turned in during class on Friday, April 16.

The partnerships can be found here.

Part I (Individual Work)

(a)  Take the following list of values and show how
the Build-Heap operation turns them into a binary heap.
Illustrate each step with a tree diagram similar to those
in transparencies 28-30 of the Priority Queues lecture slides.

6, 15, 2, 20, 9, 3, 35, 19, 4, 8, 27.

(b) Show the same sequence of steps but using an array to
hold the values, as in lecture slide 12.

(c) Read about Huffman coding in the text or online.
Using pencil and paper, build a Huffman tree for the 
following set of characters and corresponding frequencies
of occurrence:

a: 7
b: 2
c: 1
d: 1
e: 9
i: 3
k: 1
0: 4

This tree should have 15 nodes.

Turn in your solutions in class on Wednesday, April 21.

Part II (Partnership Work)

Let's refer to the two partners in each team as "Partner A" and "Partner B".

Implement the following sequence of demonstrations in Java
related to Huffman Coding.

Program 1. The Basic Huffman Tree

    We will define a Huffman tree to be a binary
      tree in which:
        each internal node has two children;
        each leaf has two data fields:
          a character and a positive integer count;
        each internal node has a count, and
          the count of an internal node is equal to
          the sum of the counts of its two children.

    The following basic operations can be associated with
      Huffman trees:
    CREATE-LEAF(c : character, count : integer)
      Creates a 1-node tree where the node is a LEAF node.
      Assigns a particular character as the first value 
      of the node. Assigns an integer value to the second.

    COMBINE(t1, t2 : HuffmanTree)
      Combine two Huffman trees to create a new tree by creating
      a new root node (that is an INTERNAL node) and 
      making the given trees be its left and right subtrees.
      The new node's count value is set to the sum of the children's values.

    BOUNDING-BOX:
      Determine the width and length of the bounding box of
      the displayed version of the tree.

    PAINT
      Paint the tree at a particular location in the display panel.
      Use the inorder-traversal display method shown in Ch. 6 of
      the text.  Use the VisStackApplet's paintComponent implementation
      as a rough guide to implementing this.

 Implement the class HuffmanTree in Java and implement the above operations.
 Create an applet like the VisStackApplet to test and
   demonstrate this.
 Design your applet so that you can have as many as 3 Huffman trees
   defined and displayed at one time. Call them T1, T2, and T3.
   This will make it possible to build up an arbitrarly large example
   through a sequence of calls to CREATE-LEAF and COMBINE.
 Come up with your own "command language" similar to that
 used in the VisStackApplet for controlling the demo.

   Partner A in each team should take the lead on this.
A demo applet giving an operational solution to this problem (without source code, naturally) is shown here.
Program 2. The Binary Heap

   Partner B in each team should take the lead on this.

    Implement a variation of the Binary Heap ADT using the vector-based 
    method (i.e., a one-dimensional array) suggested in Chapter 7,
    section 7.3.2.
    Call the special version by the name HuffmanHeap.
      Each cell of your heap should contain
        A Huffman tree (i.e., a reference to the root
        of a Huffman tree
        (as specified in Part 1)
        and
      You may also include
        An integer that will represent the relative 
        probability of occurrence of a character that
        belongs to that tree.

    You should support the following methods:
       RESET (empties the heap of any data it might already contain)
       ISEMPTY
       INSERT symbol, count
          The count value is used to partially order the cells in
          the heap.
       MINHTREE - return the Huffman tree in the heap
        that has the minimum count value in its root.
       DELETEMIN  - throw away the root of the heap and reform the
         result into a new heap.
    The heap should be able to display itself at a given location.
A demo applet giving an operational solution to this problem (without source code, naturally) is shown here.

Program 3. Forest of Huffman Trees
  (Partners A and B)

    This is a relatively minor extention of the Huffman Tree ADT
   to allow multiple trees -- a set of trees.

       The forest can be represented by a combination of a list
          and a Huffman Heap (to permit use of MINSYMBOL, MINVALUE, DELETEMIN).
          [The list is optional, not necessary.]

    You should implement a class ForestOfHuffmanTrees.
    The most important method of this class should be
    one called MERGEALL that starts with a bunch of single-node Huffman
    trees and constructs one big Huffman tree from them
    by repeatedly combining the two minimal-weight
    Huffman trees at each step.

    Your implementation of this method should make use of
    code from both Program 1 (for the Huffman trees) and
    Program 2 (used to keep returning the minimum weight
    Huffman tree during MERGEALL.
A demo applet giving an operational solution to this problem (without source code, naturally) is shown here.

OPTIONAL Programs for extra credit:

Each optional program is worth 5 percent extra credit, for a maximum of 20 percent extra credit. Each program should incorporate the features of the previous programs and use the same applet framework (with scrollable display panel, history window, and textual command parsing) as the VisStackApplet. For example, if you implement program 5, then it will be able to scan a textual string (perhaps delimited in the text input by special BEGIN TEXT and END TEXT lines), initialize the forest of Huffman trees, merge them into one big tree, and create the code. The user will be able to watch all this happening in the scrollable display panel.

4. Implement the Huffman tree method CREATE-CODE.

     CREATE-CODE:
      Perform an inorder traversal of the tree and
       create a list of pairs (c, turn-sequence) where
       each c is a character in a leaf of the tree,
       and a turn-sequence is a string of 0 and 1 
       characters that describe how to get to c starting
       from the root of the tree. A 0 means go to the left
       subtree and 1 means go to the right subtree.

5. Implement String processor 1
    which builds a table of characters and their counts.
    This table would then be used to initialize the forest of
    Huffman trees before doing MERGEALL and CREATE-CODE.

6. Implement a Huffman encoder
    that takes a Huffman code and a text document
    and creates the encoded version of it using the
    given code.

7. Implement a Huffman decoder
    that takes a Huffman code and a string representing
    a document compressed with that Huffman code.
    It should return the decoded document, which should
    be identical to the original document before encoding.
A demo applet that shows the extra credit features is shown here.

Turn-in

For Part I, turn in your individual papers on Wednesday, April 21 in class. For Part II, post your applet on the web, using the UW Computing and Communications "students" server. Make sure your applet works under both Internet Explorer and Mozilla by using the example HTML given in the VisStackApplet demonstration. The layout of the web page and other elements to put on the web page will be announced later.

Turn in your source code via the web using E-Submit.

Minor updates made April 26.