Assignment 2: Graph Algorithms - Theory and Practice
CSE 417: Algorithms and Computational Complexity
The University of Washington, Seattle, Winter 2013
The reading for this assignment is Chapter 3 of Algorithm Design by Kleinberg and Tardos. Do this assignment independently, without the use of online resources or collaboration.
Due Monday, January 28 (changed from Friday, January 25). Submit your answers to Part I ( items 1 to 4) as hardcopy at the beginning of class. Submit your answers to Part II ( item 5) electronically via Catalyst CollectIt by 2:00 PM on Monday, January 28 (also changed from Friday, January 25).
 
PART I. (Written answers).
  1. Topological Sorting (10 points). List all the possible topological orderings for the following graph.

     
  2. Butterfly Time (15 points). In Kleinberg and Tardos, Chapter 3, Exercise 4.
     
  3. When Depth and Breadth Come Together (10 points). In Kleinberg and Tardos, Chapter 3, Exercise 6.
     
  4. Who Lived When (15 points). In Kleinberg and Tardos, Chapter 3, Exercise 12.
     
PART II. (Code).
  1. Image Segmentation with Minimum Spanning Forests (50 points). Develop a Python 3 implementation of the image segmentation algorithm that we explored on the first day of class. Your implementation should include the following parts.
    • An implementation of Kruskal's algorithm, with a provision for stopping when the number of trees in the spanning forest reaches the value of a parameter ntrees. (25 points).
    • A method that takes the input image data (in the special form described later) and constructs a graph from it. There should be one vertex for each pixel of the graph. There should be an edge between two vertices if their pixels are adjacent (share a pixel boundary) in the image. Each pixel has an RGB triple (r,g,b) associated with it. The weight on an edge should be computed as the euclidean distance between the color values of the two pixels involved: D(vi, vj) = sqrt( (r1 - r2)^2 + (g1 - g2)^2 + (b1 - b2)^2 ). Because this graph has a very regular structure, you should create a special kind of array-based representation for it. You should design this representation yourself, and explain in the comments how each vertex and each edge is represented. (Hint: consider one array for the "horizontal" edges and one array for the "vertical" edges.) (15 points).
    • A means to output the results of image segmentation as a list of lists of the form below.
      [[0, 0, 0, 1], [2, 0, 1, 1], [2, 2, 1, 1]]
      
      In this example, the image is 3 rows by 4 columns. The value of ntrees is 3, and so the pixels are labeled with values in the range {0, 1, 2}. The numbers of rows and columns will depend on the input data, and your implementation should be able to handle any cases having at least one row and at least one column.
    • (optional, for 5 extra points). Implement the UNION/FIND ADT using uptrees and path compression, and use it in your implementation of Kruskal's algorithm. At the end of a run of Kruskal's algorithm, report the number of UNION, FIND, and pointer-following steps used by your algorithm. Here again, you may wish to use a special array-based uptree representation that takes advantage of the regular structure of the original graph. However, this is not required; you can also use a more conventional object-based approach.
    • (optional, for 5 extra points). Implement the priority queue ADT using a binary heap (which you should also implement yourself), and incorporate it into your implementation of Kruskal's algorithm. At the end of a run of Kruskal's algorithm, report the number of steps taken (count the number of comparisons of keys) within the binary heap to perform the operations.
    The input to your algorithm will be obtained in two phases. First, a string filename. Then your program will open and read the specified file to get the image data. The image data will be in the form of a list of lists of triples of integers. Here is a sample of the file contents for a very small image that could be the input for the sample output given above.
    [[(50, 50, 50), (50, 50, 50), (50, 50, 50), (79, 20, 16)],
     [(32, 93, 78), (50, 50, 50), (78, 21, 17), (79, 20, 16)],
     [(31, 89, 78), (33, 90, 77), (79, 21, 15), (77, 20, 17)]]
    
    Be sure to comment your code using either multi-line strings or comments following pound (number sign) characters.
     
    Turn in (a) your code, (b) a file BASIC-RESULTS.TXT showing your results on the example above with ntrees = 3, (c) and a file EARTH-RESULTS.TXT showing your results on the file earth-sm2.txt with ntrees = 2. Your code should consist of a main program file, named ImageSeg.py plus any additional python files that are imported by ImageSeg.py in which you implement various parts of your program. Here is the Catalyst dropbox for turning in these files.
     
Updates and Corrections: Any changes or important clarifications to the assignment will be posted here. (The due date was changed from Jan. 25 to Jan. 28 on Jan. 24.) (The leading "[" in lines 2 and 3 of the basic data example were removed on Jan. 23, so that the data is a valid list of lists of tuples.) (Data file changed at 5:27 PM on Jan. 16.) (Last updated Jan. 18 at 11:40 AM, with naming of ImageSeg.py specified, and transition from draft version to regular released version.)