Steam-powered Turing Machine University of Washington Department of Computer Science & Engineering
 Modern Data Structures
  CSE Home  About Us    Search    Contact Info 

People
 Paul Beame
 Matt Cary
 Erik Vee
   

Modern Data Structures

We are all familiar with the nlog n lower bounds for sorting. Except for small ranges of inputs where algorithms like bucket-sort or radix-sort are efficient, the received wisdom has been that this is the best one can do. When one considers the instruction sets available on real machines, however, one has much more latitude than merely the comparisons or addition of values for which the n log n lower bound applies; one can perform indirect addressing, shifts and multiplications of word-side quantities at essentially unit cost. What new and improved data structures can we obtain with these or other reasonable operations on word-size quantities at our disposal?

Consider the classic problem solved by binary search - the predecessor or 1-dimensional range searching problem: Given a sorted list of integers y1,...,yn and an integer x, find the i such that yi< x< yi+1. With traditional analyses this requires log n time but we show that for word-size integers we can solve this in time sqrt(log n/loglog n) and that this bizarre query time is optimal! We also obtain the same bound on the worst-case query and update cost of dynamically maintaining sorted lists to answer such range search queries. We have extended this to optimal bounds for other problems such as bounded-precision nearest-neighbor search in spaces of fixed dimension.

We are examining automated garbage collection in this model as a data structure problem. The bugaboo of automated garbage collection is the unpredictable long response time as systems periodically stop productive work and collect garbage. With the increasing popularity of object-oriented languages such as Java, improving this garbage collection has become even more critical. Can one spread out this work and still keep the overhead of garbage collection small? We have results showing that it cannot be completely smoothed out, but the question remains of how lumpy this has to be in order to retain its low overhead.

The first of these modern data structure results, due to Fredman and Willard, showed that one can sort n word-size quantities asymptotically faster than nlog n time. The current best algorithms solve this in nearly nloglog n time. Is it possible to sort in linear time or implement a constant-time priority queue? We are just beginning to scratch the surface on these problems.

Publications


CSE logo Department of Computer Science & Engineering
University of Washington
Box 352350
Seattle, WA  98195-2350
(206) 543-1695 voice, (206) 543-2969 FAX
[comments to beame@cs.washington.edu]