|
CSE Home |
Previous Quarters |
About Us |
Search |
Contact Info |
|
590z meets Tuesdays, 1:30-2:20pm in GWN 301.
590zz meets Wednesdays, 2:30-3:20pm in MOR 220.
Click here to view the combined email archive for cse590z@cs and theory-group@cs. The plan for 590z this quarter is to continue presenting material from Proofs from The Book, although people who would like to talk about something else from the current theory literature are welcome to do so. There will also be a few outside speakers talking about their recent research in theoretical computer science. 590zz, as usual, will focus on the current research work of students and faculty. This can be anything from half-baked to fully developed ideas. The point is to let everyone in the theory group know what you are working on and perhaps get some useful insights from others into problems that you are stuck at. Graduating students are also welcome to present their job talks.
ScheduleNote for speakers: Please email me (ashish@cs.washington.edu) the details of your talk as soon as you decide. Thanks!
|
|
| Speaker | Jeremy Barbay, University of Paris |
| Title | Intersection and t-Threshold Problem on Sorted Sets, Lower Bound and Algorithms for Indexed Search Engines |
| Abstract |
Consider the problem of computing the intersection of k sorted
sets. In the comparison model, we prove a new lower bound which
depends on the non-deterministic complexity of the instance, and
implies that the algorithm of Demaine, López-Ortiz and
Munro is optimal in this "adaptive" sense (for k much
smaller than n). We extend the lower bound and the algorithm to the
t-Threshold Problem, which consists in finding the elements
which are in at least t of the k sets.
These problems will be presented at the conference SODA02.
These problems are motivated by boolean queries in indexed text database systems such as Google. Two variants of the intersection problem, Bidimensional Intersection and InterUnion Problem, are studied in the context of the FFSS project, a new sharing file system protocol, associated with an indexed search engine. This protocol is currently tested on the local student network of the Federation AURORE, at the university of Paris-Sud. A generalisation of the t-Threshold Set Problem, the optimal Threshold Set problem would be of interest in indexed search engine and is also under study. |
---------------
| Speaker | Stefan Saroiu |
| Title | Content Distribution for Peer-to-Peer Systems |
| Abstract |
Recently, many proposals of peer-to-peer system designs and protocols
have emerged, all providing similar functionality: content name
lookup, that is resolving a query for a piece of content into the
address of the peer storing the content. While content name lookup is
vital to their success, peer-to-peer systems have another, equally
difficult, challenge to overcome: distributing content efficiently.
The characteristics of peer-to-peer systems pose a novel, unique environment to the problem of distributing content. First, in these systems, hosts have homogeneous and symmetric roles, acting as both clients and servers. Thus, in contrast to classic client-server architectures, distributing content in peer-to-peer can be done more efficiently, since any host can participate in the distribution. Second, a recent study has found that most of the participating peers are home-users, using asymmetric, low-speed (broadband and modem) Internet connections. Thus, bandwidths of transfers between peers are limited by the peer's connection speeds and a natural ordering of these speeds exists in practice. In this talk, we start by formalizing the content distribution problem for distributed systems in general and we describe two practical time-based objective functions. Based on this formal description, we find that most content distribution problems trivially reduce to or are known to be NP-hard. Surprisingly, however, in the case of distributing content from well-provisioned nodes in peer-to-peer, the content distribution problem has a linear time optimal algorithm for one of the proposed objective functions. Finally, we present an evaluation of this optimal algorithm and directions for future work. This is joint work with Krishna Gummadi, Jared Saia, Steve Gribble and Anna Karlin. |
---------------
| Speakers | Jason Hartline, Ed Hong, Alex Mohr, William Pentney |
| Title | Anti-Persistence: History Independent Data Structures |
| Abstract |
Paper by: Moni Naor and Vanessa Teague
Many data structures give away much more information than they were intended to. Whenever privacy is important, we need to be concerned that it might be possible to infer information from the memory representation of a data structure that is not available through its legitimate interface. Word processors that quietly maintain old versions of a document are merely the most egregious example of a general problem. We deal with data structures whose current memory representation does not reveal their history. We focus on dictionaries, where this means revealing nothing about the order of insertions or deletions. Our first algorithm is a hash table based on open addressing, allowing O(1) insertion and search. We also present a history independent dynamic perfect hash table that uses space linear in the number of elements inserted and has expected amortized insertion and deletion time O(1). To solve the dynamic perfect hashing problem we devise a general scheme for history independent memory allocation. For fixed-size records this is quite efficient, with insertion and deletion both linear in the size of the record. Our variable-size record scheme is efficient enough for dynamic perfect hashing but not for general use. The main open problem we leave is whether it is possible to implement a variable-size record scheme with low overhead. |
---------------
| Speaker | Frank McSherry |
| Title | Data Mining through Spectral Analysis |
| Abstract |
The heading Data Mining covers many problems, from web searching to
clustering to data prediction. In this talk, we will see how spectral
analysis, the use of matrix eigenstructure, can be applied to these
problems. We will see how the three problems mentioned above can be
cast in a common framework, and why spectral techniques are sucessful.
Joint work with lots of people, none of whom should be held
responsible
for the quality of the talk :)
|
---------------
| Speaker | Justin Campbell |
| Title | Coding Theory, An Introduction |
| Abstract | I will give an introduction to one of the current "hot" topics, Coding Theory. This will include, but is not limited to: background, basic terminology, trade-offs, linear codes, Hamming codes, Hamming bound, Singleton bound and possibly Reed-Solomon codes. My intent is for this to be understood by everyone who's had a (very) little linear algebra. Questions are encouraged. |
---------------
| Speaker | Erik Vee |
| Title | Lower bounds for the Nearest Neighbor Problem |
| Abstract |
Suppose you are given a set of m points in {0,1}^d, and you wish to
construct a database such that given any query from {0,1}^d, you can
quickly find the nearest point in your database. We investigate the
decision version of this problem, called Lambda-Nearest Neigbor: Given
the query point, decide whether any point in the database is within
distance lambda of the query. We show several different time-space
tradeoff lower bounds for the problem. Under the right assumptions,
these bounds relate the amount of space needed to store the database
vs. the amount of time needed to process a query to the database.
This is joint work with Paul Beame. The paper will appear in STOC02. |
---------------
| Speakers | Theory Night Group |
| Title | History Independent Data Structures |
| Abstract | We will present recent results from our study of History Independent Data Structures. The talk will begin with a brief review of history independence. We will make a few observations about history independence and space(memory) usage of data structures. We will give some general results about dynamically resizing and history independent data structures as adding the typical doubling techniques to a history independent data structure breaks their history independence. We also make some arguments as to the necessity of canonical representations for strong history independence. |
---------------
| Speaker | Ashish Sabharwal |
| Title | The Monotone Circuit Complexity of 3-Clique |
| Abstract |
This is a "half-baked" idea that I worked on 4 years ago with a bunch
of people at IIT Kanpur but never took to completion. I recently
started thinking about it again for fun!
One can easily construct circuits based on AND, OR and NOT gates to compute any Boolean (i.e. 0/1) function. In 1949, Shannon proved that almost all such functions require exponential size circuits. However, demonstrating even one specific function that indeed requires even super-linear circuit size is still an open problem. Such lower bounds are known only for circuits with various restrictions, such as monotonicity (not using any NOT gates) and bounded depth. 3-Clique is the problem of detecting whether a graph input in the form of (n \choose 2) edge variables has a 3 clique or not. This has a trivial monotone circuit using n^3 gates. An \Omega(n^3 / log^4 n) lower bound is also known, using somewhat complex techniques. We present a very simple combinatorial approach that we believe would get a better, simpler lower bound of \Omega(n^3 / log^2 n). The tangible results so far are for circuits of depth 3 or 4 only. |
---------------
| Speakers | Nilesh Dalvi, Karim Filali |
| Title | The Slope Problem, PFTB Chapter 9 |
| Abstract | We'll prove that any n >= 3 points in the plane that are not co-linear determine at least n-1 different slopes. The proof uses a nice reduction to a combinatorial model. |
---------------
| Speaker | Jared Saia |
| Title | Censorship Resistant Peer-to-peer Content Addressable |
| Abstract |
We present the first peer-to-peer network which is provably robust to
adversarial attack. Our network is robust in the sense that even
after adversarial removal of half the nodes in the network, an
arbitrarily large fraction of the remaining nodes can still access an
arbitrarily large fraction of the original data items. We also give a
variant of our scheme that has the property that it is highly spam
resistant: an adversary can have complete control of a constant
fraction of the peers and yet will still be unable to generate spam.
Our network is fully distributed and has time and resource bounds
which are competitive with other proposed peer-to-peer systems
(e.g. Chord, CAN and Tapestry) which are not provably robust to
adversarial attack.
This work appears in the Symposium on Discrete Algorithms, 2002, the Journal of Algorithms (by invitation), and the First International Workshop on Peer-to-peer Systems, 2002. The work is joint with Amos Fiat, Steve Gribble, Anna Karlin and Stefan Saroiu. |
---------------
| Speaker | Paul Beame |
| Title | A sharp threshold in proof complexity yields lower bounds for satisfiability search |
| Abstract |
It is well known that for random 3-CNF formulas there is a sharp
threshold in the ratio of the number of clauses to the number of
variables, below which formulas are almost certainly satisfiable but
above which they are almost certainly unsatisfiable. There is
substantial empirical and heuristic evidence that this satisfiability
threshold is at roughly 4.2 but its precise value is unknown. There
has been considerable work aimed at finding upper and lower bounds on
this value.
We prove that random 3-CNF formulas, at clause-to-variable ratios below 4 and thus presumed to be satisfiable almost certainly, almost certainly cause natural backtracking search algorithms (also known as Davis-Putnam or DPLL algorithms) to take exponential time to find satisfying assignments. We derive these results by giving the first example of a sharp threshold in proof complexity. More precisely, we show that for any sufficiently small epsilon>0 and D > 2.28, random formulas consisting of (1-epsilon)n 2-clauses and D n 3-clauses, which are known to be unsatisfiable almost certainly, almost certainly require resolution and Davis-Putnam proofs of unsatisfiability of exponential size, whereas it is easily seen that random formulas with (1+epsilon)n 2-clauses (and D n 3-clauses) have linear size proofs of unsatisfiability almost certainly. (Joint work with Dimitris Achlioptas and Mike Molloy) |
---------------
| Speaker | Gidon Shavit |
| Title | New Coins from Old: Computing with Unknown Bias |
| Abstract | The paper I'll be presenting is New Coins from Old: Computing with Unknown Bias by Elchanan Mossel (MSR) and Yuval Peres (UCB). The paper itself hasn't been published yet, and I'll be presenting from a draft. In addition, I'll take a somewhat major excursion into The Algebraic Theory of Context-Free Languages by N. Chomsky and M.P.Schutzenberger (1963). |
---------------
| Speaker | Ed Hong |
| Title | Extended Golomb Codes for Binary Markov Sources |
| Abstract | We evaluate the performance of elementary Golomb coding on a binary two-state Markov source. In addition to the simple context independent Golomb coding method that treats the source as memoryless, we also consider two other methods: sequential coding, and interleaved coding. Sequential coding is a context-dependent method that sequentially codes the source, choosing the order of the elementary Golomb code based on the last bit seen. Interleaved coding codes the even-numbered bits before the odd-numbered bits using elementary Golomb codes of several different orders. Of these methods, we show that no one method is best on all Markov sources, but that the best method depends upon the transition probabilities in the Markov source. |
---------------
| Speaker | Michael Saks |
| Title | Space lower bounds for distance approximation in the data stream model |
| Abstract |
We consider the problem of approximating the distance of two
$d$-dimensional vectors $x$ and $y$ in the data stream model. In this
model, the $2d$ coordinates are presented as a ``stream'' of data in
some arbitrary order, where each data item includes the index and
value of some coordinate and a bit that identifies the vector ($x$ or
$y$) to which it belongs. The goal is to minimize the amount of
memory needed to approximate the distance. For the case of
$L^p$-distance with $p \in [1,2]$, there are good approximation
algorithms that run in polylogarithmic space in $d$ (here we assume
that each coordinate is an integer with $O(\log d)$ bits). Here we
prove that they do not exist for $p>2$. In particular, we prove an
optimal approximation-space tradeoff of approximating $L^\infty$
distance of two vectors. We show that any randomized algorithm that
approximates $L^\infty$ distance of two length $d$ vectors within
factor of $d^\delta$ requires $\Omega(d^{1-4\delta})$ space. As a
consequence we show that for $p>2/(1-4\delta)$, any randomized
algorithm that approximate $L^p$ distance of two length $d$ vectors
within a factor $d^{\delta}$ requires $\Omega(d^{1-\frac 2p-4\delta})$
space.
The lower bound follows from a lower bound on the two-party one-round communication complexity of this problem. This lower bound is proved using a combination of information theory and Fourier analysis. This is joint work with Xiaodong Sun, and will appear in STOC 2002. |
---------------
| Speaker | Richard Ladner |
| Title | Windows Scheduling Problems for Broadcast Systems |
| Abstract |
The windows scheduling problem is defined by the positive integers h
and w_1,w_2,...,w_n. The window w_i is associated with page i and h is
the number of slotted channels available for broadcasting the pages. A
schedule that solves the problem assigns pages to slots such that the
gap between any two consecutive appearances of page i is at most w_i
slots. We investigate two optimization problems.
|
|
Department of Computer Science & Engineering University of Washington Box 352350 Seattle, WA 98195-2350 (206) 543-1695 voice, (206) 543-2969 FAX [Comments to Ashish Sabharwal, ashish] |
||