CSE logo University of Washington Department of Computer Science & Engineering
 CSEP 552: PMP Distributed Systems, Spring 2013

Overview

Instructor: Steve Gribble
Office hours: Monday 5pm-6pm, in CSE578, or by appointment.

TA: Johnson Goh
Office hours: by appointment
(send email to johnson at cs dot washington dot edu)

Lectures: Mondays, 6:30pm-9:20pm, in CSE 305.

CSEP552 is a graduate course on distributed systems. Distributed systems have become central to many aspects of how computers are used, from web applications to e-commerce to content distribution. This course will cover abstractions and implementation techniques for the construction of distributed systems, including client server computing, the web, cloud computing, peer-to-peer systems, and distributed storage systems. Topics will include remote procedure call, maintaining consistency of distributed state, fault tolerance, high availability, and other topics. As we believe the best way to learn the material is to build it, there will be a series of hands-on programming projects.

Prerequisities: the basic prerequisite is to have taken an undergraduate operating systems course (CSE 451 or equivalent) or an undergraduate networking course (CSE 461 or equivalent). If you haven't taken an undergrad OS or networks course, please come talk to Steve. We will not be covering undergraduate material in this course.

Papers: you will be responsible for reading approximately three papers before each class, and contributing your thoughts on each assigned paper to the class discussion board before the class that covers it.

Projects: every few weeks, I'll hand out a programming assignment related to the course material. You will work on the programming assignments solo.

Administrivia

Mailing list: When you register for the course, you'll automatically be added to the class mailing list (csep552a_sp13@uw.edu). This list will first be created on April 1st, 2013. To manage your subscription after then, visit the mailing list web page. You will be subscribed using your u.washington.edu email address, but you can modify your subscription to use an email address of your choice. Note that you can only post to the mailing list from your subscribed email address.

Announcements:

Discussion Board

Here's the link to the class discussion board:
https://catalyst.uw.edu/gopost/board/gribble/32317/
The discussion board has three areas:

Lecture and Paper schedule

Here is the schedule of papers for the quarter; this schedule might be tweaked as we progress. The discussion board entries for the assigned papers are due by noon on the day of the associated lecture.

Videos of the lectures will be made available from this page within a day or two of the lecture itself.

Date

Reading

Slides / Notes

April 1 Introduction intro
rpc
dns
April 8 Time, Clocks, and Global States clock sync
logical clocks
snapshots (slides)
April 15 Consistency, Coherence overview
DSM
coda
bayou
April 22 Transactions and Replication concurrency control and recovery
two-phase commit
April 29 Consensus intro to consensus
paxos slides
chubby
May 13 The Google Storage Stack bigtable
megastore
spanner
May 20 More Data Center Topics cap intro
PNUTS
Dynamo
COPS
May 30 Big Data Processing MapReduce
Dryad/DryadLINQ
GraphLab/PowerGraph
June 3 DHTs and P2P P2P filesharing
P2P DHTs
BitTyrant
June 10 Security PBFT
SUNDR
BAN logic

Assignments

Everybody registered for the course should already have had an instructional UNIX account created for them by the department support staff, and have been notified of it. Using this account, you can remotely log into (via ssh) the attu.cs.washington.edu compute cluster. You can find more information about instructional resources here.

You should also be able to do the programming assignments on your own personal machines; none of them require large or exceptionally powerful machines. I'd recommend doing your work on Linux; I'd start with a standard Linux Ubuntu distribution. Note that the department has made virtual machine images available with the departmental linux installation on them. You'll need to get ahold of VMware to use them.

A few rules of the road are worth mentioning. For design questions, you should feel free to talk with each other about the question and ideas that you come up with. You should not, however, share your written answers with each other directly. If you do discuss ideas with each other, please cite who you discussed with in your turned in work. This is mostly so that you get in the habit of properly attributing collaborations.

Similarly, you should feel free to talk with each other about the programming assignments, and share ideas as you see fit. You can also make use of Google or other resources. However, you must not share code with each other, or rely on code you find else where, such as the Web, to solve the programming assignment directly: you must implement your own code to solve each programming assignment. Unless the programming assignemnt specifies otherwise (and a few of them will), tou can pick whatever programming environment or tools to build on that you like -- e.g., you can make use of shells, interpreters, and within reason, libraries or other building blocks that don't directly solve the problem for you. As before, if you do discuss a programming assignment with someone else or find useful sources of information (e.g., code or technical descriptions on the Web), please cite or otherwise attribute all of your sources.